Cassandra Query Language

I’ll be going through a cloud instance of Cassandra DB.

Data Types


In CQL there are many data types, but they can be grouped into three main categories:

  • built-in data types
  • collection data types
  • user-defined data types

The user can choose any of them according to the requirements of the application and data model.

Built-in

The built-in data types are basically pre-defined in Cassandra. The user can refer the variables to any of them.

  • Besides regular data types like Ascii, Boolean, decimal, double, float, int, and text, which are fairly straightforward.

Blob

Although Cassandra mainly stores text-based information, there is also the possibility to store blobs, which stands for binary large objects.

  • Blobs are typically used to store images, audio, or other multimedia objects.
  • While blobs represent a collection of binary data stored as a single entity, in Cassandra it is recommended that their size does not exceed 1 megabyte.
  • Thus, you could store a small image or string using a blob.

Bigint

The bigint data type can be used for a 64-bit signed long integer.

  • This data type stores a higher range of integers when compared to the int data type.

Varchar

  • The well-known varchar is also available in Cassandra as a data type.
  • It represents UTF8 encoded strings.

Collection Data Type

Cassandra provides collection types as a way to group and store data together in a column.

For example, in a relational database,a grouping such as a user’s multiple email addresses is related witha many-to-one joined relationship between a user’s table and an email table.

Cassandra avoids joins between two tables by storing the user’s email addresses in a collection column in the users table.

  • Each collection specifies the data type of the data held
  • A collection is appropriate if the data for collection storage is limited
  • If the data has unbounded growth potential, like messages sent or sensor events registered every second, do not use the collection data type.
  • Instead, use a table with a compound primary key where data is stored in the clustering columns.

Within the collection data types category, there are three data types.

Lists

Lists, this Cassandra data type represents a collection of one or more elements in a table.

  • List is to be used in cases where the order of the elements is to be maintained and a value is to be stored multiple times, such as entries in a log.

Maps

Maps, this Cassandra data type represents a collection of key value pairs. Map is a data type that is used to store a key value pair of elements, such as entities in a journal entered using a date, and then text.

Sets

This Cassandra data type represents a collection of one or more sorted elements in a table.

  • Set is a data type that is used to store a group of elements.
  • The elements of a set will be returned in a sorted order. An example would be a list of email addresses.

Example

Let’s go back to the users table and add a new column to the table called jobs, which is basically just a list of jobs.

  • We would like to store the jobs in the order of their occurrences.
  • Remember that the users table is a static table with its primary key consisting of the user id column.
  • In Cassandra, we will store all the users jobs in a single column.
  • Since we cannot perform joins, in this case, we will use the list type of the collection data types because we want to preserve the order of the jobs.
  • Another reason is that a person can work at a specific company more than once,so uniqueness is not required.
  • We can add a job in the list either at the beginning or end of the list or in a specific position.
  • The entries in the list can be repetitive as they are not unique.

CRUD