Nosql – Cassandra Time Series Data

We are considering using Cassandra to store information streams from various sources.

One of the problems we face is the best way to query between two dates.

For example, we will need to retrieve an object between datetime dt1 and datetime dt2.

We are considering the created unix timestamp as the key to the actual object, and then use get_key_range to query and retrieve?

Obviously, if two items have the same timestamp, this will not work.

Is this the best way to do datetime in noSQL storage?

Cassandra rows can be very large, so consider modeling them as columns instead of CF Row; then you can use a column slicing operation that is faster than row slicing. If there is no “natural” key associated with this, the daily or hourly key of “2010/02/08 13:00” can be used.

Otherwise, using range query (get_key_range was deprecated in 0.5; using get_range_slice) is your best choice.

We are considering using Cassandra to store information streams from various sources.

One of the problems we face is the best way to query between two dates.

For example, we will need to retrieve an object between datetime dt1 and datetime dt2.

We are considering the created unix timestamp as the key to the actual object, and then use get_key_range to query and retrieve?

Obviously, if two items have the same timestamp, this will not work.

Is this the best way to do datetime in noSQL storage?

Cassandra rows can be very large, so consider modeling them as columns instead of rows in CF; then you can use columns that are faster than row slicing Slice operation. If there is no “natural” key associated with this, the daily or hourly key of “2010/02/08 13:00” can be used.

Otherwise, using range query (get_key_range was deprecated in 0.5; using get_range_slice) is your best choice.

Leave a Comment

Your email address will not be published.