Data Model Cassandra

# Data Model Cassandra [TOC] ## Data Model For Following Use Cases ``` Brain Key - Partition Key Cluster Key - Time stamp equivalent Features - Data Time stamp equivalent - Partition Key Cluster Key - Brain Key Features - Data 5 x 24 x 365 x 10 = Use Case 1 - Get all feature values for a given function Use Case 2 - Get last 1 year data of a feature for a given key and time_window Use Case 3 - Get last 15 data points of a feature for a given key and time_window Use Case 4 - Get latest feature values for list of feature for a given time_window ``` ``` sql Create keyspace featurerepository with replication={'class':'SimpleStrategy','replication_factor':1}; Use featurerepository; CREATE TABLE IF NOT EXISTS feature_by_5minutes ( brain_key text, feature_key int, window_time timestamp, feature_value bigint, PRIMARY KEY ((brain_key, feature_key), window_time) ) WITH CLUSTERING ORDER BY (window_time DESC); CREATE TABLE IF NOT EXISTS feature_by_hour ( brain_key text, feature_key int, window_time timestamp, feature_value bigint, PRIMARY KEY ((brain_key, feature_key), window_time) ) WITH CLUSTERING ORDER BY (window_time DESC); CREATE TABLE IF NOT EXISTS feature_by_day ( brain_key text, feature_key int, window_time timestamp, feature_value bigint, PRIMARY KEY ((brain_key, feature_key), window_time) ) WITH CLUSTERING ORDER BY (window_time DESC); CREATE TABLE IF NOT EXISTS feature_by_week ( brain_key text, feature_key int, window_time timestamp, feature_value int, PRIMARY KEY ((brain_key, feature_key), window_time) ) WITH CLUSTERING ORDER BY (window_time DESC); CREATE TABLE IF NOT EXISTS feature_by_month ( brain_key text, feature_key int, window_time timestamp, feature_value bigint, PRIMARY KEY ((brain_key, feature_key), window_time) ) WITH CLUSTERING ORDER BY (window_time DESC); CREATE TABLE IF NOT EXISTS feature_by_quarter ( brain_key text, feature_key int, window_time timestamp, feature_value bigint, PRIMARY KEY ((brain_key, feature_key), window_time) ) WITH CLUSTERING ORDER BY (window_time DESC); CREATE TABLE IF NOT EXISTS feature_by_year ( brain_key text, feature_key int, window_time timestamp, feature_value bigint, PRIMARY KEY ((brain_key, feature_key), window_time) ) WITH CLUSTERING ORDER BY (window_time DESC); CREATE TABLE IF NOT EXISTS feature_latest ( brain_key text, feature_key int, window_time timestamp, feature_value bigint, PRIMARY KEY ((brain_key), feature_key) ); DROP TABLE featurerepository.feature_by_hour ``` Use full cassandra Notes ``` sql docker pull cassandra:latest docker network create cassandra //Cassandra Instance docker run -d --name cassandra --hostname cassandra --network cassandra -p 9042:9042 cassandra Client to interact with cassandra docker run -it --network cassandra --rm cassandra cqlsh cassandra -p 9042:9042 ``` ### Use case 2,3,4 ```sql= Use Case 2:- Select * from featurerepository.feature_by_hour where window_end_time > '2022-06-23 08:15:00' and brain_key ='51f39b5b0aa76ca051280d3bd5f7eeecd82e84e9c73e86e091b4c49424f4f4eb' and feature_key = 1234567 Use Case 3:- Select * from featurerepository.feature_by_hour where brain_key = '51f39b5b0aa76ca051280d3bd5f7eeecd82e84e9c73e86e091b4c49424f4f4eb' and feature_key = 1234567 Limit 2 Use Case 4:- Select * from features_latest where brain_key = ? and feature_key IN ( list_of_feature_keys) ```