# Skynet Social Sentiment Design ## Background To add more dynamics and differentiation to the Skynet offering, we want to introduce some additional insights. One direction we can work towards is using social media data to analyze the sentiment of the community. The data sources can include Twitter, Telegram, Discord and more. But for the initial phase we'll focus on twitter. ## Overview ![System Overview](https://i.imgur.com/hweQsxQ.png) ## Components ### Database Requirement: Data size can be large (200-300MB) Option 1: DynamoDB: very scalable, but development cost can be higher ~~Option 2: MySQL (relational)~~: good for small to mid size of data, but schema change is less flexible Option 3: S3 File (the fastest way to retrieve a big data set) Final Design: A hybrid approach #### S3 Files: for big trunk of text storage community tweets file (file naming convension: projectId/community_daily_YYYYMMDD) ``` [ "tweet 1 text", "tweet 2 text", ... ] ``` project own tweets file (file naming convension: projectId/own_daily_YYYYMMDD) ``` [ "tweet 1 text", "tweet 2 text", ... ] ``` #### DynamoDB: for storing metric history skynet-twitter-metrics ``` name date value followersCount(twitterId=dypfinance) 20210101 1000 followersCount(twitterId=dypfinance) 20210102 1100 tweetsCount(twitterId=dypfinance) 20210101 30 ``` ### Twitter Data Indexer (ETL) Run as a Nomad cron Twitter Data Indexer queries Twitter API and save raw data into database, the data includes - user tweets with a At Project tag - project twitter account stats & their history: followers count/tweets count - project twitter account tweets Data Indexer would be triggered once per day, and write data to S3 and DynamoDB, which will be consumed by downstream API endpoints. ### Twitter Metrics API Run as a Lambda function Restful Interface ``` GET /twitter/metrics?name=followersCount(twitterId=dypfinance)&dates=20210101,20210102 [ {date: "20210101", value:1000}, {date: "20210102", value:2000} ] ``` ### Twitter Sentiment API Run as a Lambda function Restful Interface ``` GET /twitter/dypfinance/sentiment?start=20210101&end=20210114 { positive: 20, negative: 30, netural: 50 } ``` ### Twitter Word Cloud API Run as a Lambda function Restful Interface ``` GET /twitter/dypfinance/word-cloud?start=20210101&end=20210114 [ ["told", 64], ["mistake", 17], ["thought", 16], ["bad", 11], ... ] ```