# Skynet Social Sentiment Design
## Background
To add more dynamics and differentiation to the Skynet offering, we want to introduce some additional insights.
One direction we can work towards is using social media data to analyze the sentiment of the community.
The data sources can include Twitter, Telegram, Discord and more. But for the initial phase we'll focus on twitter.
## Overview

## Components
### Database
Requirement: Data size can be large (200-300MB)
Option 1: DynamoDB: very scalable, but development cost can be higher
~~Option 2: MySQL (relational)~~: good for small to mid size of data, but schema change is less flexible
Option 3: S3 File (the fastest way to retrieve a big data set)
Final Design: A hybrid approach
#### S3 Files: for big trunk of text storage
community tweets file (file naming convension: projectId/community_daily_YYYYMMDD)
```
[
"tweet 1 text",
"tweet 2 text",
...
]
```
project own tweets file (file naming convension: projectId/own_daily_YYYYMMDD)
```
[
"tweet 1 text",
"tweet 2 text",
...
]
```
#### DynamoDB: for storing metric history
skynet-twitter-metrics
```
name date value
followersCount(twitterId=dypfinance) 20210101 1000
followersCount(twitterId=dypfinance) 20210102 1100
tweetsCount(twitterId=dypfinance) 20210101 30
```
### Twitter Data Indexer (ETL)
Run as a Nomad cron
Twitter Data Indexer queries Twitter API and save raw data into database, the data includes
- user tweets with a At Project tag
- project twitter account stats & their history: followers count/tweets count
- project twitter account tweets
Data Indexer would be triggered once per day, and write data to S3 and DynamoDB, which will be consumed by downstream API endpoints.
### Twitter Metrics API
Run as a Lambda function
Restful Interface
```
GET /twitter/metrics?name=followersCount(twitterId=dypfinance)&dates=20210101,20210102
[
{date: "20210101", value:1000},
{date: "20210102", value:2000}
]
```
### Twitter Sentiment API
Run as a Lambda function
Restful Interface
```
GET /twitter/dypfinance/sentiment?start=20210101&end=20210114
{
positive: 20,
negative: 30,
netural: 50
}
```
### Twitter Word Cloud API
Run as a Lambda function
Restful Interface
```
GET /twitter/dypfinance/word-cloud?start=20210101&end=20210114
[
["told", 64],
["mistake", 17],
["thought", 16],
["bad", 11],
...
]
```