Infinite Karma
===
###### tags: `refinitiv`, `2020`, `innovationweek`
[toc]
Link to [Innovation Week Schedule](https://hackmd.io/DmoskMeaQ7iOYqtmhkUCIQ?both)
## Deck
Ayla's recommended flow:
1. Problem (including quantifiable negative impact)
1. Solution Features & Benefits (including quantifiable commercial or efficiency gain, screenshots of solution)
1. User Profile (internal or external)
1. Tech Architecture (diagram)
1. Techniques Used (what and why)
#### Problem
Slide title: Traders find it difficult to gauge reliable sources of market sentiment.
Content:
Understanding market sentiment is crucial to identifying **when** a trader should make a trade.
The raw stock price does not explain to what extent a particular news item is already priced in.
It is vital to have an understanding of the stock and the market that is aware of the context of the market.
However, due to an overload of information online, it is difficult sift out what is relevant and useful.
It is difficult to distinguish price moves driven by different classes of traders--whether institutional or retail.
Inofrmation overload
Slide 2: Map of Financial Communities online
Slide 3: Focus on Reddit's financial communities
Text: We chose r/wallstreetbets as it's the most active, r/securityanalysis for its high-quality posts, and r/investing for its generality.
#### Solution
Slide 1 Title: We identify the degree to which online sources can be timely and reliable sources of information.
Caption: Traders need a product that leverages alternative data sources to accurately reflect market sentiment.
We start with Reddit, which has an extremely active user base (include stats on financial subreddits)
It is known for influencing and also recording serious market movements.
What is the quantifiable commercial or efficiency gain?
#### User Profile
Slide Title: Our Users
Icons for: Traders, investment managers, equity analysts
How will they use this?
Traders
Investment managers
#### Tech architecture
Pushshift + EC2 + S3 + Athena -> DS Zoo (JupyterLabs) -> ElementJS
#### Techniques Used
## Important for Slides
Rationale:
- [How Quant Traders Use Sentiment To Get An Edge On The Market](https://www.forbes.com/sites/kumesharoomoogan/2015/08/06/how-quant-traders-use-sentiment-to-get-an-edge-on-the-market/#67c1b2ea4b5d)
Use cases:
- Nitish: SentiMine for Reddit, could we do a mapping from a Reddit post to a stock
- Nitish: not all traders want to follow what a model says, reddit sentiment might allow someone to overweight or underweight
- Regan likes this one, could be interesting to understand market sentiment
- Kai Xin: subset the data in an unsupervised way, make it related to ESG--
- Nitish: when you're doing a market level sentiment aggregation, we can look at who posted, no of comments,
- Josh: my friends use reddit and search specific tickers. Is there a way to measure sentiment on a stock based way?
## Resources
### To Explore
How to Analyze Every Reddit Submission and Comment, in Seconds, for Free [Link here.](https://minimaxir.com/2015/10/reddit-bigquery/) Max Woolf also has another great post on "Quantifying and Visualizing the Reddit Hivemind", [link here](https://minimaxir.com/2015/10/reddit-topwords/).
A surprisingly readable dissertation on: "Does Reddit Sentiment Predict Abnormal Stock Price Return?", which I found on [this random guy](https://www.linkedin.com/in/christian-taylor-8abaa3194/)'s Linkedin.
Who drives the market? Sentiment analysis of financial newsposted on Reddit and Financial Times. A 2018 [Bachelor's degree thesis](http://ad-publications.informatik.uni-freiburg.de/theses/Bachelor_Michael_Lubitz_2018.pdf). They used r/economics.
An evaluation of document clustering and modelling on Reddit and Twitter - [link](https://www.sciencedirect.com/science/article/abs/pii/S0306457318307805) here.
This paper analysing the [significance of Twitter](https://www.researchgate.net/publication/282049046_The_Effects_of_Twitter_Sentiment_on_Stock_Price_Returns) sentiment for stock returns could provide some helpful parallels to reddit data.
### Methods/Infrastructure
Analysing Reddit Sentiment with AWS github tutorial [here](https://github.com/aws-samples/analyzing-reddit-sentiment-with-aws)
Praw Docs may be found here [here](https://praw.readthedocs.io/en/latest/index.html)
Pushshift (instead of PRAW) - [API Docs](https://pushshift.io/api-parameters/)
A paper on the Pushshift dataset - [link](https://www.aaai.org/ojs/index.php/ICWSM/article/view/7347/7201)
Reddit on BigQuery till Feb 2017. [Link](https://console.cloud.google.com/bigquery?project=fh-bigquery&page=dataset&d=reddit_comments&p=fh-bigquery&redirect_from_classic=true&pli=1)
## Link to Mural
[Infinite Karma's Mural](https://app.mural.co/t/refinitivlabs3203/m/refinitivlabs3203/1592297137223/4ca30501e2633a34b41832d9bc5bb044595a3832)