# Fake News Detection for Social Media, News outlets
## Data & References
- BuzzFeed dataset
- Based on [Reis et al](https://homepages.dcc.ufmg.br/~fabricio/download/websci-reis-2019.pdf)
- 
- From [dataflair](https://data-flair.training/blogs/advanced-python-project-detecting-fake-news/)
- [FakeNewsChallenge](http://www.fakenewschallenge.org): [Cisco-Talos group](https://github.com/Cisco-Talos/fnc-1)
- [r/Fakeddit](https://paperswithcode.com/paper/rfakeddit-a-new-multimodal-benchmark-dataset)
- [Check-It plugin](https://www.groundai.com/project/check-it-a-plugin-for-detecting-and-reducing-the-spread-of-fake-news-and-misinformation-on-the-web/1)
- [papers with code](https://paperswithcode.com/task/fake-news-detection/latest)
- Shuo Yang et al. Unsupervised Fake News Detection on Social Media: A Generative Approach. "In the experiment, we use two public datasets, i.e., LIAR (Wang 2017) and BuzzFeed News4 to evaluate the perfor- mance of our algorithm. LIAR is one of the largest fake news datasets, containing over 12,800 short news statements and labels collected from a fact-checking website politifact.com. BuzzFeed dataset contains 1,627 news articles related to the 2016 U.S. election from Facebook. We use Twitter’s advanced search API with the titles of news to collect related news tweets. After eliminating duplicate news and filtering out the news with no verified user’s tweets, we finally obtain 332 news for LIAR and 144 news for BuzzFeed. For each news tweet, the unverified users’ engagements are also collected using web scraping. We observed that users tend to explicitly express negative sentiments (using words like “lie”, “fake”) when they think a news report is fake. Thus, we use the sentiments as their opinions. As for likes and retweets, we treat them as positive opinions. Note that if a user has very few engagement records, the user’s credibility cannot be accurately estimated. Thus, we filter out the users who have less than 3 engagement records."
- My goal is to study the dataset and analyze the features of fake news: source, contents, etc.
## Application
- Since the 2016 US presidential election, fake news has been a source of misinformation and significantly influenced the public opinion and outcome of the 2016 presidential election.
- The goal of my project is to create a web app that allows users to insert a link which can be social media link or news link and the app will tell if the news source is fake (in case of news link) or highlight fake news tweets (in case of social media link)
## Development plan
1. Collect and process realistic dataset. Build model features (1.5 - 2 week)
2. Train model (0.5 week)
3. Evaluate and select best features of model and dataset (0.5 week)
4. Build web app (1 week)