# Deep Learning Final Project
## Sweet Talkers | NLP Final Project
**Building a model that predicts course and professor ratings based on Critical Review language data**
Tags: NLP, Tensorflow, python
## Who We Are
- Geireann Lindfield Roberts (glindfie)
- Raymond Yeo (ryeo)
## Project Introduction
We are building a classification model that estimates course and professor ratings based on Critical Review feedback and summaries. We are interested in finding whether the way we write feedback and evaluations accurately reflects the scores that we give the classes we are taking. We hope this tool will provide further insight into how courses have perfomed based on their written evaluations rather than just the multiple choice responses. We hope that this projects acts as a starting point for something that can be extended upon, and that written evaluations which are often not even read can be made into a more useful mode.
We were intriuged by the language processing portion of the class and wanted to explore more about it.
## Related Work
As far as we are aware, there is nothing that has been done in the same regard with Critical Review's data. There is however some related NLP work that has been done on detecting the performance based on reviews.
- [**Detecting bad customer reviews with NLP**, 2018](https://towardsdatascience.com/detecting-bad-customer-reviews-with-nlp-d8b36134dc7e)
This article uses sentiment analysis to evaluate the rating given by customers based on their review. It uses Word2Vec, which is something we also hope to use for our model.
- [**Joint Training of Ratings and Reviews with Recurrent Recommender Networks**, 2017](https://openreview.net/pdf?id=Bkv9FyHYx&fbclid=IwAR2FxTyfv6vTo26yCvnUT-m5I1z0CQOJtQoy8uoyVaibIKicYSrOLbMGlJw)
Uses a recurrent recommender network, which is based on recurrent neural networks.
## Data
We have been in touch with the Critical Review team to get our data. While are still communicating and preparing our dataset, we will likely be utilizing two datasets in the following formats:
**`tally.csv`**
* edition (e.g. 2021.2022.1)
* department code (e.g. CSCI)
* course code (e.g. 1951F)
* average course rating (e.g. 3.5)
* average professor rating (e.g. 3.5)
**`review.csv`**
* edition (e.g. 2021.2022.1)
* department code (e.g. CSCI)
* course code (e.g. 1951F)
* review text
Our preprocessing will involve tokenizing our input texts, and accounting for any data integrity problems. We will also merge both of the data sets on `course code` such that we can access the average course rating and average professor rating given the review text. For any additional data that Critical Review does not have readily accessible, we plan to ask for permission to web scrape it.
## Methodology
1. Pre-process and prepare the data such that it is compiled into a single CSV file and sufficiently prepared to be inputted into the model.
2. Utilize language modeling deep learning techniques to train our model. We will likely use an embedding matrix, coupled with techniques seen in LSTM and Word2Vec models.
3. When testing, we will round up on our labels. What we mean by this is that a course rating of 3.79 would be considered a 4. This will help gauge our model accuracy, and allow the model to constain its possible outputs to just 5 integer values (1,2,3,4,5). Our training data will consist of Critical Review data from 2010 (when they switched the way Critical Review feedback was done) until 2019.
4. We will then test the model by using it on the feedback over the last two years.
## Metrics
Model accuracy would be a good metric to gauge this model's performance.
Base Goal: Perform better than random guessing. This means our model accuracy would be greater than 20%.
Target Goal: Accuracy > 30%
Stretch Goal: Accuracy > 50%
We will reassess our goals after we begin training our modeling and gauging its performance.
## Ethics
Deep Learning serves as a good approach to this problem, as course and professor ratings are dependent on a variety of factors.
We understand that an unintended result of this model is that we are implictly evaluating the effectiveness of Critical review summaries. Our goal here is not to see how "good" a course review is, but rather to evaluate to what extent a qualitative metric like a holistic course review can be used to predict quantiative metrics.
Critical Review and students and staff at Brown are the main stakeholders in the problem that we are trying to solve. We hope that our metrics will support Critical Review, and help them develop and create more useful quantifications of written feedback as well as numerical feedback. We have been in touch with Critical Review and have their support with regards to this matter.
If our model were to be inaccurate and give incorrect results this would negatively affect the reputation of Critical Review, and therefore would not help them provide the feedback to course staff and students. For that reason it is important that we ensure the accuracy of our mode.
## Division of labor
Ray + Geireann = 100% of work = 100% sweet talkers. We have not decided on fixed work distributions yet.