# NLP Project
# I. Introduction
A. Overview of Twitter Financial News dataset
- The Twitter Financial News dataset is an English-language corpus of finance-related tweets.
- It consists of 11,932 annotated documents, each labeled with one of three sentiments.
- Sentiments:
- "LABEL_0": "Bearish"
- "LABEL_1": "Bullish"
- "LABEL_2": "Neutral"
- The data was collected using the Twitter API, ensuring a diverse and real-world representation of financial sentiments.
- This dataset supports a multi-class classification task for sentiment analysis.
B. Importance of sentiment analysis in finance
- Sentiment analysis plays a crucial role in understanding market dynamics and investor sentiment.
- Accurate sentiment classification enables better decision-making in financial investments.
C. Objectives of the project
- Develop and compare models for sentiment analysis on financial tweets.
- Evaluate the effectiveness of different approaches in capturing and classifying financial sentiment.
# II. Data Preparation
A. Data Format
- The dataset is organized in the DatasetDict format, consisting of train and validation splits.
```python
DatasetDict({
train: Dataset({
features: ['text', 'label'],
num_rows: 9543
})
validation: Dataset({
features: ['text', 'label'],
num_rows: 2388
})
})
```
- Utilize the 'train' split for model training and the 'validation' split for final model performance evaluation.
# III. Model Development
## A. BERT as Embedding + Logistic Regression
1. Tokenization and Embedding using BERT
- Utilize BERT for contextual word embeddings, capturing nuanced financial language.
- Model Configuration:
```python
tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')
bert_model = AutoModel.from_pretrained('bert-base-cased')
```
2. Logistic Regression Model
- Apply logistic regression on BERT embeddings to predict sentiment.
- Model Configuration:
```python
#Default logistic regression model from scikit-learn.
logistic_regression_model = LogisticRegression()
```
## B. Fine-tune BERT
1. Model Architecture
- Adjust BERT architecture for optimal performance on financial sentiment analysis.
```
tokenizer = AutoTokenizer.from_pretrained('bert-base-cased')
pretrained_bert_model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=3)
```
2. Fine-tuning Process
- Train the fine-tuned BERT model on the annotated Twitter Financial News dataset.
## C. Off-the-shelf Pretrained Model from Hugging Face
1. Selecting Pretrained Model
- Choose the off-the-shelf model 'google/flan-t5-base' for sentiment analysis.
- Model Information:
- Model Name: 'google/flan-t5-base'
- Brief Description: This model is based on T5 architecture and has been pretrained on various tasks, including sentiment analysis.
```python
from transformers import AutoModelForSequenceClassification
# Specify the off-the-shelf model name
model_name = 'google/flan-t5-base'
# Load the model for sequence classification
off_the_shelf_model = AutoModelForSequenceClassification.from_pretrained(model_name)
```
2. Integration with Sentiment Analysis Task
- Utilize the zero-shot learning approach with a predefined prompt for sentiment analysis.
- Prompt Configuration:
- Define the prompt as follows:
```python
prompt = """
Classify the sentiment of the given finance-related tweet and return the numerical code: 0 for "Bearish," 1 for "Bullish," and 2 for "Neutral."
"""
```
- Make predictions using the zero-shot pipeline.
```python
# Example usage with a finance-related tweet
tweet = "Stocks are expected to rise sharply in the coming quarter."
prediction = zero_shot_classifier(tweet, prompt)
```
## D. Model Trained by Users - Deberta for Financial Sentiment Classification
- Model Information:
- Model Name: 'RashidNLP/finance-sentiment-deberta-base'
- Brief Description: Deberta model trained on over 1 million reviews from Amazon's multi-reviews dataset and fine-tuned on 4 finance datasets with sentiment labels (financial_phrasebank, chiapudding/kaggle-financial-sentiment, zeroshot/twitter-financial-news-sentiment, FinanceInc/auditor_sentiment).
```python
from transformers import AutoModelForSequenceClassification
# Specify the user-trained model name
user_model_name = 'RashidNLP/finance-sentiment-deberta-base'
# Load the user-trained Deberta model for sequence classification
user_trained_model = AutoModelForSequenceClassification.from_pretrained(user_model_name)
```
- Integration with Sentiment Analysis Task
- Utilize the user-trained Deberta model for sentiment analysis on financial tweets.
- Make predictions using the user-trained model.
```python
# Example usage with a finance-related tweet
user_prediction = user_trained_model(tweet)
```
# IV. Model Comparison
A. Evaluation Metrics
1. Accuracy, Precision, Recall, F1-score
- Use Accuracy to evalute models' performance
B. Comparative Analysis
| Methods | Accuracy | Num_parameters (M) | Model_size (MB) |
| ---------------- | -------- | ------------------- | --------------- |
| BERT_lr | 0.79 | 108 | 413 |
| Finetuned_BERT | 0.87 | 108 | 413 |
| LLM_zero_shot | 0.65 | 944 | 247 |
| Pretrained_model | 0.90 | 184 | 413 |
1. **BERT as Embedding + Logistic Regression (BERT_lr):**
- Achieved an accuracy of 79%.
- Utilized BERT embeddings with a logistic regression model.
- Both the training and final model sizes are 413 MB, with 108 million parameters.
2. **Fine-tuned BERT (Finetuned_BERT):**
- Outperformed BERT_lr with an accuracy of 87%.
- Used the same BERT architecture but fine-tuned on the financial sentiment dataset.
- Similar model size and number of parameters as BERT_lr.
3. **LLM Zero-Shot Learning (LLM_zero_shot):**
- Demonstrated a lower accuracy of 65%.
- Leveraged an off-the-shelf pretrained model for zero-shot learning.
- Larger model size (944 MB) compared to BERT-based models.
4. **Off-the-shelf Pretrained Model (Pretrained_model):**
- Achieved the highest accuracy of 90%.
- Utilized the 'google/flan-t5-base' model for zero-shot sentiment classification.
- Relatively smaller model size (184 MB) with 413 million parameters.
- **Insights:**
- The fine-tuned BERT model outperforms the BERT as Embedding + Logistic Regression model, indicating the effectiveness of fine-tuning on financial sentiment tasks.
- The off-the-shelf pretrained model, 'google/flan-t5-base,' demonstrated the highest accuracy, showcasing the potential of large-scale pretraining on diverse data.
# V. Conclusion
A. Summary of Findings
- In this project, we explored and compared various methods for sentiment analysis on financial tweets using the Twitter Financial News dataset.
- Four different methods were implemented and evaluated, including BERT as Embedding + Logistic Regression, Fine-tuned BERT, LLM Zero-Shot Learning (adopting 'google/flan-t5-base'), and an Off-the-shelf Pretrained Model ('google/flan-t5-base').
- Performance metrics, including accuracy, were analyzed for each method to assess their effectiveness in capturing financial sentiment.
B. Implications for Finance
- The adoption of Large Language Models (LLMs) is a prevailing trend in recent NLP advancements. While our project utilized a smaller-scale LLM ('google/flan-t5-base') instead of larger models like GPT-3 or GPT-4, it provides a glimpse into the potential of in-context learning. Future improvements may explore incorporating more powerful LLMs, such as GPT-3 or GPT-4, with a focus on enhancing few-shot learning capabilities.
C. Model Performance Insights
- Fine-tuned BERT outperformed the BERT as Embedding + Logistic Regression method. This suggests that adapting BERT through fine-tuning on finance-specific data improves its ability to capture nuanced financial sentiment. The tailored adjustments in the fine-tuning process contribute to a more contextually relevant representation for financial language.
D. Considerations for Deployment
- When contemplating model deployment, especially in the context of large-scale ML or DL models, the number of parameters and the model size become critical considerations. The Off-the-shelf Pretrained Model ('google/flan-t5-base') showcased high accuracy with a relatively smaller model size (184 MB), highlighting the importance of model efficiency for practical deployment.
E. Recommendations and Future Work
- Future work could delve into leveraging larger LLMs, such as GPT-3 or GPT-4, to enhance the model's ability to grasp intricate financial contexts in a few-shot learning setting.
- The success of fine-tuned BERT encourages further exploration of domain-specific fine-tuning strategies and parameter adjustments for improved sentiment analysis on financial data.
- Considerations for deployment should involve optimizing models for efficiency without compromising performance, potentially exploring model compression techniques.
# VI. References
[Dataset](https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment)
[Pretrained Model](https://huggingface.co/RashidNLP/Finance-Sentiment-Classification?text=I+like+you.+I+love+you)