# NLP Project # I. Introduction A. Overview of Twitter Financial News dataset - The Twitter Financial News dataset is an English-language corpus of finance-related tweets. - It consists of 11,932 annotated documents, each labeled with one of three sentiments. - Sentiments: - "LABEL_0": "Bearish" - "LABEL_1": "Bullish" - "LABEL_2": "Neutral" - The data was collected using the Twitter API, ensuring a diverse and real-world representation of financial sentiments. - This dataset supports a multi-class classification task for sentiment analysis. B. Importance of sentiment analysis in finance - Sentiment analysis plays a crucial role in understanding market dynamics and investor sentiment. - Accurate sentiment classification enables better decision-making in financial investments. C. Objectives of the project - Develop and compare models for sentiment analysis on financial tweets. - Evaluate the effectiveness of different approaches in capturing and classifying financial sentiment. # II. Data Preparation A. Data Format - The dataset is organized in the DatasetDict format, consisting of train and validation splits. ```python DatasetDict({ train: Dataset({ features: ['text', 'label'], num_rows: 9543 }) validation: Dataset({ features: ['text', 'label'], num_rows: 2388 }) }) ``` - Utilize the 'train' split for model training and the 'validation' split for final model performance evaluation. # III. Model Development ## A. BERT as Embedding + Logistic Regression 1. Tokenization and Embedding using BERT - Utilize BERT for contextual word embeddings, capturing nuanced financial language. - Model Configuration: ```python tokenizer = AutoTokenizer.from_pretrained('bert-base-cased') bert_model = AutoModel.from_pretrained('bert-base-cased') ``` 2. Logistic Regression Model - Apply logistic regression on BERT embeddings to predict sentiment. - Model Configuration: ```python #Default logistic regression model from scikit-learn. logistic_regression_model = LogisticRegression() ``` ## B. Fine-tune BERT 1. Model Architecture - Adjust BERT architecture for optimal performance on financial sentiment analysis. ``` tokenizer = AutoTokenizer.from_pretrained('bert-base-cased') pretrained_bert_model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased", num_labels=3) ``` 2. Fine-tuning Process - Train the fine-tuned BERT model on the annotated Twitter Financial News dataset. ## C. Off-the-shelf Pretrained Model from Hugging Face 1. Selecting Pretrained Model - Choose the off-the-shelf model 'google/flan-t5-base' for sentiment analysis. - Model Information: - Model Name: 'google/flan-t5-base' - Brief Description: This model is based on T5 architecture and has been pretrained on various tasks, including sentiment analysis. ```python from transformers import AutoModelForSequenceClassification # Specify the off-the-shelf model name model_name = 'google/flan-t5-base' # Load the model for sequence classification off_the_shelf_model = AutoModelForSequenceClassification.from_pretrained(model_name) ``` 2. Integration with Sentiment Analysis Task - Utilize the zero-shot learning approach with a predefined prompt for sentiment analysis. - Prompt Configuration: - Define the prompt as follows: ```python prompt = """ Classify the sentiment of the given finance-related tweet and return the numerical code: 0 for "Bearish," 1 for "Bullish," and 2 for "Neutral." """ ``` - Make predictions using the zero-shot pipeline. ```python # Example usage with a finance-related tweet tweet = "Stocks are expected to rise sharply in the coming quarter." prediction = zero_shot_classifier(tweet, prompt) ``` ## D. Model Trained by Users - Deberta for Financial Sentiment Classification - Model Information: - Model Name: 'RashidNLP/finance-sentiment-deberta-base' - Brief Description: Deberta model trained on over 1 million reviews from Amazon's multi-reviews dataset and fine-tuned on 4 finance datasets with sentiment labels (financial_phrasebank, chiapudding/kaggle-financial-sentiment, zeroshot/twitter-financial-news-sentiment, FinanceInc/auditor_sentiment). ```python from transformers import AutoModelForSequenceClassification # Specify the user-trained model name user_model_name = 'RashidNLP/finance-sentiment-deberta-base' # Load the user-trained Deberta model for sequence classification user_trained_model = AutoModelForSequenceClassification.from_pretrained(user_model_name) ``` - Integration with Sentiment Analysis Task - Utilize the user-trained Deberta model for sentiment analysis on financial tweets. - Make predictions using the user-trained model. ```python # Example usage with a finance-related tweet user_prediction = user_trained_model(tweet) ``` # IV. Model Comparison A. Evaluation Metrics 1. Accuracy, Precision, Recall, F1-score - Use Accuracy to evalute models' performance B. Comparative Analysis | Methods | Accuracy | Num_parameters (M) | Model_size (MB) | | ---------------- | -------- | ------------------- | --------------- | | BERT_lr | 0.79 | 108 | 413 | | Finetuned_BERT | 0.87 | 108 | 413 | | LLM_zero_shot | 0.65 | 944 | 247 | | Pretrained_model | 0.90 | 184 | 413 | 1. **BERT as Embedding + Logistic Regression (BERT_lr):** - Achieved an accuracy of 79%. - Utilized BERT embeddings with a logistic regression model. - Both the training and final model sizes are 413 MB, with 108 million parameters. 2. **Fine-tuned BERT (Finetuned_BERT):** - Outperformed BERT_lr with an accuracy of 87%. - Used the same BERT architecture but fine-tuned on the financial sentiment dataset. - Similar model size and number of parameters as BERT_lr. 3. **LLM Zero-Shot Learning (LLM_zero_shot):** - Demonstrated a lower accuracy of 65%. - Leveraged an off-the-shelf pretrained model for zero-shot learning. - Larger model size (944 MB) compared to BERT-based models. 4. **Off-the-shelf Pretrained Model (Pretrained_model):** - Achieved the highest accuracy of 90%. - Utilized the 'google/flan-t5-base' model for zero-shot sentiment classification. - Relatively smaller model size (184 MB) with 413 million parameters. - **Insights:** - The fine-tuned BERT model outperforms the BERT as Embedding + Logistic Regression model, indicating the effectiveness of fine-tuning on financial sentiment tasks. - The off-the-shelf pretrained model, 'google/flan-t5-base,' demonstrated the highest accuracy, showcasing the potential of large-scale pretraining on diverse data. # V. Conclusion A. Summary of Findings - In this project, we explored and compared various methods for sentiment analysis on financial tweets using the Twitter Financial News dataset. - Four different methods were implemented and evaluated, including BERT as Embedding + Logistic Regression, Fine-tuned BERT, LLM Zero-Shot Learning (adopting 'google/flan-t5-base'), and an Off-the-shelf Pretrained Model ('google/flan-t5-base'). - Performance metrics, including accuracy, were analyzed for each method to assess their effectiveness in capturing financial sentiment. B. Implications for Finance - The adoption of Large Language Models (LLMs) is a prevailing trend in recent NLP advancements. While our project utilized a smaller-scale LLM ('google/flan-t5-base') instead of larger models like GPT-3 or GPT-4, it provides a glimpse into the potential of in-context learning. Future improvements may explore incorporating more powerful LLMs, such as GPT-3 or GPT-4, with a focus on enhancing few-shot learning capabilities. C. Model Performance Insights - Fine-tuned BERT outperformed the BERT as Embedding + Logistic Regression method. This suggests that adapting BERT through fine-tuning on finance-specific data improves its ability to capture nuanced financial sentiment. The tailored adjustments in the fine-tuning process contribute to a more contextually relevant representation for financial language. D. Considerations for Deployment - When contemplating model deployment, especially in the context of large-scale ML or DL models, the number of parameters and the model size become critical considerations. The Off-the-shelf Pretrained Model ('google/flan-t5-base') showcased high accuracy with a relatively smaller model size (184 MB), highlighting the importance of model efficiency for practical deployment. E. Recommendations and Future Work - Future work could delve into leveraging larger LLMs, such as GPT-3 or GPT-4, to enhance the model's ability to grasp intricate financial contexts in a few-shot learning setting. - The success of fine-tuned BERT encourages further exploration of domain-specific fine-tuning strategies and parameter adjustments for improved sentiment analysis on financial data. - Considerations for deployment should involve optimizing models for efficiency without compromising performance, potentially exploring model compression techniques. # VI. References [Dataset](https://huggingface.co/datasets/zeroshot/twitter-financial-news-sentiment) [Pretrained Model](https://huggingface.co/RashidNLP/Finance-Sentiment-Classification?text=I+like+you.+I+love+you)