Skip to content

bradley-p/sentiment-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tweet Sentiment Analysis project

Authors

Bradley Payne

Nithya Alavala


Project Summary

Sentiment analysis is an important task for many domains including for product feedback and how the general public feels about a given company, product or topic. Many people express their ideas, concerns, movie analysis, and other opinions on twitter through tweets. The goal of this project is to take texts from twitter data and classify it as negative, neutral, or positive.

One area that we would like to explore is the use of emojis in sentiment analysis.


Datasets


Models

We used the following methods

  • Random Forest
  • Decision Tree
  • Naive Bayes
  • SVM - Classifier
  • LSTM
  • BERT

SRC files

Below is a brief summary of each of the source code files

    • Contains the LSTM with the added attention layer
    • Implements a version of BERT from the huggingface transformers library
    • Trains BERT for 2 epochs
    • Evaluates on the test set
    • Our first notebook to explore the data
    • Fits the random forest as proof of concept
    • Defines a data preprocessing pipeline that:
      • Loads the data
      • Cleans
      • Tokenizes
      • Splits into testing and training sets
    • Fits an LSTM model on:
      • Raw data
      • Preprocessed data
    • Uses data_pipeline.py functionality to get training and testing set
    • Fits sklearn models:
      • Naive Bayes
      • SVM
      • Decision Tree
      • Random Forest

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published