Lexis

Team 2b|!2b's project for the HackFMI8 hackathon

###Dataset mining - more than 150k unique texts

Downloaded more than 100k tweets using Twitter API

Additional 50k from other sources

All data is labeled

###Data preprocessing

removed all hashtags, links, user mentions, retweets

removed meaningless data

removed all stopwords

###Algorithm

"Bag of Words" - vectorization

Implemented different classification algorithms (SVC, Naive Bayes)

Compared and tuned the result

Get result of sample input and graph the probabilities

Find how to export and import classifiers

API - Python

GUI - HTML and JS

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
app		app
data		data
deyan		deyan
export		export
import		import
logos		logos
static		static
templates		templates
test		test
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
README.md~		README.md~
algorithm-server.py		algorithm-server.py
classifier_linear.pk		classifier_linear.pk
dataset.zip		dataset.zip
npm-debug.log		npm-debug.log
result.json		result.json
script.py		script.py
script.pyc		script.pyc
vectorizer.pk		vectorizer.pk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lexis

About

Releases

Packages

Contributors 3

Languages

emil-kirilov/lexis

Folders and files

Latest commit

History

Repository files navigation

Lexis

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages