Skip to content

Implementing BiLSTM-CRF and BiGRU-CRF based segmentation algorithms to parse Thai text.

Notifications You must be signed in to change notification settings

abanuelo/CS224N-PROJECT

Repository files navigation

CS224N-PROJECT

Thai is one of the languages that does not have explicit segmentation, and cannot be used with most word based models. In this paper we will be tackling this problem by implementing BiLSTM-CRF and BiGRU-CRF based segmentation algorithms to parse Thai text. Our model achieves an F1 score of 94.78 (micro) and 96.26 (macro). Our model outperforms the micro averaged F1 score from previous models and has comparable macro F1 score. The model also works well on small data, but struggles with named entities.

Model

alt text

Results

Below is a summarized table of the character-level F1 metrics for our project alt text

About

Implementing BiLSTM-CRF and BiGRU-CRF based segmentation algorithms to parse Thai text.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published