Parts of Speech Predictor

September 2022 – November 2022AI / ML

JavaHMMNLP

A Parts of Speech prediction model developed for CS 10, Object Oriented Programming at Dartmouth. The model uses a Hidden Markov Model trained on files containing sentences and tagged sentences.

Two hash maps — observations and transitions — track part-of-speech frequencies. Observations records how often a word appears as a certain part of speech, while transitions records how often one part of speech follows another. These occurrence counts are converted to log probabilities, and the Viterbi algorithm backtraces from the state with the best score for the last observation to determine the most probable sequence of parts of speech. Achieved over 93% accuracy.