1. Natural Language Processing (NLP)
  2. Annotated Bibliography
  3. Future Work

Natural Language Processing (NLP)

Document Similarity

I am collaborating with Professor Buxton at the University of Illinois (Springfield) to develop a language model to determine document similarity between hundreds of documents in two groups. We are exploring NLP models (BERT, Longformer, SBert) to determine which model is best at technical (scientific) document similarity.

The project uses Python and Pytorch to load data from two different sets of files (A and B) to test which files from 'A' match most closely with the files from 'B'. I wrote custom classes to load and clean the data. The code runs on Google's Colab to finetune existing models with the specific technical files.

Back To Top

Annotated Bibliography



Back To Top

Future Work

I will continue NLP and AI research to help with the creation of a smart game. A game is smart if it can learn enough about its players to teach the players something about their life purpose.

Back To Top

NRC Logo
Copyright © 1971-2023
Matt Pavlik
Centerville, Ohio 45459