High-Level Project Summary
The objective of our group is to make it easier to locate relevant documents in the NASA Technical Report Server (NTRS). Our project involves 2 main components to achieve this purpose. First, we created an AI model that automatically associates keywords to PDF documents in the NTRS. To do this, we collected PDF files and their annotations using the NTRS API. The PDF files are then preprocessed and split into a training and test dataset. The training data set is used for fine-tuning a pre-trained NLP model. The resulting model is then used for generating the keywords of the test dataset. Finally, we created a web app that allows users to search and navigate based on the generated keywords.
Link to Final Project
Link to Project "Demo"
Detailed Project Description
Our project uses AI to automatically tag keywords to NTRS documents, making sure that the keywords are relevant. To demonstrate our result, we also created a web application that can search documents based on the keywords we generated. We used the following tools for this project: Python, Roberta-Base (https://huggingface.co/roberta-base), KeyBERT (https://github.com/MaartenGr/KeyBERT/), Firebase, and JQuery. We hope that, through this project, we can make it easier to search relevant content in the NTRS.
Space Agency Data
NASA (NASA Technical Report Server)
Hackathon Journey
It was a fun and fulfilling experience
References
NTRS API (https://ntrs.nasa.gov/api/openapi/#/)
Python (https://www.python.org/)
Roberta-Base (https://huggingface.co/roberta-base)
KeyBERT (https://github.com/MaartenGr/KeyBERT/)
Firebase (https://firebase.google.com/)
JQuery (https://jquery.com/)
Tags
ntrs

