High-Level Project Summary
We designed a framework to compare the similarity of document titles to keywords, allowing researchers to more accurately search for the documents they want during their research
Link to Final Project
Link to Project "Demo"
Detailed Project Description
What are the details of this project?
We hope use pre-trained Roberta to perform NER processing on database document titles, and generate embeddings of entities and searched keywords using pre-trained Sentence-Bert, and finally calculate the cosine similarity of the two embeddings, and then present the similarity results to the researcher.
What is the purpose of this project?
We hope that this framework will be more effective in finding the results that the researcher wants to search for.
What are the tools used in this project?
The programming language for this framework will be Python, and the modules will be Pytorch, scikit-learn and so on.
Space Agency Data
In The NASA Technical Report Server (NTRS), it was found that the subjects of many literature titles were all proper terms, so we wanted to analyze the similarity of the subjects' proper terms to help search.
Hackathon Journey
This challenge is fun and if you succeed, it can also help in searching engines in your life.
During the challenge, we learned how to discuss and compile an idea quickly, and some of the knowledge we didn't understand could be supplemented by other group members, so we learned a lot during the process.
References
Tags
# semi-finished project

