Improve the accuracy of keyword search with artificial intelligence

High-Level Project Summary

We designed a framework to compare the similarity of document titles to keywords, allowing researchers to more accurately search for the documents they want during their research

Detailed Project Description

What are the details of this project?

We hope use pre-trained Roberta to perform NER processing on database document titles, and generate embeddings of entities and searched keywords using pre-trained Sentence-Bert, and finally calculate the cosine similarity of the two embeddings, and then present the similarity results to the researcher. 

What is the purpose of this project?

We hope that this framework will be more effective in finding the results that the researcher wants to search for.

What are the tools used in this project?

The programming language for this framework will be Python, and the modules will be Pytorch, scikit-learn and so on.

Space Agency Data

In The NASA Technical Report Server (NTRS), it was found that the subjects of many literature titles were all proper terms, so we wanted to analyze the similarity of the subjects' proper terms to help search.

Hackathon Journey

This challenge is fun and if you succeed, it can also help in searching engines in your life.

During the challenge, we learned how to discuss and compile an idea quickly, and some of the knowledge we didn't understand could be supplemented by other group members, so we learned a lot during the process.

Tags

# semi-finished project