High-Level Project Summary
We developed a feature for the NTRS to give the user (the researcher) the possibility to either make a diverged search, by means of a satisfier algorithm that shows papers including the searched keyword at least once, or a converged search by means of a totalizer algorithm which classifies papers by amount of keyword repetition. We came to this idea after interviewing Sonia Botta (MSc in space exploration systems), which encouraged us to think about a solution for those researchers who are not friendly enough with search engines that either want to make a diverged search in their process of brainstorming or make a converged search when assured about their topic of research.
Link to Final Project
Link to Project "Demo"
Detailed Project Description
In the NTRS home page, we would add two features for both searches mentioned earlier. If the user decides to carry out a diverged search, we would show them a broad set of results where our algorithm maximizes diversity of results. For the converged search, we switch to a more focused search where the user will only be provided with results that are strongly linked to the search parameters and are specific to the prompted keywords.
Space Agency Data
We took a sample of 26000 files from the NTRS database in order to test and develop our searching features. Those files went through a cleaning and feature extraction, ordering all of its words by relevance(using TF-IDF).
Hackathon Journey
It was actually a really fulfilling experience, having the opportunity to meet a wide variety of people. Learned a lot about researchers' experience when performing their research. Fun fact “we played some football tennis with balloons in the evening”. We mainly got our inspiration from Sonia Botta, as explained in the summary, she encouraged us to find solutions for improving searching efficiency. Our team had a great communication all around, and most importantly we gave each other the possibility to brainstorm individually before sharing together the ideas. I would like to thank both the local and global organization for making this experience unique and comfortable.
References
For software purposes: we used Python, Jupyter Notebook, Scikit Learn, Pandas and Github.
For the presentation and the project demo: We used Figma.
For the summary video: We used Adobe Premiere.
For the Brainstorming: we used Miro.
Tags
#software, #nlp, #ai, #search engines, #research

