Awards & Nominations

Find has received the following awards and nominations. Way to go!

Global Nominee

FIND: Improving NTRS efficiency and paving the way for the next generation of researchers

High-Level Project Summary

Using both extractive and abstractive NLP text summarization, one is able to decompose what could be a 100-page research paper into a single sentence. Provided with keywords that are relevant to your article, subject matter, and field. you can sort through multiple research papers in the time that it would take to go through only one. We set out to use data and artificial intelligence to save the user time, and make the research process more efficient and user-friendly. This is important as it opens up the possibilities of what a researcher can do. Regardless of a user's field of proficiency or expertise, they're able to branch out and widen the scope of their research.

Detailed Project Description

The problem with abstractive text summarization is that due to the way that NLP models encode and decode bodies of text, the model can not take in anything longer than a short paragraph. There are two intuitive solutions to this. The first is to recursively repeat the encode-decode process until it is of the desired length. The complication arises with the fact that this is both computationally expensive and time-consuming. Our solution to this problem was to use both extractive and abstractive text summarization. Using a neural network, the inputted text is narrowed down into only the most important sentences. From this, an abstract summary is created using Google's NLP model: Pegasus, narrowing down what could have originally been a 100-page research paper into a single sentence. Keeping things short and concise, one is able to more efficiently navigate through the endless supply of academic papers. Upon running the program, you will also receive multiple keywords that are relevant to your article, subject matter, and field. This is all available on a website we created using HTML, CSS, and Javascript and hosted on a server using Django and AWS Lightsail, all for the purpose of making the process as simple as possible for the user. What was inspired by a weekend of sifting through overly-technical research papers and realizing that the current system is flawed, this tool is meant to save precious time and make the research process more efficient.

Space Agency Data

We used NASA's Technical Report Server (NTRS) to test and improve the accuracy of our model. This project was inspired by our own independent research and time in NASA's L'SPACE Academy, where through repetitive use, we realized that NASA's current system (NTRS) could be improved.

Hackathon Journey

Our Space Apps experience was one of growth, perseverance, and teamwork. We chose this challenge because, after extensive use of NASA's NTRS, we realized that the entire research process could be streamlined and made more efficient. We faced many challenges, specifically with setting up our website's server. We wanted to distinguish ourselves from our competitors by building an entire site from scratch, everything from the front end, to the server, to the backend computation. This was no small task, and where most of our setbacks arose. With simple perseverance and an undisclosed amount of coffee, we were able to successfully set up the website and implement its key features in 48 hours.

References

https://huggingface.co/google/pegasus-xsum

https://huggingface.co/docs/transformers/index

https://github.com/google/sentencepiece

https://github.com/MaartenGr/KeyBERT

https://spacy.io/universe/project/spacy-pytextrank

https://github.com/pdfminer/pdfminer.six

https://github.com/axelpale/minimal-django-file-upload-example

https://aws.amazon.com/lightsail/

https://httpd.apache.org/

https://www.djangoproject.com/

Tags

#nlp #AI #machinelearning #summarization #data