High-Level Project Summary
Imagine wanting to access the NTRS, but you didn't know English? The NTRS also contains mostly documents in English, so if researchers don't have a high proficiency in the language, it will be difficult to search for information on it. We propose to make it more inclusive by creating a Question-Answer system, wrapped around Google Translate, to allow people to search for facts in their own native language, and have the answer written out to them in their language.Our solution provides an accessible interface to NTRS which consists of pdf file summarizer, query and answer and file translation.
Link to Final Project
Link to Project "Demo"
Detailed Project Description
Imagine wanting to access the NTRS, but you didn't know English?
Our proposed solution aims to make the NTRS more accessible by allowing people to access the NTRS in their native language, and to ask questions for facts in their native language.
We were inspired by Google's search engine, which often provides direct answers to natural language questions.
Our proposed solution is a Q&A system wrapped with a translation layer.
First, the user will ask a question in their native language.
Next, we will use Google Translate to translate the question into English.
Concurrently, we will use PyPDF2 to turn the NTRS's large database of PDFs into scannable text. We ran out of time to execute this process.
Then we will utilize transformers to search for the answer to the user's question. In the demo, we used the deepset/bert-base-cased-squad2 model.
Once the transformer finds the answer, we take the answer and translate it back to the user's language.
This same process can be replicated to return us a summary of the NTRS technical document, and also keywords.
Due to time constraints, we weren't able to complete the summary and keywords.
however, in the demo, it is indeed possible to utilize AI to translate questions, retrieve answers, and give the answer to the user in their own native language.
And this can help increase access to the NTRS technical documents for people all around the world.
Space Agency Data
We use files from the NASA Technical Reports Server (NTRS). In the demo, you can download pdf documents from the NTRS and upload it to view the results.
Hackathon Journey
Our Space Apps experience was excellent. We learned how to use AI in Question and Answer scenarios, and learned a great deal about the NTRS. Our approach to the project involved speaking to our friends about the NTRS, most of whom had never heard of it. After showing it to them, they were not interested because they couldn't understand the contents. But once we explained it to them in our own language, they became interested. That's the inspiration of our project.
Our setbacks were that we didn't have much experience with AI, so we learned in the 2 days. We would like to thank all the organizers, the people who compiled the NTRS, and all the contributors to the dependencies and open-source libraries we used. This project was eye-opening.
References
- https://ntrs.nasa.gov/
- Laura E Dennis, Andrea M Spaeth, Nammi Goel, Phenotypic Stability of Energy Balance Responses to Experimental Total Sleep Deprivation and Sleep Restriction in Healthy Adults, 2016, Nutrients, 8(12), 823, accessed online: https://www.mdpi.com/2072-6643/8/12/823#
- James Briggs, Question And Answering with Bert, (2021), Accessed Online: https://towardsdatascience.com/question-and-answering-with-bert-6ef89a78dac
- Hugging Face deepset/bert-base-cased-squad2; accessed online: https://huggingface.co/deepset/bert-base-cased-squad2
- Packages used: Tensorflow, PyPDF2, transformers, PyTorch, Translators, Google Translate
Tags
#software #ai #translation #multilingual #international #accessible

