High-Level Project Summary
We have developed a prototype that represents our solution to offer accessibility to the publications of the NTRS platform.The added value is that it can find information regardless of syntactical errors, word order and offers content based on similar context or characteristic terms.Basically it aims to expose knowledge in an agile and appropriate way to expectations
Link to Final Project
Link to Project "Demo"
Detailed Project Description
The flow of our solution can be abstracted into this flow:
1) Data collection (reports from January 1, 2000 until current day)
2) Data processing (pdf to text conversion)
3) Artificial intelligence (conversion of text to vectors grouped by grammatical context)
4) Cognitive search (estimation of relevance on the pdf document with the search)
Berthym Project Flow

We have designed this solution to process the PDF's and automatically proceed to extract features from the content.
It perform searches by formulating a sentence similar to the search text with the characteristics of the documents


Link Code:
https://www.kaggle.com/code/maximilianoalarcon/nasa-spacechallenge-2022/
Space Agency Data
NTRS - NASA Technical Reports Server
We use it for the development of the modules that would be part of the system
Hackathon Journey
Our experience served as insight to identify our strengths and weaknesses when facing a problem with an innovative solution.
It was somewhat exhausting but we were pleased to have been able to reach the end, the fact that it is so complex forces us to repeat this experience for as many years as necessary.
We learned to use tools to reuse in other events
We will be anxiously waiting until next year!!!
References
BERT hugging face (large-base-cased)
It was used to tokenize the text from pdf's
https://www.kaggle.com/datasets/sauravmaheshkar/huggingface-bert-variants
Library pdf2image
It was used to convert the PDF's pages to images
Library easyocr
It worked as a text extractor from images
Library ´poppler
It complements the easyocr library for its operation
BERT documentation
https://huggingface.co/docs/transformers/model_doc/bert
Evolution of coherence using neural networks models
I was inspired by the section about the semantic similarity graph
https://www.mdpi.com/2076-3417/11/7/3210
Canva template - Credits to: Jimena Domech
https://www.canva.com
Tags
#SpaceApps #AI #Software #STI #NLP #NTRS #Accessibility

