NL Preserving our Science Knowledge with AI

High-Level Project Summary

The objective of the project is to minimize the main pain of access to information through an inclusive app, generating a browser for legacy files in NASA PDF format to make information available to more people around the world. We think of a tool that facilitates access to PDF files belonging to a corpus, processed with an Artificial Intelligence layer. The user can easily and graphically view the content of the repository and make it available with classification and grouped into different topics and search keys. We are sure that with our project we will be able to preserve and spread the legacy of science that NASA has.

Detailed Project Description

Information has several dimensions of value, including as merchandise, as a means of education, as a means of influencing and a means of negotiating and understanding the world, as well as the legal and socioeconomic interests that influence the production and dissemination of information[1 ]. Access to the information stored in a system will depend on how it has to categorize and classify the resources it contains and how it is accessed. [1]. Globally, it is estimated that approximately 1.3 billion people live with some form of visual impairment [3].


We are a group of Uruguayans who met in this challenge and we were caught by the proposal to work together to propose an idea that generates added value to the information that NASA possesses

Our web application allows us to know through the use of artificial intelligence to analyze the different raw and unstructured documents, and discover their characteristics, from the named locations, people, and organizations to know what the keywords of the document are.

In addition to that, it also allows searching in all the documents filtering by the desired characteristics and generates graphs to more easily understand the results obtained. This is an app multilingual, user-friendly app that democratizes the search and access to corpus information. It contemplates those who suffer from low vision, that is, users who find it more difficult to carry out activities of daily life such as reading, and writing. The language barrier to accessing quality information is lower. We are sure that with our project we will be able to preserve and spread the legacy of science that NASA has.

Our artificial intelligence service works on a search engine that reads the content of the different documents that the user adds to a repository (or corpus) and maps them using optical character recognition and natural language processing (NLP) for the extraction and content transformation, once this is done, it performs different search indices to guarantee search speed, in addition to generating an approximation score to show the reliability of the results. This service is available through an API in the Azure cloud which connects through our backend and displays the desired information in an interface created by React.


This solution allows you to quickly discover the content of the information contained in different documents to quickly identify where the information you are looking for is located and also to know its characteristics.


We use React, Node, Express and Azure Cognitive Service.


(For this challenge, the basic functionalities will be instrumented.)

Space Agency Data

We navigate to the NTRS home page and select Legacy CDMS. We search for PDF files and download some of them for our work using the free API.


The documents used as corpus were:


https://ntrs.nasa.gov/citations/19700019477

https://ntrs.nasa.gov/citations/19700025029

https://ntrs.nasa.gov/citations/19700032732

https://ntrs.nasa.gov/citations/19710001109

https://ntrs.nasa.gov/citations/19710001115

https://ntrs.nasa.gov/citations/19710001361

https://ntrs.nasa.gov/citations/19740004979

https://ntrs.nasa.gov/citations/19770026932

https://ntrs.nasa.gov/citations/19820016960

https://ntrs.nasa.gov/citations/19850024793

https://ntrs.nasa.gov/citations/19880005241

https://ntrs.nasa.gov/citations/19880009059

https://ntrs.nasa.gov/citations/19890011217

https://ntrs.nasa.gov/citations/19890016813

https://ntrs.nasa.gov/citations/19900004904

https://ntrs.nasa.gov/citations/19900017750

https://ntrs.nasa.gov/citations/19910016315

https://ntrs.nasa.gov/citations/19910016793

https://ntrs.nasa.gov/citations/19910021747

https://ntrs.nasa.gov/citations/19920001776

https://ntrs.nasa.gov/citations/19920001806

https://ntrs.nasa.gov/citations/19920001924

https://ntrs.nasa.gov/citations/19920002000

https://ntrs.nasa.gov/citations/19920008426

https://ntrs.nasa.gov/citations/19920011303

https://ntrs.nasa.gov/citations/19920011706

https://ntrs.nasa.gov/citations/19920012214

https://ntrs.nasa.gov/citations/19920012984

https://ntrs.nasa.gov/citations/19920014184

https://ntrs.nasa.gov/citations/19920015857

https://ntrs.nasa.gov/citations/19920015959

https://ntrs.nasa.gov/citations/19920017294

https://ntrs.nasa.gov/citations/19930003502


We use all the content of it to feed the ML algorithm.

Hackathon Journey

We feel lucky to have the opportunity to participate in this wonderful experience. All the information provided before and during the challenge was very helpful, as well as the technical support of the leaders. We learned that as a team you go further, the product is enriched by the different perspectives and experiences of each member.

In our Asado Vegano team, some participants knew each other before and others we met in the challenge, forming a multidisciplinary team that we are proud of. We started from a brainstorm, each one made an effort in what he knows how to do best without neglecting the rest of the topics. We put into practice analysis techniques and software development, artificial intelligence, speech elevator technique, research. We strengthened soft skills of empathic and active communication, respecting different points of view and choosing the best path among all. The setbacks and challenges were overcome little by little, some cost a little more than others, but despite our differences (techniques, ages, gender, experiences), the short time and the challenge of working online were achieved.

 We want to thank Federico Vazquez, Francisco Delgado, Laura Morín, Caro Maneiro, Ana González who were available for us.

Tags

#software #ai #Nasa #Uruguay #scientific #ML