Cooperation measurement

High-Level Project Summary

This project aims to evaluate the effectiveness of open science activities. Metrics are developed to visualize the level of international scientific cooperation in research and regional scientific strengths.In this case, we will focus on strictly analyzing scientific articles through which we can obtain specific data of our interest and generate these metrics.These metrics would be very useful for public entities like universities and the private sector, who would use them to make better decisions regarding the formation of strategic international alliances and agreements.

Link to Project "Demo"

Detailed Project Description

What is the real problem?


There are many open sciences definitions. According to NASA:


 NASA’s Earth Science Data Systems defined

 open science as a collaborative culture enabled by technology that empowers the open sharing of data, information, and knowledge within the scientific community and the wider public to accelerate scientific research and understanding. (Open Science, https://www.earthdata.nasa.gov/technology/open-science, october 2022)


Open science has significantly impacted our lives and overall technological advancement, but... how do we know? So far, there are no clear indicators.


And how to carry it out?

Our challenge is to create metrics to evaluate the effectiveness of open science activities

Metrics cannot offer a one size fits all solution. Moreover, we need greater clarity as to which indicators are most useful for specific contexts. [2]

In this project, metrics have been developed to visualize the level of international scientific cooperation in research and regional scientific strengths.

We will focus on strictly analyzing scientific articles through which we can obtain specific data of our interest and generate these metrics. 

These metrics would be very useful for public entities, universities and the private sector, who would use them to make better decisions regarding the formation of strategic international alliances and agreements, such as strengthening faculty and student exchange programs with institutions in other regions that either reinforce our research strengths or allow us to expand in areas that are less covered.


Our proposal

We divide the solution into four steps:

1- Construction of a database of papers’ metadata.

2- Design of metrics that reflect the national and international interconnectedness of the open science community.

3- Metrics calculation and insights generation

4- User interface for results query



1- Data base

In the first instance, a database of papers will be built containing the following attributes of each one of them:






To develop it, public repositories of papers will be used, such as those of NASA and other space agencies, as well as google scholar and other sources. The metadata of these papers such as region, institution, etc. will be taken.

Since not all of these repositories contain all this data in an orderly fashion, the following methodologies are used:

Through web scraping, certain information such as the author's contact can be obtained directly, by searching for an "@" in the document, to be stored in the database.

To collect data that is not structured according to a standard in the paper, an AI (Artificial Intelligence) of NLP (Natural Language Prossesing) such as GPT3 can be used. 

As for the infrastructure that will support this database, given that only the metadata of the papers will be stored, it would not represent a significant storage. This could be done by an organization dedicated to the philosophy of open science such as NASA or through an NGO that is supported by the collaboration of interested entities.



2- Metrics

In the past 25 years, impact measurements have been limited to citation analysis of academic journal articles to assess scientific contributions (Fenner, 2014

Thanks to the database, through a system that filters the data according to the need, we will be able to obtain several metrics. In order to measure open science we defined. We propose adaptations of these metrics to reflect the regional interrelationship of them.


Acat (country)=Bcat (country)Ncat (country)  ;     international scientific relevance index

Bcat (country)= Number of citations received internationally in a given category

Ncat (country)= Number of papers published in the country in this category

This index reflects how referential that country is in that category at the international level. It shows the average number of international citations received by a paper from that region in that category. The higher the number of citations received, the higher the index increases, reflecting the greater impact in terms of quality and not only the number of papers generated.

acat (country)=bcat (country)Ncat (country)  ;    national scientific collaboration index

bcat (country)= Number of citations received nationally in a given category

Ncat (country)= Number of papers published in the country in this category

It reflects the internal scientific interconnection of a country. It shows the average number of national citations that a paper from that region receives in that category.

C=Acat (country)+ acat (country)

It shows the average number of citations that a paper from that region receives in that category. By taking an average and not the total number of citations, we have some notion of the quality of the papers.



3- Metrics calculation and insights generation


Some insights that provide valuable information will be presented through a web application. The database processing code to generate this information can be found in our github repository already developed and documented.



4- User interface for results query


By means of a web application, users will be able to browse the application and filter data according to what they need to know.

It will have two main functions:


Function 1: In case a user is looking to link to other regions, he will enter the first option. Now, he chooses his country and year of publications of interest. Automatically, he obtains a classification of the categories in which that country is most scientifically linked to other countries. Enter to obtain the nature of that link. 

This would be useful for a government entity to stimulate public policies to facilitate those links that already exist or to strengthen areas where it finds potential. That potential is reflected in the metrics in the table. We developed the international importance index of a country, which is calculated as the average number of international citations of papers from that country in that category.

Function 2: If the user wants to observe the scientific strengths of each country in a specific category, he/she enters the second option. The program automatically ranks the countries with the highest scientific contribution in that category. 

This can be useful for private entities to know where to establish themselves to develop solutions in a specific area.



Conclusion

Based on this analysis, we encourage the formation of international alliances and the identification of countries' strengths and weaknesses. The metrics developed aim to enrich the analysis and bring a better understanding of the current situation each country has, so it serves as a baseline for further development in strategic areas.



Space Agency Data

We will use NASA repositories of papers (https://ntrs.nasa.gov/) and we aim to use the ones from other space and public organizations as well as the data source to build our database of papers metadata.

All this data enrichs our database and we work from it to build meaning with our metrics.


Hackathon Journey

It was a very enriching experience in which we experienced many feelings of nerves, anxiety and doubts as well as motivation, joy and fun. We learned how to develop a project in record time, how to organize ourselves as a team, and we learned a lot about the challenge we were facing, that is, open science.

What motivated us to choose this challenge is that there is little progress on this topic, and we saw the possibility of developing a highly fruitful project for the scientific community as well as for the common public and private institutions.

We were able to overcome setbacks by concentrating on essential issues and taking the necessary breaks to relieve anxiety.

We would like to give special thanks to the mars society argentina for the great organization and services they provided during the two days of the hackathon.



References

lumenfeld,J. October 2022. https://www.earthdata.nasa.gov/learn/data-chats/data-chat-kaylin-bugbee

Open Science, october 2022,  https://www.earthdata.nasa.gov/technology/open-science..


Fenner, M. (2014). Altmetrics and other novel measures for scientific impact. In S. Bartling, & S. Friesike (Eds.), Opening science (pp. 179– 189). Springer.

Ramachandran, R., Bugbee, K., & Murphy, K. (2021). From open data to open science. Earth and Space Science, 8(5), e2020EA001562.

Rocker, J., Roncaglia, G. J., Heimerl, L. N., & Nelson, M. L. (2002, June). The NASA Scientific and Technical Information (STI) Program's Implementation of Open Archives Initiative (OAI) for Data Interoperability and Data Exchange. For full text: http://www. sla. org/content/Events/conference/2002annual/confpap2002/papers2002conf. cfm/..



Tags

#OpenScience