Skip directly to search

Skip directly to content

 

Endava at NASA’s 2020 Space Apps Challenge

 
 

Insights Through Data | Gabriel Preda |
27 May 2021

INTRODUCTION

A multidisciplinary team represented Endava at the NASA COVID-19 Space Apps Challenge Hackathon between May 30-31, 2020. The challenge consisted of using a wide variety of data from NASA and external sources to study the impact of COVID-19 epidemics in the spring of 2020.

The Endava team was made up of Data Engineers, Developers and Data Scientists as well as a UX Designer, and we tried to solve the challenge using data from various sources, e.g. NASA satellite measurements, EU social and economic data, GitHub COVID-19 data as well as Kaggle UN countries economic and social data. Our aim was to understand if there are correlations between the energy and transportation sectors, pollution, unemployment rates and the incidence and dynamics of COVID-19 cases.

The analysis was done only for European countries, so all the data and findings are related to those only.

OUR APPROACH

For this challenge, our aim was to accomplish the following objectives:

  • Understand if there are correlations between NASA satellite data about pollution and various economic and social data, including transportation, energy, oil and coal production and unemployment, as well as health indicators and the COVID-19 incidence.
  • Use these correlations to establish a risk score for each country.
  • Build a user-friendly application to display this data.

We started by investigating NASA satellite measurements data for Europe, using country coordinates and/or boundary limits, and extracted the data from the netCDF4 format. To be more specific, we analysed and processed MERRA-2 CO2, SO2 and IASI METOP CO data as well as Copernicus air quality data (SO2, CO2, pm2.5 & pm10 data).

Space rocket before launch
IASI CO distribution across Europe, April 2020

 

For COVID-19 data, we included that provided by John Hopkins University – curated by us – also with a focus on European countries.

We ingested, processed and enriched data related to air and maritime transport extracted from UN open-source data. Starting from the initial datasets, we added information using the lookup files and, where needed, transformed the quarterly data into monthly data by using the quarterly data as an average value for 3 months. This included international intra-EU freight and mail air transport data provided by main airports in each reporting country and EU partner country, air passenger transport data provided by main airports in each reporting country, airport traffic data from reporting airports and airlines and data on the gross weight of goods transported to/from main ports by direction and type of traffic (national and international).

Then we investigated various factors which are related to the impact of COVID-19. This relation could be direct like the correlation between morbidity and mortality to healthcare sector development in each country. Other relations were indirect and could be inferred through the measures imposed by each country, for example changes in unemployment rates among young people or the prevalence of internet access in various countries due to the massive shift towards working from home and online education.

Space rocket before launch
Earthdata MERRA-2 CO Column Burden (COCL) distribution, April 2020

We were also looking at sectors which saw major shifts, like mobility and transportation, or which might have a relationship with pollution. For example, the energy and industrial sectors, which may have had lockdown-related work restrictions, rely heavily on large energy consumption. We also looked to factors that might have an impact on COVID-19 dynamics, like education, literacy, healthcare system quality, population age, the forest percentage of the total land area as well as GDP per capita, population density and percentage of seats for women in parliament.

Even though the hackathon and our analysis took place only at the beginning of a time where the COVID-19 epidemics would turn out to become a global pandemic, we were already able to see some interesting correlations. For example, we observed an inverse correlation between the number of physicians per 1,000 people and the number of fatalities. Furthermore, we also found a correlation between the percentage of fatalities in a population and the unemployment rate.

Space rocket before launch
Correlation between COVID-19 aggregate indicators and UN economic and social indicators (Western Europe)

CONCLUSION

While focusing on the potential methods and execution of such complex data analysis projects, the work of the Endava team already highlighted various interesting insights about how the COVID-19 epidemics impact certain areas of the EU economy as well as various social factors. Some of our findings are captured in the visualisation and can be used as a starting point for further targeted analysis.

Over the last year, the COVID-19 pandemic has disrupted the way that people live and work on a global level. Therefore, it could be very valuable to use the data analysis methods developed during the NASA Hackathon, including the analysis application developed by the Endava team, and apply them to the ever-growing database of environmental, logistical and social data. This could help pinpoint even further the COVID-19 risk factors and thereby reduce and prevent the spreading of the disease. Looking to the future, robust yet flexible data analysis frameworks can of course also be used to support countries in their overall healthcare management systems, independent of COVID-19.

Gabriel Preda

Lead Data Scientist

Gabriel applied 20+ years ago what is now called Machine Learning to solve ill-posed inverse problems in Nondestructive Testing (NDT). He survived a PhD in computational electromagnetics, worked in academic and private research, co-founded two technology start-ups, and has been working in Software Development for more than 15 years. He is currently a Lead Data Scientist at Endava, recently working mainly on Natural Language Processing (NLP) projects for asset management, healthcare or banking clients. Gabriel is a high-profile contributor in the world of competitive machine learning and currently one of the few triple Kaggle Grandmasters. When not working in Data Science, Gabriel enjoys hiking, climbing and reading.

 

Related Articles

  • 08 June 2021

    Elasticsearch and Apache Lucene: Fundamentals Behind the Relevance Score

  • 27 May 2021

    Endava at NASA’s 2020 Space Apps Challenge

  • 27 January 2021

    Following the Patterns – The Rise of Neo4j and Graph Databases

  • 12 January 2021

    Data is Everything

  • 30 April 2020

    AR & ML Deployment in the Wild – A Story About Friendly Animals

  • 01 October 2019

    Cognitive Computing Using Cloud-Based Resources II

  • 17 September 2019

    Cognitive Computing Using Cloud-Based Resources

  • 20 August 2019

    Extracting Data from Images in Presentations

Most Popular Articles

11 Things I wish I knew before working with Terraform – part 2
 

Architecture | Julian Alarcon | 23 July 2019

11 Things I wish I knew before working with Terraform – part 2

11 Things I wish I knew before working with Terraform – part 1
 

Architecture | Julian Alarcon | 25 June 2019

11 Things I wish I knew before working with Terraform – part 1

AWS Serverless with Terraform – Best Practices
 

Architecture | Vlad Cenan | 10 December 2019

AWS Serverless with Terraform – Best Practices

EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 1
 

Software Engineering | Matjaz Bravc | 29 June 2021

EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 1

Internet Scale Architecture
 

Architecture | Gareth Badenhorst | 28 January 2019

Internet Scale Architecture

EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 2
 

Software Engineering | Matjaz Bravc | 20 July 2021

EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 2

EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 3
 

Software Engineering | Matjaz Bravc | 24 August 2021

EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 3

Microservices and Serverless Computing
 

Architecture | Radu Vunvulea | 30 May 2019

Microservices and Serverless Computing

API Management
 

Architecture | Gareth Badenhorst | 30 October 2020

API Management

 

Archive

  • 24 August 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 3

  • 20 July 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 2

  • 29 June 2021

    EHR to HL7 FHIR Integration: The Software Developer’s Guide – Part 1

  • 08 June 2021

    Elasticsearch and Apache Lucene: Fundamentals Behind the Relevance Score

  • 27 May 2021

    Endava at NASA’s 2020 Space Apps Challenge

  • 27 January 2021

    Following the Patterns – The Rise of Neo4j and Graph Databases

  • 12 January 2021

    Data is Everything

  • 05 January 2021

    Closing-the-gap-between-the-product-owner-and-the-team-part-3

We are listening

How would you rate your experience with Endava so far?

We would appreciate talking to you about your feedback. Could you share with us your contact details?