Kamakani o ka lā

Honolulu | Save the Earth from Another Carrington Event!

Awards & Nominations

Kamakani o ka lā has received the following awards and nominations. Way to go!

Global Finalists Honorable Mentions

Saving the Earth using AI

High-Level Project Summary

We develop a deep neural network anomaly detector and use it to detect anomalous DSCOVR data. Our model is an autoencoder trained on one day of WIND magnetic field vector data. We use reconstruction loss (MAE) with an anomaly threshold to identify anomalous DSCOVR magnetic field vector data. Our model is trained on just one day of data. Training on the full WIND data set may significantly improve performance. This project is important because the WIND and DSCOVR spacecraft observe the solar magnetic field which is critical for forecasting "Space Weather" on Earth caused by solar coronal mass ejections.

Link to Final Project

https://github.com/ifauh/kamakaniokala

Link to Project "Demo"

https://www.canva.com/design/DAFN2CzTF7w/ZTGcAn8KoaXWx63WggVceQ/view?utm_content=DAFN2CzTF7w

Detailed Project Description

Method

We implement an anomaly detector trained on WIND mfi data based on a Keras example of anomaly detection [1]. We use the "astro-ft" data transfer script [2] to download the data in parallel from the NASA GSFC public data website using curl.

Data Preparation

We use the CDF library [3] and pycdf package [4] to access the downloaded data in Python. Following the example, we preprocess the data into succesive, contiguous, 300 second sequences to form the training data set comprising 86101 training examples. We use a 90%/10% training/validation split yielding 77491 training and 9610 validation examples. We adapted a useful python example [5] to convert WIND fractional day of year values into seconds. The WIND time sampling interval is different from the DSCOVR time sampling interval. This posed a challenge for us. We emperically determined a linear function to produce indices for WIND "Time_PB5" values closest to each DSCOVR whole second interval. The equation we use is idx_wind = idx_dscovr * 10.828726851851853 + 5 giving a good linear fit [Fig. 1].

Model Architecture

Our anomaly detector is a 1D convolutional autoencoder [Fig. 2] with an information bottleneck in the first two layers. The information bottleneck constrains the model to learn a compressed latent representation of the data (encoder). The second two 1D transpose convolutional layers learn to reproduce the input from the latent representation (decoder). The final output transpose convolutional layer ourputs (300,3) values matching the input dimension of each training sequence.

Fig. 2 Neural network model architecture - multilayer autoencoder

Hyperparameter Optimization

We experimented with batch size of (128, 256, 512), learning rates of (1e-4, 5e-4 and 1e-3) and dropout percentage of (0, 0.2, 0.5). Our final parameters were batch size 256, learning rate 0.001 and no dropout. Notably, and somewhat counterintuitively, removing dropout significantly reduced overfitting, loss at convergence and increased the number of epochs of training before early stopping. We trained for 50 epochs with early stopping patience 5.

Training Results

Our best model converged at epoch 48 with training loss of 4.1072e-04 and validation loss of 6.4473e-04. We achieved a very good fit to the training data set [Fig. 3]. More significantly, the out of distribution DSCOVR data fit is nearly as good as for the WIND training set [Fig. 4].

Fig. 3 Fit on training data (WIND mfi)

Fig. 4 Fit on unseen inference data (DSCOVR mfi)

Anomaly Detection Method

We use the reconstruction MAE loss distribution [Fig. 5] to set an anomaly threshold equal to the maximum MAE on the training set. To filter a DSCOVR data sequence we use the model to reconstruct (predict) the input sequence from itself. The reconstruction loss is the mean absolute error between the reconstructed input and the "real" input. If this loss exceeds the anomaly threshold the data is considered an anomaly. We visualize the detected anomalies in a DSCOVR sample by overplotting anomalies in red [Fig. 6].

Fig. 5 Reconstruction loss distribution of WIND magnetic field vector training data

g. 6 DSCOVR magnetic field vector data with anomalies overplotted in red

Filtering Method

We demonstrate how to use the anomaly detector as a filter by removing the anomalous data points from the DSCOVR example and replotting the "corrected" DSCOVR and WIND data [Fig. 7].

Fig. 7 DSCOVR mfi data with anomalies removed (top, green)) and same day WIND data (bottom, magenta)

Tools

We used open source software, the most significant of which include Python, Jupyter, TensorFlow, Keras, Matplotlib and pycdf. We used Canva to prepare our presentation and GitHub as a code repository and for version control. We ran our Jupyter notebook in an NVIDIA docker container using an RTX-3090 GPU.

Space Agency Data

We used the following space agency data for this challenge.

WIND

https://cdaweb.gsfc.nasa.gov/cgi-bin/eval2.cgi

https://cdaweb.gsfc.nasa.gov/pub/data/wind/mfi/

https://cdaweb.gsfc.nasa.gov/pub/data/wind/swe/swe_h1/

https://cdaweb.gsfc.nasa.gov/pub/data/wind/mfi/mfi_h2/2022/wi_h2_mfi_20220514_v04.cdf

DSCOVR

https://cdaweb.gsfc.nasa.gov/pub/data/dscovr/h0/mag/

https://cdaweb.gsfc.nasa.gov/pub/data/dscovr/h0/mag/2022/dscovr_h0_mag_20220514_v01.cdf

Hackathon Journey

Our team had a lot of fun in this hackathon. We collaborated in Zoom with "stand up" meetings every 3 hours. We tended to just leave the Zoom running and hang out while we worked together. Some of us learned about the Carrington event and CMEs and how they could impact the Earth. Some focused on creative aspects of making a slide show to present our solution. Others sharpened their coding skills, especially with CDF, matplotlib, TensorFlow and Keras. The limited time was the biggest challenge. Some of us worked late into the night with little sleep.

Challenges

A technical challenge we encountered was the different time sampling for WIND and DSCOVR mfi data. We analyzed the data and ultimately used an empirical approach to find a solution.

Approach

The overall approach we took was first to select a high level strategy (anomaly detection). Then we adapted an existing, neural network architecture that we could adapt to the challenge data set and objective. We set out to accomplish more than just an anomaly detector but we ran out of time. We rescoped to focus on submitting early then iterating to improve both our solution and our submission.