High-Level Project Summary
The goal of the project is to predict solar wind using DSCOVR, potentially predicting another Carrington event and warning beforehand.The first challenge of the project was to calibrate the measurements from the DSCOVR satellite and the WIND spacecraft. Both devices are located relatively close in space, so the recorded signal is very similar up to a time shift due to the varying distance between them. We used dynamic time warping (DTW) in order to compare the magnetic field measurements of the two spacecrafts.The second challenge was implementing a recurrent neural network (RNN) on the data in order to predict the solar wind and the required parameters: density, speed and velocity.
Link to Final Project
Link to Project "Demo"
Detailed Project Description
Introduction
Carrington event is a strong geomagnetic storm and a major space weather event. The last event occurred in 1859, a similar event nowadays will cause a lot of damage to telecommunications satellites, power lines and electrical systems. All those can be put out of commission for months or years and have a major impact on the safety and economics.
The Deep Space Climate Observatory (DSCOVR) is a satellite for monitoring space weather and providing early warnings of solar events that could affect Earth. This satellite successfully launched in February 2015 to L1 Lagrange point between Earth and the sun, and can warn us about 45 minutes in advance of a space weather event.
Wind is a spacecraft that observes very similar data to DSCOVR, except for a small offset in space and time. Although DSCOVR is a modern satellite with high frequency compared to WIND, it is very noisy compared to WIND. That is why DSCOVR data is calibrated and validated by comparisons to it. Cleaner data from DSCOVR will allow us to make better predictions of the geomagnetic storms and potentially give earlier warning in case of another Carrington Event.
In this project, we worked in Python and uploaded all our code to a GitHub repository.
First Stage- Calibration
We first prepared our data for our machine learning pipeline. We loaded the CDF files and converted the required fields to pandas arrays. We also created a “validate vector” which indicates if an observation (a row in the dataframe) is valid (all the data within the valid range and are not none) and dropped invalid observations. We didn't have enough time, but another interesting approach is to fill the bad data with the median of its environment in order to “clean” the data.
In order to concatenate the DSCOVR FC1 data with the solar wind data, we had to understand how WIND timestamps correspond to DSCOVR timestamps. We thus compared recorded magnetic field data from both DSCOVR and WIND, in order to calibrate them. We performed Dynamic Time Warping (DTW) using the dtw-python library. We resampled the data from DSCOVR and WIND to the same frequency (once every 2 minutes) and split the data to 12 hour batches. We then applied the DTW algorithm with a slope-constrained step pattern (Sakoe1978) on each batch to find a time warping. We concatenated all results and used them to translate the timestamps from the WIND dataset to their corresponding DSCOVR timestamps.
Second Stage- Solar Wind Prediction
We planned to load the DSCOVR FC1 data and group it by timestamp in order to extract all channel recordings and put them in a single row to feed them into the Machine Learning model all at once.
In addition, we wanted to train an RNN to predict the ion parameters (density, temperature and speed) from the data collected by DSCOVR. RNNs are suitable for predicting time series and can utilize past data for predicting the next element. Specifically, we planned to use either a GRU or LSTM architecture. These architectures are also invariant to time warping. This makes them the most suitable for our purpose, since they will not be sensitive to oscillations in DSCOVR’s location.
We were supposed to compare our predictions to the “ground truth” data - the data from WIND. Unfortunately, due to a lack of time and computing resources, we couldn’t run the model itself in the scope of the Hackathon. However, we implemented two potential architectures (LSTM and GRU) and a code to train them. All that’s missing is to wrap up the data by extracting channel data from DSCOVR, combining the target solar wind parameters from WIND and separating then separating the features (to feed into the models) and the targets (to compare to the output).
Suggestions for the future
We hoped to also test the architectures and compare them to simple baselines such as Linear regression and a simple artificial neural network. Another redirection we thought about is to use the RNNs to predict the future solar winds to provide extra time for a warning signal.
Space Agency Data
We used the Z (last) coordinate of B1GSE from the “dscovr” dataset, the Z (last) coordinate of BGSE from the "wind mfi" dataset to get the magnetic fields data from DSCVR and WIND, respectively. This information was used in the calibration stage, in order to detect the relevant time epochs for the later stage. Both datasets are from NASA.
We took the data for the RNN from the fc0 dataset of NOAA.
We got the WIND ion parameters (Proton_VX_nonlin, Proton_VY_nonlin, Proton_VZ_nonlin, Proton_W_nonlin, 'Proton_Np_nonlin) for the ground truth from the “wind swe” dataset, also from NASA.
Hackathon Journey
The Space Apps experience was really great! It was exciting to work for almost 48 hours with our friends on the project together while we are surrounded by lots of people who are also enthusiastic about space challenges! It was also very thrilling to see the Discord group from all around the world, trying to solve the same problems of our team.
We chose this challenge since, as computer science and engineering Ph.D. students, we have experience with data science and machine learning tools. We were captivated by this project and believed we could combine our methods from our field of research on this challenge and get interesting results with potentially high impact.
We started with learning about the different datasets and filling our knowledge gaps. Then, some of the group members worked on the calibration of the data from the two sources, while the others worked on the RNN architecture and implementation. When we conquered an unsolved question, we tried to look at the project channel in discord for answers or search in relevant papers.
We would like to thank mike_from_smithsonian-SME from the global team, who helped a lot and was very responsive on the discord chat. Also, we would like to thank MonkeyTech - the company that hosted us in the Tel Aviv event and especially to Gil Ayalon, who took initiation and organized the Tel Aviv event.
References
Papers:
- Vech, D., Stevens, M. L., Paulson, K. W., Malaspina, D. M., Case, A. W., Klein, K. G., & Kasper, J. C. (2021). A powerful machine learning technique to extract proton core, beam and alpha-particle parameters from velocity distribution functions in space plasmas. arXiv preprint arXiv:2105.08651.
- Owens, M. J., & Nichols, J. D. (2021). Using in situ solar-wind observations to generate inner-boundary conditions to outer-heliosphere simulations–I. Dynamic time warping applied to synthetic observations. Monthly Notices of the Royal Astronomical Society, 508(2), 2575-2582.
- Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
- Cho, K., Van Merriënboer, B., Bahdanau, D., & Bengio, Y. (2014). On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259.
- Bronstein, M. M., Bruna, J., Cohen, T., & Veličković, P. (2021). Geometric deep learning: Grids, groups, graphs, geodesics, and gauges. arXiv preprint arXiv:2104.13478.
Code repositories (full library list is in the requirement file in Github):
dtw-python 1.3.0
https://pypi.org/project/dtw-python/
Scikit-learn
https://scikit-learn.org/stable/
Pytorch
Numpy
Pandas
Scipy
cdflib 0.4.7
https://pypi.org/project/cdflib/
Information:
DESIGN AND EARLY OBSERVATIONS FROM THE DSCOVR SOLAR WIND FARADAY CUP
https://www.swpc.noaa.gov/sites/default/files/images/u33/12-Kasper%202016-SWW-DSCOVRFC.pdf
DSCOVR Instrumentation Capabilities and Calibration Test Plan
Data:
The Wind Mission’s Magnetic Field Data Sets, BW(t)
https://cdaweb.gsfc.nasa.gov/pub/data/wind/mfi/mfi_h2/2022/
The DSCOVR Magnetic Field Data Sets, BD(t)
https://cdaweb.gsfc.nasa.gov/pub/data/dscovr/h0/mag/2022/
The Wind Mission’s Ion Parameters
https://cdaweb.gsfc.nasa.gov/pub/data/wind/swe/swe_h1/2022/
Data from DSCOVR
https://www.ngdc.noaa.gov/dscovr/portal/index.html#/
Photos for the demo:
https://en.wikipedia.org/wiki/Stellar_corona
https://www.nationalgeographic.com/science/article/sun-gallery
Tags
#software #ML #RNN #AI #solar #solarwind #spaceweather #carringtonevent #DSCVR #Wind #NOAA #IONparameters #calibration #DTW #NASA #monkeytech #geomagneticstorm #stellarcorona #astrophysics #climate #sun

