Leveraging Physics-Informed Neural Networks for Solar Wind Forecasting
Table of Contents
I developed this work as part of my Master Thesis, at the Laboratory of Artificial Intelligence and Computer Science, FEUP. It led to a publication at the European Symposium on Artificial Neural Networks, and granted me the Vestas Award for the best dissertation in my degree 😊!
Introduction #
Space weather refers to the dynamic conditions in the solar system, particularly the interactions between the solar wind — a stream of charged particles emitted by the Sun — and Earth’s magnetic field and atmosphere. Accurate space weather forecasting is crucial for mitigating potential impacts on satellite operations, communication systems, power grids, and astronaut safety.
Solar wind, flowing from the Sun’s corona, is a primary driver of these space weather disturbances. However, accurately predicting the solar wind’s behavior is challenging due to the complexity of the processes it undergoes as it travels from the Sun to Earth.
Existing models like MULTI-VP, which operate under complex pipelines such as SWiFT-Helio1D, provide accurate simulations of solar wind streams but require substantial computational resources, typically taking several hours to produce reliable forecasts [1]. This computational intensity makes rapid, real-time forecasting a significant challenge.
The Problem #
While MULTI-VP excels in accuracy, its computational demands limit its operational efficiency for real-time applications. Moreover, it sometimes produces numerical noise in its outputs, affecting the reliability of downstream models that depend on its data.
Previous attempts to mitigate computational demands have included using neural networks to generate more precise initial conditions, reducing computational time by up to 8% [6]. However, these approaches still rely on running MULTI-VP simulations, and the need for more efficient computation remains.
Additionally, purely data-driven surrogate models may produce physically inconsistent results, complicating their integration with downstream models. This inconsistency arises because traditional neural networks do not inherently respect the physical laws governing the solar wind’s behavior.
The Solution: Physics-Informed Neural Networks (PiNNs) #
To address these challenges, we propose leveraging Physics-Informed Neural Networks (PiNNs) as surrogate models for MULTI-VP. PiNNs blend data-driven machine learning techniques with physical laws, ensuring that the model’s outputs are not only accurate but also physically consistent.
What Are Physics-Informed Neural Networks? #
Physics-Informed Neural Networks are a class of neural networks that incorporate physical laws directly into the learning process. Introduced by Raissi et al. [7], PiNNs leverage both data and governing physical equations to guide the training of neural networks.
By embedding differential equations and physical constraints into the loss function, PiNNs ensure that the learned solutions are consistent with the underlying physics of the problem. This approach is particularly valuable in areas where data may be scarce or noisy, but physical laws are well understood.
In the context of solar wind forecasting, PiNNs allow us to create surrogate models that are both computationally efficient and physically accurate, bridging the gap between purely data-driven models and traditional physics-based simulations.
Methodology #
Data Preparation #
The PiNN model builds upon data generated by MULTI-VP, focusing on solar wind streams from five distinct solar events. MULTI-VP uses magnetogram-based data to map the Sun’s surface magnetic field and trace open magnetic field lines, each representing an elemental solar wind stream.
The geometry of each field line is defined by several physical properties, including:
- Distance from the Sun ( \(R\) )
- Position within the flux tube ( \(L\) )
- Magnetic field amplitude ( \(B\) )
- Inclination of the flux tube relative to the Sun ( \(\alpha\) )
- Tube expansion ratio ( \(A_{\text{exp}}\) )
These geometrical characteristics significantly influence the simulated outputs, which include the plasma density ( \(n\) ), velocity ( \(v\) ), and temperature ( \(T\) ) of the solar wind stream.
To train the PiNN, we prepared a dataset comprising these input properties and the corresponding outputs from MULTI-VP. We applied a comprehensive normalization process and used smoothing techniques to remove numerical noise from the outputs.
Model Definition #
The neural network architecture is designed to handle the complexity of the data. It consists of:
- An initial hidden layer with 2056 neurons
- Two hidden layers with 1024 neurons each
- Another hidden layer with 2056 neurons
- An output layer producing predictions for \(n\), \(v\), and \(T\) along the solar wind stream
We used ReLU activation functions and batch normalization to manage gradient flow. The model was optimized using the AdamW optimizer with an initial learning rate of \(10^{-2}\), employing a learning rate scheduler and weight decay to enhance training stability.
Incorporating Physical Laws #
To ensure physical consistency, we integrated two key physical laws into the model’s loss function:
Mass Conservation: The principle that mass remains constant within the flux tube. This can be expressed as:
$$ mc = \sigma\left( \frac{n \cdot v}{B} \right) \approx 0 $$
where \(\sigma\) denotes the standard deviation, \(n\) is the plasma density, \(v\) is the velocity, and \(B\) is the magnetic field amplitude.
Momentum Conservation: Represented by a simplified Magnetohydrodynamics (MHD) system of differential equations, the momentum conservation can be expressed as:
$$ pc = P_{\text{grad}} + g_{\text{term}} + v_{\text{grad}} + \nu_{\text{damp}} \approx 0 $$
where:
Pressure Gradient Term:
$$ P_{\text{grad}} = \frac{\partial (n \cdot T)}{\partial L} \cdot n $$
Gravitational Term:
$$ g_{\text{term}} = G \frac{\cos(\alpha)}{R^2} $$
with \(G\) being the gravitational constant.
Velocity Gradient Term:
$$ v_{\text{grad}} = \frac{\partial v^2}{\partial L} - v \frac{\partial v}{\partial L} $$
Viscous Damping Term:
$$ \nu_{\text{damp}} = -\nu\left( \frac{\partial^2 v}{\partial L^2} + A_{\text{exp}} \frac{\partial v}{\partial L} \right) $$
where \(\nu\) is a predefined viscosity constant.
By incorporating these physical laws as additional loss terms during training, the PiNN is guided to produce outputs that are not only accurate in terms of the data but also adhere to essential physical principles.
The Loss Function #
The overall loss function \(L\) for training the PiNN is defined as:
$$ L = \lambda_s L_s + \lambda_{\text{phys}} L_{\text{phys}} $$
- \(L_s\): The supervised loss (e.g., Mean Squared Error) between the PiNN’s predictions and the MULTI-VP outputs.
- \(L_{\text{phys}}\): The physics-informed loss derived from the conservation laws.
- \(\lambda_s\) and \(\lambda_{\text{phys}}\): Hyperparameters to balance the supervised and physics-informed losses.
Results #
The PiNN models produced results that closely mirrored MULTI-VP’s outputs but with significantly reduced computation times—from hours to seconds. The models also demonstrated improved physical consistency by adhering to the conservation laws.
Metric | MULTI-VP | Classical NN | Mass Conservation PiNN | Momentum Conservation PiNN | Combined PiNN |
---|---|---|---|---|---|
Mean Coefficient of Variation (MCV) | 0.392 | 0.336 | 0.311 | 0.305 | 0.319 |
Mean Squared Error (MSE) | — | \(3.60 \times 10^{-2}\) | \(3.73 \times 10^{-2}\) | \(5.85 \times 10^{-2}\) | \(3.77 \times 10^{-2}\) |
Mass Conservation Loss (\(L_{\text{mass}}\)) | \(1.47 \times 10^6\) | \(2.08 \times 10^6\) | \(1.00 \times 10^6\) | \(7.22 \times 10^6\) | \(9.83 \times 10^5\) |
Momentum Conservation Loss (\(L_{\text{mom}}\)) | \(2.84 \times 10^4\) | \(2.11 \times 10^4\) | \(1.89 \times 10^4\) | \(1.71 \times 10^2\) | \(9.45 \times 10^3\) |
Key Observations:
- The PiNN models not only matched the accuracy of the classical neural network baseline but also exhibited better adherence to physical laws.
- The Combined PiNN model, which incorporates both mass and momentum conservation, showed a significant reduction in conservation losses while maintaining comparable MSE to the baseline.
- Computation Time: All PiNN models reduced computation times from hours (MULTI-VP) to seconds, enabling real-time forecasting.
Why This Matters #
By reducing computation times and ensuring physical consistency, the PiNN surrogate models enable real-time solar wind forecasting. This advancement could significantly enhance our ability to mitigate the risks posed by space weather on critical technologies and infrastructure.
Moreover, the physical consistency of the PiNN outputs facilitates seamless integration with downstream heliospheric models like Helio1D and EUHFORIA, potentially improving the accuracy and reliability of broader space weather predictions.
Future Directions #
The next steps involve:
- Integrating the PiNN surrogate models with heliospheric models to validate their effectiveness in larger-scale space weather predictions.
- Optimizing the network to handle even more complex physical constraints.
- Exploring the application of PiNNs to other areas in space weather modeling and beyond.
For more details, you can explore the project on GitHub.
References #
[1] Rui F Pinto and Alexis P Rouillard. A multiple flux-tube solar wind model. The Astrophysical Journal, 838(2):89, 2017.
[6] S. Barros, Filipa & Lima, J. & Restivo, André & Pinto, Rui & Graça, Paula & Villa, Murillo. Using Recurrent Neural Networks to improve initial conditions for a solar wind forecasting model. Engineering Applications of Artificial Intelligence. 133, 2024. 10.1016/j.engappai.2024.108266.
[7] M. Raissi, P. Perdikaris, and G. E. Karniadakis. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational Physics, 378:686–707, 2019.
By integrating physics into neural networks, we’re not just making predictions — we’re ensuring those predictions make sense within the laws of our universe. This blend of data and physics opens new doors for more reliable and efficient models in not just space weather forecasting, but across all domains where physical laws govern complex systems.