For centuries, fishermen in Peru have noticed a connection between warmer than usual ocean waters—what is now known as the El Niño phenomenon—and droughts and floods on land.
But making accurate hydrologic predictions about El Niño's impact on regional weather patterns—and even understanding the complex El Niño phenomenon itself—has stymied climate scientists for decades.
It was believed making the connection required developing an incredibly complex physics model, one involving hard-to-measure flows between ocean and atmosphere and between atmosphere and land, says Auroop Ganguly, co-director of Northeastern's Global Resilience Institute.
In a recent paper published in Nature Communications, Ganguly and three co-authors showed that using machine learning to crunch existing data can yield explainable insights about the impact of El Niño on the world's great river systems—the Ganges, Congo and Amazon—and ultimately, regional weather patterns.
Treasure troves of data
Climate scientists have collected a vast amount of data, based on both observation and models, related to weather patterns, ocean temperatures from across the globe, flood levels, droughts and other climate phenomena, Ganguly says.
Up to now, the treasure trove of data has not been fully exploited, he says, adding that it has sometimes been stored in separate climate science silos.
But new developments in machine or deep learning make it possible to use servers with high computing power to harness the immense stores of data to develop predictive algorithms, Ganguly says.
Machine learning to crunch existing data can yield explainable insights about the impact of El Niño on the world's great river systems
"This explainable deep learning is new," he says. "We can say that here is how sea surface temperature correlates to itself and influences river flow. And we learn that from the past. That was not in the realm of feasibility before."
Deep learning can discover what is essentially a long-distance connection between sea surface temperatures in the eastern Pacific, where El Niño or La Nina intermittently occurs, and what that implies for river flows across the world, says Ganguly. who is also the lead for Climate-AI at the Institute for Experiential AI.
Deep learning to tackle society's big challenges
"The power of these approaches is in the ability to extract this information from the vast ocean—pun intended!—of data, rather than through oversimplified indices of this complex phenomenon," he says.
Think of being able to assess what is happening by connecting ocean temperatures to cloud development to precipitation affecting rivers, in different parts of the world, says co-author Yumin Liu of Amazon, who worked on the study as a Northeastern Ph.D. student.
"Now people realize they can mutually benefit by connecting the machine learning community and the climate community," he says.
The forthcoming information would be able to help stakeholders better prepare for floods, droughts and other climate events that affect lives, homes, industry, transportation and food production.
"Developing and adapting machine learning methods to address societal grand challenges is an urgent need of our age," says co-author Jennifer Dy, Northeastern electrical and computer engineering professor.
"This paper is an interesting demonstration of how data sciences, specifically deep learning and complex network constructs, can fill gaps in our predictive understanding about hydroclimates," says co-author Kate Duffy, who worked on the study as a Ph.D. student in Northeastern's Sustainability and Data Science Lab.
Global sea surface temperature fluctuations including the El Niño Southern Oscillation impact interannual variability in the flow of large rivers such as Amazon and Congo. a Regions for calculating El Niño–Southern Oscillation (ENSO) indices (Niño 1 + 2, Niño 3, Niño 3.4 and Niño 4) and Indian Ocean Dipole Mode Index (DMI), and two hydrological regions (Amazon River basin and Congo River basin). The colors shown on the ocean are the annual sea surface temperature (SST) anomaly in 2008, a La Niña year. b Time series of standardized annual river flow in m3/s for Amazon (green) and Congo (lime) and monthly Oceanic Niño Index (ONI) in the Niño 3.4 region at the same time-period. The ONI data are from United States Climate Prediction Center (NOAA 2021). Warm (red) and cold (blue) periods show months that are higher than +0.5 °C or lower than −0.5 °C threshold for a minimum of five consecutive months. A warm/cold year is a year when warm/cold anomaly months dominate, and a neutral year is a year that is neither a warm nor a cold year. For Amazon, the river flow decreases during the warm period and increases during the cold period. However, the relations between Congo River flow and ONI are more complicated and not obvious. Credit: Nature Communications (2023). DOI: 10.1038/s41467-023-35968-5
Developing predictive models
Temperature is relatively easy to measure, though massive amounts of data are needed to keep track of temperatures across the planet's wide expanse of oceans.
Precipitation is more difficult to measure, because precipitation systems "can be somewhat random and evolve very rapidly," according to NASA. The deep learning model allows scientists to leverage both types of information to develop a potentially predictive model, the scientists say.
Duffy says the paper also shows that improvements in earth systems models can also improve software systems, called couplers, that connect large model components such as ocean, atmospheric and land models, and facilitate better feedback and information flow. One example of such a coupler is the Energy Exascale Earth System Model.
"What the paper suggests is that in the future it could be important to examine the possibility that gaps in the science of coupling can be addressed by developing what are called hybrid physics-AI approaches, where, for example, numerical models and partial differential equation based systems can be at least partly connected via machine learning," Ganguly says.
Improving information flows
Dy, who is director of experiential AI postdoc education, says Northeastern's Institute for Experiential AI plans to work on developing "generalizable and trustworthy solutions" combining global climate models with customized machine learning.
Being able to predict information about river flows from sea surface temperature maps may appear to be a niche solution, says Ganguly, a Northeastern College of Engineering distinguished professor.
"However, it allows incredible opportunities to open up," he says.
The research shows how data-driven methods can enable improved climate-informed water resources projections, says Duffy, who recently stepped down as a NASA scientist to launch her own startup in AI-based satellite remote sensing with a NASA SBIR grant.
The intriguing possibility, according to Ganguly and Dy, is that this opens up the development of hybrid-AI systems for more effective coupling of model components within earth systems and global climate models.