Does data science help or hinder how we respond to the incredible transformation of the earth’s social and environmental systems?
Answering this question is vitally important because we live in times of profound change. The global population reached eight billion people in 2022. Many people’s lives have grown better over the last century, as witnessed by longer lifespans and socioeconomic progress. At the same time, these shifts are increasingly marked by mounting inequality and uncertainty about many foundational legal, social, and political institutions. They have also led to significant impacts on the natural world, including anthropogenic climate change, sea-level rise, landscape degradation, impaired vegetative and animal communities, and widespread pollution of the ground, water, and air. Of course, these environmental impacts often have unintended consequences for human well-being. Compromising earth systems reduces our access to food, fuel, fiber, and clean water and air. We are seeing these consequences now in all too frequent “once-in-a-lifetime” events such as rivers drying up in Europe and North America, significant deepening of long-term drought and famine in Africa, unprecedented wildfires in Australia, and catastrophic flooding in Asia.
Can data science help us understand these complicated human-environment interactions and their attendant impacts? Data science and cognate fields like big data and artificial intelligence hold much promise for studying and responding to transformations of human and ecological systems. At the same time, many researchers can aid data science by offering sophisticated data, approaches, and theories that apply to pressing real-world problems.
The potential and pitfalls of data science for human-environment systems center on data, methods, theory, and policy.
Data. Data science offers exciting new kinds of information but sometimes struggles with vast and complicated human-environment datasets. A dizzying array of measurements is being gleaned from an expanding host of sensors, internet activities, and the merging of existing databases. At the same time, human-environment data are hard to handle because they span broad spatial and temporal extents and have increasingly fine resolutions. These data often suffer from bias and error that make analysis difficult. And for all our impressive capacity to collect new data, we still have huge gaps. We have studied only a tiny part of the world’s oceans and a fraction of the millions of species on earth, for example, and we are missing information on people over much of the globe, especially before the year 2000.
Method. Data science offers new and exciting ways to understand human-environment systems but faces unresolved methodological challenges. Key data science approaches—including big data, machine learning, and artificial intelligence—can balk in the face of human-environment data because they represent complex assemblages of social and biophysical entities. Addressing these challenges involves borrowing from other fields to manage data lifecycles and engage in reproducible science. Many problems are technological in nature, as are the solutions, such as ongoing advances in high-performance and cloud computing. The data science of human-environment research is also expanding the use of smart computing and drawing on the embedded processing and sensing capacity of internet-connected devices.
Theory. Some proponents argue that data science does not need theory to advance knowledge — data and computing is all that is necessary. But this stance ignores important questions about doing science. Robust research combines data science methods like machine learning and big data with existing domain-specific knowledge of the workings of human and natural systems. Educational institutions must also better reconcile competing conceptions of what data science should do to find synergies among academic fields and commercial interests. Many intriguing conversations about theory are happening in the context of the science of cities as places where human and environmental systems interact.
Policy. Even as it is touted by many policymakers as a powerful new way to help people and the planet, data science poses profound legal, political, and ethical conundrums. Skeptics point out that the rapid and largely unregulated growth of data science has led to discrimination, data dispossession, harmful surveillance, and privacy breaches. The manner in which data science plays out is governed in part by divides between people and places in how they access and use data, and by competing public and private interests. Dilemmas and potentials of data science for policy are especially apparent in the context of sustainable development around the globe.
So, what is the answer? Does data science help or hinder how we respond to the transformation of the earth’s social and environmental systems?
The best — if unsatisfying! — answer is that it depends. Data science plays an increasingly prominent role in understanding and addressing many issues of interest to people and the planet. Serious shortcomings exist in social and environmental data, some of which we are only beginning to understand and rectify. Researchers can improve their methods and broaden engagement with the many theories that drive the data science of human-environment systems. We must also address many significant legal, ethical, and policy concerns. We need to move past the common positions of, on the one hand, hyperbolic embrace of data science and, on the other, unhelpful (if often justified) critiques. A constructive middle ground can and should be found because data science is here to stay, as are many human-environment challenges.
Author: Steven M. Manson