04

Apr

2017

Participatory mining of data streams: beyond crisis mapping

Written by: Carlos Castillo

 
32786886766_0b0b236911_o

Emergency situations force us to take critical decisions in a very short time. Often, the first place people turn to for information is social media. A combination of human and machine intelligence can help make sense of the immense amount of data that comes from these widely varied perspectives and posts.

 

Emergency situations force us to take critical decisions in a very short time. Is there a real danger or am I misinterpreting the situation? Should I shelter in place or move some place safer? Should I reach out to others to tell them what is going on right now, or wait until the situation is clearer? These and many other questions require information, which is sought from the sources that are most readily available: people in our immediate vicinity, people we can reach through our mobile phones, and broadcast media such as radio and television.

Social media platforms are familiar technologies for hundreds of millions of people. They intermediate and focalize social relationships online, allowing us to quickly disseminate and gather information in near real-time. It is not surprising that as communication needs are increased during any crisis or disaster, social media usage skyrockets during these situations. Hundreds of messages, photos, videos, are exchanged through channels both private and public.

Many emergencies and disasters unfold to show a pattern that is fairly predictable. The first messages in social media contain messages of caution and advice, which are followed by messages of solidarity (thoughts and prayers), information about people injured or dead, damage to infrastructure and/or utilities (when applicable), and calls for donations of goods and services. The quality of this information, as everything else online, varies from repetitive to unique, from vague to precise, from unverified to confirmed, and so on.

Is there anything that can be done with this information to support emergency relief efforts, to make the public safer, to understand how to mitigate, and perhaps prevent some of these disasters? In my opinion, the answer is yes. However, the huge amounts of information posted every minute far exceeds human capacity to comprehend, and understanding a confusing situation through a set of fragmented postings in a reliable manner far exceeds current computational capacities.

A combination of human and machine intelligence can help make sense of this data. Some systems are already in place, and there are thousands of digital volunteers that are willing to contribute. In recent years, we have seen many examples of hybrid systems applied to the processing of tweets and photos, as well as photos and video footage from UAVs.

These systems are still at their infancy, but are the first glimpses of an emerging paradigm. Just as Wikipedia editors have been organizing human knowledge for over a decade by performing what one could call batch processing, crisis mappers perform online processing, sometimes in near real-time. The input is a fast, chaotic, rich stream of events (texts, photos, videos, data) from a variety of sources. The sought output is a slowly flowing stream of key updates that contribute to make sense of a rapidly-developing situation.

Crisis informatics is a fascinating topic and a huge technological challenge, which should attract some of the best minds interested in creating a new breed of socio-technological systems. The Digital Humanitarian Network is a great starting point for people willing to contribute their skills to digital response efforts. Big Crisis Data is my attempt at organizing what is known about this topic from a researcher’s perspective.

Find out more about Dr. Castillo’s Big Crisis Data.

Featured image is courtesy of Social Media Koppelingen | ATLAS Social Media | Flickr

Enjoyed reading this article? Share it today:

About the Author: Carlos Castillo

Carlos Castillo is a researcher in social computing. He is a web miner with a background in information retrieval, and has been influential in the areas of web content quality and credibility, and adversarial web search. He has published more than seventy publications, with over 6,700 citations, in top-tier international conferences and journals as...

View the Author profile >
 

Latest Comments

Have your say!