Assessing secondary effects of earthquakes with Twitter

Social media tools allow access to content published by citizens. Increasingly, people are using social media tools to write what they see in places affected by disasters, or to report facts or rumours about a local or remote situation.
Some information can be interesting for emergency responders.

The key benefits are:

  • Assessing disaster impacts: early warning tools use models to predict the size of the impact, but models are never perfect. Social media content can be used to quickly assess if impact is as expected.
  • Assessing effectiveness of response: early response should address the most urgent needs of the affected population. Social media content should reflect this, or can highlight changing needs.

However, social media often only reflect what has already been reported in the traditional media and shows the interest of the well-connected citizens. For example, earthquakes in the US will produce a surge in social media content, even if they have only local consequences, while disastrous earthquakes in developing countries might only show a slight increase in social media content.

There are a lot of challenges to filter out irrelevant content and detect real signals from the noise, and social media.
The Joint Research Centre of the European Commission developed an approach to exploit Twitter data in earthquake disasters. For earthquakes, a lot of information is available through seismological measurements and models, including the time and location of an event. Impact models also do a good job at estimating the severity of the disaster (e.g. GDACS alerts) up to estimates of casualties and financial impact (e.g. USGS PAGER).

Also the likelihood of secondary effects, e.g. collapsed buildings, landslides or natech disasters, can be assessed by looking at the presence of nearby buildings, slopes or industrial plants. However, no model can accurately predict the occurrence of such secondary effects in near real-time. The JRC implemented the following simple approach to assess the occurrence of secondary effects:

  • Retrieve periodically Tweets with the word 'earthquake': this is a very small subset of the total amount of Tweets and avoids the need for fast processing techniques and large databases.
  • Store all words in the Tweet with a time stamp, except stopwords like 'me', 'the', 'it', 'I', etc.
  • Provide query interface to this database producing a graph of Tweet counts for a given date (e.g. the date of an earthquake) and a list of keywords of interest.

For each earthquake in GDACS, an automatic query is generated using information already known. The query considers the date of the earthquake (defining a time window of 1h before and 4h after the earthquake) and a predefined list of keywords including: 'landslide', 'nuclear', 'collapsed', 'tsunami'. From the earthquake location, the country is also known by GDACS, and used as an additional keyword.

The resulting graphs, automatically available under the 'Media' tab in earthquake reports, show clearly which secondary effects actually occurred, according to social media reports. The system is currently in experimental state, but already provides useful and practical information. Some examples are shown below.