Conference PaperPDF Available

Towards Real-time Emergency Response using Crowd Supported Analysis of Social Media

Authors:

Abstract and Figures

This position paper outlines an ongoing research project that aims to incorporate crowdsourcing as part of an emergency response system. The proposed system's novelty is that it integrates crowdsourcing into its architecture to analyze and structure social media content posted by microbloggers and service users, including emergency response coordinators and victims, during the event or disaster. An important challenge in this approach is identifying appropriate tasks to crowdsource, and adopting effective motivation strategies.
Content may be subject to copyright.
Towards Real-time Emergency Response using Crowd
Supported Analysis of Social Media
Jakob Rogstadius, Vassilis Kostakos
Madeira Interactive Technologies Institute
University of Madeira
9000-390 Funchal, Portugal
{jakob,vk}@m-iti.org
Jim Laredo, Maja Vukovic
IBM T.J. Watson Research Center
Hawthorne NY 10532, USA
{laredoj,maja}@us.ibm.com
ABSTRACT
This position paper outlines an ongoing research project
that aims to incorporate crowdsourcing as part of an
emergency response system. The proposed system's novelty
is that it integrates crowdsourcing into its architecture to
analyze and structure social media content posted by
microbloggers and service users, including emergency
response coordinators and victims, during the event or
disaster. An important challenge in this approach is
identifying appropriate tasks to crowdsource, and adopting
effective motivation strategies.
Author Keywords
Emergency response, social media, crowdsourcing, text
mining.
ACM Classification Keywords
H5.m. Information interfaces and presentation (e.g., HCI):
Miscellaneous.
General Terms
Design, Experimentation.
INTRODUCTION
The period of time following a natural disaster or other
large scale emergency is traditionally characterized by
individuals having limited situational awareness bound to
their immediate surroundings, combined with sparse high
level summaries provided by traditional media. More
recently, during events such as earthquakes, elections,
bushfires and terrorist attacks, people have systematically
chosen to share their knowledge on a micro level with
others through online social media such as Twitter [3,9,10].
In fact, it is often the case that reports of incidents get
published through social media before they reach regular
media. However, despite the timeliness and volume of this
new information source, it is highly challenging for users to
overview and navigate the torrent of information which can
result from such large scale events. In addition, the absence
of summaries and validity checks of claims made by posters
add further complexity to the already challenging task. In
the near future we are likely to see an increase in volume of
produced social media content, thus further increasing the
need for improved structure and overview.
ENVISIONED SYSTEM
Architecture
We envision a system design (Figure 1) in which machine
learning and automated tools work hand in hand with a
crowdsourcing community to quickly and efficiently
organize and analyze information on microblogging
websites during crisis and emergency situations. The
system we envision has six main responsibilities or
components.
1. Collect crowd generated data (e.g. by tracking keywords
on Twitter).
2. Make a "best attempt" at structuring the data using NLP
and other text mining techniques, as well as extracting
named entities, locations and important points in time.
3. Identify shortcomings in the collected data and the
inferred structure, formulate tasks and seek answers via
a crowdsourcing platform.
4. Integrate the new knowledge provided by the crowd into
the existing knowledge base.
5. Present aggregated and structured data back to the
community, i.e. emergency response professionals,
affected community members and others.
6. Wherever possible, support direct interaction between
users of the presentation layer and the original
information providers.
Crowd in the loop
Two vital feedback loops exist in this design. The analysis
loop is one where the system gives users an improved
understanding of the event; enabling improved actions and
communication (for instance by directly addressing
messages to service users whose reports have been
collected by the system). This in turn changes the state of
the event, which is reflected by a change in the inflow of
information to the system.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
CHI 2011, May 712, 2011, Vancouver, BC, Canada.
Copyright 2011 ACM 978-1-4503-0267-8/11/05....$10.00.
In the clarification loop, the system identifies information
gaps, contradictions, weaknesses or uncertainties in the
current information coverage of the situation. It then turns
these flaws in the knowledge base into tasks suitable for
crowdsourcing, sends them off to a crowdsourcing engine
and integrates the results back into the knowledge base. We
argue that by merging automatically aggregated information
from social media with the output of crowdsourced work,
the system can have the short processing times and
scalability of an algorithmic approach, combined with the
adaptability of humans. By integrating the crowd into the
analytic process, the system will be able to infer structure in
ways that closely match human cognitive models, even for
topic domains where the training corpora is scarce.
Information generated through crowdsourced tasks should
ideally be fed back into the medium which formed the
original input to the system, to decouple the analysis and
knowledge representation from the presentation layer and
thereby simplifying development of clients for different
technical platforms. In fact, if gathered data and drawn
conclusions can be made publicly available (e.g. as social
media updates or RSS feeds), this knowledge can be
accessed in its raw form through any existing client. For
instance, members of the crowd can be asked to track down
images depicting an event, which the system then
automatically shares through a designated Twitter account.
A significant strength of our proposed system over existing
disaster tracking systems such as Ushahidi
(www.ushahidi.com) is that it listens to communication
channels that people already use in their pre-event lives,
rather than attempting to rally information providers for a
new channel once the event has already taken place. This
social media content is available regardless of the success
and popularity of the system itself and by merely acting as a
crowd-algorithm-powered information catalyst, it becomes
easier to deploy the system in particular during early stages
of an event.
RESEARCH AGENDA
Related work
The proposed system builds on existing research in text
mining methods, such as clustering, named entity extraction
and relevance classification, and in particular methods
adapted for social media content [2,5,6]. Furthermore,
crowdsourcing platform design, e.g. Amazon’s Mechanical
Turk (mturk.com) and CrowdFlower (crowdflower.com)
are directly relevant to this work, as are media aggregation
systems such as Twitrix+ [8] and the Europe Media
Monitor [7]. Finally, motivational factors governing the
quality and quantity of crowdsourced work, both of
extrinsic and intrinsic nature [1,4], are directly relevant.
Ongoing research
The ongoing research efforts in this project are currently
focused on measuring the interaction effects of intrinsic and
extrinsic motivation on crowdsourced work. In addition, we
are currently adapting text mining methods to streaming
social media, in ways that permit integration of a crowd-in-
the-loop at different stages of the analysis. Finally, we are
in the process of identifying types of crowdsourcing tasks
suitable for being generated by the system.
Research challenges
There exists a series of research challenges that need to be
addressed. In terms of knowledge mining, we believe there
is a need for knowledge representations that support both
the identification of missing information and turning the
gaps into crowdsourced tasks. Additionally, we require
suitable techniques for keyword extraction and pruning for
Figure 1. The proposed system architecture that integrates crowdsourcing into the analysis of social media content. Two feedback
loops are present in the flow of information; the analysis loop and the clarification loop.
high quality topic tracking in real-time, as well as
techniques for capturing the location (and possibly context)
of people contributing information to the system.
In terms of representation, we expect to develop
visualization techniques for the collected data, and suitable
UI’s for commenting and responding to content generated
by others or the system. A further challenge will be the
design of compact and accurate summarizations of large
social media content clusters of similar topic.
Finally, in terms of crowdsourcing, there exists an open
issue of managing the tradeoffs between quality, cost and
time needed to complete tasks, in the context of the varying
priorities that are applicable during disasters. In addition,
the critical information flow in the system must be
algorithmic, as processing bottlenecks are otherwise likely
to appear due to lack of people willing to work, or lack of
incentives to offer the workers. Part of this research must
clearly identify the functionality that belongs to this critical
path and define support tasks that can be delegated to a
crowd. Even if human intelligence is necessary for high
quality output, the system as a whole must still be
functional without workers.
CONCLUSION
This paper has outlined our ongoing efforts at building a
crowd-powered system for real-time response to emergency
events by analysing information available on social media.
We have outlined a proposed system architecture for
involving crowds in the real-time analysis, and have
designed two important feedback loops. These will enable
a) the system to give users feedback about the status of the
event, and b) the users to give feedback to the system to
improve its analysis. Finally, the paper summarizes our
initial findings, our ongoing efforts, as well as a set of
challenges that we expect to tackle in the future.
ACKNOWLEDGMENTS
This work is funded by an IBM Open Collaboration Award
and by the Portuguese Foundation for Science and
Technology (FCT) grant CMU-PT/SE/0028/2008 (Web
Security and Privacy).
AUTHOR BIOGRAPHIES
Jakob Rogstadius is pursuing a PhD in HCI at the
University of Madeira, where he conducts research on how
community generated content (such as Twitter) can be
leveraged to create situational awareness in crisis situations
through both crowdsourcing and algorithmic approaches.
Before joining the University of Madeira, he designed
visualization tools and data mining algorithms for a
company in the fuel and energy sector; and he has been
employed as research engineer at Sweden's National Center
for Visual Analytics. He holds a MSc in Media Technology
and Engineering from the University of Linköping,
Sweden, where his final thesis project dealt with applied
information visualization.
Vassilis Kostakos is an Assistant Professor in the Madeira
Interactive Technologies Institute at the University of
Madeira, and an Adjunct Assistant Professor at the Human
Computer Interaction Institute at Carnegie Mellon
University. He received an IBM Open Collaboration Award
in 2010, and is a Fellow of the Finland Distinguished
Professor Program. His research interests include: mobile
and pervasive computing, human-computer interaction,
social networks, crowdsourcing.
REFERENCES
1. von Ahn, L. Games with a purpose. Computer, IEEE
Computer Society (2006), vol. 39, issue 6, 92-94.
2. Berry, M. Survey of text mining: clustering,
classification and retrieval, Second Edition. Springer-
Verlag, New York, USA, 2004. ISBN 0-387-95563-1.
3. Burns, A., Eltham, B. Twitter free Iran: An evaluation
of Twitter’s role in public diplomacy and information
operations in Iran’s 2009 election crisis. In Record of
the Communications Policy & Research Forum 2009,
Network Insight Institute (2009), 298-310.
4. Gneezy, U., Rustichini, A. Pay enough or don't pay at
all. The Quarterly Journal of Economics, MIT Press
(2000), 791-810.
5. Kotsiantis, S. B. Supervised Machine Learning: A
Review of Classification Techniques. Informatica 31
(2007), 249-268.
6. Petrović, S., Osborne, M., Lavrenko, V. Streaming First
Story Detection with application to Twitter. In Proc.
HLT 2010, Association for Computational Linguistics
(2010), 181-189.
7. Piskorski, J., Tanev, H., Atkinson, M., van der Goot, E.
Cluster-Centric Approach to News Event Extraction. In
Proceeding of the 2008 conference on New Trends in
Multimedia and Network Information Systems, IOS
Press (2008), 276-290.
8. Sheth, A., Purohit, H., Jadhav, A., Kapanipathi, P.,
Chen, L. Understanding Events Through Analysis Of
Social Media. In Proc. WWW 2011, ACM Press (2010).
9. Slagh, C. L. Managing chaos, 140 characters at a time:
How the usage of social media in the 2010 Haiti crisis
enhanced disaster relief. Georgetown University, USA,
2010.
10. Vieweg, S., Hughes, A., Starbird, K., Palen, L.
Microblogging during two natural hazards events: what
Twitter may contribute to situational awareness. In
Proc. CHI 2010, ACM Press (2010), 1079-1088.
... In 2011, Jakon Rogstadius et al. [32] presented a paper that depicted recent trends in research area targeting to include crowdsourcing in systems for emergency response. The system is unique as it incorporates crowdsourcing in the basic structure to analyze and organize the content from the internet uploaded by numerous customers and micro-bloggers. ...
Article
Full-text available
In previous years, the world has gone through several natural disasters like Tsunami, earthquakes, floods, tornadoes, hurricanes, cyclones, etc., and manufactured disasters such as stampedes, fire, terror attacks, etc. A large number of causalities are reported, with a massive loss to life, economy, and other things. Knowing this, we should make a transit from a reliable and flexible disaster management approach to a proactive one by leveraging advances in science and technology. A colossal increase in worldwide population points out that the occurrence of a crowd at any place is becoming more and more familiar with each passing day. It is undeniable that these mass gatherings often become a source of a crowd-related catastrophe such as sudden escape, terror attacks, mob lynching, human stampede, or human crushing. Prior research on the crowd’s social, psychological, and computational dynamics has indicated that the crowd's behavior under such devastating conditions is greatly decisive for crowd safety, its access or escape from the affected region, and emergency evacuation. Despite this, there is a certain paucity of pragmatic research on extreme crowd-related use cases and how to deal with such situations effectively. Through the past years, people and media have shared the details of such happenings and their experiences on a micro-level through various social network mediums. Attempts are being made to analyze this data using advanced technological tools and methods to extract the trends out of such happenings and predict any future happenings so that countermeasures can be taken and they can be prevented. This paper makes a structured literature assessment on the current scenario and systematically surveys the studies made in this field. It paves the path for future rendezvous in this area to unearth the hidden gold mine of information along the timeline. Also, an attempt is made to develop a technological solution or system that may help achieve an elevated level of social security via holistic video surveillance capable of detecting any crowd related anomaly and proactively warning the concerned authorities about any such casualty. This will ensure that crowd disasters can be prevented well in time by gaining prior insights about them. The system is developed that encompasses everything from human detection, tracking, and counting to any abnormal behavior detection. The same has been achieved with 93.33% accuracy.
... In the last decade, research on the potential of ICTs in DRM has been carried out (Cioca et al., 2009;Ludwig & MATTEDI, 2018;Reddick, 2011;Wang et al., 2016), covering a wide range of uses, such as SMS (Cioca et al., 2009;Homier et al., 2018), television broadcast (Grassau et al., 2019;Segura et al., 2015;Wahyu et al., 2012), radio (Cardoso et al., 2014), social network (Doktor & Giroux, 2011;Dunn Cavelty & Giroux, 2011;Norris, 2017;Peterson et al., 2019;Rogstadius et al., 2011), e.g. Twitter (Hong et al., 2017;Layek et al., 2018;Peterson et al., 2019), as well as digital applications (Bachmann et al., 2015;Park, 2017;Verrucci et al., 2016), and the internet (Bachmann et al., 2015;Webb et al., 2010). ...
Article
Full-text available
Natural events continue to take a heavy toll on human lives. Added to this are the challenge of dynamic at-risk settings, uncertainty, and increasing threats, which demand holistic, flexible, and quickly adaptable solutions. In this context, mobile applications are strongly emerging as communication tools that can assist in disaster reduction. Yet, these have not been sufficiently evaluated. In view of this, the aim of this research is to evaluate the adequacy of mobile applications in disaster risk reduction in reference to some of the deadliest natural events. To this purpose, a two-part methodology is developed. Firstly, a random sample of applications is evaluated and contrasted with the literature. Secondly, the viability of mobile applications is determined based on the Digital Application Potential Index proposed by the authors, cross-referenced in Geographical Information Systems with the WorldRiskIndex. The results show that most mobile applications limit their coverage range to only one stage of Disaster Risk Management (DRM) and one type of hazard event, failing to address systemic risk and hampering the scale-up of humanitarian response. For these to become adequate and wide-reaching, strong policies to promote reliability, transparency, and citizen empowerment would be required. The policies establishing the use of mobile applications as a viable tool for DRM must consider reducing the prices of internet connectivity while increasing educational levels, on top of language translation. At this point, the adoption of mobile applications is unable to ensure DRM communication, especially in countries with higher-risk levels, requiring these to be complemented with auxiliary tools. Graphic abstract
... There is a massive scarcity of formalized data, such as domain-specific annotated data sets, as well as severe problems regarding the use of social media data, such as the lack of methods to filter trustful information and to exploit noisy social media data streams of high volume, in near real-time [22]. Although geotagging is supported on Twitter, only approximately 1% of published tweets are geotagged [23]. The short nature of tweets, having a maximum size of 280 characters, makes applying the current state-of-the-art machine learning algorithms-that build latent topic models-quite difficult. ...
Article
Full-text available
This research is aimed at creating and presenting DisKnow, a data extraction system with the capability of filtering and abstracting tweets, to improve community resilience and decision-making in disaster scenarios. Nowadays most people act as human sensors, exposing detailed information regarding occurring disasters, in social media. Through a pipeline of natural language processing (NLP) tools for text processing, convolutional neural networks (CNNs) for classifying and extracting disasters, and knowledge graphs (KG) for presenting connected insights, it is possible to generate real-time visual information about such disasters and affected stakeholders, to better the crisis management process, by disseminating such information to both relevant authorities and population alike. DisKnow has proved to be on par with the state-of-the-art Disaster Extraction systems, and it contributes with a way to easily manage and present such happenings.
... Social media data can be also used to incorporate crowdsourcing and guide the crowd to move to safe locations in emergency situations. The research project described by Rogstadious et al. [64] integrates real-time crowdsourcing to identify the appropriate tasks and strategies in the case of a disaster. Another interesting topic in the analysis of social media data is to determine the impact of social connections on people's mobility. ...
Article
Full-text available
Extracting features from crowd flow analysis has become an important research challenge due to its social cost and the impact of inadequate planning of high-quality services and security monitoring on the lives of citizens. This paper descriptively reviews and compares existing crowd analysis approaches based on different data sources. This survey provides the fundamentals of crowd analysis and considers three main approaches: crowd video analysis, crowd spatio-temporal analysis, and crowd social media analysis. The key research contributions in each approach are presented, and the most significant techniques and algorithms used to improve the precision of results that could be integrated into solutions to enhance the quality of services in a smart city are analyzed.
... Crowdsourcing, as introduced by Howe (2006), means to distribute a specific task to an unknown set of volunteers to solve a problem by harnessing collective contribution rather than by an individual. Another form is to perform data processing tasks like translating, filtering, tagging or classifying this content by voluntary crowd workers (Rogstadius et al., 2011). These operations can already be facilitated during the creation of social media posts by explicit user-driven assignment of predefined thematic keywords or hashtags, for instance "#flood" or "#need", to allow information that is contained in the posts to be easily extracted, filtered and computationally evaluated (Starbird and Stamberger, 2010). ...
Article
Full-text available
During and shortly after a disaster, data about the hazard and its consequences are scarce and not readily available. Information provided by eyewitnesses via social media is a valuable information source, which should be explored in a~more effective way. This research proposes a methodology that leverages social media content to support rapid inundation mapping, including inundation extent and water depth in the case of floods. The novelty of this approach is the utilization of quantitative data that are derived from photos from eyewitnesses extracted from social media posts and their integration with established data. Due to the rapid availability of these posts compared to traditional data sources such as remote sensing data, areas affected by a flood, for example, can be determined quickly. The challenge is to filter the large number of posts to a manageable amount of potentially useful inundation-related information, as well as to interpret and integrate the posts into mapping procedures in a timely manner. To support rapid inundation mapping we propose a methodology and develop "PostDistiller", a tool to filter geolocated posts from social media services which include links to photos. This spatial distributed contextualized in situ information is further explored manually. In an application case study during the June 2013 flood in central Europe we evaluate the utilization of this approach to infer spatial flood patterns and inundation depths in the city of Dresden.
... Crowdsourcing as introduced by Howe (2006) means to distribute a specific task to an unknown set of volunteers to solve a problem by harnessing collective contribution rather than by an indi-20 vidual. Another form is to perform data processing tasks like translating, filtering, tagging or classifying this content by voluntary crowd workers (Rogstadius et al., 2011). These operations can be facilitated already during creation of social media posts by explicit user-driven assignment of predefined thematic keywords or hashtags, for instance #flood or #need, to allow easy extraction, filtering and computational evaluation 25 of contained information (Starbird and Stamberger, 2010). ...
Article
Full-text available
During and shortly after a disaster data about the hazard and its consequences are scarce and not readily available. Information provided by eye-witnesses via social media are a valuable information source, which should be explored in a more effective way. This research proposes a methodology that leverages social media content to support rapid inundation mapping, including inundation extent and water depth in case of floods. The novelty of this approach is the utilization of quantitative data that are derived from photos from eye-witnesses extracted from social media posts and its integration with established data. Due to the rapid availability of these posts compared to traditional data sources such as remote sensing data, for example areas affected by a flood can be determined quickly. The challenge is to filter the large number of posts to a manageable amount of potentially useful inundation-related information as well as their timely interpretation and integration in mapping procedures. To support rapid inundation mapping we propose a methodology and develop a tool to filter geo-located posts from social media services which include links to photos. This spatial distributed contextualized in-situ information is further explored manually. In an application case study during the June 2013 flood in central Europe we evaluate the utilization of this approach to infer spatial flood patterns and inundation depths in the city of Dresden.
... The amount of social media message is still increasing, which means that a good structural representation is necessary that should either focus on visualization techniques data or clusters of similar topics [Rogstadius et al. 2011]. During disasters, use is growing largely because of the spatial distribution of people with smart-phones in a high densification and the fact that sensor networks are hard to implement and deploy [Crooks et al. 2013]. ...
Conference Paper
Full-text available
Flood risk management requires updated and accurate information about the overall situation in vulnerable areas. Social media messages are considered to be as a valuable additional source of information to complement authoritative data (e.g. in situ sensor data). In some cases, these messages might also help to complement unsuitable or incomplete sensor data, and thus a more complete description of a phenomenon can be provided. Nevertheless, it remains a difficult matter to identify information that is significant and trustworthy. This is due to the huge volume of messages that are produced and which raises issues regarding their authenticity, confidentiality, trustworthiness, ownership and quality. In light of this, this paper adopts an approach for on-the-fly prioritiza-tion of social media messages that relies on sensor data (esp. water gauges). A proof-of-concept application of our approach is outlined by means of a hypothetical scenario, which uses social media messages from Twitter as well as sensor data collected through hydrological stations networks maintained by Pegelon-line in Germany. The results have shown that our approach is able to prioritize social media messages and thus provide updated and accurate information for supporting tasks carried out by decision-makers in flood risk management.
... A number of recent works highlight the utility of this potential, for example in real-time emergency response [47,48], opportunistic data dissemination [49], crowd-supported sensing and processing [50], and many others. Crowdsourced ICT based solutions are also increasingly applied in the assistive technology arena, as evidenced by several recent works on e.g. ...
Article
Full-text available
This paper summarizes recent developments in audio and tactile feedback based assistive technologies targeting the blind community. Current technology allows applications to be efficiently distributed and run on mobile and handheld devices, even in cases where computational requirements are significant. As a result, electronic travel aids, navigational assistance modules, text-to-speech applications, as well as virtual audio displays which combine audio with haptic channels are becoming integrated into standard mobile devices. This trend, combined with the appearance of increasingly user-friendly interfaces and modes of interaction has opened a variety of new perspectives for the rehabilitation and training of users with visual impairments. The goal of this paper is to provide an overview of these developments based on recent advances in basic research and application development. Using this overview as a foundation, an agenda is outlined for future research in mobile interaction design with respect to users with special needs, as well as ultimately in relation to sensor-bridging applications in general.
Conference Paper
As the development of Web 2.0, the social media like microblog, blogs and social network have supplied a bunch of information with locations (Volunteered Geographical Information, VGI).Recent years many cases have shown that, if disaster happened, the cyber citizens will get together very quickly and share the disaster information, this results a bunch of volunteered geographical information about disaster situation which is very valuable for disaster response if this VGIs are used efficiently and properly. This project will take typhoon disaster as case study. In this paper, we study the relations between weibo messages and the real typhoon situation, we proposed an analysis framework for mine the relations between weibo messages distribution and physical space. We found that the number of the weibo messages, key words frequency and spatial temporary distribution of the messages have strong relations with the disaster spread in the real world, and this research results can improve our disaster situation awareness in the future. The achievement of the study will give a method for typhoon disaster situation awareness based on VGI from the bottom up, and will locate the disaster spot and evolution quickly which is very important for disaster response and recover.
Conference Paper
Crowd sourced mobile microtasking represents a significant opportunity in emerging economies such as India, that are characterized by the high levels of mobile phone penetration and large numbers of educated people that are unemployed or underemployed. Indeed, mobile phones have been used successfully in many parts of the world for microtasking, primarily for crowd sourced data collection, and text or image based tasks. More complex tasks such as annotation of multimedia such as audio or video have traditionally been confined to desktop interfaces. With the rapid evolution in the multimedia capabilities of mobile phones in these geographies, we believe that the nature of microtasks carried out on these devices, as well as the design of interfaces for such microtasks, warrants investigation. In this paper we explore the design of mobile phone interfaces for a set of multimedia-based microtasks on feature phones, which represent the vast majority of multimedia-capable mobile phones in these geographies. As part of an initial study using paper prototypes, we evaluate three types of multimedia content: images, audio and video, and three interfaces for data input: Direct Entry, Scroll Key Input and Key Mapping. We observe that while there are clear interface preferences for image and audio tasks, the user preference for video tasks varies based on the 'task complexity' - the 'density' of data the annotator has to deal with. In a second study, we prototype two different interfaces for video-based annotation tasks - a single screen input method, and a two screen phased interface. We evaluate the two interface designs and the three data input methods studied earlier by means of a user study with 36 participants. Our findings show that where less dense data was concerned; participants prefer Key Mapping as the input technique. For dense data, while participants prefer Key Mapping, our data shows that the accuracy of data input with Key Mapping is significantly lower than that with Scroll Key Input. The study also provides insight into the game plan each user develops and employs to input data. We believe these findingswill enable other researchers to build effective user interfaces for mobile microtasks, and be of value to UI developers, HCI researchers and microtask designers.
Conference Paper
Full-text available
This paper presents a real-time and multilingual news event extraction system developed at the Joint Research Centre of the European Commission. It is capable of accurately and efficiently extracting violent and natural disaster events from online news. In particular, a linguistically relatively lightweight approach is deployed, in which clustered news are heavily exploited at all stages of processing. The paper focuses on the system's architecture, real-time news clustering, geolocating clusters, event extraction grammar development, adapting the system to the processing of new languages, cluster-level information fusion, visual event tracking and accuracy evaluation.
Article
Full-text available
Users are sharing vast amounts of social data through social networking platforms accessible by Web and increasingly via mobile devices. This opens an exciting opportunity to extract social perceptions as well as obtain insights relevant to events around us. We discuss the significant need and opportunity for analyzing event-centric user generated content on social networks, present some of the technical challenges and our approach to address them. This includes aggregating social data related to events of interest, along with Web resources (news, Wikipedia pages, multimedia) related to an event of interest, and supporting analysis along spatial, temporal, thematic, and sentiment dimensions. The system is unique in its support for user generated content in developed countries where Twitter is popular, as well as in support for SMS that is popular in emerging regions.
Conference Paper
Full-text available
We analyze microblog posts generated during two recent, concurrent emergency events in North America via Twitter, a popular microblogging service. We focus on communications broadcast by people who were "on the ground" during the Oklahoma Grassfires of April 2009 and the Red River Floods that occurred in March and April 2009, and identify information that may contribute to enhancing situational awareness (SA). This work aims to inform next steps for extracting useful, relevant information during emergencies using information extraction (IE) techniques.
Article
Full-text available
Social media platforms such as Twitter pose new challenges for decision-makers in an international crisis. We examine Twitter's role during Iran's 2009 election crisis using a comparative analysis of Twitter investors, US State Department diplomats, citizen activists and Iranian protesters and paramilitary forces. We code for key events during the election's aftermath from 12 June to 5 August 2009, and evaluate Twitter. Foreign policy, international political economy and historical sociology frameworks provide a deeper context of how Twitter was used by different users for defensive information operations and public diplomacy. Those who believe Twitter and other social network technologies will enable ordinary people to seize power from repressive regimes should consider the fate of Iran's protesters, some of whom paid for their enthusiastic adoption of Twitter with their lives.
Article
Full-text available
Economists usually assume that monetary incentives improve performance, and psychologists claim that the opposite may happen. We present and discuss a set of experiments designed to test these contrasting claims. We found that the effect of monetary compensation on performance was not monotonic. In the treatments in which money was offered, a larger amount yielded a higher performance. However, offering money did not always produce an improvement: subjects who were offered monetary incentives performed more poorly than those who were offered no compensation. Several possible interpretations of the results are discussed.
Book
As the volume of digitized textual information continues to grow, so does the critical need for designing robust and scalable indexing and search strategies/software to meet a variety of user needs. Knowledge extraction or creation from text requires systematic, yet reliable processing that can be codified and adapted for changing needs and environments. Survey of Text Mining is a comprehensive edited survey organized into three parts: Clustering and Classification; Information Extraction and Retrieval; and Trend Detection. Many of the chapters stress the practical application of software and algorithms for current and future needs in text mining. Authors from industry provide their perspectives on current approaches for large-scale text mining and obstacles that will guide R&D activity in this area for the next decade. Topics and features: * Highlights issues such as scalability, robustness, and software tools * Brings together recent research and techniques from academia and industry * Examines algorithmic advances in discriminant analysis, spectral clustering, trend detection, and synonym extraction * Includes case studies in mining Web and customer-support logs for hot- topic extraction and query characterizations * Extensive bibliography of all references, including websites This useful survey volume taps the expertise of academicians and industry professionals to recommend practical approaches to purifying, indexing, and mining textual information. Researchers, practitioners, and professionals involved in information retrieval, computational statistics, and data mining, who need the latest text-mining methods and algorithms, will find the book an indispensable resource.
Article
Supervised machine learning is the search for algorithms that reason from externally supplied instances to produce general hypotheses, which then make predictions about future instances. In other words, the goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various supervised machine learning classification techniques. Of course, a single article cannot be a complete review of all supervised machine learning classification algorithms (also known induction classification algorithms), yet we hope that the references cited will cover the major theoretical issues, guiding the researcher in interesting research directions and suggesting possible bias combinations that have yet to be explored.
Conference Paper
With the recent rise in popularity and size of social media, there is a growing need for sys- tems that can extract useful information from this amount of data. We address the prob- lem of detecting new events from a stream of Twitter posts. To make event detection feasi- ble on web-scale corpora, we present an algo- rithm based on locality-sensitive hashing which is able overcome the limitations of traditional approaches, while maintaining competitive re- sults. In particular, a comparison with a state- of-the-art system on the first story detection task shows that we achieve over an order of magnitude speedup in processing time, while retaining comparable performance. Event de- tection experiments on a collection of 160 mil- lion Twitter posts show that celebrity deaths are the fastest spreading news on Twitter.
Article
Supervised machine learning is the search for algorithms that reason from externally supplied instances to produce general hypotheses, which then make predictions about future instances. In other words, the goal of supervised learning is to build a concise model of the distribution of class labels in terms of predictor features. The resulting classifier is then used to assign class labels to the testing instances where the values of the predictor features are known, but the value of the class label is unknown. This paper describes various supervised machine learning classification techniques. Of course, a single article cannot be a complete review of all supervised machine learning classification algorithms (also known induction classification algorithms), yet we hope that the references cited will cover the major theoretical issues, guiding the researcher in interesting research directions and suggesting possible bias combinations that have yet to be explored. Povzetek: Podan je pregled metod strojnega učenja.