Content uploaded by Jaakko Kukkonen
Author content
All content in this area was uploaded by Jaakko Kukkonen on Oct 18, 2017
Content may be subject to copyright.
A model for environmental data extraction from multimedia and its evaluation
against various chemical weather forecasting datasets
Anastasia Moumtzidou
a,
⁎, Victor Epitropou
b
, Stefanos Vrochidis
a
, Kostas Karatzas
b
, Sascha Voth
c
,
Anastasios Bassoukos
b
, Jürgen Moßgraber
c
, Ari Karppinen
d
,JaakkoKukkonen
d
, Ioannis Kompatsiaris
a
a
Information Technologies Institute, Centre for Research and Technology Hellas, Greece
b
Informatics Systems and Applications Group, Aristotle University of Thessaloniki, Greece
c
Fraunhofer Institute of Optronics, System Technologies and Image Exploitation, Germany
d
Finnish Meteorological Institute, Helsinki, Finland
abstractarticle info
Article history:
Received 31 January 2013
Received in revised form 12 July 2013
Accepted 20 August 2013
Available online xxxx
Keywords:
Air quality
Heatmap
Image processing
OCR
Environmental
Multimedia
Environmental data analysis and information provision are considered of great importance for people, since en-
vironmental conditions are strongly related to health issues and directly affect a variety of everyday activities.
Nowadays, there are several free web-based services that provide environmental information in several formats
with map images being the most commonly used to present air quality and pollen forecasts. This format, despite
being intuitive for humans, complicates the extraction and processing of the underlying data.Typical examples of
this case are the chemical weather forecasts, which are usually encoded heatmaps (i.e. graphical representation
of matrix data with colors), while the forecasted numerical pollutant concentrations are commonly unavailable.
This work presents a model for the semi-automatic extraction of such information based on a template configu-
ration tool, on methodologies for data reconstruction from images, as well as on text processing and Optical
Character Recognition (OCR). The aforementioned modules are integrated in a standalone framework, which is
extensively evaluated by comparing data extracted from a variety of chemical weather heat maps against the
real numerical values produced by chemical weather forecasting models. The results demonstrate a satisfactory
performance in terms of data recovery and positional accuracy.
© 2013 Elsevier B.V. All rights reserved.
1. Introduction
The analysis of environmental data and the generation, combination
and reuse of related information, such as air pollutant concentrations, is
of particular interest for people. Environmental status information (in
particular, the concentration of certain pollutants in the air) is consid-
ered to be correlated with a series of health issues, such as cardiovascu-
lar and respiratory diseases, it directly affects several outdoor activities
(e.g. commuting, sports, trip planning, agriculture) and therefore it is
strongly related to the overall quality of life. In addition, the analysis
of environmental information is often a prerequisite for the fulfillment
of legal mandates on the management and preservation of environmen-
tal quality, according to the EU's and other legal frameworks (Karatzas
and Moussiopoulos, 2000). With a view to offering personalized deci-
sion support services for people based on environmental information
regarding their everyday activities (Wanner et al., 2012) and supporting
environmental experts in air quality preservation tasks, there is a need
to extract, combine and compare complementary and competing envi-
ronmental information from several resources in order to generate
more reliable and cross-validated information on the environmental
conditions. One of themain steps towards this goal is theenvironmental
information extraction from heterogeneous resources.
Environmental observations are automatically performed by spe-
cialized instruments, hosted in stations established by environmental
organizations, while the forecasts, which are used to foretell weather
conditions, the levels of pollution or pollen concentration in areas of
interest, are provided by environmental prediction models, the output
of which are gridded numerical data, henceforth referred to as ‘actual’
or ‘original’data. In practice only a few of the data providers make
available to the public some means of access to their actual (numerical)
forecast data, while the majority publishes the results in the form of
preprocessed images, that address specific environmental pressures
(like air pollution concentrations), for specific temporal scales (usually
in the order of hours or days), and for specific geographical areas of in-
terest. However, even if the original data values of environmental infor-
mation had been available, these would commonly be presented in
various technical formats, using various coordinates and spatial resolu-
tions, different units, and several other choices (e.g., Kukkonen et al.,
Ecological Informatics xxx (2013) xxx–xxx
⁎Corresponding author at: Centre for Research and Technology Hellas, Information
Technologies Institute, 6th km Charilaou-Thermi Road, P.O. Box 60361, 57001 Thermi,
Thessaloniki, Greece. Tel.: +30 2311257746.
E-mail addresses: moumtzid@iti.gr (A. Moumtzidou), vepitrop@isag.meng.auth.gr
(V. Epitropou), stefanos@iti.gr (S. Vrochidis), kkara@eng.auth.gr (K. Karatzas),
sascha.voth@iosb.fraunhofer.de (S. Voth), abas@isag.meng.auth.gr (A. Bassoukos),
juergen.mossgraber@iosb.fraunhofer.de (J. Moßgraber), ari.karppinen@fmi.fi
(A. Karppinen), jaakko.kukkonen@fmi.fi(J. Kukkonen), ikom@iti.gr (I. Kompatsiaris).
ECOINF-00416; No of Pages 14
1574-9541/$ –see front matter © 2013 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
Contents lists available at ScienceDirect
Ecological Informatics
journal homepage: www.elsevier.com/locate/ecolinf
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
2012). It can therefore be a laborious task to convert these data files to
the same harmonized format, for inter-comparison purposes. Conse-
quently, the main sources of environmental information for everyday
use are web portals and sites, which provide a variety of information
of diverse spatial and temporal nature. Although the weather forecasts
are usually presented in textual format (Moumtzidou et al., 2012b),
important environmental information such as the air quality and pollen
forecasts is encoded in multimedia formats (Karatzas, 2005). Specifical-
ly, the vast majority of such environmental data are published as static
heatmaps (i.e. graphical representation of matrix data with colors), or
as sequences of heatmaps (time-lapse animations). A characteristic
example of a heatmap is presented in Fig. 1 (generated by the SILAM
model, courtesy of FMI). However, since this information comes from
different providers and is presented in a variety of not intercomparable
and compatible visual forms, it is not possible to directly combine them
and compile a synthetic service that takes into accountall available data
sources. In order to dealwith this problem, it is necessary to design and
develop a model that is capable of extracting environmental informa-
tion from heatmaps and translate them to a structured numerical for-
mat. The processing of images for their conversion into numerical data
would comprise the core of environmental data recovery techniques,
at least in the air pollution and the pollen concentration domains.
In this context, this paper addresses the extraction of air quality and
pollen forecasts from heatmaps, by proposing a semi-automatic frame-
work, which consists of three main components: an annotation tool
for administrative user intervention used for generating configuration
templates for each heatmap, an Optical Character Recognition (OCR)
and text processing module used for fetching text information em-
bedded in the image and making the necessary corrections, as well
as the AirMerge heatmap processing module (Epitropou et al., 2011)
that allows for the automatic harvesting, annotation, harmonization
and reconversion of heatmaps into numerical data. The framework is
evaluated against the AirMerge system and various chemical weather
forecast datasets. It should be highlighted that the AirMerge system,
per-se, does not include an automated annotation process, therefore
any heatmap harvesting and parsing procedure must be manually
scripted, even though the programmatic generation of certain types of
highly repetitive scripts e.g. to handle series of images from one same
provider, is possible. On the contrary, the proposed framework aims
at automating this scripting process, by generating the configuration
scripts required by AirMerge on a per-case basis via optical heatmap
analysis, the use of graphical templates and machine automation. The
results of the resulting scripts are then compared to those obtained
by using the best manually configured AirMerge scripts for a given
heatmap template, and the differences in their setup and final data
extraction results are discussed.
The contribution of this work is a novel framework that integrates
multimedia annotation and processing modules, in order to allow for
the semi-automatic extraction of air quality and/or pollen forecast
data presented in heatmaps. Specifically, this framework integrates
multimedia configuration components (annotation tool), advanced sys-
tems for heatmap image processing (AirMerge) and optimized OCR
techniques. These modules are integrated in a standalone, user-based
interface that allows for template-based customization of heatmaps
and thus assists in handling several formats of heatmaps. This paper
substantially extends the works presented in Moumtzidou et al.
(2012a) and Vrochidis et al. (2012), which have demonstrated the ini-
tial results of this framework, by providing an extensive evaluation,
which includes a comparative study of the proposed framework against
the manually configured AirMerge system and real numerical data pro-
vided by forecast models for a variety of providers.
This paper is structured as follows: Section 2 presents the previous
research on heatmap analysis and content extraction, Section 3 de-
scr ibes the results of studies on the presentation format of environmental
Fig. 1. An example of an air quality heatmap: the forecast of NO
2
concentrations (μg/m
3
) at 8 UTC time of 6 December 2012, using the SILAM chemical transport model.
2A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
information as well the a typical heatmap. Section 4 presents the prob-
lem and its requirements, while Section 5 describes the overall architec-
ture, the involved modules (i.e. annotation tool, text extraction and
processing module and heatmap processing module) and a short com-
parison of the proposed system and AirMerge. The evaluation results
are presented in Section 6 and finally, Section 7 concludes the paper.
2. Previous research
The task of map analysis strongly depends on the map type and
the information we need to extract. Depending on the application, a
straightforward requirement would be to perform semantic image seg-
mentation (e.g. rivers, forests, etc.), while in the case of heatmaps it is to
transform color into numerical data. In general, the discriminating fac-
tors between map types are reflected in their scale, colorization, quality,
accuracy, topology and many other aspects. In the case of air quality
(or chemical weather) maps there are mainly two types of information
covered by the map data:
•Geographical information: points and lines describing country fron-
tiers or other well-known points of interests or structures (e.g. sea,
land, capitals) in a given coordinate system. These features can often
be used as cues for manually or automatically identifying the geo-
graphical registration of a map.
•Feature information: forecasted or measured data of any kind
(e.g. average temperature or pollutant concentration), which are
coded via a color scale representing the measured values. Single values
are referenced geographically by a color value at the corresponding
geographical point.
Chemical weather maps often use raster map images to represent
measured or forecasted data. There are several approaches to extract
and digitalize this image information automatically. Musavi et al. (1988)
describe the process of the vectorization of digital image data. Hereby
the geographical information, in form of lines, is extracted and converted
to digital storable vector data.
In another work (Desai et al., 2005), the authors describe an ap-
proach to efficiently identify street maps among several other images
by applying image processing techniques to identify unique patterns,
such as street lines, which differentiate them among all other images.
For the identification of street maps, the Law's texture classification
algorithm is applied in order to recognize the unique image patterns
such as street lines and street labels. Finally, the authors use GEOPPM,
an algorithm for automatically determining the geocoordinates and
scale of the maps. In another similar work (Henderson and Linton,
2009), the authors use the specific knowledge of the known colorization
in USGS maps, to automatically segment these maps based on their
semantic contents (e.g. roads, rivers). Chiang and Knoblock (2006)
propose an algorithm using 2-D Discrete Cosine Transformation (DCT)
coefficients and Support Vector Machines (SVM) to classify the pixels
of lines and characters on raster maps.
In Michelson et al. (2008), the authors present an automatic ap-
proach to mine collections of maps from the web. This method harvests
images from the web and then classifies them as maps or non-maps by
comparing them to previously classified map and non-map images
using methods from Content-Based Image Retrieval (CBIR). Specifically,
a voting, k-Nearest Neighbor classifier is used as it allows exploiting
image similarities without explicitly modeling them compared to
other traditional machine learning techniques such as Support Vector
Machines.
Finally, Cao and Tan (2002) improve the segmentation quality of
text and graphics in color map images, to enhance the results of the
following analysis processes (e.g. OCR), by selecting black or dark pixels
from color maps, cleaning them up from possible errors or known
unwanted structures (e.g. dashed lines), to get cleaner text structures.
In addition, a specific attempt for map recognition was realized within
the context of TRECVID workshops (Smeaton et al., 2006). Specifically,
the ‘maps’concept was evaluated in the high level concept feature ex-
traction task of TRECVID 2007 (Kraaij et al., 2007). The best performing
system for the map concept was Yuan et al. (2007), which is based on
supervised machine learning techniques on several fused visual de-
scriptors. In another approach evaluated in TRECVID 2007 (Ngo et al.,
2007), the authors explore the upper limit of bag-of-visual-words
(BoW) approach based upon local appearance features and evaluate
several factors which could impact their performance. The proposed
system is based on the fusion of Support Vector Machine classifiers
that use BoW, spatial layout of keypoints, edge histogram, grid based
color moment and wavelet texture features. In this context, Chang
et al. (2007) developed a cross-domain SVM (CDSVM) algorithm for
adapting previously learned support vectors from one domain to help
the classification in another domain. However, these algorithms were
tested generally on maps and no testing was realized on heatmaps.
Although research work has been conducted towards the automatic
extraction of information in maps, very few works address the automatic
extraction of information from chemical weather maps or environmental
maps in general. However, such an extraction method has been included
in the European Open Access Chemical Weather Forecasting Portal
(ECWFP
1
), while an overview of the first version of this portal has been
presented by Balk et al. (2011).InEpitropou et al. (2011, 2012),ameth-
od to reconstruct environmental data out of chemical weather images is
described and developed (AirMerge system). First, the relevant map sec-
tion is scraped from the chemical weather image. Then, disturbances are
removed and a color classification is used to classify every single data
point (pixel), to recover the measured data. With the aid of the known
geographical boundaries, given by the coordinate axis and the map pro-
jection type, the geographical position of the measured data point can be
retrieved. In the case of missing data points, a special interpolation algo-
rithm (based on a novel Artificial Neural Network algorithm developed
by the authors) is used to close these gaps. The authors in Moumtzidou
et al. (2012a) and Vrochidis et al. (2012) propose a framework that inte-
grates the system of Epitropou et al. (i.e. AirMerge system) and aims at
automating and thus facilitating its use. In both works the proposed sys-
tem is evaluated only against the AirMerge system (semi-automated
versus manual configuration), while in the current work a more exten-
sive evaluation is realized by using the original numerical values that
were generated by the corresponding forecast models as the ground
truth.
3. Study and description of forecasted chemical weather heatmaps
In this section we present insights into the presentation of environ-
mental information, focusing on air quality and pollen forecasts. The re-
sults of a study we have conducted on more than 60 environmental
websites (dealing with weather, air quality and pollen), as well as the
findings of previous works (Karatzas, 2005) revealed that a consider-
able share of environmental content, almost 60%, is encoded in images
and specifically heatmaps. Overall, it can be said (Balk et al., 2011)
that the chemical weather forecasting information is usually presented
in the form of images representing pollutant concentrations over a geo-
graphically bounded region, typically in terms of maximum or average
concentration values for the time scale of reference, which is usually
the hour or day (Epitropou et al., 2011). These providers present air
quality forecasts almost exclusively in the form of preprocessed images
with a color index scale indicating the concentration of pollutants. In
addition, they individually choose the image resolution and the color
scale employed for visualizing pollution loadings, the covered region,
as well as the geographical map projection. The mode of presentation
varies from simple web images to AJAX, Java or Adobe Flash viewers
(Kukkonen et al., 2009) and while this representation is informative
for the casual user (e.g. compared to a table with numerical values), it
1
http://www.chemicalweather.eu/Domains.
3A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
has the drawback that the data are being presented in a wide range of
highly heterogeneous forms, which makes it very complicated to ex-
tract and compare their results. Moreover, some of the images are per-
manently marked with visible watermarks, text, lines etc. that would
make the extraction phase even more challenging.
In general, the heatmaps that contain chemical weather information
are commonly static bitmap images, which represent the coverage data
(e.g. concentrations) in terms of a color-coded scale over a geographical
map. A characteristic example of such a heatmap, obtained from the
SILAM FMI
2
website, is depicted in Fig. 1.
In general, the information that can be embedded in a heatmap
image, is the geographical coordinates of the map, the type of environ-
mental aspect (e.g. ozone, birch pollen), the date/time information of
the meaningful information and the color scale. After a careful observa-
tion of numerous heatmaps, we conclude that the information that is
considered of importance besides the geographical coordinates and
the concentrations, is the type of physical property (i.e. concentration
of NO
2
), the date/time information (i.e. 2012-12-6 1268 08:00) and
the color scale. Summarizing, the main parts of information that need
to be extracted and/or processed from all images are the following:
•Heatmap: map depicting a geographical region with colors representing
the values of an environmental quantity.
•Color scale: range indicating the correspondence between value and
color.
•Coordinate axes (x, y): indicate the geographical longitude and latitude
of every map point for a specific geographic projection. On some
heatmaps, the coordinates and their scale are explicit, while for others
they must be deduced differently, e.g., by using known landmarks.
•Title: contains information such as the type of measured physical
property, the time and date of the forecast, and additional informa-
tion such as type of measurement procedure (e.g. hourly average
or daily maximum).
•Additional information: watermarks, border and coastal lines, wind
fields superimposed to concentration maps and any other informa-
tion that can be useful for visual interpretation and geographical
registration purposes. However this type of information is categorized
as “noise”in terms of influencing the information content and repre-
sentation value of the specificheatmap.
4. Problem statement and requirements
After having described the format of heatmaps and the type of the
encoded information (i.e. geographical and color information), we will
briefly describe the problem we address and the steps towards its
solution.
The problem description is summarized into the following lines: re-
trieval of the concentrations of air pollutants' (or other environmental
aspects such as birch pollen concentration) numerical values and geo-
graphical coordinates out of a heatmap by taking into consideration
that the original values have been quantized in order to allow their visu-
alization and thus no one-to-one mapping is possible. The proposed
procedure towards the solution of this problem is a four step process
and is depicted in Fig. 2. The steps that reflect the requirements of the
proposed framework are the following:
1) Removal of noisy elements (e.g. border and coastal lines)
2) Retrieval of the heatmap's raster grid's coordinates and mapping
them to actual geographical coordinates
3) Mapping of the heatmap's pixel color to a range of values according
to the color scale
4) Retrieval of the final result, i.e. coordinates and pollutant values.
5. Overall architecture of the heatmap processing model
The architecture of the proposed framework draws upon the re-
quirements that were set in the previous section. The idea is to employ
image analysis and processing techniques to map the color variations on
the images on specific categories, whichdirectly correspond to ranges of
values, in order to further automatize the process supported by the
AirMerge system. Normally, the latter relies on manually or program-
matically prepared scripts to perform this task, but, its modular archi-
tecture allows for automating it, making AirMerge suitable for use in
an automated service. Such automation is crucially needed for the use
2
http://silam.fmi.fi/AQ_forecasts/Regional_v4_9/index.html.
Fig. 2. Problem statement and steps involved.
4A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
of this system in an open access portal, such as the ECWFP. To this end,
optical character recognition techniques need to be employed for recog-
nizing text encoded in image format, such as image titles, dates, envi-
ronmental information and coordinates, while an annotation tool is
required to support the intervention of an administrative user. Due to
the fact that there is a large variation of images and many different
representations, there is a need for optimizing and configuring the algo-
rithms involved. Specifically, the intervention of an administrative user
is required, in orderto annotate and manually segment different parts of
a new image type (like data, legend, etc.), which need to be processed
by the content extraction algorithms. Ideally, the goal is to construct a
complete configuration template –with metadata –for AirMerge in a
more automatic way and thus limiting the user input.
The proposed system workflow and the involved modules are
depicted in Fig. 3. In order to facilitate this configuration through a
graphical user interface as already discussed, we have implemented
the “annotation tool”(AnT), which is tailored for dealing with heat
maps. The output of this tool is a configuration file that holds the static
information of the image. The second module is the “text extraction and
processing”, which uses the information of the configuration file to ex-
tract data from the corresponding image. More specifically, it retrieves
and analyzes the information captured in text format using text pro-
cessing techniques and OCR. The third module is the “heatmap
processing”, which uses information both from the output of the
“text processing”module and the configuration file to process the
heatmap located inside the image.
The input of the framework is an image containing a heatmap and
the output is an XML file, in which each geographical coordinate of
the initial heatmap is associated with a value (e.g., pollutant concentra-
tion or air quality index).
5.1. Annotation tool
To facilitate the annotation process for the user, an annotation tool
(AnT) was developed which can be easily used to annotate heat maps
interactively. The annotation tool was realized in C++ and based on
the QT framework, which allowed for creating a platform independent
tool. To ensure expandability, the MVC (model/view/controller) pattern
was used in the software design. Based on this pattern two different
interaction methods were implemented on two different data views,
at which both are derived from one data structure (the loaded XML
template). The first data view was implemented as simple TreeView,
which represents the underlying XML data structure and its entries
as traversable tree. The second data view was implemented as a
GraphicsView, whichis capable of interpreting and viewing the selected
datasets graphically. This view is used to draw regions of interests
(ROIs) or point of interests (POIs) as overlays over the heatmap. The ini-
tial drawing of the data is triggered by the selection of the data element
(e.g. ROI element for the legend) in the TreeView. Fig. 4 shows the AnT
tool with an already loaded heatmap from the SILAM FMI website
(Fig. 1).
The left section of the AnT user interface shows the loaded heatmap.
The loaded image consists of the following elements: a) the dyed map
(i.e. heatmap), b) the x and y axes of the map, c) the color legend
with its corresponding d) measurement values, and e) the title and de-
scription of the heatmap. The smaller heatmaps on the bottom left and
right are secondary information heatmaps present in this particular in-
stance of a published chemical weather image, which however are not
being considered in this particular example. After selecting a ROI ele-
ment from the pre-defined basic template, the ROI is drawn over the
heatmap as a red rectangle. Then, the user has the ability to manipulate
the ROI directly by moving the rectangle boundaries with the mouse, or
alternatively by manipulating the values in the TreeView through direct
text input. Both input methods record their changes to the same XML
template data structure and update the other data views.
5.2. Text extraction and processing module
This module is driven by the configuration file produced by the AnT
tool and focuses on retrieving the textual information captured in the
image using text extraction and processing techniques through a two-
step procedure. The first step (i.e. text extraction) includes the appli-
cation of OCR on the following parts of the input image: title, color
scale, map x and y axes, searching for potential text strings contain-
ing relevant information to the heatmap itself. The OCR software that
is used is Abbyy Fine Reader,
3
though any text processing module
could be, in theory, plugged in. It should be noted that the OCR step
is not expected to be error free and thus a second step (i.e. text pro-
cessing) for text correction is required. In this step, we apply text
processing based on heuristic rules, in order to correct to a certain
extent, extract and understand the semantic information encoded
in the aforementioned locations. It should be noted that each of
these locations was treated in a different way.
The module produces two output files: the first one is used as input
for the heatmap processing and holds information concerning the color
scale and the map geographical coordinates, while the second captures
general information, such as the date and the type of environmental
aspect.
In the sequel, we describe these two steps by applying them on the
characteristic heatmap example of Fig. 1 and present the results. It
should be noted that this example is very demanding, since especially
the resolution of the text that describes the x and y axes is of very low
quality.
5.2.1. OCR on title, color scale, axes
Basedonthestudyonheatmaps(seeSection 4), considerable part of
the meaningful information can be extracted from the text surrounding
the image. More specifically, the color scale and themap axes areessen-
tial elements that provide information about the values and the geo-
graphical area covered. On the other hand, the title (usually) contains
information about the environmental physical property measured and
the corresponding date/time. The location of the aforementioned
image parts needs to be captured in the configuration template.
Therefore, we apply OCR on the aforementioned parts of heatmap
depicted in Fig. 1.Tables 1, 2, 3 and 4 contain the input and output of
Fig. 3. Overall heatmap content distillation architecture.
3
http://www.abbyy.com.gr/.
5A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
OCR for the title, the color scale and, x and y axes respectively. The
values in bold indicate the errors produced by OCR. It should be noted
that for the cases of the color scale and the x and y axes, we also re-
trieved the exact position of the text, in order to relate the latter with
the corresponding colors and geographical coordinates. This is done
on the grounds that it is reasonable to assume that e.g. a number
under a horizontal line in the image most likely represents a longitude
value, while a number located under (or at the side, in case of vertical
scales) thebeginning of a color region in the color scale most likely rep-
resents the minimum or starting value for that color.
A careful observation of Tables 1 and 2 shows that the text in the title
and the color scale was identified accurately compared to that of the
axes. Especially the results after processing the text on the y axis contain
a lot of errors. This is due to the fact that the resolution of the figures
along y axis is particularly low, which makes it difficult even for the
human eye to recognize them successfully. We will attempt to correct
as much as possible these errors in the second step.
5.2.2. Text processing on OCR results
The next step includes the application of heuristic rules that accrue
from the study of the sites containing heatmaps and aim at correcting
and understanding the semantic information encoded in the aforemen-
tioned places. Each of these segments is treated in different ways, since
the type of the semantic information included is different.
5.2.2.1. Title. The title usually contains the name of the environmental
aspect, the measurement units and the date/time. Regarding the mea-
surement units, these are usually standard depending on the measured
environmental aspect and therefore we will not attempt to extract
them. The date/time is considered as the most complex element given
that it is presented in several different formats, which need to be handled
separately, using a trial-by-error and maximum likelihood strategy. In
order to correct possible errors in the textual format of the month, day
and aspect, we apply the following procedure:
1) Construct manually three English vocabularies, which are used as
ground truth datasets. These vocabularies hold all the possible values
of the aforementioned elements that is the month (e.g. January, Jan.),
the day (e.g. Monday, Mon.) and the environmental aspect (e.g. O
3
,
ozone);
2) Split the text returned by OCR into words;
3) Compare the words returned from OCR with the each one of the man-
ually constructed English ground truth sets using the Levenshtein
Fig. 4. Annotation tool (AnT) user interface.
Table 1
Title —input image (top) and OCR output (bottom).
Forecast for NO
2
. Last analysis time: 20121206_00
Concentration, μgN/m
3
, 08Z06DEC2012
Forecast for NO
2
. Last analysis time: 20121206_00
Concentration, μgN/m
3
, 08Z06DEC2012
Table 2
Color scale—input image (top) and OCR with text position output(bottom), expressedin
pixel coordinates (horizontal and vertical positions, with upper-left origin (0,0)).
0.1 0.2 0.4 0.8 1.5 2.5 4 7 15 25
Position (left, top, right, bottom): 47, 5, 90, 30 —value: 0,1
Position (left, top, right, bottom): 149, 5, 195, 30 —value: 0,2
Position (left, top, right, bottom): 248, 5, 294, 30 —value: 0.4
Position (left, top, right, bottom): 347, 5, 393, 30 —value: 0,8
Position (left, top, right, bottom): 452, 5, 495, 30 —value: 1,5
Position (left, top, right, bottom): 548, 5, 594, 30 —value: 2,5
Position (left, top, right, bottom): 662, 5, 681, 24 —value: 4-
Position (left, top, right, bottom): 764, 5, 780, 30 —value: 7
Position (left, top, right, bottom): 857, 5, 891, 31 —value: 15
Position (left, top, right, bottom): 953, 5, 990, 30 —value: 25
6A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
distance metric (Levenshtein, 1966); the Levenshtein distance is a
string metric for measuring the difference (distance) between two
sequences (words in our case); specifically, the Levenshtein distance
metric is calculated as the minimum number of single-character
edits required to change one word into the other;
4) Correct the initial OCR result by considering the word from the
ground truth dataset that has the minimum distance from it.
In the specificexampleofFig. 1, the OCR module recognized correct-
ly the date/time and aspect parameters and thusno corrections were re-
quired. The information we obtained from the title is the following:
Date/time: 2012-12-06 08:00:00, aspect: NO
2
.
5.2.2.2. Color scale. The color scale holds the mapping between color
variations in the map and aspect values. The extraction of information
from the color scale is a two step procedure. During the first step, the re-
sults of OCR (i.e. values and positions) are corrected, while in the second
we associate values with colors. Regarding the first step, it should be
noted that in case the scale values change in a linear way, the most com-
mon difference among them is calculated and then the scale values are
adapted accordingly. Otherwise, we do not proceed on such adapta-
tions, since it is possible that the resulting values will not be corrected.
The information regarding the linearity of color scale values is provided
by the administrative user through the AnT tool. Then, the correlation of
values to colors is achieved by taking into consideration the orientation
of the color scale and by using the pixel coordinates given by OCR.
In the specific example, the values 0.8–1.5 are mapped to the color
found at the (268, 447) coordinates of the initial image. It should be
noted that since the scale values do not increase in a linear way, no
attempt is made to modify them.
5.2.2.3. X and Y axes. In orderto deal with x and y axes, similar processing
techniques are applied, since they both representthe geographical coor-
dinates of the map. Specifically, at least two points of the map (giving
two distinct geographical coordinates), as well as their position with re-
spect to themap's raster (giving two distinct pixel coordinates) needto
be resolved, in order to successfully identify all the point coordinates
through a geographical bearing extrapolation procedure. The procedure
followed includes again two steps: a) correction of the errors produced
by OCR and b) use of the coordinates' elements. In a similar way to
the color scale processing, in order to correct the values in both axes
we estimate the most common difference among the axis values and
adjust the others accordingly, since the values in this case change in a
linear way.
For the specificexampleofFig. 1, after correcting OCR results, we
associated the geographical coordinates (9°, 70°) and (18°, 68°) to the
image map pixels (162, 130) and (292, 164) respectively. It should
be noted that for the specific site a lot of processing and severalassump-
tions were required since the OCR results for the coordinate axes (espe-
cially for y axis) were not satisfactory.
5.3. Heatmap processing module
In this section, we present the heatmap processing module that
extracts data from different models and coordinate systems. This is real-
ized by the AirMerge engine, which is a complex processing framework
with its primary purpose being the extraction of environmental data
from heatmaps, by using image segmentation, scraping and processing
algorithms. Even though it was initially designed to be used for the
extraction of chemical weather forecasting data, its methodology is
generalizable to any type of heatmaps, provided that it can be algo-
rithmically processed. In addition, AirMerge implements auxiliary
functionality such as automatic harvesting of heatmaps, batch pro-
cessing of large numbers of heatmaps, and persistence of processing
results (database storage).
The most important component of AirMerge, a derivative of which
is also reused (under license) by the proposed framework, is the
AirMerge Core Engine, which performs the conversion from image
data (heatmaps) to numerical gridded data. The functionality and per-
formance of this engine has been described in Epitropou et al. (2011,
2012) and Karatzas et al. (2011). The Core Engine performs the extrac-
tion of data from heatmaps using a processing chain that consists of two
main procedures: a) the screen scraping procedure, where raw RGB
pixel data are extracted from heatmaps, and classified according to a
color scale in order to be mapped to ranges of numerical values;
finally, this procedure includes a linear deprojection phase, where the
images' raster is interpreted as a geographical grid in a specified geo-
graphical projection, centered on reference keypoints; b) the recon-
struction of missing values and data gap procedure, which deals with
noisy elements on heatmaps.
5.3.1. Screen scraping procedure
This step handles the cropping of the original image to a region of
interest and parsing of it into a 2D data array directly mapped to the
original image's pixels. Also, it deals with the association of the color
to minimum/maximum value ranges of the air pollutant concentration
levels, which is often implied by the color scale associated with the orig-
inal image. It shouldbe noted that the information about where to crop,
where each color on the legend is, to which index it should correspond,
etc. is provided by the configuration template of theAnT tool in the pro-
posed system. In this phase, the mapping of the images' raster to a
specific geographical grid is performed, since the images themselves
represent geographical region. The configuration options of AirMerge
allow for choosing between the most commonly encountered geograph-
ical projections (equirectangular, conical, polar stereographic etc.) and
choosing keypoints in the image to allow for precise pixel-coordinate
mapping. Regarding the pixel-coordinated mapping, while the selection
of keypoints is performed manually when using AirMerge as standalone
tool, in the proposed work this functionality is realized in an automatic
way with the aid of the “text processing and extraction”module.
5.3.2. “Reconstruction of missing values and data gaps”procedure
This step is introduced to deal with unwanted elements such as
legends, text, geomarkings and watermarks, as well as regions that are
Table 3
Coordinates of x axis —input image (top) and OCR (with position) output (bottom),
expressed in pixel coordinates (horizontal and vertical positions, with upper-left origin
(0,0)).
6E 9E 12E 15E 18E 21E 24E 27E 30E
Position (left, top, right, bottom): 18, 3, 44, 24 —value: 6E
Position (left, top, right, bottom): 147, 3, 173, 24 —value: 9E
Position (left, top, right, bottom): 273, 3, 311, 24 —value: 1iE
Position (left, top, right, bottom): 401, 3, 441, 24 —value: 16E
Position (left, top, right, bottom): 531, 3, 570, 24 —value: 1AE
Position (left, top, right, bottom): 659, 2, 684, 24 —value: 21
Position (left, top, right, bottom): 689, 2, 702, 24 —value: E
Position (left, top, right, bottom): 788, 2, 831, 24 —value: 24E
Position (left, top, right, bottom): 918, 2, 960, 24 —value: 2?E
Position (left, top, right, bottom): 1050, 3, 1088, 24 —value: 3u£
Table 4
Coordinates of y axis —input image (left) and OCRwith position output (right),expressed
in pixel coordinates (horizontal and vertical positions, with upper-left origin (0,0)).
70N Position (left, top, right, bottom): 36, 69, 77, 86 —value: TON
68N Position (left, top, right, bottom): 39, 168, 77, 189 —value: WN
66N Position (left, top, right, bottom): 39, 270, 78, 287 —value: E4N
64N Position (left, top, right, bottom): 39, 369, 78, 387 —value: G4M
62N Position (left, top, right, bottom): 38, 467, 78, 489 —value: &2N
60N Position (left, top, right, bottom): 39, 570, 77, 587 —value: 60N
58N Position (left, top, right, bottom): 36, 668, 78, 690 —value: & N
56N Position (left, top, right, bottom): 35, 770, 78, 788 —value: 56N
54N Position (left, top, right, bottom): 36, 885, 77, 891 —value: −c4fl
7A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
not part of the forecast area, which might be present after the screen
scraping phase. The image pixels are classified into three main catego-
ries: valid data (with colors that satisfy the color scale's classification),
invalid data (with colors notpresent in the color scale),and regions con-
taining colors that are explicitly marked for exclusion, and which are
considered void during processing. Such marked regionsare not consid-
ered as part of the forecast, and thus do not undergo data correction.
However, regions containing unmarked invalid data are considered as
regions with correctable errors or “data gaps”, which can be filled-in.
This distinction is due to their different appearance patterns: void re-
gions are usually extended and continuous (e.g. sea regions not covered
by the forecast, but present on the map), while invalid data regions are
usually smaller but more noticeable (e.g. lines, text, watermarks etc.)
and with more noise-like patterns, and thus it is more compelling to
remove them by using gap-filling techniques. These techniques include
traditional grid and pattern-based interpolation techniques using neu-
ral networks.
In order for the Core Engine module to function, it must be guided
through all the relevant details of the heatmap (position, dimension,
colors, geographical projection etc.). Normally, this is achieved via an
XML scripting subsystem, which is used as AirMerge's configuration
template. Each distinct type of heatmap needs its own scripting/config-
uration file, although similar heatmaps can use the same configuration
with no or only minor variations.
Generally speaking, whenever a new source of environmental
heatmaps is added to AirMerge's list of tasks, a new configuration tem-
plate/script (using XML syntax) must be created by hand, though it is
possibleto partially customize this template so that a series oftemplates
will be automatically produced from it. For example, the pattern of the
URLs used by a model provider to publish their own heatmaps can be
encoded in the template, and used to automatically produce variations
of the template only for the parts that vary e.g. resolution may be con-
stant for all images from a given provider, but color scale maybe differ-
ent for every available pollutant, and there might be several different
time series available (e.g. 48 or 72 h) for the same pollutant and region.
The proposed framework aims at automating the creation of these
configuration scripts, which can be quite time consuming and require
technical skills, andthus any comparisonsare drawn primarily between
the accuracy achievable by a technically skilled human operator that
knows how to classify heatmaps and create appropriate scripts, and a
semi-automated system, which instead relies on cues contained in
the heatmaps themselves and a guidance by environmental operators,
who do not possess technical skills.
5.4. Comparison of the proposed system and AirMerge
Given that both the proposed framework and the AirMerge compo-
nent can be employed to perform the same task it would be useful to list
their advantages and disadvantages in order to make clearer whichlim-
itations of AirMerge attempt the proposed architecture to overcome
and the errors that are introduced when limiting the user intervention.
The advantages of using a manually configured system (i.e. AirMerge)
are that in general, it is a very accurate system, if spot-on information
(i.e. latitude and longitude lines and their values) is available and that
it allows a skilled operator to detect optimizations and cues that are dif-
ficult for an automated system to realize e.g. template redundancy and
reuse (“master templates”), the use of unusual map projections, images
with little or no geographical cues etc. However, the main disadvantage
is that the manual configuration of the system is a laborious, time con-
suming and error prone task, while specific expertise and technical
skills are required.
On the other hand, the proposed framework automates further the
data extraction procedure from heatmaps by relieving human operators
from the tedious task of manual configuration and allows the usage by
administrative users (i.e. environmental experts), who do not have
technical skills. However, this automation does not come without a
cost, since it is possible that error is introduced during the second mod-
ule, which includes the OCR and coordinate mapping step.
Although both systems have pros and cons, they could serve differ-
ent application needs. For instance the proposed framework could be
more useful for administrative environmental experts, without techni-
cal skills,while AirMerge could certainly be used by technically qualified
personnel to provide quality measurements. Table 5 contains a brief
overview of the advantages and disadvantages of the manually and
the semi-automatic configured systems.
6. Results
The evaluation of the framework is carried out into two steps with
different focuses. The first step deals with evaluating thetext extraction
and processing module (i.e. OCR and text processing using heuristic
rules) by presenting a visual assessment of the output. The final XML
output of the system (i.e. mappingof geographic coordinates to forecast
values) is not provided, since its visual presentation is not informative,
compared to the reconstructed image, which derives from this repre-
sentation and is more appropriate for visual inspection. The second
step presents a direct comparison of the results of the proposed frame-
work with the ones of the AirMerge system, as well as with thenumer-
ical values obtained from the corresponding forecast models.
6.1. Qualitative evaluation
The tests that have been performed during this step focus on the
recognition of the x and y axes and evaluate the mapping of pixels to
Table 5
Advantages and disadvantages of manually and semi-automatic configured systems.
Manually configured system (AirMerge) Semi-automatic configured system (framework)
Advantages •Potentially very accurate, if spot-on information is available
•A technically skilled operator can detect optimizations and cues that are difficult for an
automated system to realizee.g. template redundancyand reuse (“master templates”).
•Relieves human operators from a potentially tedious task
•Significant step towards the creation of completely automated
systems
•Can automatically deal with unknown/unlisted types of heatmaps
•Usable in a completely automated service
Disadvantages •Creating proper templates is laborious and error prone
•Incorrect assumptions by part of the operator can lead to sub-optimal templates
•The template configuration requires technical skills and it cannot easily be used by
environmental experts
•Certain types of heatmaps do not contain enough cues for an automated
system to completely analyze without manual intervention.
•Introduction of error during the geographical mapping procedure
Table 6
OCR error in websites.
Site Longitude Latitude
Original
degrees
Estimation Absolute
error
Original
degrees
Estimation Absolute
error
FMI Pollen 5 4.98775 0.01225 5 4.98404 0.01596
FMI 5 4.97516 0.02484 5 4.96523 0.03477
GEMs 5 5.0634 0.0 125 5 4.9776 0.0 045
LAPS 2 1.9924 0.004 1 1.0236 0.023
AOPG 2 1.9937 0.003 1 0.9958 0.004
8A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
geographical coordinates. Given the fact that in this case we are not
aware of the original forecasted data that were used for constructing
the heatmap values, we assess the results by visual comparison of the
original image and the one produced by the proposed framework.
The images tested are extracted from the following sites:
•FMI Pollen, Pollen Finnish Meteorological Institute site (http://
pollen.fmi.fi)
It contains forecasts measurements for several types of pollen such
as birch and grass for Europe in general.
•FMI, SILAM Finnish Meteorological Institute site (http://silam.fmi.fi/)
It contains forecasted measurements for several air pollutants such as
nitrogen oxides and fine particles for Europe and for the Northern
European countries.
•GEMS, Global and regional Earth-system Monitoring using Satellite
and in-situ data project site (http://gems.ecmwf.int/d/products/raq/)
It contains outputs from several state-of-the-art chemistry and trans-
port models for Europe.
•LAPS, Laboratory of Atmospheric Physics of the Aristotle University of
Thessaloniki site (http://lap.physics.auth.gr/forecasting/airquality.htm)
It contains regional air quality forecasts for Greece.
•AOPG, Atmospheric and Oceanic Physics Group site (http://www.fisica.
unige.it/atmosfera/bolchem/MAPS/)
It presents the results of BOLCHEM numerical model that simulates the
composition of the atmosphere for Italy.
Fig. 5. Original image retrieved from Pollen FMI site representing the fraction of birch.
Fig. 6. Reconstructed image.
Fig. 7. Originalimage retrieved from FMI site representing theNO
2
forecastconcentration
from 500 m using SILAM model.
9A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
Table 6 contains the error introduced by the text extractionand pro-
cessing module (called “absolute error”), during the process of recog-
nizing the values and the position of the horizontal (latitude) and
vertical (longitude) axes. This error is introduced mostly due to the in-
ability of OCR to perfectly identify the position of each coordinate on
the map axes and it is calculated as the average difference of the OCR es-
timation (e.g. 4.98775 in the first line) compared to the initial degrees
range (e.g. 5 in the first line) for two consecutive lines and represents
the error in the latitude and longitude step (i.e. the difference between
two subsequent degrees in the map). It should be noted that the pixel
coordinate matching is based on how well the OCR recognizes the
position of each coordinate axis value and how well the alignment
of this value and the coordinate lines (or ticks) is. Since several
heatmaps do not include grid lines, this approach relies only on the
position of the coordinates on the heatmap to define the pixel coor-
dinate matching. In the following, we present in detail the results for
each website.
6.1.1. FMI Pollen website
Fig. 5 is the original image retrieved from the site and represents the
fraction of birch (%). Fig. 6 depicts the reconstructed image produced
from the proposed system after visualizing the XML output. The
reconstructed figure is almost identical and in addition any noise
(e.g. black lines) was removed. Moreover, the absolute error both
for of the latitude and longitude step is very low (around 0.3%).
6.1.2. FMI website
In case of the FMI site, based on visual assessment, the reconstructed
image (Fig. 8) is almost identical to the initial one (Fig. 7). The original
image depicts NO
2
forecast concentrations for a height of 500 m as esti-
mated by SILAM model. The absolute geo-coordinate error is very low
(around 0.6%) for both latitude and longitude and thus the error intro-
duced by OCR is not significant.
6.1.3. GEMs website
The images capture O
3
forecast concentrations using the EURAD-IM
model. Figs. 9 and 10 depict the original and reconstructed image by the
AirMerge system, which are almost identical, and any noise (e.g. black
lines) is removed. In both cases the error is very low.
6.1.4. LAPS site
In a similar way we present the initial and the reconstructed image
of this website in Figs. 11 and 12. The results are reported in Table 6
and the error is again very small. The original image was produced
using the Fifth Generation Penn/State Mesoscale Model, MM5, and the
Eulerian photochemical air quality model CAMx and represents the
maximum concentration of NO
2
.
Fig. 8. Reconstructed image.
Fig. 9. Original image from GEMS site representing the O
3
forecast concentration using the EURAD-IM model.
10 A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
6.1.5. AOPG site
The results of the last provider report an average error of 0.35%. The
initial and the reconstructed maps are illustrated in Figs. 13 and 14.It
should be noted that the white region in Fig. 13 is treated as “void
space”in Fig. 14, and considered as a distinct case than national border
Fig. 11. Original image from LAPS site representing the maximum forecast concentration of NO
2
using the Fifth Generation Penn/State Mesoscale model, MM5, and the Eulerian photo-
chemical air quality model CAMx.
Fig. 12. Reconstructed image.
Fig. 10. Reconstructed image.
11A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
lines, which are instead treatedas unwanted noise and filled-in.Regard-
ing the original image, it represents the concentration of PM
10
pollutant
as predicted by the BOLCHEM model.
6.2. Quantitative evaluation
The quantitative evaluation focuses on comparing the performance
of the AirMerge system with the proposed framework against the real
numerical data. This is realized by comparing the reconstructed data
from the published images from both systems with the original forecast
data as produced by the forecast model. In this way, we can calculate
more accurately, compared to the first evaluation step, how significant
is error introduced by OCR and the quality of the final results.
The tests are performed on a set of 108 images, which are extracted
from the FMI site
4
and their reconstructed data from these images are
compared with the original data provided in a NetCDF format file
by FMI. These images are selected so that multiple air pollutants
and time/dates are covered. The selection of diverse input data aims at
retrieving a variety of images and thus testing the systems with as dif-
ferent as possible input images.
Specifically, the dataset is created using the following restrictions:
•6 pollutants are handled (i.e. CO, NO, NO
2
,PM
10
,PM
2.5
and SO
2
)
•3 h per day (i.e. 8:00, 16:00, 24:00)
•6 days, a weekend and 4 weekdays were selected
•Surface height was used exclusively.
Table 7 contains the following results for each pollutant separately:
a) the number of images, b) the absolute average latitude and longitude
step differences, which indicate the error introduced by the proposed
framework for 5° in each axis, c) the average percentage of pixels with
correct values (i.e. compared with the original numerical values produced
by the SILAM model), d) the average error (er) introduced in each pixel
from AirMerge (AM) during data extraction from the heatmaps (the
mathematical formula for er, which is presented later, is based on the
formula used for estimating the relative error), e) the average error (er)
introduced in each pixel from the framework (FW) due to OCR and thus
misalignment of the coordinates, f) the mean squared error per pixel for
4
http://silam.fmi.fi/AQ_forecasts/Regional_v4_9/index.html.
Fig. 13. Original image fromAOPG site representing the forecast concentration of PM
10
using the BOLCHEM model.
Fig. 14. Reconstructed image.
12 A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
the AirMerge (AM) system and g) the root mean squared error of AM and
FW respectively which has the same units as the quantity being estimat-
ed.Theerroreriscalculatedas:
er ¼Xn
i¼0
vi‐evi
vi
n;
where n is the total number of pixels, v
i
is the original data from the
specific geographical coordinates and ev
i
is the value of AirMerge
with manual configuration or the value of the proposed framework for
the specific coordinates. The mathematical formula for Mean Squared
Error (MSE) is:
MSE ¼1
nX
n
i¼0
vi‐evi
ðÞ
2
;
where n, v
i
and ev
i
stand for the same parameters as in the error er. Final-
ly, the Root Mean Squared Error (RMSE) is defined as the square root of
MSE:
RMSE ¼ffiffiffiffiffiffiffiffiffiffi
MSE
p¼ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1
nX
n
i¼0
vi‐evi
ðÞ
2
v
u
u
t:
Based on Table 7, it is evident that the latitude and longitude errors
are common for all pollutants, since the map types considered are sim-
ilar. They are both quite low and thus the proposed framework could
identify rather well the position of the horizontal and vertical lines on
the map. The percentage of pixels with correct values is satisfactory
for both systems with that of the framework being slightly lower due
to the OCR error. Regarding the error introduced in each pixel value, it
is in general quite low except the case of CO, where the error is higher.
This is probably due tothe fact that the values betweensequential pixels
varied more coarsely compared to the other cases, a phenomenon
which was also observed in Epitropou et al. (2012) and attributable to
the use of a linearly spaced, but coarse and sparse color scale, as well
as to the higher average magnitude of the observed values. The same
applies for the MSE and RMSE errors. In general, it is evident that the
proposed framework introduces an additional error to the original
values compared to AirMerge. However the error introduced is not sig-
nificant and shows that a manually configured extraction system could
be substituted by a semi-automatic one, which could facilitate the tasks
of environmental administrators.
7. Conclusions
In this paper, we have proposed a framework for environmental
information extraction from air quality and pollen forecast heatmaps,
combining image processing, template configuration, as well as textual
recognition components. The proposed framework overcomes the
limitation of not having access to the raw data, since it only considers in-
formation in form of heatmaps that are publicly available on the Inter-
net, and estimates the original numerical forecasted data by using the
reconstructed data of the heatmaps. The evaluation revealed that the
proposed semi-automatic configured system has almost similar results
(i.e. the estimated values are rather close to the original ones) to the
manual one, since in most of the cases no significant error is introduced
by the OCR.
Potential uses for the proposed framework include supporting envi-
ronmental systems that provide either air quality information from
several providers for direct comparison or orchestration purposes or de-
cision support on everyday issues (e.g. travel planning) (Wanner et al.,
2012), and in general providing a way to access sufficiently usable nu-
merical environmental data for a host of applications involving the pro-
cessing of the latter, without requiring explicit data publishing policy
changes by part of environmental data providers, thus creating more
flexibility.
Future work includesevaluation with images in different projections
(such as conical) and an effort to further automate the procedure. This
can be achieved by applying segmentation techniques on the original
image, which will result to the automatic recognition of its element
(heatmap, color scale, axis) boundaries. Towards this direction we
plan to investigate and apply segmentation techniques that are based
only on rough image features (Hoenes and Lichter, 1994), on Voronoi
diagrams (Kise et al., 1998) and on connected components (Bukhari
et al., 2010).
Acknowledgments
This work was supported by PESCaDO project (FP7-248594).
References
Balk, T., Kukkonen, J., Karatzas, K., Bassoukos, A., Epitropou, V., 2011. A European open
access chemical weather forecasting portal. Atmos. Environ. 45, 6917–6922.
Bukhari, S., Al Azawi, M.I.A., Shafait, F., Breuel, T.M., 2010. Document image segmentation
using discriminative learning over connected components. Proceedings of the 9th
IAPR International Workshop on Document Analysis Systems (DAS '10). ACM, New
York, NY, USA, pp. 183–190.
Cao, R., Tan,C., 2002. Text/graphics separation in maps. In: Blostein, D., Kwon, Y.-B.(Eds.),
Fourth IAPR Workshop on Graphics Recognition. Lecture Notes in Computer Science,
vol. 2390. Springer, Berlin, pp. 167–177.
Chang, S., Jiang, W., Yanagawa, A., Zavesky, E., 2007. Columbia University TRECVID 2007
high-level feature extraction. Proceedings of TREC Video Retrieval Workshop
(TRECVID 07).
Chiang, Y.Y., Knoblock, C.A., 2006. Classification of line and character pixels on
raster maps using discrete cosine transformation coefficients and support vector
machine. Proceedings of the 18th International Conference on Pattern Recognition,
pp. 1034–1037.
Desai, S., Knoblock, C.A., Chiang, Y.-Y., Desai, K., Chen, C.-C., 2005. Automatically identify-
ing and georeferencing street maps on the web. Proceedings of the 2005 Workshop
on Geographic Information Retrieval (GIR '05). ACM, New York, NY, USA, pp. 35–38.
Epitropou, V., Karatzas, K.D., Bassoukos, A., Kukkonen, J., Balk, T., 2011.A new en vironmental
image processing method for chemical weather forecasts in Europe. Proceedings of the
5th International Symposium on Information Technologies in Environmental Engineer-
ing, Poznan.
Epitropou, V., Karatzas, K., Kukkonen, J., Vira, J., 2012. Evaluation of the accuracy of an
inverse image-based reconstructionmethod for chemical weather data. International
Journal of Artificial Intelligence 9 (S12), 152–171.
Henderson, T.C., Linton,T., 2009. Raster map image analysis. Proceedings of the 2009 10th
International Conference on Document Analysis and Recognition (ICDAR '09). IEEE
Computer Society, Washington, DC, USA, pp. 376–380.
Hoenes,F.,Lichter,J.,1994.Layout extraction of mixed mode documents.Mach. Vis. Appl.
7, 237–246.
Karatzas, K., 2005. Internet-based management of environmental simulation tasks. In:
Farago, I., Georgiev, K., Havasi, A. (Eds.), Advances in Air Pollution Modelling for
Environmental Security, pp. 253–262 (NATO Reference EST.ARW980503, 406 p.).
Table 7
Results comparing the proposed framework and AirMerge system with the original
numerical values produced by SILAM model.
Pollutant Total
CO NO
2
NO PM10 PM2.5 SO
2
Number of images 18 18 18 18 18 18 108
Latitude step difference
between AM and FW
8.72 · 10
−4
Longitude step difference
between AM and FW
1.33 · 10
−4
Average percentage of
pixels without error
in value (AM)
74.9% 83.2% 89.7% 85.6% 86.6% 77.2% 82.9%
Average percentage of
pixels without error
in value (FW)
72.1% 76.4% 89.6% 80.3% 81.5% 69.5% 78.3%
Average error per pixel
(AM)
19.857 0.283 0.025 0.712 0.622 0.188 3.490
Average error per pixel
(FW)
20.566 0.3426 0.029 0.831 0.717 0.238 3.657
RMSE per pixel (AM) 36.218 0.473 0.219 1.156 0.922 0.454 6.574
RMSE per pixel (FW) 38.497 0.638 0.250 1.520 1.193 0.618 7.120
13A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003
Karatzas, K., Moussiopoulos, N., 2000. Urban air quality management and information
systems in Europe: legal framework and information access. J. Environ. Assess. Policy
Manag. 2 (Νο. 2), 263–272.
Karatzas, K., Kukkonen,J., Bassoukos, A., Epitropou, V.,Balk, T., 2011. A European chemical
weather forecasting portal. In: Steyn, Douw G., Trini Castelli, Silvia (Eds.), 31st ITM -
NATO/SPS International Technical Meeting on Air Pollution Modelling and Its Appli-
cation, Torino, 28 Sept. 2010. Published in Air Pollution Modeling and Its Applications
XXI, Springer, NATO Science for Peace and Security Series C: Environmental Security,
pp. 239–243.
Kise, K., Sato, A., Iwata, M., 1998. Segmentation of page images using the area Voronoi
diagram. Comput. Vis. Image Underst. 70 (3), 370–382.
Kraaij, W., Over, P., Awad, G., 2007. TRECVID-2007 high-level feature task: overview.
Online Proceedings of the TRECVID Video Retrieval Evaluation Workshop.
Kukkonen, J., Klein, T., Karatzas, K., Torseth, K., Fahre Vik, A., San José, R., Balk, T., Sofiev,
M., 2009. COST ES0602: towards a European network on chemical weather forecast-
ing and information systems. Adv. Sci. Res. J. 1, 1–7.
Kukkonen, J., Olsson, T., Schultz, D.M., Baklanov, A., Klein, T., Miranda, A.I., Monteiro, A.,
Hirtl, M., Tarvainen, V., Boy, M., Peuch, V.-H., Poupkou, A., Kioutsioukis, I., Finardi, S.,
Sofiev, M., Sokhi, R., Lehtinen, K.E.J., Karatzas, K., San José, R., Astitha, M., Kallos, G.,
Schaap, M., Reimer, E., Jakobs, H., Eben, K., 2012. A review of operat ional, reg ional-
scale, chemical weather forecasting models in Europe. Atmos. Chem. Phys. 12, 1–87.
Levenshtein, V.I., 1966.Binary codes capableof correcting deletions, insertions, and rever-
sals. Sov. Phys. Dokl. 10, 707–710.
Michelson, M., Goel, A., Knoblock, C.A., 2008. Identifying maps on the World Wide Web.
In: Cova, Thomas J., Miller, Harvey J., Beard, Kate, Frank, Andrew U., Goodchild,
Michael F. (Eds.), Proceedings of the 5th International Conference on Geographic
Information Science (GIScience '08).Springer-Verlag, Berlin, Heidelberg, pp. 249–260.
Moumtzidou,A.,Epitropou,V.,Vrochidis,S.,Voth,S.,Bassoukos,A.,Karatzas,K.,Mossgraber,
J.,Kompatsiaris,I.,Karppinen,A.,Kukkonen,J.,2012a.Environmental data extraction
from multimedia resources. Proceedings of the 1st ACM International Workshop on
Multimedia Analysis for Ecological Data (MAED 2012), November 2, Nara, Japan,
pp. 13–18.
Moumtzidou, A., Vrochidis, S., Tonelli, S., Kompatsiaris, I., Pianta, E., 2012b. Discovery of
environmental nodes in the web. Proceedings of the 5th IRF Conference, Austria,
Vienna, July 2–3.
Musavi, M.T., Shirvaikar, M.V., Ramanathan, E., Nekovei, A.R., 1988. Map processing
methods:an automated alternative. Proceedings of the Twentieth Southeastern Sym-
posium on, IEEE Computer Society, System Theory, pp. 300–303.
Ngo, Ch., et al., 2007. Experimenting VIREO-374: bag-of-visual-words and visual-based
ontology for semantic video indexing and search. Proceedings of TREC Video Retrieval
Workshop (TRECVID 07).
Smeaton, A.F., Over, P., Kraaij, W., 2006. Evaluation campaigns and TREC Vid. Proceedings
of 8th ACM International Workshop on Multimedia Information Retrieval, California,
USA, pp. 321–330.
Vrochidis, S., Epitropou, V., Bassoukos, A., Voth, S., Karatzas, K., Moumtzidou, A., Mossgraber,
J.,Kompatsiaris,I.,Karppinen,A.,Kukkonen,J.,2012.Extraction of environmental data
from on-line environmental information sources. Artificial Intelligence Applications
and Innovations. IFIP Advances i nI nformation and Communication Technology, volume
382 361–370.
Wanner, L., Rospocher, M., Vrochidis, S., Bosch, H., Bouayad-Agha, N., Bugel, U.,
Casamayor, G., Ertl, T., Hilbring, D., Karppinen, A., Kompatsiaris, I., Koskentalo, T.,
Mille, S., Mossgraber, J., Moumtzidou, A., Myllynen, M., Pianta, E., Saggion, H.,
Serafini, L., Tarvainen, V., Tonelli, S., 2012. Personalized environmental service
configuration and delivery orchestration: the PESCaDO demonstrator. Proceedings
of the 9th Extended Semantic Web Conference (ESWC 2012), Heraclion, Crete,
Greece.
Yuan, Y., et al., 2007. THU and ICRC at TRECVID 2007. Proceedings of TREC Video Retrieval
Workshop (TRECVID 07).
14 A. Moumtzidou et al. / Ecological Informatics xxx (2013) xxx–xxx
Please cite this article as: Moumtzidou, A., et al., A model for environmental data extraction from multimedia and its evaluation against various
chemical weather forecasting datasets, Ecological Informatics (2013), http://dx.doi.org/10.1016/j.ecoinf.2013.08.003