ArticlePDF Available

Big Data Applications in the Government Sector: A Comparative Analysis among Leading Countries

Authors:

Abstract and Figures

Businesses, governments, and the research community can all derive value from the massive amounts of digital data they collect. Analyzing big-data application projects by governments offers guidance for follower countries for their own future big-data initiatives. Decision making in government usually takes much longer and is conducted through consultation and mutual consent of a large number of diverse actors, including officials, interest groups, and ordinary citizens. Governments deal not only with general issues of big-data integration from multiple sources and in different formats and cost but also with some special challenges. The biggest is collecting data; governments have difficulty, as the data not only comes from multiple channels but from different sources. Most governments operating or planning big-data projects need to take a step-by-step approach for setting the right goals and realistic expectations. Success depends on their ability to integrate and analyze information, develop supporting systems, and support decision making through analytics.
Content may be subject to copyright.
78 COMMUNICATIONS OF THE ACM | MARCH 2014 | VOL. 57 | NO. 3
contributed articles
Big data, a general term for the massive amount of
digital data being collected from all sorts of sources,
is too large, raw, or unstructured for analysis through
conventional relational database techniques. Almost
90% of the world’s data today was generated during
the past two years, with 2.5 quintillion bytes of data
added each day.7 Moreover, approximately 90% of
it is unstructured. Still, the overwhelming amount
of big data from the Web and the cloud offers new
opportunities for discovery, value creation, and rich
business intelligence for decision support in any
organization. Big data also means new challenges
involving complexity, security, and risks to privacy, as
well as a need for new technology and human skills.
Big data is redefining the landscape
of data management, from extract,
transform, and load, or ETL, processes
to new technologies (such as Hadoop)
for cleansing and organizing unstruc-
tured data in big-data applications.
Although the business sector is
leading big-data-application develop-
ment, the public sector has begun to
derive insight to help support decision
making in real time from fast-growing
in-motion data from multiple sourc-
es, including the Web, biological and
industrial sensors, video, email, and
social communications.3 Many white
papers, journal articles, and business
reports have proposed ways govern-
ments can use big data to help them
serve their citizens and overcome
national challenges (such as rising
health care costs, job creation, natu-
ral disasters, and terrorism).9 There
is also some skepticism as to whether
it can actually improve government
operations, as governments must de-
velop new capabilities and adopt new
technologies (such as Hadoop and
NoSQL) to transform it into informa-
tion through data organization and
analytics.4
Here, we ask whether governments
are able to implement some of today’s
big-data applications associated with
the business sector. We first compare
the two sectors in terms of goals, mis-
sions, decision-making processes,
decision actors, organizational struc-
ture, and strategies (see the table
here), then turn to several current ap-
plications in technologically advanced
Big-Data
Applications
in the
Government
Sector
DOI:10.1145/2500873
In the same way businesses use big data to
pursue profits, governments use it to
promote the public good.
BY GANG-HOON KIM, SILVANA TRIMI, AND JI-HYONG CHUNG
key insights
Businesses, governments, and the rese arch
community can all derive value from the
massive a mounts of digit al data they coll ect.
Governments of leading ICT countries
have initiated big-data application
projects to enhance operational efficiency,
transparency, citizens’ well-being and
engagement in public affairs, economic
growth, and national security.
Analyzing big-data application projects
by governments offers guidance for
follower countries for their own future
big-data initiatives.
MARCH 2014 | VOL. 57 | NO. 3 | COMMUNICATIONS OF THE ACM 79
IMAGE COLLAGE BY IWONA USAKIEWICZ/ANDRIJ BORYS ASSOCIATES
countries, including Australia, Japan,
Singapore, South Korea, the U.K., and
the U.S. Also examined are some busi-
ness-sector big-data applications and
initiatives that can be implemented by
governments. Finally, we suggest ways
for governments of follower countries
to pursue their own future big-data
strategies and implementations.
Business and
Government Compared
Although the primary missions of
businesses and governments are not
in conflict, they do reflect different
goals and values. In business, the
main goal is to earn profits by provid-
ing goods and services, developing/
sustaining a competitive edge, and
satisfying customers and other stake-
holders by providing value. In govern-
ment, the main goal is to maintain
domestic tranquility, achieve sustain-
able development, secure citizens’
basic rights, and promote the general
welfare and economic growth.
Most businesses aim to make short-
term decisions with a limited number
of actors in a competitive market envi-
ronment. Decision making in govern-
ment usually takes much longer and is
conducted through consultation and
mutual consent of a large number of
diverse actors, including officials, in-
terest groups, and ordinary citizens.
Many well-defined steps are therefore
required to reduce risk and increase
the efficiency and effectiveness of gov-
ernment decision making.18 It follows
that big-data applications likewise dif-
fer between public and private sectors.
Dataset Attributes Compared
The big-data environment reflects
the evolution of IT-enabled decision-
support systems: data processing in
the 1960s, information applications
in the 1970s–1980s, decision-support
models in the 1990s, data warehous-
ing and mining in the 2000s, and big
data today. The big-data era is at an
early stage, as most related technology
and analytics applications were first
introduced only around 2010.4
The attributes and challenges of
big data have been described in terms
of “three Vs”: volume, velocity, and
variety (see Figure 1). Volume is big
data’s primary attribute, as terabytes
or even petabytes of it are generated by
organizations in the course of doing
business while also complying with
government regulations. Velocity is
the speed data is generated, delivered,
and processed; that is, big data is so
large and difficult to manage and to
contributed articles
80 COMMUNICATIONS OF THE ACM | MARCH 2014 | VOL. 57 | NO. 3
terror suspects with U.S. intelligence
agencies. In addition, sharing infor-
mation across national boundaries in-
volves language translation and inter-
pretation of text semantics (meaning
of content) and sentiment (emotional
content) so the true meaning is not
lost. Dealing with language requires
sophisticated and costly tools.
Data sharing within a country
among different government depart-
ments and agencies is another chal-
lenge. The most important difference
of government data vs. business data is
scale and scope, both growing steadily
for years. Governments, both local
and national, in the process of imple-
menting laws and regulations and per-
forming public services and financial
transactions accumulate an enormous
amount of data with attributes, values,
and challenges that differ from their
counterparts in the business sector.
Government big-data issues can be
categorized as silo, security, and vari-
ety. Each government agency or depart-
ment typically has its own warehouse,
or silo, of confidential or public infor-
mation, with agencies often reluctant
to share what they might consider pro-
prietary data. The “tower of Babel” in
which each system keeps its data iso-
lated from other systems complicates
trying to integrate complementary
data among government agencies and
departments. Communication failure
is sometimes the issue for data integra-
tion;19 for example, in the U.K., a coali-
tion of police departments and hospi-
tals intended to share data on violent
crimes has been reported as a failure
due to a lack of communication among
participating organizations.19 Another
challenge for sharing and organizing
government data involves finding a
cohesive format that would allow for
analytics in the legacy systems of dif-
ferent agencies. Even though most gov-
ernment data is structured, rather than
semi-structured or unstructured, col-
lecting it from multiple channels and
sources is a further challenge. Then
there is the lack of standardized solu-
tions, software, and cross-agency so-
lutions for extracting useful informa-
tion from discrete datasets in multiple
government agencies and insufficient
funding due to government austerity
measures to develop and implement
these solutions.
extract value from that conventional
information technologies are not ef-
fective for its management.13 Variety
is that data comes in all forms: struc-
tured (traditional databases like SQL);
semi-structured (with tags and mark-
ers but without formal structure like
a database); and unstructured (unor-
ganized data with no business intelli-
gence behind it).
The concept of big data has evolved
to imply not only a vast amount of
the data but also the process through
which organizations derive value from
it. Big data, synonymous today with
business intelligence, business ana-
lytics, and data mining, has shifted
business intelligence from reporting
and decision support to prediction
and next-move decision making.13
New data-management systems aim
to meet the challenges of big data;
for example, Hadoop, an open-source
platform, is the most widely applied
technology for managing storage and
access, overhead associated with large
datasets, and high-speed parallel pro-
cessing.22 However, Hadoop is a chal-
lenge for many businesses, especially
small- and mid-size ones, as applica-
tions require expertise and experience
not widely available and may thus
need outsourced help. Finding the
right talent to analyze big data is per-
haps the greatest challenge for busi-
ness organizations, as required skills
are neither simple nor solely technol-
ogy-oriented. Searching for and find-
ing competent data scientists (in data
mining, visualization, analysis, ma-
nipulation, and discovery) is difficult
and expensive for most organizations.
Other commercial big-data tech-
nologies include the Casandra data-
base, a Dynamo-based tool that can
store two million columns in a single
row, allowing inclusion of a large
amount of data without prior knowl-
edge of how it is formatted.13 Another
challenge for businesses is deciding
which technology is best for them:
open source technology (such as Ha-
doop) or commercial implementa-
tions (such as Casandra, Cloudera,
Hortonworks, and MapR).
Governments deal not only with
general issues of big-data integration
from multiple sources and in different
formats and cost but also with some
special challenges. The biggest is col-
lecting data; governments have diffi-
culty, as the data not only comes from
multiple channels (such as social net-
works, the Web, and crowdsourcing)
but from different sources (such as
countries, institutions, agencies, and
departments). Sharing data and infor-
mation between countries is a special
challenge, as shown by the terrorist
bombing attack on the Boston Mara-
thon in April 2013. National govern-
ments must be prepared and willing to
share data and build systems for crime
prevention and fighting. As reported
in the public media, the Boston Mara-
thon tragedy might have been prevent-
ed if the Russian secret services had
shared critical information about the
Attributes of business- and government-sector projects.
Attribute Business Firm Government
Goal Profit to stakeholders Domestic tranquility, sustainable
development
Mission Development of competitive
edge, customer satisfaction
Security of basic rights (equality,
liberty, justice), promotion of gen-
eral welfare, economic growth
Decision Making Short-term decision-making
processes for maximizing self-
interest and minimizing cost
Long-term decision-making
processes for maximizing self-
interest and promoting the public
interest
Decision Actors Limited number of decision
actors
Diverse decision actors
Organizational Stru cture Hierarchical Governance
Financial Resources Revenue Taxes
Nature of Collective Activity Competition and engagement Cooperation and checking
contributed articles
MARCH 2014 | VOL. 57 | NO. 3 | COMMUNICATIONS OF THE ACM 81
Governments must also address
related legality, security, and compli-
ance requirements when using data.
There is a fine line between collect-
ing and using big data for predictive
analysis and ensuring citizens’ rights
of privacy. In the U.S., the USA PA-
TRIOT Act allows legal monitoring
and sometimes spying on citizens;
the Electronic Communication Pri-
vacy Act allows email access without
warrant; the proposed Cyber Intel-
ligence Sharing and Protection Act
(CISPA) (not enacted as of February
2014) raises concern, as it might po-
sition the U.S. government toward the
ultimate big-data end game—access
to all data for all entities in the U.S.14
Even though the intent is to prevent
attacks from both domestic and for-
eign sources against networks and
systems, CISPA raises concerns of
misconstrued profiling and/or inap-
propriate use of information.
Data security is the primary attri-
bute of government big data, as col-
lecting, storing, and using it requires
special care. However, most big-data
technologies today, including Casa-
ndra and Hadoop, lack sufficient se-
curity tools, making security another
challenge for governments.
Compliance in highly regulated
industries (such as financial services
and health care) is yet another ob-
stacle for gathering data for big-data
government projects; for example,
U.S. health-care regulations must be
addressed when extracting knowl-
edge from health-related big data.
The two U.S. laws posing perhaps the
greatest obstacle to big-data analytics
in health care are the Health Insur-
ance Portability and Accountability
Act (HIPAA) and the Health Informa-
tion Technology for Economic and
Clinical Health Act (HITECH). HIPAA
protects the privacy of individually
identifiable health information, pro-
vides national standards for securing
electronic data and patient records,
and sets rules for protecting patient
identity and information in analyz-
ing patient safety events. HITECH ex-
panded HIPAA in 2009 to protect the
health records and electronic use of
health information by various institu-
tions. Together, these laws limit the
amount and types of health records
used for big-data analytics in health
care. Because big data by definition
involves large-scale data, these laws
complicate collecting data and per-
forming analysis on such a scale. As
of February 2014, health-care infor-
mation in the U.S. intended for big-
data analytics is collected only from
volunteers willing to share their own.
Businesses use big data to address
customer needs and behavior, develop
unique core competencies, and create
innovative products and services. Gov-
ernments use it, along with predictive
analytics to enhance transparency, in-
crease citizen engagement in public
affairs, prevent fraud and crime, im-
prove national security, and support
the well-being of people through bet-
ter education and health care.
Choosing and implementing tech-
nology to extract value and finding
skilled personnel are constant chal-
lenges for businesses and govern-
ments alike. However, the challenges
for governments are more acute, as
they must look to break down depart-
mental silos for data integration, im-
plement regulations for security and
compliance, and establish sufficient
control towers (such as the Federal
Data Center in the U.S.).
Big-Data Applications
Comparing the big-data applications
of leading e-government countries can
reveal where current and future appli-
Figure 1. Business and government dataset attributes compared.
Provide actionable
solutions (predicting
customer behavior,
developing
competitive edge)
Provide
sustainable
solutions
(enhancing
government
transparency,
balancing social
communities)
Better
understanding
of problems
like climate
modeling
Exponential growth of
traditional business data and
machine-generated data
Volume
Data in all forms
(traditional, unstructured,
semi-structured)
Expanded use of
unstructured data
Variety
Real-time processing
of streaming data
Velocity
Enormous amount of data
in legacy databases of each
department
Silo
Data in all forms
(traditional, unstructured,
semi-structured)
Expanded use of
unstructured data
Variety
Privacy when using records
Authority and legitimacy
for accessing database and
data records
Security
Data scientists (analysts,
statisticians)
Data mining (storing,
interlinking, processing)
Challenges
Breaking silos
Control tower
Regulation and technologies
Challenges
Business Value Government
contributed articles
82 COMMUNICATIONS OF THE ACM | MARCH 2014 | VOL. 57 | NO. 3
taining 420,894 datasets (as of Au-
gust 2012) covering transportation,
economy, health care, education, and
human services and the data source
for multiple applications: 1,279 by
governments, 236 by citizens, and
103 mobile-oriented.21 In 2010, the
President’s Council of Advisors on
Science and Technology (the primary
mechanism the federal government
uses to coordinate its unclassified
cations are focused and serve as a guide
for follower countries looking to initi-
ate their own big-data applications:
U.S. To manage real-time analysis
of high-volume streaming data, the
U.S. government and IBM collaborat-
ed in 2002 to develop a massively scal-
able, clustered infrastructure.1 The
result, IBM InfoSphere Stream and
IBM Big Data, both widely used by gov-
ernment agencies and business orga-
nizations, are platforms for discovery
and visualization of information from
thousands of real-time sources, en-
compassing application development
and systems management built on
Hadoop, stream computing, and data
warehousing.
In 2009, the U.S. government
launched Data.gov as a step toward
government transparency and ac-
countability. It is a warehouse con-
Figure 2. Government data and big-data practices and initiatives.
InitiatingImplementingOperating
U.S.: Genome Data on AWS
U.S. NSF and NIH: BIGDATA
U.S. NASA: GEOSS
U.S. CDC: NPBOID
Japan: Collaboration between MEXT and NSF
Japan: ITS Japan: Info-plosion
Korea ACRC: CIAS
Korea KOSTAT: EPS
Korea KNOC: opinet.co.kr
U.S. Michigan: MSDW
U.K.: data.gov.uk
Australia: data.gov.au
Singapore: data.gov.au
U.S.: data.gov
Singapore: RAHS
U.K.: HSC
U.S.: RRP
U.S. Syracuse: Smarter City Project
Korea MOPAS: PDS
Korea KOBIC: NDM Korea MFAFF and MOPAS: PFMDS
U.S. NIH: TCGA
E. U.: DOME
Government
Data
Data
Characteristics
Big
Data
Applied Sector GovernmentCitizens and Firms
Japan Collaboration of Ministry of Education,
Culture, Sports, Science, and Technology and
National Science Foundation
Japan Intelligent Traffic System
Korea Anti-Corruption and Civil Rights Commission
of Korea: Complaints Information Analysis Center
Korea Statistics Korea: Employment Position
Statistics
Korea Ministry for Food, Agriculture, Forestry, and
Fisheries and Ministry of Public Administration
and Security: Preventing Foot and Mouth Disease
Syndrome System
Korea Ministry of Public Administration and
Security: Preventing Disasters System
Singapore Risk Assessment and Horizon Scanning
U.K. Horizon Scanning Center
U.S. Centers for Disease Control and Prevention:
Networked Phylogenomics for Bacteria and
Outbreak ID
U.S. Genome Data on Amazon Web Services
U.S. Michigan: Michigan Statewide Data Warehouse
U.S. National Aeronautics and Space
Administration: Global Earth Observation System
of Systems
U.S. National Science Foundation and National
Institutes of Health: BIGDATA
U.S. Return in Review
contributed articles
MARCH 2014 | VOL. 57 | NO. 3 | COMMUNICATIONS OF THE ACM 83
Governments
expect big data
to enhance their
ability to serve their
citizens and address
major national
challenges involving
the economy, health
care, job creation,
natural disasters,
and terrorism.
networking and information technol-
ogy research investments) spelled out
a big-data strategy in its report Design-
ing a Digital Future: Federally Funded
Research and Development in Network-
ing and Information Technology.15 In
2012, the Obama Administration an-
nounced the Big Data Research and
Development Initiative,12 a $200 mil-
lion investment involving multiple
federal departments and agencies,
including the White House Office of
Science and Technology Policy, Na-
tional Science Foundation (NSF), Na-
tional Institutes of Health (NIH), De-
partment of Defense (DoD), Defense
Advanced Research Projects Agency,
Department of Energy, Health and
Human Services, and U.S. Geological
Survey. The main objectives were to
advance state-of the-art core big-data
technologies; accelerate discovery in
science and engineering; strengthen
national security; transform teach-
ing and learning and expand the work
force needed to develop and use big-
data technologies.11
As of February 2014, NIH has ac-
cumulated hundreds terabytes of
data for human genetic variations on
Amazon Web Services, enabling re-
searchers to access and analyze huge
amounts of data without having to
develop their own supercomputing
capability. In 2012, NSF joined NIH to
launch the Core Techniques and Tech-
nologies for Advancing Big Data Sci-
ence & Engineering program, aiming
to advance core scientific and techno-
logical means of managing, analyzing,
visualizing and extracting useful in-
formation from large, diverse, distrib-
uted, heterogeneous datasets. Several
federal agencies have launched their
own big-data programs. The Internal
Revenue Service has been integrating
big data-analytic capabilities into its
Return Review Program (RRP), which
by analyzing massive amounts of data
allows it to detect, prevent, and resolve
tax-evasion and fraud cases.10 DoD is
also spending millions of dollars on
big-data-related projects; one goal is
developing autonomous robotic sys-
tems (learning machines) by harness-
ing big data.
Local governments have also initi-
ated big-data projects; for example, in
2011, Syracuse, NY, in collaboration
with IBM, launched a Smarter City
project to use big data to help predict
and prevent vacant residential proper-
ties.7 Michigan’s Department of Infor-
mation Technology constructed a data
warehouse to provide a single source
of information about the citizens of
Michigan to multiple government
agencies and organizations to help
provide better services.
European Union. In 2010, The Eu-
ropean Commission initiated its “Dig-
ital Agenda for Europe” to address how
to deliver sustainable economic and
social benefits to EU citizens from a
single digital market through fast and
ultra-fast interoperable Internet appli-
cations.5 In 2012, in its “Digital Agenda
for Europe and Challenges for 2012,”
the European Commission made big-
data strategy part of the effort, em-
phasizing the economic potential of
public data locked in filing cabinets
and data centers of public agencies;
ensuring data protection and increas-
ing individuals’ trust; developing the
Internet of things, or communication
between devices without direct hu-
man intervention; and assuring Inter-
net security and secure treatment of
data and online exchanges.5
U.K. The U.K. government was one
of the earliest implementer EU coun-
tries of big-data programs, establish-
ing the U.K. Horizon Scanning Centre
(HSC) in 2004 to improve the govern-
ment’s ability to deal with cross-de-
partmental and multi-disciplinary
challenges.17 In 2011, the HSC’s Fore-
sight International Dimensions of Cli-
mate Change effort addressed climate
change and its effect on the availabil-
ity of food and water, regional ten-
sions, and international stability and
security by performing an in-depth
analysis on multiple data channels.
Another U.K. government initiative
was the creation of the public website
http://data.gov.uk in 2009, opening to
the public more than 1,000 existing
datasets from seven government de-
partments initially, later increased to
8,633 datasets.
The Netherlands, Switzerland, the
U.K., and 17 other countries launched
a collaborative project with IBM called
DOME to develop a supercomputing
system able to handle a dataset in ex-
cess of one exabyte per day derived
from the Square Kilometer Array (SKA)
radio telescope.3 The project aims to
contributed articles
84 COMMUNICATIONS OF THE ACM | MARCH 2014 | VOL. 57 | NO. 3
economic consequences. MEXT has
been collaborating with the country’s
National Science Foundation to en-
hance research and leverage big-data
technologies for preventing, mitigat-
ing, and managing natural disasters.
The Council of Information and
Communications and the ICT Strategy
Committee, both branches of the Min-
istry of Internal Affairs and Commu-
nications, designated “big data appli-
cations” as a crucial mission for 2020
Japan. A big-data expert group was
formed to search for technical solu-
tions and manage institutional issues
in deploying big data.
Australia. The Australian Govern-
ment Information Management Of-
fice (AGIMO) provides public access
to government data through the Gov-
ernment 2.0 program, which runs the
http://data.gov.au/ website to support
repository and search tools for gov-
ernment big data. The government
expects to save time and resources by
using automated tools that let users
search, analyze, and reuse enormous
amounts of data.
Implementations and
Initiatives Compared
Reviewing big-data projects and ini-
tiatives in leading countries (see
Figure 2) identifies three notable
big-data trends: First, most projects
operated or implemented today can
only marginally be classified as big-
data applications, as outlined in the
figure’s upper-left quadrant. The ma-
jority of government data projects
in these countries appears to share
structured databases of stored data;
they do not use real-time, in-motion,
and unstructured or semi-structured
data. Second, large and complex da-
tasets are becoming the norm for
public-sector organizations. Govern-
ments expect big data to enhance
their ability to serve their citizens and
address major national challenges in-
volving the economy, health care, job
creation, natural disasters, and ter-
rorism. However, the majority of big-
data applications are in the citizen
(participation in public affairs) and
business sectors, rather than in the
government sector. And third, most
big-data initiatives in the government
sector, especially in the U.S. (such as
the National Science Foundation’s
investigate emerging technologies for
exascale computing, data transport
and storage, and streaming analytics
required to read, store, and analyze all
the raw data collected daily. This big-
data project, headquartered at Man-
chester’s Jodrell Bank Observatory in
England, aims to address a range of
scientific questions about the observ-
able universe.
Asia. The United Nations’ 2012 E-
Government Survey gave high marks to
several Asian countries, notably South
Korea, Singapore, and Japan.20 Aus-
tralia also ranked. These leaders have
launched diverse initiatives on big
data and deployed numerous projects:
South Korea. The Big Data Initiative,
launched in 2011 by the President’s
Council on National ICT Strategies
(the highest-level coordinating body
for government ICT policy),16 aims to
converge knowledge and administra-
tive analytics through big data. Its Big
Data Task Force was created to play
the lead role in building the necessary
infrastructure. The Big Data Initiative
aims to establish pan-government big-
data-network-and-analysis systems;
promote data convergence between
the government and the private sec-
tors; build a public data-diagnosis
system; produce and train talented
professionals; guarantee privacy and
security of personal information and
improve relevant laws; develop big-
data infrastructure technologies; and
develop big-data management and
analytical technologies.
Many South Korean ministries and
agencies have proposed related ac-
tion plans; for example, the Ministry
of Health and Welfare initiated the
Social Welfare Integrated Manage-
ment Network to analyze 385 different
types of public data from 35 agencies,
comprehensively managing welfare
benefits and services provided by the
central government, as well as by local
governments, to deserving recipients.
The Ministry of Food, Agriculture, For-
estry, and Fisheries and the Ministry
of Public Administration and Security,
or MOPAS, plan to launch the Prevent-
ing Foot and Mouth Disease Syndrome
system, harnessing big data related to
animal disease overseas, customs/im-
migration records, breeding-farm sur-
veys, livestock migration, and workers
in the livestock industry. Another sys-
tem MOPAS is planning is the Prevent-
ing Disasters System to forecast di-
sasters based on past damage records
and automatic and real-time forecasts
of weather and/or seismic conditions.
Moreover, the Korean Bioinforma-
tion Center plans to develop and op-
erate the National DNA Management
System to integrate massive DNA and
medical patient information to pro-
vide customized diagnosis and medi-
cal treatment to individuals.
Singapore. In 2004, to address na-
tional security, infectious diseases,
and other national concerns, the Sin-
gapore government launched the Risk
Assessment and Horizon Scanning
(RAHS) program within the National
Security Coordination Centre.6 Col-
lecting and analyzing large-scale data-
sets, it proactively manages national
threats, including terrorist attacks, in-
fectious diseases, and financial crises.
The RAHS Experimentation Center
(REC), which opened in 2007, focuses
on new technological tools to support
policy making for RAHS and enhance
and maintain RAHS through system-
atic upgrades of the big-data infra-
structure. A notable REC application
is exploration of possible scenarios
involving importation of avian influ-
enza into Singapore and assessment
of the threat of outbreaks occurring
throughout southeast Asia.
Aiming to create value through big-
data research, analysis, and applica-
tions, the Singapore government also
launched the portal site http://data.
gov.sg/ to provide access to publicly
available government data gathered
from more than 5,000 datasets from
50 ministries and agencies.
Japan. The Japanese government
has initiated several programs to use
accumulated large-scale data. From
2005 to 2011, the Ministry of Educa-
tion, Sports, Culture, Science, and
Technology (MEXT), in association
with universities and research insti-
tutes, operated the New IT Infrastruc-
ture for the Information-explosion Era
project (the so-called Info-plosion).
Since 2011, the government’s top
priority has been to address the con-
sequences of the Fukushima earth-
quake, tsunami, and nuclear-power-
plant disaster and the reconstruction
and rehabilitation of affected areas,
as well as relief of related social and
contributed articles
MARCH 2014 | VOL. 57 | NO. 3 | COMMUNICATIONS OF THE ACM 85
and National Institutes of Health’s
Big Data program); are just getting
under way or being planned for future
implementation. This means big-da-
ta application projects in the govern-
ment sector are still at an early stage
of development, with only a handful
of projects in operation (such as the
U.S.’s RRP, Singapore’s RAHS, and
the U.K.’s HSC).
Conclusion
Elected officials, administrators, and
citizens all seem to recognize that be-
ing able to manage and create value
from large streams of data from dif-
ferent sources and in many forms
(structured/stored, semi-structured/
tagged, and unstructured/in-motion)
represents a new form of competitive
differentiation. Most governments op-
erating or planning big-data projects
need to take a step-by-step approach
for setting the right goals and realis-
tic expectations. Success depends on
their ability to integrate and analyze
information (through new technolo-
gies like Hadoop), develop support-
ing systems (such as big-data control
towers), and support decision making
through analytics.4
Here, we have explored the chal-
lenges governments face and the
opportunities they find in big data.
Such insights can also help follower
countries in trying to deploy their own
big-data systems. Moreover, follower
countries may be able to leapfrog the
leaders’ applications through careful
analysis of their successes and fail-
ures, as well as exploit future opportu-
nities in mobile services.
Follower countries should there-
fore be cognizant of several insights
regarding big-data applications in the
public sector:
National priorities. All big-data proj-
ects in leading countries’ governments
share similar goals (such as easy and
equal access to public services, better
citizen participation in public affairs,
and transparency). The main concerns
with big-data applications converge
on security, speed, interoperability,
analytics capabilities, and lack of com-
petent professionals. However, each
government has its own priorities, op-
portunities, and threats based on its
unique environment (such as terror-
ism and health care in the U.S., natu-
ral disasters in Japan, and national
defense in South Korea).1
Analytics agency. For data that cuts
across departmental boundaries,
a top-down approach is needed to
manage and integrate big data. Gov-
ernments should look to establish
big-data control towers to integrate ac-
cumulated datasets, structured or un-
structured, from departmental silos.
Moreover, governments need to estab-
lish an advanced analytics agency re-
sponsible for developing strategies for
how big data can be managed through
new technology platforms and analyt-
ics and how to secure skilled profes-
sional staff.
Real-time analysis. They need to
manage real-time analysis of in-mo-
tion big data while protecting individ-
ual citizens’ privacy and security. They
should also explore new technological
playgrounds (such as cloud comput-
ing, advanced analytics, security tech-
nologies, and legislation).
Global collaboration. Much govern-
ment data is global in nature and can
be used to prevent and solve global is-
sues; for example, the Group on Earth
Observations (GEO) is a collaborative
international intergovernmental ef-
fort to integrate and share Earth-ob-
servation data. Its Global Earth Ob-
servation System of Systems (GEOSS),
a global public infrastructure that
generates comprehensive, near-real-
time environmental data, intends
to provide information and analyses
for a wide range of global users and
decision makers. Governments also
need to share data related to security
threats, fraud, and illegal activities.
Such big data needs not only transla-
tion technologies but an international
collaborative effort to share and inte-
grate data,
ICT big brothers. Finally, govern-
ments should collaborate with “ICT big
brothers” like EMC, IBM, and SAS; for
example, Amazon Web Services hosts
many public datasets, including Japa-
nese and U.S. census data, and many
genomic and medical databases.
References
1. Accenture. Build It and They Will Come?
Chicago, 2012; http://www.accenture.com/
SiteCollectionDocuments/PDF/Accenture-Digital-
Citizen-FullSurvey.pdf
2. Braham Group Inc. Maximizing the Value Provided By
a Big Data Platform. Salt Lake City, UT, June 2012;
http://public.dhe.ibm.com/common/ssi/ecm/en/
iml14324usen/IML14324USEN.PDF
3. Broekema, C.P. et al. DOME: Towards the ASTRON and
IBM Center for Exascale Technology. In Proceedings
of the 2012 Workshop on High-Performance
Computing for Astronomy Data, 2012, 1–4.
4. Chen, H., Chiang, R.H.L., and Storey, V.C. Business
intelligence and analytics: From big data to big impact.
MIS Quarterly 36, 4 (Dec. 2012), 1165–1188.
5. European Commission. A Digital Agenda for Europe.
Brussels, Aug. 26, 2010; http://ec.europa.eu/
digital-agenda/
6. Habegger, B. Strategic foresight in public policy:
Reviewing the experiences of the U.K., Singapore, and
the Netherlands. Futures 42, 1 (Feb. 2010), 49–58.
7. IBM. IBM’s Smarter Cities Challenge: Syracuse.
Dec. 2011; http://smartercitieschallenge.org/city_
syracuse_ny.html
8. McAfee, A. and Br ynjolfsson, E. Big data: The
management revolution. Harvard Business Review
(Oct. 2012), 61–68.
9. McKinsey Global Institute. Big Data: The Next Frontier
for Innovation, Competition, and Productivity. New
York, May 2011; http://www.mckinsey.com/insights/
business_technology/big_data_the_next_frontier_for_
innovation
10. National Information Society Agency. Evolving World
on Big Data: Global Practices. May 2012; http://
www.koreainformationsociety.com/2013/11/koreas-
national-information-society.html
11. Office of Science and Technology Policy, Executive
Office of the President. Fact Sheet: Big Data Across the
Federal Government. Washington, D.C., Mar. 29, 2012;
http://www.whitehouse.gov/administration/eop/ostp
12. Office of Science and Technology Policy, Executive
Office of the President. Obama Administration Unveils
‘Big Data’ Initiative: Announces $200 Million in New
R&D Investments. Washington, D.C., Mar. 29, 2012;
http://www.whitehouse.gov/administration/eop/ostp
13. Ohlhorst, F.J. Big Data Analytics: Turning Big Data Into
Big Money. John Wiley & Sons, Hoboken, NJ, 2013.
14. Plant, R. CISPA: Information without representation?
Big Data Republic, Apr. 24, 2013; http://www.
bigdatarepublic.com/author.asp?section_
id=2635&doc_id=262480
15. President’s Council of Advisors on Science and
Technology. Designing a Digital Future: Federally
Funded Research and Development in Networking
and Information Technology. Washington, D.C., Dec.
2010; http://www.whitehouse.gov/sites/default/files/
microsites/ostp/pcast-nitrd-report-2010.pdf
16. President’s Council on National ICT Strategies.
Establishing a Smart Government by Using Big Data.
Washington, D.C., Nov. 7, 2011.
17. Sherry, S. 33B pounds drive U.K. government big
data agenda. Big Data Republic, Nov. 16, 2012; http://
www.bigdatarepublic.com/author.asp?section_
id=2642&doc_id=254471
18. Stone, D.A. Policy Paradox: The Art of Political Decision
Making. W.W. Norton & Company, Inc., New York, 2002.
19. Stonebraker, M. What does ‘big data’ mean? Blog@
CACM, Sept. 21, 2012; http://cacm.acm.org/blogs/
blog-cacm/155468-what-does-big-data-mean/fulltext
20. United Nations. E-government Survey 2012:
E-government for the People, 2012; http://www.
un.org/en/development/desa/publications/connecting-
governments-to-citizens.html
21. U.S. Government. Data.gov; http://www.data.gov
22. Zikopoulos, P.C., Eaton, C., DeRoos, D., Deutsch, T.,
and Lapis, G. Understanding Big Data: Analytics
for Enterprise-Class Hadoop and Streaming Data.
McGraw-Hill, New York, 2012.
Gang-Hoon Kim (ironhoon@etri.re.kr) is a researcher
in the Creative Future Research Laboratory at the
Electronics and Telecommunications Research Institute,
Daejeon, South Korea.
Silvana Trimi (strimi@unl.edu) is an associate professor
of management information systems in the College of
Business Administration at the University of Nebraska–
Lincoln.
Ji-Hyong Chung (jhc123@etri.re.kr) is a researcher in the
Creative Future Research Laboratory at the Electronics
and Telecommunications Research Institute, Daejeon,
South Korea.
© 2014 ACM 0001-0782/14/03 $15.00
... Recent developments in big data and applications such as machine learning and artificial intelligence have raised expectations in government that these innovations will strengthen public service capacities and also solve the national problem in all fields, such as the economy, the health care system, the production of jobs, natural disasters and terrorism. (Kim & Chung, 2014). The use of ICT in the healthcare sector will streamline healthcare organizations' administration, enhance the quality of clinical services, and expand the scope of public health awareness for people (WHO, 2008). ...
Thesis
Full-text available
This explorative study focuses on the similarity and difference of Satu Data Indonesiaprinciples and the FAIR Principles, and how FAIR elements can help Satu Data Indonesiato strengthen Satu Data Indonesia principles in COVID-19 data management. For this,both principles were studied aiming to understand the connection between the twoprinciples and what are the health regulatory frameworks in order to find a model to extendSatu Data Indonesia principles with FAIR elements. The semi-structured interview withfour interviewees from the Indonesian ministries and two interviewees from VirusOutbreak Data Network (VODAN) was chosen to get insight from the Satu Data Indonesiarelation to FAIR and how the principle applies in COVID-19 data management. Besides,the researcher participated as Training of Trainers (ToT) technical support in the VirusOutbreak Data Network (VODAN) Africa, one of the joint activities carried out by GO-FAIR to observe the creation and deployment of FAIR data related to COVID-19. Theconnection and the possibility to use FAIR elements for Satu Data Indonesia wasinvestigated by using Theory of Agenda-Setting of Kingdon to check the similarity anddifference from the three streams: problem, policy, and political. It is concluded that thetwo principles are harmonious due to their similarity in the objective and principles. All ofthe FAIR principles can answer the goal of Satu Data Indonesia's principles. According tothese analyses, it can be concluded that if data management in Satu Data Indonesiafollowing the FAIR principles, it also meets the Satu Data Indonesia requirement.Therefore, a model of FAIR implementation for COVID-19 data management for VODANAfrica can be applied to improve COVID-19 data management in Indonesia.
... This can be observed in various governments' efforts to integrate big data into their decision-making processes regarding domestic issues such as terrorism and unemployment (G.-H. Kim, Trimi, & Chung, 2014). ...
Chapter
Full-text available
As machine learning, a subset of artificial intelligence, has inspired numerous breakthroughs, a recent surge in interest in this field is observed across a wide variety of disciplines. This chapter aims to point out some of the main concepts and models of this attractive area of research while considering data-oriented and human learning approaches. We try to provide an understandable overview of common models, including supervised and unsupervised models, while also delving into the algorithms that make these models function, such as the k-nearest-neighbor, decision trees, and clustering. Furthermore, the importance of bias and fairness in the ML models, overfitting and overgeneralization, validation strategies, and big data are presented. Finally, we transition to concepts and examples in more complex models, such as deep learning, computer vision, transfer learning, autoencoders, and natural language processing, to highlight the applicability of these exciting machine learning models.
Article
This study focuses on an important yet often neglected topic in public personnel competency studies: competencies required for digital government. It addresses the question: Which competencies do civil servants need for data-driven decision-making (DDDM) in local governments? Empirical data are obtained through a combination of 12 expert interviews and 22 Behavioral Event Interviews. Our analysis shows that DDDM as observed in this study is a hybrid process that contains elements of both “traditional” and “data-driven” decision-making. We identified eight competencies that are required in this process: data literacy, critical thinking, teamwork, domain expertise, data analytical skills, engaging stakeholders, innovativeness, and political astuteness. These competencies are also hybrid: a combination of more “traditional” (e.g., political astuteness) and more “innovative” (e.g., data literacy) competencies. We conclude that local governments need to invest resources in developing or selecting these competencies among their employees, to exploit the possibilities data offers in a responsible way.
Chapter
As the Internet of Things (IoT) is a way of interconnecting computing machines, programmed and digital tools have implemented blockchains to play a crucial function in the prevailing advanced applications in the healthcare sector. Spreading blockchain technologies in the healthcare sector will allow the number of cryptographic materials to be converged in this sector. The convergence of IoT and blockchain technologies in the healthcare sector will revolutionize upcoming goods and services to manage sufferers with therapeutic, precautionary, rehabilitative, and care. The convergence concept of IoT and blockchain in Healthcare is improving remarkably; Combining IoT and Blockchain was never done before. This will lead to the benefit of all services related to healthcare industries. This integration will change the face of Medicare and its facilities. The IoT will combine and manage services collaboratively to achieve a healthier state of life, actuality, energy-saving human health. The current technology in healthcare platforms needs to be integrated all over the world. Convergence IoT and blockchain can be the energy chain that will provide a new model for next-generation healthcare services. The potential application of concerning technologies is the new hope for our healthcare field.
Article
In scarcely a decade, a “labification” phenomenon has taken hold globally. The search for innovative policy solutions for social problems is embedded within scientific experimental‐like structures often referred to as policy innovation labs (PILs). With the rapid technological changes (e.g., big data, artificial intelligence), data‐based PILs have emerged. Despite the growing importance of these PILs in the policy process, very little is known about them and how they contribute to policy outcomes. This study analyzes 133 data‐based PILs and examines their contribution to policy capacity. We adopt policy capacity framework to investigate how data‐based PILs contribute to enhancing analytical, organization, and political policy capacity. Many data‐based PILs are located in Western Europe and North America, initiated by governments, and employ multi‐domain administrative data with advanced technologies. Our analysis finds that data‐based PILs enhance analytical and operational policy capacity at the individual, organizational and systemic levels but do little to enhance political capacity. It is this deficit that we suggest possible strategies for data‐based PILs. 在不到十年的时间里,一种“实验室化”(labification)现象在全球蔓延开来。对社会问题的创新政策解决方案的探索,嵌入在类似科学实验的结构中,这种结构通常被称为政策创新实验室(PILs)。随着技术的快速变革(例如大数据、人工智能),基于数据的PILs应运而生。尽管这些PILs在政策过程中的重要性日益增加,但人们对PILs以及其如何为政策结果作贡献一事知之甚少。本研究分析了133个基于数据的PILs,并分析了它们对政策能力的贡献。我们采用Wu等人在2015年提出的政策能力框架,以研究基于数据的PILs如何有助于提高分析能力、组织能力和政治政策能力。许多基于数据的 PILs位于西欧和北美,由政府发起,通过先进技术应用多域管理数据。我们的分析发现,基于数据的PILs增强了个人层面、组织层面和系统层面的分析政策能力和操作政策能力,但对增强政治能力而言几乎没有任何作用。为填补该缺陷,我们为基于数据的PILs提出了可能的策略。. En apenas una década, un fenómeno de “labificación” se ha afianzado a nivel mundial. La búsqueda de soluciones políticas innovadoras para los problemas sociales está integrada dentro de estructuras experimentales científicas a menudo denominadas laboratorios de innovación de políticas (PIL, por sus siglas en inglés). Con los rápidos cambios tecnológicos (por ejemplo, big data, inteligencia artificial), han surgido PIL basados en datos. A pesar de la creciente importancia de estos PIL en el proceso de políticas, se sabe muy poco sobre ellos y cómo contribuyen a los resultados de las políticas. Este estudio analiza 133 PIL basados en datos y examina su contribución a la capacidad política. Adoptamos el marco de capacidad de políticas de 2015 de Wu et al. para investigar cómo los PIL basados en datos contribuyen a mejorar la capacidad de políticas analíticas, organizativas y políticas. Muchos PIL basados en datos están ubicados en Europa Occidental y América del Norte, iniciados por gobiernos, y emplean datos administrativos de múltiples dominios con tecnologías avanzadas. Nuestro análisis encuentra que los PIL basados en datos mejoran la capacidad política analítica y operativa a nivel individual, organizacional y sistémico, pero hacen poco para mejorar la capacidad política. Es este déficit el que sugerimos posibles estrategias para los PIL basados en datos.
Chapter
Within the South African government, there is an increasing amount of data. The problem is that the South African government is struggling to employ the concept of big data analytics (BDA) for the analysis of its big data. This could be attributed to know-how from both technical and nontechnical perspectives. Failure to implement BDA and ensure appropriate use hinders government enterprises and agencies in their drive to deliver quality service. A government enterprise was selected and used as a case in this study primarily because the concept of BDA is new to many South African government departments. Data was collected through in-depth interviews. From the analysis, four factors—knowledge, process, differentiation, and skillset—that can influence implementation of BDA for government enterprises were revealed. Based on the factors, a set of criteria in the form of a model was developed.
Article
Full-text available
This article has reviewed international research, up to the first half of 2021, focused on sustainability, big data and the mathematical techniques used for its analysis. In addition, a study of the spatial component (city, region, nation and beyond) of the works has been carried out and an analysis has been made of which Sustainable Development Goals (SDGs) have received the most attention. A bibliometric analysis and a fractal cluster analysis were performed on the papers published in the Web of Science. The results show a continuous increase in the number of published articles and citations over the whole period, demonstrating a growing interest in this topic. China, the United States and India are the most productive countries and there are more papers at the regional level. It has been found that the environmental dimension is the most studied and the least studied is the social dimension. The mathematical techniques used in the empirical work are mainly regression analysis, neural networks and multi-criteria decision methods. SDG9 and SDG11 are the most worked on. The trend shows a convergence in recent years towards big data applied to supply chains, Industry 4.0 and the achievement of sustainable cities.
Article
Full-text available
Business intelligence and analytics (BI&A) has emerged as an important area of study for both practitioners and researchers, reflecting the magnitude and impact of data-related problems to be solved in contemporary business organizations. This introduction to the MIS Quarterly Special Issue on Business Intelligence Research first provides a framework that identifies the evolution, applications, and emerging research areas of BI&A. BI&A 1.0, BI&A 2.0, and BI&A 3.0 are defined and described in terms of their key characteristics and capabilities. Current research in BI&A is analyzed and challenges and opportunities associated with BI&A research and education are identified. We also report a bibliometric study of critical BI&A publications, researchers, and research topics based on more than a decade of related academic and industry publications. Finally, the six articles that comprise this special issue are introduced and characterized in terms of the proposed BI&A research framework.
Conference Paper
Full-text available
The computational and storage demands for the future Square Kilometer Array (SKA) radio telescope are significant. Building on the experience gained with the collaboration between ASTRON and IBM with the Blue Gene based LOFAR correlator, ASTRON and IBM have now embarked on a public-private exascale computing research project aimed at solving the SKA computing challenges. This project, called DOME, investigates novel approaches to exascale computing and concepts with a focus on energy e�fficient, streaming data processing, exascale storage, and nano-photonics. DOME will not only benefit the SKA, but will also make the knowledge gained available to interested third parties via a Users Platform. The intention of the DOME project is to evolve into the global center of excellence for transporting, processing, storing and analyzing large amounts of data for minimal energy cost: the ASTRON & IBM Center for Exascale Technology.
Article
Full-text available
Within large integrative scenario studies, it is often problematic to fully link narrative storylines and quantitative models. This paper demonstrates the potential use of a highly participatory scenario development framework that involves a mix of qualitative, semi-quantitative and quantitative methods. The assumption is that the use of semi-quantitative methods will structure the participatory output, which provides a solid base for quantification. It should further facilitate the communication between stakeholders and modellers. Fuzzy Cognitive Maps is the main semi-quantitative method and has a central place in the proposed framework. The paper provides a detailed description of its implementation in participatory workshops, also because of a lack of documented testing of its implementation. We tested Fuzzy Cognitive Maps as part of the framework in two training sessions; both gave encouraging results. Results show that the tool provides a structured, semi-quantitative understanding of the system perceptions of a group of participants. Participants perceived the method as easy to understand and easy to use in a short period of time. This supports the hypothesis that Fuzzy Cognitive Maps can be used as part of a scenario development framework and that the new framework can help to bridge the gap between storylines and models.
Article
In an interdependent and complex world, only few public policy challenges can be confined to one particular policy area anymore. Many governments have realized that a single-issue focus is often insufficient in dealing with emerging threats and opportunities. They have therefore started to experiment with strategic foresight that deliberately cuts across the traditional boundaries of policy areas and government departments. This article reviews the foresight activities of three countries that have been at the forefront of this trend: the United Kingdom, Singapore, and the Netherlands. To this end, the article discusses the concept of strategic foresight and explains the two distinct ways in which it contributes to public policy-making: on the one hand, it informs policy by providing more systematic knowledge about relevant trends and developments in an organization's environments; on the other hand, it acts as a driver of reflexive mutual social learning processes among policy-makers that stimulate the generation of common public policy visions. The article concludes by drawing lessons with regard to the key success factors allowing strategic foresight to make an effective contribution to public policy-making.
  • European Commission
  • Agenda For Europe
  • Brussels
european commission. A Digital Agenda for Europe. brussels, aug. 26, 2010; http://ec.europa.eu/ digital-agenda/ 6. habegger, b. strategic foresight in public policy: reviewing the experiences of the u.K., singapore, and the netherlands. Futures 42, 1 (Feb. 2010), 49–58.
IBM's Smarter Cities Challenge: Syracuse http://smartercitieschallenge.org/city_ syracuse_ny.html 8. mcafee, a. and brynjolfsson, e. big data: the management revolution
  • Ibm
ibm. IBM's Smarter Cities Challenge: Syracuse. Dec. 2011; http://smartercitieschallenge.org/city_ syracuse_ny.html 8. mcafee, a. and brynjolfsson, e. big data: the management revolution. Harvard Business Review (oct. 2012), 61–68.
national information society agencykoreas- national-information-society.html 11. office of science and technology Policy, executive office of the President office of science and technology Policy, executive office of the President
mcKinsey global institute. Big Data: The Next Frontier for Innovation, Competition, and Productivity. new york, may 2011; http://www.mckinsey.com/insights/ business_technology/big_data_the_next_frontier_for_ innovation 10. national information society agency. Evolving World on Big Data: Global Practices. may 2012; http:// www.koreainformationsociety.com/2013/11/koreas- national-information-society.html 11. office of science and technology Policy, executive office of the President. Fact Sheet: Big Data Across the Federal Government. Washington, D.c., mar. 29, 2012; http://www.whitehouse.gov/administration/eop/ostp 12. office of science and technology Policy, executive office of the President. Obama Administration Unveils 'Big Data' Initiative: Announces $200 Million in New R&D Investments. Washington, D.c., mar. 29, 2012; http://www.whitehouse.gov/administration/eop/ostp 13. ohlhorst, F.J. Big Data Analytics: Turning Big Data Into Big Money. John Wiley & sons, hoboken, nJ, 2013.