ArticlePDF Available

Beyond Business Intelligence

Authors:
  • Non-Affiliated

Abstract and Figures

It has been almost 25 years since the original data warehouse was conceived. Although the term business intelligence (BI) has since been introduced, little has changed from the original architecture. Meanwhile, business needs have expanded dra-matically and technology has advanced far beyond what was ever envisioned in the 1980s. These business and technology changes are driving a broader and more inclusive view of what the business needs from IT; not just in BI but across the entire spectrum—from transaction processing to social networking. If BI is to be at the center of this revolution, we practitioners must raise our heads above the battlements and propose a new, inclusive architecture for the future.
Content may be subject to copyright.
7
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
Barry Devlin, Ph.D., is a founder of the
data warehousing industry and among the
foremost worldwide authorities on business
intelligence and the emerging eld of
business insight. He is a widely respected
consultant, lecturer, and author of the seminal
book, Data Warehouse: From Architecture to
Implementation. He is founder and principal
of 9sight Consulting (www.9sight.com).
barry@9sight.com
Beyond Business
Intelligence
Barry Devlin
Abstract
It has been almost 25 years since the original data warehouse
was conceived. Although the term business intelligence (BI)
has since been introduced, little has changed from the original
architecture. Meanwhile, business needs have expanded dra-
matically and technology has advanced far beyond what was
ever envisioned in the 1980s. These business and technology
changes are driving a broader and more inclusive view of what
the business needs from IT; not just in BI but across the entire
spectrum—from transaction processing to social networking.
If BI is to be at the center of this revolution, we practitioners
must raise our heads above the battlements and propose a new,
inclusive architecture for the future.
Business integrated insight (BI2) is that architecture. This
article focuses on the information component of BI2—the
business information resource. I introduce a data topography
and a new modeling approach that can support data warehouse
implementers to look beyond the traditional hard informa-
tion content of BI and consider new ways of addressing such
diverse areas as operational BI and (so-called) unstructured
content. This is an opportunity to take the next step beyond BI
to provide complete business insight.
The Evolution of an Architecture
e rst article describing a data warehouse architecture
was published in 1988 in the IBM Systems Journal (Devlin
and Murphy, 1988), based on work in IBM Europe over
the previous three years. At almost 25 years old, data
warehousing might thus be considered venerable. It has
also been successful; almost all of that original architec-
ture is clearly visible in today’s approaches.
e structure and main components of that rst
warehouse architecture are shown in Figure 1, inverted
to match later bottom-to-top ows but otherwise
unmodied. Despite changes in nomenclature, all
but one of the major components of the modern data
8
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
warehouse architecture appear. e data interface clearly
corresponds to ETL. e business data directory was
later labeled metadata. e absence of data marts is more
apparent than real. e business data warehouse explicitly
described data at dierent levels of granularity, derivation,
and usage—all the characteristics that later dened data
marts. e only missing component, seen only recently
in data warehouses, is enterprise information integration
(EII) or federated access.
Figure 1 is a logical architecture. It shows two distinct
types of data—operational and informational—and
recognizes the fundamental dierences between them.
Operational data was the ultimate source of all data
in the warehouse, but was beyond the scope of the
warehouse: fragmented, often unreliable, and in need
of cleansing and conditioning before being loaded. e
warehouse data, on the other hand, was cleansed, consis-
tent, and enterprisewide. is dual view of data informed
how decision support was viewed by both business and IT
since its invention in the 1960s (Power, 2007).
A key mutation occurred in the architecture in the
early 1990s. is mutation, shown in Figure 2, split the
singular business data warehouse (and all informational
data) into two horizontal layers—the enterprise data
warehouse (EDW) and the data marts—and also verti-
cally split the data mart layer into separate stovepipes of
data for dierent informational needs. e realignment
was driven largely by the need for better query perfor-
mance in relational databases. e highly normalized
tables in the EDW usually required extensive and
expensive joins of such tables to answer user queries.
Another driver was “slice-and-dice” analysis, which is
most easily supported using dimensional models and
even specialized data stores.
is redrawing of the original, logical architecture
picture has had signicant consequences for subsequent
thinking about data warehousing. First was a level of
mental confusion about whether the architecture picture
was supposed to be logical or physical. Such a basic
architectural misunderstanding divides the community
End user interface
Business data warehouse
Enhanced data,
summary
Enhanced data,
detailed
Raw data,
detailed
Operational
systems
Data dictionary and
business process
denitions
Data interface
Local
data
Reports End user
workstation
Metadata
Metadata
Operational systems
Data marts
Enterprise data warehouse
Figur e 2. T he layered dat a warehouse architecture ( Devlin, 1997 )
Figur e 1. Data warehouse architecture, 1988
9
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
into factions debating the “right” architecture—recall the
Inmon versus Kimball battles of the 1990s.
Second, and more important, is the disconnect from
a key requirement of the original architecture: that
decision-support information must be consistent and
integrated across the whole enterprise. When viewed as a
physical picture, Figure 2 can encourage fragmentation of
the information vertically (based on data granularity or
structure) and horizontally (for dierent organizational/
user needs or divisions). e implication is that data
should be provided to users through separate data stores,
optimized for specic query types, performance needs,
etc. Vendors of data mart tools thus promoted quick
solutions to specic data and analysis needs, paying lip
service—at best—to the EDW. In truth, most general-
purpose databases struggled to provide the performance
required across all types of queries. e EDW is often
little more than a shunting yard for data on its way to
data marts or a basic repository for predened reporting.
e third, and more subtle, consequence is that thinking
about logical and physical data models and storage has
also split into two camps. Enterprise architecture focuses
on data consistency and integrity, often assuming that the
model may never be physically instantiated. On the other
hand are solution developers who focus on application
performance at the expense of creating yet more copies of
data. e result is dysfunctional IT organizations where
corporate and departmental factions promote diametri-
cally opposed principles to the detriment of the business
as a whole.
Of course, Figure 2 is not the end of the architecture
evolution. Today’s pictures show even more data storage
components. Metadata is split o into a separate layer
or pillar. e EDW is complemented by stores such as
master data management (MDM) and the operational
data store (ODS). Data marts have multiplied into
various types based on usage, function, and data type.
e connectivity of EII has been added in recent years.
In truth, these modern pictures have become more like
graphical inventories of physical components than true
logical architectures; they have begun to look like the
spaghetti diagrams beloved by BI vendors to show the
current mess in decision support that will be cured by
data warehousing.
is brief review of the evolution of data warehousing
poses three questions:
After 25 years of changing business needs, do we need
a new architecture to meet the current and foreseen
business demands?
What would a new logical data architecture look like?
What new modeling and implementation approaches
are needed to move to the new architecture?
What Business Needs from IT in the 21st Century
e concepts of operational BI and unstructured content
analytics point to the most signicant changes in what
business expects of IT over the past decade. e former
reects a huge increase in speed and agility required by
modern business; the latter points to a fundamental shift
in focus by decision makers and a signicant expansion in
the scope of their attention.
Speed has become one of the key drivers of business
success today. Decisions or processes that 20 years ago
took days or longer must now be completed in hours or
even minutes. e data required for such activities must
now be up to the minute rather than days or weeks old.
Increasing speed may require eliminating people from
decision making, which drives automation of previously
manual work and echoes the prior automation of “blue
collar” work. As a result, the focus of data warehousing
has largely shifted from consistency to speed of delivery.
In truth, of course, delivering inconsistent data more
quickly is actually worse in the long term than delivering
it slowly, but this obvious consideration is often conve-
niently ignored.
As the term “operational BI” implies, decision making
is being driven into the operational environment by this
trend. Participants from IT in operational BI seminars
repeatedly ask: How is this dierent from what goes on
in the operational systems? e answer is: not a lot. is
response has profound implications for data warehouse
10
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
architecture, disrupting the division that has existed
between operational and informational data since
the 1960s. If BI architects can no longer distinguish
between operational and informational activities, how
will users do so?
Agility—how easily business systems cope with and
respond to internal and external change—is a major
driver of evolution in the operational environment.
Current thinking favors service-oriented architecture
(SOA) as a means of allowing rapid and easy modication
of workows and exchange of business-level services as
business dictates. Such rapid change in the operational
environment creates problems for data loading using
traditional ETL tools with more lengthy development
cycles. On the plus side, the message-oriented interfaces
between SOA services can provide the means to load data
continuously into the warehouse.
Furthermore, the operational-informational boundary
becomes even more blurred as SOA becomes pervasive,
especially as it is envisaged that business users may
directly modify business processes. Users simply do
not distinguish between operational and informational
functions. ey require any and all services to operate
seamlessly in a business workow. In this environment,
the old warehousing belief that operational data is
inconsistent while warehouse data is reliable simply
cannot be maintained. Operational data will have to be
cleansed and made consistent at the source, and as this
occurs, one rationale for the EDW—as the dependable
source of consistent data—disappears.
Turning to the growing interest in and importance of
unstructured data, we encounter further fundamental
challenges to our old thinking about decision making
and how to support it. We are constantly reminded of the
near-exponential growth in these data volumes and the
consequent storage and performance problems. However,
this is really not the issue.
e real problem lies in the oxymoron “unstructured
data.” All data is structured—by denition. “Structured
data, as it’s known, is designed to be internally consistent
and immediately useful to IT systems that record and
analyze largely numerical and categorized information.
Such hard information is modeled and usually stored in
tabular or relational form. “Unstructured” information,
in reality, has some other structure less amenable to
numerical use or categorization. is soft information
often contains or is related to hard information. For
example, a business order can exist as: (1) a message on
a voicemail system; (2) a scanned, handwritten note; (3)
an e-mail message; (4) an XML document; and (5) a row
in a relational database. As we proceed along this list,
the information becomes harder, that is, more usable by
a computer. On the other hand, we may lose some value
inherent in the softer information: the tone of voice in the
voicemail message may alert a person to the urgency of
the order or some dissatisfaction of the buyer.
Business decision makers, especially at senior levels, have
always used soft information, often from beyond the
enterprise, in their work. Such information was gleaned
from the press and other sources, gathered in conversa-
tions with peers and competitors, and grafted together in
face-to-face interactions between team members. Today,
these less-structured decision-making processes are elec-
tronically supported and computerized. e basic content
is digitized, stored, and used online. Conversations occur
via e-mail and instant messaging. Conferences are remote
and Web-based.
For data warehousing, as a result, the implications extend
far beyond the volumes of unstructured data that must be
stored. ese volumes would pose major problems—the
viability of copying so much data into the data warehouse
and management of potentially multiple copies—if we
accepted the current architecture. However, of deeper
signicance is the question of how soft information and
associated processes can be meaningfully and usefully
integrated with existing hard information and processes.
At its core, this is an architectural question. How can
existing modeling and design approaches for hard
information extend to soft information? Assuming they
can, how can soft information, with its loose and uid
structure, be mined on the y for the metadata inherent
in its content? Although these questions are not new,
there is little consensus so far about how this will be
11
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
done. As was the case for enterprise data modeling, which
matured in tandem with the data warehouse architecture,
methods of dealing with soft information will surface as a
new architecture for life beyond BI is dened.
In the case of operational BI and SOA, the direction
is clear and the path is emerging: e barrier between
operational and informational data is collapsing, and
improvements in database technology suggest that we
can begin to envisage something of a common store. For
the structured/unstructured divide, the direction is only
now emerging and the path is yet unclear. However,
the direction echoes that for operational/informational
stores—the barriers we have erected between these data
types no longer serve the business. We need to tear
down the walls.
Business Integrated Insight and the Business
Information Resource
Business integrated insight (BI2), a new architecture
that shows how to break down the walls, is described
elsewhere (Devlin, 2009). As Figure 3 shows, this is
again a layered architecture, but one where the layers are
information, process, and people, and all information
resides in a single layer.
As seen in the business directions described earlier, a
single, consistent, and integrated set of all information
used by the organization—from minute-to-minute
operations to strategic decision making—is needed. At
its most comprehensive, this comprises every disparate
business data store on every computer in the organization,
all relevant information on business partners’ computers,
and all relevant information on the Internet! It includes
in-ight transaction data, operational databases, data
warehouses and data marts, spreadsheets, e-mail reposi-
tories, and content stores of all shapes and sizes inside the
business and on the Web.
is article focuses on the business information resource
(BIR), the information layer in BI2, to provide an
expanded and improved view of that component of
Figure 3. e BIR provides a single, logical view of the
entire information foundation of the business that aims
to signicantly reduce the physical tendency to separate
and then duplicate data in multiple stores. is BIR is
a unied information space with a conceptual structure
that allows for reasoned decisions about where to draw
boundaries of signicant business interest or practical
implementation viability. As business changes or technol-
ogy evolves, the BIR allows boundaries to change in
response without reinventing the logical architecture
or dening new physical components to simply store
alternative representations of the same information.
Figur e 3. The business integrated insight architecture
Personal
Action
Domain
Business
Function
Assembly
Business
Information
Resource
Active Thoughtful Inventive
Creative Conditioning Analytical Decisional
In-ight Live Reconciled Historical
Structured
Certied
Unstructured
Uncertied
Activity
Workow
Immediate
Deferred
12
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
e structure of the BIR is based on data topography,
with a set of three continuously variable axes character-
izing the data space. Data topography refers to the type
and use of data in a general sense—easy to recognize
but often dicult to dene. is corresponds to physical
topography, where most people can easily recognize a hill
or a mountain when they see one, but formal denitions
of the dierence between them seldom make much sense.
Similarly, most business or IT professionals can distin-
guish between hard and soft information as discussed
earlier, but creating denitions of the two and drawing a
boundary between them can be problematic.
e three axes of data topography, as shown in Figure
4, provide business and IT with a common language
to understand information needs and technological
possibilities and constraints. Placing data elements or sets
along the axes of the data space denes their business
usage and directs us to the appropriate technology.
The Timeliness/Consistency Axis
e timeliness/consistency (TC) axis denes the time
period over which data validly exists and its level of
consistency with logically related data. ese two factors
reside on the same axis because there is a distinct, and
often dicult, inverse technical relationship between
them. From left to right, timeliness moves from data that
is ephemeral to eternal; consistency moves
from standalone to consistent, integrated
data. When data is very timely (i.e., close
to real time), ensuring consistency between
related data items can be challenging. As
timeliness is relaxed, consistency is more
easily ensured. Satisfying a business need for
high consistency in near-real-time data can
be technically challenging and ultimately
very expensive.
Along this axis, in-ight data consists of
messages on the wire or the enterprise service
bus; data is valid only at the instant it passes
by. is data-in-motion might be processed,
used, and discarded. However, it is normally recorded
somewhere, at which stage it becomes live. Live data has
a limited period of validity and is subject to continuous
change. It also is not necessarily completely consistent
with other live data. at is the characteristic of stable
and reconciled data, which are stable over the medium
term. In addition to its stability, reconciled data is also
internally consistent in meaning and timing. Historical
data is where the period of validity and consistency is, in
principle, forever.
e TC axis broadly mirrors the lifecycle of data from
creation through use to disposal or archival. Within its
lifecycle, data traverses the TC axis from left to right,
although some individual data items may traverse only
part of the axis or may be transformed en route. A
nancial transaction, for example, starts life in-ight and
exists unchanged right across the axis to the historical
view. On the other hand, customer information usually
appears rst in live data, often in inconsistent subsets that
are transformed into a single set of reconciled data and
further expanded with validity time frame data in the
historical stage.
It is vital to note that this axis (like the others) is a
continuum. e words in-ight, live, and so on denote
broad phases in the continuous progression of timeliness
from shorter to longer periods of validity and consistency
from less- to more-easily achieved. ey are not discrete
categories of data. Nor are there ve data layers between
Reliance/usage
Timeliness/consistency
Knowledge
density
Global
Enterprise
Local
Personal
Vague
In-ight
Atomic
Derived
Compound
Multiplex
Live Stable Reconciled Historical
Figur e 4. T he axes of the business information resource
13
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
which data must be copied and transformed. ey
represent broad, descriptive levels of data timeliness and
consistency against which business needs and technical
implementation can be judged. Placing data at the left
end of the axis emphasizes the need for timeliness; at the
right end, consistency is more important.
It should be clear that the TC axis is the primary one
along which data warehousing has traditionally operated.
e current architecture splits data along this axis into
discrete layers, assigning separate physical storage to each
layer and distributing responsibility for the layers across
the organization. Reuniting these layers, at rst logically
and perhaps eventually physically, is a key aim of BI2.
The Knowledge Density Axis
e knowledge density (KD) axis shows the amount
of knowledge contained in a single data instance and
reects the ease with which meaning can be discerned in
information. In principle, this measure could be numeri-
cal. For example, a single data item, such as Order Item
Quantity, contains a single piece of information, while
another data item, such as a Sales Contract, contains
multiple pieces of information. In practice, however,
counting and agreeing on information elements in more
complex data items is dicult and, as with the TC axis,
the KD axis is more easily described in terms of general,
loosely bounded classes.
At the lowest density level is atomic data, containing a
single piece of information (or fact) per data item. Atomic
data is extensively modeled and is most often structured
according to the relational model. It is the most basic and
simple form of data, and the most amenable to traditional
(numerical) computer processing. e modeling process
generates the separate descriptions of the data (the
metadata) without which the actual data is meaning-
less. At the next level of density is derived data, which
typically consists of multiple occurrences of atomic data
that have been manipulated in some way. Such data may
be derived or summarized from atomic data; the latter
process may result in data loss. Derived data is usually
largely modeled, and the metadata is also separate from
the data itself.
Compound data is the third broad class on the KD axis
and refers to XML and similar data structures, where the
descriptive metadata has been included (at least in part)
with the data and where the combined data and metadata
is stored in more complex or hierarchical structures.
ese structures may be modeled, but their inherent ex-
ibility allows for less rigorous implementation. Although
well suited to SOA and Web services approaches, such
looseness can impact internal consistency and cause
problems when combining with atomic or derived data.
e nal class is multiplex data, which includes documents,
general content, image, video, and all sorts of binary large
object (BLOB) data. In such data, much of the metadata
about the meaning of the content is often implicit in the
content itself. For example, in an e-mail message, the “To:”
and “From:” elds clearly identify recipient and sender, but
we need to apply judgment to the content of the elds and
even the message itself to decide whether the sender is a
person or an automated process.
is axis allows us to deal with the concepts of hard and
soft information mentioned earlier. e KD axis also
relates to the much-abused terms “structured,” “semi-
structured,” and “unstructured.” Placing information on
this axis is increasingly important in modern business
as more soft information is used. Given that such data
makes up 80 percent or more of all stored data, it makes
sense that much useful information can be found here,
for example, by text mining and automated modeling
tools. Just as we have traditionally transformed and
moved information along the TC axis in data warehous-
ing, we now face decisions about whether and how to
transform and move data along the KD axis. In this case,
the direction of movement is likely to be from multiplex
to compound, with further renement into atomic or
derived. e challenge is to do so with minimal copying.
The Reliance / Usage Axis
e nal axis, reliance/usage (RU), has been largely
ignored in traditional data warehousing, which connes
itself to centrally managed and allegedly dependable
data. However, the widespread use of personal data, such
as spreadsheets, has always been problematic for data
management (Eckerson and Sherman, 2008). Similarly,
14
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
data increasingly arrives from external sources: from
trusted business partners all the way to the “world wild
west” of the Internet. All this unmanaged and undepend-
able information plays an increasingly important role in
running a business. It is becoming clear that centrally
managed and certied information is only a fraction of
the information resource of any business.
e RU axis, therefore, classies information according to
how much faith can be placed in it and the uses to which
it can be put. Global and enterprise information is strongly
managed, either at an enterprise level or more widely
by government, industry, or other regulatory bodies. It
adheres to a well-dened and controlled information
model, is highly consistent, and may be subject to audit.
By denition, reconciled and historical information fall
into these classes. Local information is also strongly man-
aged, but only within a departmental or similar scope.
Internal operational systems, with their long history of
management and auditability, usually contain local or
enterprise-class data. Information produced and managed
by a single individual is personal and can be relied upon
and used only within a very limited scope. A collaborative
eort by a group of individuals produces information of
higher reliability and wider usage and thus has a higher
position on the RU axis.
Vague information is the most unreliable and poorly
controlled. Internet information is vague, requiring
validation and verication before use. Information from
other external sources, such as business partners, has
varying levels of reliability and usage.
e placement of information on this axis and the
denition of rules and methods for handling dierent
levels of reliance and usage are topics that are still in their
infancy, but they will become increasingly important
as the volumes and value of less closely managed and
controlled data grow.
A Note about Metadata
e tendency of prior data warehouse architectures to carve
up business information is also evident in their positioning
of metadata as a separate layer or pillar. Such separation
was always somewhat arbitrary and is no longer reasonable.
We have probably all encountered debates about whether
timestamps, for example, are business data or metadata.
is new architecture places metadata rmly and fully in
the business information resource for three key reasons.
First, as discussed earlier, metadata is actually embedded in
the compound and multiplex information classes by deni-
tion. Second, metadata is highly valuable and useful to the
business. is is obvious for business metadata, but even
so-called technical metadata is often used by power users
and business analysts as they search for innovative ways
to combine and use existing data. ird, as SOA exposes
business services to users, their metadata will become
increasingly important in creating workows. Integrating
metadata into the BIR simply makes life easier for business
and IT alike. Metadata, when extracted from business
information, resides in the compound data class.
Introducing Data Space Modeling and Implementation
e data topography and data space described above
recognize and describe a fact of life for the vast majority
of modern business processes: Any particular business
process (or, in many cases, a specic task) requires
information that is distributed over the data space. A call
center, for example, uses live, stable, and historical data
along the TC axis; atomic, derived, and multiplex data
along the KD axis; and local and enterprise data on the
RU axis, as shown in Figure 5.
Although this data space illustration provides a valuable
visual representation of the data needs of the process
and their inherent complexity, a more formal method of
describing the data relationships is required to support
practical implementation: data space modeling. Its aim
is to create a data model beyond the traditional scope
of hard information. Data space modeling includes soft
information and describes the data relationships that exist
within and across all data elements used by a process or
task, irrespective of where they reside in the data space.
To do this, I introduce a new modeling construct, the
information nugget, and propose that a new, dynamic
approach to modeling is needed, especially for soft
information. It should be noted that much work remains
to bring data space modeling to fruition.
15
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
The Information Nugget
An information nugget is the smallest set of related data
(wherever it resides in or is distributed through the data
space) that is of value to a business user in a particular
context. It is the information equivalent of an SOA service,
also dened in terms of the smallest piece of business
function from a user viewpoint. An information nugget can
thus be as small as a single record when dealing with an
individual transaction or as large as an array of data sets
used by a business process at a particular time. As with SOA
services, information nuggets may be composed of smaller
nuggets or be part of many larger nuggets. ey are thus
granular, reusable, modular, composable, and interoperable.
ey often span traditional information types.
As modeled, an information nugget exists only once in the
BIR, although it may be widely dispersed along the three
axes. At a physical level, it ideally maps to a single data
instantiation, although the usual technology performance
and access constraints may require some duplication.
However, the purpose of this new modeling concept is
to ensure that information, as seen by business users, is
uniquely and directly related to its use, while minimizing
the level of physical data redundancy. When implemented,
the information nugget leads to rational decisions about
when and how data should be duplicated and to what extent
federation/EII approaches can be used.
Modeling Soft Information
Traditional information modeling
approaches focus on (and, indeed, dene
and create) hard information. It is a
relatively small step from such traditional
modeling to envision how the relationships
between multiple sets of hard informa-
tion used in a particular task can be
represented through simple extensions of
existing models to describe information
nuggets. e real problem arises with soft
information, particularly that represented
by the multiplex data class on the KD
axis. Such data elements are most often
modeled simply as text or object entities
at the highest level, with no recognition
that more fundamental data elements exist within these
high-level entities.
Returning to the call center example, consider the customer
complaint information that is vital to interactions between
agents and customers. When such information arrives
in the form of an e-mail or voicemail message from the
customer, we can be sure that within the content exists real,
valuable, detailed information including product name,
type of defect, failure conditions, where purchased, name of
customer, etc. In order to relate such information to other
data of interest, we must model the complaint information
(multiplex data) at a lower level, internal to the usual text or
object class.
Such modeling must recognize and handle two character-
istics of soft information. First is the level of uncertainty
about the information content and our ability to recognize
the data items and values contained therein. For example,
“the clutch failed when climbing a hill,” and “I lost the
clutch going up the St. Gotthard Pass,” contain the same
information about the conditions of a clutch failure, but
may be dicult to recognize immediately. Second, because
soft information may contain lower-level information
elements in dierent instances of the same text/object entity,
each instance must be individually modeled on the y as it
arrives in the store.
Reliance/usage
Knowledge
density
Atomic
Derived
Compound
Multiplex
Timeliness/consistency
Global
Enterprise
Local
Personal
Vague
In-ight Live Stable Reconciled Historical
Figur e 5. Sample data space mapping for the call center process
16
BUSINE SS INT ELLIGENCE
JOU RNAL • VO L. 15, NO . 2
BE YO ND B I
Automated text mining and semantic and structural
analysis are key components in soft information modeling
given the volumes and variety of information involved. Such
tools essentially extract the tacit metadata from multiplex
data and store it in a usable form. is enables multiplex
data to be used in combination with the simpler atomic, rec-
onciled, and derived classes on the KD axis. By storing this
metadata in the BIR and using it as pointers to the actual
multiplex data, we can avoid the need to transform, extract,
and copy vast quantities of soft information into traditional
warehouse data stores. We may also decide to extract certain
key elements for performance or referential integrity needs.
e important point is that we need to automatically model
soft information at a lower level of detail to enable such
decisions and to use this information class fully.
Conclusions
is article posed three questions: (1) Do we need a
new architecture for data warehousing after 25 years of
evolution of business needs and technology? (2) If so,
what would such an architecture look like? and (3) What
new approaches would we need to implement it? e
answers are clear.
1. Business needs and technology have evolved
dramatically since the rst warehouse architecture.
Speed of response, agility in the face of change,
and a signicantly wider information scope for all
aspects of the business demand a new, extensive level
of information and process integration beyond any
previously attempted. We need a new data warehouse
architecture as well as a new enterprise IT architec-
ture of which data warehousing is one key part.
2.
Business integrated insight (BI2) is a proposed
new architecture that addresses these needs while
taking into account current trends in technology. It
is an architecture with three layers—information,
process, and people. Contrary to the traditional
data warehouse approach, all information is
placed in a single layer—the business informa-
tion resource—to emphasize the comprehensive
integration of information needed and the aim to
eliminate duplication of data.
3. An initial step toward implementing this architecture
is to describe and model a new topography of data
based on broad types and uses of information. A
data space mapped along three axes is proposed and
a new modeling concept, the information nugget,
introduced. e architecture also requires dynamic,
in-ight modeling particularly of soft information to
handle the expanded data scope.
Although seemingly of enormous breadth and impact, the
BI2 architecture builds directly on current knowledge and
technology. Prior work to diligently model and implement a
true enterprise data warehouse will contribute greatly to this
important next step beyond BI to meet future enterprise
needs for complete business insight.
References
Devlin, B. [1997]. Data Warehouse: From Architecture to
Implementation, Addison-Wesley.
———
[2009]. “Business Integrated Insight (BI2):
Reinventing enterprise information management,”
white paper, September. http://www.9sight.com/
bi2_white_paper.pdf
———
, and P. T. Murphy [1988]. “An architecture for a
business and information system,” IBM Systems Journal,
Vol. 27, No. 1, p. 60.
Eckerson, Wayne W., and Richard P. Sherman [2008].
Strategies for Managing Spreadmarts: Migrating to
a Managed BI Environment, TDWI Best Practices
Report, Q1. http://tdwi.org/research/2008/04/
strategies-for-managing-spreadmarts-migrating-to-a-
managed-bi-environment.aspx
Power, D. J. [2007]. “A Brief History of Decision Support
Systems,” v 4.0, March 10. http://dssresources.com/
history/dsshistory.html
... Business intelligence provides up-to-date reliable business information and enables organizations to reason and understand the concepts hidden in business information through the process of data discovery and analysis [3,4]. Instant access to such information can help to make accurate decisions and create dynamic changes that will lead to improve the company's core line [5,6]. ...
Article
Full-text available
Value-Originality. Startup companies look for solutions to remove useless issues and enhance the activities offering value adding over the product/service development stage in such a way that such companies will be able to provide great products and more probabilities for success with no need to employ considerable investments from external sources. Accordingly, innovation, business intelligence, and knowledge management will contribute to the positive findings of startup companies. Objectives. Hence, the purpose of the present study was to conduct an investigation on the impact of business intelligence over the performance of startup companies according to such mediators as innovation and knowledge management. Methods. The type of this study is correlation and is also an applied and descriptive-survey work regarding objective and data gathering. Our statistical population encompassed all related experts and managers in the following Iranian startups: Pinket, Alibaba, Snapp, Filimo, IDPay, PhonePay, TapSell, and Body Spinner. Our sample was composed of 108 experts and managers in mentioned companies, who were chosen based on the cluster sampling method. Structural equation modeling with partial least squares approach was used to analyze the gathered data. Findings. Our results reveal the effectiveness of business intelligence, innovation, and knowledge management on the performance of startups and that of business intelligence on innovation and knowledge management. Ultimately, innovation and knowledge management are mediators of the relationship between business intelligence and performance of startups. Conclusion. The startup companies’ managers are strongly recommended to take advantage of innovation and knowledge management in order to enhance the performance of their businesses.
... Building a database to accommodate this data is a simpler task (as difficult as building a good database is). Unstructured data is normally defined to include multimedia, graphics, e-mail, and a large number of social media content formats (Tallon, Short, & Harkins, 2013) which is much less suited to simple numerical analysis or categorization (Devlin, 2010). To illustrate that last point, the reader is invited to consider what the average of his/her last 20 social media posts would be (the question itself makes little sense and invites more questions in order to define it better). ...
Chapter
Business Intelligence 2.0 is an umbrella term used to refer to a collection of tools that help organizations extend their BI capabilities using Internet platforms. BI 2.0 tools can enable the automatic discovery of distributed software services and data stores, greatly increasing the range of market options for an organization. The development cycle for these tools is still in its early stage, and much work remains. However, some technologies and standards are already well understood in order to make a significant impact. This paper provides an overview of the eXtensible Markup Language (XML) and related technologies supporting the deployment of web services and service-oriented architectures (SOA). The author summarizes the critical importance of these technologies to the emergence of BI 2.0 tools. This paper also explores the current state of Internet-enabled BI activities and strategic considerations for firms considering BI 2.0 options.
... Building a database to accommodate this data is a simpler task (as difficult as building a good database is). Unstructured data is normally defined to include multimedia, graphics, e-mail, and a large number of social media content formats (Tallon, Short, & Harkins, 2013) which is much less suited to simple numerical analysis or categorization (Devlin, 2010). To illustrate that last point, the reader is invited to consider what the average of his/her last 20 social media posts would be (the question itself makes little sense and invites more questions in order to define it better). ...
Chapter
This chapter reviews opportunities and issues propelling and limiting the success of business intelligence and analytics services for a company’s internal use. We describe three strategies for providing these services internally (on-premises, cloud, and hybrid) and explore issues of importance in the shaping of current demand and of future offerings by web-based providers. It also discusses opportunities for the development of academic curricula to offer better training to graduate and improve recruiting outcomes for organizations and for the development of more relevant academic research to address topics of current and strategic importance to the firm.
Article
Full-text available
This paper tries to study the effects of business intelligence in advertising industry based on Decision Making Trial and Evaluation Laboratory (DEMATEL) method. The statistical population of this research consists of 20 professional experts who are working in the famous advertising companies in Tehran City. The selection criteria and common features of the subjects were based on their awareness and superiority, knowing well the marketing and advertising industry, and the management modules of the research system or DEMATEL method. Also, the main implement of data collection for this research were the questionnaires which designed based on different goals and distributed to the statistical community after obtaining their responses, and analysis of each factor of this research examined by DEMATEL method. Also, in this study, the questionnaires were the main data gathering tools, which designed for different purposes. After obtaining the experts approval, the questionnaires distributed among the statistical population and in this process, exploratory factor analysis and DEMATEL method were used. Finally, the results led to the identification of four main dimensions including adaptation to business needs and objectives, technical functionality of the business intelligence system, flexibility, and the ability to integrate the experiences and needs mentioned in Code 13 indices; then their final weight were calculated. According to the findings of this study, the “most influential indicator” in the implementation of general science and technology policies is to adapt to business needs and goals and to integrate experiences and needs. On the other hand, it is the “most influential factor”, in compare with other factors, in the implementation of general science and technology policies. It should be noted that this research is an exploratory study in terms of purpose and survey approach.
Article
Population of this research includes all employees of Telecom Company is Boukan city. In Method of research simple random sampling used to determine the sample. The productivity questionnaire consisted of 21 items and the business intelligence questionnaire consisted of 42 items. The results of this study show that the business intelligence have positive correlated with Productivity aspects involves ability, clarity, support and assessment, but have not positive correlation with Motivational Dimension. Also in the relationship between business intelligence and productivity was found that there is a positive correlation between these variables.
Article
Business intelligence (BI) for the 21st century should reflect the uncertainty inherent in emergent knowledge about complex causal relationships between elements of the ecosystem. However BI constructed by integrating data from multiple sources remain restricted to strong economic signals as its systems cannot cope with the amount and diversity of data about sustainability. BI's non-engagement with weak signals about the impact of unsustainable activities is aggravated by the absence of a single uncontested definition of sustainability. Ecosystem literature advocates grappling with sustainability's characteristic uncertainty-complexity by providing stakeholders with information that fosters capacities for social learning. Here we put forward socially constructed risk-based analytical-deliberative platform that integrates weak environmental signals, thus better representing the uncertainty and complexity aspects of sustainability. An Australian case-study of irrigators is used to examine how our platform integrates weak environmental signals with BI's strong signals, and fosters capacities for social learning among decision-makers.
Article
This paper examines external pressures that influence the relationship between an organization's business intelligence (BI) data collection strategy and the purpose for which BI is implemented. A model is proposed and tested that is grounded in institutional theory, research about competitive pressure, and research about the purpose of BI. Two data collection strategies (comprehensive and problem driven) and three BI purposes (insight, consistency, and transformation) are examined. Findings provide a theoretical lens to better understand the motivators and the success factors related to collecting the huge amounts of data required for BI. This study also provides managers with a mental model on which to base decisions about the data required to accomplish their goals for BI.
Business Integrated Insight (BI 2 ): Reinventing enterprise information management
  • P T Murphy
---[2009]. "Business Integrated Insight (BI 2 ): Reinventing enterprise information management," white paper, September. http://www.9sight.com/ bi2_white_paper.pdf ---, and P. T. Murphy [1988]. "An architecture for a business and information system," IBM Systems Journal, Vol. 27, No. 1, p. 60.