Conference PaperPDF Available

NoSE: Schema Design for NoSQL Applications

Authors:

Abstract and Figures

Database design is critical for high performance in relational databases and many tools exist to aid application designers in selecting an appropriate schema. While the problem of schema optimization is also highly relevant for NoSQL databases, existing tools for relational databases are inadequate for this setting. Application designers wishing to use a NoSQL database instead rely on rules of thumb to select an appropriate schema. We present a system for recommending database schemas for NoSQL applications. Our cost-based approach uses a novel binary integer programming formulation to guide the mapping from the application's conceptual data model to a database schema. We implemented a prototype of this approach for the Cassan-dra extensible record store. Our prototype, the NoSQL Schema Evaluator (NoSE) is able to capture rules of thumb used by expert designers without explicitly encoding the rules. Automating the design process allows NoSE to produce efficient schemas and to examine more alternatives than would be possible with a manual rule-based approach.
Content may be subject to copyright.
c
2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including
reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or
reuse of any copyrighted component of this work in other works.
NoSE: Schema Design for NoSQL Applications
Michael J. Mior, Kenneth Salem
University of Waterloo
{mmior,kmsalem}@uwaterloo.ca
Ashraf Aboulnaga
Qatar Computing Research Institute, HBKU
aaboulnaga@qf.org.qa
Rui Liu
HP Vertica
r.liu@hp.com
Abstract—Database design is critical for high performance in
relational databases and many tools exist to aid application de-
signers in selecting an appropriate schema. While the problem of
schema optimization is also highly relevant for NoSQL databases,
existing tools for relational databases are inadequate for this
setting. Application designers wishing to use a NoSQL database
instead rely on rules of thumb to select an appropriate schema.
We present a system for recommending database schemas for
NoSQL applications. Our cost-based approach uses a novel binary
integer programming formulation to guide the mapping from the
application’s conceptual data model to a database schema.
We implemented a prototype of this approach for the Cassan-
dra extensible record store. Our prototype, the NoSQL Schema
Evaluator (NoSE) is able to capture rules of thumb used by expert
designers without explicitly encoding the rules. Automating the
design process allows NoSE to produce efficient schemas and to
examine more alternatives than would be possible with a manual
rule-based approach.
I. INTRODUCTION
NoSQL systems have become a popular choice as database
backends for applications because of the high performance,
scalability, and availability that they provide. In this paper,
we focus on one particular type of NoSQL system, termed
extensible record stores in Cattell’s taxonomy [1]. In these
systems, applications can create tables of records, with each
record identified by a key. However, the set of columns, in
the records need not be defined in advance. Instead, each
record can have an arbitrary collection of columns, each with
an associated value. Because of this flexibility, applications
can encode their data in both the keys and column values. We
refer to tables in such systems as column families. Examples of
extensible record stores that support this column family model
include Cassandra [2], HBase [3], and BigTable [4].
Before an extensible record store application can be devel-
oped, it is necessary to define a schema for the underlying rec-
ord store. Although the schema of an extensible record store is
flexible in the sense that specific columns need not be defined
in advance, it is still necessary to decide what column families
will exist in the record store, and what information will be
encoded in each column family. These choices are important
because the performance of the application depends strongly
on the underlying schema. For example, some schemas may
provide answers to queries with a single lookup while others
may require multiple requests to the extensible record store.
Although it is important to choose a good schema, there
are no tools or established methodologies to guide and support
this process. Instead, current practices in schema design for
extensible record stores are captured in general heuristics
and rules of thumb. For example, eBay [5] and Netflix [6]
have shared examples and general guidelines for designing
schemas for Cassandra. Specific recommendations include not
designing column families as one would relational tables,
ensuring that column families reflect the anticipated workload,
and denormalizing data in the record store. While such recom-
mendations are clearly useful, they are necessarily vague and
generic, and must be tailored to each application. In Section II,
we illustrate some of the choices faced when designing a
schema for a specific application.
In this paper, we propose a more principled approach to
the problem of schema design for extensible record stores. Our
objective is to replace general schema design rules of thumb
with a tool that can recommend a specific extensible record
store schema optimized for a target application. Our tool uses
a cost-based approach to schema optimization. By estimating
the performance that candidate schemas would have for the
target application, we recommend the schema that results in the
best estimated performance. NoSE is intended to be used early
in the application development process: the tool recommends
a schema and the application is then developed using that
schema. In addition to providing a schema definition, NoSE
also recommends how the application should be implemented
against the schema.
This paper makes the following contributions. First, we
formulate the schema design problem for extensible record
stores. Second, we propose a cost-based solution to the schema
design problem, which is embodied in a schema design ad-
visor we call NoSE, the NoSQL Schema Evaluator. Using a
conceptual model of the data required by a target application,
as well as a description of how that data will be used, NoSE
recommends both an extensible record store schema, i.e., a
set of column family definitions which is optimized for the
target application, and guidelines for application development
using this schema. Finally, we present an evaluation of this
approach. For the evaluation, we use a simple online auction
Web application as the target and we evaluate both the quality
of the NoSE-recommended schema and the time required to
generate recommendations.
II. SC HE MA DESIGN EX AM PL E
In this section we present a simple example to illustrate the
schema design problem for extensible record stores. Suppose
we are building an application to manage hotel reservations.
The data that will be managed by this application are described
by the conceptual model in Figure 1, adapted from Hewitt [7].
The schema design problem for extensible record stores
is the problem of deciding what column families to create
and what information to store in each column family, for a
given application. In general, this will depend on what the
Fig. 1. Entity graph for a hotel booking system. Each box represents an
entity set, and edges between boxes represent relationships.
target application needs to do. For example, suppose that the
application will need to use the extensible record store to
obtain information about the points of interest (POIs) near
hotels that have been booked by a guest, given the guest’s
GuestID. The primary operations supported by an extensible
record store are retrieval (get) or update (put) of one or
more columns from a record, given a record key. Thus, this
query could be answered easily if the record store includes a
column family with GuestIDs as record keys and columns
corresponding to POIs. That is, the column family would
include one record for each guest. A guest’s record would
include one column for each POI that is associated with a
hotel at which that guest has booked a room. The column
names are POIIDs, and each column stores a composite value
consisting of POIName and POIDescription. In general,
each guest’s record in this column family may have different
columns. Furthermore, columns may be added to or removed
from a guest’s record when that guest’s hotel bookings are
updated in the record store. With such a column family, the
application can obtain point of interest information for a given
guest using a single get operation. This column family is
effectively a materialized view which stores the result of the
application query for all guests.
In this paper, we will describe such a column family using
the following triple notation:
[GuestID][POIID][POIName, POIDescription]
The first element of the triple indicates which attributes’
values will be used as record keys in the column family.
The second element indicates which attributes’ values will
be used as column names, and the third indicates which
will be used as column values. We refer the first element as
the partitioning key, since extensible record stores typically
horizontally partition column families based on the record
keys. We refer to the second element as the clustering key,
since extensible record stores typically physically cluster each
record’s columns by column name.
Although this column family is ideal for executing the
single application query we have considered, it may not be
ideal when we consider the application’s entire workload. For
example, if the application expects to be updating the names
and descriptions of points of interest frequently, the above
column family may be not be ideal because the point of interest
information is denormalized: the name and description of a
POI may be stored in many guests’ records. Instead, it may
be better to create two column families, as follows:
[GuestID][POIID][]
[POIID][][POIName, POIDescription]
This stores information about each point of interest once, in a
separate column family, making it easy to update. Similarly, if
the application also needs to perform another query that returns
information about the points of interest near a given hotel, it
may be better to create three column families, such as these:
[GuestID][HotelID][]
[HotelID][POIID][]
[POIID][],[POIName, POIDescription]
In this schema, which is more normalized, records in the
third column family consist of a key (a POIID) and a single
column which stores the POIName and POIDescription
as a composite value. The second column family, which maps
HotelIDs to POIIDs, will be useful for both the original
query and the new one.
The goal of our system, NoSE, is to explore this space of
alternatives and recommend a good set of column families,
taking into account both the entire application workload and
the characteristics of the extensible record store.
The schema design problem that NoSE solves is related
to schema design for relational databases. However, there are
also significant differences between the two. Relational sys-
tems provide a clean separation between logical and physical
schemas. Standard procedures exist for translating a conceptual
model, like the entity graph in Figure 1 to a normalized logical
relational schema, i.e., a set of table definitions, against which
the application’s workload can be defined. The logical schema
often determines a physical schema consisting of a set of base
tables. The physical layout of these base tables can then be
optimzied and they can then be supplemented with additional
physical structures, such as indexes and materialized views, to
tune the physical design to the anticipated workload. There are
many tools for recommending a good set of physical structures
for a given workload [8]–[15].
Extensible record stores, in contrast, do not provide a clean
separation between logical and physical design. There is only
a single schema, which is both logical and physical. Thus,
NoSE starts with the conceptual model, and produces both a
recommended schema and plans for implementing the appli-
cation against the schema. Further, the schema recommended
by NoSE represents the entire schema, not a supplement to a
fixed set of base tables. Unlike most relational physical design
tools, NoSE must ensure that the workload is covered, i.e.,
that the column families it recommends are sufficient to allow
the entire workload to be implemented. We provide further
discussion of relational physical design tools in Section VIII.
III. SYS TE M OVE RVIEW
Figure 2 gives a high level illustration of the NoSE schema
advisor. NoSE is intended to be used early in the process
of developing an application using an extensible record store
backend. It takes two inputs. The first is a conceptual model
of the data required by the application. The second is a
description of the workload, indicating how the application
expects to query and update the record store.
Fig. 2. Schema advisor overview
Given these inputs, the advisor produces two outputs. The
first is a recommended schema, which describes the column
families that should be used to store the application’s data.
The second output is a set of plans, one plan for each query
and update in the workload. Each plan describes how the ap-
plication should use the column families in the recommended
schema to implement a query or an update. These plans are
used as a guide for the application developer.
In the remainder of this section, we describe the conceptual
model, workload, and schema of the record store in more de-
tail. Query plans are described in more detail in Section IV-C.
A. Database Conceptual Model
To recommend a schema for the target application, NoSE
must have a conceptual model describing the information that
will be stored in the record store. NoSE expects this conceptual
model in the form of an entity graph, such as the one shown in
Figure 1. Entity graphs are simply a restricted type of entity-
relationship (ER) model [16]. Each box represents a type of
entity, and each edge is a relationship between entities and the
associated cardinality of the relationship (one-to-many, one-to-
one, or many-to-many). Entities have attributes, one or more
of which serve as a key. For example, the model shown in
Figure 1 indicates that each room has an associated room
number and rate. In addition, each room is associated with
a hotel and with a set of reservations.
B. Workload Description
The target application’s workload is described as a set
of parameterized query and update statements. Each query
and update is associated with a weight indicating its relative
frequency in the antipicated workload. We focus here on the
queries, and defer discussion of updates to Section VI.
Each query in the workload returns information about
a particular type of entity. Figure 3 shows an example of
a NoSE query, expressed using an SQL-like syntax, which
returns the names and email addresses of guests who have
reserved rooms in given city at a given minimum rate. In this
example, ?city and ?rate are parameters, values for which
would be provide by the application when it performs such a
query. NoSE queries are expressed directly over the conceptual
model. Specifically, each query identifies a target entity set
(in the FROM clause) and a path that originates at the target
entity set and traverses the entity graph. Each query returns
the attribute values from one or more entities along this path.
We emphasize that the underlying extensible record store
supports only simple put and get operations on column fam-
ilies, and is unable to directly interpret or execute queries like
the one shown in Figure 3. Instead, the application itself must
SELECT Guest.GuestName, Guest.GuestEmail FROM
Guest WHERE Guest.Reservation.Room.Hotel
.HotelCity = ?city AND
Guest.Reservation.Room.RoomRate > ?rate
Fig. 3. An example query against the hotel booking system schema
implement queries such as this, typically using a series of get
operations, perhaps combined with application-implemented
filtering, sorting, or joining of results. Nonetheless, by de-
scribing the workload to NoSE in this way, the application
developer can convey the purpose of a sequence of low-
level operations, allowing NoSE to optimize over the scope
of entire high-level queries, rather than individual low-level
optimizations. Of course, another problem with describing the
application workload to NoSE in terms of get and put
operations on column families is that the column families are
not known. Indeed, the purpose of NoSE is to recommend a
suitable set of column families for the target application.
Although it is not shown in Figure 3, NoSE queries can also
specify a desired ordering on the the query results, using an
ORDER BY clause. This allows NoSE to recommend column
families which exploit the implicit ordering of clustering keys
to allow results to be constructed in the desired order.
C. Extensible Record Stores
The target of our system is extensible record stores, such as
Cassandra or HBase. These systems store collections of keyed
records in column families. Records in a collection need not
all have the same columns.
Given a domain Kof partition keys, an ordered domain
Cof clustering keys, and a domain Vof column values, we
model a column family as a table of the form
K 7→ (C 7→ V )
That is, a column family maps a partition key to a set of
clustering keys, each of which maps to a value. Records
in a single partition are ordered by the clustering keys. For
example, in Section II, we used an example of a column family
with GuestIDs as partition keys, POIIDs as clustering keys,
and POI names and descriptions as values. Such a column
family would have one record for each GuestID, with POI
information for that guest’s records clustered using the POIID.
We assume that the extensible record store supports only
put and get operations on column families. To perform a
get operation, the application must supply a partition key and
a range of clustering key values. The get operation returns
all C 7→ V pairs within the specified clustering key range, for
the record identified by the partition key. For example, the
application could use a get operation to retrieve information
about the points of interest assocaited with a given GuestID.
Similarly, a put operation can be used to modify the C 7→ V
pairs associated with a single partition key.
Some extensible record stores provide additional capabil-
ities beyond the basic get and put operations we have de-
scribed. For example, in HBase it is possible to get information
for a range of partition keys, since records are also sorted based
on their partition key. As another example, Cassandra provides
Fig. 4. Complete schema advisor architecture
a limited form of secondary indexing, allowing applications to
select records by something other than the partitioning key.
However, many applications do not use them for performance
reasons [17]. For simplicity, we restrict ourselves to the simple
get/put model we have described, as it captures functionality
that is commonly present in extensible record stores.
D. The Schema Design Problem
A schema for an extensible record store consists of a set
of column family definitions. Each column family is identified
by a name, and its definition includes the domains of partition
keys, clustering keys, and column values used in that column
family.
Given a conceptual schema, an application workload, and
an optional space constraint, the schema design problem is to
recommend a schema such that (a) each query in the workload
can be answered using one or more get requests to column
families in the schema, (b) the weighted total cost of answering
the queries is minimized, and optionally (c) the aggregate size
of the recommended column families is within a given space
constraint. Solving this optimization problem is the objective
of our schema advisor. In addition to the schema, for each
query in the workload, NoSE recommends a specific plan for
obtaining an answer to that query using the recommended
schema. We discuss these plans further in Section IV-C.
IV. SCHEMA ADVISOR
Given an application’s conceptual schema and workload,
as shown in Figure 2, our advisor proceeds through four
steps to produce a recommended schema and a set of query
implementation plans.
1) Candidate Enumeration The first step is to generate
a set of candidate column families based on the
workload. By inspecting the workload, the advisor
generates only candidates which may be useful for
answering the queries in the workload.
2) Query Planning The advisor generates a space of
possible implementation plans for each query. These
plans make use of the candidate column families
produced in the first step.
3) Schema Optimization The possible plans for the
queries are used to generate a binary integer program
(BIP) used to choose a subset of the candidate column
families to form the recommended schema. The BIP
is then given to an off-the-shelf solver (we have
chosen to use Gurobi [18]) which chooses a set of
column families that minimizes the cost of answering
the queries.
4) Plan Recommendation The advisor chooses a single
plan from the plan space of each query to be the rec-
ommended implementation plan for that query based
on the column families selected by the optimizer.
This process is illustrated in Figure 4. In the reminder
of this section, we discuss candidate enumeration and query
planning. Schema optimization and plan recommendation are
presented in Section V.
A. Candidate Enumeration
One possible approach to candidate enumeration is to
consider all possible column families for a given set of entities.
However, the number of possible column families is exponen-
tial in the number of attributes, entities, and relationships in
the conceptual model. Thus, this approach does not scale well.
Instead, we enumerate candidates using a two-step process
based on the application’s workload. First, we enumerate can-
didate column families for each query in the application work-
load using an algorithm Enumerate(q). The union of these
sets Cis used as our initial pool, i.e. CqEnumerate(q).
Second, we supplement this pool with additional column
families constructed by combining candidates from the initial
pool, CCCombine(C). The goal of the second
step is to add candidates which are likely to be useful for
answering more than one query while consuming less space
than two separate column families. Both the Enumerate and
Combine algorithms are described in the following sections.
We note that we do not claim to enumerate column families
which result in an optimal schema. Any enumeration algorithm
which produces valid column families capable of answering
queries in the workload could be substituted here. We leave
heuristics to determine additional useful column families as
future work. However, the optimization process we discuss
in Section V produces the optimal subset of the enumerated
candidates for the given cost model.
1) Candidate Column Families: Recall from Section III-C
that a column family is a mapping of the form K 7→ (C 7→ V ).
To define a specific column family, we need to determine K,
C, and V. That is, we need to specify what the keys, columns,
and values will be for the column family.
We consider column families in which keys, columns, and
values consist of one or more attributes from the application’s
conceptual model. For example, using pairs (HotelCity,
HotelState) as partition keys, HotelNames as clustering
keys, and (HotelAddress, HotelPhone) as values, we
can define a column family that can be used to retrieve, for
a given city and state, a list of hotel names, addresses, and
phone numbers, in order of hotel name. We represent each
column family as a triple, consisting of a set of partition
key attributes, an ordered list of clustering key attributes,
and a set of value attributes. Each column family has an
associated path of relationships linking the entities contained
in the column family. However, for the examples we show,
the path is unambiguous and is omitted from the description.
For example, the column family described above would be
represented as
[HotelCity, HotelState][HotelName,
HotelID][HotelAddress, HotelPhone].
Column families are not limited to containing information
on a single entity from the conceptual model. For any query
in our language, we can define a column family that can be
used to directly retrieve answers to that query, which we call a
materialized view. For example, the query shown in Figure 3,
which returns the names and emails of guests who have
reserved rooms at hotels in a given city, at room rates above
a given rate, corresponds to the following column family:
[HotelCity][RoomRate, GuestID]
[GuestName, GuestEmail]
By supplying a city name and a minimum room rate,
an application can use this column family to retrieve a list
of (RoomRate,GuestID) pairs, with each mapped to the
name and email address of the specified guest. WE do not
show it here, but we also include the ID of each entity along
the path in the clustering key (e.g. HotelID,RoomID, and
ResID). This ensures we have a unique record for each guest
reservation since the same guest and hotel may be connected
multiple ways (e.g. through different reservations for the same
hotel). The use of HotelCity in the partition key restricts
the results to a given city while having RoomRate as part of
the clustering key allows for range queries to find reservations
over the given rate.
2) Per-Query Candidate Enumeration: Since each query
corresponds to a column family, the Enumerate algorithm
could enumerate just one candidate for each query, namely a
column family which has the appropriate attributes to answer
the query using a single get operation against the extensible
record store. This single column family is the materialized
view for the query as described above. However, the schema
optimizer requires more flexibility than this, since its space
budget may not allow the recommendation of a materialized
view for each query. In addition, when we later consider
updates, a column family for each query may become too
expensive to maintain. Therefore, Enumerate also includes
additional column families in the candidate pool for each
query. Each provides a partial answer for the query. The
application can use these to answer the query by combining the
results of multiple get requests to different column families.
To generate the full pool of candidate column families
for a given query, we decompose the query at each entity
along the query path. Decomposing a query at a specific
entity along its path splits the query into two parts, which
we call the prefix query and a remainder query.Enumerate
produces one or more candidate column families based on
the prefix query, then recursively decomposes the remainder
query using the same approach. Figure 5 illustrates the first
level of this recursive splitting process for the example query
that was shown in Figure 3. The query path for this query
is Guest.Reservation.Room.Hotel, so the query is
decomposed at four points.
For each prefix query, NoSE enumerates a column family
with the appropriate construction to allow the prefix query
to be answered with a single request to the backend record
store. In addition, depending on the form of the prefix query,
the enumerator may enumerate additional candidates. If the
SELECT clause of the prefix query includes attributes not in
the key of the target entity, two additional candidates will be
enumerated: one that returns only the key attributes, and a
second that returns the attributes from the SELECT clause,
given a key. For example, for the first prefix query in Figure 5,
in addition to the materialized view for that query, which is
[HotelCity][RoomRate, GuestID]
[GuestName, GuestEmail]
we also enumerate the following two column families:
[HotelCity][RoomRate, GuestID][]
[GuestID][][GuestName, GuestEmail].
The former can be used to return a set of GuestIDs, given a
city and a room rate, and the latter can then be used to retrieve
the guests’ names and email addresses.
Finally, when the prefix query contains multiple predicates,
the enumerator generates additional candidates corresponding
to relaxed versions of the prefix query. Specifically, when the
enumerator considers a query of the form
SELECT attribute-list FROM entity WHERE
entity.attr op ? AND predicate2 AND ...
it also generates materialized view column families for relaxed
queries of the form
SELECT attribute-list, attr FROM entity
WHERE predicate2 AND ...
That is, we remove one or more predicates and add the
attributes involved in the predicates to the SELECT list. These
column families can be used to retrieve a superset of the result
of the original prefix query, which can then by filtered by
the application. When relaxing prefix queries, the enumerator
removes only predicates that test an attribute of the target
entity (the one mentioned in the FROM clause). Furthermore,
predicates are only considered for removal if the remaining
query will have at least one equality predicate remaining. (This
is required to construct a valid get request on the column
family that will be constructed by the schema advisor.)
We can relax queries involving ordering in the same way by
moving an attribute in an ORDER BY clause to the SELECT
list. The plan for a query using this column family would then
perform a sort in the application on the selected attribute.
3) Candidate Combinations: Once candidates have been
enumerated for each query in the workload, Combine con-
siders additional candidates by combining the per-query can-
didates that are already present in the pool. There are many
opportunities for creating new column families by combining
candidates from the pool, but Combine currently only exploits
one simple opportunity. Specifically, Combine looks for pairs
of column families in the pool for which
Decomposition
Point Prefix query Remainder query
Guest
SELECT Guest.GuestName, Guest.GuestEmail FROM Guest
WHERE Guest.Reservation.Room.Hotel.HotelCity = ?
AND Guest.Reservation.Room.RoomRate > ? none
Reservation
SELECT Reservation.ResID FROM Reservation
WHERE Reservation.Room.Hotel.HotelCity = ?
AND Reservation.Room.RoomRate > ?
SELECT Guest.GuestName, Guest.GuestEmail FROM Guest
WHERE Guest.Reservation.ResID = ?
Room
SELECT Room.RoomID FROM Room
WHERE Room.Hotel.HotelCity = ?
AND Room.RoomRate > ?
SELECT Guest.GuestName, Guest.GuestEmail FROM Guest
WHERE Guest.Reservation.Room.RoomID = ?
Hotel SELECT Hotel.HotelID FROM Hotel
WHERE Hotel.City = ?
SELECT Guest.GuestName, Guest.GuestEmail FROM Guest
WHERE Guest.Reservation.Room.Hotel.HotelID = ?
AND Guest.Reservation.Room.RoomRate > ?
Fig. 5. Example of query decomposition for candidate enumeration
both column families have the same partition key, and
both column families have no clustering key, and
the two column families have different data attributes.
For each such pair, Combine adds an additional column
family to the candidate pool. The new column family has the
same partition keys as the column families on which it is based,
and the union of their value attributes. Thus, the new candidate
will be larger than either of the original candidates but will be
potentially useful for answering more than one query.
Increasing the number of candidates increases the opportu-
nity for the schema advisor to identify a high-quality schema
but also increases the running time of the advisor. As future
work, we intend to explore other opportunities for candidate
generation in light of this tradeoff as well as heuristics to prune
column families which are unlikely to be useful.
B. Application Model
To determine which candidate column families to rec-
ommend, the schema advisor must understand the way in
which the candidates will be used by the application. This
understanding is based on an application model. This model
is used in two ways. First, it allows the advisor to estimate the
cost (to the application) of executing queries in the workload
under a particular schema recommendation by using the model
to determine how the application would use the recommended
schema to obtain answers to the workload queries. Second,
the schema advisor can use the model as a framework for
describing to the application developers how the recommended
schema should be used to answer the queries. We refer to
these descriptions as query implementation plans or simply,
plans. The schema advisor’s output includes a plan for each
application query in the input workload. We note that the
exact cost model used to estimate the cost of each query
implementation plan is not important to our approach. The
simple cost model we have developed is detailed in [19].
Consider the example query from Figure 3, and suppose
the following column families exist in the schema:
CF1: [HotelCity][RoomID][RoomRate]
CF2: [RoomID][GuestID][GuestName,GuestEmail]
The application can use these column families to answer
the example query by performing the following steps:
1) Using the given city as the partition key value,
perform a get operation on CF1to obtain a list of
RoomID,RoomRate pairs.
2) Using the given minimum room rate, filter the list of
RoomID,RoomRate pairs to eliminate those with a
RoomRate less than the minimum allowable rate.
3) For each remaining RoomID, use the RoomID as the
partition key value to perform a get on CF2to obtain
a list of GuestID,GuestName,GuestEmail
triples. Merge the results (discarding duplicates) to
obtain the final query result.
We consider this plan to consist of instances of four primitive
operation types, which form part of the application model.
The first is get operations on the underlying record store. The
second is a filter operation, which is used in step 2 to eliminate
rooms with low rates from the result of the get operation in
step 1. The last is a join operation, which is used in step 3
to take results from an earlier step and use them as keys to
get data from another column family. Finally, our application
model also includes a sort operation, which may be used, for
example, when the schema does not allow the application to
retrieve results in the desired order. Of these operations, only
data access (the get operation) is implemented by the NoSQL
data store. Filtering, sorting, and joining, when required, must
be performed on the client side by the application itself.
C. Query Planning
The task of the query planner is to enumerate all possible
plans for evaluating a given query, under the assumption that
all candidate column families are available. Each plan is a
sequence of application operations, using candidate column
families, that will produce an answer to an application query.
We refer to the result of this process as the plan space for the
given query. Later, during schema optimization, the schema
advisor will use the plan spaces for each query to determine
which of the candidate column families to recommend.
Query planning in our schema advisor is performed as
part of the same recursive decomposition process that is used
to generate candidate column families. Consider again the
CF1[HotelCity][RoomRate,RoomID][]
CF2[HotelCity][RoomID][]
CF3[HotelCity][HotelID][]
CF4[HotelID][RoomID][]
CF5[RoomID][][RoomRate]
Fig. 6. Example query plan space
decomposition of the running example query (Figure 3) that
was shown in Figure 5. For each prefix query, the query planner
generates a set of implementation plans, each of which starts
by retrieving the results of the prefix query, and finishes by
joining those results to a (recursively calculated) plan for the
corresponding remainder query. When generating plans for a
prefix query, the planner will generate one set of plans for each
of the candidate column families that was generated for that
prefix query.
Figure 6 shows the plan space for the simple relaxed prefix
query seen in Section IV-A.
SELECT Room.RoomID FROM Room WHERE
Room.Hotel.HotelCity = ?city AND
Room.RoomRate > ?rate
There are three possible plans in the plan space. The
first uses the materialized view CF1to answer the query
directly. The second finds the HotelID for all hotels in a
given HotelCity using CF3. The HotelID is then used to
find all the RoomIDs for the given hotel using CF4. Finally,
the RoomRates are discovered using CF5and the RoomIDs
are filtered to only contain those matching the predicated on
RoomRate. The final plan is similar, but goes directly from
aHotelCity to a list of RoomIDs using CF2.
Note that the number of column families that can be used to
execute these plans is larger than what is shown. Each plan can
make use of any column family which contains the necessary
data, but possibly “larger” through the addition of some suffix
to the clustering key or additional data.
V. SC HE MA OPTIMIZATION
A na¨
ıve approach to schema optimization is to examine
each element in the power set of candidate column families
and evaluate the cost of executing each workload query using
a plan that involves only the selected candidates. However, this
approach scales poorly as it is exponential in the total number
of candidate column families.
Papadomanolakis and Ailamaki [13] present a more effi-
cient approach to the related problem of index selection in
relational database systems. Their approach formulates the
index selection problem as a binary integer program (BIP)
minimize
i
j
Cij δij
subject to
All used column families being present
δij δj,i, j
Maximum space usage S
j
sjδjS
Plus per-query path constraints (see text)
Fig. 7. Binary integer program for schema optimization
which selects an optimal set of indices based on the index
configurations that are useful for each query in the workload.
Their approach uses a set of decision variables for each query,
with the number of variables per query equal to the number
of combinations of indices useful to that query. This is still
exponential as in the na¨
ıve approach, but only in the number
of indices relevant to each query, rather than the total number
of candidate indices.
Like Papadomanolakis and Ailamaki, we have imple-
mented schema optimization by formulating the problem as a
BIP. However, because of the relatively simple structure of the
query implementation plans that our schema advisor considers,
we are able to provide a simpler formulation for our problem.
Our schema advisor uses the query plan spaces described
in Section IV-C to generate a BIP. The program uses a binary
decision variable, δij, for each combination of a candidate
column family and a workload query. The variable δij indicates
whether the ith query will use the jth column family in
its implementation plan. The objective of the optimization
program is to minimize the quantity ijCij δij , where Cij
represents the cost of using the jth column family in the plan
for the ith query. However, after solving this optimization
problem, we run the solver a second with an additional
constraint that the cost of the workload equals the minimum
value which was just discovered. The new objective function
is the total number of column families in the recommended
schema. This allows NoSE to produce the smallest schema
out of set of those which are most efficient.
In addition to the decision variables δij, our program for-
mulation uses one other decision variable per candidate column
family. These variables indicate whether the corresponding
column families will be included in the set of column families
recommended by the schema advisor. We use δjto represent
this per-column-family decision variable for the jth candidate
column family. Our BIP includes constraints that ensure that
the jth column family is included in the set of recom-
mended column families if it is used in the plan for
at least one query, and
(optionally) that the total size of the recommended
column families is less than the specified space con-
straint.
Overall, this approach requires |Q||P|variables representing
the use of column families in query implementation plans, and
|P|variables representing which column families should be
recommended where |Q|represents the number of queries and
|P|the number of candidate column families. We also allow
an optional storage constraint whereby the user can specify
a limit Son the amount of storage occupied by all column
families. The estimated size of each column family sjis also
given as a parameter to the BIP. The binary integer program
is summarized in Figure 7.
As noted in Figure 7, the BIP also requires a set of
path constraints, on the variables δij, which ensure that the
solver will choose a set of column families for each query
that correspond to one of the plans in the query plan space.
These constraints are derived from the per-query plan spaces
determined by the query planner. For example, in Figure 6, at
most one of CF1, CF3, and CF2can be selected for use in
answering this query, since each is useful for different plans,
and only one plan is selected per query. In addition, if CF3is
selected for use, then CF4must also be selected. The BIP will
include corresponding constraints on the decision variables δij
that indicate whether those column families are used to answer
this query. For this example, the following constraints would
be generated for the example in Figure 6 (assuming we number
the query as query 1):
δ1,1+δ1,3+δ1,2= 1
δ1,4=δ1,3
δ1,5δ1,4+δ1,2
After solving the BIP, making the final plan recommenda-
tion is straightforward. There is a unique plan with minimal
cost based on the values of the decision variables in the BIP.
VI. UP DATES
The previous sections described how NoSE functions on a
read-only workload, but it is important to also consider updates
in the workload description. Updates implicitly constrain the
amount of denormalization present in the generated schema.
This effect results from the maintenance required when the
same attribute appears in multiple column families. Each
column family which contains an attribute that is modified
by an update must also be modified, so repetition of attributes
increases update cost.
We first introduce extensions to our workload description
to express updates. We then discuss how NoSE executes
these updates for a given set of column families. Finally, we
describe modifications required to the enumeration algorithm
and the BIP used by NoSE in order to support these updates.
Extensions to our workload description to consider such tasks
as performing aggregation at insertion time are left as future
work.
A. Update Language
In order to support updates to the workload, we extend
our query language with additional statements which support
updates to data according to the conceptual model as in the
examples shown in Figure 8. INSERT statements create new
entities and result in insertions to column families containing
INSERT INTO Reservation SET ResEndDate = ?date
DELETE FROM Guest WHERE Guest.GuestID = ?guestid
UPDATE Reservation FROM Reservation.Guest SET
Reservation.ResEndDate = ?
WHERE Guest.GuestID = ?guestid
CONNECT User(?userid) TO Reservations(?resid)
DISCONNECT User(?userid) FROM Reservations(?resid)
Fig. 8. Example update statements in the workload
Query
SELECT Room.RoomRate FROM
Room.Hotel.PointsOfInterest
WHERE Room.RoomFloor=?floor
AND PointsOfInterest.POIID=?id
Materialized view
[Room.RoomFloor][PointsOfInterest.POIID,
Hotel.HotelID, Room.RoomID]
[Room.RoomRate]
Update
UPDATE Room FROM Room.Reservations.Guest
SET RoomRate=?rate1 WHERE Guest.GuestID=?id
AND Room.RoomRate=?rate2
Fig. 9. An example workload against the hotel schema
attributes from this entity. We assume that the primary key of
each entity is provided with the entity when it is inserted, but
all other attributes are optional. UPDATE statements modify
attributes resulting in updates to any corresponding column
families in the schema. DELETE statements remove entities
and all data about that entity is removed from any associated
column families. Both UPDATE and DELETE statements spec-
ify the entities to modify using the same predicates available
for queries. Finally, CONNECT and DISCONNECT statements
allow for relationships between entities to be created or re-
moved. These statements simply specify the primary key of
each entity and the relationship to modify. We also allow
relationships to be created when a new entity is inserted.
B. Update Execution
As with queries, NoSE must provide an implementation
plan for each update. To update a column family, we delete
records for old data which is being modified or removed (in
the case of UPDATE,DELETE, and DISCONNECT). We then
insert records corresponding to the new data (in the case of
UPDATE,INSERT, and CONNECT). For the example given
in Figure 9, we would update the materialized view by first
removing the record for the old RoomRate. This would
be a tuple consisting of values for RoomFloor,POIID,
HotelID, and RoomID. The associated RoomRate is not
required since the partition and clustering key attributes con-
stitute the primary key for the record.
In general, to modify records we must have the primary
key (partition and clustering key attributes) available for each
record in each column family which needs to be modified in
order to construct a valid put request. Since these attributes
may not be provided with the update, we construct one or
more support queries which select values for the attributes in
the primary key primary key given the predicates specified in
the update. Support query construction varies depending on the
type of update, with details provided in [19]. A complete plan
for an update consists of executing support queries followed
by insertion and/or deletion into the associated column family.
C. Column Family Enumeration for Updates
Additional column families may be necessary to answer
support queries for updates in the workload. To expand our
enumeration to include updates, we use the modified procedure
shown in Algorithm 1. The Enumerate and Combine func-
tions are the same as previously described in Section IV-A. The
Modifies? function is a predicate which tests if an update
requires modifications to a given column family. Finally, the
Support function produces all the support queries necessary
for a given update on a column family. Note that we run the
candidate enumeration process twice for support queries. This
is because when generating support queries for statements in
the original workload, we may cover new paths in the entity
graph. As a result, it is possible that there are no column
families suggested by Enumerate which can provide answers
to these support queries. By performing enumeration for these
support queries a second time, we are guaranteed to cover all
paths which were expanded in the first step.
input : A set of queries Qand updates U
output: A set of enumerated column families
/*enumeration for workload queries */
C← ∅;
foreach query qin Qdo
CCEnumerate(q);
end
/*enumeration for support queries */
do twice
C′ ← C;
foreach update uin Udo
foreach column family cin Cdo
if Modifies?(u,c)then
foreach query qin Support(u,c)do
CCEnumerate(q);
end
end
end
end
end
return CCombine(C);
Algorithm 1: Column family enumeration for updates
D. BIP Modifications
To incorporate updates into our BIP, we first add constraints
for all support queries in a similar manner to queries in the
original workload. However, there is an additional constraint
that ensures a plan is only generated for a support query
if the decision variable for the associated column family is
set. This ensures that we do not generate implementation
plans for support queries which do not need to be executed.
mnC
mnδnis added to the objective function to represent
the cost of updating each column family which is contingent
on the column family being selected for the final schema. C
mn
minimize
i
j
Cij δij +
m
n
C
mnδn
subject to
All constraints from Figure 7
Additional constraints per support query (see text)
Fig. 10. BIP modifications for updates
is the cost of updating column family nfor update mgiven
that the column family is selected for inclusion in the final
schema (δn). The cost support queries is also added using the
same weight specified for the update in the workload. This
modification to the objective function constrains the problem
to reduce denormalization of attributes which are heavily
updated. These modifications are shown in Figure 10.
After solving this new BIP, we plan each update by first
generating any necessary support query plans in the same way
as plans for queries in the original workload. Each update plan
then consists of a series of support query plans along with
insertion or deletion as necessary for the update. Note that we
only need to generate update plans for column families which
are included in the recommended schema (i.e. those that have
their decision variables set).
VII. EVALUATI ON
In this section we present an evaluation of NoSE, designed
to address two questions. First, does NoSE produce good
schemas (Section VII-A)? Second, how long does it take for
NoSE to generate schema recommendations (Section VII-B)?
A. Schema Quality
To evaluate the schemas recommended by NoSE, we
used it to generate schema and plan recommendations for a
target application. We then implemented the recommended
schema in Cassandra along with the recommended application
plans While executing the plans against the record store, we
measured their execution times. In a similar manner, we also
implemented and executed the same statements against two
other schemas for comparison, which we describe later. Our
implementation of NoSE is available on GitHub [20].
Although extensible record stores like Cassandra are in
wide use, we are not aware of open-source applications or
benchmarks, with the exception of YCSB [21], which is
intended for performance and scalability testing of NoSQL
systems, but offers no flexibility in schema design. Instead,
we created a target application by adapting RUBiS [22], a
Web application benchmark originally backed by a relational
database which simulates an online auction site.
To adapt RUBiS for Cassandra, we created a conceptual
model based on the entities managed by RUBiS. The resulting
model contains eight entity sets, with eleven relationships
among them. Using this model, we generated a workload
weighted according to the original RUBiS request distribution.
This workload consists of one or more statements correspond-
ing to each SQL statement used by the RUBiS bidding work-
load. We excluded RUBiS queries which involve browsing and
0.001
0.01
0.1
1
10
100
1000
BrowseCategories
ViewBidHistory
ViewItem
SearchItemsByCategory
ViewUserInfo
BuyNow
StoreBuyNow
PutBid
StoreBid
PutComment
StoreComment
AboutMe
RegisterItem
RegisterUser
Average response time (ms)
Transaction type
NoSE Normalized Expert
Fig. 11. Bidding workload performance on different schemas
searching by region as they cannot be efficiently implemented
using our current model.
The first schema we examine, the recommended schema,
is determined by the NoSE advisor with no storage con-
straint. This results in a highly denormalized, workload-
specific schema, generally consistent with the rules of thumb
for NoSQL schema design discussed in Section I. Second,
the normalized schema is a manually created schema which
is highly normalized. This schema contains a column family
for each entity set where the partition key is the primary
key of the entity and containing all data associated with the
entity. The schema also contains column families which serve
as secondary indices for queries which do not specify entity
primary keys. These column families use the attributes given
in query predicates as the partition keys and store the primary
key of the corresponding entities. Finally, we defined a third
schema which we refer to as the “expert” schema. This schema
was defined manually by a human designer familiar with
Cassandra using the same workload input to NoSE. In addition
to this schema, the designer also provided a set of execution
plans for each query.
We used each schema to define a set of column families
in Cassandra, and populated the record store using data for a
RUBiS instance with 200,000 users. Finally, we developed a
simple execution engine which can execute the plans recom-
mended by NoSE. One machine served to execute the state-
ments and measure the execution time with another machine
running an instance of Cassandra 2.0.9. Each machine has two
six core Xeon E5-2620 processors operating at 2.10 GHz and
64 GB of memory. A 1TB SATA drive operating at 7,200 RPM
was used for the Cassandra data directory. All Cassandra-level
caching was disabled since we do not attempt to model the
effects of caching in our cost model. A more complicated
cost model which captures the effects of caching could be
substituted into NoSE without changing the rest of the system.
We examined the RUBiS bidding workload by evaluating
user transactions, which are groups of statements which would
be executed for a single request to the application server.
Results for the three schemas are shown in Figure 11. Mean
response times for the different transaction types ranged from
1.25-113 ms for the schema recommended by NoSE, and 2.42-
157 ms for the expert schema. We then measured the average
0.1
1
10
100
Browsing Bidding 10x 100x
Weighted average response time (ms)
Workload mix
NoSE
Normalized
Expert
Fig. 12. Execution plan performance for varied write workloads
response time for each transaction over 1,000 executions. The
schema recommended by NoSE was able to achieve up to a
125×improvement in performance over the expert schema for
a single transaction. Additionally, NoSE was able to reduce
the average response time weighted by the frequency of each
transaction in the workload by a factor of 1.8. This is because
NoSE was able to exploit information in the workload to
perform more expensive updates for infrequent transactions
(e.g., StoreBuyNow) or avoid denormalization for data which
is frequently updated but infrequently read (e.g., AboutMe).
By allowing more expensive updates which have little impact
on the overall workload, the schema recommended by NoSE
can contain additional column families to support frequently
executed queries.
Furthermore, we examined the adaptability of NoSE to
different workload distributions. We increased the weight of
each RUBiS interaction involving writes by a factor of 10
and 100. In addition, we also tested NoSE against the RUBiS
browsing workload mix which contains no updates. Each
of these workload mixes leads to a different NoSE schema
since writes becomes more or less expensive which changes
the optimal level of denormalization. Results are shown in
Figure 12. We note that in the 100×case, NoSE performs
worse than the expert schema. NoSE has no knowledge of
the correlation of queries in the input workload and cannot
share the results of support query execution. In contrast, the
expert schema does exploit this knowledge and is thus able to
0
200
400
600
800
1000
1200
1400
1600
1800
12345678910
Average execution time (s)
Workload scale factor
Cost calculation
BIP Construction
BIP Solving
Other
Fig. 13. Advisor runtime for varying workload sizes
avoid unnecessary queries. Additionally, NoSE is not currently
capable of exploiting queries which make use of GROUP BY
clauses. By using this knowledge, the expert schema is able to
reduce the number of updates. For example, it is necessary to
keep a record of which items a user has placed a bid on, but
is not necessary for the application to efficiently retrieve all
the individual bids for a given user. Since NoSE is unaware of
the grouping of query results, we are unable to avoid storing
data for individual bids, increasing the cost of writes.
B. Advisor Performance
Running NoSE for the RUBiS workload takes less than ten
seconds. To evaluate the advisor runtime workloads larger than
RUBiS, we generated random entity graphs and queries to use
as input to our tool. The entity graph generation is based on the
Watts-Strogatz random graph model [23]. After generating the
graph, we randomly assign a direction to each edge and create
a foreign key at the head node. We then add a random number
of attributes to each entity in the graph. Statements are defined
using a random walk through the graph to identify the path of
the statement. For any statements involving a WHERE clause,
we randomly generate three predicates along the statement
path. Queries and updates have randomly chosen attributes
along the path which are selected/updated.
Figure 13 shows the results of a simple experiment in
which we started with a random workload having similar prop-
erties to the RUBiS workload discussed in the previous section.
We then increased the size of the workload by multiplying
the number of entities and statements by a constant factor.
The figure shows the time required for NoSE to recommend
a schema and a set of execution plans as a function of this
factor. All experiments were run using a machine with the same
specifications as in the previous section. Note that the runtime
increases exponentially with the workload size, but this is
independent of the weights of each of statement. The increase
in runtime is a result of the increased number of support
queries required to update the column families for each query.
This interaction increases non-linearly with the workload size
(e.g., increased numbers of queries and updates) since there are
exponentially more ways that column families recommended
for queries interact with updates. We note that the runtime of
the BIP is relatively short, and the largest component runtime
is the construction of the BIP and the associated constraints.
There is likely to be room for optimization in the NoSE code
to significantly reduce the runtime.
VIII. RE LATE D WOR K
Numerous tools are available, or have been proposed, for
solving related design problems in relational database systems.
Many of these tools select an optimized collection of indexes
and materialized views to support a given workload [8]–[15].
However, as was noted in Section II, there are some significant
differences between relational physical schema design and
schema design for NoSQL systems, which NoSE addresses.
Others focus on vertical partitioning of relations, either to
recommend a set of covering indexes [24] or to determine
a physical representation for the relations [25], [26]. There
are also tools for automating relational partitioning and layout
across servers [10], [27] or storage devices [28], [29]. DB-
Designer [30] determines the physical representations (called
projections) of tables for the Vertica [25] column, based on an
input workload. However, DBDesigner recommends a single
projection at each iteration which may not produce a globally
optimal solution. In addition, DBDesigner does not explicitly
consider the effect of updates but instead relies on heuristics
to limit the number of projections.
Typically, relational physical design involves the iden-
tification of candidate physical structures, followed by the
selection of a good subset of these candidates. NoSE uses
the same general approach. A few relational design tools,
including CoPhy [13], [15], CORADD [14], and a physical
design technique for C-Store [26] have formulated the task
of choosing a good set of candidates as a BIP. As was noted
in Section V, Papadomanolakis and Ailamaki [13] presented
a simple formulation of the problem as a binary integer
program. CoPhy [15], an extension of this work, adds further
optimizations to reduce calls to the relational query optimizer.
As in our work, their approach exploits the decomposition of
queries into components which can be analyzed independently.
CoPhy also includes a rich set of constraints which may also
be useful as extensions to NoSE.
Our approach to the schema design problem for extensible
record stores owes an intellectual debt to GMAP [31], which
was proposed as a technique for improving the physical data
independence of relational database systems. In GMAP, both
application queries and physical structures are described using
a conceptual entity-relationship model. Queries are mapped to
one or more physical structures which can be used to produce
answers to the query. This approach is used out of a desire to
provide a more thorough form of physical data independence.
In our case, we adopt a similar approach out of necessity, as the
extensible record stores we target do not implement separate
logical and physical schemas. However, in GMAP, the primary
algorithmic task is to map each query to a given set of physical
structures. In contrast, our task is to choose a set of physical
structures to handle a given workload, in addition to specifying
which physical structures should be used to answer each query.
Others have also proposed writing queries directly against a
conceptual schema. For example, ERQL [32], is a conceptual
query language over enhanced ER diagrams. It defines path
expressions referring to a series of entities or relationships.
Our query model is somewhat more restrictive as all predicates
given in a query must lie along the same path and we disallow
self references. Queries over our conceptual model are also
similar to path expressions in object databases, and the physical
structures our technique recommends are similar to the nested
indexes and path indexes described by Bertino and Kim [33].
Vajk et al. [34] discuss schema design in a setting similar
to ours. Their approach, like ours, starts with a conceptual
schema. Queries are expressed in the UML Object Constraint
Language. They sketch an algorithm that appears to involve
the use of foreign key constraints in the conceptual schema
to exhaustively enumerate candidate denormalizations. An un-
specified technique is then used to make a cost-based selection
of candidates. Although this approach is similar to ours, it
is difficult to make specific comparisons because the schema
design approach is only sketched. Rule-based approaches have
also been proposed for adapting relational [35] and OLAP [36]
schemas for NoSQL databases. However, these approaches are
workload agnostic and do not necessarily produce schemas
which can efficiently implement any particular workload. A
workload-aware approach such as the one we take with NoSE
is suggested as future work by Li [35].
IX. CONCLUSION
Schema design for NoSQL databases is a complex problem
with many additional challenges as compared to the analo-
gous problem for relational databases. We have developed a
workload-driven approach for schema design for extensible
record stores, which is able to effectively explore various trade-
offs in the design space. Our approach implicitly captures best
practices in NoSQL schema design without relying on general
design rules-of-thumb, and is thereby able to generate effective
NoSQL schema designs. Our approach also allows applications
to explicitly control the tradeoff between normalization and
query performance by varying a space constraint.
Currently, NoSE only targets Cassandra. However, we
believe that with minimal effort, the same approach could be
applied to other extensible record stores, such as HBase. We
also intend to explore the use of a similar model on data stores
with significantly different data models, such as key-value
stores and document stores. Applying our approach to very
similar data stores may only require changing the cost model
and the physical representation of column families. However,
we imagine more significant changes may be required to fully
exploit the capabilities of different data models.
ACK NOW LE DG EM EN TS
This work was supported by the Natural Sciences and
Engineering Research Council of Canada (NSERC).
REF ER EN CE S
[1] R. Cattell, “Scalable SQL and NoSQL data stores,ACM SIGMOD
Record, vol. 39, no. 4, pp. 12–27, 2011.
[2] A. Lakshman and P. Malik, “Cassandra: a decentralized structured
storage system,” ACM SIGOPS Operating Systems Review, vol. 44,
no. 2, pp. 35–40, 2010.
[3] “HBase: A distributed database for large datasets,” retrieved March 7,
2013 from http://hbase.apache.org.
[4] F. Chang et al., “Bigtable: A distributed storage system for structured
data,” in OSDI, 2006.
[5] J. Patel, “Cassandra data modeling best practices, part 1,” 2012,
retrieved Oct. 21, 2014 from http://ebaytechblog.com/?p=1308.
[6] N. Korla, “Cassandra data modeling - practical consid-
erations @ Netflix,” 2013, retrieved Oct. 21, 2014 from
http://www.slideshare.net/nkorla1share/cass-summit-3.
[7] E. Hewitt, Cassandra: The Definitive Guide, 2nd ed., M. Loukides, Ed.
Sebastopol, CA: O’Reilly Media, 2011.
[8] S. J. Finkelstein, M. Schkolnick, and P. Tiberio, “Physical database de-
sign for relational databases,” ACM Transactions on Database Systems,
vol. 13, no. 1, pp. 91–128, 1988.
[9] S. Agrawal et al., “Automated selection of materialized views and
indexes in SQL databases,PVLDB, pp. 496–505, 2000.
[10] D. C. Zilio et al., “DB2 design advisor: integrated automatic physical
database design,” PVLDB, pp. 1087–1097, 2004.
[11] Benoit Dageville, D. Das, K. Dias, K. Yagoub, and M. Zait, “Automatic
SQL tuning in Oracle 10g,” PVLDB, vol. 30, pp. 1098–1109, 2004.
[12] N. Bruno and S. Chaudhuri, “Automatic physical database tuning: A
relaxation-based approach,” in SIGMOD, 2005.
[13] S. Papadomanolakis and A. Ailamaki, “An integer linear programming
approach to database design,” ICDEW, pp. 442–449, 2007.
[14] H. Kimura et al., “CORADD: Correlation aware database designer for
materialized views and indexes,PVLDB, pp. 1103–1113, 2010.
[15] H. P. Company and A. Ailamaki, “CoPhy: A scalable, portable, and
interactive index advisor for large workloads,” vol. 4, no. 6, pp. 362–
372, 2011.
[16] P. P.-S. Chen, “The entity-relationship model - toward a unified view
of data,” ACM TODS, vol. 1, no. 1, pp. 9–36, 1976.
[17] DataStax, “About indexes in Cassandra,” 2014, retrieved Nov. 4, 2014
from http://www.datastax.com/docs/1.1/ddl/indexes.
[18] Gurobi Optimization, Inc., “Gurobi optimizer reference manual,” 2014,
retrieved Aug. 8, 2014 from http://www.gurobi.com.
[19] M. J. Mior et al., “NoSE: Schema design for NoSQL applications,
University of Waterloo, Tech. Rep. CS-2015-17, 2015.
[20] M. Mior, “Nose - automated schema design for nosql
applications,” 2016, retrieved Jan. 15, 2016 from
https://github.com/michaelmior/NoSE.
[21] B. F. Cooper et al., “Benchmarking cloud serving systems with YCSB,
in Proc. ACM Symp. on Cloud Computing, 2010.
[22] E. Cecchet et al., “Performance and scalability of EJB applications,
ACM SIGPLAN Notices, vol. 37, no. 11, pp. 246–261, 2002.
[23] D. J. Watts and S. H. Strogatz, “Collective dynamics of ‘small-world’
networks,” Nature, vol. 393, no. 6684, pp. 440–442, 1998.
[24] S. Papadomanolakis and A. Ailamaki, “AutoPart: Automating schema
design for large scientific databases using data partitioning,” in SSDBM,
2004, pp. 383–392.
[25] A. Lamb et al., “The Vertica analytic database: C-Store 7 years later,”
PVLDB, vol. 5, no. 12, pp. 1790–1801, 2012.
[26] A. Rasin and S. Zdonik, “An automatic physical design tool for
clustered column-stores,” EDBT, pp. 203–214, 2013.
[27] J. Rao et al., “Automating physical database design in a parallel
database,” in SIGMOD, 2002, pp. 558–569.
[28] S. Agrawal, S. Chaudhuri, A. Das, and V. Narasayya, “Automating
layout of relational databases,” in ICDE, 2003, pp. 607–618.
[29] O. Ozmen et al., “Workload-aware storage layout for database systems,”
in SIGMOD, 2010, pp. 939–950.
[30] R. Varadarajan et al., “DBDesigner: A customizable physical design
tool for Vertica analytic database,” in ICDE 2014, 2014, pp. 1084–1095.
[31] O. G. Tsatalos et al., “The GMAP: a versatile tool for physical data
independence,” PVLDB, vol. 5, no. 2, pp. 101–118, 1996.
[32] M. Lawley and R. Topor, “A query language for EER schemas,” in ADC
94, 1994, pp. 292–304.
[33] E. Bertino and W. Kim, “Indexing techniques for queries on nested
objects,” IEEE Transactions on Knowledge and Data Engineering,
vol. 1, no. 2, pp. 196–214, 1989.
[34] T. Vajk, L. De´
ak, K. Fekete, and G. Mezei, “Automatic NoSQL schema
development: A case study,” in Artificial Intelligence and Applications.
Actapress, 2013, pp. 656–663.
[35] C. Li, “Transforming relational database into hbase: A case study,” in
Proc. ICSESS 2010, 2010, pp. 683–687.
[36] M. Chevalier et al., “Implantation not only SQL des bases de donn´
ees
multidimensionnelles,” in Colloque VSST. VSST, 2015.
... The schemas of such databases are flexible and are enforced by client-side application developers rather than database engines, as in the case of relational databases [4,10]. This means, however, that the quality of NoSQL schema depends solely on the designers' competence and professionalism [3,5,[11][12][13][14]. ...
... One of these concerns is the storage capability of the relational databases. This is because the inherent complexity, high access latency demand, and unpredictable nature of today's data (i.e., big data) has rendered relational databases less capable due to their structural rigidity and untimely response [3][4][5][6][7]. Relational databases provide engines that provide central control of redundancy and data access patterns, enforce a schema, and eliminate inconsistencies [8]. ...
... Since one of the ways to test a system is through its product [3,71], we generated schemas using the same requirements as used by all the participating experts. This is to enable us to test and compare the performances of all the schemas without discrimination or bias. ...
Article
Full-text available
The way a database schema is designed has a high impact on its performance in relational databases, which are symmetric in nature. While the problem of schema optimization is even more significant for NoSQL ("Not only SQL") databases, existing modeling tools for relational databases are inadequate for this asymmetric setting. As a result, NoSQL modelers rely on rules of thumb to model schemas that require a high level of competence. Several studies have been conducted to address this problem; however, they are either proprietary, symmetrical, relationally dependent or post-design assessment tools. In this study, a Dynamic Schema Proposition (DSP) model for NoSQL databases is proposed to handle the asymmetric nature of today's data. This model aims to facilitate database design and improve its performance in relation to data availability. To achieve this, data modeling styles were aggregated and classified. Existing cardinality notations were empirically extended using synthetically generated queries. A binary integer formulation was used to guide the mapping of asymmetric entities from the application's conceptual data model to a database schema. An experiment was conducted to evaluate the impact of the DSP model on NoSQL schema production and its performance. A profound improvement was observed in read/write query performance and schema production complexities. In this regard, DSP has significant potential to produce schemas that are capable of handling big data efficiently. Paper available in http://hdl.handle.net/10651/57033
... Different from our approach, these works do not provide means to evaluate different schema options in relation to a set of queries that represent the access pattern of the application. Works were defined to generate optimized schemas for column-family [24,25] and document [7][8][9] oriented NoSQL databases, based on the target application workload (queries and, in some cases, database size.). Our approach can be used to evaluate and compare the output schemas of these works, before migrating data from RDB to NoSQL. ...
... The schemas are ordered by LoC field. The schema S C has the smallest number of Stages (25) and LoC (204) Table 9 List and number of stages, and LoC used to implement the modified version of the queries q 3 , q 5 and q 6 . to implement all the queries, and the best result for the Paths The results indicate that schemas with higher ReqColls and Paths scores, and low FArray score present less effort to codify the queries. ...
Article
NoSQL databases have emerged as an alternative to relational databases, which do not meet all the currently imposed scenarios. Large applications that handle a variety of data formats often use several types of databases and the need to migrate data between them is common. There are several approaches that perform this type of conversion. However, the process of choosing the ideal data structuring for the application requirements is not a trivial task. In this paper, we present a RDB to NoSQL conversion approach composed by steps for defining, evaluating and comparing candidate NoSQL schemas (data structuring) in relation to the applications access pattern, before migrating RDB data to NoSQL document store. We present a set of query-based metrics and scores to assist the user in the schema selection process and a framework to migrate RDB data to NoSQL format. Finally, we present experiments to evaluate the benefits of our approach and the correlation between metrics results and query implementation effort after migrating RDB to NoSQL.
... Dai discusses a transformation technique [8] that is capable of transforming database schemas, and translate and optimize queries. Other approaches to design and transform schemas could be found in [9], [10]. ...
Article
Full-text available
Data analysis is very important for the development of any business today. It helps to identify organizational bottlenecks, optimize business processes, foresee customers' demands and behavior, and provides summarized data that could help reducing costs and increase profits. Having this information when designing new products or services highly increases the chances of their success, and thus provides an additional competitive advantage over other businesses. However, having a single data analyst with a computer is far from enough in the era of big data. There are powerful data analytical software tools, but they are either expensive or hard to deploy and require multiple high-performance servers to run. Buying expensive hardware and software, and hiring high-qualified IT experts, is not affordable for all companies, especially for smaller ones and start-ups. Therefore, this article proposes an architecture for integration of a company's heterogeneous data (stored within a database of any type, or in the file system) to a remote Hadoop cluster, providing powerful data analytical services on demand. This is an affordable and cost-effective cloud-based solution, suitable for a company of any size. Businesses are not required to by any hardware or software, but use the data analytical services on demand, paying a small processing fee per request or by subscription.
Article
Full-text available
The concept of the 3D smart city is an integration of smart cities and information technology. One of the data sources of a smart city is point cloud data that are produced from various data acquisition tools such as LiDAR, Terrestrial Laser Scanning, and Unmanned Aerial Vehicle. Due to the large size of point cloud data input, traditional databases could not handle the data efficiently. Alternatively, unstructured databases have become an option. Furthermore, data for smart city applications are considered being complex and large. Storing data in the unstructured database can easily be retrieved from various front ends such as web and mobile devices. However, unstructured databases do not have fixed schema and data types that often limit the uses of 3D point cloud data in relational databases. There are four categories of the data model in the unstructured database: document store, key-value, column store, and graph store. Each of the categories has different characteristics and approaches to handling data. Thus, this paper aims to summarise an overview of each category and determine the most suitable data organisation and environment for a 3D point cloud of a smart city. The overview will aid the developer or user select and comparing available data models in the unstructured database to handle 3D point clouds.
Article
Full-text available
Enterprises create domain-specific knowledge bases (KBs) by curating and integrating their business data from multiple sources. To support a variety of query types over domain-specific KBs, we propose Hermes, an ontology-based system that allows storing KB data in multiple backends, and querying them with different query languages. In this paper, we address two important challenges in realizing such a system: data placement and schema optimization. First, we identify the best data store for any query type and determine the subset of the KB that needs to be stored in this data store, while minimizing data replication. Second, we optimize how we organize the data for best query performance. To choose the best data stores, we partition the data described by the domain ontology into multiple overlapping subsets based on the operations performed in a given query workload, and place these subsets in appropriate data stores according to their capabilities. Then, we optimize the schema on each data store to enable efficient querying. In particular, we focus on the property graph schema optimization, which has been largely ignored in the literature. We propose two algorithms to generate an optimized schema from the domain ontology. We demonstrate the effectiveness of our data placement and schema optimization algorithms with two real-world KBs from the medical and financial domains. The results show that the proposed data placement algorithm generates near-optimal data placement plans with minimal data replication overhead, and the schema optimization algorithms produce high-quality schemas, achieving up to two orders of magnitude speed-up compared to alternative schema designs.
Chapter
Being schemaless is a common feature of most NoSQL systems. It accommodates the change and non-uniformity of stored data, and allows fast deployment of databases. However, the lack of database schemas makes it difficult to develop database applications and tools. Therefore, explicit schemas should be produced, either inferred from NoSQL data, code, or both, to facilitate the work of developers and support the functionality of database tools. Strategies published to discover NoSQL schemas focus on the extraction of the entity types but physical schemas have received very little attention. Our group recently presented an approach to infer logical schemas from aggregate-based NoSQL stores. Because the inferred schemas do not capture physical information on the underlying database, they can not help with the implementation of some typical database tasks, like database migrations, optimization, and schema evolution. In this paper we extend our previous approach by proposing a physical metamodel targeted to MongoDB databases, which captures characteristics such as existing indexes, data organization, and statistical features (e.g. cardinality of values.) We also explain the process of retrieving the physical model from an existing database, and the bidirectional transformations between logical and physical models.
Chapter
Database schema design requires careful consideration of the application’s data model, workload, and target database technology to optimize for performance and data size. Traditional normalization schemes used in relational databases minimize data redundancy, whereas NoSQL document-oriented databases favor redundancy and optimize for horizontal scalability and performance. Systematic NoSQL schema design involves multiple dimensions, and a database designer is in practice required to carefully consider (i) which data elements to copy and co-locate, (ii) which data elements to normalize, and (iii) how to encode data, while taking into account factors such as the workload and data model. In this paper, we present a workload-driven document database schema recommender (DBSR), which takes a systematic, search-based approach in exploring the complex schema design space. The recommender takes as main inputs the application’s data model and its read workload, and outputs (i) the suggested document schema (featuring secondary indexing), (ii) query plan recommendations, and (iii) a document utility matrix that encodes insights on their respective costs and relative utility. We evaluate recommended schema in MongoDB using YCSB, and show significant benefits to read query performance.
Chapter
Full-text available
Document stores are frequently used as representation format in many applications. It is often necessary to transform a set of data stored in a relational database (RDB) into a document store. There are several approaches that execute such translation. However, it is difficult to evaluate which target document structure is the most appropriate. In this article, we present a set of query-based metrics for evaluating and comparing documents schemas against a set of existing queries, that represent the application access pattern. We represent the target document schema and the queries as DAGs (Directed Acyclic Graphs), which are used to calculate the metrics. The metrics allow to evaluate if a given target document schema is adequate to answer the queries. We performed a set of experiments to calculate the metrics over a set of documents produced by existing transformation solutions. The metric results are related with smaller coding effort, showing that the metrics are effective to guide the choice of a target NoSQL document structure.
Conference Paper
Full-text available
Les systèmes NoSQL (Not Only SQL) se développent notamment grâce à leurs capacités à gérer facilement de grands volumes de données, et leur flexibilité en terme de type de données. Dans cet article, nous étudions l’implantation d’un entrepôt de données multidimensionnelles avec un système NoSQL orientée documents. Nous proposons des règles de transformation qui permettent de passer d’un modèle conceptuel multidimensionnel vers un modèle logique NoSQL orienté documents. Nous proposons trois types de transformation pour implanter les entrepôts de données multidimensionnelles. Nous expérimentons ces trois approches avec le système MongoDB, et étudions le chargement des données, les processus de transformation d’un type d’implantation à un autre ainsi que le pré-calcul d’agrégats inhérents aux entrepôts de données multidimensionnelles.
Technical Report
Full-text available
Database design is critical for high performance in relational databases and many tools exist to aid application designers in selecting an appropriate schema. While the problem of schema optimization is also highly relevant for NoSQL databases, existing tools for relational databases are inadequate for this setting. Application designers wishing to use a NoSQL database instead rely on rules of thumb to select an appropriate schema. We present an system for recommending database schemas for NoSQL applications. Our cost-based approach uses a novel binary integer programming formulation to guide the mapping from the application's conceptual data model to a database schema. We implemented a prototype of this approach for the Cassandra extensible record store. Our prototype, the NoSQL Schema Evaluator (NoSE) is able to implicitly capture rules of thumb used by expert designers without explicitly encoding the rules. Automating the design process allows NoSE to produce efficient schemas and to examine more alternatives than would be possible with a manual rule-based approach.
Article
Full-text available
This paper describes the concepts used in the implementation of DBDSGN, an experimental physical design tool for relational databases developed at the IBM San Jose Research Laboratory. Given a workload for System R (consisting of a set of SQL statements and their execution frequencies), DBDSGN suggests physical configurations for efficient performance. Each configuration consists of a set of indices and an ordering for each table. Workload statements are evaluated only for atomic configurations of indices, which have only one index per table. Costs for any configuration can be obtained from those of the atomic configurations. DBDSGN uses information supplied by the System R optimizer both to determine which columns might be worth indexing and to obtain estimates of the cost of executing statements in different configurations. The tool finds efficient solutions to the index-selection problem; if we assume the cost estimates supplied by the optimizer are the actual execution costs, it finds the optimal solution. Optionally, heuristics can be used to reduce execution time. The approach taken by DBDSGN in solving the index-selection problem for multiple-table statements significantly reduces the complexity of the problem. DBDSGN's principles were used in the Relational Design Tool (RDT), an IBM product based on DBDSGN, which performs design for SQL/DS, a relational system based on System R. System R actually uses DBDSGN's suggested solutions as the tool expects because cost estimates and other necessary information can be obtained from System R using a new SQL statement, the EXPLAIN statement. This illustrates how a system can export a model of its internal assumptions and behavior so that other systems (such as tools) can share this model.
Article
We investigate the combined effect of application implementation method, container design, and efficiency of communication layers on the performance scalability of J2EE application servers by detailed measurement and profiling of an auction site server. We have implemented five versions of the auction site. The first version uses stateless session beans, making only minimal use of the services provided by the Enterprise JavaBeans (EJB) container. Two versions use entity beans, one with container-managed persistence and the other with bean-managed persistence. The fourth version applies the session façade pattern, using session beans as a façade to access entity beans. The last version uses EJB 2.0 local interfaces with the session façade pattern. We evaluate these different implementations on two popular open-source EJB containers with orthogonal designs. JBoss uses dynamic proxies to generate the container classes at run time, making an extensive use of reflection. JOnAS pre-compiles classes during deployment, minimizing the use of reflection at run time. We also evaluate the communication optimizations provided by each of these EJB containers. The most important factor in determining performance is the application implementation method. EJB applications with session beans perform as well as a Java servlets-only implementation and an order-of-magnitude better than most of the implementations based on entity beans. The fine-granularity access exposed by the entity beans limits scalability. Use of session façade beans improves performance for entity beans, but only if local communication is very efficient or EJB 2.0 local interfaces are used. Otherwise, session façade beans degrade performance. For the implementation using session beans, communication cost forms the major component of the execution time on the EJB server. The design of the container has little effect on performance. With entity beans, the design of the container becomes important. In particular, the cost of reflection affects performance. For implementations using session façade beans, local communication cost is critically important. EJB 2.0 local interfaces improve the performance by avoiding the communication layers for local communications.
Book
What could you do with data if scalability wasn't a problem? With this hands-on guide, you'll learn how Apache Cassandra handles hundreds of terabytes of data while remaining highly available across multiple data centers -- capabilities that have attracted Facebook, Twitter, and other data-intensive companies. Cassandra: The Definitive Guide provides the technical details and practical examples you need to assess this database management system and put it to work in a production environment. Author Eben Hewitt demonstrates the advantages of Cassandra's nonrelational design, and pays special attention to data modeling. If you're a developer, DBA, application architect, or manager looking to solve a database scaling issue or future-proof your application, this guide shows you how to harness Cassandra's speed and flexibility. * Understand the tenets of Cassandra's column-oriented structure * Learn how to write, update, and read Cassandra data * Discover how to add or remove nodes from the cluster as your application requires * Examine a working application that translates from a relational model to Cassandra's data model * Use examples for writing clients in Java, Python, and C# * Use the JMX interface to monitor a cluster's usage, memory patterns, and more * Tune memory settings, data storage, and caching for better performance
Conference Paper
In this paper, we present Vertica's customizable physical design tool, called the DBDesigner (DBD), that produces designs optimized for various scenarios and applications. For a given workload and space budget, DBD automatically recommends a physical design that optimizes query performance, storage footprint, fault tolerance and recovery to meet different customer requirements. Vertica is a distributed, massively parallel columnar database that physically organizes data into projections. Projections are attribute subsets from one or more tables with tuples sorted by one or more attributes, that are replicated or segmented (distributed) on cluster nodes. The key challenges involved in projection design are picking appropriate column sets, sort orders, cluster data distributions and column encodings. To achieve the desired trade-off between query performance and storage footprint, DBD operates under three different design policies: (a) load-optimized, (b) query-optimized or (c) balanced. These policies indirectly control the number of projections proposed and queries optimized to achieve the desired balance. To cater to query workloads that evolve over time, DBD also operates in a comprehensive and incremental design mode. In addition, DBD lets users override specific features of projection design based on their intimate knowledge about the data and query workloads. We present the complete physical design algorithm, describing in detail how projection candidates are efficiently explored and evaluated using optimizer's cost and benefit model. Our experimental results show that DBD produces good physical designs that satisfy a variety of customer use cases.
Article
Cloud service providers offer a huge variety of schema-less NoSQL data storage solutions. The flexibility of these data stores offer greater freedom in structuring the data than relational databases. However, it would be desirable to make use of the strong mathematical background of relational data structures. In this paper, we introduce an automatic NoSQL schema optimization that uses a normalized data schema as starting point. We analyze the predefined set of queries, and compile the schema that can serve the queries with minimal cost at a certain query load. The introduced process is performed on a conceptual model of the database, and the queries are defined in Object Constraint Language to simplify the analysis. The optimization algorithm is introduced through a case study.
Conference Paper
Good database design is typically a very difficult and costly process. As database systems get more complex and as the amount of data under management grows, the stakes increase accordingly. Past research produced a number of design tools capable of automatically selecting secondary indexes and materialized views for a known workload. However, a significant bulk of research on automated database design has been done in the context of row-store DBMSes. While this work has produced effective design tools, new specialized database architectures demand a rethinking of automated design algorithms. In this paper, we present results for an automatic design tool that is aimed at column-oriented DBMSes on OLAP workloads. In particular, we have chosen a commercial column store DBMS that supports data sorting. In this setting, the key problem is selecting proper sort orders and compression schemes for the columns as well as appropriate pre-join views. This paper describes our automatic design algorithms as well as the results of some experiments using it on realistic data sets.
Article
With the development of distributed system and cloud computing, more and more applications might be migrated to the cloud to exploit its computing power and scalability, where the first task is data migration. In this paper, we propose a novel approach that transforms a relational database into HBase, which is an open-source distributed database similar to BigTable. Our method is comprised of two phases. In the first phase, relational schema is transformed into HBase schema based on the data model of HBase. We present three guidelines in this phase, which could be further utilized to develop an HBase application. In the second phase, relationships between two schémas are expressed as a set of nested schema mappings, which would be employed to create a set of queries or programs that transform the source relational data into the target representation automatically.