Conference PaperPDF Available

Can Quantitative Finance Benefit from IoT?

Authors:

Abstract and Figures

The Internet of Things (IoT) is a novel paradigm that communicates information among smart devices that are connected to the Internet. In this context, such devices would leverage our understanding and capabilities of big data, deep analysis and artificial intelligence to solve problems in real-time. The IoT paradigm has successfully benefited many applications in the social sciences and industries. However, in the rise of IoT, there is at least one question that has been left unanswered: Can Quantitative Finance (QF) benefit from IoT? The QF is a field that extends sophisticated mathematical models and utilizes advanced computer techniques to link with global finance markets. By taking market and social information as input, a QF model can derive profitable insights and control the risk to make trading decisions. Today, many Internet-based techniques are extensively employed in the field as: (a) market and social data is provided via Internet; (b) big data infrastructures are built in the Cloud; and (c) deep learning tools are accessible in Internet. Even trading models and strategies could be exerted through Internet. In this paper, we will provide an overview of challenges and opportunities presented by this new paradigm in the QF industry. To unlock the potential of IoT, a system architecture, termed QuantCloud, is proposed for modern quantitative trading firms in the field.
Content may be subject to copyright.
Can Quantitative Finance Benefit from IoT?
Peng Zhang
Stony Brook University
Stony Brook, NY 11794, USA
peng.zhang@stonybrook.edu
Xiang Shi
Advanced Risk and Portfolio
Management
New York, NY 10023, USA
xiang.shi@arpm.co
Samee U. Khan
North Dakota State University
Fargo, ND 58108, USA
samee.khan@ndsu.edu
ABSTRACT
The Internet of Things (IoT) is a novel paradigm that
communicates information among smart devices that are
connected to the Internet. In this context, such devices would
leverage our understanding and capabilities of big data, deep
analysis and artificial intelligence to solve problems in
real-time. The IoT paradigm has successfully benefited many
applications in the social sciences and industries. However, in
the rise of IoT, there is at least one question that has been left
unanswered: Can Quantitative Finance (QF) benefit from IoT?
The QF is a field that extends sophisticated mathematical
models and utilizes advanced computer techniques to link with
global finance markets. By taking market and social
information as input, a QF model can derive profitable insights
and control the risk to make trading decisions. Today, many
Internet-based techniques are extensively employed in the field
as: (a) market and social data is provided via Internet; (b) big
data infrastructures are built in the Cloud; and (c) deep learning
tools are accessible in Internet. Even trading models and
strategies could be exerted through Internet. In this paper, we
will provide an overview of challenges and opportunities
presented by this new paradigm in the QF industry. To unlock
the potential of IoT, a system architecture, termed QuantCloud,
is proposed for modern quantitative trading firms in the field.
CCS CONCEPTS
Software and its engineering Software organization
and properties Software system structures
KEYWORDS
Internet of Things, Quantitative Finance, Big Data, Cloud
Computing
ACM Reference format:
P. Zhang, X. Shi, Samee U. Khan. 2017. In Proceedings of Second
ACM/IEEE Symposium on Edge Computing: Workshop on Smart
IoT (SmartIoT17), San Jose / Silicon Valley, CA, USA, October
14, 2017, 6 pages.
https://doi.org/10.1145/3132479.3132491
1 INTRODUCTION
The Internet of Things, or IoT for short, is an entirely new
paradigm that motivates a renewed thinking in many fields,
such as retail, healthcare, cyber and physical infrastructures
[1]. With the advances in communication technologies: (a)
more scattered information could be effectively integrated in a
consolidated big data management system; (b)
knowledge-based decision could be made more accurately
based on the consolidated information; and (c) tools for
modeling and integrating variety and large volumes of
metadata could be more rapidly deliverable from vendors to
customers through the Internet. These IoT benefits are
drawing attention of the social, sciences, and industries [1, 2].
Quantitative finance (QF) plays a key role in many fields of the
modern financial markets in stocks, bonds, and foreign
exchange. The QF is a field that relies on sophisticated
mathematical models, statistical tools, machine learning, and
computer techniques to derive profitable insights to control
portfolio risks of the rapid-changing markets and make trading
decisions [3, 4]. In the field, proprietary trading firms were the
pioneers in the use of high-frequency quantitative trading,
which accounts for more than half of US equity volumes and
about 45% of futures trading, according to Tabb Group
estimates. In the past, the firms primarily focused on the speed
between the exchanges. However, today, only being fast is
insufficient to make profits. One evidence is that US high
frequency trading (HFT) equity market marker revenue
decreased from more than $7B in 2009 to $1B in 2016. There
is a growing trend of firms doing big and deep data analysis to
improve their trading decisions. In general, we require the
following: (a) consolidating vast amounts of data of different
instruments from different sources at different locations; (b)
developing mathematical models and statistical tools that are
able to deep mine “big values”; (c) building hardware
platforms to grab market inefficiency in a timely manner; and
(d) deliver data, software, and hardware services as an
integrated solution.
The QF evolution from ultra-low-latency systems to “smart
trading” systems could be an opportunity for the rise of IoT in
revolutionizing the trading industry. This motivated us to
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. Copyrights for
components of this work owned by others than ACM must be ho nored.
Abstracting with credit is permitted. To copy otherwise, or republish, to post
on servers or to redistribute to lists, requires prior specific permission
and/or a fee. Request permissions from Permissions@acm.org.
SmartIoT'17, October 14, 2017, San Jose / Silicon Valley, CA, U SA
© 2017 Association for Computing Machinery.
ACM ISBN 978-1-4503-5528-5/17/10$15.00
https://doi.org/10.1145/3132479.3132491
SmartIoT’17, October 2017, San Jose / Silicon Valley, CA, USA
P. Zhang et al.
2
investigate the connectivity between IoT and QF.
Consequently, in this paper, we present the potential benefits
for QF from utilizing IoT. We also propose a novel
Internet-based system architecture, termed QuantCloud that
may leverage the missed opportunities and challenges in this
field.
The main contributions of this work are as follows:
1. Discussion on the benefit the QF may gain from IoT
and the undergoing evolution of quantitative trading
from “fast trading” to “smart trading”.
2. The needs for Internet-based technologies in QF are
presented in detail and discussed in essential aspects
such as data, methods, platforms and services, etc.
3. A QuantCloud system architecture is proposed as a
next-generation quantitative finance platform that is
able to leverage “fast and smart trading” for this field.
This work makes an attempt to extend this IoT paradigm and
gain IoT benefits in an industrial environment. There is a
strong connectivity and business need for practitioners who
can develop an innovative IoT solution so that the firms could
quickly benefit from the IoT insights in the future. Our work is
a step forward towards this industry-driven approach.
The remainder of the article is organized as follows. In Section
2, some necessary background on modern QF systems is
reviewed, which also serves as a motivation for the work. To
fulfill the needs discussed in Section 2, Section 3 presents an
IoT QF system, termed the QuantCloud. We discuss some
preliminary QuantCloud results in Section 4. Finally, we
conclude the work in Section 5.
2 BACKGROUND AND MOTIVATIONS
Things, within the IoT refers to a wide variety of aspects, which
in the context of QF could be a mixture of data, methods,
platforms, and services.
2.1 Financial Big Data
Data is the most important “thing” in quantitative analytics. To
date, financial big data is the major challenge for financial
institutions [5-7]. The 3 Vs of Big Data: volume, velocity, and
variety, never stop to grow [5, 6].
First, data volume in market transactions is increasing at a
tremendous rate. For example, there was a tenfold increase in
market data between 2008 and 2011, and the data volumes are
growing stronger in all areas of the financial domain [5]. The
New York Stock Exchange (NYSE) by itself creates several
terabytes of market and reference data per day covering the
use and exchange of financial instruments [5, 8]. For a bigger
picture, the total number of transactions increased by 50 times,
compared to 20 years ago, and this number being more than
120 times bigger during the financial crisis [6].
Second, data velocity is an important factor in preserving a
competitive advantage. High-speed market data are directly
delivered to the high frequency trading (HFT) firms through
low-latency networks. These HFT transactions are highly
sensitive to small price fluctuations even at the microsecond
level. It has been recorded that these HFT transactions can deal
with several thousands of orders per day [5].
Third, modern financial firms focus on “wide data”, not just big
data, in their strategies. The unstructured data from social
media, such as news, Twitter, and Facebook are needed to be
modeled to gain insights about risk analysis and trading
predictions [5, 7]. Consequently, it is no longer possible for a
traditional relational database management system (RDBMS)
to handle such heterogeneous data [9, 10].
Under this scenario, a proprietary database management
system is the first “must-have” component that needs to be
optimized for handling time-series queries. In the best practice,
the columnar database is viewed as the most preferred option
for financial big data applications [10]. Consequently, we also
use the columnar database approach in our big data
infrastructure.
2.2 CEP and AI Methods
Big data is not just volume, velocity, and variety but a better
analytics method is what decision-makers really need. The
financial services industry is a pioneer in utilizing the complex
event processing (CEP) technology [11] to organize
data-driven events so that it could inform algorithmic trading
behavior by timely identifying opportunities and/or risks.
Nowadays, the CEP technology is extensively utilized in most
financial applications, such as quantitative trading, signal
generations, and risk management. Such CEP-based approach
is also a popular IoT solution to process multiple streams of
data/events to identify patterns of interest [11, 12].
The financial firms also are utilizing artificial intelligent (AI) as
another novel fast-developing approach. For example, it is
reported that the traditional hedge funds, such as Renaissance
Technologies and Bridgewater Associates have heavily
invested in AI to generate investment ideas [13].
Coincidentally, AI is also a fast-developing tool within the
world of the IoT [2, 14]. However, most conservative investors,
though eager on the idea of AI, are still slow to adopt this
emerging technology.
In practice, the CEP technology and AI algorithms must be
integrated with the time-series databases. An ideal case is
when the time-series databases are data providers for
historical and real-time data; meanwhile, the CEP and AI
methods are data consumers to derive hidden opportunities
and assess portfolio risks [3, 6, 11]. This is a cornerstone
feature in our proposed system, described in Section 3.
2.3 Cloud Computing Platform
The IoT and Cloud are different paradigms but we consider
them complimentary. The IoT generates vast amounts of data,
and the Cloud provides a scalable platform to store and process
the data. For example, the popular cloud IoT platforms include
Amazon Web Services IoT [15], Google Cloud IoT [16],
Microsoft Azure IoT [17], etc. Similarly, quantitative analytics
in finance is also moving the computing tasks to the Cloud. For
example, popular cloud platforms of such classification include
Amazon Web Services for Financial Services [18], Google Cloud
for Financial Services Solutions [19], Microsoft Azure for
Financial Services [20], etc. Consequently, both the IoT and QF
are using the Cloud as a platform [21]. Naturally, we also utilize
Cloud as a platform of our proposed system, described in
Section 3.
Can Quantitative Finance Benefit from IoT?
SmartIoT’17, October 2017, San Jose / Silicon Valley, CA, USA
3
2.4 Internet-based Services
By moving to the Cloud, the financial services sector is
currently undergoing a significant change. Traditionally, the
enterprise-level integrated trading systems was affordable by
only by the top financial institutions. Today, this situation has
changed. The Internet-based services enabled an explosion in
the availability of integrated trading platforms for smaller
firms and even professional individual traders. For example,
QuantStart offers Internet-based educational resources for
learning algorithmic trading [22]. Quantopian is a platform that
supports coding of investment algorithms [23] and
QuantConnect is yet another example [24]. Among all of these
platforms/tools, the Collective2 provides a rich set of
multi-source data, such as stocks, forex, futures, and options
[25]. We predict that more powerful tools and expressive
services will be implemented and delivered via the Internet,
shaping the financial services model infused with collaborative
technologies [6, 26, 27]. The same ideological change is
happening within the world of the IoT [28]. Consequently, we
adopt the same Internet-based service model in our proposed
architecture, described in Section 3.
2.5 Summary
Observing industrial cases across various applications helps us
understand the real needs in industries. The overview of
quantitative analytics in finance is outlined in Fig 1.
Figure 1: Overview of Quantitative Analytics in Finance
3 A QUANTCLOUD SYSTEM ARCHITECTURE
After in-depth analysis of motivational factors in QF, we
present an integrated system architecture, termed QuantCloud,
to leverage the capabilities of financial big data, time-series
analytics techniques, parallel processing, and Internet-based
services while preserving legacy interfaces, such as Python.
3.1 System Architecture
The QuantCloud system architecture is composed of three
abstraction layers, namely: user, client, and server, as shown in
Fig 2. The user layer provides an Internet portal through which
users submit their tasks in XML and receive their results in
CSV. The portal supports quantitative analysts to program
their algorithms in C/C++ or Python. Specifically, a task could
be a strategy the analyst builds, such as market data types, a
trade strategy and frequency, and the user account and
exchange information. Such a design will minimize hardware
and device-to-cloud communication requirements for the
end-point Internet-connected devices. It also is possible to
access results by mobile devices, such as smartphones.
The client layer is at the heart of quantitative analytics. Briefly,
it consists of the following modules:
a. Data push and fetch services: It queries time-series data
from its connected server (fetch); and pushes results to a user
on completion of the user tasks (push).
b. Shared memory system (SHM): It buffers queried
time-series and allows other modules to make use of the data.
c. Complex event processing (CEP): On arrival of a user task,
it analyzes the dependencies between tasks and data. It then
sends queries to server and starts to execute the tasks as long
as queried data arrive at SHM.
d. Artificial intelligence (AI): It is a built-in function module
that is callable by tasks in CEP. When a function call is made, an
AI subroutine reads associated data from SHM and starts
analyzing on the read data.
e. Accelerators (ACC): These are additional computational
units for host processors. An accelerator appears as a device on
the bus for better performance. In general, some specific
operators, such as large matrix operations could be accelerated
on the Nvidia GPU [29, 30]; and some specific complex models,
such as machine learning models could be improved in speed
and accuracy on the Google TPU.
The server layer is at the heart of quantitative data and is
comprised of:
a. Database (DB): Ideally it adopts a non-relational
columnar data storage. It needs to be optimized for time-series
queries. In short, time-series is data that has a timestamp, such
as IoT device data and QF stocks transactions. Further, a
real-time data collection interface will be added to collect
market information streams in real-time through the Internet.
b. Hybrid storage solution: It combines in-memory and
on-disk storage. Particularly, an in-memory database (IMDB) is
just a part of the DB in memory for most frequently accessed
data, such as stocks trades. An on-disk database that may
consist of a SSD and a HDD, stores the rest of data, such as
stocks quotes.
c. Data push and fetch services: It pushes queried data to the
requester client (push); and fetches data in real-time from
sources, such as financial markets and exchanges (fetch).
3.2 Key Components and Their Functions
3.2.1 Big Data Management
Within the server, data is managed in a non-relational
columnar storage. In support of the QF use cases, we
considered the following factors: (a) fast range queries for time
series; (b) support simultaneous read operations; (c) data
compression; and (d) data hashing for security [27].
At the client side, data is stored and managed within the SHM,
and a client adopts a hybrid multi-threading programming
model. On arrival of packets from server, data packets are
decrypted and restored as time series for other subroutines to
use [27].
SmartIoT’17, October 2017, San Jose / Silicon Valley, CA, USA
P. Zhang et al.
4
3.2.2 Complex Event Processing (CEP)
A task could be viewed as a data-driven decision-making
process and is comprised of multiple data streams and
data-dependent subtasks (i.e. events). Technically, a CEP
solution needs to, at least, analyze data dependencies between
events and process events on the availability of the needed
data. In a practical perspective, users could simply define the
events and their behaviors but do not need to worry about the
execution ordering, which is a model-driven development. In
QF, an event instance could be just a simple event, such as a
quotation or a composite event, derived from other discrete
events. In financial index models, an event could also be an
index of interest using some regression approaches.
Given all this, a data-driven paradigm is a solution to
understand the data dependencies between complex events for
creation of a data-dependent matrix [31]. Taking this matrix as
an input, a scheduler executes an event when its dependency is
ready. This approach allows concurrent execution of multiple
events to enable event-level parallel processing.
3.2.3 Nvidia GPU and Google TPU Accelerators
Today, GPU is one of most popular accelerators, which has
exhibited its superiority over CPU in some specific algorithms,
such as large-scale matrix operations and Monte Carlo (MC).
The MC is extensively used to calculate portfolio risk for the
simple reason that it does not require closed-form expressions.
However, accuracy of estimated risks in MC is dependent on
the number of generated scenarios. Therefore, in practice, a
considerable number of scenarios are calculated. In this case,
the GPU is a preferable computing means to solve such a
problem.
Another example is the cone programming problem in modern
portfolio theory [32, 33]. The algorithms include the linear
programming (LP), quadratic programming (QP) and
semidefinite programming (SDP). In contrast to CPU, GPU is
more powerful in solving these algorithms. Therefore, compute
intensive methods, such as MC, LP, QP, and SDP must move to
GPUs from traditional CPUs.
Google TPU is a novel accelerator and is purpose-built
specifically for machine learning (ML). However, its access
method is now limited to the cloud. To harness the TPU benefit,
the training workloads and the execution of the ML models
must also be exported to the Google Cloud. This usage is
different from the usage of GPU that allows to be used locally.
Figure 2: A QuantCloud System Architecture
3.3 Software Environments
Among popular programming languages for the QF and IoT
developments, Python is a preferred language for building
solutions as it requires fewer lines of codes and it has a wide
availability of statistical libraries. On the other hand, C++ is still
the first-choice language for programmers who code at the
lowest layer of the software. In theory, there is not much
difference between these high-level languages for writing
desktop apps and servers. However, in practice, there are big
differences in writing codes for the next-generation
Internet-based “things”. For example, most computing
resources are remote and all of the communication goes
through the network. Ideally, a user should not be concerned
with any of this and should simply implement the algorithm as
objects.
Keeping all of this in mind, in our software environment, C++ is
used for developing the big data system architecture [27] and
its built-in CEP scheduler [34]. At the high-level user
environment, in addition to these convention C++ callbacks, a
Python interface is supplied to execute the Python codes in
multi-threaded C++ runtime. This integrated software
architecture is important as it provides an effortless interface
to use many Python libraries, such as Theano and Pylearn2 for
machine learning [35, 36], and StatsModels for statistical tests
and data exploration [37].
Can Quantitative Finance Benefit from IoT?
SmartIoT’17, October 2017, San Jose / Silicon Valley, CA, USA
5
3.4 Hardware/Software Co-Development for QF
In development, we collaborate the software specification with
the hardware properties and prejudice neither hardware nor
software implementation. This co-design method has been
extensively-used for powering the IoT development [38] so it
is hereby applied for this QF development.
In the off-the-shelf processors, a manycore architecture is the
most popular, containing a number of independent cores and
shared memory. Technically, a program must be written for a
degree of parallel processing so it may fully explore the power
of a manycore processor system. Our platform uses a hybrid
multi-threading and multi-coprocessing approach.
A manycore processor usually has just a few cores (e.g. 4, 8, 16)
and may be complemented by an accelerator, such as Nvidia
GPU in a heterogenous system. Each GPU device has its own
memory. Communication between host CPU and its attached
GPUs goes through the host memory. Strictly, GPU is also a
form of manycore architecture but more suitable for
highly-parallel compute-intensive applications.
Google Tensor Processing Unit, called as Cloud TPU, may be
considered as another form of novel accelerator, only being
suitable for specific purpose: machine learning (ML). The TPU
is available now as part of Google Cloud and programmable in
TensorFlow. Currently, a ML application or object has to be
moving to the cloud for using this TPU. This is changing the
hardware acquisition, which simultaneously requires a change
in the software development. Under this scenario, the client
part in Fig. 2 is designed as an Internet-based analytics
provider, rather than mere a standalone instance.
Consequently, QF can benefit from a transform towards an
Internet-based architecture paradigm.
4 PRELIMINARY RESULTS
We build a proof-of-concept (POC) system to demonstrate the
benefit for QF from IoT. This POC system is shown in Fig. 3. In
this POC, we simulated a conventional user who operates a
personal desktop to perform analysis locally and an
Internet-based user who uses Internet-based services to
perform same analysis on the Cloud. For this conventional
user, we use the Matlab toolbox in a Microsoft Windows
operating system. For this Internet-based user, we use a laptop
to submit tasks to one of clients in the Cloud using the TCP/IP.
On receipt of a task, the client queries data from server and
does the task. In this, the big data system infrastructure
followed the work [27].
Figure 3: A Proof-of-Concept System for the QuantCloud Architecture
We tested the autoregressive moving-average (ARMA) model
using this POC system. In this test, we assume that the
conventional user operates a local computer that has a
relatively old CPU: Intel Xeon E5 processor; on the other hand,
the Internet-based user accesses a cloud compute instance that
has a novel CPU: Intel Xeon Phi (Knights Landing) processor.
Both users run the ARMA code in Python in StatsModels [37]
and use a total of 7-day trade data. It took the conventional
user 78 seconds to process 16 stocks and the Internet-based
user 18 seconds to process 64 stocks. In other words, in an
hour, a conventional user can only process a total of 46 stocks
on his local server but an Internet-based use could process as
many as 195 stocks using one single cloud instance. The NYSE
exchange trades stocks for some 2800 companies. So, an
Internet-based user needs only 15 compute instances to
complete such analysis on all stocks in an hour. To compare,
this conventional user needs about 2.5 days for this job.
Therefore, at the IoT age, such conventional users may quickly
lose their competitive advantage in the industry. Some
example “things” in our model could be:
1. Real-time collection of a socio-temporal event: People use
mobile devices, such as smartphones to comment on
social affairs. For example, people “like” or “dislike” a
company’s news. These events are collected through
smartphones and transmitted to a cloud and organized as
socio-temporal events. Analyzing such events may help us
understand the preference of customers on a company
and its products in a timely fashion. Therefore, in this
manner, such mobile devices are tangible “things” for the
financial cloud to understand the social timely impact on
the financial market.
2. Place an order using smart devices: Individual traders can
use their smartphone to place an order, for example, buy
or sell stocks. Such individual orders are transmitted
through a network to an exchange broker where orders
are placed and executed. Therefore, in this manner, these
smart phones are tangible “things” for the financial cloud
to help its customers to place orders in an agile way.
3. Extract live news from a website: For example, get live news
about a company or a sector, from markets.wsj.com and
SmartIoT’17, October 2017, San Jose / Silicon Valley, CA, USA
P. Zhang et al.
6
www.nytimes.com. Search some keywords and use AI
models to predict how these live news data impact a
company’s stocks. Therefore, in this manner, such news
websites are intangible “things” but important for the
financial cloud to reflect the public opinions on the
companies.
4. Extract a company financial data: These data are collected
to a cloud center and organized as time-series events used
to understand a company’s financial situation and help
price its stocks. Therefore, in this manner, the company
websites are intangible “things” but reliable sensors for
the financial cloud to demonstrate a company’s
performance in a timely manner.
5 CONCLUSIONS
In this work, we see a great potential in leveraging the IoT
paradigm for the quantitative trading firms to transform
business practices. By extending this IoT paradigm, we could
be able to collect multi-source data through the Internet, utilize
Internet-based toolchains to gain deep insights from the
collected data, minimize the resource provisioning costs by
using the Cloud, and create end-to-end integrated solutions in
a timely manner. The benefit that this IoT paradigm could
bring would change the best practice of most quantitative
trading firms. Therefore, the rise of IoT is an opportunity to
revolutionize the financial industry as it is better aligned to the
needs of modern financial practitioners. To harness this
opportunity, the QuantCloud system architecture is one
solution with the clear focus on the capabilities of financial big
data, complex event processing, artificial intelligence, and
Cloud portability.
ACKNOWLEDGMENTS
Samee U. Khan’s work supported by (while serving at) the
National Science Foundation. Any opinion, findings, and
conclusions or recommendations expressed in this material
are those of the authors and do not necessarily reflect the
views of the National Science Foundation.
REFERENCES
Y. Sun, R. Bie, P. Thomas, and X. Cheng, "Advances on data, information,
and knowledge in the internet of things," ed: Springer, 2014.
J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, "Internet of Things
(IoT): A vision, architectural elements, and future directions," Future
generation computer systems, vol. 29, pp. 1645-1660, 2013.
X. Shi, P. Zhang, and S. U. Khan, "Quantitative Data Analysis in Finance,"
in Handbook of Big Data Technologies, A. Y. Zomaya and S. Sakr, Eds.,
ed Cham: Springer International Publishing, 2017, pp. 719-753.
J.-P. Bouchaud, M. Mézard, and M. Potters, "Statistical properties of
stock order books: empirical results and models," Quantitative finance,
vol. 2, pp. 251-256, 2002.
T. Seth and V. Chaudhary, "Big Data in Finance," ed, 2015.
B. Fang and P. Zhang, "Big Data in Finance," in Big Data Concepts,
Theories, and Applications, S. Yu and S. Guo, Eds., ed Cham: Springer
International Publishing, 2016, pp. 391-412.
H. Jagadish, J. Gehrke, A. Labrinidis, Y. Papakonstantinou, J. M. Patel, R.
Ramakrishnan, et al., "Big data and its technical challenges,"
Communications of the ACM, vol. 57, pp. 86-94, 2014.
M. Versace and K. Massey, "The case for Big Data in the financial
services industry," IDC Financial Insights, White paper, 2012.
A. McAfee and E. Brynjolfsson, "Big data: the management revolution,"
Harvard business review, vol. 90, pp. 60-68, 2012.
S. Müller and H. Plattner, "Aggregates caching in columnar in-memory
databases," in In Memory Data Management and Analysis, ed: Springer,
2015, pp. 69-81.
[11]
R. Bhargavi, "Complex Event Processing Framework for Big Data
Applications," in Data Science and Big Data Computing: Frameworks
and Methodologies, Z. Mahmood, Ed., ed Cham: Springer International
Publishing, 2016, pp. 41-56.
[12]
S. Haller, S. Karnouskos, and C. Schroth, "The internet of things in an
enterprise context," in Future Internet Symposium, 2008, pp. 14-28.
[13]
A. Satariano, "Silicon Valley Hedge Fund Takes On Wall Street With AI
Trader," in Bloomberg Technology, ed, 2017.
[14]
D. Miorandi, S. Sicari, F. De Pellegrini, and I. Chlamtac, "Internet of
things: Vision, applications and research challenges," Ad Hoc
Networks, vol. 10, pp. 1497-1516, 2012.
[15]
Amazon Web Services IoT. URL: https://aws.amazon.com/iot/
[16]
Google Cloud IoT. URL: https://cloud.google.com/solutions/iot/
[17]
Microsoft Azure IoT. URL:
https://www.microsoft.com/en-us/internet-of-things/azure-iot-suite
[18]
Amazon Web Services for Financial Services. URL:
https://aws.amazon.com/financial-services/
[19]
Google Cloud Platform for Financial Services Solutions. URL:
https://cloud.google.com/solutions/financial-services/
[20]
Microsoft Azure for Financial Services. URL:
https://azure.microsoft.com/en-us/industries/financial/
[21]
A. Botta, W. De Donato, V. Persico, and A. Pescapé, "On the integration
of cloud computing and internet of things," in Future Internet of Things
and Cloud (FiCloud), 2014 International Conference on, 2014, pp.
23-30.
[22]
QuantStart. URL: https://www.quantstart.com/
[23]
Quantopian. URL: https://www.quantopian.com/
[24]
QuantConnect. URL: https://www.quantconnect.com/
[25]
Collective2. URL: https://trade.collective2.com/
[26]
M. Miller, "Cloud computing: Web-based applications that change the
way you work and collaborate" online: Que publishing, 2008.
[27]
P. Zhang, K. Yu, J. Yu, and S. Khan, "QuantCloud: Big Data Infrastructure
for Quantitative Finance on the Cloud," IEEE Transactions on Big Data
DOI: 10.1109/TBDATA.2017.2649544, 2017.
[28]
Y. Lu and J. Cecil, "An Internet of Things (IoT)-based collaborative
framework for advanced manufacturing," The International Journal of
Advanced Manufacturing Technology, vol. 84, pp. 1141-1152, May 01
2016.
[29]
P. Zhang and Y. Gao, "Matrix Multiplication on High-Density Multi-GPU
Architectures: Theoretical and Experimental Investigations," in High
Performance Computing: 30th International Conference, ISC High
Performance 2015, Frankfurt, Germany, July 12-16, 2015, Proceedings,
J. M. Kunkel and T. Ludwig, Eds., ed Cham: Springer International
Publishing, 2015, pp. 17-30.
[30]
K. Fatahalian, J. Sugerman, and P. Hanrahan, "Understanding the
efficiency of GPU algorithms for matrix-matrix multiplication," in
Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on
Graphics hardware, 2004, pp. 133-137.
[31]
P. Zhang, Y. Gao, and M. Qiu, "A data-oriented method for scheduling
dependent tasks on high-density multi-GPU systems," in High
Performance Computing and Communications (HPCC), IEEE 17th
International Conference on, 2015, pp. 694-699.
[32]
H. Markowitz, "Portfolio selection," The journal of finance, vol. 7, pp.
77-91, 1952.
[33]
E. J. Elton and M. J. Gruber, "Modern portfolio theory, 1950 to date,"
Journal of Banking & Finance, vol. 21, pp. 1743-1759, 1997.
[34]
R. Al-Rfou, G. Alain, A. Almahairi, C. Angermueller, D. Bahdanau, N.
Ballas, et al., "Theano: A Python framework for fast computation of
mathematical expressions," arXiv preprint arXiv:1605.02688, 2016.
[35]
I. J. Goodfellow, D. Warde-Farley, P. Lamblin, V. Dumoulin, M. Mirza, R.
Pascanu, et al., "Pylearn2: a machine learning research library," arXiv
preprint arXiv:1308.4214, 2013.
[36]
S. Seabold and J. Perktold, "Statsmodels: Econometric and statistical
modeling with python," in Proceedings of the 9th Python in Science
Conference, 2010, p. 61.
[37]
H. Jayakumar, K. Lee, W. S. Lee, A. Raha, Y. Kim, and V. Raghunathan,
"Powering the internet of things," in Proceedings of the 2014
international symposium on Low power electronics and design, 2014,
pp. 375-380.
... Comparing our system to other presented approaches and architectures shows the straightforwardness of our methodology. For example in (Project Pro 2021) one can see 3 sources for data indigestion and 7 sources for data management, while in (Zhang, Shi, and Khan 2017) there are three layers with 3 to 5 blocks in each layer. ...
Article
Today’s digital society generates more and more data on a daily basis in all areas of human activities, especially in the financial sector. Such data can be collected, stored, processed, and analyzed, providing serious analytical opportunities for the end users. A lot of such systems are implemented and work using cloud technologies, which have a number of advantages, but they use a pay-per-use model and thus are not very suitable for medium and small organizations, non-profit and academic institutions. In this paper, a system, capable of fetching, storing, and processing big data is proposed and tested with financial data. It uses an open-source component-based approach and can be custom-built and implemented in national universities or centers of competence/excellence. That can present unique opportunities to researchers and developers to use and work with Big data on economic and financial problems, to investigate dependencies, use large simulation and forecast models and analyze results, using the new technologies and Big data provided by them
... According to a McKinsey report [21], there will be at least 30 million IoT devices connected and interacting by 2020. Given the ability to create better systems of knowledgebased decision systems [22], IoT is considered an important strategic technology trend that will shape business opportunities and competitive advantage [23]. However, it needs to be well integrated, managed, and governed to potentiate its benefits [24] [25]. ...
Article
Full-text available
Internet of things (IoT) is considered a key technology for the Industry 4.0 revolution. Information Technology (IT) governance (ITG) is now an increasingly important tool for organizations to align their IT strategy and infrastructures with the organizations' business objectives. The most adopted ITG framework is COBIT, which defines seven enabler categories. These enablers aim to facilitate the implementation, identification, and management of IT. This research aims to determine, explore, and define which are the most suitable IT governance enablers to assist managers in IoT implementation. The study adopted the Design Science Research methodology, including two systematic literature reviews and a Delphi method to build the artefact. The artefact was demonstrated and evaluated in a real organization. The results indicate that data privacy, data protection, and data analysis are currently the most relevant enablers to consider in an IoT implementation because they increase the efficiency of the solution and enhance data credibility. is given and that r ef e rence m ade to the publicat ion, to its date of issue, and to the fact that repr int ing pr ivile ges we re granted by perm iss ion of Sc iKA-Assoc iat io n for Prom otion and Dissem inat ion of Sc ient if ic Knowledge. How IT Governa nce can assist IoT project implementation International
Chapter
Full-text available
The fundamental requirement for modern IT systems is the ability to detect and produce timely reaction to the occurrence of real-world situations in the system environment. This applies to any of the Internet of Things (IoT) applications where number of sensors and other smart devices are deployed. These sensors and smart devices embedded in IoT networks continually produce huge amounts of data. These data streams from heterogeneous sources arrive at high rates and need to be processed in real time in order to detect more complex situations from the low-level information embedded in the data. Complex event processing (CEP) has emerged as an appropriate approach to tackle such scenarios. Complex event processing is the technology used to process one or more streams of data/events and identify patterns of interest from multiple streams of events to derive a meaningful conclusion. This chapter proposes CEP-based solution to continuously collect and analyze the data generated from multiple sources in real time. Two case studies on intrusion detection in a heterogeneous sensor network and automated healthcare monitoring of geriatric patient are also considered for experimenting and validating the proposed solutions.
Article
Full-text available
Theano is a Python library that allows to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements. Theano is being actively and continuously developed since 2008, multiple frameworks have been built on top of it and it has been used to produce many state-of-the-art machine learning models. The present article is structured as follows. Section I provides an overview of the Theano software and its community. Section II presents the principal features of Theano and how to use them, and compares them with other similar projects. Section III focuses on recently-introduced functionalities and improvements. Section IV compares the performance of Theano against Torch7 and TensorFlow on several machine learning models. Section V discusses current limitations of Theano and potential ways of improving it.
Article
Full-text available
Cloud computing and Internet of Things (IoT), two very different technologies, are both already part of our life. Their massive adoption and use is expected to increase further, making them important components of the Future Internet. A novel paradigm where Cloud and IoT are merged together is foreseen as disruptive and an enabler of a large number of application scenarios. In this paper we focus our attention on the integration of Cloud and IoT, which we call the CloudIoT paradigm. Many works in literature have surveyed Cloud and IoT separately: their main properties, features, underlying technologies, and open issues. However, to the best of our knowledge, these works lack a detailed analysis of the CloudIoT paradigm. To bridge this gap, in this paper we review the literature about the integration of Cloud and IoT. We start analyzing and discussing the need for integrating them, the challenges deriving from such integration, and how these issues have been tackled in literature. We then describe application scenarios that have been presented in literature, as well as platforms-both commercial and open source-and projects implementing the CloudIoT paradigm. Finally, we identify open issues, main challenges and future directions in this promising field.
Article
We investigate several statistical properties of the order book of three liquid stocks of the Paris Bourse. The results are to a large degree independent of the stock studied. The most interesting features concern (i) the statistics of incoming limit order prices, which follows a power-law around the current price with a diverging mean; and (ii) the shape of the average order book, which can be quantitatively reproduced using a ‘zero intelligence’ numerical model, and qualitatively predicted using a simple approximation. Financial markets offer an amazing source of detailed data on the collective behaviour of interacting agents. It is possible to find many reproducible patterns and even to perform experiments, which bring this atypical subject into the realm of experimental science. The situation is simple and well defined, since many agents, with all the same goal, trade the very same asset. As such, the statistical analysis of financial markets also offers an interesting testing ground not only for economic theories, but also for more ambitious theories of human activities. One may indeed wonder to what extent it is necessary to invoke human intelligence or rationality to explain the various universal statistical laws which have been recently unveiled by the systematic analysis of very large data sets. Many statistical properties of financial markets have already been explored, and have revealed striking similarities between very different markets (different
Article
In this paper, we present the QuantCloud infrastructure, designed for performing big data analytics in modern quantitative finance. Through analyzing market observations, quantitative finance (QF) utilizes mathematical models to search for subtle patterns and inefficiencies in financial markets to improve prospective profits. To discover profitable signals in anticipation of volatile trading patterns amid a global market, analytics are carried out on Exabyte-scale market metadata with a complex process in pursuit of a microsecond or even a nanosecond of data processing advantage. This objective motivates the development of innovative tools to address challenges for handling high volume, velocity, and variety investment instruments. Inspired by this need, we developed QuantCloud by employing large-scale SSD-backed datastore, various parallel processing algorithms, and portability in Cloud computing. QuantCloud bridges the gap between model computing techniques and financial data-driven research. The large volume of market data is structured in an SSD-backed datastore, and a daemon reacts to provide the Data-on-Demand services. Multiple client services process user requests in a parallel mode and query on-demand datasets from the datastore through Internet connections. We benchmark QuantCloud performance on a 40-core, 1TB-memory computer and a 5-TB SSD-backed datastore. We use NYSE TAQ data from the fourth quarter of 2014 as our market data. The results indicate data-access application latency as low as 3.6 nanoseconds per message, sustained throughput for parallel data processing as high as 74 million messages per second, and completion of 11 petabyte-level data analytics within 53 minutes. Our results demonstrate that the aggregated contributions of our infrastructure, parallel algorithms, and sophisticated implementations offer the algorithmic trading and financial engineering community new hope and numeric insights for their research and development.
Chapter
Quantitative tools have been widely adopted in order to extract the massive information from a variety of financial data. Mathematics, statistics and computers algorithms have never been so important to financial practitioners in history. Investment banks develop equilibrium models to evaluate financial instruments; mutual funds applied time series to identify the risks in their portfolio; and hedge funds hope to extract market signals and statistical arbitrage from noisy market data. The rise of quantitative finance in the last decade relies on the development of computer techniques that makes processing large datasets possible. As more data is available at a higher frequency, more researches in quantitative finance have switched to the microstructures of financial market. High frequency data is a typical example of big data that is characterized by the 3V’s: velocity, variety and volume. In addition, the signal to noise ratio in financial time series is usually very small. High frequency datasets are more likely to be exposed to extreme values, jumps and errors than the low frequency ones. Specific data processing techniques and quantitative models are elaborately designed to extract information from financial data efficiently. In this chapter, we present the quantitative data analysis approaches in finance. First, we review the development of quantitative finance in the past decade. Then we discuss the characteristics of high frequency data and the challenges it brings. The quantitative data analysis consists of two basic steps: (i) data cleaning and aggregating; (ii) data modeling. We review the mathematics tools and computing technologies behind the two steps. The valuable information extracted from raw data is represented by a group of statistics. The most widely used statistics in finance are expected return and volatility, which are the fundamentals of modern portfolio theory. We further introduce some simple portfolio optimization strategies as an example of the application of financial data analysis. Big data has already changed financial industry fundamentally; while quantitative tools for addressing massive financial data still have a long way to go. Adoptions of advanced statistics, information theory, machine learning and faster computing algorithm are inevitable in order to predict complicated financial markets. These topics are briefly discussed in the later part of this chapter.
Chapter
The financial industry has always been driven by data. Today, Big Data is prevalent at various levels of this field, ranging from the financial services sector to capital markets. The availability of Big Data in this domain has opened up new avenues for innovation and has offered immense opportunities for growth and sustainability. At the same time, it has presented several new challenges that must be overcome to gain the maximum value out of it. This chapter considers the impact and applications of Big Data in the financial domain. It examines some of the key advancements and trans­formations driven by Big Data in this field. The chapter also highlights important Big Data challenges that remain to be addressed in the financial domain.
Chapter
Quantitative finance is an area in which data is the vital actionable information in all aspects. Leading finance institutions and firms are adopting advanced Big Data technologies towards gaining actionable insights from massive market data, standardizing financial data from a variety of sources, reducing the response time to real-time data streams, improving the scalability of algorithms and software stacks on novel architectures. Today, these major profits are driving the pioneers of the financial practitioners to develop and deploy the big data solutions in financial products, ranging from front-office algorithmic trading to back-office data management and analytics. Not only the collection and purification of multi-source data, the effective visualization of high-throughput data streams and rapid programmability on massively parallel processing architectures are widely used to facilitate the algorithmic trading and research. Big data analytics can help reveal more hidden market opportunities through analyzing high-volume structured data and social news, in contrast to the underperformers that are incapable of adopting novel techniques. Being able to process massive complex events in ultra-fast speed removes the roadblock for promptly capturing market trends and timely managing risks. These key trends in capital markets and extensive examples in quantitative finance are systematically highlighted in this chapter. The insufficiency of technological adaptation and the gap between research and practice are also presented. To clarify matters, the three natures of Big Data, volume, velocity and variety are used as a prism through which to understand the pitfalls and opportunities of emerged and emerging technologies towards financial services.