Conference PaperPDF Available

Constructing an Efficient State Space Query System for the Voyage Data Recorder

Authors:

Abstract and Figures

Voyage data recorders are devices kept on vessels to track their trajec-tories when overseas. It receives signals from global positioning satellites to compute its current location on Earth and records the current speed, course, and bearing within its internal memory. The information is downloaded and recorded each time the fishing vessel docks for fueling. Currently, over 10 thousand devices are installed and tracked, collecting over 3 billion GPS samples over 8 years period. This paper presents our approach to clean and reorganize the data, and design an efficient incremental updatable database for fast query on the state of vessels and the spacial locations of vessels. We have also adapted the WebGL technology for visualization of trajectories with more than 100 thousand points. The initialization of this work will provide a better platform for future fisheries management such as resource estimation, catch per unit effort analysis, environmental protections, operational efficiency, and activity surveillance.
Content may be subject to copyright.
Constructing an Efficient State Space
Query System for the Voyage Data
Recorder
William W. Y. Hsu a,b,1, Yi-Wen Wu a, Min-Ruey You a, Cheng-Hsin Liao c,
Cheng-Yu Lu d, Hao-Hsun Wang a
aDepartment of Computer Science and Engineering, National Taiwan Ocean
University, Keeling, Taiwan
bInstitute of Information Science, Academia Sinica, Nankang, Taiwan
cDepartment of Environmental Biology and Fisheries Science, National Taiwan Ocean
University, Keeling, Taiwan
dPixNET, Taipei, Taiwan
Abstract. Voyage data recorders are devices kept on vessels to track their trajec-
tories when overseas. It receives signals from global positioning satellites to com-
pute its current location on Earth and records the current speed, course, and bear-
ing within its internal memory. The information is downloaded and recorded each
time the fishing vessel docks for fueling. Currently, over 10 thousand devices are
installed and tracked, collecting over 3 billion GPS samples over 8 years period.
This paper presents our approach to clean and reorganize the data, and design an
efficient incremental updatable database for fast query on the state of vessels and
the spacial locations of vessels. We have also adapted the WebGL technology for
visualization of trajectories with more than 100 thousand points. The initialization
of this work will provide a better platform for future fisheries management such
as resource estimation, catch per unit effort analysis, environmental protections,
operational efficiency, and activity surveillance.
Keywords. Voyage data recorder (VDR), Fisheries, Trajectories, Global positioning
systems (GPS), Visualization
1. Introduction
Taiwan fisheries is a major industry supported by the government by providing fishing
vessel fuel stipends. The Fisheries Agency, Council of Agriculture, Executive Yuan of
Taiwan, in the purpose of detecting unauthorized or illegal application for fishing ves-
sel fuel stipend and future marine resource management, has asked the National Cheng-
1Corresponding author. The authors are supported by the Ministry of Science and Technology of Taiwan,
under grant MOST-103-2221-E-019-041, Council of Agriculture, Fisheries Agency of Taiwan under grant
103AG-11.2-1-FA-F7, and by the National Taiwan Ocean University under grant NTOU-RD-AA-2014-2-
05021. This research has no affiliations with or involvement in any organization or entity with any financial
interest or non-financial interest in the subject matter or materials discussed in this manuscript. (Email:
wwyhsu@ntou.edu.tw)
Kung University to develop the Voyage Data Recorder (VDR) system in 2006. After a
year of research, development, prototyping, testing, and onboard evaluation, the Fish-
eries Agency has finally regulated that all fishing vessels in Taiwan must be armed with
the VDR system. Establishing such a system is a must for future researches. Currently,
coast guard records of vessels leaving and entering harbors, statistic of catch (fish type,
amount, and location) from seed vessels, fuel stipend records, and the VDR data are
kept separately. It is difficult to join the data together for further statistical analysis and
estimation without an efficient system.
The VDR systems use global positioning satellites (GPS) for trajectory tracking.
The latitude, longitude, course bearing, course speed, and time stamp is recorded every 3
minutes. Upon returning to the harbor for refueling, the raw VDR records are uploaded to
a central server at the Center of Systems and Naval Mechatronic Engineering (CSNME)
for storing. Currently, over 10 thousand unique devices are monitored. Due to budget
limitations, the VDR can not use satellite communication to relay data back in real-time,
however, the delayed VDR data can still be used for off-line analysis, including trajectory
analysis and operational efficiency.
To our knowledge, the VDR data center at the National Cheng-Kung University is
focused on preserving raw data and organized to maintain a minimal relational database
system for analysis and queries. Due to the vast amount of accumulated VDR data, the
original system design cannot follow up the need for fast responses. We propose this
system to clean and reorganize the raw data, redesign the database structure, and create
an environment for fast post-processing, queries, and data visualization.
2. Background
Researches on the use of GPS data has already been widespread. Wu et al. uses GPS data
to discover moving styles of monitored objects in different regions [1]. Based on clus-
tering techniques, trajectories with similar behavior can be joined represent a condition
in a region at a specific time. Pao et al. proposed a Markov chain model to find hidden
patterns embedded in the trajectories produced by user inputs [2]. Although the data in
Pao et al. is not from GPS, the property of the data is similar. Mobile devices with GPS
can also produce trajectory data for use. Yin et al. [3] and Chen et al. [4] used data min-
ing techniques to predict future possible trajectories of subjects from past information.
Sematic analysis has also been used in mining GPS information, i.e., Ying et al. tried
to identify the GPS device bearer’s future possible location using semantic analysis [5].
Finding frequent trajectory patterns is also a research topic. Wang et al. has used vague
space partition to analyze and find frequent patterns in spatiotemporal data [6]. This is
can be a relatively important technique for VDRs as fishing vessels travel irregularly and
may span halfway around the Earth.
Voyage management systems (VMS) are similar to VDRs, both are specialized GPS
devices used on ship vessels, only that VMS takes samples at a much longer interval and
relays information using satellite communications. Walker et al. did research on VMS
to estimate the behavior of vessels on sea [7]. They tried to identify whether the vessel
is cruising, idling, or fishing using fuzzy logic. Mak et al. used the Vessel Performance
Monitoring and Analysis System (VPMAS) to support efficient operations to reduce fuel
consumption of vessels [8]. All such researches depend on a stable and fast database for
storing VDR data which contains GPS information.
User Terminal
Voyage database
service provider
Web portal
3rd party GIS Internal GIS
Internal server
(VDR data processor)
Private terminal
Data computing and
preprocessing module Read/write
Upload raw data
Read only
Http/Ajax/Jquery
Http request
(Ajax/JQuery)
Figure 1. System architecture. Our architecture isolates private and public data, and communication between
modules are built on web services.
3. System Architecture
A cloud infrastructure may host an ensemble of software delivered as services with Soft-
ware as a Service (SaaS) model and computing hardware and networking [9]. It provides
high computing performance, security, and compliance requirements. Cloud infrastruc-
ture also help users to outsource their IT framework and to reduce establishment cost
for concentrating on their business. We modularize our system to provide isolation as
required and reserve flexibility for the future. Our system modules communicates using
web service protocols, which allows each module to be programming language and op-
erating system independent. De-identified is necessary at this point due to the personal
information included in the data, which is highly confidential. Shown in Figure 1, we
isolate the database from the frontend user. Only through private secured terminals can
the data be viewed or modified directly without any de-identification process. For regular
de-identified queries, users will need to cross the web portal, which prepares the data
for display at the user terminal. The web portal requests data from the computing and
preprocessing module, where the data is prepared for display with Cesium WebGL [10].
The data computing and preprocessing modules can only be seen by authorized and users
with security clearance. This module extracts information from the central VDR voyage
database and process it for visualization. Finally, GIS information from either internal
servers or third party providers are loaded to form the geographical layer. The rendering
process is done without sending any trajectory information to the third party providers.
The detailed flow is shown in Figure 2. Upon received the raw VDR data (in text
files as NMEA strings), we parse the data and sort them according to the timestamp into
the database. Following, we join related GPS points to form a trajectory, i.e., a voyage
segment for each vessel. For voyages that span over a month, we analyze to see if it
is an ongoing voyage from the previous segment or a voyage to be continued into the
following month. The indexes for the database are created after all the analysis have been
done. At this point, our system is ready for visualization and queries.
We chose MariaDB as our underlying database management system [11]. Accord-
ing to documentation, it has a lot of optimizer enhancements, faster and safer replica-
Start Raw Data Parsing Construct Voyage
Segments
Voyage Segment
Analysis
Is there any ongoing
voyage?
Implement the
virtual double linked
list
Yes
Create Database
Indexes
No
Query Display
Figure 2. System flow diagram. The detailed procedures of our system.
tions. Second, it supports NoSQL. Third, sharding is supported, which allows tables to
be dispersed across servers in the future.
4. Database Construction
4.1. Definition of GPRMC Strings
VDR information are stored using the NMEA (National Marine Electronics Association)
format [12]. An example string from our raw data is
$GPRMC,033416.00,A,2619.06271,N,11952.82288,E,4.059,198.65,120514,,,A*6D
Detailed field definition is given in Table 1.
Table 1. Field in the VDR sample. The fields in our VDR sample is defined using NMEA standards (obtained
from [12]).
Field Comment
$GPRMC Recommended Minimum sentence C
033416.00 Fix taken at 03:34:16.00 UTC
AStatus A=active or V=Void
2619.06271,N Latitude 26 degree 19.06271’ N
11952.82288,E Longitude 119 deg 52.82288’ E
4.059 Speed over the ground in knots
198.65 Track angle in degrees
120514 Date 2014/05/12
,,,A Magnetic Variations and “Unknown”
*6D The checksum data, always begins with *
GPRMC uses the degree (D), minutes (M), and second (S) for angle measurements.
We have converted this representation into decimal degree (DD) representation using
DD =D+M
60 +S
3600 .
The raw data acquired has a precision up to five decimal places, indicating that the max-
imum accuracy of the minute field can be 0.00001
60 =0.00000016, up to a precision of
6 digits in decimal degree representation. The GPS is precise to 11.132cm at the equa-
tor and 4.3496cm at 64N/S. This conversion is adequate since the accuracy of GPS
is around 7.8 meters at a 95% confidence interval2. When storing information into our
database, our truncation of the decimal degree to 6 decimal digits would not generate any
bias under this supporting fact.
The VDR equipment sometimes produce erroneous results due to harsh environ-
ments on board of the fishing vessels out at the seas. We check every sample by comput-
ing the checksum (see Table 1). We discard samples which are not parsable or have incor-
rect checksum. Moreover, error factors from others reasons should also be considered,
including invalid triangulation from hardware faults, accidental human intervention that
changes file contents, and power shortage onboard fishing vessels which can cause long
break between VDR samples. These types of data account for 0.6% of the total samples
over 8 years.
4.2. Distributing the Raw VDR Samples
The VDR data we obtained are not real-time data. The reason is that using real-time
reporting method such as the vessel monitoring system (VMS), which relays information
through satellite communication is very costly. The government declares relaying the
information in real-time is a tremendous burden to the finance as VDR data are recorded
very frequently; as compared to VMS which are recorded once every 2 to 6 hours, VDRs
are recorded once every 3 minute in current configuration and regulation. As result, VDR
samples can only be acquired when a vessel docks for refueling. The VDR data will
be downloaded from the fishing vessel and uploaed to the server at CSNME during the
fueling process. The files downloaded at this point will contain trajectory since the last
download to today, which may span over a month.
After parsing the raw VDR data, we re-distribute the samples into its corresponding
year and month tables. Illustrated in Figure 3, the data for 2014/06 (green) may contain
information before 2014/06 as the vessel may operate overseas for various number of
months. After filtering out the samples that belongs to 2014/06 (blue), we merge the
remaining data (light purple) with the data of 2014/05 for the next iteration. This process
is continued until the original set is empty or contains only trash data.
4.3. Voyage Construction
The next step construct voyages from the VDR data. We define a single voyage of a
vessel to be:
After a vessel enters a harbor and before the vessel exists a harbor. (Docking)
When a vessel leaves a harbor and then re-enters a harbor. (Operating)
When a vessel is at a harbor, and the next sample shows up in another harbor.
(Possibly the device was carried on land to be installed somewhere else.)
We group related VDR samples together to form individual voyage segments. The pseu-
docode of the partitioning algorithm based on the three criterions above is shown in Al-
gorithm 1. We classify the vessel into 2 states in this paper. First is the idling state, which
the vessel is within a port. As long as the vessel does not leave this port, all activities
2Source: http://www.gps.gov/systems/gps/performance/accuracy
2014
20142014
2014/
//
/06
0606
06 2014
20142014
2014/
//
/05
0505
05 2014
20142014
2014/
//
/04
0404
04 2014
20142014
2014/
//
/03
0303
03 2014
20142014
2014/
//
/02
0202
02
2014
20142014
2014/
//
/06
0606
06
Filter
FilterFilter
Filter
2014
20142014
2014/
//
/06
0606
06
Filter
FilterFilter
Filter
2014
20142014
2014/
//
/05
0505
05
2014
20142014
2014/
//
/06
0606
06
2014
20142014
2014/
//
/05
0505
05
Filter
FilterFilter
Filter
2014
20142014
2014/
//
/04
0404
04
2014
20142014
2014/
//
/06
0606
06
2014
20142014
2014/
//
/05
0505
05
2014
20142014
2014/
//
/04
0404
04
Filter
FilterFilter
Filter
2014
20142014
2014/
//
/03
0303
03
Filter
FilterFilter
Filter
2014
20142014
2014/
//
/06
0606
06
2014
20142014
2014/
//
/05
0505
05
2014
20142014
2014/
//
/04
0404
04
2014
20142014
2014/
//
/03
0303
03
2014
20142014
2014/
//
/02
0202
02
2014
20142014
2014/
//
/06
0606
06
2014
20142014
2014/
//
/05
0505
05
2014
20142014
2014/
//
/04
0404
04
2014
20142014
2014/
//
/03
0303
03
2014
20142014
2014/
//
/02
0202
02
Figure 3. Distributing raw samples into the database. A single file may contain information not only from
the current month, but also information of the previous months.
Algorithm 1 Voyage partitioning algorithm.
Require: The set of vessels v∈ {CT 0.., CT 1..,.. . , CT 9,C T X};
Ensure: Voyage fragments rs for all v;
1: Read from database of v’s trajectory Ts sorted by timestamp;
2: Let hbe the set of harbors in Taiwan;
3: Initialize a new voyage r;
4: for all VDR samples txTdo
5: if txhand tx1/hthen
6: We have found a voyage rof v;
7: Record it into the database and initialize rfor the next voyage;
8: else if tx/hand tx1hthen
9: We have found a voyage rof v, record it into the database and reset r;
10: else if txand tx1both /hor both are hthen
11: Merge txinto current voyage r;
12: end if
13: end for
14: return
are considered as idling state. This including docking, fueling, unloading goods, and
maintenance. Second is the active state, where the vessel leaves a port for operation.
4.4. Route Stitching
Since VDR data stored in our database are partitioned into monthly resolution, the voy-
ages constructed using Algorithm 1 can be fragmented for vessels who has been docking
in a harbor or navigating around the world across month boundaries. Algorithm 2 is ap-
plied to stitch related voyage fragments into a complete trajectory. We maintain a virtual
double linked list structure in our database, which as the advantage of future mainte-
nance and minimal modification to the original structure. First, we do not need another
database schema to store the full trajectory. Second, as data arrive asynchronous from
different VDR downloading sites, the virtual double linked list pointers can be updated
incrementally without affecting other data. For any given time frame, we can trace both
Algorithm 2 Voyage stitching algorithm.
Require: The set of voyages rfrom the database;
Ensure: Complete voyages Rfor all vessels;
1: for all Voyage rnof a vessel vdo
2: if rnends in a harbor hthen
3: if The next route rn+1of vstarts in the same harbor hthen
4: Update rnto point to the next voyage rn+1;
5: Update rn+1to point back to the current voyage rn;
6: end if
7: end if
8: if rends outside a harbor then
9: Let next voyage of vbe rn+1;
10: Update rnto point to the next voyage rn+1;
11: Update rn+1to point back to the current voyage rn;
12: end if
13: end for
14: return
forward and backward of the voyage trajectory by following the links. This data structure
allows us to trace the beginning and end of a voyages through months within linear O(n)
queries, where nis the number of months this particular trajectory has covered. For ex-
ample, in Figure 4, CT8000001’s voyage ends at month x, and previous trajectory can be
obtained by tracing backward to month x1. CT1000001 starts at month xand extends
into months x+1 and continues on. We make the following observations in designing
our voyage stitching algorithm:
If a vessel stays within the same harbor (or adjacent and connected harbor) it is
stitched into the same voyage.
If a vessel voyage terminates somewhere outside any harbor, then the initial point
of the next voyage from the next month should be a consecutive trip.
If a vessel end in some harbor and begins in another non adjacent harbor (VDR is
manually moved or something), it is considered to be two voyages.
Shown in Figure 5, we have a vessel whose complete voyage span 4 months. We use our
algorithm to join the segments forming a complete trajectory as in Figure 6. Partitioning
by using time-distance information from 2 consecutive VDR samples may not work for
earlier models of the VDR, which are not equipped with internal battery. Fishing vessels
tend to anchor and power down the whole vessel when waiting or idling on the sea
to conserve fuel. This will lead to an arbitrary gap between the two consecutive VDR
samples.
4.5. Incremental Data Flow
The VDR data is collected each time the vessel fuels, but the raw data is pooled and
delivered once per season. Thus, the database would be updated on a quarterly interval.
The process for updating the database when new data arrives are:
1. The raw data is parsed and checked for any errors.
CT
CTCT
CT1000001
10000011000001
1000001
Month
Month Month
Month
x
xx
x
...
......
...
CT
CTCT
CT8000001
80000018000001
8000001
...
......
...
CT
CTCT
CT9000001
90000019000001
9000001
CT
CTCT
CT1000001
10000011000001
1000001
Month
Month Month
Month
x
xx
x
+
++
+1
11
1
...
......
...
CT
CTCT
CT9000001
90000019000001
9000001
...
......
...
...
......
...
CT
CTCT
CT8000001
80000018000001
8000001
Month
Month Month
Month
x
xx
x
-
--
-1
11
1
CT
CTCT
CT9000001
90000019000001
9000001
...
......
...
...
......
...
...
......
...
...
......
...
...
......
...
...
......
...
Figure 4. Double linked list implementation in the database. This structure allows our algorithm to find
the initial position and final position of a voyage given any segment.
(a) The vessel initiates a voyage in month x.(b) The vessel is traveling through month x+1.
(c) The vessel is traveling through month x+2. (d) The vessel returns to its port in month x+3.
Figure 5. Voyage stitching. We join voyage segments which should be one single trip into a single trajectory.
2. The data will be distributed among the months preceding the arrival month.
3. For each month’s data that has been increased, Algorithm 1 should be executed
to generate new voyage segments.
4. Months that have new voyage segments generated should execute Algorithm 2 to
extend (or merge) incomplete voyage segments.
5. Experiment Results
We have process VDR data from January 2008 to June 2014. These data are confidential
government data and thus masking and de-identification of vessel owners is required. We
have processed approximately 3.4 billion VDR samples and over 12 thousand unique
VDR devices. The front desk visualization is done using Cesium WebGL [10], which
Figure 6. Complete voyage constructed from segments. The complete voyage can be constructed with our
algorithm with minimal changes to the database.
utilizes 3D acceleration in web browsers. GIS informations are acquired from Microsoft
Bing Maps and Open Street Maps. The primary server is Asus RS700/E7 with Xeon
2.0Mhz processor and the primary database storage is offloaded to the QNAP TS-469
network attach storage (NAS).
Figure 7 shows 7 trajectories of a vessel with over 8000 GPS points plotted with
Cesium WebGL (with some modifications to conserve browser memory) using Microsoft
Bing Maps and Open Street Maps as the GIS layer. We have added bearing information
to the trajectories (see Figure 8, with longer and larger arrows indicating that the vessel is
traveling faster. The query time for a single trajectory usually ranges from 0.5 second to
3 seconds, depending on the length of the voyage and the number of months the voyage
span.
For the next experiment, we try visualizing a larger number of points. We did 2
queries for a lengthy voyage. The first voyage contains 37488 points and the second
voyage contains 53869 points, and the queries were completed in 1.281 seconds and 2.57
seconds respectively. The results are shown in Figure 9. There are 91357 points on this
figure. Regular Google 2D map API is unable to process such a number of data points,
but by resolving to 3D WebGL, the problem is solved.
6. Conclusion
Computers and equipment today generate information at a tremendous rate. For our case,
VDR samples could be adjusted to 1 sample per minute or even 1 sample per second. In
Figure 7. Trajectory of a vessel. This plot shows 7 trajectories of a vessel leaving a harbor for operation and
returning.
other disciplines, approximation algorithms and methods have been used to analysis this
mass of information (see [13,14]). The first difficulty lies in exactness. Our system needs
to be exact in some aspects when it comes to computing stipends for fishing vessel. A
little error will cause the government to pay too much, or on the other hand, encroaches
the right of fishermen to receive their exact fuel compensation.
Second, VDR data are collected each time fishing vessels returns to harbor for fuel-
ing. This means that events overseas have already happened and we can not have a real
time visual. Although VMS uses satellite communication to relay information real-time,
it does it at a 2 to 6 hour interval. It would cost too much for VDRs which takes samples
at a high frequency.
Third, our database provides efficient queries of the vessels state and trajectory. A
systematic way of parsing the raw data to the database and well-designed data represen-
tation speeds up the process of acquiring meaningful information. In combined with 3D
WebGL visualization, we can show the position of designated vessels on the globe.
Figure 8. Zooming the map. The arrows showing the bearing of the vessel is now visible. Larger arrows
represent that the vessel is traveling faster.
Figure 9. Plotting lengthy trajectories. The 2 trajectories span a period over 6 months.
Last, our research provides a platform for future fisheries management and surveil-
lance. We know that there are more than 20 fishing methods over the vessels monitored.
With extra information such as the yield of catch reports, we can estimate the efficiency
of each vessel. Moreover, we can see if any vessel is trespassing into no fishing zone or
territories belonging to outer countries which can cause international disputes. Providing
this efficient system with clean data will save preprocessing time for future researches.
Future development of this research will include merging coast guard records, catch-
ing statistics, fuel records, and hydrological information. State analysis are broadly clas-
sified into to sets in current systems, and will be subdivided into more categories in
the future. For example, docking state will contain fueling, unloading, maintenance, and
powered off. Operating state will contain navigating, seeking, net casting, harvesting, and
idling. With this system, further analysis including resource estimation and environment
monitoring can be possible.
References
[1] H.-R. Wu, M.-Y. Yeh, and M.-S. Chen, “Profiling moving objects by dividing and clustering trajectories
spatiotemporally,Knowledge and Data Engineering, IEEE Transactions on, vol. 25, no. 11, pp. 2615–
2628, 2013.
[2] H.-K. Pao, J. Fadlil, H.-Y. Lin, and K.-T. Chen, “Trajectory analysis for user verification and recogni-
tion,” Knowledge-Based Systems, vol. 34, pp. 81–90, 2012.
[3] P. Yin, M. Ye, W.-C. Lee, and Z. Li, “Mining GPS data for trajectory recommendation,” in Advances in
Knowledge Discovery and Data Mining. Springer, 2014, pp. 50–61.
[4] L. Chen, M. Lv, and G. Chen, “A system for destination and future route prediction based on trajectory
mining,” Pervasive and Mobile Computing, vol. 6, no. 6, pp. 657–676, 2010.
[5] J. J.-C. Ying, W.-C. Lee, T.-C. Weng, and V. S. Tseng, “Semantic trajectory mining for location pre-
diction,” in Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geo-
graphic Information Systems. ACM, 2011, pp. 34–43.
[6] C. Wang, D. De, and W.-Z. Song, “Trajectory mining from anonymous binary motion sensors in smart
environment,Knowledge-Based Systems, vol. 37, pp. 346–356, 2013.
[7] E. Walker and N. Bez, “A pioneer validation of a state-space model of vessel trajectories (VMS) with
observers data,” Ecological Modelling, vol. 221, no. 17, pp. 2008–2017, 2010.
[8] L. Mak, M. Sullivan, A. Kuczora, and J. Millan, “Ship performance monitoring and analysis to improve
fuel efficiency,” in Oceans-St. John’s, 2014. IEEE, 2014, pp. 1–10.
[9] A. Agopyan, E. Sener, and A. Beklen, “Financial business cloud for high-frequency trading: A research
on financial trading operations with cloud computing,” International Journal On Advances in Intelligent
Systems, vol. 4, no. 3-4, pp. 203–217, 2012.
[10] P. Cozzi and D. Bagnell, “A webgl globe rendering pipeline,” GPU Pro 4: Advanced Rendering Tech-
niques, vol. 4, pp. 39–48, 2013.
[11] (2015) MariaDB. [Online]. Available: https://mariadb.org
[12] K. Betke, “The NMEA 0183 protocol,” Standard for Interfacing Marine Electronics Devices, National
Marine Electronics Association, Severna Park, Maryland, USA, 2001.
[13] T. Heinis, “Data analysis: Approximation aids handling of big data,Nature, vol. 515, no. 7526, pp.
198–198, 2014.
[14] G. Luo and G. Pei, “A novel multivariate polynomial approximation factorization of big data,” in Intel-
ligent Computation in Big Data Era. Springer, 2015, pp. 484–496.
... The VDR data center at the National Cheng-Kung University periodically uploads raw VDR data. We collect the raw data from the Fisheries Agency and then parse, construct voyages according to the method explained in Hsu et al. [10]. However, due to the vast amount of data to be processed, we choose to use MongoDB as our engine, a document store model noSQL database [11]. ...
... However, due to the vast amount of data to be processed, we choose to use MongoDB as our engine, a document store model noSQL database [11]. For our case, we following the definition of a voyage as in [10] being: ...
Article
Full-text available
Managing and maintaining fishing ports is crucial for fishing activities. With the lack of budget and labor force, some fishing ports face the brink of being shut down. A first choice for demolishing would be fishing ports with low utilization, but this does not mean the port is totally useless. When severe weather condition, i.e., typhoons, is in the proximity of the coastal line, offshore fishing vessels will return to dock in ports for safety. When major ports are fully docked, some of the minor ports or less active ports will be chosen as the haven. These ports are seldom used and lowly utilized, but should still be maintained for safety issues as they serve as a sanctuary for extreme weather conditions. This research uses big data aggregation approach to identify haven ports by processing fishing vessel states using collected voyage data recorder (VDR) records and correlates this information with the typhoon information data source.
... The S-VDR is not required to store the same level of detail as the VDR but should maintain a store, in a secure and retrievable form, of information concerning the position, movement, physical status, command, and control of a ship over the period leading up to and following an incident. Table 1 summarizes the two types of data items to be saved on the VDR [6][7][8][9][10][11]. The interfaces for the items that are saved need to satisfy International Electrotechnical Commission standard IEC 61162. ...
Article
Full-text available
p>Identifying the causes of marine accidents is difficult because of problems in scene preservation, reenactment, and procuring of witnesses. Thanks to new regulations, larger vessels are now required to carry voyage data recorders (VDRs) and automatic identification systems (AISs). However, the content of these devices, which is created, stored, and managed digitally, has security vulnerabilities such as the potential for data modification. Therefore, when managing digital records it is important to guarantee reliability. To this end, we suggest a digital forensics-based digital records migration method using a hash algorithm to guarantee the integrity and authenticity of digital records.</p
... If the vessel moves out of range, the FAS will store the information in the memory, wait for the connection to be re-establish, and then resume the data transfer. All of the data will be relayed to a central control center for recording and further analysis, i.e., trajectory analysis [7] and catch analysis and behavior [8]. The required databases are processed and stored on the SD card. ...
Conference Paper
Full-text available
This paper will focus on developing and designing a portable fisheries assistant system (FAS), a micro fisheries activity assistant system which integrates operation zone detections, i.e., marine protected zones and closed fishing season areas, online digital crew declaration system, emergency distress calls, and trajectory recording. FAS combines external modules, i.e., GSM transmitter and receivers, GPS receivers, and external storage, and uses microdevices which is light and can be powered by a battery bank. This makes FAS suitable as a carry-on device for fishing rafts and sampans which do not have a stable source of electric power. Development of the software will also take account of using operations that consume less electricity to lengthen the operation time of FAS. The current development stage uses mobile network systems to relay information and is field-tested by going on the seas. Finally, research and experiments will be done to evaluate the capability of FAS integrated with a current existing central control system at the Fisheries Agency of Taiwan.
Conference Paper
Managing and maintaining fishing ports is crucial for fishing activities. With the lack of budget and labor force, some fishing ports face the brink of being shut down. A first choice for demolishing would be fishing ports with low utilization, but this does not mean the port is totally useless. When severe weather condition, i.e., typhoons, is in the proximity of the coastal line, offshore fishing vessels will return to dock in ports for safety. When major ports are fully docked, some of the minor ports or less active ports will be chosen as the haven. These ports are seldom used and lowly utilized, but should still be maintained for safety issues as they serve as a sanctuary for extreme weather conditions. This research uses big data aggregation approach to identify haven ports by processing fishing vessel states using collected voy­age data recorder (VDR) records and correlates this information with the typhoon information data source.
Conference Paper
The wide use of GPS sensors in smart phones encourages people to record their personal trajectories and share them with others in the Internet. A recommendation service is needed to help people process the large quantity of trajectories and select potentially interesting ones. The GPS trace data is a new format of information and few works focus on building user preference profiles on it. In this work we proposed a trajectory recommendation framework and developed three recommendation methods, namely, Activity-Based Recommendation (ABR), GPS-Based Recommendation (GBR) and Hybrid Recommendation. The ABR recommends trajectories purely relying on activity tags. For GBR, we proposed a generative model to construct user profiles based on GPS traces. The Hybrid recommendation combines the ABR and GBR. We finally conducted extensive experiments to evaluate these proposed solutions and it turned out the hybrid solution displays the best performance.
Conference Paper
In actual engineering, processing of big data sometimes requires building of mass physical models, while processing of physical model requires relevant math model, thus producing mass multivariate polynomials, the effective reduction of which is a difficult problem at present. A novel algorithm is proposed to achieve the approximation factorization of complex coefficient multivariate polynomial in light of characteristics of multivariate polynomials. At first, the multivariate polynomial is reduced to be the binary polynomial, then the approximation factorization of binary polynomial can produce irreducible duality factor, at last, the irreducible duality factor is restored to the irreducible multiple factor. As a unit root is cyclic, selecting the unit root as the reduced factor can ensure the coefficient does not expand in a reduction process. Chinese remainder theorem is adopted in the corresponding reduction process, which brought down the calculation complexity. The algorithm is based on approximation factorization of binary polynomial and calculation of approximation Greatest Common Divisor, GCD. The algorithm can solve the reduction of multivariate polynomials in massive math models, which can obtain effectively null point of multivariate polynomials, providing a new approach for further analysis and explanation of physical models. The experiment result shows that the irreducible factors from this method get close to the real factors with high efficiency.
Article
This paper defines a new business cloud model to create an efficient high-frequency trading platform while validating the portability and also cost-efficiency of cloud execution environments for financial operations. High-frequency trading systems, built to analyze trends in tick-by-tick financial data and thus to inform buying and selling decisions, imply speed and computing power. They also require high availability and scalability of back-end systems which, require high cost investments. The defined model uses cloud computing architecture to fulfill these requirements, boosting availability and scalability while reducing costs and raising profitability. It incorporates data collection, analytics, trading, and risk management modules in the same cloud, all of which, are the main components of a high-frequency trading platform.
Article
For many computer activities, user verification is necessary before the system will authorize access. The objective of verification is to separate genuine account owners from intruders or miscreants. In this paper, we propose a general user verification approach based on user trajectories. A trajectory consists of a sequence of coordinated inputs. We study several kinds of trajectories, including on-line game traces, mouse traces, handwritten characters, and traces of the movements of animals in their natural environments. The proposed approach, which does not require any extra action by account users, is designed to prevent the possible copying or duplication of information by unauthorized users or automatic programs, such as bots. Specifically, the approach focuses on finding the hidden patterns embedded in the trajectories produced by account users. We utilize a Markov chain model with a Gaussian distribution in its transition to describe trajectory behavior. To distinguish between two trajectories, we introduce a novel dissimilarity measure combined with a manifold learned tuning technique to capture the pairwise relationship between the two trajectories. Based on that pairwise relationship, we plug-in effective classification or clustering methods to detect attempts to gain unauthorized access. The method can also be applied to the task of recognition, and used to predict the type of trajectory without the user's pre-defined identity. Our experiment results demonstrate that, the proposed method can perform better, or is competitive to existing state-of-the-art approaches, for both of the verification and recognition tasks.
Article
An object can move with various speeds and arbitrarily changing directions. Given a bounded area where a set of objects moving around, there are some typical moving styles of the objects at different local regions due to the geography nature or other spatiotemporal conditions. Not only the paths that the objects move along, we also want to know how different groups of objects move with various speeds. Therefore, given a set of collected trajectories spreading in a bounded area, we are interested in discovering the typical moving styles in different regions of all the monitored moving objects. These regional typical moving styles are regarded as the profile of the monitored moving objects, which may help reflect the geoinformation of the observed area and the moving behaviors of the observed moving objects. In this paper, we present DivCluST, an approach to finding regional typical moving styles by dividing and clustering the trajectories in consideration of both the spatial and temporal constraints. Different from the existing works that consider only the spatial properties or just the interesting regions of trajectories, DivCluST focuses more on typical movements in local regions of a bounded area and takes the temporal information into account when designing the criteria for trajectory dividing and the distance measurement for adaptive $(k)$-means clustering. Extensive experiments on three types of real data sets with specially designed visualization are presented to show the effectiveness of DivCluST.
Article
One of the key applications of Smart Environment (which is deployed with anonymous binary motion sensors and ) is user activity behavior analysis. The necessary prerequisite to finding behavior knowledge of users is to mine trajectories from the massive amount of sensor data. However, it becomes more challenging when the Smart Environment has to use only non-invasive and binary sensing because of user privacy protection. Furthermore, the existing trajectory tracking algorithms mainly deal with tracking object either using sophisticated invasive and expensive sensors and , or treating tracking as a Hidden Markov Model (HMM) which needs adequate training data set to obtain model’s parameter [5]. So, it is imperative to propose a framework which can distinguish different trajectories only based on collected data from anonymous binary motion sensors. In this paper, we propose a framework – Mining Trajectory from Anonymous Binary Motion Sensor Data (MiningTraMo) – that can mine valuable and trust-worthy motion trajectories from the massive amount of sensor data. The proposed solution makes use of both temporal and spatial information to remove the system noise and ambiguity caused by motion crossover and overlapping. Meanwhile, MiningTraMo introduces Multiple Pairs Best Trajectory Problem (MPBT), which is inspired by the multiple pairs shortest path algorithm in [6], to search the most possible trajectory using walking speed variance when there are several trajectory candidates. The time complexity of the proposed algorithms are analyzed and the accuracy performance is evaluated by some designed experiments which not only have ground truth, but also are the typical situation for real application. The mining experiment using real history data from a smart workspace is also finished to find the user’s behavior pattern.
Article
In the context of the expansion of animal tracking and bio-logging, state-space models have been developed with the objective to characterise animals’ trajectories and to understand the factors controlling their behaviour. In the fisheries community, the electronic tagging of vessels commonly designated by Vessel Monitoring Systems (VMS) is developing and provides a new insight for the understanding, the analysis and the modelling of the trajectories of vessels and their prospecting behaviour. VMS data are thus a clue for the proper definition of fishing effort which remains a fundamental parameter of tuna stock assessments. In this context, we used the VMS (recording of hourly positions) of the French tropical tuna purse-seiners operating in the Indian Ocean to characterise three types of movement (states) on the VMS trajectories (stillness, tracking, and cruising). Based on empirical evidences, and on the regular frequency of VMS acquisition, this was achieved by the development of a Bayesian Hidden Markov model for the speeds and turning angles derived from the hourly steps of the trajectories. In a second phase, states were related to activities disentangling stillness into fishing or stop at sea. Finally the quality of the model performances was rigorously quantified thanks to observers’ data. Confronting model prediction and true activities allowed estimating that 10% of the hourly steps were misclassified. The assumptions and model’ choices are discussed, highlighting the fact that VMS data and observers’ data having different time resolutions, the effective use of validating data was troublesome. However, without validation, these analyses remain speculative. The validation part of this work represents an important step for the operational use of state-space models in ecology in the broad sense (predators’ tracking data, e.g. birds or mammals trajectories).