ArticlePDF Available

Extract, Transform, Load Monitoring sales Multi Marketplace

Authors:

Abstract and Figures

This research is made in the design of a database system and online sales files for sales on a medium business scale, Era 5.0 Society many marketplaces make it easy for medium businesses to sell products, So business actors need to see the development of digital business and forecast their merchandise that is sold out online. As for the implementation of 1) database from RDBMS into MYSQL. The research uses an open source database, namely MYSQL, 2) The research method uses Microsoft Access program tools with the link to the Windows 10 operating system odbc in Extract Transform Load (ETL) which is carried out automatically with button 3 process Extract, Transform, Load from the download folder into the relational database management system (RDMS) in ms access and stored in the MySQL database. 3) This research forms raw data facts from several marketplaces and is processed into normalized table dimensions in the data warehouse. This research aims to design a combination of open-source systems and Microsoft access licenses, for business actors who need sales analysis that is sold multichannel (online or offline), to optimize stock and provide products consistently in each sales channel.
Content may be subject to copyright.
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
Extract, Transform, Load Monitoring Sales Multi Marketplace
Leonardus Adityo Toto Pratomo1*, Deni Mahdiana1
1Universitas Budi Luhur, Jakarta, Indonesia
*Corresponding Author: 2211602178@student.budiluhur.ac.id
Article Information
ABSTRACT
Article history:
No. 721
Rec. Mei 29, 2023
Rev. August 28, 2023
Acc. September 16, 2023
Pub. September 24, 2023
Page. 174 188
This journal is created in the design of a database system and online
sales file for sales on a medium business scale, in the Era of Society
5.0. Online marketplaces provide convenience for medium business
entities to sell their products. The process conducted earlier involves
calculating raw sales data using Microsoft Excel, which entails
consolidating raw data from three marketplace sources, requiring
effort until it takes the form of raw factual data. The Extract
Transform Load process in MS Access accelerates the manual process,
with the aim of enabling business entities to efficiently and quickly
view sales reports. The goals of implementing ETL are as follows:
Integrating a data mart in the form of factual data stored in MYSQL.
Proposing a prototype method by creating a Microsoft Access
program in the form of Extract Transform Load (ETL) that operates
automatically with a button. Creating Analysis data into factual
tables from the three-dimensional data mart sourced from three raw
marketplace data, with the transformation process of tables in the
MYSQL data warehouse taking the form of exporting daily and
specific period reports. This journal designs a combination of open-
source systems and Microsoft Access licenses for business entities that
require sales analysis for multichannel sales (online or offline), in
order to optimize inventory and consistently provide products
through each sales channel.
This is an open-access article under the CC BY-SA license.
Keywords:
ETL
RDBMS
MySQL
Microsoft Access
Multi Marketplace
1. INTRODUCTION
The application of database technology began in the 1960s, At first, database systems
were used to store data centrally in a table or file format, which could be accessed by
multiple users. In the 1970s, Edgar F. Codd, a mathematician and computer scientist from
IBM, developed relational models for databases, such as SQL (Structured Query Language).
In the 2000s, database technology continued to evolve NoSQL databases [1]
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
175
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
The application of information technology has a role in managing raw data, even for
large and small business scales, which can be imported, and and processed to the desired
output at the management level. Opensource databases are used by business people as an
alternative business solution that in terms of cost can be affordable[2]. Generally open-
source databases such as MYSQL [1] and Postgresql. Design is simple in a Microsoft Access
program connected to an open-source database, and dynamically can be changed quickly
daily, including query builder tools and macros. [3]
A database is a structured data collection [4]. MySQL is a multi-user SQL database
business management system and multithreaded DBMB (Database Management System).
MySQL is derived from SQL (Structured Query Language) [5]. MySQL is a Database
Management System that uses basic SQL commands. In its application, MySQL as data
management in the system is certainly very easy to use and has fast performance in data
storage [6].
Many companies today offer their goods through a variety of online and offline sales
methods. A system that reliably and successfully combines and analyzes sales data from
many sources is essential. By combining open-source technologies and Microsoft Access
licensing, this research offers a solution. This study is vital for understanding sales patterns,
demand trends, and consumer preferences, as well as for managing product inventories.
This study provides a solution by combining sales data from several channels to assure
product information uniformity. Product consistency across all channels is also an issue.
Businesses may increase operational effectiveness, save time, and lower human error rates
by using a system that combines and analyzes sales data. Additionally, properly integrated
and evaluated sales data aids in decision-making, opportunity identification, and strategic
decision-making for firms. Therefore, there is an urgent need for this study to help firms
maximize multi-channel sales, manage inventory effectively, and deliver consistent
products across all sales channels. According to the summary above, the goal of this project
is to create a system that combines an open source platform and a Microsoft access license
for use by businesspeople that require multichannel (online and offline) sales analysis in
order to manage inventory and deliver goods reliably across all channels.
2. RESEARCH METHOD
Research Analyzes necessity Planning system information with ETL.ini be method
prototype development. Prototype method with Accelerate pendekatan at Stages software
development life cycle (SDLC) [7] deep development device soft with face CASE tools,
Visual programming languages, generators yang can Use immediately Into shape report to
of Analyzes and Implemented. Research Beginning with planning see files that are already
in download from some marketplace, then data Collected, Analyze each column and Design
architecture database. Design ETL uses logic flow The process sees data raw result Sales that
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
176
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
happen every day [8]. Language Programming that is Used with visual basic for application
(VBA) in Microsoft access and connection Microsoft open database connectivity (ODBC).
Testing Done Directly Gradual on data raw until baris serratus Thousands record on the
line CSV [9] And excel [10]. Flow research like that is visible in Picture 1.
Figure 1 is the stages and research flow of the prototype [11] method. Starting from
the planning stage of files that have been downloaded from several markets places then
analyzing database variables, designing, implementing data processing, and transforming
data, then processing data architecture design, entering The graphical user interface stage,
and the resulting report form is in the form of raw data by paying attention to the columns
in raw data. The ETL engine system can be implemented and tested by looking at column
formats that change due to characters, cress, and spaces when it will be imported into the
ETL system and view the contents which often change dynamically.
Case studies on ETL systems are divided into 3 designs: ETL engine systems, Entity
relation diagram (ERD) design, logical records structure (LRS), and master report.[12]
Theoretical and researched practical aspects that were conducted earlier involve the
merging of several tables and aligning different tables as illustrated in the image below:
Figure 1. A segment of tables and processes carried out previously from 3 marketplaces
Subsequently, the alignment of the three raw marketplace data formats, which
proved to be less efficient in terms of time and human resources when executed at the herbal
store, transformed into a report. This led the author to propose a prototype method. Within
the report, in this ETL system, a level in the ETL process begins with the identification of
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
177
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
files through renaming and incorporating the database into the Microsoft Access system,
turning the data into a format that can be analyzed according to user needs. An overview of
the system can be observed in Figure 2.
This is Levels deep ETL process Beginning from Indicates a file with a mere name
and making the database enter the system Microsoft access becomes Data that then in the
analysis appropriate necessity user. Picture common system get views at figure 1
Figure 2. Strictly theory and practice that Researched deep system machine ETL
Figure 2 above shows a sequence of ETL processes from marketplace to report.
Originally, the data used came from marketplace sales which were then downloaded, and
then the process of extracting and transforming all data. In the ETL process, there is an
equalization of data columns and a data scrubbing refinement process that is useful for
seeing double data and different characters in each content of a data, in producing more
quality data. Data that has gone through the scrubbing process is then loaded and saved to
the herbal medicine store database which is a data warehouse database[2][13][14]
ETL Entity Relation Diagram System Sales data is divided into several tables which
are RDBMS databases that can be used in prototype method processes in programs with
query combination databases and denormalized, both query form tests can be used in the
first generation Pentium i3 assembly computer, ETL has DDL and DML queries such as
insert update delete which can run in 1 minute to 7 minutes from sales data at herbal stores,
where the concept used is the concept of relational database management (RDBMS). [15][16]
trans
form
DWH Repo
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
178
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
Figure 3. Data Warehouse in the form of ERD
Data Warehouse in the form of ERD figure 2, has 9 tables as in its function in each
query and is interconnected by macros that can run automatically with 4 buttons graphical
user interface, which are connected. To see the diagram in the Data Warehouse in Figure 3
stored data on all marketplaces, detailed tables of several marketplaces function to store
prefix data more presented in column columns, and as material that serves to store data
analysis materials.[14]
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
179
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
Figure 4. LRS ETL System
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
180
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
LRS ETL System, a flow of data from the ETL process from being stored in a database
to the form of exported data raw, there are many contents of sales transactions, several
marketplaces row by row and column by column.
Engineer RDBMS data warehouse software on LRS architecture. The data
architecture consists of raw tables and master tables that are connected to raw tables from
the marketplace. There are 2 raw tables, namely the raw sales table and raw table, and 6
normalization tables [17] where there are 3 raw fact data from the marketplace and 3 process
data normalization data, product matching dimensions, combined raw fact dimensions
from 3 marketplaces, product master dimensions, the report dimension. The data
warehouse schema design can be seen in Figure 2.3.
Figure 2.3 above RDBMS data warehouse system design has a raw sales table and
has several product fields as product codes, hna_net, and value_net that are connected
(match/relation) between normalization-normalization tables as problem-solving [18]. In
the product match dimension table, there are fields including proud, Product_ID as product,
hna as net price, qty as several sales, market place code, which is divided between the
normalization table of the product master table. The raw normalization table serves as a
temporary table for all data connected in each field to make it easier to automate each data
in the table.
3. RESULTS AND DISCUSSION
The implementation is designed to prevent duplicate data during the transformation
process in each data mart from the three sources of raw data and dimension data. Therefore,
each relation has its distinct function. The table master_group dimension data serves as the
product master, and the table xmatch_product dimension data acts as the dimension for a
collection of products, with a foreign key linking the dimension table to the fact table. The
transformation query generates Analysis Data to support decision-making.
Below are three marketplace data sources for the data warehouse connected to the
master data.
In gure 5, the fact data from Blibli is related to the MySQL data warehouse as shown
below:
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
181
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
Figure 5. The data warehouse Blibli in the Microsoft Access ETL system
In the rst example above, it presents data that includes sales and quantity analysis
for products sold in the Blibli marketplace. In gure 6, the fact data from Shopee is related
to the MySQL data warehouse as shown below:
Figure 6. the datawarehouse Shopee in the Microsoft Access ETL system
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
182
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
In the second example above, it presents data that includes sales and quantity
analysis for products sold in the Shopee marketplace. In Figure 7, the fact data from
Tokopedia is related to the MySQL data warehouse as shown below:
Figure 7. Datawarehouse Tokopedia in the Microsoft Access ETL system
In the third example above, it presents data that includes sales and quantity analysis
for products sold in the Tokopedia marketplace.
All of these query data are processed automatically using Microsoft Access by macro
tool, specically using macro code. The results of this study implement ETL and produce
reports, which have minimal costs in terms of hardware, and human resources because they
daily use Microsoft Oce, an application program that has been created with prototypes
that have previously been designed with RDBMS data warehouse, so that in the previous
discussion so that the ETL engine and raw report data can be used with pivot tables
according to analysis needs. The ETL process works in 2 stages, namely the manual ETL
process and the Automatic ETL process which works according to a buon click (not yet
scheduled) to process the delete import output process. Both of these processes have
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
183
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
previously been carried out in the scrub data process so that later the input data on the
Report is neat, under what is desired in determining decision-making needs at the
management level.
The form design feature of the ETL data warehouse in Toko Herbal serves to describe
the data, dening the objects contained in each table in Toko Herbal. In this ETL form design
feature, it can process data from several raw tables and normalized tables more accurately
such as id, eld name, data type, length, and description. [19]
This form design feature is made with the visual basic of application programming
to make it easier for users to just click 4 buons. Why is it made 4 to look manually to nd
out the origin table and destination table in doing the extract process at the herbal store if
there is a dierence between data and the query? This feature is also equipped with an
automatic message eld description that explains the completion of a process from every 4
buons in each query, delete, import append, and export process. Keys are integrated into
Microsoft Access macros, so that all query functions in each table, can be connected between
tables, which contain primary keys and foreign keys. In this feature, users can also nd out
which elds will be carried out by the ETL process in the Toko Herbal data warehouse
database. Form design and List table illustrated in 4.
Figure 4. The form design feature of the ETL data warehouse
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
184
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
Figure 5. ETL Features Automatic Process ETL produces an automated process
ETL Features Automatic Process ETL produces an automated process that is used to
run ETL more eciently without having to click through each query that runs all manual
query process activities. [20]
The Automatic ETL process will run the data extract process, which queries delete
and import les such as gure 4.2.1, transform data, and load data automatically from data
that has been downloaded on a daily sales basis, [8]
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
185
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
Figure 6. Transform load macros visual basic of application
Figure 6 Transform load macros visual basic of application as above, after that, the
ETL process runs by clicking the buon that connects the vba syntax with Microsoft Access
macros. The automatic ETL graphical user interface is a prototype of the display program
of the Automatic ETL feature in the herbal marketplace store in Figure 4.1. There is an import
auto number, the type in each eld to determine ETL whether there are double transactions
or not conditioned by creating a primary key and foreign key so that it will be automatically
done when 4 buons are clicked. After seing the Microsoft access popup and auto tools
compact repair automatic, the ETL process at the herbal store will be automatically carried
out according to what the user wants.
Report feature This research builds a data analysis system in an RDBMS data
warehouse called ETL intended for the management level to determine and support
decisions in a company that aims to help see decisions from pivot data. [21]
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
186
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
Figure 7. This ETL system
This ETL system Figure 7 provides various types of reports such as sales reports for
each marketplace every day and month or year which are united with certain append or
union query systems in query builder tools formed by raw table data.[22]
4. CONCLUSION
The conclusion of the implementation and testing of the system that has been carried
out in this study can be concluded that using an ETL engine that can work manually or
automatically can make it easier for upper management to make decisions so that the ETL
engine that is built is equipped with the scrub data feature, it is very important that this
feature is to minimize the wrong input by the user, of course, it simplifies the ETL process,
ETL engine software engineering is also equipped with a simple user interface button
feature, the coding is also made simple according to the query builder and macros referring
to naming in each process. This feature serves to find out existing table information to
facilitate raw technical data users. The ETL engine that is built can extract, transform and
load from the RDBMS Data Warehouse, and finalize data reports from the database
illustrated when users can pivot tables as needed. The weakness of this system is that it
cannot see process logs to be able to display the history or history of processes that have
been carried out by the ETL engine. Report data view. However, at the small business level
(Falchuk et al., 2021) built on this study, it can display sales transaction column data from
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
187
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
several marketplaces, and can easily add comparison data such as stock, costs, etc. to herbal
stores.
REFERENCES
[1] A. Rahmat Iskandar, A. Junaidi, and A. Herman, “Extract, Transform, Load sebagai
upaya Pembangunan Data Warehouse,” 2019.
[2] I. Putu, W. Prasetia, I. Nyoman, and H. Kurniawan, “Implementasi ETL (Extract,
Transform, Load) pada Data warehouse Penjualan Menggunakan Tools Pentaho,”
TIERS Information Technology Journal, vol. 2, no. 1, pp. 3947, 2021, [Online]. Available:
https://journal.undiknas.ac.id/index.php/tiers
[3] E. Abdelhamid, N. Tsikoudis, M. Duller, M. Sugiyama, N. E. Marino, and F. M. Waas,
“Adaptive Real-time Virtualization of Legacy ETL Pipelines in Cloud Data Warehouses
Frequent Subgraph Mining View project Efficient Techniques for Modular Past State
Systems View project Adaptive Real-time Virtualization of Legacy ETL Pipelines in
Cloud Data Warehouses,” 2023, doi: 10.48786/edbt.2023.64.
[4] B. Rianto, F. Yunita, M. Amin, and R. Rahma, “Information Systems of Official Trips at
General Election Commission Office (KPU),” Jurnal Teknologi Informasi dan Pendidikan,
vol. 14, no. 1, pp. 19, 2021.
[5] D. Novaliendry and V. D. Puteri, “E-Retail Percetakan Anambaleh Desain
Menggunakan Framework Laravel,” Jurnal Teknologi Informasi dan Pendidikan, vol. 13,
no. 1, pp. 1019, 2020.
[6] F. Oktavia and Y. Hendriyani, Perancangan E-Report Financial System Berbasis Web
(Studi Kasus: Toko Kripik Balado Mahkota),” Jurnal Teknologi Informasi dan Pendidikan,
vol. 13, no. 1, pp. 8795, 2020.
[7] M. Hendayun, E. Yulianto, J. F. Rusdi, A. Setiawan, and B. Ilman, “Extract transform
load process in banking reporting system,” MethodsX, vol. 8, Jan. 2021, doi:
10.1016/j.mex.2021.101260.
[8] D. Andriansyah, “Implementasi Extract-Transform-Load (ETL) Data Warehouse
Laporan Harian Pool”.
[9] K. Y. Cheng, S. Pazmino, and B. Schreiweis, “ETL Processes for Integrating Healthcare
Data - Tools and Architecture Patterns,” in Studies in Health Technology and Informatics,
IOS Press BV, Nov. 2022, pp. 151156. doi: 10.3233/SHTI220974.
[10] A. D. Barahama and R. Wardani, “Utilization Extract, Transform, Load for
Developing Data Warehouse in Education Using Pentaho Data Integration,” in Journal
of Physics: Conference Series, IOP Publishing Ltd, Dec. 2021. doi: 10.1088/1742-
6596/2111/1/012030.
Jurnal Teknologi Informasi dan Pendidikan
Volume 16, No. 1, March 2023
https://doi.org/10.24036/tip.v16i1.721
188
P.ISSN: 2086 4981
E.ISSN: 2620 6390
tip.ppj.unp.ac.id
[11] F. L. Gaol, L. Abdillah, and T. Matsuo, “Adoption of Business Intelligence to Support
Cost Accounting Based Financial Systems-Case Study of XYZ Company,” Open
Engineering, vol. 11, no. 1, pp. 1428, Jan. 2020, doi: 10.1515/eng-2021-0002.
[12] aljanabi alharqi haider, “An Optimal Warehouse Design for Crime Dataset,”
International Journal of Advanced Trends in Computer Science and Engineering, vol. 9, no. 5,
pp. 90809088, Oct. 2020, doi: 10.30534/ijatcse/2020/313952020.
[13] G. Khan Ghajar, M. Plager, and S. Hallberg, “Conversational And Live Interactions
Development And Marketplace Distribution System And Process,” 2018, doi: 10.1109/I.
[14] R. Venkatakrishnan, “Design, Implementation, and Assessment of Innovative Data
Warehousing; Extract, Transformation, and Load(ETL); and Online Analytical
Processing(OLAP) on BI,” International Journal of Database Management Systems, vol. 12,
no. 3, pp. 16, Jun. 2020, doi: 10.5121/ijdms.2020.12301.
[15] J. C. Quiroz, T. Chard, Z. Sa, A. Ritchie, L. Jorm, and B. Gallego, “Extract, Transform,
Load Framework for the Conversion of Health Databases to OMOP,” 2021, doi:
10.1101/2021.04.08.21255178.
[16] Mr. R. S. Ratnaparkhi, “The Extract Transform Load (ETL) Process and Optimization
using Ab Initio,” Int J Res Appl Sci Eng Technol, vol. 7, no. 5, pp. 17141717, May 2019,
doi: 10.22214/ijraset.2019.5287.
[17] Chatzistefanou Dimitra, “Data Warehousing in Business Intelligence and ETL
Processes Chatzistefanou Dimitra,” 2023.
[18] alexander sychev, Data Analytics and Management in Data Intensive Domains, vol. 1427.
in Communications in Computer and Information Science, vol. 1427. Cham: Springer
International Publishing, 2021. doi: 10.1007/978-3-030-81200-3.
[19] Ardhian Agung Yulianto, “Extract Transform Load process in distributed database
academic datawarehouse,” 2021.
[20] S. M. Bouknight, “Automated Etl Workflow,” 2022, doi: 10.1109/W1.2016.0083.
[21] B.-E. B. Semlali, C. El Amrani, and G. Ortiz, “SAT-ETL-Integrator: an extract-
transform-load software for satellite big data ingestion,” J Appl Remote Sens, vol. 14, no.
01, p. 1, Jan. 2020, doi: 10.1117/1.jrs.14.018501.
[22] A. Ta’a, N. Ishak, E. M. Elias, and N. Mahidin, An Impact Analysis Of Extract
Transform Load Process For Maintaining The System Of Data Warehouse,” Journal of
Information System and Technology Management, vol. 7, no. 27, pp. 168186, Sep. 2022, doi:
10.35631/jistm.727014.
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
Extract, Transform, and Load (ETL) pipelines are widely used to ingest data into Enterprise Data Warehouse (EDW) systems. These pipelines can be very complex and often tightly coupled to a given EDW, making it challenging to upgrade from a legacy EDW to a Cloud Data Warehouse (CDW). This paper presents a novel solution for a transparent and fully-automated porting of legacy ETL pipelines to CDW environments.
Chapter
Full-text available
Improving the interoperability of healthcare information systems is a crucial clinical care issue involving disparate but coexisting information systems. However, healthcare organizations are also facing the dilemma of choosing the right ETL tool and architecture pattern as data warehouse enterprises. This article gives an overview of current ETL tools for healthcare data integration. In addition, we demonstrate three ETL processes for clinical data integration using different ETL tools and architecture patterns, which map data from various data sources (e.g. MEONA and ORBIS) to diverse standards (e.g. FHIR and openEHR). Depending on the project’s technical requirements, we choose our ETL tool and software architecture pattern to boost team efficiency.
Article
Full-text available
The Extract Transform Load (ETL) process involves extracting data from database sources, transforming them into a suitable form for research analysis, and then loading it into a data warehouse (DW) to support effective decision-support implementation. To maintain the target of the data warehouse, several issues are discussed in the DW life cycle, ETL processes, and impact analysis for maintaining the DW. This research focuses on the issue high frequency of data changes makes ETL processes difficult to propagate the data changes and to maintain the changes history in the DW life cycle. Therefore, the focus issues of this research are identifying factors for frequent data changes that occur in data sources and the DW structure that need to be modified by performing impact analysis. The general factors of data changes in DW were identified by questionnaire from 41 respondents, and the factor of impact analysis was evaluated using the statistic test method called Kruskal Wallis H Test to make a comparison between the impact analysis factor and the category of users. The aims of this research are to perform the impact analysis of the DW process in order to maintain the DW system and to achieve DW with the right requirements definition. In addition, this will help users working with the DW to understand the elements of impact analysis in DW, especially on how to ensure the DW process runs efficiently and successfully. Therefore, the database administrators, data analysts, and DW developers can utilize these research findings as a guideline to deal with the data changes in the DW process.
Article
Full-text available
Pool Daily Report (LHP) is a report used by the Operations department and other relevant departments to find out the position, condition, and availability of rental units. In the process, this report is formed by several data sources including from the main system and mobile applications. Differences in data structure and format become the next obstacle faced in the data integration process. The management decision-making process is hampered because the information provided is less relevant, because some of the data is in another database that is different from the database used by the company. The implementation of a data warehouse is one of the efforts to integrate data. Extract-transform-load (ETL) is a method that can be used in a data warehouse to extract, transform, and save data back into the prepared target table. The result of this research is the implementation of the ETL process with a snowflake approach to produce the required dimensions, which are then related to form reports that are in accordance with the wishes of the user.
Article
Full-text available
Data science and analytics represent one of the most emerging fields nowadays. Collecting, storing and analyzing the data are challenging issues in the field since they require the most advanced techniques and technologies. Data Warehouse and Data Marts represent some solutions for collecting, storing and accessing the data. Good Warehouse design leads to better analysis results. Among different application fields of the data, crime data is an important and complex discipline that contains a number of complex relationships between its contents, a wide range of applications and its crucial importance. The aim of the work in this paper is building an optimal Data warehouse for crime dataset using real crime data collected from the internet. Among the different DW modules available in this field galaxy module is used in this work. The data warehouse will support the decision-making process for lawmaker and police departments by understanding crime subjects, and statistics that allow them to track actions, foretell the probability of occurring crimes and efficiently use supplies which are inverted in this paper. The proposed design of the DW shows more reliability, better storing and accessing capabilities and lower anomalies among the other designs. The proposed design was supported with a crime database design to remove heterogonous of the data and to apply some preprocessing issues from which they require data is extracted, transformed and loaded (ETL) into the warehouse. Finally, more than six million high quality, clean, and preprocessed of crime records data are available for the researchers.
Article
Full-text available
The General Election Commission (KPU) of Indragiri Hilir Regency has not had a particular application that helps financial staff make a warrant for an official trip. Recently, it has been made by using Microsoft Word. Therefore, researchers create a system to facilitate it easier. This study uses the waterfall method in developing its system, which includes the stages of needs analysis, data collection, application development, testing, and implementation. The data collection was done by utilizing observation, interviews, and literature studies. The modeling system used is UML modeling (Unified Modeling Language) web-based with the programming language PHP (Hypertext Preprocessor) and MySQL database. This study's results can make it more straightforward for financial staff to make a warrant for an official trip. It makes it easier to find information related to past official travel if needed.
Preprint
Full-text available
Objective: Develop an extract, transform, load (ETL) framework for the conversion of health databases to the Observational Medical Outcomes Partnership Common Data Model (OMOP CDM) that supports transparency of the mapping process, readability, refactoring, and maintainability. Materials and Methods: We propose an ETL framework that is metadata-driven and generic across source datasets. The ETL framework reads mapping logic for OMOP tables from YAML files, which organize SQL snippets in key-value pairs that define the extract and transform logic to populate OMOP columns. Results: We developed a data manipulation language (DML) for writing the mapping logic from health datasets to OMOP, which defines mapping operations on a column-by-column basis. A core ETL pipeline converts the DML in YAML files and generates an ETL script. We provide access to our ETL framework via a web application, allowing users to upload and edit YAML files and obtain an ETL SQL script that can be used in development environments. Discussion: The structure of the DML and the mapping operations defined in column-by-column operations maximizes readability, refactoring, and maintainability, while minimizing technical debt, and standardizes the writing of ETL operations for mapping to OMOP. Our web application allows institutions and teams to reuse the ETL pipeline by writing their own rules using our DML. Conclusion: The research community needs tools that reduce the cost and time effort needed to map datasets to OMOP. These tools must support transparency of the mapping process for mapping efforts to be reused by different institutions.
Article
Full-text available
Banks must maintain, calculate, and monitor liquidity using the Liquidity Coverage Ratio (LCR) indicator. In Indonesia, they reported daily, monthly, or quarterly online through a paper template prepared by the Financial Services Authority (OJK). This reporting must accurate and on time or be subject to penalties. For banks that are still processing LCR semi-automatically, this reporting system is an obstacle that they continue to face and resolve. This article discusses the automation process method developed using the concept of Extract Transform Load (ETL), with a waterfall software development model, so that daily reports are generated automatically. This article proposed a methodology to anticipate problems in integrating banking with regulators by applying one of the Basel III frameworks, primarily based on Indonesia's case studies. The finding of the research is a method for the LCR report process through ETL. The proposed ETL method in this research had proven used success in processing LCR in the banking industry. This method is a solution and recommendation for banks in making reports based on Basel III to complete LCR reporting through the ETL method. •o Maintain, calculate, and monitor liquidity using the Liquidity Coverage Ratio (LCR) indicator in Banking . •o Automation process method developed using the concept of Extract Transform Load (ETL). •o Recommendation for Banks to complete LCR reporting through the ETL method).
Article
Paper ini dibuat untuk memberikan gambaran secara general dalam proses transformasi Ekstract, Transform, dan Load (ETL) sebagai data masukan untuk multidimensional modeling data mart dan data warehouse. Artikel ini dibuat dengan mengimplementasikan database dari Online Transaction Processing (OLTP) kedalam database Online Analytical processing (OLAP). Pada penelitian ini digunakan database classicmodels yang bersifat open source dari Mysql.Metode yang dilakukan dalam penelitian ini adalah, dengan melakukan proses Extract, Transform danLoad (ETL) pada data classic models yang dilakukan dengan cara melakukan ketiga proses tersebut (ETL) dari database OLTP kedalam database OLAP.Luaran dari penelitian ini adalah terbntuknya fact oder berisi data dari semua data dimension yang dbiuat untuk data classic model menggunakan perngkat lunak Pentaho Data Intergarion (Kettle) dan database management system MySQL
Article
Perusahaan yang bergerak dibidang komersil perlu melakukan analisis kinerja penjualan. Dengan melakukan analisis kinerja penjualan, perusahaan dapat meningkatkan kinerja penjualannya. Salah satu cara melakukan analisis kinerja penjualan adalah dengan mengumpulkan data historis yang berkaitan dengan penjualan dan kemudian mengolah data tersebut sehingga menghasilkan informasi berupa hasil kinerja penjualan perusahaan yang dapat digunakan sebagai acuan dalam pengambilan keputusan dalam perusahaan. Penulis disini akan mencoba menganalisa sebuah data yang terkait dengan data penjualan yang ada pada sebuah Superstore di Negara Amerika Serikat, data-data yang dikumpulkan berikut terkait dengan penjualan seperti data produk, segment penjualan, transaksi penjualan, dan lain-lain. Setelah semua data yang dibutuhkan untuk membangun Data warehouse terkumpul, proses selanjutnya adalah ETL (Extract, Transform dan Load) data. Tools yang digunakan pada proses ETL ini yaitu Pentaho. Pada proses ekstraksi data ini meliputi 1 sumber data yaitu data penjualan dengan jenis file excel. Setelah melakukan proses ektraksi selanjutnya ada proses transformasi data yaitu melakukan beberapa perubahan terhadap data yang sudah diekstraksi agar lebih konsisten dan seragam sesuai dengan kebutuhan data warehouse. Setelah transformasi dilakukan, hasil akhir dari proses ETL tersebut berupa Data warehouse sederhana yang berisikan data Customer, Home Office dan Corporate, kemudian data tersebut dimasukkan ke dalam Data warehouse dan di tampilkan kedalam Database MySql dan file Microsoft Excel.