Science topic
Data Warehousing - Science topic
Explore the latest publications in Data Warehousing, and find Data Warehousing experts.
Publications related to Data Warehousing (8,365)
Sorted by most recent
The rapid growth of the ready-to-wear industry has created a need for continuous improvement, along with the necessity to shorten production times and increase quality. The processes in this industry comprise a series of sequential activities carried out by machines and workers in a specific order. Particularly before cutting, checking model inform...
This article addresses the importance of HaaS (Hadoop-as-a-Service) in cloud technologies, with specific reference to its usefulness in big data mining for environmental computing applications. The term environmental computing refers to computational analysis within environmental science and management, encompassing a myr-iad of techniques, especia...
The large volume of data in systems in the collection area leads to the lack of adequate management of information, as well as dissatisfaction on the part of the user. The purpose of the study is to implement business intelligence (BI) technology to improve the effectiveness of the information and the satisfaction of the attention of the users of a...
Introduction
Traumatic brain injury (TBI) is a leading cause of death and disability in children, but data on the longitudinal healthcare and financial needs of pediatric patients is limited in scope and duration. We sought to describe and predict these metrics following acute inpatient treatment for TBI.
Methods
Children surviving their initial i...
The integration of genomic medicine within mainstream patient care promises advances in healthcare and potential benefits for disease prediction and personalised treatment approaches. This paper explores the challenges of integrating genomic medicine within the UK’s National Health Service (NHS) and potential solutions for alignment with the NHS’s...
The purpose of this study was to determine the real-world incidence and predictors of additional vertebroplasty or balloon kyphoplasty after initial vertebral augmentation, as a proxy for subsequent symptomatic vertebral fracture. Of patients, 15.5% underwent subsequent vertebral augmentation. The patient’s comorbidities are strongly associated wit...
In the era of data-driven decision-making, effective data management and datawarehousing are critical to the success of Management Information Systems (MIS). This reviewexplores recent advancements in data warehousing technologies and their transformative impacton MIS. Key topics include the fundamentals of data warehousing, advances in big data, c...
Purpose
Chemotherapy in combination with trastuzumab is the standard neoadjuvant and adjuvant therapy for human epidermal growth factor receptor 2 (HER2)-positive breast cancer (BC). Assessing the regimens administered to patients with HER2-positive BC in the real world is lacking. We evaluated neoadjuvant and adjuvant regimen patterns among HER2-p...
Objectives: This study described real-world patient characteristics and outcomes among selpercatinib-treated patients in the United States, using the Flatiron Health electronic health record-derived deidentified database (FHD) for advanced/metastatic non-small cell lung cancer (a/mNSCLC) and Optum’s de-identified Clinformatics® Data Mart Database (...
A list of diagnosis codes mapped to CDC-defined high-risk conditions for severe COVID-19 outcomes is currently not available in the literature. We reviewed the CDC list of underlying conditions associated with severe COVID-19 and a coding expert and two clinicians mapped the relevant high-risk conditions to the appropriate ICD-10-CM codes. We addit...
The current investigation aimed to develop a novel approach for risk prediction modeling of clinical outcomes in common diseases based on computational and human intelligence techniques with no a priori input on risk factors using real-world individual patient-level data from administrative claims. Bootstrapping multivariable Cox regression and ant...
Purpose
This study assessed the clinical and economic burden of geographic atrophy (GA) using real-world data from elderly patients with Medicare Advantage plans in the United States.
Patients and Methods
A retrospective cohort design of patients with GA only, GA + visual impairment (GA + VI), GA + blindness (GA + B), and patients without GA were...
Background
Checkpoint inhibitors (CPIs) have significantly enhanced cancer treatment, yet formation of antidrug antibodies (ADA) can reduce drug efficacy and lead to increased immune toxicity.¹ Identifying biomarkers predictive of ADA formation is crucial to optimize administration of CPIs. Human leukocyte antigen (HLA) genes are responsible for pr...
Background
Total hip, knee and shoulder arthroplasties (THKSA) are increasing due to expanding demands in ageing population. Material surveillance is important to prevent severe complications involving implantable medical devices (IMD) by taking appropriate preventive measures. Automating the analysis of patient and IMD features could benefit physi...
Background: Early detection of cognitive decline during the preclinical stage of Alzheimer's disease is crucial for timely intervention and treatment. Clinical notes, often found in unstructured electronic health records (EHRs), contain valuable information that can aid in the early identification of cognitive decline. In this study, we utilize adv...
Integrating Generative AI (GenAI) with real-time data streaming analytics on Google Cloud represents a groundbreaking approach to harnessing data for immediate and impactful insights. As businesses face the challenge of processing massive volumes of continuously generated data, traditional analytics solutions often fall short in meeting the demands...
The integration of Generative Artificial Intelligence (GenAI) with real-time data streaming capabilities has emerged as a transformative strategy in advanced data analytics. With the continuous growth of data volumes, enterprises seek efficient ways to process, analyze, and derive actionable insights in real time. Google Cloud, known for its scalab...
This paper presents a detailed analysis of three widely-used data storage formats—Parquet, Avro, and ORC— evaluating their performance across key metrics such as query execution, compression efficiency, data skipping, schema evolution, and throughput. Each format offers distinct advantages depending on the nature of the workload. Parquet is optimiz...
Innovating Real-Time Data Analytics Technology: Opportunities and Challenges is a Group project presented as part of the Tsinghua University Global program Innovating Education and Entrepreneurship for the Digital Economy (IEDE), Autumn program from Sept 9th-Oct 4th 2024. This presentation explores the rapidly evolving field of real-time data analy...
In an increasingly data-driven world, effective visualization of warehouse data is essential for optimizing operations, enhancing decision-making, and improving overall efficiency. This paper explores the role of cloud-based dashboards and reporting tools in visualizing warehouse data, highlighting their capabilities to provide real-time insights,...
Studies of fluoroquinolone (FQ) safety across indications show increased collagen/neurological adverse event (AE) risk, yet patients still receive FQs for uncomplicated urinary tract infections (uUTIs). This retrospective, cohort study investigated the risk of collagen/neurological AEs of special interest (AESIs) with short-term FQ use versus stand...
Sotorasib was the first drug approved for adults with Kirsten rat sarcoma G12C-mutated locally advanced/metastatic non-small cell lung cancer (NSCLC) who received prior systemic therapy in the US. This study aimed to provide initial real-world evidence on patient characteristics, treatment patterns, healthcare resource utilization (HCRU), and healt...
In this article, we will discuss an application of Business Intelligence (BI) in analyzing data from the Brazilian educational system, specifically working with INEP (The Instituto Brasileiro de Geografia e Estatística — IBGE). This paper then discusses BI implementation's positive contributions and critical challenges in this domain. From the cons...
Background
In adults aged 50 + years, vaccine-preventable diseases (VPDs) pose a significant health burden and can lead to additional ‘downstream effects’ of infection beyond the acute phase e.g., increasing the risk for non-communicable disease and exacerbating chronic conditions. The aim was to understand and quantify the burden of VPD downstream...
Eosinophilic granulomatosis with polyangiitis (EGPA) is an eosinophil-associated disease (EAD) characterized by inflammation in small- to medium-sized blood vessels. In the REal-world inVestigation of Eosinophilic-Associated disease overLap (REVEAL) study, overlap among 11 EADs was assessed. In the present sub-study, we evaluated EGPA overlap with...
BACKGROUND
Atrial fibrillation (AF) is associated with an increased risk of stroke, yet the limitations of conventional monitoring have restricted our understanding of AF burden risk thresholds. Predictive algorithms incorporating continuous AF burden measures may be useful for predicting stroke. This study evaluated the performance of temporal AF...
Purpose: The study discusses the increasing challenges faced by financial services due to fast-growing transaction, regulatory, and client data, and the need for more flexible, scalable, and affordable data management systems. It examines the potential of Snowflake, a cloud-based data warehousing platform, to address these issues through its multi-...
Background
Real-world data on the use, healthcare resource utilization (HCRU), and associated costs of antifibrotic therapies in patients with idiopathic pulmonary fibrosis (IPF) are limited.
Objectives
To assess the prevalence of antifibrotic treatment, characteristics of patients receiving treatment, discontinuation rates, and HCRU and costs ass...
Aims
Racial disparities exist in clinical outcomes for valvular heart disease (VHD). It is unknown whether clinician segregation contributes to these disparities. Among an adequately insured population, we evaluated the relationship between clinician segregation in a hospital and receipt of care by a cardiologist according to patient race. We also...
This article examines the growing trend of cloud-based data integration and warehousing solutions in response to the exponential growth of data generation and the need for scalable, flexible analytics capabilities. It explores the key drivers behind cloud adoption, including data volume increases, demand for real-time insights, cost efficiencies, a...
This article examines the critical components of an effective portfolio for data engineering professionals specializing in Snowflake and Teradata platforms. As the data landscape evolves, the ability to showcase practical skills alongside theoretical knowledge has become paramount for career advancement. Through an analysis of industry trends and e...
This article explores the future of data warehousing, discussing emerging trends, technologies, and challenges shaping the landscape of data management and analytics in the era of big data, cloud computing, and artificial intelligence. It delves into topics such as the rise of cloud data warehouses, the increasing adoption of artificial intelligenc...
Background
Findings regarding the protective effect of Angiotensin II receptor blockers (ARBs) against Alzheimer’s disease and related dementias (AD/ADRD) and cognitive decline have been inconclusive.
Methods
Individuals with hypertension who do not have any prior ADRD diagnosis were included in this retrospective cohort study from Optum’s de-iden...
As generative AI applications gain traction across various industries, the demand for scalable and robust data architectures has become paramount. This paper presents a comparative analysis of Amazon Web Services (AWS) and Google Cloud Platform (GCP) solutions tailored for generative AI workloads. Both cloud providers offer a comprehensive suite of...
The explosion of data in today's digital landscape necessitates effective strategies for real-time analytics to derive actionable insights and drive informed decision-making. Cloud platforms like Amazon Web Services (AWS) and Google Cloud offer powerful tools and services that facilitate real-time data processing and analytics. These platforms empo...
La inteligencia de negocios permite una gran ayuda en la toma de decisiones permitiendo observar un panorama completo del negocio y del flujo que este mismo lleva, ofreciendo varias alternativas a los encargados de tomar las decisiones. La investigación tuvo como objetivo agilizar la toma de decisiones en la compañía mediante la implantación de un...
In the dynamic landscape of sports marketing, the ability to effectively ingest and manipulate data is critical for driving decision-making and enhancing competitive advantage. This paper explores strategies for optimizing data ingestion and manipulation processes tailored to sports marketing analytics. We begin by analyzing the complexities of dat...
Hidradenitis suppurativa (HS) is a painful, inflammatory skin disease associated with a high disease burden and long diagnostic delay. Prevalence estimates of HS vary widely in the literature owing to differing estimation methodologies. This study aimed to apply stepwise algorithms to estimate the prevalence of possible/diagnosed cases of HS in the...
Adilmart CCK adalah perusahaan ritel yang berkembang pesat dan membutuhkan strategi yang tepat untuk menentukan harga standar cost guna mengoptimalkan produksi dan memaksimalkan keuntungan. Penelitian ini bertujuan untuk menganalisis dan merancang sistem Business Intelligence yang dapat membantu perusahaan dalam menetapkan harga standar cost secara...
Introduction Memphis, Tennessee, ranks among the top U.S. cities for breast cancer mortality, especially among African American women. Breast cancer presents a significant public health challenge, exacerbated by various comorbid conditions that complicate patient outcomes. The Charlson Comorbidity Index (CCI) predicts ten-year mortality risk based...
Background: Combating antibiotic resistance, exacerbated by widespread unnecessary outpatient antibiotic prescriptions, necessitates innovative stewardship solutions. Audit and feedback reports are effective but often resource heavy. We introduced a free, open-source system, Outpatient Automated Stewardship Information System (OASIS©), for automati...
Background: CAP is often inappropriately treated with agents active against multidrug-resistant organisms (MDRO; methicillin-resistant S. aureus [MRSA] and P. aeruginosa [PSA]) and for prolonged duration. We assessed the relationship between antibiotic use with ATS/IDSA guideline-unjustified empiric and definitive MDRO therapy and prolonged duratio...
Amplifying the utilization of big data in healthcare analytics through cloud and Snowflake migration presents a significant opportunity to enhance data-driven insights and decision-making in the healthcare sector. This migration makes it easier to move large amounts of healthcare data to the cloud. Applications deployed in could are scalable for in...
Enterprise decision-making entails results extracted through Online Analytical Processing (OLAP) queries. The performance of result retrieval from data warehouse is a critical factor. Frequent OLAP queries have to access warehouse data repeatedly for generating the same results. To avoid executing the same OLAP query and access data warehouse, our...
Purpose: This research examines the utilization of machine learning (ML) in data warehousing systems and the extent to which it will transform business intelligence and analytics. It aims to know how ML improves conventional data warehousing systems to support prediction and forecasting. Methodology: This research uses a literature review together...
Objective
This study uses electronic health record (EHR) data to predict 12 common cancer symptoms, assessing the efficacy of machine learning (ML) models in identifying symptom influencers.
Materials and Methods
We analyzed EHR data of 8156 adults diagnosed with cancer who underwent cancer treatment from 2017 to 2020. Structured and unstructured...
The aim of this study was to evaluate the performance of a computable phenotype for systemic lupus erythematosus (SLE) patients when it is ported from a local data warehouse to the i2b2, OMOP, and PCORnet CDMs. We adapted the SLE phenotype to the Northwestern Medicine (NM) Enterprise Data Warehouse (EDW) and NU i2b2, OMOP, and PCORnet instances. Ea...
This study explores the transformative role of data warehousing in enhancing decision-making and operational efficiency for franchise businesses, focusing on a case study involving Foodad, a growing franchise chain. The research highlights the importance of centralizing and organizing data to overcome the challenges of inconsistent and scattered da...
In the era of big data, the optimization of machine learning models within cloud-based data warehousing systems has emerged as a critical domain of research and application. This paper presents an in-depth analysis of advanced data science techniques aimed at enhancing the performance and scalability of machine learning models in such environments....
Un processus ETL (Extract-Transform-Load) est très complexe en termes de flux de données et des tâches chargées de nettoyer, filtrer, normaliser et charger les données dans l'entrepôt de données. L'extraction des données à partir des sources, transformation permettant de livrer des données de qualité ayant une valeur pour l'analyse) chargement des...
Existing text-to-SQL benchmarks have largely been constructed using publicly available tables from the web with human-generated tests containing question and SQL statement pairs. They typically show very good results and lead people to think that LLMs are effective at text-to-SQL tasks. In this paper, we apply off-the-shelf LLMs to a benchmark cont...
Congestive heart failure (CHF) and opioid use disorder (OUD) commonly coexist and are major contributors to high healthcare utilization in the United States. Medication assisted treatment (MAT; e.g., buprenorphine and methadone) reduces opioid-related mortality by about 50 %; yet little is known about how OUD treatment impacts CHF outcomes in patie...
Agile techniques have transformed project management and software development by stressing flexibility, collaboration, and customer-centricity. Data warehouse initiatives have traditionally used a waterfall methodology, which may delay, cost more, and misalign with business needs. Agile techniques are applied to data warehouse projects in this arti...
Introduction: The implementation of data warehouse systems offers great potential for improving patient care, operational efficiency, and strategic decision-making. This study explores the challenges and opportunities of implementing data storage solutions in the Jordanian healthcare industry. Objectives: To investigate current data management prac...
INTRODUCTION: A clinical data warehouse (CDW) is a powerfulresource that supports clinical decision-making and secondary data use byintegrating and presenting heterogeneous data sources. Despite considerableeffort within healthcare organizations (HCOs) to develop CDWs, scientific literaturesurrounding clinical data warehousing methods is limited.OB...
A real-time data warehouse is a crucial tool for information management and analysis, enabling the capture, processing, and analysis of vast amounts of data from diverse sources in real-time. It offers enterprises enhanced decision support through its efficient processing capabilities and timely data feedback. This paper reviews the technical chara...
Purpose: The aim of the study was to assess the effect of data integration techniques on operational efficiency in manufacturing industries in Iran. Materials and Methods: This study adopted a desk methodology. A desk study research design is commonly known as secondary data collection. This is basically collecting data from existing resources pref...
Rare diseases pose significant challenges due to their heterogeneity and lack of knowledge. This study develops a comprehensive pipeline interoperable with a document-oriented clinical data warehouse, integrating cohort characterization, patient clustering and interpretation. Leveraging NLP, semantic similarity, machine learning and visualization,...
The term "Big Data" has garnered significant attention in recent years, evolving into a
highly valued asset akin to oil in the modern world. But what exactly does it encompass? Big
Data refers to the massive volume of data that cannot be effectively processed and analyzed
using conventional processing techniques. It encompasses a collection of d...
Data warehousing has become a pivotal element in modern data management strategies, enabling organizations to consolidate large volumes of data from diverse sources for analysis and decision-making. This article provides a comprehensive overview of data warehousing, including its architecture, components, and the processes involved in building and...
Chronic sialorrhea is a condition characterized by excessive drooling, often associated with neurological and neuromuscular disorders such as Parkinson’s disease, cerebral palsy, and stroke. Despite its prevalence, it remains underdiagnosed and poorly understood, leading to a lack of comprehensive data on patient demographics, clinical characterist...
Movement disorders such as cervical dystonia, blepharospasm, and hemifacial spasm negatively impact the quality of life of people living with these conditions. Botulinum toxin (BoNT) injections are commonly used to treat these disorders. We sought to describe patient characteristics, BoNT utilization, and potential adverse events (AEs) among patien...
Associations between increased functional disability and higher healthcare resource utilization (HCRU) and costs were reported in patients with psoriatic arthritis (PsA). We assessed characteristics/outcomes of patients with PsA receiving tofacitinib monotherapy vs combination therapy with conventional synthetic disease-modifying antirheumatic drug...
Ein wichtiger Bestandteil moderner Dateninfrastruktur in Unternehmen ist die Vielfalt der Datenquellen. Der Aufbau effizienter Datenpipelines ist notwendig, damit die Daten die verschiedenen Stufen der Wertschöpfungskette durchlaufen können. Das Ziel dieser Masterarbeit ist der Aufbau einer automastisierten Datenpipeline als SaaS-Anwendung unter Ve...
An information system is a series that includes aspects of software, hardware and brainware in a structured manner with performance in a process that is interconnected so as to create a certain desired product, then the development of website-based system infrastructure can make it easier to data warehouse electronic products for both incoming and...
This article presents a comprehensive framework for data quality assurance in data warehousing, addressing the critical need for maintaining data integrity, accuracy, and reliability in modern enterprise environments. It explores common data quality issues such as duplicates, inconsistencies, missing values, and data drift while offering best pract...
User comments on the Play Store are crucial sources for developers to understand user feedback on their applications. In this research, we introduce the application of Extract, Transform, Load (ETL) Technique to analyze user comments on the Flip.id application on the Play Store using the Pentaho application. The ETL method is utilized to extract co...
Background:
Patients with COPD often develop other morbidities, suggesting a systemic component to this disease. This retrospective non-interventional cohort study investigated relationships between multimorbidities in COPD and their impact on COPD exacerbations and COPD-related healthcare resource utilization (HCRU) using real-world evidence from...
Background
SARS-CoV-2 vaccines are safe and effective against infection and severe COVID-19 disease worldwide. Certain co-morbid conditions cause immune dysfunction and may reduce immune response to vaccination. In contrast, those with co-morbidities may practice infection prevention strategies. Thus, the real-world clinical impact of co-morbiditie...
Customer Relationship Platforms (CRP) are essential tools for managing customer interactions, but traditional systems often fall short in handling the exponential growth of data and the need for real-time insights. This manuscript explores the transformative potential of integrating Artificial Intelligence (AI) into CRP systems, presenting a model...
Background
The randomized, dose-optimization, open-label ReDOS study in US patients with metastatic colorectal cancer (CRC) showed that, compared with a standard dosing approach, initiating regorafenib at 80 mg/day and escalating to 160 mg/day depending on tolerability increased the proportion of patients reaching their third treatment cycle and re...
This review examines the implementation of AI-powered data warehouse solutions to optimize big data management and utilization, analyzing 25 peer-reviewed articles published over the last decade. As organizations increasingly rely on vast amounts of data for strategic decision-making, traditional data warehousing techniques have struggled to keep p...
This research explores the application of big data in education, examining its nature, technologies, challenges, and benefits. Big data in education refers to the vast amounts of structured and unstructured data generated from various educational sources. Key characteristics include high volume, velocity, and variety of data. The study reviews majo...