Youngjin Choi's research while affiliated with Eulji University and other places

What is this page?


This page lists the scientific contributions of an author, who either does not have a ResearchGate profile, or has not yet added these contributions to their profile.

It was automatically created by ResearchGate to create a record of this author's body of work. We create such pages to advance our goal of creating and maintaining the most comprehensive scientific repository possible. In doing so, we process publicly available (personal) data relating to the author as a member of the scientific community.

If you're a ResearchGate member, you can follow this page to keep up with this author's work.

If you are this author, and you don't want us to display this page anymore, please let us know.

Publications (6)


Study workflow
Mortality distribution
Importance analysis results
Comparison of algorithm performance
Comparison of sample correction methods
Comparison of mortality prediction models for road traffic accidents: an ensemble technique for imbalanced data
  • Article
  • Full-text available

August 2022

·

45 Reads

·

10 Citations

BMC Public Health

·

Youngjin Choi

Background Injuries caused by RTA are classified under the International Classification of Diseases-10 as ‘S00-T99’ and represent imbalanced samples with a mortality rate of only 1.2% among all RTA victims. To predict the characteristics of external causes of road traffic accident (RTA) injuries and mortality, we compared performances based on differences in the correction and classification techniques for imbalanced samples. Methods The present study extracted and utilized data spanning over a 5-year period (2013–2017) from the Korean National Hospital Discharge In-depth Injury Survey (KNHDS), a national level survey conducted by the Korea Disease Control and Prevention Agency, A total of eight variables were used in the prediction, including patient, accident, and injury/disease characteristics. As the data was imbalanced, a sample consisting of only severe injuries was constructed and compared against the total sample. Considering the characteristics of the samples, preprocessing was performed in the study. The samples were standardized first, considering that they contained many variables with different units. Among the ensemble techniques for classification, the present study utilized Random Forest, Extra-Trees, and XGBoost. Four different over- and under-sampling techniques were used to compare the performance of algorithms using “accuracy”, “precision”, “recall”, “F1”, and “MCC”. Results The results showed that among the prediction techniques, XGBoost had the best performance. While the synthetic minority oversampling technique (SMOTE), a type of over-sampling, also demonstrated a certain level of performance, under-sampling was the most superior. Overall, prediction by the XGBoost model with samples using SMOTE produced the best results. Conclusion This study presented the results of an empirical comparison of the validity of sampling techniques and classification algorithms that affect the accuracy of imbalanced samples by combining two techniques. The findings could be used as reference data in classification analyses of imbalanced data in the medical field.

Download
Share

Figure 1. Sample data processing process.
Figure 2. Sample data processing process.
Frequency analysis.
Comparison of Prediction Models for Mortality Related to Injuries from Road Traffic Accidents after Correcting for Undersampling

May 2021

·

52 Reads

·

7 Citations

International Journal of Environmental Research and Public Health (IJERPH)International Journal of Environmental Research and Public Health (IJERPH)

In this study, four models—logistic regression (LR), random forest (RF), linear support vector machine (SVM), and radial basis function (RBF)-SVM—were compared for their accuracy in determining mortality caused by road traffic injuries. They were tested using five years of national-level data from the Korea Disease Control and Prevention Agency’s (KDCA) National Hospital Discharge In-Depth Survey (2013 through to 2017). Model performance was measured for accuracy, precision, recall, F1 score, and Brier score metrics using classification analysis that included characteristics of patients, accidents, injuries, and illnesses. Due to the number of variables and differing units, the rates of survival and mortality related to road traffic accidents were imbalanced, so the data was corrected and standardized before the classification models’ performances were compared. Using the importance analysis, the main diagnosis, the type of injury, the site of the injury, the type of injury, the operation status, the type of accident, the role at the time of the accident, and the sex were selected as the analysis factors. The biggest contributing factor was the role in the accident, which is the driver, and the major sites of the injuries were head injuries and deep injuries. Using selected factors, comparisons of the classification performance of each model indicated RBF-SVM and RF models were superior to the others. Of the SVM models, the RBF kernel model was superior to the linear kernel model; it can be inferred that the performance of the high-dimensional transformed RBF model is superior when the dimension is complex because of the use of multiple variables. The findings suggest there are limitations to analyses involving imbalanced, multidimensional original data, such as data on road traffic mortality. Thus, analyses must be performed after imbalances are corrected.


The method to secure scalability and High density in cloud data-center

July 2014

·

38 Reads

·

7 Citations

Information Systems

YoungJin Choi

·

SangHak Lee

·

JinHwan Kim

·

[...]

·

Yong-Gyu Jung

Recently IT infrastructures change to cloud computing, the demand of cloud data center increased. Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared computing resources that can be rapidly provisioned and released with minimal management effort, the interest on data centers to provide the cloud computing services economically and variably is increasing. This study analyzes the factors to improve the power efficiency while securing scalability of data centers and presents the considerations for cloud data center construction in terms of power distribution method, power density per rack and expansion unit separately. The result of this study may be used for making rational decisions concerning the power input, voltage transformation and unit of expansion when constructing a cloud data center or migrating an existing data center to a cloud data center.


Analysis of questions regarding morbidity coding posted to the online coding clinic of the Korean Medical Record Association

January 2014

·

10 Reads

·

1 Citation

Health information management: journal of the Health Information Management Association of Australia

Accuracy and consistency in morbidity coding are important in both clinical research and practice. However,Health Information Managers (HIMs) sometimes face difficulties in assigning morbidity codes. To assist them,the Korean Medical Record Association operates an online coding clinic bulletin board, on which HIMs can post questions and receive answers. Frequency analysis and Fisher's exact testing were performed to identify differences among the types of questions posted and the characteristics of the HIMs who posted them. Through statistical analysis, it was found that HIMs working at hospitals with fewer than 500 beds and those with more than 10 years of work experience were found to post more questions than other HIMs. The study also identified the characteristics of HIMs who require more coding education and particular diagnoses for which further training is required. Our findings will assist the development of coding procedures, guidelines, education programs, and a more user-friendly database.


Analysis of Questions regarding Morbidity Coding Posted to the Online Coding Clinic of the Korean Medical Record Association

October 2013

·

39 Reads

·

1 Citation

Health information management: journal of the Health Information Management Association of Australia

Accuracy and consistency in morbidity coding are important in both clinical research and practice. However, Health Information Managers (HIMs) sometimes face difficulties in assigning morbidity codes. To assist them, the Korean Medical Record Association operates an online coding clinic bulletin board, on which HIMs can post questions and receive answers. Frequency analysis and Fisher's exact testing were performed to identify differences among the types of questions posted and the characteristics of the HIMs who posted them. Through statistical analysis, it was found that HIMs working at hospitals with fewer than 500 beds and those with more than 10 years of work experience were found to post more questions than other HIMs. The study also identified the characteristics of HIMs who require more coding education and particular diagnoses for which further training is required. Our findings will assist the development of coding procedures, guidelines, education programs, and a more user-friendly database.


A Case of Standard Develop Framework Based on Open-Source Software in Korea Public Sector

January 2012

·

9 Reads

·

1 Citation

Communications in Computer and Information Science

The various development frameworks cause problems such as the cost of the system maintenance, the outsourcing firm dependency and the lack of interoperability between systems. In order to solve these problems, the Korean government has developed a standard development framework for e-Government, is called the eGovFrame using open source. And many agencies used eGovFrame, HIRA also used standard development framework. In this study we showed that applied to the practices for DUR(Drug Utilization Review) system of Korea HIRA.

Citations (5)


... The optimal split is selected by a greedy algorithm to generate multiple decision trees and combine their predictions by weighting them to build a stronger model. 27,28 In this study, machine learning models (including Random Forest and XGBoost) were used to predict the health utility values of elderly hypertensive stroke patients, and the importance of each feature in the prediction was analyzed. In view of the small sample size of the study, deep learning models may be overfitted with small samples, so machine learning models are chosen instead of deep learning models in this study. ...

Reference:

Analysis of Health-Related Quality of Life in Elderly Patients with Stroke Complicated by Hypertension in China Using the EQ-5D-3L Scale
Comparison of mortality prediction models for road traffic accidents: an ensemble technique for imbalanced data

BMC Public Health

... From the literature review presented in Santos et al. can be extracted information about studies that compared different machine learning algorithms to develop injury severity prediction models in road traffic accidents (12). There are several studies that compare different machine learning techniques to predict injury severity where at least two of the following algorithms were used: RF, SVM, DT, KNN, and XGBoost (18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32). In all these studies, the most-used performance metric to compare the model results was accuracy. ...

Comparison of Prediction Models for Mortality Related to Injuries from Road Traffic Accidents after Correcting for Undersampling

... *For Correspondence: astra67@ncc.re.kr(Ortega et al., 2014). This function may not be used in cases of poor accessibility, delayed responses, or weak interaction (Boo et al., 2013). For cancer registration staffs, factors including the absence of devoted cancer registration staff, poor expertise, and insufficient training can cause cancer registration inquiries and lower the quality of cancer registration data (Boo et al., 2014). ...

Analysis of questions regarding morbidity coding posted to the online coding clinic of the Korean Medical Record Association
  • Citing Article
  • January 2014

Health information management: journal of the Health Information Management Association of Australia

... Cloud computing has grown significantly over the last decade, resulting in the proliferation of various web applications, gradually changing the conventional computing example and making it more centralized, where the operational core is located at remote CDCs [124]. Despite the enormous benefits generated by this shift, as the number of cloud users and services rapidly increases, various scalability problems arise in the data centers [125], relating for example to performance issues, energy efficiency, resource allocation, server capacities, response time, network architecture, service availability, etc. It has already been reported that VMs are an important element for the provision of on-demand cloud services, whether they are hosted in data centers or deployed at the edge, closer to the end users. ...

The method to secure scalability and High density in cloud data-center
  • Citing Article
  • July 2014

Information Systems

... Moreover, utilizing an inquiry system, if present in a data collection system, can help improve the collected data's accuracy (Ortega et al., 2014). This function may not be used in cases of poor accessibility, delayed responses, or weak interaction (Boo et al., 2013). For cancer registration staffs, factors including the absence of devoted cancer registration staff, poor expertise, and insufficient training can cause cancer registration inquiries and lower the quality of cancer registration data (Boo et al., 2014). ...

Analysis of Questions regarding Morbidity Coding Posted to the Online Coding Clinic of the Korean Medical Record Association

Health information management: journal of the Health Information Management Association of Australia