Joong-Ho Won

Seoul National University, Sŏul, Seoul, South Korea

Are you Joong-Ho Won?

Claim your profile

Publications (22)34.5 Total impact

  • [Show abstract] [Hide abstract]
    ABSTRACT: Orthodontists are interested in finding a set of standard arch forms for clinical orthodontic practice. In this paper, we propose a functional clustering method for the dental arches based on a mixture of U-shaped curves. We decide the number of clusters (equivalently, mixture components) using the Bayesian information criterion and the jump criterion based on a given distortion function. We apply our method to clustering the dental arch data from the nationwide standard occlusion study conducted in Korea from 1997 to 2005. The data are composed of dental arches of 306 subjects with normal occlusion selected from 15,836 young adults. We also provide the comparison of the proposed method to other existing methods.
    Journal of the Korean Statistical Society 05/2015; DOI:10.1016/j.jkss.2015.04.003
  • Source
    Sang-Yun Oh, Bala Rajaratnam, Joong-Ho Won
    [Show abstract] [Hide abstract]
    ABSTRACT: The recently introduced condition-number-regularized covariance estimation method (CondReg) has been demonstrated to be highly useful for estimating high-dimensional covariance matrices. Unlike L1-regularized estimators, this approach has the added advantage that no sparsity assumptions are made. The regularization path of the lasso solution has received much attention in the literature. Despite their importance, the solution paths of covariance estimators however have not been considered in much detail. In this paper, we provide a complete characterization of the entire solution path of the CondReg estimator. Our characterization of the solution path has important applications as it yields fast algorithms that compute the CondReg estimates for all possible values of the regularization parameter at the same cost as that for a single fixed parameter. We present two instances of fast algorithms: the forward and the backward algorithms. These algorithms greatly speed up the cross-validation procedure that selects the optimal regularization parameter. Our new method is efficiently implemented with the R package CondReg.
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: As the need for large-scale data analysis is rapidly increasing, Hadoop, or the platform that realizes large-scale data processing, and MapReduce, or the internal computational model of Hadoop, are receiving great attention. This paper reviews the basic concepts of Hadoop and MapReduce necessary for data analysts who are familiar with statistical programming, through examples that combine the R programming language and Hadoop.
    09/2013; 24(5). DOI:10.7465/jkdi.2013.24.5.1013
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: In this paper, we propose a majorization-minimization (MM) algorithm for high-dimensional fused lasso regression (FLR) suitable for parallelization using graphics processing units (GPUs). The MM algorithm is stable and flexible as it can solve the FLR problems with various types of design matrices and penalty structures within a few tens of iterations. We also show that the convergence of the proposed algorithm is guaranteed. We conduct numerical studies to compare our algorithm with other existing algorithms. We demonstrate that the proposed MM algorithm is competitive in general settings. The merit of GPU parallelization is also exhibited.
    Journal of Computational and Graphical Statistics 06/2013; 24(1). DOI:10.1080/10618600.2013.878662
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Estimation of high-dimensional covariance matrices is known to be a difficult problem, has many applications, and is of current interest to the larger statistics community. In many applications including so-called the "large p small n" setting, the estimate of the covariance matrix is required to be not only invertible, but also well-conditioned. Although many regularization schemes attempt to do this, none of them address the ill-conditioning problem directly. In this paper, we propose a maximum likelihood approach, with the direct goal of obtaining a well-conditioned estimator. No sparsity assumption on either the covariance matrix or its inverse are are imposed, thus making our procedure more widely applicable. We demonstrate that the proposed regularization scheme is computationally efficient, yields a type of Steinian shrinkage estimator, and has a natural Bayesian interpretation. We investigate the theoretical properties of the regularized covariance estimator comprehensively, including its regularization path, and proceed to develop an approach that adaptively determines the level of regularization that is required. Finally, we demonstrate the performance of the regularized estimator in decision-theoretic comparisons and in the financial portfolio optimization setting. The proposed approach has desirable properties, and can serve as a competitive procedure, especially when the sample size is small and when a well-conditioned estimator is required.
    Journal of the Royal Statistical Society Series B (Statistical Methodology) 06/2013; 75(3):427-450. DOI:10.1111/j.1467-9868.2012.01049.x
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: This paper proposes a procedure to obtain monotone estimates of both the local and the tail false discovery rates that arise in large-scale multiple testing. The proposed monotonization is asymptotically optimal for controlling the false discovery rate and also has many attractive finite-sample properties.
    Statistics [?] Probability Letters 05/2013; 87. DOI:10.1016/j.spl.2013.12.011
  • [Show abstract] [Hide abstract]
    ABSTRACT: To investigate the safety and efficacy of a double stent in patients with malignant extrahepatic biliary obstruction.Materials and Methods From January 2010 to March 2012, 160 consecutive patients (102 men, 58 women; mean age, 63.8 years; range, 33—91 years) with malignant extrahepatic biliary obstructions were prospectively enrolled in four academic tertiary referral centers. All patients were treated with a double stent system (covered stent-in-uncovered stent).ResultsThe double stents were successfully placed in all patients. Stent migration was not observed in any patients. Percutaneous drainage catheters were successfully removed from all patients. Mean serum bilirubin level, which was 8.9 mg/dl ± 5.6 before drainage, decreased significantly, to 2.2 mg/dl ± 0.4, one month after stent placement (p < 0.001). Clinical success rate was 95%. Median patient survival time was 135 days (95% confidence interval [CI], 101—169 days). Cumulative stent patency rates at 1, 3, 6, 9, and 12 months were 99%, 95%, 87%, 85%, and 85% respectively. Fifteen patients (11.1%) presented with stent occlusion due to sludge incrustation (n=12) or tumor overgrowth (n=3), and required repeat intervention. Tumor ingrowth was not observed in any of these patients. Complications including pancreatitis (n=3), acute cholecystitis (n=3), and hepatic abscess (n=2) occurred in eight patients.Conclusion Percutaneous treatment of malignant extrahepatic biliary obstruction using a double stent is feasible, safe, and effective in achieving internal biliary drainage.
    Journal of Vascular and Interventional Radiology 04/2013; 24(4):S52-S53. DOI:10.1016/j.jvir.2013.01.120
  • Johan Lim, Joong-Ho Won
    [Show abstract] [Hide abstract]
    ABSTRACT: The ROC convex hull (ROCCH) is the least convex majorant of the empirical ROC curve, and represents the optimal ROC curve of a set of classifiers. This paper provides a probabilistic view to the ROCCH. We show that the ROCCH can be characterized as a nonparametric maximum likelihood estimator (NPMLE) of a convex ROC curve. We provide two NPMLE formulations, one unconditional and the other conditional, both of which yield the ROOCH as the solution. The solution technique relates the NPMLEs to convex optimization and classifier calibration. The connection between the NPMLEs and the ROCCH also suggests efficient algorithms to compute NPMLEs of a convex ROC curve, and a conditional bootstrap procedure for assessing uncertainties in the ROCCH.
    Machine Learning 09/2012; 88(3). DOI:10.1007/s10994-012-5290-y
  • Yongkweon Jeon, Joong-Ho Won, Sungroh Yoon
    [Show abstract] [Hide abstract]
    ABSTRACT: Images captured using computed tomography and magnetic resonance angiography are used in the examination of the abdominal aorta and its branches. The examination of all clinically relevant branches simultaneously in a single 2D image without any misleading overlaps facilitates the diagnosis of vascular abnormalities. This problem is called uncluttered singleimage visualization (USIV). We can solve the USIV problem by assigning energy-based scores to visualization candidates and then finding the candidate that optimizes the score; this approach is similar to the manner in which the protein side-chain placement problem has been solved. To obtain near-optimum images, we need to explore the energy space extensively, which is often time consuming. This paper describes a method for exploring the energy space in a massively parallel fashion using graphics processing units. According to our experiments, in which we used 30 images obtained from 5 patients, the proposed method can reduce the total visualization time substantially. We believe that the proposed method can make a significant contribution to the effective visualization of abdominal vascular structures and precise diagnosis of related abnormalities.
    IEEE transactions on bio-medical engineering 08/2012; 60(1). DOI:10.1109/TBME.2012.2214386
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Highly multiplexed assays using antibody coated, fluorescent (xMap) beads are widely used to measure quantities of soluble analytes, such as cytokines and antibodies in clinical and other studies. Current analyses of these assays use methods based on standard curves that have limitations in detecting low or high abundance analytes. Here we describe SAxCyB (Significance Analysis of xMap Cytokine Beads), a method that uses fluorescence measurements of individual beads to find significant differences between experimental conditions. We show that SAxCyB outperforms conventional analysis schemes in both sensitivity (low fluorescence) and robustness (high variability) and has enabled us to find many new differentially expressed cytokines in published studies.
    Proceedings of the National Academy of Sciences 02/2012; 109(8):2848-53. DOI:10.1073/pnas.1112599109
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Direct projection of three-dimensional branching structures, such as networks of cables, blood vessels, or neurons onto a 2D image creates the illusion of intersecting structural parts and creates challenges for understanding and communication. We present a method for visualizing such structures, and demonstrate its utility in visualizing the abdominal aorta and its branches, whose tomographic images might be obtained by computed tomography or magnetic resonance angiography, in a single two-dimensional stylistic image, without overlaps among branches. The visualization method, termed uncluttered single-image visualization (USIV), involves optimization of geometry. This paper proposes a novel optimization technique that utilizes an interesting connection of the optimization problem regarding USIV to the protein structure prediction problem. Adopting the integer linear programming-based formulation for the protein structure prediction problem, we tested the proposed technique using 30 visualizations produced from five patient scans with representative anatomical variants in the abdominal aortic vessel tree. The novel technique can exploit commodity-level parallelism, enabling use of general-purpose graphics processing unit (GPGPU) technology that yields a significant speedup. Comparison of the results with the other optimization technique previously reported elsewhere suggests that, in most aspects, the quality of the visualization is comparable to that of the previous one, with a significant gain in the computation time of the algorithm.
    01/2012; 19(1). DOI:10.1109/TVCG.2012.25
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Though recently they have fallen into some disrepute, genome-wide association studies (GWAS) have been formulated and applied to understanding essential hypertension. The principal goal here is to use data gathered in a GWAS to gauge the extent to which SNPs and their interactions with other features can be combined to predict mean arterial blood pressure (MAP) in 3138 pre-menopausal and naturally post-menopausal white women. More precisely, we quantify the extent to which data as described permit prediction of MAP beyond what is possible from traditional risk factors such as blood cholesterol levels and glucose levels. Of course, these traditional risk factors are genetic, though typically not explicitly so. In all, there were 44 such risk factors/clinical variables measured and 377,790 single nucleotide polymorphisms (SNPs) genotyped. Data for women we studied are from first visit measurements taken as part of the Atherosclerotic Risk in Communities (ARIC) study. We begin by assessing non-SNP features in their abilities to predict MAP, employing a novel regression technique with two stages, first the discovery of main effects and next discovery of their interactions. The long list of SNPs genotyped is reduced to a manageable list for combining with non-SNP features in prediction. We adapted Efron's local false discovery rate to produce this reduced list. Selected non-SNP and SNP features and their interactions are used to predict MAP using adaptive linear regression. We quantify quality of prediction by an estimated coefficient of determination (R(2)). We compare the accuracy of prediction with and without information from SNPs.
    PLoS ONE 11/2011; 6(11):e27891. DOI:10.1371/journal.pone.0027891
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The authors develop a method to visualize the abdominal aorta and its branches, obtained by CT or MR angiography, in a single 2D stylistic image without overlap among branches. The abdominal aortic vasculature is modeled as an articulated object whose underlying topology is a rooted tree. The inputs to the algorithm are the 3D centerlines of the abdominal aorta, its branches, and their associated diameter information. The visualization problem is formulated as an optimization problem that finds a spatial configuration of the bounding boxes of the centerlines most similar to the projection of the input into a given viewing direction (e.g., anteroposterior), while not introducing intersections among the boxes. The optimization algorithm minimizes a score function regarding the overlap of the bounding boxes and the deviation from the input. The output of the algorithm is used to produce a stylistic visualization, made of the 2D centerlines modulated by the associated diameter information, on a plane. The authors performed a preliminary evaluation by asking three radiologists to label 366 arterial branches from the 30 visualizations of five cases produced by the method. Each of the five patients was presented in six different variant images, selected from ten variants with the three lowest and three highest scores. For each label, they assigned confidence and distortion ratings (low/medium/high). They studied the association between the quantitative metrics measured from the visualization and the subjective ratings by the radiologists. All resulting visualizations were free from branch overlaps. Labeling accuracies of the three readers were 93.4%, 94.5%, and 95.4%, respectively. For the total of 1098 samples, the distortion ratings were low: 77.39%, medium: 10.48%, and high: 12.12%. The confidence ratings were low: 5.56%, medium: 16.50%, and high: 77.94%. The association study shows that the proposed quantitative metrics can predict a reader's subjective ratings and suggests that the visualization with the lowest score should be selected for readers. The method for eliminating misleading false intersections in 2D projections of the abdominal aortic tree conserves the overall shape and does not diminish accurate identifiability of the branches.
    Medical Physics 11/2009; 36(11):5245-60. DOI:10.1118/1.3243866
  • Y. Ghim, J. Won, S. Shim
    [Show abstract] [Hide abstract]
    ABSTRACT: Visibility was sharply reduced during a few days from the 20th of May 2003 in the greater Seoul area. The episodic field study revealed that organic and elemental carbon levels in PM10 increased up to 40 mug/m3 and 10 mug/m3, respectively. Potassium ion level was also increased as high as 1.0 mug/m3, which is 5 times higher than the normal level. PM2.5/PM10 ratio was very close to 1, with the average PM2.5 concentration during the period being 111 mug/m3. It was reported that extensive forest fires occurred across Siberia and the far eastern region in Russia in May 2003. Satellite images showed hundreds of active fires during the same period in the south of the Russian Federation to the north of the Korean Peninsula. In the former part of May 2003, the Korean Peninsula was under the influence of air mass from China. At the beginning of the episode, large amount of pollutants from forest fire was transported directly from Siberia and mixed with existing pollutants over the Korean Peninsula. During the episode, air was stagnant in association with a slow moving high pressure system over the Korean Peninsula. Both PM10 and PM2.5 and inorganic species and organic and elemental carbons were measured at Seoul (37 60 N, 127 05 E) and Deokjeok Island (37 13 N, 126 09 E), which is located at 70 km off the west coast of Korean Peninsula. The effects of anthropogenic sources and Siberian forest fires were evaluated during the evolution of the episode.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Many urban areas have been subsided due to various causes, including a withdrawal of groundwater, oil and natural gas, underground excavation, mining and tectonic motion. The impacts of land subsidence are most critical in coastal cities, where land elevation is close or below sea level. These cities are highly susceptible to flooding, as shown last year when New Orleans was flooded by Hurricane Katrina. Subsidence caused by aquifer system compaction associated with ground water withdrawal, or organic soil drainage, often shows continuous but waning slow ground movement with time. Due to residual compaction, subsidence continues even after stabilization of ground water level. In reclaimed land, rapid and waning subsidence rates of the surface also results from soil consolidation by the overburden of increased soil loading over the surface during reclamation works. Here we demonstrate interferometric mapping of surface deformation related to soil consolidation. Twenty three JERS-1 SAR images acquired from 1992 to 1998 were used to estimate land subsidence rate in the city of Mokpo, located in the south western coast of Korea. Large regions within Mokpo are subjected to significant subsidence because about 70% of the city area is a reclaimed land from the sea. Two subsidence field maps were retrieved by means of permanent scatterer InSAR (PSInSAR) adopting constant velocity model as well as hyperbolic model to describe the waning ground subsidence. The results indicate continuous subsidence in some areas with different decaying velocity. The subsidence velocity reaches over 6 cm/yr in the fastest sinking area. The validity of subsidence rates estimated was verified by the comparison with ENVISAT SAR measurements acquired during 2004-2005. Hyperbolic model constrained by our JERS PSInSAR results allows us to better fit the observed subsidence derived from ENVISAT data. The result clearly confirms that PSInSAR techniques coupled with hyperbolic model is valuable tool for monitoring long-term land subsidence characterized by time-varying subsidence rate.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Radiance measurements and inversions of the AErosol RObotic NETwork (AERONET, Holben et al., 2001) are used to characterize global atmospheric aerosols and key parameters such as the phase functions, single scattering albedos, asymmetry factors and extinction coefficients of various aerosol types. This study uses more than 105 records of aerosol size distributions and complex refractive indices to generate the optical properties of the aerosol at more 200 sites worldwide. These properties together with the radiance measurements are then classified using classical clustering methods to group the sites according to the type of aerosol with the greatest frequency of occurrence at each site. Seven significant clusters are identified: two types of desert dust, two types of biomass burning, urban industrial pollution, rural background aerosol, and polluted marine aerosol. The variances of the aerosol properties are determined using principle component analysis (PCA). In addition to categorizing the aerosol types, all the column properties that are important for the calculation of radiative forcing are determined for each cluster.
  • [Show abstract] [Hide abstract]
    ABSTRACT: Tokchok Island is located in the Yellow Sea, a semi-enclosed, shelf-type shallow basin between Korea and China. On the Korean side, the island is 50 km west of the greater Seoul area where around 20 million people reside; on the Chinese side, it is 250 km east of the greater Qindao area where more than 8 million people reside. Prevailing wind directions over the Yellow Sea are westerlies especially in winter. However, in summer with strong insolation, the island could be affected by the sea-land breeze from the Korean Peninsula. Although the emission amount of air pollutants is quite low in this small island of 20 km2 with 20 thousand inhabitants, there are an oil-firing power plant and a wharf for ferryboat. Gaseous pollutants including ozone and SO2 and fine particles such as TSP and PM2.5 were measured from April 1999 to June 2000. While ozone and SO2 were constantly measured during the period, other pollutants were intermittently measured during four intensive measurement periods. First, the measurement periods were classified into three groups on the basis of the correlation between ozone and SO2. They included the period more affected by the long-range transport, the period more affected by local emissions, and the remaining period. Characteristics of mass and ion concentrations of TSP and PM2.5 in each group were investigated in order to further study the variations of atmospheric trace substances according to the transport scales. The transport paths of trace substances were estimated by using the back trajectory analysis over the refined meteorological fields. Seasonal and meteorological influences revealed on the variations of trace substances were also discussed.
  • International Journal of Radiation OncologyBiologyPhysics 11/2001; 51(3):260-260. DOI:10.1016/S0360-3016(01)02300-8
  • [Show abstract] [Hide abstract]
    ABSTRACT: The Yellow Sea is a semi-enclosed, shelf-type shallow basin with reduced water exchange with the open ocean. The rim of the Yellow Sea--the west side is China and the east side is Korea--is one of the fastest developing zones in the world. During the past several years, considerable measurements have been made both around and over the Yellow Sea in order to study the pollutant transport in the region. Fine particles as well as gaseous pollutants have been routinely measured at three national background monitoring stations on the Korean side. Two ground stations have been operated for supplementing these monitoring stations; one is on the Korean side and the other is on the Chinese side. Aircraft and shipboard measurements were also made during selected intensive measurement periods. However, not all these measurements have been made for a common object. Rather, several research teams carried out their measurements for their own purposes according to separate plans. In the present work, the amounts of nitrogen and sulfur deposited in the region of the Yellow Sea in both dry and wet forms were estimated. Concentration data available from each measurement were reviewed to choose adequate ones. Meteorological data at ground stations were readily obtained either from a collocated automatic weather station or from a surface weather station in the nearby area. However, those over the sea were estimated from the output of RDAPS (Regional Data Assimilation and Prediction System), which were provided by the Korea Meteorological Administration. Precipitation data were only available from several routinely operated ground stations since intensive measurements accompanying aircraft or shipboard measurements were not made on rainy days. The amounts of dry and wet depositions were compared at these stations. (This work was supported in part by the Korea Ministry of Science and Technology under grant 98-LO-01-01-A-003 and in part by the Sustainable Water Resources Research Center of the 21st Century Frontier Research Program.)
  • S. Kim, J. Won
    [Show abstract] [Hide abstract]
    ABSTRACT: The radar interferometric measurement on sea surface has not be considered feasible, but Alsdorf et al. (2000) recently demonstrated that interferometric phases of L-HH SAR were correlated with centimeter-scale changes in the height of water surfaces within flooded vegetation. We present the characteristics of the JERS-1 SAR interferometric phase on seawater around Kaduckdo, Korea, and propose a possible application of SAR to measuring instantaneous relative sea level. Coherent signals, caused by manmade oyster farm structures and comparable to those from land in terms of coherence, were observed. Using 21 interferograms produced from 11 JERS-1 SAR single look complex data sets, the instantaneous sea level changes were estimated for the first time. The absolute sea level changes could not properly be restored by interferometric phases alone because of the discontinuity of phase and the large sea level changes in the area of interest. The wrapped phases are limited to an estimation of -7.6~7.6 cm changes due to uncertainty of sign (up or down). The comparison of the radar measurements with the tide gauge data (OTT-R20) yielded a relatively low correlation coefficient, 0.57. The possible error sources included the tide gauge measurements, which was not on-site measurements but 5 km away from the test site, and phase noise error (1.8 cm). We have overcome the ambiguity problem to some extent by exploiting radar back-scattering intensity. The radar intensity from sea farms was normalized using the statistics of the intensities at seawater and urban land area. The normalized intensity was inversely proportional to the sea level with a correlation coefficient of -0.83. We could thus constrain the number of wrapping counts to one (13 pairs) or two (9 pairs) within 68% confidence interval. When the wrapping count was chosen through the proposed method, the correlation coefficient was improved to be 0.96 with an r.m.s. error of 6.0 cm. The results show a feasibility of radar interferometry combined with altimetry to sea level measurement.

Publication Stats

32 Citations
34.50 Total Impact Points

Institutions

  • 2015
    • Seoul National University
      • Department of Statistics
      Sŏul, Seoul, South Korea
  • 2012–2013
    • Korea University
      Sŏul, Seoul, South Korea
  • 2009–2012
    • Stanford University
      • • Division of Biostatistics
      • • Department of Radiology
      Palo Alto, CA, United States