NVIDIA
  • Santa Clara, CA, United States
Recent publications
The IceCube Neutrino Observatory is a cubic kilometer neutrino detector located at the geographic South Pole designed to detect high-energy astrophysical neutrinos. To thoroughly understand the detected neutrinos and their properties, the detector response to signal and background has to be modeled using Monte Carlo techniques. An integral part of these studies are the optical properties of the ice the observatory is built into. The simulated propagation of individual photons from particles produced by neutrino interactions in the ice can be greatly accelerated using graphics processing units (GPUs). In this paper, we (a collaboration between NVIDIA and IceCube) reduced the propagation time per photon by a factor of up to 3 on the same GPU. We achieved this by porting the OpenCL parts of the program to CUDA and optimizing the performance. This involved careful analysis and multiple changes to the algorithm. We also ported the code to NVIDIA OptiX to handle the collision detection. The hand-tuned CUDA algorithm turned out to be faster than OptiX. It exploits detector geometry and only a small fraction of photons ever travel close to one of the detectors.
Acquiring pixel-level annotation has been a major challenge for machine learning methods in medical image analysis. Such difficulty mainly comes from two sources: localization requiring high expertise, and delineation requiring tedious and time-consuming work. Existing methods of easing the annotation effort mostly focus on the latter one, the extreme of which is replacing the delineation with a single label for all cases. We postulate that under a clinical-realistic setting, such methods alone may not always be effective in reducing the annotation requirements from conventional classification/detection algorithms, because the major difficulty can come from localization, which is often neglected but can be critical in medical domain, especially for histopathology images. In this work, we performed a worst-case scenario study to identify the information loss from missing detection. To tackle the challenge, we 1) proposed a different annotation strategy to image data with different levels of disease severity, 2) combined semi- and self-supervised representation learning with probabilistic weakly supervision to make use of the proposed annotations, and 3) illustrated its effectiveness in recovering useful information under the worst-case scenario. As a shift from previous convention, it can potentially save significant time for experts’ annotation for AI model development.KeywordsClinical-realistic annotationHistopathologyProbabilistic semi-supervisionWorst-case study
Which volume to annotate next is a challenging problem in building medical imaging datasets for deep learning. One of the promising methods to approach this question is active learning (AL). However, AL has been a hard nut to crack in terms of which AL algorithm and acquisition functions are most useful for which datasets. Also, the problem is exacerbated with which volumes to label first when there is zero labeled data to start with. This is known as the cold start problem in AL. We propose two novel strategies for AL specifically for 3D image segmentation. First, we tackle the cold start problem by proposing a proxy task and then utilizing uncertainty generated from the proxy task to rank the unlabeled data to be annotated. Second, we craft a two-stage learning framework for each active iteration where the unlabeled data is also used in the second stage as a semi-supervised fine-tuning strategy. We show the promise of our approach on two well-known large public datasets from medical segmentation decathlon. The results indicate that the initial selection of data and semi-supervised framework both showed significant improvement for several AL strategies. KeywordsActive learningDeep learningSemi-supervised learningSelf-supervised learningSegmentationCT
The training hyperparameters (learning rate, augmentation policies, e.t.c) are key factors affecting the performance of deep networks for medical image segmentation. Manual or automatic hyperparameter optimization (HPO) is used to improve the performance. However, manual tuning is infeasible for a large number of parameters, and existing automatic HPO methods like Bayesian optimization are extremely time consuming. Moreover, they can only find a fixed set of hyperparameters. Population based training (PBT) has shown its ability to find dynamic hyperparameters and has fast search speed by using parallel training processes. However, it is still expensive for large 3D medical image datasets with limited GPUs, and the performance lower bound is unknown. In this paper, we focus on improving the network performance using hyperparameter scheduling via PBT with limited computation cost. The core idea is to train the network with a default setting from prior knowledge, and finetune using PBT based hyperparameter scheduling. Our method can achieve 1%–3% performance improvements over default setting while only taking 3%–10% computation cost of training from scratch using PBT. KeywordsHyperparameter optimizationPopulation based trainingBayesian optimizationMedical image segmentation
Rheumatic heart disease (RHD) is a common medical condition in children in which acute rheumatic fever causes permanent damage to the heart valves, thus impairing the heart’s ability to pump blood. Doppler echocardiography is a popular diagnostic tool used in the detection of RHD. However, the execution of this assessment requires the work of skilled physicians, which poses a problem of accessibility, especially in low-income countries with limited access to clinical experts. This paper presents a novel, automated, deep learning-based method to detect RHD using color Doppler echocardiography clips. We first homogenize the analysis of ungated echocardiograms by identifying two acquisition views (parasternal and apical), followed by extracting the left atrium regions during ventricular systole. Then, we apply a model ensemble of multi-view 3D convolutional neural networks and a multi-view Transformer to detect RHD. This model allows our analysis to benefit from the inclusion of spatiotemporal information and uses an attention mechanism to identify the relevant temporal frames for RHD detection, thus improving the ability to accurately detect RHD. The performance of this method was assessed using 2,136 color Doppler echocardiography clips acquired at the point of care of 591 children in low-resource settings, showing an average accuracy of 0.78, sensitivity of 0.81, and specificity of 0.74. These results are similar to RHD detection conducted by expert clinicians and superior to the state-of-the-art approach. Our novel model thus has the potential to improve RHD detection in patients with limited access to clinical experts.
Perovskite Solar Cells In article number 2200332, Meng Li, Bo Hou, Fei Wang, Zhe Li, and co‐workers reported a quantitative analysis of the Pb leaching dynamics by investigating five types of state‐of‐the‐art perovskite solar cells (PSCs). They found that Pb leaching occurs rapidly, with more than 60% of total Pb amount leaching within the first 120 s upon aqueous exposure. The Pb leaching rate is likely dependent upon the type of PSCs.
Indonesian peatlands and their large carbon stores are under threat from recurrent large-scale fires driven by anthropogenic ecosystem degradation. Although the key drivers of peatland fires are known, a holistic methodology for assessing the potential of fire mitigation strategies is lacking. Here, we use machine learning (convolutional neural network) to develop a model capable of recreating historic fire observations based on pre-fire season parameters. Using this model, we test multiple land management and peatland restoration scenarios and quantify the associated potential for fire reduction. We estimate that converting heavily degraded swamp shrubland areas to swamp forest or plantations can reduce fires occurrence by approximately 40% or 55%, respectively. Blocking all but major canals to restore these degraded areas to swamp forest may reduce fire occurrence by 70%. Our findings suggest that effective land management strategies can influence fire regimes and substantially reduce carbon emissions associated with peatland fires, in addition to enabling sustainable management of these important ecosystems.
In massive multiple-input multiple-output (MIMO) systems, hybrid analog-digital beamforming is an essential technique for exploiting the potential array gain without using a dedicated radio frequency chain for each antenna. However, due to the large number of antennas, the conventional channel estimation and hybrid beamforming algorithms generally require high computational complexity and signaling overhead. In this work, we propose an end-to-end deep-unfolding neural network (NN) joint channel estimation and hybrid beamforming (JCEHB) algorithm to maximize the system sum rate in time-division duplex (TDD) massive MIMO. Specifically, the recursive least-squares (RLS) algorithm and stochastic successive convex approximation (SSCA) algorithm are unfolded for channel estimation and hybrid beamforming, respectively. In order to reduce the signaling overhead, we consider a mixed-timescale hybrid beamforming scheme, where the analog beamforming matrices are optimized based on the channel state information (CSI) statistics offline, while the digital beamforming matrices are designed at each time slot based on the estimated low-dimensional equivalent CSI matrices. We jointly train the analog beamformers together with the trainable parameters of the RLS and SSCA induced deep-unfolding NNs based on the CSI statistics offline. During data transmission, we estimate the low-dimensional equivalent CSI by the RLS induced deep-unfolding NN and update the digital beamformers. In addition, we propose a mixed-timescale deep-unfolding NN where the analog beamformers are optimized online, and extend the framework to frequency-division duplex (FDD) systems where channel feedback is considered. Simulation results show that the proposed algorithm can significantly outperform conventional algorithms with reduced computational complexity and signaling overhead.
How to understand the dependability concerning electrical/electronic (E/E) product development? We introduce an essential answer that can be perceived as the first step toward a novel approach for dependable E/E product development, called the dual-cone V-model.
Institution pages aggregate content on ResearchGate related to an institution. The members listed on this page have self-identified as being affiliated with this institution. Publications listed on this page were identified by our algorithms as relating to this institution. This page was not created or approved by the institution. If you represent an institution and have questions about these pages or wish to report inaccurate content, you can contact us here.
712 members
Stefan Jeschke
  • Department of Research
John A. Gunnels
  • Mathematical Libraries/Quantum Computing
Karel Petrak
  • Department of Research
Information
Address
2788 San Tomas Expressway, 95051, Santa Clara, CA, United States
Website
http://nvidia.com/
Phone
+1 (408) 486-2000