
Jason T. L. WangNew Jersey Institute of Technology | NJIT · Department of Computer Science
Jason T. L. Wang
PhD
About
251
Publications
18,878
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,451
Citations
Citations since 2017
Introduction
Skills and Expertise
Publications
Publications (251)
Solar flares, especially the M- and X-class flares, are often associated with coronal mass ejections. They are the most important sources of space weather effects, which can severely impact the near-Earth environment. Thus it is essential to forecast flares (especially the M- and X-class ones) to mitigate their destructive and hazardous consequence...
Solar activity is often caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions (ARs) have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with...
Solar activity is usually caused by the evolution of solar magnetic fields. Magnetic field parameters derived from photospheric vector magnetograms of solar active regions have been used to analyze and forecast eruptive events such as solar flares and coronal mass ejections. Unfortunately, the most recent solar cycle 24 was relatively weak with few...
Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIR...
Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIR...
The Sun constantly releases radiation and plasma into the heliosphere. Sporadically, the Sun launches solar eruptions such as flares and coronal mass ejections (CMEs). CMEs carry away a huge amount of mass and magnetic flux with them. An Earth-directed CME can cause serious consequences to the human system. It can destroy power grids/pipelines, sat...
The impact of climate change on the environment has become increasingly visible today, and foreseeing future climate events, which involves long-term prediction of climate variables (e.g., temperature, wind speed, precipitation, etc.) at a local small scale in a local region, is crucial for disaster risk management. General Circulation Models (GCMs...
Solar flares, especially the M- and X-class flares, are often associated with coronal mass ejections (CMEs). They are the most important sources of space weather effects, that can severely impact the near-Earth environment. Thus it is essential to forecast flares (especially the M-and X-class ones) to mitigate their destructive and hazardous conseq...
Geomagnetic activities have a crucial impact on Earth, which can affect spacecraft and electrical power grids. Geospace scientists use a geomagnetic index, called the Kp index, to describe the overall level of geomagnetic activity. This index is an important indicator of disturbances in the Earth’s magnetic field and is used by the U.S. Space Weath...
Wind dynamics are extremely complex and have critical impacts on the level of damage from natural hazards, such as storms and wildfires. In the wake of climate change, wind dynamics are becoming more complex, making the prediction of future wind characteristics a more challenging task. Nevertheless, having long-term projections of some wind charact...
Data science and technology offer transformative tools and methods to science. This review article highlights latest development and progress in the interdisciplinary field of data-driven plasma science (DDPS). A large amount of data and machine learning algorithms go hand in hand. Most plasma data, whether experimental, observational or computatio...
The disturbance storm time (Dst) index is an important and useful measurement in space weather research. It has been used to characterize the size and intensity of a geomagnetic storm. A negative Dst value means that the Earth's magnetic field is weakened, which happens during storms. In this paper, we present a novel deep learning method, called t...
The disturbance storm time (Dst) index is an important and useful measurement in space weather research. It has been used to characterize the size and intensity of a geomagnetic storm. A negative Dst value means that the Earth's magnetic field is weakened, which happens during storms. In this paper, we present a novel deep learning method, called t...
Solar energetic particles (SEPs) are an essential source of space radiation, and are hazardous for humans in space, spacecraft, and technology in general. In this paper, we propose a deep-learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (...
Solar energetic particles (SEPs) are an essential source of space radiation, which are hazards for humans in space, spacecraft, and technology in general. In this paper we propose a deep learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (i...
Solar and Heliosphere physics are areas of remarkable data-driven discoveries. Recent advances in high-cadence, high-resolution multiwavelength observations, growing amounts of data from realistic modeling, and operational needs for uninterrupted science-quality data coverage generate the demand for a solar metadata standardization and overall heal...
Wind plays a crucial part during adverse events, such as storms and wildfires, and is a widely leveraged source of renewable energy. Predicting long-term daily local wind speed is critical for effective monitoring and mitigation of climate change, as well as to locate suitable locations for wind farms. Long-term simulations of wind dynamics (until...
We present a new deep-learning method, named FibrilNet, for tracing chromospheric fibrils in Hα images of solar observations. Our method consists of a data preprocessing component that prepares training data from a threshold- based tool, a deep-learning model implemented as a Bayesian convolutional neural network for probabilistic image segmentatio...
Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA’s Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun’s magnetic activity. HMI provides continuous full-disk observati...
The Earth's primary source of energy is the radiant energy generated by the Sun, which is referred to as solar irradiance, or total solar irradiance (TSI) when all of the radiation is measured. A minor change in the solar irradiance can have a significant impact on the Earth's climate and atmosphere. As a result, studying and measuring solar irradi...
We present a new deep learning method, dubbed FibrilNet, for tracing chromospheric fibrils in Halpha images of solar observations. Our method consists of a data pre-processing component that prepares training data from a threshold-based tool, a deep learning model implemented as a Bayesian convolutional neural network for probabilistic image segmen...
Although the source active regions of some coronal mass ejections (CMEs) were identified in CME catalogues, vast majority of CMEs do not have an identified source active region. We propose a method that uses a filtration process and machine learning to identify the sunspot groups associated with a large fraction of CMEs and compare the physical par...
The Earth's primary source of energy is the radiant energy generated by the Sun, which is referred to as solar irradiance, or total solar irradiance (TSI) when all of the radiation is measured. A minor change in the solar irradiance can have a significant impact on the Earth's climate and atmosphere. As a result, studying and measuring solar irradi...
Solar flare prediction plays an important role in understanding and forecasting space weather. The main goal of the Helioseismic and Magnetic Imager (HMI), one of the instruments on NASA's Solar Dynamics Observatory, is to study the origin of solar variability and characterize the Sun's magnetic activity. HMI provides continuous full-disk observati...
Deep learning has drawn a lot of interest in recent years due to its effectiveness in processing big and complex observational data gathered from diverse instruments. Here we propose a new deep learning method, called SolarUnet, to identify and track solar magnetic flux elements or features in observed vector magnetograms based on the Southwest Aut...
The ability of predicting future frames in video sequences, known as video prediction, is an appealing yet challenging task in computer vision. This task requires an in-depth representation of video sequences and a deep understanding of real-word causal rules. Existing approaches for tackling the video prediction problem can be classified into two...
The ability of predicting future frames in video sequences, known as video prediction, is an appealing yet challenging task in computer vision. This task requires an in-depth representation of video sequences and a deep understanding of real-word causal rules. Existing approaches for tackling the video prediction problem can be classified into two...
Predicting the response, or sensitivity, of a clinical drug to a specific cancer type is an important research problem. By predicting the clinical drug response correctly, clinicians are able to understand patient-to-patient differences in drug sensitivity outcomes, which in turn results in lesser time spent and lower cost associated with identifyi...
Directed networks find many applications in computer science, social science and biomedicine, among others. In this paper we propose a new graph mining algorithm that is capable of locating all frequent induced subgraphs in a given set of directed networks. We present an incremental coding scheme for representing the canonical form of a graph, stud...
Reverse engineering gene regulatory networks (GRNs), also known as GRN inference, refers to the process of reconstructing GRNs from gene expression data. A GRN is modeled as a directed graph in which nodes represent genes and edges show regulatory relationships between the genes. By predicting the edges to infer a GRN, biologists can gain a better...
Transfer learning (TL) algorithms aim to improve the prediction performance in a target task (e.g. the prediction of cisplatin sensitivity in triple-negative breast cancer patients) via transferring knowledge from auxiliary data of a related task (e.g. the prediction of docetaxel sensitivity in breast cancer patients), where the distribution and ev...
The Gene Regulatory Network (GRN) inference problem in computational biology is challenging. Many algorithmic and statistical approaches have been developed to computationally reverse engineer biological systems. However, there are no known bioinformatics tools capable of performing perfect GRN inference. Here, we review and compare seven recent bi...
Reverse engineering gene regulatory networks (GRNs), also known as network inference, refers to the process of reconstructing GRNs from gene expression data. Biologists model a GRN as a directed graph in which nodes represent genes and links show regulatory relationships between the genes. By predicting the links to infer a GRN, biologists can gain...
Traditional machine learning approaches to drug sensitivity prediction assume that training data and test data must be in the same feature space and have the same underlying distribution. However, in real-world applications, this assumption does not hold. For example, we sometimes have limited training data for the task of drug sensitivity predicti...
Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the co...
MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides (nt) that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpins can be found in genomes. It is a challenge to distinguish the real pre-miRNAs...
RNA junctions are important structural elements of RNA molecules. They are formed when three or more helices come together in three-dimensional space. Recent studies have focused on the annotation and prediction of coaxial helical stacking (CHS) motifs within junctions. Here we exploit such predictions to develop an efficient alignment tool to hand...
Results obtained by aligning five pairs of riboswitches from Table 2 using CHSalign_p.
For each pair of riboswitches, the input and output of the CHSalign_p program are displayed. The input includes two riboswitches in bpseq format. CHSalign_p invokes Junction-Explorer to predict coaxial helical stacking (CHS) motifs in the input molecules, and ali...
Results obtained by aligning two pairs of riboswitches from Table 2 using CHSalign_u.
For each pair of riboswitches, the input and output of the CHSalign_u program are displayed. The input includes two riboswitches in bpseq format along with CHS motifs annotated manually by the user. The output includes the CPU time spent in performing the alignmen...
Network inference through link prediction is an important data mining problem that finds many applications in computational social science and biomedicine. For example, by predicting links, i.e., regulatory relationships, between genes to infer gene regulatory networks (GRNs), computational biologists gain a better understanding of the functional e...
RNA pseudoknots play important roles in many biological processes. Previous methods for comparative pseudoknot analysis mainly focus on simultaneous folding and alignment of RNA sequences. Little work has been done to align two known RNA secondary structures with pseudoknots taking into account both sequence and structure information of the two RNA...
Link prediction is an important data mining problem that has many applications in different domains such as social network analysis and computational biology. For example, biologists model gene regulatory networks (GRNs) as directed graphs where nodes are genes and links show regulatory relationships between the genes. By predicting links in GRNs,...
Use of computational methods to predict gene regulatory networks (GRNs) from gene expression data is a challenging task. Many studies have been conducted using unsupervised methods to fulfill the task; however, such methods usually yield low prediction accuracies due to the lack of training data. In this article, we propose semi-supervised methods...
We consider a new tree mining problem that aims to discover restrictedly embedded subtree patterns from a set of rooted labeled unordered trees. We study the properties of a canonical form of unordered trees, and develop new Apriori-based techniques to generate all candidate subtrees level by level through two efficient rightmost expansion operatio...
Developing effective artificial intelligence tools to find motifs in DNA, RNA and proteins poses a challenging yet important problem in life science research. In this paper, we present a computational approach for finding RNA tertiary motifs in genomic sequences. Specifically, we predict genomic coordinate locations for coaxial helical stackings in...
In the text mining field, obtaining training data requires human experts' labeling efforts, which is often time consuming and expensive. Supervised learning with only a small number of positive examples and a large amount of unlabeled data, which is easy to get, has attracted booming interests in the field. A recently proposed relabeling method, wh...
This chapter presents two case studies related to RNA data analysis. The first case study focuses on classification of microRNA precursors (also known as pre-miRNAs). The second case study focuses on prediction of RNA secondary structures, including pseudoknots. Pseudoknots are important RNA tertiary motifs. The authors describe in detail the probl...
RNA tertiary interactions or tertiary motifs are conserved structural patterns formed by pairwise interactions between nucleotides. They include base-pairing, base-stacking, and base-phosphate interactions. A-minor motifs are the most common tertiary interactions in the large ribosomal subunit. The A-minor motif is a nucleotide triple in which mino...
Abstract MicroRNAs play important roles in most biological processes, including cell proliferation, tissue differentiation, and embryonic development, among others. They originate from precursor transcripts (pre-miRNAs), which contain phylogenetically conserved stem-loop structures. An important bioinformatics problem is to distinguish the pre-miRN...
We present in this paper an ab initio method, named KnotFold, for RNA H-type pseudoknot prediction. Our method employs an ensemble of RNA folding tools and a filtering heuristic to generate a set of pseudoknot-free stems, and then predicts pseudoknots by utilizing a search technique with a pseudo-probability scoring scheme. Experimental results sho...
Motif finding in DNA, RNA and proteins plays an important role in life science research. In this paper, we present a computational approach to searching for RNA tertiary motifs in genomic sequences. Specifically, we describe a method, named CSminer, and show, as a case study, the application of CSminer to genome-wide search for coaxial helical stac...
MicroRNAs (miRNAs) are non-coding RNAs with approximately 22 nucleotides (nt) that are derived from precursor molecules. These precursor molecules or pre-miRNAs often fold into stem-loop hairpin structures. However, a large number of sequences with pre-miRNA-like hairpins can be found in genomes. It is a challenge to distinguish the real pre-miRNAs...
Motif finding in DNA, RNA and proteins plays an important role in life science research. Recent patents concerning motif finding in biomolecular data are recorded in the DNA Patent Database which serves as a resource for policy makers and members of the general public interested in fields like genomics, genetics and biotechnology. In this paper, we...
We propose an ab initio method, named DiscoverR, for finding common patterns from two RNA secondary structures. The method works by representing RNA secondary structures as ordered labeled trees and performs tree pattern discovery using an efficient dynamic programming algorithm. DiscoverR is able to identify and extract the largest common substruc...
We study a data mining problem concerning the elastic peak detection in 2D liquid chromatography-mass spectrometry (LC-MS) data. These data can be modeled as time series, in which the X-axis represents time points and the Y-axis represents intensity values. A peak occurs in a set of 2D LC-MS data when the sum of the intensity values in a sliding ti...
Recently non-coding RNA (ncRNA) genes have been found to serve many important functions in the cell such as regulation of gene expression at the transcriptional level. Potentially there are more ncRNA molecules yet to be found and their possible functions are to be revealed. The discovery of ncRNAs is a difficult task because they lack sequence ind...
MicroRNAs (miRNAs) are short single-stranded RNA molecules with 21-22 nucleotides known to regulate post-transcriptional expression of protein-coding genes involved in most of the cellular processes. Prediction of miRNA targets is a challenging bioinformatics problem. AU-rich elements (AREs) are regulatory RNA motifs found in the 3’ untranslated re...
XML's tree structure provides a rich background for complicated structural searches. In this paper we present a new system, called XML Query by Example (XML QBE) that allows the user to query XML documents exploiting their inherent tree structure. We present some interesting queries and describe the underlying query processing algorithms. We also d...
We consider the problem of comparing CUAL graphs (Connected, Undirected, Acyclic graphs with nodes being Labeled). This problem is motivated by the study of information retrieval for bio-chemical and molecular databases. Suppose we define the distance between two CUAL graphs G1 and G2 to be the weighted number of edit operations (insert node, delet...
We propose here a new approach for ncRNA prediction. Our approach selects features derived from RNA folding programs and ranks these features using a class separation method that measures the ability of the features to differentiate between positive and negative classes. The target feature set comprising top-ranked features is then used to construc...
RNA junctions are important structural elements that form when three or more helices come together in space in the tertiary
structures of RNA molecules. Determining their structural configuration is important for predicting RNA 3D structure. We introduce
a computational method to predict, at the secondary structure level, the coaxial helical stacki...
Analysis of a large number of RNA molecules indicates that variations in their nucleotide sequences do not necessarily convey differences in their secondary structures. Numerous methods have been developed to find patterns in RNA molecules, including the detection of structural motifs in families of noncoding RNAs (ncRNAs). When almost identical se...
We present a method, called BlockMatch, for aligning two blocks, where a block is an RNA multiple sequence alignment with the consensus secondary structure of the alignment in Stockholm format. The method employs a quadratic-time dynamic programming algorithm for aligning columns and column pairs of the multiple alignments in the blocks. Unlike man...