You-Gan WangThe University of Q · Mathematics
You-Gan Wang
D.Phil, Oxford
About
294
Publications
96,744
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
6,328
Citations
Introduction
Optimization and Machine Learning
Dependent Data Analysis
Robust Analysis
https://staff.qut.edu.au/staff/you-gan.wang
Additional affiliations
April 2010 - September 2015
Education
September 1988 - February 1991
Publications
Publications (294)
Due to its favorable traits—such as lower lignin content, higher oil concentration, and increased protein levels—the genetic improvement of yellow-seeded rapeseed has attracted more attention than other rapeseed color variations. Traditionally, yellow-seeded rapeseed has been identified visually, but the complex variability in the seed coat color o...
Study region: Poyang Lake, China's largest freshwater lake Study focus: The water level variations of Poyang Lake and the combined effects of the upstream rivers and the Yangtze River during extreme drought events are not yet fully understood. In this study, the temporal and spatial variations of Poyang Lake's water level and the river-lake interac...
This chapter focuses on two critical issues of concept inventories (CIs) in STEM subjects (STEM-CIs) for initial teacher education (ITE)—one is enhancing testing efficiency and accuracy for CI tests through modern technologies and the other is evaluating learning gain in a more general approach. First, we review the research conducted on STEM-CIs u...
To address the sensitivity of parameter and dissatisfactory precision for physics informed extreme machine learning (called PIELM) with common sigmoid, tangent and gaussian activation functions in solving high order partial differential equations (PDEs) arised from the fields of scientific computation and engineering applications. In this work, a F...
This paper reviews the integration of Q‐learning with meta‐heuristic algorithms (QLMA) over the last 20 years, highlighting its success in solving complex optimization problems. We focus on key aspects of QLMA, including parameter adaptation, operator selection, and balancing global exploration with local exploitation. QLMA has become a leading sol...
Lake temperature forecasting is crucial for understanding and mitigating climate change impacts on aquatic ecosystems. The meteorological time series data and their relationship have a high degree of complexity and uncertainty, making it difficult to predict lake temperatures. In this study, we propose a novel approach, Probabilistic Quantile Multi...
The water quality index (WQI) is a widely used tool for comprehensive assessment of river environments. However, its calculation involves numerous water quality parameters, making sample collection and laboratory analysis time-consuming and costly. This study aimed to identify key water parameters and the most reliable prediction models that could...
A machine learning technique merging Bayesian method called Bayesian Additive Regression Trees (BART) provides a nonparametric Bayesian approach that further needs improved forecasting accuracy in the presence of outliers, especially when dealing with potential nonlinear relationships and complex interactions among the response and explanatory vari...
Multidimensional forced-choice (MFC) items have been found to be useful to reduce response biases in personality assessments. However, conventional scoring methods for the MFC items result in ipsative data, hindering the wider applications of the MFC format. In the last decade, a number of item response theory (IRT) models have been developed, majo...
The transmission of inflation is a widespread occurrence, and managing inflationary pressures is a crucial macroeconomic challenge. Although inflation is a typical macroeconomic variable, its contemporaneous and lagged causal relationships have not been thoroughly investigated, which could result in missing important policy insights. The Bayesian g...
Information technology and statistical modeling have made significant contributions to smart agriculture. Machine vision and hyperspectral technologies, with their non-destructive and real-time capabilities, have been extensively utilized in the non-destructive diagnosis and quality monitoring of crops and seeds, becoming essential tools in traditi...
Deep neural networks have garnered widespread attention due to their simplicity and flexibility in the fields of engineering and scientific calculation. In this study, we probe into solving a class of elliptic partial differential equations (PDEs) with multiple scales by utilizing Fourier-based mixed physics informed neural networks (dubbed FMPINN)...
In‐situ observations of hydrodynamics and suspended sediment concentrations (SSCs) were conducted on an abandoned lobe in the northern part of the modern Yellow River Delta, China. The SSC record at the site is found to be the superposition of a general trend (fast increase and slow decrease cycle) caused by storm waves (SubSSC1) and relatively sma...
The interest in predicting online learning performance using ML algorithms has been steadily increasing. We first conducted a scientometric analysis to provide a systematic review of research in this area. The findings show that most existing studies apply the ML methods without considering learning behavior patterns, which may compromise the predi...
The Support Vector Regression (SVR) technique can approximate intricate systems by addressing learning and estimation challenges within a reproducing kernel Hilbert space, devoid of reliance on specific parameter assumptions. However, when dealing with correlated data like time series, the SVR method often falls short in accounting for underlying t...
Deep learning methods have gained considerable interest in the numerical solution of various partial differential equations (PDEs). One particular focus is physics-informed neural networks (PINN), which integrate physical principles into neural networks. This transforms the process of solving PDEs into optimization problems for neural networks. To...
Deep learning methods have gained considerable interest in the numerical solution of various partial differential equations (PDEs). One particular focus is on physics-informed neural networks (PINNs), which integrate physical principles into neural networks. This transforms the process of solving PDEs into optimization problems for neural networks....
Analytical solutions are practical tools in ocean engineering, but their derivation is often constrained by the complexities of the real world. This underscores the necessity for alternative approaches. In this study, the potential of Physics-Informed Neural Networks (PINN) for solving the one-dimensional vertical suspended sediment mixing (settlin...
This work aims to provide a review of methodology on analysis of longitudinal data focusing on (i) how to select different model components: the covariance (correlation and variance) functions or structures and the predictive variables; (ii) the robust approaches including rank and quantile regression; and (iii) machine learning algorithms that inc...
We propose an adjusted robust heteroscedastic autoregressive spatiotemporal model with a data-driven process to predict the hourly PM 2:5 concentrations in Xi'an and Xianyang, China. To begin with, an adjusted variance function for the heteroscedastic model is proposed to capture the different variances of the PM2.5 concentrations during the period...
Electricity demand forecasting is crucial for practical power system management. However, during the COVID-19 pandemic, the electricity demand system deviated from normal system, which has detrimental bias effect in future forecasts. To overcome this problem, we propose a deep learning framework with a COVID-19 adjustment for electricity demand for...
Accurately predicting runoff (Q) and Suspended Sediment Concentration (SSC) is crucial for the environmental and geological evolution of the Yellow River Delta, a region with numerous oil fields and wetlands. However, accurate prediction of Q and SSC in the Yellow River Delta, characterized by the intricate interplay of high-frequency and low-frequ...
Energy efficiency is crucial for the operation and management of cloud data centers, which are the foundation of cloud computing. Virtual machine (VM) placement plays a vital role in improving energy efficiency in data centers. The genetic algorithm (GA) has been extensively studied for solving the VM placement problem due to its ability to provide...
Forecasting stock market movements is a challenging task from the practitioners’ point of view. We explore how model selection via the least absolute shrinkage and selection operator (LASSO) approach can be better used to forecast stock closing prices using real-world datasets of daily stock closing prices of three major international airlines. Com...
Background
The central biological clock governs numerous facets of mammalian physiology, including sleep, metabolism, and immune system regulation. Understanding gene regulatory relationships is crucial for unravelling the mechanisms that underlie various cellular biological processes. While it is possible to infer circadian gene regulatory relatio...
The utilization of gene selection techniques is crucial when dealing with extensive datasets containing limited cases and numerous genes, as they enhance the learning processes and improve overall outcomes. In this research, we introduce a hybrid method that combines the binary reptile search algorithm (BRSA) with the LASSO regression method to eff...
A spatial sampling design for optimally selecting additional locations should capture the complex relationships of spatial variables. Spatial variables may be complex in the following ways: non-Gaussian spatial dependence, spatially nonlinear, and there may be multiple spatially correlated variables. For example, multiple variables are sampled over...
The extreme learning machine (ELM) is a well-known approach for training single hidden layer feedforward neural networks (SLFNs) in machine learning. However, ELM is most effective when used for regression on datasets with simple Gaussian distributed error because it often employs a squared loss in its objective function. In contrast, real-world da...
Wave and water depth were measured with an instrumented tripod in the Yellow River Delta from 9 December 2014 to 29 April 2015. Concurrent wind data were also collected from a nearby wind station. A high‐precision model for predicting local significant wave height (Hs) with wind speed (vw) is constructed using an improved data‐driven approach. The...
The transmission of inflation is a widespread occurrence, and managing inflationary pres-sures is a crucial macroeconomic challenge. Although inflation is a typical macroeconomic variable, its contemporaneous and lagged causal relationships have not been thoroughly investigated, which could result in missing important policy insights. The Bayesian...
Deep neural networks have received significant attention due to their simplicity and flexibility in the fields of engineering and scientific calculation. In this work, we probe into solving a class of elliptic PDEs with multiple scales by means of Fourier-based mixed physics-informed neural networks (called FMPINN), and its solver is configured as...
Many person-fit statistics have been proposed to detect aberrant response behaviors (e.g., cheating, guessing). Among them, lz is one of the most widely used indices. The computation of lz assumes the item and person parameters are known. In reality, they often have to be estimated from data. The better the estimation, the better lz will perform. W...
In time series forecasting with outliers and random noise, parameter estimation in a neural network via minimizing the
$l_{2}$
loss is unreliable. Therefore, an adaptive rescaled lncosh loss function is proposed in this article to handle time series modeling with outliers and random noise. It overcomes the limitation of the single distribution of...
The COVID-19 pandemic has given rise to significant changes in electricity demand around the world. Although these changes differ from region to region, countries that have implemented stringent lockdown measures to curtail the spread of the virus have experienced the greatest alterations in demand. Within Australia, the state of Victoria has been...
A spatial sampling design for optimally selecting additional locations should capture the complex relationships of spatial variables. Spatial variables may be complex in the following ways: non-Gaussian spatial dependence, spatially nonlinear, and there may be multiple spatially correlated variables. For example, multiple variables are sampled over...
China implemented a strict lockdown policy to prevent the spread of COVID-19 in the worst-affected regions, including Wuhan and Shanghai. This study aims to investigate impact of these lockdowns on air quality index (AQI) using a deep learning framework. In addition to historical pollutant concentrations and meteorological factors, we incorporate s...
The hyperparameters in support vector regression (SVR) determine the effectiveness of the support vectors with fitting and predictions. However, the choice of these hyperparameters has always been challenging in both theory and practice. The ν-support vector regression eliminates the need to specify an value elegantly, but at the cost of specifying...
Background: The central biological clock controls countless aspects of mammalian physiology, such as sleep, metabolism, and immune system regulation. Gene regulatory relationships are essential in understanding the mechanisms that underlie various cellular biological processes. It is feasible to infer circadian gene regulatory relationships from ti...
The presence of heterogeneous variances is the norm in practice, which makes machine learning predictions less reliable when noise variance is implicitly assumed to be equal. To this end, we extend support vector regression by allowing a range of variances in the model training. Specifically, we model the variance as a function of the mean and othe...
Standard methods for forecasting electricity loads are not robust to cyberattacks on electricity demand data, potentially leading to severe consequences such as major economic loss or a system blackout. Methods are required that can handle forecasting under these conditions and detect outliers that would otherwise go unnoticed. The key challenge is...
Many engineering and scientific problems in the real-world boil down to optimization problems, which are difficult to solve by using traditional methods. Meta-heuristics are appealing algorithms for solving optimization problems while keeping computational costs reasonable. The marine predators algorithm (MPA) is a modern optimization meta-heuristi...
Centrality has always been used in transportation networks to estimate the status and importance of a node in the networks, especially in the shipping networks. However, most of the studies only take the shipping network as an unweighted network or only considering the tie weights in the weighted networks, ignoring the truth that both the number of...
Energy efficiency is a critical issue in the management and operation of cloud data centers, which form the backbone of cloud computing. Virtual machine (VM) placement has a significant impact on energy-efficiency improvement for virtualized data centers. Among various methods to solve the VM-placement problem, the genetic algorithm (GA) has been w...
In load forecasting fields, electricity demand with hierarchical structure is very popular where there are some differences among investigated load series because of geography or customers' habits. Common methods usually ignore their differences and introduce some complex models to improve forecasting performance. Therefore, appropriately dealing w...
The insensitivity parameter in support vector regression determines the set of support vectors that greatly impacts the prediction. A data-driven approach is proposed to determine an approximate value for this insensitivity parameter by minimizing a generalized loss function originating from the likelihood principle. This data-driven support vector...
Accurate air quality index (AQI) forecasting makes a difference to public health, local economic development, and ecological environment. As a typical geographical datum, the spatial autocorrelation (SAC) of the AQI is often ignored, which may violate the assumptions of some models, such as machine learning which requires variables to be independen...
We consider predictions in longitudinal studies, and investigate the well known statistical mixed-effects model, piecewise linear mixed-effects model and six different popular machine learning approaches: decision trees, bagging, random forest, boosting, support-vector machine and neural network. In order to consider the correlated data in machine...
Many studies have considered temperature trends at the global scale, but the literature is commonly associated with an overall increase in mean temperature in a defined past time period and hence lacking in in-depth analysis of the latent trends. For example, in addition to heterogeneity in mean and median values, daily temperature data often exhib...
In environmental monitoring, multiple spatial variables are often sampled at a geographical location that can depend on each other in complex ways, such as non-linear and non-Gaussian spatial dependence. We propose a new mixture copula model that can capture those complex relationships of spatially correlated multiple variables and predict univaria...
In engineering applications, many real-world optimization problems are nonlinear with multiple local optimums. Traditional algorithms that require gradients are not suitable for these problems. Meta-heuristic algorithms are popularly employed to deal with these problems because they can promisingly jump out of local optima and do not need any gradi...
A better understanding of phosphorus-transfer process and influence factors at Sediment-Water Interface (SWI) is essential to develop effective and efficient river managements strategies. In this study, overlying water, pore water and riverbed sediment samples were collected in the Huaihe River (HR), a highly polluted river in Eastern China, in May...
An Underwater Data Center (UDC) is an underwater vessel full of computing servers and designed with a cooling system using cold water in the ocean. A UDC vessel is composed of cabinets for computing servers, and the cabinets are finally packed into racks that facilitate the installments of the computing servers. We formulate the problem of packing...
Wind energy is a core sustainable source of electric power, and accurate wind-speed forecasting is pivotal to enhancing the power stability, efficiency, and utilization. The existing forecasting methods are still limited by the influence of outliers and the modelling difficulties caused by complex features in wind speed series. This paper proposes...
Energy efficiency is a critical issue in data centre management, which is the foundation for cloud computing. The VM placement has a considerable impact on a data centre's energy efficiency and resource utilisation. The assignment of VMs to PMs is an NP-hard problem without an easy way to find an optimal solution, particularly in large-scale data c...
An in-situ monitoring of water quality (suspended sediment concentration, SSC) and concurrent hydrodynamics was conducted in the subaqueous Yellow River Delta in China. Empirical mode decomposition and spectral analysis on the SSC time series reveal the different periodicities of each physical mechanism that contribute to the SSC variations. Based...
Background
Polyploids are common in flowering plants and they tend to have more expanded ranges of distributions than their diploid progenitors. Possible mechanisms underlying polyploid success has been intensively investigated. Previous studies show that polyploidy generates novel changes and that subgenomes in allopolyploid species often differ i...
With a rapid decline in the cost of battery energy storage, a battery system plays an increasing important role in managing an imbalance between ordering and consumption in the electricity wholesale market. To determine the optimal battery capacity that minimizes costs, we develop a new cost-oriented load forecasting framework accounting for batter...