Hai-Wei Shen’s research while affiliated with Macau University of Science and Technology and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (4)


Schematic diagram of prime solution in dual space
Various p and Fixed λ\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\lambda $$\end{document}
Various rows and Fixed p
Various atoms and Fixed p
Normalized compute time by dictionary without noise

+3

A greedy screening test strategy to accelerate solving LASSO problems with small regularization parameters
  • Article
  • Publisher preview available

April 2020

·

109 Reads

·

1 Citation

Soft Computing

Hai-Wei Shen

·

·

·

[...]

·

In the era of big data remarked by high dimensionality and large sample size, the least absolute shrinkage and selection operator (LASSO) problems demand efficient algorithms. Both static and dynamic strategies based on screening test principle have been proposed recently, in order to safely filter out irrelevant atoms from the dictionary. However, such strategies only work well for LASSO problems with large regularization parameters, and lose their efficiency for those with small regularization parameters. This paper presents a novel greedy screening test strategy to accelerate solving LASSO problems with small regularization parameters, as well as its effectiveness through adoption of a relatively larger regularization parameter which filters out irrelevant atoms in every iteration. Further more, the convergence proof of the greedy strategy is given, and the computational complexity of LASSO solvers integrated with this strategy is investigated. Numerical experiments on both synthetic and real data sets support the effectiveness of this greedy strategy, and the results show it outperforms both the static and dynamic strategies for LASSO problems with small regularization parameters.

View access options


Complex harmonic regularization with differential evolution in a memetic framework for biomarker selection

February 2019

·

138 Reads

·

8 Citations

For studying cancer and genetic diseases, the issue of identifying high correlation genes from high-dimensional data is an important problem. It is a great challenge to select relevant biomarkers from gene expression data that contains some important correlation structures, and some of the genes can be divided into different groups with a common biological function, chromosomal location or regulation. In this paper, we propose a penalized accelerated failure time model CHR-DE using a non-convex regularization (local search) with differential evolution (global search) in a wrapper-embedded memetic framework. The complex harmonic regularization (CHR) can approximate to the combination ℓ p ( 1 2 ≤ p < 1 ) and ℓq (1 ≤ q < 2) for selecting biomarkers in group. And differential evolution (DE) is utilized to globally optimize the CHR’s hyperparameters, which make CHR-DE achieve strong capability of selecting groups of genes in high-dimensional biological data. We also developed an efficient path seeking algorithm to optimize this penalized model. The proposed method is evaluated on synthetic and three gene expression datasets: breast cancer, hepatocellular carcinoma and colorectal cancer. The experimental results demonstrate that CHR-DE is a more effective tool for feature selection and learning prediction.


A novel logistic regression model combining semi-supervised learning and active learning for disease classification

August 2018

·

572 Reads

·

28 Citations

Traditional supervised learning classifier needs a lot of labeled samples to achieve good performance, however in many biological datasets there is only a small size of labeled samples and the remaining samples are unlabeled. Labeling these unlabeled samples manually is difficult or expensive. Technologies such as active learning and semi-supervised learning have been proposed to utilize the unlabeled samples for improving the model performance. However in active learning the model suffers from being short-sighted or biased and some manual workload is still needed. The semi-supervised learning methods are easy to be affected by the noisy samples. In this paper we propose a novel logistic regression model based on complementarity of active learning and semi-supervised learning, for utilizing the unlabeled samples with least cost to improve the disease classification accuracy. In addition to that, an update pseudo-labeled samples mechanism is designed to reduce the false pseudo-labeled samples. The experiment results show that this new model can achieve better performances compared the widely used semi-supervised learning and active learning methods in disease classification and gene selection.

Citations (2)


... Year Applications [124] 2016 Prediction [125] 2018 Prediction [126] 2018 Prediction [127] 2019 Prediction [128] 2012 Industrial control [129] 2014 Industrial control [130] 2015 Industrial control [131] 2015 Industrial control [132] 2019 Industrial control [133] 2020 Industrial control [134] 2017 Computational systems [135] 2020 Computational systems [136] 2020 Computational systems [137] 2012 Electrical and power systems [138] 2017 Electrical and power systems [139] 2017 Electrical and power systems [140] 2019 Electrical and power systems [141] 2020 Electrical and power systems [142] 2013 Feature selection [143] 2017 Feature selection [144] 2017 Feature selection [145] 2018 Feature selection [146] 2018 Feature selection [147] 2018 Feature selection [148] 2020 Feature selection [149] 2020 Feature selection [150] 2020 Feature selection [151] 2013 Image processing [152] 2017 Image processing [153] 2018 Image processing [154] 2019 Image processing [155] 2020 Image processing [156] 2017 Clustering [157] 2019 Clustering [158] 2019 Clustering [159] 2019 Clustering [160] 2020 Clustering [161] 2018 Health care [162] 2019 Health care [163] 2019 Health care [164] 2016 Path planning [165] 2019 Path planning [166] 2020 Path planning [167] 2020 Path planning [168] 2020 Path planning [169] 2020 Path planning [170] 2018 Wireless and sensor [171] 2018 Wireless and sensor [172] 2018 Wireless and sensor [173] 2020 Wireless and sensor [174] 2007 Differential equations [175] 2013 Differential equations [176] 2014 Differential equations [177] 2019 Differential equations [178] 2020 Differential equations 4.6.5. Feature selection Ghosh et al. [142] proposed a self-adaptive DE (SADE) to address the feature subset selection problem of a hyperspectral image that suffers from high computational intensiveness and redundancy issues due to the presence of large numbers of neighbouring bands. ...

Reference:

Differential evolution: A recent review based on state-of-the-art works
Complex harmonic regularization with differential evolution in a memetic framework for biomarker selection

... More specifically, the goal is to model the probability of one of the binary outcomes based on the predictor variables. In machine learning and various scientific applications, logistic regression appears in numerous settings, including online learning (Zhang et al. 2012), feature selection (Koh, Kim, and Boyd 2007), anomaly detection (Hendrycks, Mazeika, and Dietterich 2019;Feng et al. 2014), disease classification (Liao and Chin 2007;Chai et al. 2018), image & signal processing (Dong, Zhu, and Gong 2019;Rosario 2004), probability calibration (Kull et al. 2019) and many more. ...

A novel logistic regression model combining semi-supervised learning and active learning for disease classification