Fig 8 - uploaded by Zhongfei Zhang
Content may be subject to copyright.
The illustration of cross-media neural network hashing architecture (adapted from [210]).

The illustration of cross-media neural network hashing architecture (adapted from [210]).

Source publication
Article
Full-text available
Recently, multi-view representation learning has become a rapidly growing direction in machine learning and data mining areas. This paper first reviews the root methods and theories on multi-view representation learning, especially on canonical correlation analysis (CCA) and its several extensions. And then we investigate the advancement of multi-v...

Context in source publication

Context 1
... Zhuang et al. [210] propose a cross-media hashing approach based on a correspondence multi-modal neu- ral network, referred as Cross-Media Neural Network Hashing (CMNNH). The network structure of CMNNH can be considered as a combination of two modality-specific neural networks with an additional correspondence layer as shown in Figure 8. Denote the two neural networks corresponding to multi-modal input {X, Y } as NN x and NN y , respectively. ...

Citations

... A composite image provides comprehensive information to humans and machines, supplying more detailed information than the single-model/view image [10]. In a similar manner to multimodal learning, multi-view learning which can be considered as an information fusion technique has been noticed in machine learning and data mining due to an increased rate of multi-view data provided by applications [22] [23]. Data from different views of a modal (e.g., image) have complementary information than single-view data. ...
... Multi-view learning uses the related information to boost learning functionality. Study [23] investigates the progress of multi-view representation learning from shallow to deep methods. Researchers in [22] introduce two categories for multi-view representation learning as shown in Figure 3. [22] Generally, there are two main categories for multi-view representation learning: multi-view alignment and multi-view representation fusion. ...
Preprint
Full-text available
Patients diagnosed with irregular astigmatism require certain means of vision correction. In this regard, the use of a Rigid Gas Permeable (RGP) lens is among the most effective treatment methods. However, RGP lens base-curve detection is among the challenging issues. Current techniques have faced drawbacks in providing accuracy in detection. In this paper, a new method is defined based on multi-modal feature fusion on Pentacam images for automatic RGP lens base-curve detection using image processing and machine learning techniques. To this end, four types of features have been extracted from Pentacam images followed by a serial feature fusion mechanism. The fusion technique provides all possible combinatory views of these feature types to a Multi-Layered Perceptron (MLP) network to determine the base-curve. The first type of feature is obtained from the middle layer after passing the RGB combination of maps through a Convolutional Autoencoder (CAE) neural network. The second set is obtained by calculating the ratio of the area of the colored areas of the front cornea map. A feature vector is derived from the Cornea Front parameters as the third modality and the fourth feature vector is the radius of the reference sphere/ellipse of the front elevation map. Our evaluations on a manually labeled dataset show that the proposed technique provides an accurate detection rate with a 0.005 means square error (MSE) and a coefficient of determination of 0.79, superior to previous methods. This can be considered an effective step towards automatic base-curve determination, minimizing manual intervention in lens fitting.
... In recent years, a series of unsupervised multi-view representation learning methods have been proposed [11][12][13][14][15][16][17].The classical model is canonical correlation analysis (CCA) [18], which is the earliest classical multi-view feature learning method and projects two views into a common space by maximizing their correlation. Many variants of CCA have been applied to the generalization performance of several tasks [19][20][21], such as kernel CCA (KCCA) [22], sparse CCA [23], probabilistic CCA [24], and generalized CCA (GCCA) [25]. ...
Article
Full-text available
Multi-view data can collaborate with each other to provide more comprehensive information than single-view data. Although there exist a few unsupervised multi-view representation learning methods taking both the discrepancies and incorporating complementary information from different views into consideration, they always ignore the use of inner-view discriminant information. It remains challenging to learn a meaningful shared representation of multiple views. To overcome this difficulty, this paper proposes a novel unsupervised multi-view representation learning model, MRL. Unlike most state-of-art multi-view representation learning, which only can be used for clustering or classification task, our method explores the proximity guided representation from inner-view and complete the task of multi-label classification and clustering by the discrimination fusion representation simultaneously. MRL consists of three parts. The first part is a deep representation learning for each view and then aims to represent the latent specific discriminant characteristic of each view, the second part builds a proximity guided dynamic routing to preserve its inner features of direction,location and etc. At last, the third part, GCCA-based fusion, exploits the maximum correlations among multiple views based on Generalized Canonical Correlation Analysis (GCCA). To the best of our knowledge, the proposed MRL could be one of the first unsupervised multi-view representation learning models that work in proximity guided dynamic routing and GCCA modes. The proposed model MRL is tested on five multi-view datasets for two different tasks. In the task of multi-label classification, the results show that our model is superior to the state-of-the-art multi-view learning methods in precision, recall, F1 and accuracy. In clustering task, its performance is better than the latest related popular algorithms. And the performance varies w.r.t. the dimensionality of G is also made to explore the characteristics of MRL.
... In parallel, this decade has witnessed a surge in the development of deep learning-based methods for data learning, ranging from using the autoencoder, using feedforward networks or using Generative Adversarial Network (GAN) [34] [35]. However, limited research has been done to exploit deep learning in clustering multi-aspect data. ...
... The multi-modal deep learning [37] is a prominent method wherein the shared representation between input modalities has been learned and used for different purposes such as reconstructing each or both modalities or to use on the test data for classification tasks. Another group of DNNbased methods uses canonical correlation analysis (CCA) to constrain data learned from different views while using DNN [34]. DCCA [38] is the most noticeable method that exploited CCA in DNN for learning two-view data. ...
Preprint
Full-text available
Clustering on the data with multiple aspects, such as multi-view or multi-type relational data, has become popular in recent years due to their wide applicability. The approach using manifold learning with the Non-negative Matrix Factorization (NMF) framework, that learns the accurate low-rank representation of the multi-dimensional data, has shown effectiveness. We propose to include the inter-manifold in the NMF framework, utilizing the distance information of data points of different data types (or views) to learn the diverse manifold for data clustering. Empirical analysis reveals that the proposed method can find partial representations of various interrelated types and select useful features during clustering. Results on several datasets demonstrate that the proposed method outperforms the state-of-the-art multi-aspect data clustering methods in both accuracy and efficiency.
... Multiview learning has a long history [4], and many literature reviews have been produced on this topic, including the following: Li and colleagues [83] focus on multiview representation learning methods; Zhao and colleagues [84], Sun [85,86], Sun and colleagues [87] focus on some theoretical aspects-that is, generalization bounds-of some old paradigms of multiview learning (e.g., co-training); one of the first reviews discussing extensively on the consensus and complementary principles of multiview learning is made by Xu and colleagues [16]; Chao and colleagues [88] focus on and categorize multiview clustering methods into generative and discriminative methods; and Baltrušaitis and colleagues [89] conducted a comprehensive survey that categorizes multiview learning methods into 5 technical challenges-representation, translation, alignment, fusion, and co-learning. Most methods surveyed by Baltrušaitis and colleagues [89] are general or specialized for multimedia applications. ...
Article
Full-text available
The molecular mechanisms and functions in complex biological systems currently remain elusive. Recent high-throughput techniques, such as next-generation sequencing, have generated a wide variety of multiomics datasets that enable the identification of biological functions and mechanisms via multiple facets. However, integrating these large-scale multiomics data and discovering functional insights are, nevertheless, challenging tasks. To address these challenges, machine learning has been broadly applied to analyze multiomics. This review introduces multiview learning—an emerging machine learning field—and envisions its potentially powerful applications to multiomics. In particular, multiview learning is more effective than previous integrative methods for learning data’s heterogeneity and revealing cross-talk patterns. Although it has been applied to various contexts, such as computer vision and speech recognition, multiview learning has not yet been widely applied to biological data—specifically, multiomics data. Therefore, this paper firstly reviews recent multiview learning methods and unifies them in a framework called multiview empirical risk minimization (MV-ERM). We further discuss the potential applications of each method to multiomics, including genomics, transcriptomics, and epigenomics, in an aim to discover the functional and mechanistic interpretations across omics. Secondly, we explore possible applications to different biological systems, including human diseases (e.g., brain disorders and cancers), plants, and single-cell analysis, and discuss both the benefits and caveats of using multiview learning to discover the molecular mechanisms and functions of these systems.
... Multi-view learning arouses amounts of interests in the past decades Quang et al. 2013;Sun Editor: Ulf Brefeld. Extended author information available on the last page of the article and Chao 2013; Li et al. 2016;Ye et al. 2015;Liu et al. 2016;Chen and Zhou 2018). Nowadays, there are many multi-view learning approaches, e.g., multiple kernel learning (Gönen and Alpaydın 2011), disagreement-based multi-view learning (Blum and Mitchell 1998), late fusion methods which combine outputs of the models constructed from different view features (Ye et al. 2012) and subspace learning methods for multi-view data (Chen et al. 2012). ...
Article
Full-text available
Multi-view data have become increasingly popular in many real-world applications where data are generated from different information channels or different views such as image + text, audio + video, and webpage + link data. Last decades have witnessed a number of studies devoted to multi-view learning algorithms, especially the predictive latent subspace learning approaches which aim at obtaining a subspace shared by multiple views and then learning models in the shared subspace. However, few efforts have been made to handle online multi-view learning scenarios. In this paper, we propose an online Bayesian multi-view learning algorithm which learns predictive subspace with the max-margin principle. Specifically, we first define the latent margin loss for classification or regression in the subspace, and then cast the learning problem into a variational Bayesian framework by exploiting the pseudo-likelihood and data augmentation idea. With the variational approximate posterior inferred from the past samples, we can naturally combine historical knowledge with new arrival data, in a Bayesian passive-aggressive style. Finally, we extensively evaluate our model on several real-world data sets and the experimental results show that our models can achieve superior performance, compared with a number of state-of-the-art competitors.
... ManiNetCluster realizes a general multi-view learning approach by implementing manifold alignment/warping to combine multiple views into a common latent subspace for further analysis, i.e., clustering. Previous studies have emphasized the importance of multiview learning in heterogenous biological data [54] or discussed different methods realizing multiview learning [52,53] but, to the best of our knowledge, very few of them [55,56] regarded manifold alignment as such a method. In our approach, manifold alignment is considered to be a natural and effective method for multiview representation learning. ...
Article
Full-text available
Background: The coordination of genomic functions is a critical and complex process across biological systems such as phenotypes or states (e.g., time, disease, organism, environmental perturbation). Understanding how the complexity of genomic function relates to these states remains a challenge. To address this, we have developed a novel computational method, ManiNetCluster, which simultaneously aligns and clusters gene networks (e.g., co-expression) to systematically reveal the links of genomic function between different conditions. Specifically, ManiNetCluster employs manifold learning to uncover and match local and non-linear structures among networks, and identifies cross-network functional links. Results: We demonstrated that ManiNetCluster better aligns the orthologous genes from their developmental expression profiles across model organisms than state-of-the-art methods (p-value <2.2×10-16). This indicates the potential non-linear interactions of evolutionarily conserved genes across species in development. Furthermore, we applied ManiNetCluster to time series transcriptome data measured in the green alga Chlamydomonas reinhardtii to discover the genomic functions linking various metabolic processes between the light and dark periods of a diurnally cycling culture. We identified a number of genes putatively regulating processes across each lighting regime. Conclusions: ManiNetCluster provides a novel computational tool to uncover the genes linking various functions from different networks, providing new insight on how gene functions coordinate across different conditions. ManiNetCluster is publicly available as an R package at https://github.com/daifengwanglab/ManiNetCluster.
... Multi-view learning is a popular machine learning paradigm, which aims at handling the instances with multiple features views [39]. Some detailed surveys about this framework can be found in [29], [40]. A large portion of multi-view learning literatures mainly focus on supervised (or semi-supervised settings) learning tasks, such as recommendation [41] and classification [42], [43]. ...
Article
Full-text available
Link prediction is a demanding task in real-world scenarios, such as recommender systems, which targets to predict the unobservable links between different objects by learning network-structured data. In this paper, we propose a novel multi-view graph convolutional neural network (MV-GCN) model to solve this problem based on Matrix Completion method by simultaneously exploiting the interactive relationship and the content information of different objects. Unlike existing approaches directly concatenate the interactive and content information as a single view, the proposed MV-GCN improves the accuracy of the predictions by restricting the consistencies on the graph embedding from multiple views. Experimental results on six primary benchmark datasets, including two homogeneous datasets and four heterogeneous datasets, both show that MV-GCN outperforms the recent state-of-the-art methods.
... Previously DR-GAN [15] attempted to solve this problem, by providing pose code along with image data, while training. Li et al. [12] attempted this challenge by using Canonical Correlation Analysis for comparing the difference between the sub-spaces of various poses. Tian et al. [14] tried solving this problem with dual pathway architecture. ...
Chapter
To generate a multi-faceted view, from a single image has always been a challenging problem for decades. Recent developments in technology enable us to tackle this problem effectively. Previously, Several Generative Adversarial Network (GAN) based models have been used to deal with this problem as linear GAN, linear framework, a generator (generally encoder-decoder), followed by the discriminator. Such structures helped to some extent, but are not powerful enough to tackle this problem effectively.
... Previously DR-GAN [15] attempted to solve this problem, by providing pose code along with image data, while training. Li et al. [12] attempted this challenge by using Canonical Correlation Analysis for comparing the difference between the sub-spaces of various poses. Tian et al. [14] tried solving this problem with dual pathway architecture. ...
Conference Paper
Full-text available
To generate a multi-faceted view, from a single image has always been a challenging problem for decades. Recent developments in technology enable us to tackle this problem effectively. Previously, Several Generative Adversarial Network (GAN) based models have been used to deal with this problem as linear GAN, linear framework, a generator (generally encoder-decoder), followed by the discriminator. Such structures helped to some extent, but are not powerful enough to tackle this problem effectively. In this paper, we propose a GAN based dual-architecture model called DUO-GAN. In the proposed model, we add a second pathway in addition to the linear framework of GAN with the aim of better learning of the embedding space. In this model, we propose two learning paths, which compete with each other in a parameter-sharing manner. Furthermore , the proposed two-pathway framework primarily trains multiple sub-models, which combine to give realistic results. The experimental results of DUO-GAN outperform state of the art models in the field.
... Utilizing all of these images in the process of abnormal tissue modeling potentially leads to better performance on the segmentation results. Extracting information from multiple image modalities is referred as multi-view representation learning which has strong relation to the topic of multi-view learning [5]. Multi-view representation learning is also effective in cases that the views are artificially generated and derived from real views which themselves are the result of measurements performed by sensors. ...
Article
Full-text available
Automated segmentation of abnormal tissues in medical images assists both physicians and medical researchers in the process of diseases diagnostic and research activities respectively. Intelligent techniques of automated segmentation are gaining more popularity in contrast to non-intelligent ones. In these techniques, quality representation of pixel/voxels by considering multiple natural and artificial views which exist in medical images increases segmentation accuracy. The proposed method for segmentation of abnormal tissues in medical images is based on multi-view representation with six phases of pre-processing, view generation, representation generation, classification, post-processing, and evaluation. In the representation phase, raw data of medical images are represented based on the modes of variation or clusters exist in the original multi-view feature space. Quantitative results of the experiment demonstrate representations generated via the proposed method are effective especially when the Random Forest classifier is employed. DSC of 0.72 for a subject shows that the results are promising. This study shows cluster based representation of raw pixel/voxels of multiple views are effective in supervised segmentation of abnormal tissues.