June 2022
·
8 Reads
·
20 Citations
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
June 2022
·
8 Reads
·
20 Citations
February 2022
·
119 Reads
·
1 Citation
Transfer learning and meta-learning offer some of the most promising avenues to unlock the scalability of healthcare and consumer technologies driven by biosignal data. This is because current methods cannot generalise well across human subjects' data and handle learning from different heterogeneously collected data sets, thus limiting the scale of training data. On the other side, developments in transfer learning would benefit significantly from a real-world benchmark with immediate practical application. Therefore, we pick electroencephalography (EEG) as an exemplar for what makes biosignal machine learning hard. We design two transfer learning challenges around diagnostics and Brain-Computer-Interfacing (BCI), that have to be solved in the face of low signal-to-noise ratios, major variability among subjects, differences in the data recording sessions and techniques, and even between the specific BCI tasks recorded in the dataset. Task 1 is centred on the field of medical diagnostics, addressing automatic sleep stage annotation across subjects. Task 2 is centred on Brain-Computer Interfacing (BCI), addressing motor imagery decoding across both subjects and data sets. The BEETL competition with its over 30 competing teams and its 3 winning entries brought attention to the potential of deep transfer learning and combinations of set theory and conventional machine learning techniques to overcome the challenges. The results set a new state-of-the-art for the real-world BEETL benchmark.
February 2022
·
36 Reads
Building subject-independent deep learning models for EEG decoding faces the challenge of strong covariate-shift across different datasets, subjects and recording sessions. Our approach to address this difficulty is to explicitly align feature distributions at various layers of the deep learning model, using both simple statistical techniques as well as trainable methods with more representational capacity. This follows in a similar vein as covariance-based alignment methods, often used in a Riemannian manifold context. The methodology proposed herein won first place in the 2021 Benchmarks in EEG Transfer Learning (BEETL) competition, hosted at the NeurIPS conference. The first task of the competition consisted of sleep stage classification, which required the transfer of models trained on younger subjects to perform inference on multiple subjects of older age groups without personalized calibration data, requiring subject-independent models. The second task required to transfer models trained on the subjects of one or more source motor imagery datasets to perform inference on two target datasets, providing a small set of personalized calibration data for multiple test subjects.
December 2021
·
229 Reads
Automatic road graph extraction from aerial and satellite images is a long-standing challenge. Existing algorithms are either based on pixel-level segmentation followed by vectorization, or on iterative graph construction using next move prediction. Both of these strategies suffer from severe drawbacks, in particular high computing resources and incomplete outputs. By contrast, we propose a method that directly infers the final road graph in a single pass. The key idea consists in combining a Fully Convolutional Network in charge of locating points of interest such as intersections, dead ends and turns, and a Graph Neural Network which predicts links between these points. Such a strategy is more efficient than iterative methods and allows us to streamline the training process by removing the need for generation of starting locations while keeping the training end-to-end. We evaluate our method against existing works on the popular RoadTracer dataset and achieve competitive results. We also benchmark the speed of our method and show that it outperforms existing approaches. This opens the possibility of in-flight processing on embedded devices.
September 2021
·
203 Reads
·
23 Citations
International Journal of Computer Vision
Standard registration algorithms need to be independently applied to each surface to register, following careful pre-processing and hand-tuning. Recently, learning-based approaches have emerged that reduce the registration of new scans to running inference with a previously-trained model. The potential benefits are multifold: inference is typically orders of magnitude faster than solving a new instance of a difficult optimization problem, deep learning models can be made robust to noise and corruption, and the trained model may be re-used for other tasks, e.g. through transfer learning. In this paper, we cast the registration task as a surface-to-surface translation problem, and design a model to reliably capture the latent geometric information directly from raw 3D face scans. We introduce Shape-My-Face (SMF), a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model that we smoothly integrate with the mesh convolutions. Compared to the previous state-of-the-art learning algorithms for non-rigid registration of face scans, SMF only requires the raw data to be rigidly aligned (with scaling) with a pre-defined face template. Additionally, our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets. We extensively evaluate the quality of our registrations on diverse data. We demonstrate the robustness and generalizability of our model with in-the-wild face scans across different modalities, sensor types, and resolutions. Finally, we show that, by learning to register scans, SMF produces a hybrid linear and non-linear morphable model. Manipulation of the latent space of SMF allows for shape generation, and morphing applications such as expression transfer in-the-wild. We train SMF on a dataset of human faces comprising 9 large-scale databases on commodity hardware.
June 2021
·
26 Reads
·
48 Citations
December 2020
·
168 Reads
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data. As they generalize the operations of classical CNNs on grids to arbitrary topologies, GNNs also bring much of the implementation challenges of their Euclidean counterparts. Model size, memory footprint, and energy consumption are common concerns for many real-world applications. Network binarization allocates a single bit to network parameters and activations, thus dramatically reducing the memory requirements (up to 32x compared to single-precision floating-point parameters) and maximizing the benefits of fast SIMD instructions of modern hardware for measurable speedups. However, in spite of the large body of work on binarization for classical CNNs, this area remains largely unexplored in geometric deep learning. In this paper, we present and evaluate different strategies for the binarization of graph neural networks. We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks. In particular, we present the first dynamic graph neural network in Hamming space, able to leverage efficient k-NN search on binary vectors to speed-up the construction of the dynamic graph. We further verify that the binary models offer significant savings on embedded devices.
December 2020
·
177 Reads
Existing surface registration methods focus on fitting in-sample data with little to no generalization ability and require both heavy pre-processing and careful hand-tuning. In this paper, we cast the registration task as a surface-to-surface translation problem, and design a model to reliably capture the latent geometric information directly from raw 3D face scans. We introduce Shape-My-Face (SMF), a powerful encoder-decoder architecture based on an improved point cloud encoder, a novel visual attention mechanism, graph convolutional decoders with skip connections, and a specialized mouth model that we smoothly integrate with the mesh convolutions. Compared to the previous state-of-the-art learning algorithms for non-rigid registration of face scans, SMF only requires the raw data to be rigidly aligned (with scaling) with a pre-defined face template. Additionally, our model provides topologically-sound meshes with minimal supervision, offers faster training time, has orders of magnitude fewer trainable parameters, is more robust to noise, and can generalize to previously unseen datasets. We extensively evaluate the quality of our registrations on diverse data. We demonstrate the robustness and generalizability of our model with in-the-wild face scans across different modalities, sensor types, and resolutions. Finally, we show that, by learning to register scans, SMF produces a hybrid linear and non-linear morphable model that can be used for generation, shape morphing, and expression transfer through manipulation of the latent space, including in-the-wild. We train SMF on a dataset of human faces comprising 9 large-scale databases on commodity hardware.
June 2020
·
51 Reads
·
27 Citations
April 2020
·
54 Reads
Graph convolution operators bring the advantages of deep learning to a variety of graph and mesh processing tasks previously deemed out of reach. With their continued success comes the desire to design more powerful architectures, often by adapting existing deep learning techniques to non-Euclidean data. In this paper, we argue geometry should remain the primary driving force behind innovation in the emerging field of geometric deep learning. We relate graph neural networks to widely successful computer graphics and data approximation models: radial basis functions (RBFs). We conjecture that, like RBFs, graph convolution layers would benefit from the addition of simple functions to the powerful convolution kernels. We introduce affine skip connections, a novel building block formed by combining a fully connected layer with any graph convolution operator. We experimentally demonstrate the effectiveness of our technique and show the improved performance is the consequence of more than the increased number of parameters. Operators equipped with the affine skip connection markedly outperform their base performance on every task we evaluated, i.e., shape reconstruction, dense shape correspondence, and graph classification. We hope our simple and effective approach will serve as a solid baseline and help ease future research in graph neural networks.
... Máttyus et al. [24] directly infer the road graph from aerial images using deep learning for initial segmentation and an algorithm to reason about missing connections as a shortest path problem. Other methods use graph neural networks [1], [41] or encoding-dependent heuristics [19], [34] to infer the road graph from the features computed by an encoder network. Xu et al. propose various methods for image-based map feature extraction tasks, which aim to infer topologically structured graphs, including road curb detection through imitation learning [39], road network graph generation via Transformers and imitation learning [36], [38], and cityscale road boundary annotation with continuous graph inference [35]. ...
June 2022
... Notably, the results of the CNN pipelines show a more pronounced bimodal distribution compared to the Riemannian pipelines. The observation that deep convolutional networks may be particularly useful in transfer learning settings is in line with the results of a recent BCI decoding competition that was also won by a CNN approach [37]. However, we remark that the methods benchmarked here were not explicitly designed for cross-subject decoding. ...
February 2022
... Bahri et al. [92] propose a binary quantized GNN that can operate on energy-constrained devices. The authors highlight that although the use of non-Euclidean data makes GNNs challenging in several aspects compared to CNN models, small network sizes can be achieved through a controlled training process and model design. ...
June 2021
... Several deep learning models [?, 3,4,9,25] proposed ways to learn a latent representation of face scans using robust encoders such as PointNet [5,32] and Transformers but the resulting mesh is registered, which tends to smooth out some details from the scan geometry. However, this extra registration step may be handled efficiently with recent industrial applications such as MetaHuman from Epic Games. ...
September 2021
International Journal of Computer Vision
... Li et al. 15 use the 56-layer graph convolution neural network they constructed to segment the point cloud data semantically, and achieve better performance than the shallow network. Larger graphs and meshes also need deep GCN to capture remote dependencies between nodes 16,17 . But there are still two problems in training deep GCN. ...
June 2020
... Generally, there are two common and fundamental definitions: CP-rank and Tucker-rank. Several robust CP-/Tuckerbased approaches have been introduced to recover lowrank tensors [6][7][8]. Chen et al. [6] described a generalized weighted low-rank tensor factorization (GWLRTF) and extended its applicability by integrating it with CP and Tucker factorization. Bahr et al. [7] proposed a robust Kronecker component analysis (RKCA), which combined RPCA with the ideas from sparse dictionary learning. ...
January 2018
IEEE Transactions on Pattern Analysis and Machine Intelligence
... However, these methodologies do not account for dependence within tensor observations, the sampling distribution of their estimated coefficients and the natural connection that exists between ToTR and the related analysis of variance (ANOVA). Here, we propose a general ToTR framework that renders four low-rank tensor formats on the coefficient B: CP, Tucker (TK) [26]- [28], TR and the OP, while simultaneously allowing the errors to follow a tensor-variate normal (TVN) distribution [29]- [32] that posits a Kronecker structure on the Σ of (1). Assuming TVN-distributed errors allows us to obtain the sampling distributions of the estimated coefficients under their assumed low-rank format. ...
October 2017
... To address this issue, several robust tensor decomposition approaches have been developed to remove the influence of outliers. These techniques decompose a tensor into a summation of three tensors: a smooth low-rank tensor, a sparse tensor, and an error tensor (Anandkumar et al., 2016;Gu et al., 2014;Xue et al., 2017), among which the sparse tensor contains outliers. To estimate these tensors, Anandkumar et al. (2016) propose a non-convex iterative algorithm that alternates between lowrank CP decomposition through gradient ascent and hard thresholding of residuals. ...
August 2017