Dimitris N. Metaxas

Dimitris N. Metaxas
Rutgers, The State University of New Jersey | Rutgers · Department of Computer Science

PhD

About

816
Publications
144,670
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
31,316
Citations
Additional affiliations
August 2001 - present
Rutgers, The State University of New Jersey
Position
  • Professor (Full)
Description
  • Medical Image Analysis, Computer Vision, Graphics, Robotics, AI, dynamic data analytics
September 1992 - July 2001
University of Pennsylvania
Position
  • Professor (Associate)
Education
September 1988 - August 1992
University of Toronto
Field of study
  • Computer vision, Graphics and Medical Image Analysis
September 1986 - June 1988
University of Maryland, College Park
Field of study
  • Computer Science
September 1981 - July 1986
National Technical University of Athens
Field of study
  • Electrical Engineering

Publications

Publications (816)
Article
Examination of pathological images is the golden standard for diagnosing and screening many kinds of cancers. Multiple datasets, benchmarks, and challenges have been released in recent years, resulting in significant improvements in computer-aided diagnosis (CAD) of related diseases. However, few existing works focus on the digestive system. We rel...
Preprint
Full-text available
Scene-understanding is an important topic in the area of Computer Vision, and illustrates computational challenges with applications to a wide range of domains including remote sensing, surveillance, smart agriculture, robotics, autonomous driving, and smart cities. We consider the active explanation-driven understanding and classification of scene...
Preprint
Full-text available
Joint 2D cardiac segmentation and 3D volume reconstruction are fundamental to building statistical cardiac anatomy models and understanding functional mechanisms from motion patterns. However, due to the low through-plane resolution of cine MR and high inter-subject variance, accurately segmenting cardiac images and reconstructing the 3D volume are...
Preprint
Neural architecture search (NAS) algorithms save tremendous labor from human experts. Recent advancements further reduce the computational overhead to an affordable level. However, it is still cumbersome to deploy the NAS techniques in real-world applications due to the fussy procedures and the supervised learning paradigm. In this work, we propose...
Article
Full-text available
Despite that Convolutional Neural Networks (CNNs) have achieved promising performance in many medical image segmentation tasks, they rely on a large set of labeled images for training, which is expensive and time-consuming to acquire. Semi-supervised learning has shown the potential to alleviate this challenge by learning from a large set of unlabe...
Preprint
Full-text available
Combining information from multi-view images is crucial to improve the performance and robustness of automated methods for disease diagnosis. However, due to the non-alignment characteristics of multi-view images, building correlation and data fusion across views largely remain an open problem. In this study, we present TransFusion, a Transformer-b...
Preprint
Most methods for conditional video synthesis use a single modality as the condition. This comes with major limitations. For example, it is problematic for a model conditioned on an image to generate a specific motion trajectory desired by the user since there is no means to provide motion information. Conversely, language information can describe t...
Preprint
Full-text available
Transformers have emerged to be successful in a number of natural language processing and vision tasks, but their potential applications to medical imaging remain largely unexplored due to the unique difficulties of this field. In this study, we present UTNetV2, a simple yet powerful backbone model that combines the strengths of the convolutional n...
Preprint
Full-text available
Multi-modality images have been widely used and provide comprehensive information for medical image analysis. However, acquiring all modalities among all institutes is costly and often impossible in clinical settings. To leverage more comprehensive multi-modality information, we propose a privacy secured decentralized multi-modality adaptive learni...
Preprint
Full-text available
Medical image segmentation has been widely recognized as a pivot procedure for clinical diagnosis, analysis, and treatment planning. However, the laborious and expensive annotation process lags down the speed of further advances. Contrastive learning-based weight pre-training provides an alternative by leveraging unlabeled data to learn a good repr...
Article
Full-text available
Signet ring cell carcinoma (SRCC) is a malignant tumor of the digestive system. This tumor has long been considered to be poorly differentiated and highly invasive because it has a higher rate of metastasis than well-differentiated adenocarcinoma. But some studies in recent years have shown that the prognosis of some SRCC is more favorable than oth...
Article
Full-text available
Automatic and accurate lung nodule detection from 3D Computed Tomography (CT) scans plays a vital role in efficient lung cancer screening. Despite the state-of-the-art performance obtained by recent anchor-based detectors using Convolutional Neural Networks (CNNs) for this task, they require predetermined anchor parameters such as the size, number,...
Chapter
Cardiac magnetic resonance (CMR) imaging is the most accurate imaging modality for cardiac function analysis. However respiration misalignment can negatively impact the accuracy of the cardiac wall 3D segmentation and the assessment of cardiac function. A learning based misalignment correction method is needed, in order to build an end-to-end accur...
Chapter
Efficient and accurate segmentation of the heart is important for analysis of cardiac magnetic resonance imaging (MRI). Although many convolutional neural networks (CNNs) have been proposed to address cardiac segmentation in cine MRI, the task is still an open and challenging problem due to highly complex and variable cardiac shape in various patho...
Preprint
Full-text available
Statistically and information-wise adequate data plays a critical role in training a robust deep learning model. However, collecting sufficient medical data to train a centralized model is still challenging due to various constraints such as privacy regulations and security. In this work, we develop a novel privacy-preserving federated-discriminato...
Article
Computed Tomography (CT) plays an important role in monitoring radiation-induced Pulmonary Fibrosis (PF), where accurate segmentation of the PF lesions is highly desired for diagnosis and treatment follow-up. However, the task is challenged by ambiguous boundary, irregular shape, various position and size of the lesions, as well as the difficulty i...
Chapter
Weak supervision learning on classification labels has demonstrated high performance in various tasks, while a few pixel-level fine annotations are also affordable. Naturally a question comes to us that whether the combination of pixel-level (e.g., segmentation) and image level (e.g., classification) annotation can introduce further improvement. Ho...
Chapter
Transformer architecture has emerged to be successful in a number of natural language processing tasks. However, its applications to medical vision remain largely unexplored. In this study, we present UTNet, a simple yet powerful hybrid Transformer architecture that integrates self-attention into a convolutional neural network for enhancing medical...
Article
Full-text available
Breast carcinoma is the most common cancer among women worldwide that consists of a heterogeneous group of subtype diseases. The whole-slide images (WSIs) can capture the cell-level heterogeneity, and are routinely used for cancer diagnosis by pathologists. However, key driver genetic mutations related to targeted therapies are identified by genomi...
Chapter
The goal of every contemporary recognition approach is to learn robust and unambiguous object representations in feature space. These learned powerful disentangled representations make it possible to build effective classifiers and are an active research topic in many fields such as face analytics.
Preprint
Full-text available
Weak supervision learning on classification labels has demonstrated high performance in various tasks. When a few pixel-level fine annotations are also affordable, it is natural to leverage both of the pixel-level (e.g., segmentation) and image level (e.g., classification) annotation to further improve the performance. In computational pathology, h...
Preprint
Full-text available
Transformer architecture has emerged to be successful in a number of natural language processing tasks. However, its applications to medical vision remain largely unexplored. In this study, we present UTNet, a simple yet powerful hybrid Transformer architecture that integrates self-attention into a convolutional neural network for enhancing medical...
Preprint
Full-text available
Signet ring cell carcinoma(SRCC) is a malignant tumor of the digestive system. This tumor has long been considered to be poorly differentiated and highly invasive because it has a higher rate of metastasis than well-differentiated adenocarcinoma. But some studies in recent years have shown that the prognosis of some SRCC is more favorable than othe...
Preprint
Full-text available
Instance segmentation is of great importance for many biological applications, such as study of neural cell interactions, plant phenotyping, and quantitatively measuring how cells react to drug treatment. In this paper, we propose a novel box-based instance segmentation method. Box-based instance segmentation methods capture objects via bounding bo...
Chapter
Data augmentation has proved extremely useful by increasing training data variance to alleviate overfitting and improve deep neural networks’ generalization performance. In medical image analysis, a well-designed augmentation policy usually requires much expert knowledge and is difficult to generalize to multiple tasks due to the vast discrepancies...
Preprint
Full-text available
Attention mechanisms have been widely applied to cross-modal tasks such as image captioning and information retrieval, and have achieved remarkable improvements due to its capability to learn fine-grained relevance across different modalities. However, existing attention models could be sub-optimal and lack preciseness because there is no direct su...
Article
Instance segmentation is of great importance for many biological applications, such as study of neural cell interactions, plant phenotyping, and quantitatively measuring how cells react to drug treatment. In this paper, we propose a novel box-based instance segmentation method. Box-based instance segmentation methods capture objects via bounding bo...
Preprint
Image and video synthesis are closely related areas aiming at generating content from noise. While rapid progress has been demonstrated in improving image-based models to handle large resolutions, high-quality renderings, and wide variations in image content, achieving comparable video generation results remains problematic. We present a framework...
Preprint
Full-text available
Automatic and accurate lung nodule detection from 3D Computed Tomography scans plays a vital role in efficient lung cancer screening. Despite the state-of-the-art performance obtained by recent anchor-based detectors using Convolutional Neural Networks, they require predetermined anchor parameters such as the size, number, and aspect ratio of ancho...
Preprint
Full-text available
Data augmentation has proved extremely useful by increasing training data variance to alleviate overfitting and improve deep neural networks' generalization performance. In medical image analysis, a well-designed augmentation policy usually requires much expert knowledge and is difficult to generalize to multiple tasks due to the vast discrepancies...
Preprint
Normalization techniques are crucial in stabilizing and accelerating the training of deep neural networks. However, they are mainly designed for the independent and identically distributed (IID) data, not satisfying many real-world out-of-distribution (OOD) situations. Unlike most previous works, this paper presents two normalization methods, SelfN...
Article
Temporal correlation in dynamic magnetic resonance imaging (MRI), such as cardiac MRI, is informative and important to understand motion mechanisms of body regions. Modeling such information into the MRI reconstruction process produces temporally coherent image sequence and reduces imaging artifacts and blurring. However, existing deep learning bas...
Preprint
Full-text available
As deep learning technologies advance, increasingly more data is necessary to generate general and robust models for various tasks. In the medical domain, however, large-scale and multi-parties data training and analyses are infeasible due to the privacy and data security concerns. In this paper, we propose an extendable and elastic learning framew...
Chapter
A movie’s key moments stand out of the screenplay to grab an audience’s attention and make movie browsing efficient. But a lack of annotations makes the existing approaches not applicable to movie key moment detection. To get rid of human annotations, we leverage the officially-released trailers as the weak supervision to learn a model that can det...
Chapter
Data augmentation is one of the most important tools in training modern deep neural networks. Recently, great advances have been made in searching for optimal augmentation policies in the image classification domain. However, two key points related to data augmentation remain uncovered by the current methods. First is that most if not all modern au...
Article
Full-text available
In this paper, we consider the problem of image-to-video translation, where one or a set of input images are translated into an output video which contains motions of a single object. Especially, we focus on predicting motions conditioned by high-level structures, such as facial expression and human pose. Recent approaches are either driven by stru...
Chapter
Scene classification is an important computer vision problem with applications to a wide range of domains including remote sensing, robotics, autonomous driving, defense, and surveillance. However, many approaches to scene classification make simplifying assumptions about the data, and many algorithms for scene classification are ill-suited for rea...
Preprint
Data augmentation is one of the most important tools in training modern deep neural networks. Recently, great advances have been made in searching for optimal augmentation policies in the image classification domain. However, two key points related to data augmentation remain uncovered by the current methods. First is that most if not all modern au...
Preprint
Full-text available
Nuclei segmentation is a fundamental task in histopathology image analysis. Typically, such segmentation tasks require significant effort to manually generate accurate pixel-wise annotations for fully supervised training. To alleviate such tedious and manual effort, in this paper we propose a novel weakly supervised segmentation framework based on...
Article
Full-text available
Instance segmentation of biological images is essential for studying object behaviors and properties. The challenges, such as clustering, occlusion, and adhesion problems of the objects, make instance segmentation a non-trivial task. Current box-free instance segmentation methods typically rely on local pixel-level information. Due to a lack of glo...
Preprint
Full-text available
Cross-modal knowledge distillation deals with transferring knowledge from a model trained with superior modalities (Teacher) to another model trained with weak modalities (Student). Existing approaches require paired training examples exist in both modalities. However, accessing the data from superior modalities may not always be feasible. For exam...
Preprint
Full-text available
Adolescent idiopathic scoliosis (AIS) is a lifetime disease that arises in children. Accurate estimation of Cobb angles of the scoliosis is essential for clinicians to make diagnosis and treatment decisions. The Cobb angles are measured according to the vertebrae landmarks. Existing regression-based methods for the vertebra landmark detection typic...
Preprint
Full-text available
Instance segmentation of biological images is essential for studying object behaviors and properties. The challenges, such as clustering, occlusion, and adhesion problems of the objects, make instance segmentation a non-trivial task. Current box-free instance segmentation methods typically rely on local pixel-level information. Due to a lack of glo...
Preprint
Full-text available
Graph kernels are kernel methods measuring graph similarity and serve as a standard tool for graph classification. However, the use of kernel methods for node classification, which is a related problem to graph representation learning, is still ill-posed and the state-of-the-art methods are heavily based on heuristics. Here, we present a novel theo...
Chapter
Image segmentation plays an important role in pathology image analysis as the accurate separation of nuclei or glands is crucial for cancer diagnosis and other clinical analyses. The networks and cross entropy loss in current deep learning-based segmentation methods originate from image classification tasks and have drawbacks for segmentation. In t...
Chapter
Most existing methods handle cell instance segmentation problems directly without relying on additional detection boxes. These methods generally fails to separate touching cells due to the lack of global understanding of the objects. In contrast, box-based instance segmentation solves this problem by combining object detection with segmentation. Ho...
Chapter
The 3D morphology and quantitative assessment of knee articular cartilages (i.e., femoral, tibial, and patellar cartilage) in magnetic resonance (MR) imaging is of great importance for knee radiographic osteoarthritis (OA) diagnostic decision making. However, effective and efficient delineation of all the knee articular cartilages in large-sized an...
Preprint
Full-text available
This paper proposes a new deep neural network for object detection. The proposed network, termed ASSD, builds feature relations in the spatial space of the feature map. With the global relation information, ASSD learns to highlight useful regions on the feature maps while suppressing the irrelevant information, thereby providing reliable guidance f...
Article
This paper proposes a new deep neural network for object detection. The proposed network, termed ASSD, builds feature relations in the spatial space of the feature map. With the global relation information, ASSD learns to highlight useful regions on the feature maps while suppressing the irrelevant information, thereby providing reliable guidance f...
Preprint
Full-text available
The 3D morphology and quantitative assessment of knee articular cartilages (i.e., femoral, tibial, and patellar cartilage) in magnetic resonance (MR) imaging is of great importance for knee radiographic osteoarthritis (OA) diagnostic decision making. However, effective and efficient delineation of all the knee articular cartilages in large-sized an...
Preprint
Full-text available
Most existing methods handle cell instance segmentation problems directly without relying on additional detection boxes. This method generally fails to separate touching cells due to the lack of global understanding of the objects. In contrast, box-based instance segmentation solves this problem by combining object detection with segmentation. Howe...
Preprint
Full-text available
We propose a Dynamic Graph-Based Spatial-Temporal Attention (DG-STA) method for hand gesture recognition. The key idea is to first construct a fully-connected graph from a hand skeleton, where the node features and edges are then automatically learned via a self-attention mechanism that performs in both spatial and temporal domains. We further prop...
Preprint
We present a simple method that achieves unexpectedly superior performance for Complex Reasoning involved Visual Question Answering. Our solution collects statistical features from high-frequency words of all the questions asked about an image and use them as accurate knowledge for answering further questions of the same image. We are fully aware t...