András Lörincz

András Lörincz
Eötvös Loránd University · Department of Software Technology and Methodology

PhD

About

404
Publications
50,436
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,713
Citations
Introduction
Interest: - Human-machine/robot collaboration, - Reinforcement learning, structure learning, deep learning - Cyber-physical systems - Smart tools, assistive tools, joining intelligences - Neuroscience, episodic memory - Diagnosis, computerized cognitive behavioral therapy, serious games Projects: Cyber-Physical Systems for Smart Factories project Spatio-temporal process identification and recognition in Health and Wellbeing project Pro-active machine action in human-machine collaboration
Additional affiliations
September 1992 - December 1992
Brown University
Position
  • Professor
August 1988 - June 1989
Illinois Institute of Technology
Position
  • Research Assistant
September 1980 - July 1984
University of Chicago
Position
  • Research Associate
Description
  • Experimental and theoretical studies of ultrafast processes in molecules

Publications

Publications (404)
Preprint
Full-text available
The quality of human social relationships is intricately linked to human memory processes, with memory serving as the foundation for the creation of social bonds. Since human memory is selective, differing recollections of the same events within a group can lead to misunderstandings and misalignments in what is perceived to be common ground in the...
Chapter
We present an innovative augmented reality game which aims to demonstrate a new dimension of interaction between humans and large language models through non-verbal gesture-based communication. Players collaborate with an LLM-controlled avatar to identify and correct discrepancies in an augmented reality environment, relying solely on non-verbal cu...
Conference Paper
Full-text available
Social interactions are fundamental to human life. Accurately identifying and interpreting verbal and non-verbal cues is essential for analyzing human behavior and human-machine interactions. The complexity of these interactions, along with the different communication signals, and their varying frequencies is a challenge that Deep Neural Networks c...
Article
Full-text available
This work presents BlinkLinMulT, a transformer-based framework for eye blink detection. While most existing approaches rely on frame-wise eye state classification, recent advancements in transformer-based sequence models have not been explored in the blink detection literature. Our approach effectively combines low- and high-level feature sequences...
Article
Full-text available
Allocentric semantic 3D maps are highly useful for a variety of human–machine interaction related tasks since egocentric viewpoints can be derived by the machine for the human partner. Class labels and map interpretations, however, may differ or could be missing for the participants due to the different perspectives. Particularly, when considering...
Article
Full-text available
Objectives: Melanoma is the deadliest form of skin cancer, but it can be fully cured through early detection and treatment in 99% of cases. Our aim was to develop a non-invasive machine learning system that can predict the thickness of a melanoma lesion, which is a proxy for tumor progression, through dermoscopic images. This method can serve as a...
Article
Full-text available
Dataset dependence affects many real‐life applications of machine learning: the performance of a model trained on a dataset is significantly worse on samples from another dataset than on new, unseen samples from the original one. This issue is particularly acute for small and somewhat specific databases in medical applications; the automated recogn...
Preprint
Dyadic and small group collaboration is an evolutionary advantageous behaviour and the need for such collaboration is a regular occurrence in day to day life. In this paper we estimate the perceived personality traits of individuals in dyadic and small groups over thin-slices of interaction on four multimodal datasets. We find that our transformer...
Article
Full-text available
According to the recently proposed omnigenic theory, all expressed genes in a relevant tissue are contributing directly or indirectly to the manifestation of complex disorders such as autism. Thus, holistic approaches can be complementary in studying genetics of these complex disorders to focusing on a limited number of candidate genes. Gene intera...
Article
Full-text available
Modern industries still commonly use traditional methods to visually inspect products, even though automation has many advantages over the skills of human labour. The automation of redundant tasks is one of the greatest successes of Artificial Intelligence (AI). It employs human annotation and finds possible relationships between features within a...
Article
Full-text available
We consider, evaluate, and develop methods for home rehabilitation scenarios. We show the required modules for this scenario. Due to the large number of modules, the framework falls into the category of Composite AI. Our work is based on collected videos with high-quality execution and samples of typical errors. They are augmented by sample dialogu...
Preprint
While deep neural networks are sensitive to adversarial noise, sparse coding using the Basis Pursuit (BP) method is robust against such attacks, including its multi-layer extensions. We prove that the stability theorem of BP holds upon the following generalizations: (i) the regularization procedure can be separated into disjoint groups with differe...
Article
Full-text available
Identity tracking and instance segmentation are crucial in several areas of biological research. Behavior analysis of individuals in groups of similar animals is a task that emerges frequently in agriculture or pharmaceutical studies, among others. Automated annotation of many hours of surveillance videos can facilitate a large number of biological...
Preprint
Full-text available
The goal of this paper is to review the relevant literature on autism questionnaires, models, analytic tools, and subgrouping, focusing on the opportunities and limitations. We examined how the size and nature of the database and the number and type of parameters measured determine the use of analytic tools and the expected type of results. To supp...
Article
Full-text available
Cloud-based speech services are powerful practical tools but the privacy of the speakers raises important legal concerns when exposed to the Internet. We propose a deep neural network solution that removes personal characteristics from human speech by converting it to the voice of a Text-to-Speech (TTS) system before sending the utterance to the cl...
Chapter
Continuous identification of objects with identical appearance is crucial to analyze the behavior of laboratory animals. Most existing methods attempt to avoid this problem by excluding direct social interactions or facilitating it by implants or artificial markers. Unfortunately, these techniques may distort the results as they can affect the beha...
Chapter
Human pose estimation is a crucial step towards understanding and characterizing people’s behavior in images and videos. Current state of the art results on human pose estimation were achieved by large Deep Learning models that are restricted to cloud computing for real time applications. However, with the development of edge computing, Deep Learni...
Preprint
Full-text available
Pixelwise annotation of image sequences can be very tedious for humans. Interactive video object segmentation aims to utilize automatic methods to speed up the process and reduce the workload of the annotators. Most contemporary approaches rely on deep convolutional networks to collect and process information from human annotations throughout the v...
Preprint
Full-text available
Our goal is to bridge human and machine intelligence in melanoma detection. We develop a classification system exploiting a combination of visual pre-processing, deep learning, and ensembling for providing explanations to experts and to minimize false negative rate while maintaining high accuracy in melanoma detection. Source images are first autom...
Article
Full-text available
Autism spectrum disorder (ASD) is a heterogeneous neuropsychiatric condition traditionally defined by core symptoms in social behavior, speech/communication, repetitive behavior, and restricted interests. Beyond the core symptoms, autism has strong association with other disorders such as intellectual disability (ID), epilepsy, schizophrenia among...
Chapter
In multi-person pose estimation actors can be heavily occluded, even become fully invisible behind another person. While temporal methods can still predict a reasonable estimation for a temporarily disappeared pose using past and future frames, they exhibit large errors nevertheless. We present an energy minimization approach to generate smooth, va...
Chapter
Tracking and segmentation of moving objects in videos continues to be the central problem in the separation and prediction of concurrent episodes and situation understanding. Along with critical issues such as collision avoidance, tracking and segmentation have numerous applications in other disciplines, including medicine research. To infer the po...
Preprint
Full-text available
In multi-person pose estimation actors can be heavily occluded, even become fully invisible behind another person. While temporal methods can still predict a reasonable estimation for a temporarily disappeared pose using past and future frames, they exhibit large errors nevertheless. We present an energy minimization approach to generate smooth, va...
Chapter
In 3D human pose estimation one of the biggest problems is the lack of large, diverse datasets. This is especially true for multi-person 3D pose estimation, where, to our knowledge, there are only machine generated annotations available for training. To mitigate this issue, we introduce a network that can be trained with additional RGB-D images in...
Article
Full-text available
Manual annotation of video segmentation datasets requires an immense amount of human effort, thus, reduction of human annotation costs is an active topic of research. While many papers deal with the propagation of masks through frames of a video, only a few results attempt to optimize annotation task selection. In this paper we present a deep learn...
Preprint
Full-text available
In 3D human pose estimation one of the biggest problems is the lack of large, diverse datasets. This is especially true for multi-person 3D pose estimation, where, to our knowledge, there are only machine generated annotations available for training. To mitigate this issue, we introduce a network that can be trained with additional RGB-D images in...
Preprint
Full-text available
Autism spectrum disorder (ASD) is a heterogeneous neuropsychiatric problem with a few core symptoms: weaknesses in social behavior, verbal impairments, repetitive behavior and restricted interests. Beyond the core symptoms, autism has strong association with other disorders such as intellectual disability, epilepsy, schizophrenia among many others....
Chapter
Previously, we have put forth the concept of Cartesian abstraction and argued that it can yield ‘cognitive maps’. We suggested a general mechanism and presented deep learning based numerical simulations: an observed factor (head direction) was non-linearly projected to form a discretized representation (head direction cells). That representation, i...
Article
Full-text available
In this paper, inspired by our previous works, we propose an architecture for the design and realization of cyber-physical systems (CPS) that considers the spatio-temporal context of events, promotes anomaly detection, facilitates efficient human-computer interaction and is capable of discovering novel human and/or machine knowledge. We view deep n...
Preprint
Full-text available
The common approach to 3D human pose estimation is predicting the body joint coordinates relative to the hip. This works well for a single person but is insufficient in the case of multiple interacting people. Methods predicting absolute coordinates first estimate a root-relative pose then calculate the translation via a secondary optimization task...
Article
In monocular 3D human pose estimation a common setup is to first detect 2D positions and then lift the detection into 3D coordinates. Many algorithms suffer from overfitting to camera positions in the training set. We propose a siamese architecture that learns a rotation equivariant hidden representation to reduce the need for data augmentation. Ou...
Preprint
Full-text available
In monocular 3D human pose estimation a common setup is to first detect 2D positions and then lift the detection into 3D coordinates. Many algorithms suffer from overfitting to camera positions in the training set. We propose a siamese architecture that learns a rotation equivariant hidden representation to reduce the need for data augmentation. Ou...
Article
Full-text available
[This corrects the article DOI: 10.1371/journal.pone.0195131.].
Preprint
Full-text available
We propose a theory of ASD as a condition of comorbid cognitive impairments that corrupt the learning, encoding, and manipulation of episodic and semantic memories. We consider (i) episodic and semantic memory functions of the entorhinal-hippocampal complex, (ii) constraints on the transfer and encoding of these memory components into neocortical a...
Article
Full-text available
In this study we investigate the strategies of subjects in a complex divided attention task. We conducted a series of experiments with ten participants and evaluated their performance. After an extensive analysis, we identified four strategic measures that justify the achievement of the participants, by highlighting the individual differences and p...
Chapter
Full-text available
Industry 4.0 factories become more and more complex with increased maintenance costs. Reducing costs by cyber-physical (CP) controllers should ensure the commercialization of the CPS for smart factory project results. We implement multi-adaptive CP controllers in the following domains: industrial robot arms, car manufacturing, steel industry, and a...
Article
Full-text available
Fine-tuning of a deep convolutional neural network (CNN) is often desired. This paper provides an overview of our publicly available py-faster-rcnn-ft software library that can be used to fine-tune the VGG_CNN_M_1024 model on custom subsets of the Microsoft Common Objects in Context (MS COCO) dataset. For example, we improved the procedure so that...
Article
Machine learning is making substantial progress in diverse applications. The success is mostly due to advances in deep learning. However, deep learning can make mistakes and its generalization abilities to new tasks are questionable. We ask when and how one can combine network outputs, when (i) details of the observations are evaluated by learned d...
Article
Full-text available
The existence of place cells (PCs), grid cells (GCs), border cells (BCs), and head direction cells (HCs) as well as the dependencies between them have been enigmatic. We make an effort to explain their nature by introducing the concept of Cartesian Factors. These factors have specific properties: (i) they assume and complement each other, like dire...
Article
Full-text available
It has been long debated how the so called cognitive map, the set of place cells, develops in rat hippocampus. The function of this organ is of high relevance, since the hippocampus is the key component of the medial temporal lobe memory system, responsible for forming episodic memory, declarative memory, the memory for facts and rules that serve c...
Article
Full-text available
Machine learning is making substantial progress in diverse applications. The success is mostly due to advances in deep learning. However, deep learning can make mistakes and its generalization abilities to new tasks are questionable. We ask when and how one can combine network outputs, when (i) details of the observations are evaluated by learned d...
Article
Facial expression communicates emotion, intention, and physical state, and regulates interpersonal behavior. Automated face analysis (AFA) for the detection, synthesis, and understanding of facial expression is a vital focus of basic research with applications in behavioral science, mental and physical health and treatment, marketing, and human-rob...
Conference Paper
Full-text available
There is a growing interest in behavior based biometrics. Although biometric data has considerable variations for an individual and may be faked, yet the combination of such ‘weak experts’ can be rather strong. A remotely detectable component is gaze direction estimation and thus, eye movement patterns. Here, we present a novel personalization meth...
Conference Paper
Full-text available
Recent advances of deep learning technology enable one to train complex input-output mappings, provided that a high quality training set is available. In this paper, we show how to extend an existing dataset of depth maps of hand annotated with the corresponding 3D hand poses by fitting a 3D hand model to smart glove-based annotations and generatin...
Conference Paper
We introduce a learning architecture that can serve compression while it also satisfies the constraints of factored reinforcement learning. Our novel Cartesian factors enable one to decrease the number of variables being relevant for the ongoing task, an exponential gain in the size of the state space. We demonstrate the working, the limitations an...
Conference Paper
Serious games for mental health is seen as the groundwork for assistive technology to maintain and improve mental health. We present a technical system layout we partly implemented for demonstration purposes and highlight vision-based perception and manipulation capabilities. These include physical interactions employing artificial general intellig...
Article
We extend the scope of Wikification to novel words by relaxing two premises of Wikification: (i) we wikify without using the surface form of the word (ii) to a mixture of Wikipedia senses instead of a single sense. We identify two types of ''novel'' words: words where the connection between their surface form and their meaning is broken (e.g., a mi...
Conference Paper
Full-text available
The problem of anomaly detection is a critical topic across application domains and is the subject of extensive research. Applications include finding frauds and intrusions, warning on robot safety, and many others. Standard approaches in this field exploit simple or complex system models, created by experts using detailed domain knowledge. In this...
Conference Paper
Cyber-Physical Systems have many components including physical ones with heavy demands on workflow management; a real-time problem. Furthermore, the complexity of the system involves some degree of stochasticity, due to interactions with the environment. We argue that the factored version of the event-learning framework (ELF) being able to exploit...
Article
Full-text available
We argue that recent technology developments hold great promises for health and wellbeing. In our view, recent advances of (1) smart tools and wearable sensors of diverse kinds, (2) data collection and data mining methods, (3) 3D visual recording and visual processing methods, (4) 3D models of the environment with robust physics engine, and last bu...
Article
Full-text available
Recently, computer simulation aided work has become a standard routine in all engineering fields. Accordingly, simulation plays a fundamental role even in road traffic engineering. A reliable simulator is able to provide effective analysis of a given traffic network if the applied simulation scenarios properly converge to the real-word situation. T...
Conference Paper
In many behavioral domains, such as facial expression and gesture, sparse structure is prevalent. This sparsity would be well suited for event detection but for one problem. Features typically are confounded by alignment error in space and time. As a consequence, high-dimensional representations such as SIFT and Gabor features have been favored des...
Conference Paper
Full-text available
For multiple reasons, the automatic annotation of video recordings is challenging. The amount of database video instances to be annotated is huge, tedious manual labeling sessions are required, the multi-modal annotation needs exact information of space, time, and context, and the different labeling opportunities require special agreements between...
Conference Paper
The ability to communicate with others is of paramount importance for mental well-being. In this paper, we describe an interaction system to reduce communication barriers for people with severe speech and physical impairments (SSPI) such as cerebral palsy. The system consists of two main components: (i) the head-mounted humancomputer interaction (H...
Conference Paper
Full-text available
In this paper, we describe how we combine active and passive user input modes in clinical environments for knowledge discovery and knowledge acquisition towards decision support in clinical environments. Active input modes include digital pens, smartphones, and automatic handwriting recognition for a direct digitalisation of patient data. Passive i...
Conference Paper
Full-text available
Indoor navigation in emergency scenarios poses a challenge to evacuation and emergency support, especially for injured or physically encumbered individuals. Navigation systems must be lightweight, easy to use, and provide robust localization and accurate navigation instructions in adverse conditions. To address this challenge, we combine magnetic l...
Conference Paper
Recent advances in kernel methods are very promising for improving the estimation, clustering and classification of spatio-temporal processes. Facial expression estimation can take advantage of such methods if one considers marker positions and their motion in 3D space. We applied support vector classification with kernels derived from dynamic time...