Sebastian Stober

Sebastian Stober
  • Prof. Dr.-Ing.
  • Professor (Full) at Otto-von-Guericke University Magdeburg

About

142
Publications
28,970
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,567
Citations
Introduction
I am an interdisciplinary researcher fascinated by the abilities of humans and intelligent machines to learn, adapt and interact with each other. Especially, I am interested in so-called “human-in-the-loop” scenarios, in which both humans and machines learn from each other and together contribute to the solution of a problem. My research is driven by the idea that many challenges in the age of Big Data could be solved by such fruitful collaborations. For more information visit sebastianstober.de
Current institution
Otto-von-Guericke University Magdeburg
Current position
  • Professor (Full)
Additional affiliations
September 2013 - present
Western Caspian University
Position
  • PostDoc Position
September 2013 - present
Western University
Position
  • PostDoc Position
January 2006 - August 2013
Otto-von-Guericke University Magdeburg
Position
  • Researcher

Publications

Publications (142)
Chapter
Full-text available
Personalized and user-aware systems for retrieving multimedia items are becoming increasingly important as the amount of available multimedia data has been spiraling. A personalized system is one that incorporates information about the user into its data processing part (e.g., a particular user taste for a movie genre). A context-aware system, in c...
Chapter
Full-text available
The hubness phenomenon, as it was recently described, consists in the observation that for increasing dimensionality of a data set the distribution of the number of times a data point occurs among the k nearest neighbors of other data points becomes increasingly skewed to the right. As a consequence, so-called hubs emerge, that is, data points that...
Conference Paper
Full-text available
Electroencephalography (EEG) recordings of rhythm perception might contain enough information to distinguish different rhythm types/genres or even identify the rhythms themselves. We apply convolutional neural networks (CNNs) to analyze and classify EEG data recorded within a rhythm perception study in Kigali, Rwanda which comprises 12 East African...
Article
Full-text available
With the development of more and more sophisticated Music Information Retrieval approaches, aspects of adaptivity are becoming an increasingly important research topic. Even though, adaptive techniques have already found their way into Music Information Retrieval systems and contribute to robustness or user satisfaction they are not always identifi...
Conference Paper
Full-text available
A common way to support exploratory music retrieval scenarios is to give an overview using a neighborhood-preserving projection of the collection onto two dimensions. However, neighborhood cannot always be preserved in the projection because of the inherent dimensionality reduction. Furthermore, there is usually more than one way to look at a music...
Conference Paper
Full-text available
The rapid evolution of artificial intelligence (AI) underscores the critical necessity for engineers to comprehend their responsibilities in AI use and development. This imperative requires equipping engineers with the requisite skills and knowledge to address societal challenges while ensuring that AI technologies are harnessed for societal benefi...
Preprint
Full-text available
Traffic signal control plays a crucial role in urban mobility. However, existing methods often struggle to generalize beyond their training environments to unseen scenarios with varying traffic dynamics. We present TransferLight, a novel framework designed for robust generalization across road-networks, diverse traffic conditions and intersection g...
Preprint
Full-text available
The increasing use of cloud-based speech assistants has heightened the need for effective speech anonymization, which aims to obscure a speaker's identity while retaining critical information for subsequent tasks. One approach to achieving this is through voice conversion. While existing methods often emphasize complex architectures and training te...
Preprint
Full-text available
Speech anonymisation aims to protect speaker identity by changing personal identifiers in speech while retaining linguistic content. Current methods fail to retain prosody and unique speech patterns found in elderly and pathological speech domains, which is essential for remote health monitoring. To address this gap, we propose a voice conversion-b...
Conference Paper
The increasing use of cloud-based speech assistants has heightened the need for effective speech anonymization, which aims to obscure a speaker's identity while retaining critical information for subsequent tasks. One approach to achieving this is through voice conversion. While existing methods often emphasize complex architectures and training te...
Preprint
Building up competencies in working with data and tools of Artificial Intelligence (AI) is becoming more relevant across disciplinary engineering fields. While the adoption of tools for teaching and learning, such as ChatGPT, is garnering significant attention, integration of AI knowledge, competencies, and skills within engineering education is la...
Chapter
Full-text available
In contrast to Variational Autoencoders, Dynamical Variational Autoencoders (DVAEs) learn a sequence of latent states for a time series. Initially, they were implemented using recurrent neural networks (RNNs) known for challenging training dynamics and problems with long-term dependencies. This led to the recent adoption of Transformers close to th...
Conference Paper
Full-text available
Speech anonymisation aims to protect speaker identity by changing personal identifiers in speech while retaining linguistic content. Current methods fail to retain prosody and unique speech patterns found in elderly and pathological speech domains, which is essential for remote health monitoring. To address this gap, we propose a voice conversion-b...
Preprint
Full-text available
Speech anonymisation aims to protect speaker identity by changing personal identifiers in speech while retaining linguistic content. Current methods fail to retain prosody and unique speech patterns found in elderly and pathological speech domains, which is essential for remote health monitoring. To address this gap, we propose a voice conversion-b...
Conference Paper
Full-text available
Deep neural networks are applied in more and more areas of everyday life. However, they still lack essential abilities, such as robustly dealing with spatially transformed input signals. Approaches to mitigate this severe robustness issue are limited to two pathways: Either models are implicitly regularised by increased sample variability (data aug...
Article
Full-text available
The outbreak of COVID-19 has shocked the entire world with its fairly rapid spread, and has challenged different sectors. One of the most effective ways to limit its spread is the early and accurate diagnosing of infected patients. Medical imaging, such as X-ray and computed tomography (CT), combined with the potential of artificial intelligence (A...
Article
Full-text available
Purpose Albinism is a congenital disorder affecting pigmentation levels, structure, and function of the visual system. The identification of anatomical changes typical for people with albinism (PWA), such as optic chiasm malformations, could become an important component of diagnostics. Here, we tested an application of convolutional neural network...
Conference Paper
Full-text available
As Artificial Intelligence (AI) becomes increasingly important in engineering, instructors need to incorporate AI concepts into their subject-specific courses. However, many teachers may lack the expertise to do so effectively or don't know where to start. To address this challenge, we have developed the AI Course Design Planning Framework to help...
Preprint
Full-text available
The detailed images produced by Magnetic Resonance Imaging (MRI) provide life-critical information for the diagnosis and treatment of prostate cancer. To provide standardized acquisition, interpretation and usage of the complex MRI images, the PI-RADS v2 guideline was proposed. An automated segmentation following the guideline facilitates consisten...
Article
Full-text available
The use of artificial intelligence (AI) is becoming increasingly important in various domains, making education about AI a necessity. The interdisciplinary nature of AI and the relevance of AI in various fields require that university instructors and course developers integrate AI topics into the classroom and create so-called domain-specific AI co...
Conference Paper
Full-text available
The integration of tools and methods of Artificial Intelligence (AI) into the engineering domain has become increasingly important, and with it comes a shift in required competencies. As a result, engineering education should now incorporate competencies into its courses and curricula. While interdisciplinary education at a subject level has alread...
Conference Paper
Full-text available
Voice conversion (VC) transforms an utterance to sound like another person without changing the linguistic content. A recently proposed generative adversarial network-based VC method, StarGANv2-VC is very successful in generating natural-sounding conversions. However, the method fails to preserve the emotion of the source speaker in the converted s...
Conference Paper
Full-text available
Speech anonymisation prevents misuse of spoken data by removing any personal identifier while preserving at least linguistic content. However, emotion preservation is crucial for natural human-computer interaction. The well-known voice conversion technique StarGANv2-VC achieves anonymisation but fails to preserve emotion. This work presents an any-...
Chapter
The use of predictive models in education promises individual support and personalization for students. To develop trustworthy models, we need to understand what factors and causes contribute to a prediction. Thus, it is necessary to develop models that are not only accurate but also explainable. Moreover, we need to conduct holistic model evaluati...
Preprint
Voice conversion (VC) transforms an utterance to sound like another person without changing the linguistic content. A recently proposed generative adversarial network-based VC method, StarGANv2-VC is very successful in generating natural-sounding conversions. However, the method fails to preserve the emotion of the source speaker in the converted s...
Chapter
Full-text available
Machine Learning with Deep Neural Networks (DNNs) has become a successful tool in solving tasks across various fields of application. However, the complexity of DNNs makes it difficult to understand how they solve their learned task. To improve the explainability of DNNs, we adapt methods from neuroscience that analyze complex and opaque systems. H...
Preprint
Full-text available
We present PredProp, a method for optimization of weights and states in predictive coding networks (PCNs) based on the precision of propagated errors and neural activity. PredProp jointly addresses inference and learning via stochastic gradient descent and adaptively weights parameter updates by approximate curvature. Due to the relation between pr...
Conference Paper
Full-text available
The use of Artificial Intelligence (AI) in engineering is on the rise and comes with the promise of cost reductions and efficiency gains. However, classical engineers often lack the necessary skills to implement data-driven solutions. At the same time, computer scientists lack the required understanding of engineering systems. Thus, we need to exte...
Conference Paper
Full-text available
A major challenge in engineering education is to empower students to use their acquired technical skills to solve real-world problems. In particular, methods of Artificial Intelligence (AI) need to be studied as tools in their respective application contexts. This puts pressure on university lecturers concerning the didactical design and elaboratio...
Book
Full-text available
Der zweite Fellow-Jahrgang des KI-Campus teilt in einem neuen Sammelband seine Erfahrungen mit der Integration digitaler Lernangebote zum Thema Künstliche Intelligenz (KI) in die Hochschullehre. Der Fokus liegt dabei auf der Anwendungsorientierung. Zehn Beiträge zeigen das breite Spektrum an Einsatzmöglichkeiten der offen lizenzierten und frei verf...
Conference Paper
Full-text available
Predictive coding networks (PCNs) have an inherent degree of biological plausibil-ity and perform approximate backpropagation of error in supervised settings. It is less clear how predictive coding compares to state-of-the-art architectures, such as VAEs in unsupervised and probabilistic settings. We propose a generalized PCN that, like its' inspir...
Chapter
Most deep learning models are known to be black-box models due to their overwhelming complexity. One approach to make models more interpretable is to reduce the representations to a finite number of objects. This can be achieved by clustering latent spaces or training models which include quantization by design such as the Vector Quantised-Variatio...
Chapter
The rise of Artificial Intelligence in Education opens up new possibilities for analysis of student data. However, the protection of private data in these applications is a major challenge. According to data regulations, the application designer is responsible for technical and organizational measures to ensure privacy. This paper aims to guide dev...
Preprint
Full-text available
Nearly all state of the art vision models are sensitive to image rotations. Existing methods often compensate for missing inductive biases by using augmented training data to learn pseudo-invariances. Alongside the resource demanding data inflation process, predictions often poorly generalize. The inductive biases inherent to convolutional neural n...
Article
The C-arm Cone-Beam Computed Tomography (CBCT) increasingly plays a major role in interventions and radiotherapy. However, the slow data acquisition and high dose hinder its predominance in the clinical routine. To overcome the high-dose issue, various protocols such as sparse-view have been proposed, where a subset of projections is acquired over...
Preprint
Full-text available
Machine Learning with Deep Neural Networks (DNNs) has become a successful tool in solving tasks across various fields of application. The success of DNNs is strongly connected to their high complexity in terms of the number of network layers or of neurons in each layer, which severely complicates to understand how DNNs solve their learned task. To...
Article
Full-text available
Goal-directed actions frequently require a balance between antagonistic processes (e.g., executing and inhibiting a response), often showing an interdependency concerning what constitutes goal-directed behavior. While an inter-dependency of antagonistic actions is well described at a behavioral level, a possible inter-dependency of underlying proce...
Preprint
Full-text available
The C-arm Cone-Beam Computed Tomography (CBCT) increasingly plays a major role in interventions and radiotherapy. However, the slow data acquisition and high dose hinder its predominance in the clinical routine. To overcome the high-dose issue, various protocols such as sparse-view have been proposed, where a subset of projections is acquired over...
Preprint
Full-text available
This paper deals with differentiable dynamical models congruent with neural process theories that cast brain function as the hierarchical refinement of an internal generative model explaining observations. Our work extends existing implementations of gradient-based predictive coding with automatic differentiation and allows to integrate deep neural...
Conference Paper
Full-text available
Humans efficiently extract relevant information from complex auditory stimuli. Oftentimes, the interpretation of the signal is ambiguous and musical meaning is derived from the subjective context. Predictive processing interpretations of brain function describe subjective music experience driven by hierarchical precision-weighted expectations. Ther...
Preprint
Full-text available
There is an increasing convergence between biologically plausible computational models of inference and learning with local update rules and the global gradient-based optimization of neural network models employed in machine learning. One particularly exciting connection is the correspondence between the locally informed optimization in predictive...
Conference Paper
Full-text available
Active Inference states that the human brain minimizes a statistical quantity of surprise with respect to current observations and the planned future [1]. So far, implementations based on active inference with artificial neural networks have been used to model individual planners and interaction between multiple autonomous agents [2]. However, the...
Conference Paper
Machine Learning with deep Artificial Neural Networks (ANNs) has become a successful tool in solving tasks across various fields of application. However, this success is typically achieved by increasing the ANN complexity in terms of the number of network layers or of neurons in each layer. This severely complicates the understanding of how modern...
Preprint
Full-text available
The concerns over radiation-related health risks associated with the increasing use of computed tomography (CT) have accelerated the development of low-dose strategies. There is a higher need for low dosage in interventional applications as repeated scanning is performed. However, using the noisier and undersampled low-dose datasets, the standard r...
Preprint
Full-text available
Humans efficiently extract relevant information from complex auditory stimuli. Oftentimes, the interpretation of the signal is ambiguous and musical meaning is derived from the subjective context. Predictive processing interpretations of brain function describe subjective music experience driven by hierarchical precision-weighted expectations. Ther...
Article
Full-text available
Deep Learning-based Automatic Speech Recognition (ASR) models are very successful, but hard to interpret. To gain a better understanding of how Artificial Neural Networks (ANNs) accomplish their tasks, several introspection methods have been proposed. However, established introspection techniques are mostly designed for computer vision tasks and re...
Preprint
Full-text available
Various convolutional neural network (CNN) based concepts have been introduced for the prostate's automatic segmentation and its coarse subdivision into transition zone (TZ) and peripheral zone (PZ). However, when targeting a fine-grained segmentation of TZ, PZ, distal prostatic urethra (DPU) and the anterior fibromuscular stroma (AFS), the task be...
Article
Full-text available
Various convolutional neural network (CNN) based concepts have been introduced for the prostate's automatic segmentation and its coarse subdivision into transition zone (TZ) and peripheral zone (PZ). However, when targeting a fine-grained segmentation of TZ, PZ, distal prostatic urethra (DPU) and the anterior fibromuscular stroma (AFS), the task be...
Conference Paper
Full-text available
This paper discusses representation learning from electroencephalographic (EEG) signal with deep variational pre-dictive coding networks. We introduce a hierarchical probabilistic network that minimises prediction error at multiple levels of spatio-temporal abstraction. While the lowest layer predicts brain activity directly, higher layers abstract...
Article
Full-text available
Relationships between neuroimaging measures and behavior provide important clues about brain function and cognition in healthy and clinical populations. While electroencephalography (EEG) provides a portable, low cost measure of brain dynamics, it has been somewhat underrepresented in the emerging field of model-based inference. We seek to address...
Conference Paper
Full-text available
The human response to music combines low-level expectations that are driven by the perceptual characteristics of audio with high-level expectations from the context and the listener's expertise. This paper discusses surprisal based music representation learning with a hierarchical predic-tive neural network. In order to inspect the cognitive validi...
Preprint
Full-text available
The human response to music combines low-level expectations that are driven by the perceptual characteristics of audio with high-level expectations from the context and the listener's expertise. This paper discusses surprisal based music representation learning with a hierarchical predictive neural network. In order to inspect the cognitive validit...
Preprint
This paper discusses representation learning from electroencephalographic (EEG) signal with deep predictive coding networks. We introduce a hierarchical probabilistic network that minimises prediction error on multiple levels. While the lowest layer predicts brain activity directly, higher layers abstract away from the data and predict sequences of...
Preprint
Full-text available
The outbreak of COVID-19 has shocked the entire world with its fairly rapid spread and has challenged different sectors. One of the most effective ways to limit its spread is the early and accurate diagnosis of infected patients. Medical imaging such as X-ray and Computed Tomography (CT) combined with the potential of Artificial Intelligence (AI) p...
Article
Full-text available
Efficient action control is indispensable for goal-directed behaviour. Different theories have stressed the importance of either attention or response selection sub-processes for action control. Yet, it is unclear to what extent these processes can be identified in the dynamics of neurophysiological (EEG) processes at the single-trial level and be...
Preprint
Deep Learning based Automatic Speech Recognition (ASR) models are very successful, but hard to interpret. To gain better understanding of how Artificial Neural Networks (ANNs) accomplish their tasks, introspection methods have been proposed. Adapting such techniques from computer vision to speech recognition is not straight-forward, because speech...
Preprint
The uninformative ordering of artificial neurons in Deep Neural Networks complicates visualizing activations in deeper layers. This is one reason why the internal structure of such models is very unintuitive. In neuroscience, activity of real brains can be visualized by highlighting active regions. Inspired by those techniques, we train a convoluti...
Preprint
Full-text available
The domain of medical imaging, especially brain Magnetic Resonance Imaging (sMRI), suffers from limited availability of labelled data. In this paper, we study how to effectively perform transfer learning from a large generic sMRI dataset to a small dataset of specific neurological disorder. We highlight the major challenges of transfer learning and...
Article
Full-text available
Attention Deficit Hyperactivity Disorder (ADHD) is one of the most prevalent neuropsychiatric disorders in childhood and adolescence and its diagnosis is based on clinical interviews, symptom questionnaires, and neuropsychological testing. Much research effort has been undertaken to evaluate the usefulness of neurophysiological (EEG) data to aid th...
Conference Paper
The increasing complexity of deep Artificial Neural Networks (ANNs) allows to solve complex tasks in various applications. This comes with less understanding of decision processes in ANNs. Therefore, introspection techniques have been proposed to interpret how the network accomplishes its task. Those methods mostly visualize their results in the in...
Conference Paper
Full-text available
Retrieving music information from brain activity is a challenging and still largely unexplored research problem. In this paper we investigate the possibility to reconstruct perceived and imagined musical stimuli from electroencephalography (EEG) recordings based on two datasets. One dataset contains multi-channel EEG of subjects listening to and im...
Article
Full-text available
Relationships between neuroimaging measures and behavior provide important clues about brain function and cognition in healthy and clinical populations. While electroencephalography (EEG) provides a portable, low cost measure of brain dynamics, it has been somewhat underrepresented in the emerging field of model-based inference. We seek to address...
Chapter
The target group of search engine users in the Internet is very wide and heterogeneous. The users differ in background, knowledge, experience, etc. That is why, in order to find relevant information, such search systems not only have to retrieve web documents related to the search query but also have to consider and adapt to the user’s interests, s...
Conference Paper
The increase in complexity of Artificial Neural Network (ANN) encompasses difficulties in understanding what they have learned and how they accomplish their goal. As their complexity becomes closer to the one of the human brain, neuroscientific techniques could facilitate the analysis of ANN. This paper investigates an adaptation of the Event-Relat...
Article
Full-text available
As an emerging sub-field of music information retrieval (MIR), music imagery information retrieval (MIIR) aims to retrieve information from brain activity recorded during music cognition–such as listening to or imagining music pieces. This is a highly inter-disciplinary endeavor that requires expertise in MIR as well as cognitive neuroscience and p...
Article
End-to-end training of automated speech recognition (ASR) systems requires massive data and compute resources. We explore transfer learning based on model adaptation as an approach for training ASR models under constrained GPU memory, throughput and training data. We conduct several systematic experiments adapting a Wav2Letter convolutional neural...
Conference Paper
Full-text available
We compare Visual Berrypicking, an interactive approach allowing users to explore large and highly faceted information spaces using similarity-based two-dimensional maps, with traditional browsing techniques. For large datasets, current projection methods used to generate maplike overviews suffer from increased computational costs and a loss of acc...
Article
Full-text available
We introduce and compare several strategies for learning discriminative features from electroencephalography (EEG) recordings using deep learning techniques. EEG data are generally only available in small quantities, they are high-dimensional with a poor signal-to-noise ratio, and there is considerable variability between individual subjects and re...
Conference Paper
Full-text available
In this paper we describe a novel concept of a search history visualization that is primarily designed for children. We propose to visualize the search history as a treasure map: The treasure map shows a landscape of islands. Each island represents the context of a user query. We visualize visited and unvisited relevant results and bookmarked docum...
Conference Paper
Full-text available
Sometimes users of a music retrieval system are not able to explicitly state what they are looking for. They rather want to browse a collection in order to get an overview and to discover interesting content. A common approach for browsing a collection relies on a similarity-preserving projection of objects (tracks, albums or artists) onto the (typ...
Conference Paper
Full-text available
Exploring image collections using similarity-based two-dimensional maps is an ongoing research area that faces two main challenges: with increasing size of the collection and complexity of the similarity metric projection accuracy rapidly degrades and computational costs prevent online map generation. We propose a prototype that creates the impress...
Conference Paper
Full-text available
Music imagery information retrieval (MIIR) systems may one day be able to recognize a song just as we think of it. As one step towards such technology, we investigate whether rhythms can be identified from an electroencephalography (EEG) recording taken directly after their auditory presentation. The EEG data has been collected during a rhythm perc...
Conference Paper
Full-text available
In this paper, we explore alternative ways to visualize search results for children. We propose a novel search result visualization using characters. The main idea is to represent each web document as a character where a character visually provides clues about the webpage's content. We focused on children between six and twelve as a target user gro...
Conference Paper
Full-text available
Electroencephalography (EEG) recordings of rhythm perception might contain enough information to distinguish different rhythm types/genres or even identify the rhythms themselves. In this paper, we present first classification results using deep learning techniques on EEG data recorded within a rhythm perception study in Kigali, Rwanda. We tested 1...
Article
The use of feedback-based systems in the music domain dates back to the 1960s. Their applications span from music composition and sound organization to audio synthesis and processing, as the interest in feedback resulted both from theoretical reflection ...
Article
Full-text available
Boasting over 200 attendees from around the globe, the International Society of Music Information Retrieval (ISMIR) concluded its 13th annual conference in Porto, Portugal in October. Although light rain early in the week ostensibly ensured that attendees would remain inside the stunning São Bento da Vitória Monastery, better weather would later in...
Conference Paper
Full-text available
Children need advanced support during web search or related interactions with computer systems. At this point, a voice-controlled search engine offers different benefits. Children who have difficulties in writing will not make spelling errors using a voice control. Voice control is a natural input method and is supposed to be easier to use for chil...
Conference Paper
Full-text available
In dieser Arbeit untersuchen wir Techniken der sprachgesteuerten Interaktion mit Suchmaschinen fü̈r junge Nutzer. Eine Sprachsteuerung hat viele Vorteile für Kinder. Beispielsweise kann der emotionale Zustand aus der Sprache erkannt und zur Unterstützung bei der Suche verwendet werden. Im Folgenden werden die Ergebnisse eines Wizard-of-Oz-Experimen...
Conference Paper
Full-text available
Map-based visualizations -- sometimes also called projections -- are a popular means for exploring music collections. But how useful are they if the collection is not static but grows over time? Ideally, a map that a user is already familiar with should be altered as little as possible and only as much as necessary to reflect the changes of the und...

Network

Cited By