Klaus SchoeffmannAlpen-Adria-Universität Klagenfurt · Institut für Informationstechnologie (ITEC)
Klaus Schoeffmann
Ph.D.
About
233
Publications
45,814
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
3,164
Citations
Introduction
Additional affiliations
February 2015 - February 2015
Publications
Publications (233)
In this paper, we consider the problem of visual scanning mechanism underpinning sensorimotor tasks, such as walking and driving, in dynamic environments. We exploit eye tracking data for offering two new cognitive effort measures in visual scanning behavior of virtual driving. By utilizing the retinal flow induced by fixation, two novel measures o...
In recent years, the landscape of computer-assisted interventions and post-operative surgical video analysis has been dramatically reshaped by deep-learning techniques, resulting in significant advancements in surgeons’ skills, operation room management, and overall surgical outcomes. However, the progression of deep-learning-powered surgical techn...
CLIP-based text-to-image retrieval has proven to be very effective at the interactive video retrieval competition Video Browser Showdown 2022, where all three top-scoring teams had implemented a variant of a CLIP model in their system. Since the performance of these three systems was quite close, this post-evaluation was designed to get better insi...
Analyzing laparoscopic surgery videos presents a complex and multifaceted challenge, with applications including surgical training, intra-operative surgical complication prediction, and post-operative surgical assessment. Identifying crucial events within these videos is a significant prerequisite in a majority of these applications. In this paper,...
According to our experience from VBS2023 and the feedback from the IVR4B special session at CBMI2023, we have largely revised the diveXplore system for VBS2024. It now integrates OpenCLIP trained on the LAION-2B dataset for image/text embeddings that are used for free-text and visual similarity search, a query server that is able to distribute diff...
Purpose
Semantic segmentation plays a pivotal role in many applications related to medical image and video analysis. However, designing a neural network architecture for medical image and surgical video segmentation is challenging due to the diverse features of relevant classes, including heterogeneity, deformability, transparency, blunt boundaries...
A critical yet unpredictable complication following cataract surgery is intraocular lens dislocation. Postoperative stability is imperative, as even a tiny decentration of multifocal lenses or inadequate alignment of the torus in toric lenses due to postoperative rotation can lead to a significant drop in visual acuity. Investigating possible intra...
Video streaming and its applications are growing rapidly, making video optimization a primary target for content providers looking to enhance their services. Enhancing the quality of videos requires the adjustment of different encoding parameters such as bitrate, resolution, and frame rate. To avoid brute force approaches for predicting optimal enc...
This paper conducts a thorough examination of the 12th Video Browser Showdown (VBS) competition, which is a well-established international benchmarking campaign for interactive video search systems. The annual VBS competition has witnessed a steep rise in the popularity of multimodal embedding-based approaches in interactive video retrieval. The ma...
Visual scanning is achieved via head motion and gaze movement for visual information acquisition and cognitive processing, which plays a critical role in undertaking common sensorimotor tasks such as driving. The coordination of the head and eyes is an important human behavior to make a key contribution to goal-directed visual scanning and sensorim...
Models capable of leveraging unlabelled data are crucial in overcoming large distribution gaps between the acquired datasets across different imaging devices and configurations. In this regard, self-training techniques based on pseudo-labeling have been shown to be highly effective for semi-supervised domain adaptation. However, the unreliability o...
This paper presents findings of the eleventh Video Browser Showdown competition, where sixteen teams competed in known-item and ad-hoc search tasks. Many of the teams utilized state-of-the-art video retrieval approaches that demonstrated high effectiveness in challenging search scenarios. In this paper, a broad survey of all utilized approaches is...
Models capable of leveraging unlabelled data are crucial in overcoming large distribution gaps between the acquired datasets across different imaging devices and configurations. In this regard, self-training techniques based on pseudo-labeling have been shown to be highly effective for semi-supervised domain adaptation. However, the unreliability o...
In the last decade, the need for storing videos from cataract surgery has increased significantly. Hospitals continue to improve their imaging and recording devices (e.g., microscopes and cameras used in microscopic surgery, such as ophthalmology) to enhance their post-surgical processing efficiency. The video recordings enable a lot of user-cases...
This paper presents the first version of our video search system Perfect Match for the Video Browser Showdown 2023 competition. The system indexes videos from the large V3C video dataset and derives visual content descriptors automatically. Furthermore, it provides an interactive video search user interface (UI), which implements approaches from th...
The diveXplore system has been participating in the VBS since 2017 and uses a sophisticated content analysis stack as well as an advanced interface for concept, object, event, and texts search. This year, we perform several changes in order to make the system both much easier to use as well as more efficient. These changes include using shot-based...
p>Adaptive live video streaming applications utilize a predefined collection of bitrate-resolution pairs, known as a bitrate ladder, for simplicity and efficiency, eliminating the need for additional run-time to determine the optimal pairs for each video. These applications do not incorporate two-pass encoding methods due to increased latency. Howe...
p>Adaptive live video streaming applications utilize a predefined collection of bitrate-resolution pairs, known as a bitrate ladder, for simplicity and efficiency, eliminating the need for additional run-time to determine the optimal pairs for each video. These applications do not incorporate two-pass encoding methods due to increased latency. Howe...
The rise of video streaming applications has increased the demand for Video Quality Assessment (VQA). In 2016, Netflix introduced VMAF, a full reference VQA metric that strongly correlates with perceptual quality, but its computation is time-intensive. This paper proposes a Discrete Cosine Transform (DCT)-energy-based VQA with texture information f...
The Lifelog Search Challenge (LSC) is an interactive benchmarking evaluation workshop for lifelog retrieval systems. The challenge was first organised in 2018 aiming to find the system that can quickly retrieve relevant lifelog images for a given semantic query. This paper provides an analysis of the performance of all 17 systems participating in t...
Adaptive live video streaming applications utilize a predefined collection of bitrate-resolution pairs, known as a
bitrate ladder
, for simplicity and efficiency, eliminating the need for additional run-time to determine the optimal pairs during the live streaming session. These applications do not incorporate two-pass encoding methods due to inc...
Teaching text-based programming poses significant challenges in both school and university contexts. This study explores the potential of ChatGPT as a sustainable didactic tool to support students, freshmen, and teachers. By focusing on a beginner’s course with examples also relevant to vocational schools, we investigated three research questions....
Visual scanning is achieved by eye movement control for visual information acquisition and cognitive processing, which plays a critical role in undertaking common sensorimotor tasks such as driving. The specific coordination of the head and eyes, with head motions temporally preceding eye movements, is an important human behavior to make a key cont...
The retrieval of multimedia content remains a difficult problem where a high accuracy or specificity can often only be achieved interactively, with a user working closely and iteratively with a retrieval system. While there exist several venues for the exchange of insights in the area of information retrieval in general and multimedia retrieval spe...
During the last 10 years of Video Browser Showdown (VBS), there were many different approaches tested for known-item search and ad-hoc search tasks. Undoubtedly, teams incorporating state-of-the-art models from the machine learning domain had an advantage over teams focusing just on interactive interfaces. On the other hand, VBS results indicate th...
Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant structures in these surgeries make the designation of a unique network quite challenging. This paper proposes a semantic segmentation ne...
Humans modulate the behavior flexibly after timely receiving and processing information from the environment. To better understand and measure human behavior in the driving process, we integrate humans and the environment as a system. The eye-movement methodologies are used to provide a bridge between humans and environment. Thus, we conduct a goal...
Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant structures in these surgeries make the designation of a unique network quite challenging. This paper proposes a semantic segmentation ne...
For the fifth time since 2018, the Lifelog Search Challenge (LSC) facilitated a benchmarking exercise to compare interactive search systems designed for multimodal lifelogs. LSC'22 attracted nine participating research groups who developed interactive lifelog retrieval systems enabling fast and effective access to lifelogs. The systems competed in...
In the last decade, user-centric video search competitions have facilitated the evolution of interactive video search systems. So far, these competitions focused on a small number of search task categories, with few attempts to change task category configurations. Based on our extensive experience with interactive video search contests, we have ana...
The Video Browser Showdown addresses difficult video search challenges through an annual interactive evaluation campaign attracting research teams focusing on interactive video retrieval. The campaign aims to provide insights into the performance of participating interactive video retrieval systems, tested by selected search tasks on large video co...
Endometriosis is a common gynecologic condition typically treated via laparoscopic surgery. Its visual versatility makes it hard to identify for non-specialized physicians and challenging to classify or localize via computer-aided analysis. In this work, we take a first step in the direction of localized endometriosis recognition in laparoscopic gy...
Continuously participating since the sixth Video Browser Showdown (VBS2017), diveXplore is a veteran interactive search system that throughout its lifetime has offered and evaluated numerous features. After undergoing major refactoring for the most recent VBS2021, however, the system since version 5.0 is less feature rich, yet, more modern, leaner...
Semantic segmentation in surgical videos is a prerequisite for a broad range of applications towards improving surgical outcomes and surgical video analysis. However, semantic segmentation in surgical videos involves many challenges. In particular, in cataract surgery, various features of the relevant objects such as blunt edges, color and context...
In the light of an increased use of premium intraocular lenses (IOL), such as EDOF IOLs, multifocal IOLs or toric IOLs even minor intraoperative complications such as decentrations or an IOL tilt, will hamper the visual performance of these IOLs. Thus, the post-operative analysis of cataract surgeries to detect even minor intraoperative deviations...
A critical complication after cataract surgery is the dislocation of the lens implant leading to vision deterioration and eye trauma. In order to reduce the risk of this complication, it is vital to discover the risk factors during the surgery. However, studying the relationship between lens dislocation and its suspicious risk factors using numerou...
A critical complication after cataract surgery is the dislocation of the lens implant leading to vision deterioration and eye trauma. In order to reduce the risk of this complication, it is vital to discover the risk factors during the surgery. However, studying the relationship between lens dislocation and its suspicious risk factors using numerou...
Semantic segmentation in surgical videos is a prerequisite for a broad range of applications towards improving surgical outcomes and surgical video analysis. However, semantic segmentation in surgical videos involves many challenges. In particular, in cataract surgery, various features of the relevant objects such as blunt edges, color and context...
A critical complication after cataract surgery is the dislocation of the lens implant leading to vision deterioration and eye trauma. In order to reduce the risk of this complication, it is vital to discover the risk factors during the surgery. However, studying the relationship between lens dislocation and its suspicious risk factors using numerou...
Semantic segmentation in cataract surgery has a wide range of applications contributing to surgical outcome enhancement and clinical risk reduction. However, the varying issues in segmenting the different relevant instances make the designation of a unique network quite challenging. This paper proposes a semantic segmentation network termed as Deep...
The Lifelog Search Challenge (LSC) is an annual benchmarking challenge for comparing approaches to interactive retrieval from multi-modal lifelogs. LSC'21, the fourth challenge, attracted sixteen participants, each of which had developed interactive retrieval systems for large multimodal lifelogs. These interactive retrieval systems participated in...
A critical complication after cataract surgery is the dislocation of the lens implant leading to vision deterioration and eye trauma. In order to reduce the risk of this complication, it is vital to discover the risk factors during the surgery. However, studying the relationship between lens dislocation and its suspicious risk factors using numerou...
Comprehensive and fair performance evaluation of information retrieval systems represents an essential task for the current information age. Whereas Cranfield-based evaluations with benchmark datasets support development of retrieval models, significant evaluation efforts are required also for user-oriented systems that try to boost performance wit...
In cataract surgery, the operation is performed with the help of a microscope. Since the microscope enables watching real-time surgery by up to two people only, a major part of surgical training is conducted using the recorded videos. To optimize the training procedure with the video content, the surgeons require an automatic relevance detection ap...
For research results to be comparable, it is important to have common datasets for experimentation and evaluation. The size of such datasets, however, can be an obstacle to their use. The Vimeo Creative Commons Collection (V3C) is a video dataset designed to be representative of video content found on the web, containing roughly 3800 hours of video...
In cataract surgery, the operation is performed with the help of a microscope. Since the microscope enables watching real-time surgery by up to two people only, a major part of surgical training is conducted using the recorded videos. To optimize the training procedure with the video content, the surgeons require an automatic relevance detection ap...
In cataract surgery, the operation is performed with the help of a microscope. Since the microscope enables watching real-time surgery by up to two people only, a major part of surgical training is conducted using the recorded videos. To optimize the training procedure with the video content, the surgeons require an automatic relevance detection ap...
Eye movement behavior, which provides the visual information acquisition and processing, plays an important role in performing sensorimotor tasks, such as driving, by human beings in everyday life. In the procedure of performing sensorimotor tasks, eye movement is contributed through a specific coordination of head and eye in gaze changes, with hea...
Despite all its irrefutable benefits, the development of steganography methods has sparked ever-increasing concerns over steganography abuse in recent decades. To prevent the inimical usage of steganography, steganalysis approaches have been introduced. Since motion vector manipulation leads to random and indirect changes in the statistics of video...
As a longstanding participating system in the annual Video Browser Showdown (VBS2017-VBS2020) as well as in two iterations of the more recently established Lifelog Search Challenge (LSC2018-LSC2019), diveXplore is developed as a feature-rich Deep Interactive Video Exploration system. After its initial successful employment as a competitive tool at...
We present our NoShot Video Browser, which has been successfully used at the last Video Browser Showdown competition VBS2020 at the MMM2020. NoShot is given its name due to the fact, that it neither makes use of any kind of shot detection nor utilize the VBS master shots. Instead videos are split into frames with a time distance of one second. The...
We present IVOS, an interactive video content search system that allows for object-based search and filtering in video archives. The main idea behind is to use the result of recent object detection models to index all keyframes with a manageable set of object classes, and allow the user to filter by different characteristics, such as object name, o...
The two-volume set LNCS 12572 and 1273 constitutes the thoroughly refereed proceedings of the 27th International Conference on MultiMedia Modeling, MMM 2021, held in Prague, Czech Republic, in June2021.
Of the 211 submitted regular papers, 40 papers were selected for oral presentation and 33 for poster presentation; 16 special session papers were a...