About
134
Publications
72,984
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
4,381
Citations
Introduction
Skills and Expertise
Additional affiliations
January 2006 - present
January 2000 - February 2015
Publications
Publications (134)
Machine Learning is tasked with extracting relationships from data that are relevant to the user. In this chapter, we formulate a simple linear model, the logistic regression model, which predicts the corresponding output for given inputs. The goal is to automatically find the relations between the existing input values and the output category in t...
This chapter shows that deep neural networks (DNNs) can creatively generate novel images, text, music, and dialogue. In the case of images, generative adversarial networks (GANs) are able to create images with specific properties or style features. In addition, they can convert images from one type to another, such as a photo to a painting. For aut...
For more complex problems, simple linear models are insufficient. A way out is offered by models with several nonlinear layers (operators), which can represent arbitrary “curved” relationships between inputs and outputs. This chapter describes the properties of such deep neural networks and shows how to find the optimal parameters using the backpro...
In recent years, advances in computer processing power and the availability of suitable programming environments and algorithms have made it possible to solve some Artificial Intelligence subtasks in a satisfactory manner. This chapter provides an informal overview of the state of the art. Of particular importance here is the interpretation of sens...
The vast majority of information in our society is available as written text. Therefore, this chapter describes the extraction of knowledge from written text. In deep neural networks (DNN), words, sentences and documents are usually represented by embedding vectors. While simple embedding creation methods can only be used to approximate the meaning...
This chapter describes models for speech recognition, i.e. for transferring spoken language into text. Speech recognizers use derived sound features for small time intervals as input. For speech processing, deep sequence-to-sequence models based on LSTM or transformers are used, which generate the recognized text. Alternatively, Convolutional Neura...
Reinforcement learning is an area of Machine Learning in which a software program (agent) must select an action at each time step with the goal of achieving the highest possible sum of rewards over time. An action is determined based on the current state and affects the reward, often many time steps later. Examples applications include games, robot...
Recently, the term Artificial Intelligence (AI) came into the focus of public discussion. An Artificial Intelligence system is supposed to be able to perceive its environment and behave intelligently, similar to humans. However, this definition is imprecise because the term “intelligence” is difficult to delineate. Therefore, this chapter discusses...
Image recognition is about finding automatic methods to identify objects and their arrangement in an image or photo. This includes classifying the image objects and determining their position in the image. The majority of DNNs for image processing are Convolutional Neural Networks (CNN). They use layers with small receptive fields (convolutions), w...
Artificial intelligence has established itself as a central trend topic in the global technology industry in recent years. It is realized through deep neural networks and offers a wide range of opportunities and innovation potential, for example in the smart home, in medicine and in industrial applications. AI has a huge impact on economic developm...
This chapter presents the main architecture types of attention-based language models, which describe the distribution of tokens in texts: Autoencoders similar to BERT receive an input text and produce a contextual embedding for each token. Autoregressive language models similar to GPT receive a subsequence of tokens as input. They produce a context...
In the chapter we consider Information Extraction approaches that automatically identify structured information in text documents and comprise a set of tasks. The Text Classification task assigns a document to one or more pre-defined content categories or classes. This includes many subtasks such as language identification, sentiment analysis, etc....
Foundation Models emerged as a new paradigm in sequence interpretation that can be used for a large number of tasks to understand our environment. They offer the remarkable property of combining sensory input (sound, images, video) with symbolic interpretation of text and may even include action and DNA sequences. We briefly recap the process of pr...
Foundation Models are able to model not only tokens of natural language but also token elements of arbitrary sequences. For images, square image patches can be represented as tokens; for videos, we can define tubelets that span an image patch across multiple frames. Subsequently, the proven self-attention algorithms can be applied to these tokens....
During pre-training, a Foundation Model is trained on an extensive collection of documents and learns the distribution of words in correct and fluent language. In this chapter, we investigate the knowledge acquired by PLMs and the larger Foundation Models. We first discuss the application of Foundation Models to specific benchmarks to test knowledg...
This chapter discusses Foundation Models for Text Generation. This includes systems for Document Retrieval, which accept a query and return an ordered list of text documents from a document collection, often evaluating the similarity of embeddings to retrieve relevant text passages. Question Answering systems are given a natural language question a...
This chapter describes a number of different approaches to improve the performance of Pre-trained Language Models (PLMs), i.e. variants of BERT, autoregressive language models similar to GPT, and sequence-to-sequence models like Transformers. First we may modify the pre-training tasks to learn as much as possible about the syntax and semantics of l...
This open access book provides a comprehensive overview of the state of the art in research and applications of Foundation Models and is intended for readers familiar with basic Natural Language Processing (NLP) concepts. Over the recent years, a revolutionary new paradigm has been developed for training models for NLP. These models are first pre-t...
Download location: https://ki-verband.de/leam-machbarkeitsstudie-2023/
Im Jahr 2002 erschien in der New York Times ein Artikel mit der Überschrift „Google's toughest search is for a Business Model“ (Hansell, 2002). Der Autor des Artikels war überzeugt, dass Google sich gegen die damaligen Konkurrenten im OnlineAdvertising Geschäft nicht behaupten...
in Linux Magazin 2022
https://www.linux-magazin.de/ausgaben/2022/09/ki-und-sprache/
Ob Sprachassistenten, Chatbots oder die automatische Analyse von Dokumenten: Die rasanten Entwicklungen in der KI machen Sprachtechnologien mittlerweile allgegenwärtig. Doch wie gelingt es der KI, die Feinheiten der menschlichen Sprache zu verstehen?
Was bedeutet Natural Language Processing, was verbirgt sich hinter GPT-3 und wie funktionieren eigentlich Chatbots? Antworten auf diese Fragen liefert die neue Studie »Moderne Sprachtechnologien – Konzepte, Anwendungen, Chancen« von KI.NRW. In einer umfassenden Einführung zeigen Wissenschaftler des Fraunhofer-Instituts für Intelligente Analyse- und...
Discuss how machine learning and deep learning methods may be made transparent and reliable. Discuss general capabilities of deep learning.
Bestärkungslernen ist ein Bereich des maschinellen Lernens, bei dem ein Software-Programm (Agent) zu jedem Zeitschritt eine Aktion auswählen muss mit dem Ziel, eine möglichst hohe Summe von Belohnungen über die Zeit zu erreichen. Eine Aktion wird aufgrund des jeweils aktuellen Zustands bestimmt und wirkt sich oft erst nach vielen Zeitschritten auf...
Die allermeisten Informationen in unserer Gesellschaft sind als geschriebener Text verfügbar. Diese Kapitel beschreibt daher die Extraktion von Wissen aus geschriebenem Text. In tiefen neuronalen Netzen (TNN) werden Wörter, Sätze und Dokumente meist durch Embedding-Vektoren repräsentiert. Während einfache Verfahren zur Bestimmung von Embeddings nur...
Dieses Kapitel zeigt, dass tiefe neuronale Netze (TNN) auf kreative Art neuartige Bilder, Texte, Musik und Dialoge erzeugen können. Bei Bildern sind generative adversariale Netze (GAN) in der Lage, Bilder mit bestimmten Eigenschaften oder Stilmerkmalen zu generieren. Zudem können sie Bilder eines Typs in einen anderen Typ übertragen, z.B. ein Foto...
In den letzten Jahren haben Fortschritte in der Rechenleistung der Computer und die Verfügbarkeit geeigneter Programmierumgebungen und Algorithmen dazu geführt, dass man einige Teilaufgaben der Künstlichen Intelligenz in befriedigender Weise lösen kann. Dieses Kapitel bietet einen informellen Überblick über den State-of-the-Art. Besonders wichtig i...
Künstliche Intelligenz (KI) ist heute schon in unserem Alltag präsent und wird uns zukünftig in nahezu allen Lebensbereichen begegnen – von der bildgestützten Diagnose in der Medizin über das autonome Fahren und die intelligente Maschinenwartung in der Industrie bis hin zur Sprachsteuerung im smarten Zuhause. Die Potenziale der KI sind enorm, gleic...
Diese Kapitel beschreibt Modelle zur Spracherkennung, also zur Übertragung von gesprochener Sprache in Text. Als Eingabe nutzen Spracherkenner abgeleitete Merkmale für kleine Zeitintervalle. Zur Verarbeitung werden einerseits tiefe Sequence-to-Sequence-Modelle auf LSTM- oder Transformer-Basis verwendet, welche den erkannten Text ausgeben. Als Alter...
Künstliche Intelligenz hat sich in den letzten Jahren als zentrales Trendthema der globalen Technologieindustrie etabliert. Sie wird realisiert durch tiefe neuronale Netze und bietet vielfältige Chancen und Innovationspotentiale, beispielsweise im Smart Home, in der Medizin und bei industriellen Anwendungen. KI hat enorme Auswirkungen auf die wirts...
In der letzten Zeit ist der Begriff Künstliche Intelligenz (KI) in aller Munde. Ein System der Künstlichen Intelligenz soll in der Lage sein, seine Umwelt wahrzunehmen und sich ähnlich wie ein Mensch intelligent zu verhalten. Allerdings ist diese Definition ungenau, da der Begriff „Intelligenz“ schwer abzugrenzen ist. In diesem Kapitel werden daher...
In der Bilderkennung werden automatische Verfahren gesucht, mit denen man Objekte in einem Bild oder Foto identifizieren kann. Dabei geht es einerseits um die Klassifikation der Bildobjekte und andererseits um die Bestimmung ihrer Position auf dem Bild. Die allermeisten TNN zur Bildverarbeitung sind Convolutionale Neuronale Netze (CNN). Deren Schic...
Bei komplexeren Problemem sind einfache lineare Modelle unzureichend. Ein Ausweg bieten Modelle mit mehreren nichtlinearen Schichten (Operatoren), welche beliebige Zusammenhänge zwischen Ein- und Ausgaben repräsentieren können. Dieses Kapitel beschreibt die Eigenschaften derartiger tiefer neuronaler Netze und zeigt auf, wie sich mit Hilfe des Backp...
Maschinelles Lernen hat die Aufgabe, für den Nutzer relevante Zusammenhänge aus Daten zu rekonstruieren. In diesem Kapitel wird ein einfaches lineares Modell, das logistischen Regressionsmodell, formuliert, welches für beliebige Eingaben die zugehörige Ausgabe prognostiziert. Ziel ist es, die Relationen zwischen den vorhandenen Eingabe- und Ausgabe...
Objective:
The aim of this study was to assess (1) whether vasoreactivity is altered in patients with epilepsy and (2) whether the two most commonly used approaches, the trans-Sylvian (TS) and the trans-cortical (TC) route, differ in their impact on cortical blood flow.
Methods:
Patients were randomized to undergo selective amygdalohippocampecto...
Understanding the semantics of text plays an important role in many real-world applications such as machine translation, information extraction, sentiment detection, summarization, etc. Semantic Role Labeling (SRL) is an important NLP task which for each verb assigns semantic roles (such as agent, patient, instrument, etc.) to different phrases in...
The extraction of semantics of unstructured documents requires the recognition and classification of textual patterns, their variability and their interrelationships , i.e. the analysis of the linguistic structure of documents. Being the integral part of a larger real-life application, this linguistic analysis process must be robust, fast and adapt...
In this paper the elicitation of probabilities from human experts is considered as a measurement process, which may be disturbed by random 'measurement noise'. Using Bayesian concepts a second order probability distribution is derived reflecting the uncertainty of the input probabilities. The algorithm is based on an approximate sample representati...
The EM-algorithm is a general procedure to get maximum likelihood estimates if part of the observations on the variables of a network are missing. In this paper a stochastic version of the algorithm is adapted to probabilistic neural networks describing the associative dependency of variables. These networks have a probability distribution, which i...
Probabilistic reasoning systems combine different probabilistic rules and probabilistic facts to arrive at the desired probability values of consequences. In this paper we describe the MESA-algorithm (Maximum Entropy by Simulated Annealing) that derives a joint distribution of variables or propositions. It takes into account the reliability of prob...
Die inhaltliche Erschließung von digitalen Dokumenten wird durch semantische Technologien immens vereinfacht. Mit „Smart Semantics”, die das Fraunhofer IAIS im Kontext des THESEUS-Programms entwickelt hat, ist es möglich, wichtige Zusammenhänge in Texten zu finden und für neue Anwendungen zu nutzen. Anhand einer mobilen App und einem Firmenbeispiel...
In news stories verbatim quotes of persons play a very important role, as they carry reliable information about the opinion of that person concerning specific aspects. As thousands of new quotes are published every hour it is very difficult to keep track of them. In this paper we describe a set of algorithms to solve the knowledge management proble...
During the last decades, the disciplines of Data Mining and Operations Research have been working mostly independent of each other. However, the increasing complexity of today's applications in areas such as business, medicine, and science requires more and more interaction between both disciplines. On the one hand, several data mining algorithms a...
Name ambiguity arises from the polysemy of names and causes uncertainty about the true identity of entities referenced in unstructured text. This is a major problem in areas like information retrieval or knowledge management, for example when searching for a specific entity or updating an existing knowledge base.
We approach this problem of named e...
The backbone of the information age is digital information which may be searched, accessed, and transferred instantaneously. Therefore the digitization of paper documents is extremely interesting. This chapter describes approaches for document structure recognition detecting the hierarchy of physical components in images of documents, such as pages...
An important step for understanding the semantic content of text is the extraction of semantic relations between entities in natural language documents. Automatic extraction techniques have to be able to identify different versions of the same relation which usually may be expressed in a great variety of ways. Therefore these techniques benefit fro...
Phishing is a serious threat to global security and economy. Previously we have developed a phishing ltering system based on automatic classi cation. We perform statistical ltering of emails, where a classi er is trained on character- istic features of existing emails and subsequently is able to identify new phishing emails with dierent contents. I...
The automatic extraction of relations from unstructured natural text is challenging but offers practical solutions for many
problems like automatic text understanding and semantic retrieval. Relation extraction can be formulated as a classification
problem using support vector machines and kernels for structured data that may include parse trees to...
The output of a speech recognition system is a stream of text features that is overlayed by noise resulting from errors in the system's statistical classification of the audio input. Conditional random fields (CRFs), which have already proven themselves to be efficient, high-performance named entity recognizers (NERs) for named entities from text,...
In recent years, text mining has moved far beyond the clas- sical problem of text classification with an increased interest in more sophisticated processing of large text corpora, such as, for example, eval- uations of complex queries. This and several other tasks are based on the essential step of relation extraction. This problem becomes a typica...
The annotation of words and phrases by ontology concepts is extremely helpful for semantic interpretation. However many ontologies, e.g. WordNet, are too fine-grained and even human annotators often have disagreements about the precise word sense. Therefore we use coarse-grained supersenses of WordNet. We employ conditional random fields (CRFs) to...
Phishing emails usually contain a message from a credible looking source requesting a user to click a link to a website where she/he is asked to enter a password or other confidential information. Most phishing emails aim at withdrawing money from financial institutions or getting access to private information. Phishing has increased enormously ove...
The automatic extraction of relations between entities expressed in natural language text is an important problem for IR and text understanding. In this paper we show how different kernels for parse trees can be combined to improve the relation extraction quality. On a public benchmark dataset the combination of a kernel for phrase grammar parse tr...
In recognition of knowledge as a valuable resource, there is a whole spectrum of processes, methods and systems for the generation,
identification, representation, distribution and communication of knowledge, which aim to provide targeted support to individuals,
organisations and enterprises, particularly in solving knowledge-based tasks. This is k...
One major problem in text mining and seman-tic retrieval is that detected entity mentions have to be assigned to the true underlying entity. The ambiguity of a name results from both the pol-ysemy and synonymy problem, as the name of a unique entity may be written in variant ways and different unique entities may have the same name. The term "bush"...
Computer supported communication and infrastructure are integral parts of modern economy. Their security is of incredible importance to a wide variety of practical domains ranging from Internet service providers to the banking industry and e-commerce, from corporate networks to the intelligence community. The CSI-KDD workshop focuses on novel knowl...
Methods of acquiring, seeking and processing knowledge are a strategically vital issue in the context of globalized competition. One of the main subjects currently being researched is the development of semantic technologies that are capable of recognizing and classifying the content and meaning of information (words, pictures or sounds). In the co...
An important problem in text mining and semantic retrieval is entity resolution which aims at detecting the identity of a named entity. Note that the name of a unique entity may be written in variant ways and different unique entities may have the same name. The term "bush" for instance may refer to a woody plant, a mechanical fixing, 52 persons an...
Salting is the intentional addition or distortion of content, aimed to evade automatic filtering. Salting is usually found in spam emails. Salting can also be hidden in phishing emails, which aim to steal personal information from users. We present a novel method that detects hidden salting tricks as visual anomalies in text. We solely use these sa...
Salting is the intentional addition or distortion of content, aimed to evade automatic filtering. Salting is usually found
in spam emails. Salting can also be hidden in phishing emails, which aim to steal personal information from users. We present
a novel method that detects hidden salting tricks as visual anomalies in text. We solely use these sa...
Phishing emails are a real threat to inter- net communication and web economy. Crim- inals are trying to convince unsuspecting on- line users to reveal passwords, account num- bers, social security numbers or other per- sonal information. Filtering approaches us- ing blacklists are not completely eective as about every minute a new phishing scam is...
Spam and phishing emails are not only an-noying to users, but are a real threat to inter-net communication and web economy. The fight against unwanted emails has become a cat-and-mouse game between criminals and people trying to develop techniques for de-tecting such unwanted emails. Criminals are constantly developing new tricks and adopt the ones...
We categorise contributions to an e-discussion platform using Classifier Induced Semantic Spaces and Self-Organising Maps. Analysing the contributions delivers insight into the nature of the communication process, makes it more comprehensible and renders the resulting decisions more transparent. Additionally, it can serve as a basis to monitor how...
As the internet becomes more pervasive in all areas of human activity, attackers can use the anonymity of the cyberspace to commit crimes and compromise the IT infrastructure. As currently there is no generally implemented authentification technology we have to monitor the contents and relations of messages and internet traffic to detect infringeme...
We present an algorithm that is able to integrate uncertain probability statements of different default levels. In case of conflict between statements of different levels the statements of the lower levels are ignored. The approach is applicable to inference networks of arbitrary structure including loops and cycles. The simulated annealing algorit...
The Breeder Genetic Algorithm (BGA) is based on the equation for the response to selection. In order to use this equation for prediction, the variance of the fitness of the population has to be estimated. For the usual sexual recombination the computation can be difficult. In this paper we shortly state the problem and investigate several modificat...
The enormous amount of information stored in unstructured texts cannot simply be used for further processing by computers, which typically handle text as simple sequences of character strings. Therefore, specific (pre-)processing methods and algorithms are required in order to extract useful patterns. Text mining refers generally to the process of...
We investigate the performance of text mining systems for annotating press articles in two real-world press archives. Seven commercial systems are tested which recover the categories of a document as well named entities and catchphrases. Using cross-validation we evaluate the precision-recall characteristic. Depending on the depth of the category t...
The goal of the paper is to give an overview on the state of the art of data mining and text mining approaches which are useful
for bibliometrics and patent databases. The paper explains the basics of data mining in a non-technical manner. Basic approaches
from statistics and machine learning are introduced in order to clarify the groundwork of dat...
In this paper, we present a new approach for classifying video content into semantic classes at a high level of abstraction by exploiting the connoted visual code. The method is based on the concept of supervised learning algorithms that have already been applied for the classification of written text and spoken language quite successfully. In orde...
An ontology is a speci…cation of a conceptualization, a shared understanding of some domain of interest. The paper develops an algo- rithm that hierarchically groups words together which conceptually be- long together. We assume conceptual similarity if words often appear in the same context. This leads to a hierachical extension of Bayesian probab...
Support Vector Machines (SVM) can classify objects described by an effectively infinite-dimensional feature vector. This gives them the ability to use counts of different words in a document, i.e. more than 100000 words, directly for classification. In this paper we describe the results of a large number of experiments of different preprocessing st...
Politicians, planners and social scientists have an increasing need for tools clarifying the spatial distribution of relevant features. Special interest is in predicting changes in a what-if analysis: what would happen if we change some features in a specific way. To predict future developments requires a statistical model with inherent modelling u...
In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with...
In this paper we explore the use of text-mining methods for the identification of the author of a text. We apply the support vector machine (SVM) to this problem, as it is able to cope with half a million of inputs it requires no feature selection and can process the frequency vector of all words of a text. We performed a number of experiments with...
We have de# e lope d a nove l approach to de#MD - mine the similarity of docume# ts using probabilistic late# t se#) ntic inde##O0 . For e# ch docume# t a probability ve#M( r of la te# t factors is e# timate# which onthe one hand take s into account the distribution of words inthe te#e and on the othe# hand the distribution of cate gory value s. Th...
In this paper we use SVMs to classify spoken and written documents. We show that classification accuracy for written material
is improved by the utilization of strings of sub-word units with dramatic gains for small topic categories. The classification
of spoken documents for large categories using sub-word units is only slightly worse than for wri...
We extend a multi-class categorization scheme proposed by Dietterich and Bakiri 1995 for binary classifiers, using error correcting
codes. The extension comprises the computation of the codes by a simulated annealing algorithm and optimization of Kullback-Leibler
(KL) category distances within the code-words. For the first time, we apply the scheme...
Automatic text categorization has become a vital topic in many applications. Imagine for example the automatic classification of Internet pages for a search engine database. The traditional 1-of-n output coding for classification scheme needs resources increasing linearly with the number of classes. A different solution uses an error correcting cod...
Automatic text categorization has become a vital topic in many applications. Imagine for example the automatic classification of Internet pages for a search engine database. The traditional 1-of-n output coding for classification scheme needs resources increasing linearly with the number of classes. A different solution uses an error correcting cod...
Complex classification models like neural networks usually have lower errors than simple models. They often have very many interdependent parameters, whose effects no longer can be understood by the user. For many applications, especially in the financial industry, it is vital to understand the reasons why a classification model arrives at a specif...
Bayesian evolutionary algorithms (BEAs) are a probabilistic model
of evolutionary computation for learning and optimization. Starting from
a population of individuals drawn from a prior distribution, a Bayesian
evolutionary algorithm iteratively generates a new population by
estimating the posterior fitness distribution of parent individuals and
th...
If the collection of training data is costly, one can gain by actively selecting particular informative data points in a sequential way. In a Bayesian decision theoretic framework we develop a query selection criterion for classification models which explicitly takes into account the utility of decisions. We determine the overall utility and its de...
Complex classification models like neural networks usually have lower errors than simple models. They often have very many interdependent parameters, whose effects no longer can be understood by the user. For many applications, especially in the financial industry, it is vital to understand the reasons why a classification model arrives at a specif...
Due to the high number of insolvencies in the credit business, automatic procedures for testing the credit-worthiness of enterprises
become increasingly important. For this task we use classification trees with soft splits which assign the observations near
the split boundary to both branches. Tree models involve an extra complication as the number...
We develop a Bayesian procedure for classification with trees by switching between different model structures. For classification trees with overlap we use a Markov chain Monte Carlo procedure to produce an ensemble of trees which allow the assessment of prediction uncertainty and the value of new information. The approach is applied to a large cre...