# Rémi EyraudHubert Curien Laboratory · Data Inteligence Team

Rémi Eyraud

Ph.D

## About

33

Publications

6,851

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

481

Citations

Introduction

Additional affiliations

September 2007 - present

September 2007 - present

Education

August 2003 - November 2006

## Publications

Publications (33)

Primary open-angle glaucoma (POAG) is a frequent blindness-causing neurodegenerative disorder characterized by optic nerve and retinal ganglion cell damage most commonly due to a chronic increase in intraocular pressure. The preservation of visual function in patients critically depends on the timeliness of detection and treatment of the disease, w...

Simple Summary
The toolkit for diagnosing the most aggressive primary brain tumor glioblastoma (GBM) is very limited. We recently demonstrated that plasma denaturation profiles (PDPs) of GBM patients and healthy controls obtained with nanoDSF can be automatically classified using artificial intelligence (AI) algorithms. Since PDPs have been shown t...

This paper is an attempt to bridge the gap between deep learning and grammatical inference. Indeed, it provides an algorithm to extract a (stochastic) formal language from any recurrent neural network trained for language modelling. In detail, the algorithm uses the already trained network as an oracle—and thus does not require the access to the in...

Simple Summary
Brain cancers, such as gliomas, are very difficult to detect because of their localization and late onset of symptoms. Here, we have developed a novel cancer detection method based on plasma denaturation profiles obtained by a non-conventional use of Differential Scanning Fluorimetry. Using blood samples from glioma patients and heal...

This paper is an attempt to bridge the gap between deep learning and grammatical inference. Indeed, it provides an algorithm to extract a (stochastic) formal language from any recurrent neural network trained for language modelling. In detail, the algorithm uses the already trained network as an oracle -- and thus does not require the access to the...

We describe a novel cancer diagnostic method based on plasma denaturation profiles obtained by a non-conventional use of Differential Scanning Fluorimetry. We show that 84 glioma patients and 63 healthy controls can be automatically classified using denaturation profiles with the help of machine learning algorithms with 92% accuracy. Proposed high...

BACKGROUND
differential scanning fluorimetry (DSF) has been recently proposed to be used to perform high throughput biofluids profiling by following protein denaturation. Our objective was to discriminate patients with glioma from healthy controls using plasmatic DSF profiles.
MATERIAL AND METHODS
We included 78 glioma patients and 44 healthy cont...

This paper examines the characterization and learning of grammars defined with enriched representational models. Model-theoretic approaches to formal language theory traditionally assume that each position in a string belongs to exactly one unary relation. We consider unconventional string models where positions can have multiple, shared properties...

Understanding how a learned black box works is of crucial interest for the future of Machine Learning. In this paper, we pioneer the question of the global interpretability of learned black box models that assign numerical values to symbolic sequential data. To tackle that task, we propose a spectral algorithm for the extraction of weighted automat...

Understanding how a learned black box works is of crucial interest for the future of Machine Learning. In this paper, we pioneer the question of the global interpretability of learned black box models that assign numerical values to symbolic sequential data. To tackle that task, we propose a spectral algorithm for the extraction of weighted automat...

Though graph grammars have been widely investigated for 40 years, few learning results exist for them. The main reasons come from complexity issues that are inherent when graphs, and a fortiori graph grammars, are considered. The picture is however different if one considers drawings of graphs, rather than the graphs themselves. For instance, it ha...

The Sequence PredIction ChallengE (SPiCe) is an on-line competition that took place between March and July 2016. Each of the 15 problems was made of a set of whole sequences as training sample, a validation set of prefixes, and a test set of prefixes. The aim was to submit a ranking of the 5 most probable symbols to be the next symbol of each prefi...

The most widely used learning paradigm in Grammatical Inference was introduced in 1967 and is known as identification in the limit. An important issue that has been raised with respect to the original definition is the absence of efficiency bounds. Nearly fifty years after its introduction, it remains an open problem how to best incorporate a notio...

This paper characterizes a subclass of subsequential string-to-string functions called Output Strictly Local (OSL) and presents a learning algorithm which provably learns any OSL function in polynomial time and data. This algorithm is more efficient than other existing ones capable of learning this class. The OSL
class is motivated by the study of...

We define two proper subclasses of subsequential functions based on the concept of Strict Locality (McNaughton and Papert, 1971; Rogers and Pullum, 2011; Rogers et al., 2013) for formal languages. They are called Input and Output Strictly Local (ISL and OSL). We provide an automata-theoretic characterization of the ISL class and theorems establishi...

In this paper, we present a new algorithm that can identify in polynomial time and data using positive examples any class of subsequential functions that share a particular finite-state structure. While this structure is given to the learner a priori, it allows for the exact learning of partial functions, and both the time and data complexity of th...

Approximating distributions over strings is a hard learning problem. Typical techniques involve using finite state machines as models and attempting to learn these; these machines can either be hand built and then have their weights estimated, or built by grammatical inference techniques: the structure and the weights are then learned simultaneousl...

While some heuristics exist for the learning of graph grammars, few has been done on the theoretical side. Due to complexity issues, the class of graphs has to be restricted: this paper deals with the subclass of plane graphs, which correspond to drawings of planar graphs. This allows us to introduce a new kind of graph grammars, using a face-repla...

We present a polynomial update time algorithm for the inductive inference of a large class of context-free languages using the paradigm of positive data and a membership oracle. We achieve this result by moving to a novel representation, called Contextual Binary Feature Grammars (CBFGs), which are capable of representing richly structured context-f...

Contextual Binary Feature Grammars were recently proposed by (Clark et al., 2008) as a learnable representation for richly structured context-free and context sensitive languages. In this paper we examine the representational power of the formalism, its relationship to other standard formalisms and language classes, and its appropriateness for mode...

We present a polynomial algorithm for the inductive inference of a large class of context free languages, that includes all regular languages. The algorithm uses a representation which we call Binary Feature Grammars based on a set of features, capable of representing richly structured context free languages as well as some context sensitive langua...

This paper formalises the idea of substitutability introduced by Zellig Harris in the 1950s and makes it the basis for a learning algorithm from positive data only for a subclass of context-free languages. We show that there is a polynomial characteristic set, and thus prove polynomial identification in the limit of this class. We discuss the relat...

Whereas there is a number of methods and algorithms to learn regular languages, moving up the Chomsky hierarchy is proving to be a challenging task. Indeed, several theo- retical barriers make the class of context-free languages hard to learn. To tackle these barriers, we choose to change the way we represent these languages. Among the formalisms t...

We present a simple context-free grammatical inference algorithm, and prove that it is capable of learning an interesting subclass of context-free languages. We also demonstrate that an implementation of this algorithm is capable of learning auxiliary fronting in polar interrogatives (AFIPI) in English. This has been one of the most important test...

This paper formalisms the idea of substitutability introduced by Zellig Harris in the 1950s and makes it the basis for a learning algorithm from positive data only for a subclass of context-free grammars.
We show that there is a polynomial characteristic set, and thus prove polynomial identification in the limit of this class. We discuss the relat...

Powerful methods and algorithms are known to learn regular languages. Aiming at extending them to more complex grammars, we choose to change the way we represent these languages. Among the formalisms that allow to define classes of languages, the one of string-rewriting systems (SRS) has outstanding properties. Indeed, SRS are expressive enough to...

Approximating distributions over strings is a hard learning problem. Typical techniques involve using finite state machines as models and attempting to learn these; these machines can either be hand built and then have their weights estimated, or built by grammatical inference techniques: the structure and the weights are then learnt simultaneously...

The machine learning team of the "Labora-toire d'Informatique Fondamentale de Mar-seille" (ML-LIF) is a Joint Research Unit of the Université de Provence and the CNRS. All of the 5 professors/assistant professors of the team are involved in the organisation of a spring school in machine learning that will be held at the end of next May. The particu...