Article

A bibliographical study of grammatical inference

Authors:
To read the full-text of this research, you can request a copy directly from the author.

Abstract

The field of grammatical inference (also known as grammar induction) is transversal to a number of research areas including machine learning, formal language theory, syntactic and structural pattern recognition, computational linguistics, computational biology and speech recognition. There is no uniform literature on the subject and one can find many papers with original definitions or points of view. This makes research in this subject very hard, mainly for a beginner or someone who does not wish to become a specialist but just to find the most suitable ideas for his own research activity. The goal of this paper is to introduce a certain number of papers related with grammatical inference. Some of these papers are essential and should constitute a common background to research in the area, whereas others are specialized on particular problems or techniques, but can be of great help on specific tasks.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the author.

... Grammar Inference, also called Grammar Induction or Grammatical Inference, is the process of learning grammar from examples, either positive (i.e., the grammar generates a string) and/or negative (i.e., the grammar does not generate a string) [1,2]. Grammar Inference has been applied successfully to many diverse domains, such as Language Acquisition [3], Pattern Recognition [4], Speech Recognition [5], Computational Biology [6,7], Robotics and Software Engineering [8,9]. ...
... An example for the Robot language, begin right up up end, is shown in Figure 2, where the robot stopped in position (1,2). The current position of the robot is adjusted with four commands: left (decrease the x coordinate by 1), right (increase the x coordinate by 1), down (decrease the y coordinate by 1) and up (increase the y coordinate by 1). ...
... The Grammar Inference process [1,2] can be stated as follows: Given a set of positive samples S + and set of negative samples S − , which might also be empty, find at least one grammar G, such that S + ⊆ L(G) and S − ⊆ L(G), where L(G) and L(G) are the set of strings in and not in, respectively, the language generated by G (L(G)). This theoretical problem triggers many researches that applied Grammar Inference on various grammar-based systems [10]. ...
Thesis
Full-text available
Presented doctoral dissertation describes a research work on Semantic Inference, which can be regarded as an extension of Grammar Inference. The main task of Grammar Inference is to induce a grammatical structure from a set of positive samples (programs), which can sometimes also be accompanied by a set of negative samples. Successfully applying Grammar Inference can result only in identifying the correct syntax of a language. But, when valid syntactical structures are additionally constrained with context-sensitive information the Grammar Inference needs to be extended to the Semantic Inference. With the Semantic Inference a further step is realised, namely, towards inducing language semantics. In this doctoral dissertation it is shown that a complete compiler/interpreter for small Domain-Specific Languages (DSLs) can be generated automatically solely from given programs and their associated meanings using Semantic Inference. For the purpose of this research work the tool LISA.SI has been developed on the top of the compiler/interpreter generator tool LISA that uses Evolutionary Computations to explore and exploit the enormous search space that appears in Semantic Inference. A wide class of Attribute Grammars has been learned. Using Genetic Programming approach S-attributed and L-attributed have been inferred successfully, while inferring Absolutely Non-Circular Attribute Grammars (ANC-AG) with complex dependencies among attributes has been achieved by integrating a Memetic Algorithm (MA) into the LISA.SI tool. In addition to the Memetic Algorithm, this research work presents several other approaches implemented in LISA.SI that are used to improve the performance of Semantic Inference. Specifically, Long Term Memory Assistance (LTMA) for duplicate identification and multi-threaded fitness evaluation using a self-tuning thread pool. Semantic Inference is a hot topic in many areas such as Image Processing, Natural Language Processing, Semantic Web, etc. As such, it can be useful not only for automatically generating compilers/ interpreters, but also in areas that are even difficult to predict at the moment.
... Grammar Inference, also called Grammar Induction or Grammatical Inference, is the process of learning grammar from examples, either positive (i.e., the grammar generates a string) and/or negative (i.e., the grammar does not generate a string) [1,2]. Grammar Inference has been applied successfully to many diverse domains, such as Speech Recognition [3], Computational Biology [4,5], Robotics, and Software Engineering [6]. ...
... The Grammar Inference process [1,2] can be stated as follows-Given a set of positive samples S + and set of negative samples S − , which might also be empty, find at least one grammar G, such that S + ⊆ L(G) and S − ⊆ L(G), where L(G) and L(G) are the set of strings in and not in, respectively, the language generated by G (L(G)). Grammar Inference has been investigated now for more than 40 years, and has found applications in several research domains, such as Language Acquisition [16], Pattern Recognition [17], Computational Biology [4], and Software Engineering [6,8,10]. ...
... Therefore, there are 8116 (4+48+8064) different semantic equations for production P1. Similarly, in the second production P2 (Listing 3), only one attribute A [1].val and constant 1 can be used as operands, on which only operator + can be applied. Since operators && and == return boolean value, the assignment statement for the attribute A[0].val in the production P2 would cause a type error. ...
Article
Full-text available
This paper describes a research work on Semantic Inference, which can be regarded as an extension of Grammar Inference. The main task of Grammar Inference is to induce a grammatical structure from a set of positive samples (programs), which can sometimes also be accompanied by a set of negative samples. Successfully applying Grammar Inference can result only in identifying the correct syntax of a language. With the Semantic Inference a further step is realised, namely, towards inducing language semantics. When syntax and semantics can be inferred, a complete compiler/interpreter can be generated solely from samples. In this work Evolutionary Computation was employed to explore and exploit the enormous search space that appears in Semantic Inference. For the purpose of this research work the tool LISA.SI has been developed on the top of the compiler/interpreter generator tool LISA. The first results are encouraging, since we were able to infer the semantics only from samples and their associated meanings for several simple languages, including the Robot language.
... These guards are boolean expressions that must be true prior to following a transition. [2] We wanted to find the weaknesses of these algorithms and propose modifications and improvements as necessary. ...
... In addition, they can be potentially under-generalized and non-deterministic [3]. The GK Tail algorithm is an improved version of K Tails which generates EFSMs [2]. It uses Daikon [23] to make generalized transition guards based on the variable assignments. ...
... Further explanation of these options can be found in the tool's documentations. 2 However, during our experiments, we faced a number of problems which we will discuss in the following sections. In the following sections we explain each problem, try to find its root cause, and we suggest solutions for fixing them and present the way we solved some of them. ...
Preprint
Full-text available
Dynamic model inference techniques have been the center of many research projects recently. There are now multiple open source implementations of state-of-the-art algorithms, which provide basic abstraction and merging capabilities. Most of these tools and algorithms have been developed with one particular application in mind, which is program comprehension. The outputs models can abstract away the details of the program and represent the software behavior in a concise and easy to understand form. However, one application context that is less studied is using such inferred models for debugging, where the behavior to abstract is a faulty behavior (e.g., a set of execution traces including a failed test case). We tried to apply some of the existing model inference techniques (implemented in a promising tool called MINT) in a real-world industrial context to support program comprehension for debugging. Our initial experiments have shown many limitations both in terms of implementation as well as the algorithms. The paper will discuss the root cause of the failures and proposes ideas for future improvement.
... Yet recent advances in this field has shown, that although in a strict formal way Gold theorem still holds, by using heuristic, statistical or evolutionary methods, we are able to infer more complex grammars, like context-free grammars (CFG). De la Higuera presents a review of possible inference methods in [4]. And not only formal languages can be inferred. ...
... And that we may view context-free expressions as higher order regular expressions, i.e. expressions that may take another expression as an argument. The simple example is shown in the expression (4). ...
... Grammatical inference is a mature and well-studied field with many application domains ranging from various computer science fields, e.g., machine learning, to areas of natural sciences, e.g. computational biology [1]. The identification of a minimum size deterministic finite automaton (DFA) from labeled examples is one of the most well-investigated problems in this field. ...
... Related Work. Existing work considers the problem of minimal DFA identification from labeled examples [1]. It is shown that the DFA identification problem with a given upper bound on the number of states is an NP-complete problem [5]. ...
Preprint
The identification of a deterministic finite automaton (DFA) from labeled examples is a well-studied problem in the literature; however, prior work focuses on the identification of monolithic DFAs. Although monolithic DFAs provide accurate descriptions of systems' behavior, they lack simplicity and interpretability; moreover, they fail to capture sub-tasks realized by the system and introduce inductive biases away from the inherent decomposition of the overall task. In this paper, we present an algorithm for learning conjunctions of DFAs from labeled examples. Our approach extends an existing SAT-based method to systematically enumerate Pareto-optimal candidate solutions. We highlight the utility of our approach by integrating it with a state-of-the-art algorithm for learning DFAs from demonstrations. Our experiments show that the algorithm learns sub-tasks realized by the labeled examples, and it is scalable in the domains of interest.
... Furthermore, a programming language's semantics are much harder to define than its syntax. Hence, an interesting theoretical problem arises: Would it be possible to infer the semantics of a programming language automatically solely from provided valid programs and their meanings, in a similar manner as Grammatical Inference [21,22] can induce a grammar from positive and/or negative samples (programs)? ...
... In [31] the authors used Answer Set Programming (ASP) annotations to express context-sensitive constraints, which can be inferred automatically using Inductive Logic Program- ming, whilst an Evolutionary Algorithm has been applied in our approach [23]. As in our case, the Context-Free Grammar (CFG) is fixed, assuming that the syntax of the target language is known or can be inferred by Grammar Inference [21,22], but the semantics are unknown. ...
Article
When valid syntactical structures are additionally constrained with context-sensitive information the Grammar Inference needs to be extended to the Semantic Inference. In this paper, it is shown that a complete compiler/interpreter for small Domain-Specific Languages (DSLs) can be generated automatically solely from given programs and their associated meanings using Semantic Inference. In this work a wider class of Attribute Grammars has been learned, while only S-attributed and L-attributed Grammars have previously been inferred successfully. Inferring Absolutely Non-Circular Attribute Grammars (ANC-AG) with complex dependencies among attributes has been achieved by integrating a Memetic Algorithm (MA) into the LISA.SI tool. The results show that the proposed Memetic Algorithm is at least four times faster on the selected benchmark than the previous method.
... Besides this theoretical bent, GI algorithms have also been applied to practical problems (e.g., Natural Language Processing, Computational Biology, etc.). Excellent surveys on the eld of GI can be found in [6,17]. ...
... GI studies have focused on learning REG and CF languages (i.e, the rst two levels in the Chomsky Hierarchy) [6,17]. However, the Chomsky Hierarchy has some limitations that should be taken into account when we want to study natural language syntax. ...
Article
The eld of Grammatical Inference provides a good theoretical framework for investigating a learning process. Formal results in this eld can be relevant to the question of rst language acquisition. However, Grammatical Inference studies have been focused mainly on mathematical aspects, and have not exploited the linguistic relevance of their results. With this paper, we try to enrich Grammatical Inference studies with ideas from Linguistics. We propose a non-classical mechanism that has relevant linguistic and computational properties, and we study its learnability from positive data.
... También es útil la inferencia gramatical en problemas de robótica y sistemas de control, biología computacional, manejo de documentos, compresión y minería de datos. En general, la inferencia gramatical puede aportar a la solución de problemas de reconocimiento estructural de patrones (De la Higuera, 2005). ...
... Una posibilidad que eventualmente ha sido considerada, es la de inferir autómatas no deterministas (NFA) en lugar de autómatas deterministas. Yokomori (1994) publicó un trabajo en este tema cuyo método utilizaba consultas a un oráculo y contraejemplo (De la Higuera, 2005). También lo hicieron Coste & Fredouille (2000, 2003a, 2003b y Coste et al. (2004) quienes plantean la inferencia de NFAs como un caso particular de la inferencia de autómatas finitos no ambiguos. ...
Article
El desarrollo de nuevos algoritmos, que resulten convergentes y eficientes, es un paso necesario para un uso provechoso de la inferencia gramatical en la solución de problemas reales y de mayor tamaño. En este trabajo se presentan dos algoritmos llamados DeLeTe2 y MRIA, que implementan la inferencia gramatical por medio de autómatas no deterministas, en contraste con los algoritmos más comúnmente empleados, los cuales utilizan autómatas deterministas. Se consideran las ventajas y desventajas de este cambio en el modelo de representación, mediante la descripción detallada y la comparación de los dos algoritmos de inferencia con respecto al enfoque utilizado en su implementación, a su complejidad computacional, a sus criterios de terminación y a su desempeño sobre un cuerpo de datos sintéticos.
... In the field of grammar inference [26,27], it is typically assumed that there exists an unknown formal language that generates the sequential dataset. As such, the sequences in the dataset are considered positive examples, on the basis of which the resulting formal language is to be inferred. ...
... Other grammar inference methods assume the language to belong to the class of context-free languages and focus on extracting a context-free language in extended Backus-Naur form [82]. We refer to [26,27] for an extensive overview of the grammar inference field. ...
Article
Full-text available
Data of sequential nature arise in many application domains in the form of, e.g., textual data, DNA sequences, and software execution traces. Different research disciplines have developed methods to learn sequence models from such datasets: (i) In the machine learning field methods such as (hidden) Markov models and recurrent neural networks have been developed and successfully applied to a wide range of tasks, (ii) in process mining process discovery methods aim to generate human-interpretable descriptive models, and (iii) in the grammar inference field the focus is on finding descriptive models in the form of formal grammars. Despite their different focuses, these fields share a common goal: learning a model that accurately captures the sequential behavior in the underlying data. Those sequence models are generative, i.e., they are able to predict what elements are likely to occur after a given incomplete sequence. So far, these fields have developed mainly in isolation from each other and no comparison exists. This paper presents an interdisciplinary experimental evaluation that compares sequence modeling methods on the task of next-element prediction on four real-life sequence datasets. The results indicate that machine learning methods, which generally do not aim at model interpretability, tend to outperform methods from the process mining and grammar inference fields in terms of accuracy.
... Learning regular languages has been extensively studied because of its wide varieties of applications in many fields such as pattern recognition, model checking, data mining and computational linguistics [10]. Angluin [1] presented an algorithm L * which learns the minimum deterministic finite automaton (DFA) accepting an unknown target language using membership queries (MQs) and equivalence queries (EQs). ...
... give a to Λ (x ,x) as counterexample; find v ∈ Σ * , a ∈ Σ and q 2 ∈ Q such that 32 av is suffix of v, ∀q 1 ∈ Q, MQ(q 1 v ) = + ⇔ v ∈ L q 1 and MQ(q 2 av ) = + ⇔ av / ∈ L q 2 ; if ∃q ∈ Q, row(q ) ⊆ temp row ∧ q / ∈ δ (q, a) then 10 give a to Λ (q,q ) as a counterexample; reading all v ∈ V from each q ∈ Q. Note that, since V is not necessarily suffix-closed, differently from the previous case, we do not perform this in the length ascending order. ...
Article
Full-text available
We propose a query learning algorithm for residual symbolic finite automata (RSFAs). Symbolic finite automata (SFAs) are finite automata whose transitions are labeled by predicates over a Boolean algebra, in which a big collection of characters leading the same transition may be represented by a single predicate. Residual finite automata (RFAs) are a special type of non-deterministic finite automata which can be exponentially smaller than the minimum deterministic finite automata and have a favorable property for learning algorithms. RSFAs have both properties of SFAs and RFAs and can have more succinct representation of transitions and fewer states than RFAs and deterministic SFAs accepting the same language. The implementation of our algorithm efficiently learns RSFAs over a huge alphabet and outperforms an existing learning algorithm for deterministic SFAs. The result also shows that the benefit of non-determinism in efficiency is even larger in learning SFAs than non-symbolic automata.
... Grammatical inference (GI) or grammar learning deals with idealized learning procedure to acquire grammars on the basis of the evidence about the languages [4][5][6], was extensively studied in [6][7][8][9][10][11][12][13][14] due to its wide fields of applications to solve the particle problems. In GI process, an individual grammar rule needs to be verified using an input string of a specific language over finite state controller, whereas a pushdown automaton is used to generate the equivalent proliferation for the language. ...
... In GI, production rule length (PRL) is also an important factor that affects the result. The orthogonal array involves five control factors with two levels: PS= [120, 360], PRL= [5,8], CS= [120, 240], CR= [0.6, 0.9] and MR= [0.5, 0.8], where following setting gave the best results PS= 120, PRL= 5, CS= 120, CR= 0.9 and MR= 0.8 (See Table 3 experiment number 2, SNR=-12.1889) is taken for the robust process design and to conduct the experiments. ...
Article
Full-text available
The focus of this paper is towards developing a grammatical inference system uses a genetic algorithm (GA), has a powerful global exploration capability that can exploit the optimum offspring. The implemented system runs in two phases: first, generation of grammar rules and verification and then applies the GA’s operation to optimize the rules. A pushdown automata simulator has been developed, which parse the training data over the grammar’s rules. An inverted mutation with random mask and then ‘XOR’ operator has been applied introduces diversity in the population, helps the GA not to get trapped at local optimum. Taguchi method has been incorporated to tune the parameters makes the proposed approach more robust, statistically sound and quickly convergent. The performance of the proposed system has been compared with: classical GA, random offspring GA and crowding algorithms. Overall, a grammatical inference system has been developed that employs a PDA simulator for verification.
... . .}, 23 The problem of identifying an AF specification from a regular sequence of observations has direct connections with the the widely studied (and partially overlapping) fields of automata identification and grammatical inference [34]. Defining algorithms for the identification of AF specification is an interesting issue for future work, that we are confident can be faced resorting to techniques borrowed from the above mentioned areas. ...
Preprint
The theory of abstract argumentation frameworks (afs) has, in the main, focused on finite structures, though there are many significant contexts where argumentation can be regarded as a process involving infinite objects. To address this limitation, in this paper we propose a novel approach for describing infinite afs using tools from formal language theory. In particular, the possibly infinite set of arguments is specified through the language recognized by a deterministic finite automaton while a suitable formalism, called attack expression, is introduced to describe the relation of attack between arguments. The proposed approach is shown to satisfy some desirable properties which can not be achieved through other "naive" uses of formal languages. In particular, the approach is shown to be expressive enough to capture (besides any arbitrary finite structure) a large variety of infinite afs including two major examples from previous literature and two sample cases from the domains of multi-agent negotiation and ambient intelligence. On the computational side, we show that several decision and construction problems which are known to be polynomial time solvable in finite afs are decidable in the context of the proposed formalism and we provide the relevant algorithms. Moreover we obtain additional results concerning the case of finitary afs.
... The task of inferring a minimum-size separating automaton from two disjoint sets of samples has gained much attention from various fields, including computational biology [21], inference of network invariants [19], regular model checking [26], and reinforcement learning [24]. More recently, this problem has also arisen in the context of parity game solving [6], where separating automata can be used to decide the winner. ...
Chapter
Full-text available
We propose , a passive learning tool for learning minimal separating deterministic finite automata (DFA) from a set of labelled samples. Separating automata are an interesting class of automata that occurs generally in regular model checking and has raised interest in foundational questions of parity game solving. We first propose a simple and linear-time algorithm that incrementally constructs a three-valued DFA (3DFA) from a set of labelled samples given in the usual lexicographical order. This 3DFA has accepting and rejecting states as well as don’t-care states, so that it can exactly recognise the labelled examples. We then apply our tool to mining a minimal separating DFA for the labelled samples by minimising the constructed automata via a reduction to SAT solving. Empirical evaluation shows that our tool outperforms current state-of-the-art tools significantly on standard benchmarks for learning minimal separating DFAs from samples. Progress in the efficient construction of separating DFAs can also lead to finding the lower bound of parity game solving, where we show that can create optimal separating automata for simple languages with up to 7 colours. Future improvements might offer inroads to better data structures.
... The problem of identifying a deterministic finite state automaton (DFA) from labeled traces is one of the best-studied problems in grammatical inference [11]. The latter sees applications in various areas, including Business Process Management (BPM) [1], non-Markovian Reinforcement Learning [14,26], automatic control [7], speech recognition [3], and computational biology [25]. ...
Preprint
Full-text available
In this work, we introduce DeepDFA, a novel approach to identifying Deterministic Finite Automata (DFAs) from traces, harnessing a differentiable yet discrete model. Inspired by both the probabilistic relaxation of DFAs and Recurrent Neural Networks (RNNs), our model offers interpretability post-training, alongside reduced complexity and enhanced training efficiency compared to traditional RNNs. Moreover, by leveraging gradient-based optimization, our method surpasses combinatorial approaches in both scalability and noise resilience. Validation experiments conducted on target regular languages of varying size and complexity demonstrate that our approach is accurate, fast, and robust to noise in both the input symbols and the output labels of training data, integrating the strengths of both logical grammar induction and deep learning.
... Related Work. For a broad overview of AAL refer to the survey paper of de la Higuera [27] from 2005 and to a more recent paper by Howar and Steffen [28]. ...
Chapter
Full-text available
Existing active automata learning (AAL) algorithms have demonstrated their potential in capturing the behavior of complex systems (e.g., in analyzing network protocol implementations). The most widely used AAL algorithms generate finite state machine models, such as Mealy machines. For many analysis tasks, however, it is crucial to generate richer classes of models that also show how relations between data parameters affect system behavior. Such models have shown potential to uncover critical bugs, but their learning algorithms do not scale beyond small and well curated experiments. In this paper, we present SLλ{SL}^{\lambda } SL λ , an effective and scalable register automata (RA) learning algorithm that significantly reduces the number of tests required for inferring models. It achieves this by combining a tree-based cost-efficient data structure with mechanisms for computing short and restricted tests. We have implemented SLλ{SL}^{\lambda } SL λ as a new algorithm in RALib. We evaluate its performance by comparing it against SL{SL}^{*} SL ∗ , the current state-of-the-art RA learning algorithm, in a series of experiments, and show superior performance and substantial asymptotic improvements in bigger systems.
... Learning finite automata has become an important field in machine learning (Kearns and Vazirani, 1994) and has been applied to wide-ranging realistic problems (Higuera, 2005;Vaandrager, 2017), for example, smartcards, network protocols, legacy software, robotics and control systems, pattern recognition, computational linguistics, computational biology, data compression, data mining, etc. In Vaandrager (2017), learning finite automata is termed as model learning. ...
Article
Learning finite automata (termed as model learning ) has become an important field in machine learning and has been useful realistic applications. Quantum finite automata (QFA) are simple models of quantum computers with finite memory. Due to their simplicity, QFA have well physical realizability, but one-way QFA still have essential advantages over classical finite automata with regard to state complexity (two-way QFA are more powerful than classical finite automata in computation ability as well). As a different problem in quantum learning theory and quantum machine learning , in this paper, our purpose is to initiate the study of learning QFA with queries (naturally it may be termed as quantum model learning ), and the main results are regarding learning two basic one-way QFA (1QFA): (1) we propose a learning algorithm for measure-once 1QFA (MO-1QFA) with query complexity of polynomial time and (2) we propose a learning algorithm for measure-many 1QFA (MM-1QFA) with query complexity of polynomial time, as well.
... Other related works in this direction of automata learning and grammar extraction include learning of context free languages such as by Omphalos [2], extracting context free languages in extended Backus-Naur form [3] and learning of probabilistic finite state machines such as by PAutomaC [4]. Several other works [5] [6] [7] provide surveys in this field of grammar inferencing [8] with a focus on constructing grammar for the underlying language by learning through examples. ...
Conference Paper
Full-text available
When processes execute through their business logic, their activities generate event logs, which contribute to trace sets. Since its introduction, the field of process mining has evolved, however, accuracy issues persist. In this paper, we explore the L* algorithm in the context of process mining, especially towards development of interactive process mining techniques. We discuss the results of our experiments, the limitations of the approach, and possible future directions towards adapting the technique for covering richer set of features.
... Cilj gramatičkog zaključivanja je dobiti gramatičku strukturu na osnovu skupa uzoraka [26], [27]. Ispod je prikazano nekoliko primjera jezika DESK [28]: ...
Article
Full-text available
SAŽETAK Genetski algoritam heuristična je metoda pretraživanja inspirirana biološkim procesima evolucije. Ta se metoda pokazala efikasnom za mnoge vrste problema za čije je rješavanje potrebno pretražiti veliki prostor mogućih rješenja i za koje su egzaktne tehnike pretraživanja, kao što je dinamičko programiranje, nedovoljno efikasne. U ovom radu opisani su osnovni principi rada genetskih algoritama kao što je selekcija, križanje, mutacija i funkcija dobrote zajedno s nekim područjima njihove primjene kao što su optimizacija, genetsko programiranje i gramatičko i semantičko zaključivanje. Iako je sam princip rada genetskih algoritama jednostavan, neki od problema koji ih ograničavaju u primjeni i pronalaženju optimalnog ili prihvatljivog rješenja su kompleksna i/ili neefikasna funkcija dobrote i postizanje lokalnog optimuma što ih sprečava u konvergiranju prema globalnom optimumu. Ključne riječi: genetski operatori, evolucijsko računarstvo, optimizacija ABSTRACT Genetic algorithm is a heuristic search method inspired by biological evolution processes. That method has proven successful for many types of problems for which finding a solution requires searching a large potential solution space and for which egzact methods, such as dynamic programming, are not efficient enough. In this paper we describe basic principles of genetic algorithms such as selection, crossover, mutation and fitness function together with some application areas such as optimization, genetic programming and grammatic and semantic inference. Although the basic principles of genetic algorithms are simple, some problems that limit their usability and finding an optimal and/or acceptable solution are complex and/or inneficient fitness function and achieving local optimum which stops them from converging towards a global optimum.
... The problem of identifying a deterministic finite state automaton (DFA) from labeled traces is one of the best-studied problems in grammatical inference. The latter sees applications in various areas, including machine learning, formal language theory, syntactic and structural pattern recognition, computational linguistics, computational biology, and speech recognition (de la Higuera 2005). Both passive (Heule and Verwer 2010) and active (Angluin 1987) exact methods were proposed for DFA identification. ...
Preprint
Full-text available
This work addresses the problem of identifying a deterministic finite automaton (DFA) from traces by using a differ-entiable recurrent model. Our model is similar to a recurrent neural network but, differently from that, is designed to be completely transparent after training. At the same time, using gradient-based optimization makes our approach faster and more resilient than combinatorial methods for DFA identification. Experiments on the Tomita language benchmark and on randomly generated DFAs show that our approach is accurate, fast and robust to errors in the training data.
... Deterministic and nondeterministic finite automata play a crucial role in various practical applications, including artificial intelligence, grammatical inference, and bioinformatics [1,2,3]. The last field of application is particularly interesting, as also stated in [4], since the automata can be used to detect patterns hidden in bioinformatics data. ...
Chapter
Full-text available
The parallel induction algorithm discussed in the paper finds a minimal nondeterministic finite automaton (NFA) consistent with the given sample. The sample consists of examples and counterexamples, i.e., words that are accepted and rejected by the automaton. The algorithm transforms the problem to a family of constraint satisfaction problems solved in parallel. Only the first solution is sought, which means that upon finding a consistent automaton, the remaining processes terminate their execution. We analyze the parallel algorithm in terms of achieved speedups. In particular, we discuss the reasons of the observed superlinear speedups. The analysis includes experiments conducted for the samples defined over the alphabets of different sizes.
... Grammatical inference for context-free languages (CFLs), which are represented by grammars or pushdown automata, has a long tradition. Though it is widely considered an open problem, the literature provides several approaches to learning restricted CFLs (see de la Higuera [12,13] for an overview). In this section, we sketch a seamless generalization of Angluin's algorithm for visibly pushdown languages (VPLs), which has recently been proposed by Barbot et al. [5]. ...
... The children everywhere need parental control to obtain a specific and positive response in school performance. In Cuenca, Ecuador, the study aimed at a population between 9 and 11 years old showed that there were similarities between the levels of acceptance, rejection and control of the paternal and maternal figure in both boys and girls, indicating that these, they perceive themselves to be highly rejected, that is, they do not feel accepted and the control exerted by the people they are attached to is very low; In addition, in the correlation results, a significant relationship was found between parental acceptance and maternal indifference, paternal indifference and maternal aggression, paternal aggression and maternal indifference, paternal rejection and maternal aggression (De La Higuera, 2005;Feng et al., 2015). ...
Article
Full-text available
The purpose of this study is related to the incidence of parental control on the school performance of the middle school students of the Eugenio Espejo Educational Unit #29 of the Tosagua Canton, where there are situations of negative behavior in the students due to the lack of emotional relationships of their parents who in many cases overprotect their children, adopting unusual values and practices, that lead them to have a low performance reflected in their learning results. The objective of the research is to investigate how Parental control affects the school performance of the students of the Eugenio Espejo Educational Unit #29 of the Tosagua Canton in the period 2021-2022. A qualitative analysis was used as a methodology, based on a field and bibliographic study, in addition to the inductive-deductive, analytical-synthetic, and statistical method, the data collection tools were based on parent surveys, interviews with teachers and student observation sheet, with a sample of 107 students, 100 parents and 3 teachers. It was obtained as a result that the parental control of the parents towards the children is fundamental for their education and the formation of values.
... Early results in computational linguistics quickly established fundamental limits of what could be achieved: it was shown that not even regular languages can be identified given only positive examples [20]. Nevertheless, with applications ranging from speech recognition to computational biology, grammatical inference is an active and vibrant field [13,14]. ...
... Can grammatical structures be inferred from programs, and can semantic structures be inferred from the provided meanings of programs? Grammar Induction, or Grammatical Infer-75 ence (GI) [44,45] deal with the first problem. Various works show that, indeed, a grammar can be inferred from positive and/or negative samples for small programming languages [46,47,48,49]. ...
Article
Using advanced AI approaches, the development of Domain-Specific Languages (DSLs) can be facilitated for domain experts who are not proficient in programming language development. In this paper, we first addressed the aforementioned problem using Semantic Inference. However, this approach is very time-consuming. Namely, a lot of code bloat is present in the generated language specifications, which increases the time required to evaluate a solution. To improve this, we introduced a multi-threaded approach, which accelerates the evaluation process by over 9.5 times, while the number of fitness evaluations using the improved Long Term Memory Assistance (LTMA) was reduced by up to 7.3%. Finally, a reduction in the number of input samples (fitness cases) was proposed, which reduces CPU consumption further.
... Process-mining techniques build formal models that describe a process as a set of all valid process instances (van der Aalst 2011), so these techniques are similar to traditional approaches in the field of grammatical inference. These grammatical-inference techniques build formal models that describe languages as sets of valid sentences (de la Higuera 2005). Language models are foundational to many modern technologies that deal with language, such as search engines and machine translation systems, but to get where it is now, grammatical inference had to depart from using formal, logical models and base its contemporary applications on probabilistic approaches that model a language as probability distributions over sentences instead of as a set of valid sentences (Chater and Manning 2006). ...
Article
Predictive modeling approaches in business process management provide a way to streamline operational business processes. For instance, they can warn decision makers about undesirable events that are likely to happen in the future, giving the decision maker an opportunity to intervene. The topic is gaining momentum in process mining, a field of research that has traditionally developed tools to discover business process models from data sets of past process behavior. Predictive modeling techniques are built on top of process-discovery algorithms. As these algorithms describe business process behavior using models of formal languages (e.g., Petri nets), strong language biases are necessary in order to generate models with the limited amounts of data included in the data set. Naturally, corresponding predictive modeling techniques reflect these biases. Based on theory from grammatical inference, a field of research that is concerned with inducing language models, we design a new predictive modeling technique based on weaker biases. Fitting a probabilistic model to a data set of past behavior makes it possible to predict how currently running process instances will behave in the future. To clarify how this technique works and to facilitate its adoption, we also design a way to visualize the probabilistic models. We assess the effectiveness of the technique in an experimental evaluation with synthetic and real-world data.
... Early results in computational linguistics quickly established fundamental limits of what could be achieved: it was shown that not even regular languages can be identified given only positive examples [20]. Nevertheless, with applications ranging from speech recognition to computational biology, grammatical inference is an active and vibrant field [13,14]. ...
Preprint
Full-text available
Ad hoc parsers are everywhere: they appear any time a string is split, looped over, interpreted, transformed, or otherwise processed. Every ad hoc parser gives rise to a language: the possibly infinite set of input strings that the program accepts without going wrong. Any language can be described by a formal grammar: a finite set of rules that can generate all strings of that language. But programmers do not write grammars for ad hoc parsers -- even though they would be eminently useful. Grammars can serve as documentation, aid program comprehension, generate test inputs, and allow reasoning about language-theoretic security. We propose an automatic grammar inference system for ad hoc parsers that would enable all of these use cases, in addition to opening up new possibilities in mining software repositories and bi-directional parser synthesis.
... Learning finite automata has become an important field in machine learning [21] and has been applied to wide-ranging realistic problems [18,40], for example, smartcards, network protocols, legacy software, robotics and control systems, pattern recognition, computational linguistics, computational biology, data compression, data mining, etc. In [40] learning finite automata is termed as model learning. ...
Preprint
{\it Learning finite automata} (termed as {\it model learning}) has become an important field in machine learning and has been useful realistic applications. Quantum finite automata (QFA) are simple models of quantum computers with finite memory. Due to their simplicity, QFA have well physical realizability, but one-way QFA still have essential advantages over classical finite automata with regard to state complexity (two-way QFA are more powerful than classical finite automata in computation ability as well). As a different problem in {\it quantum learning theory} and {\it quantum machine learning}, in this paper, our purpose is to initiate the study of {\it learning QFA with queries} (naturally it may be termed as {\it quantum model learning}), and the main results are regarding learning two basic one-way QFA: (1) We propose a learning algorithm for measure-once one-way QFA (MO-1QFA) with query complexity of polynomial time; (2) We propose a learning algorithm for measure-many one-way QFA (MM-1QFA) with query complexity of polynomial-time, as well.
... Passive learning is a setting in which labeled data is provided to the learner. The learner is tasked to find a model that represents this data [36]- [38]. The data is usually an observation log from the SUL. ...
Thesis
Full-text available
Our society's reliance on computer-controlled systems is rapidly growing. Such systems are found in various devices, ranging from simple light switches to safety-critical systems like autonomous vehicles. In the context of safety-critical systems, safety and correctness are of utmost importance. Faults and errors could have catastrophic consequences. Thus, there is a need for rigorous methodologies that help provide guarantees of safety and correctness. Supervisor synthesis, the concept of being able to mathematically synthesize a supervisor that ensures that the closed-loop system behaves in accordance with known requirements, can indeed help. This thesis introduces supervisor learning, an approach to help automate the learning of supervisors in the absence of plant models. Traditionally, supervisor synthesis makes use of plant models and specification models to obtain a supervisor. Industrial adoption of this method is limited due to, among other things, the difficulty in obtaining usable plant models. Manually creating these plant models is an error-prone and time-consuming process. Thus, supervisor learning intends to improve the industrial adoption of supervisory control by automating the process of generating supervisors in the absence of plant models. The idea here is to learn a supervisor for the system under learning (SUL) by active interaction and experimentation. To this end, we present two algorithms, SupL*, and MSL, that directly learn supervisors when provided with a simulator of the SUL and its corresponding specifications. SupL* is a language-based learner that learns one supervisor for the entire system. MSL, on the other hand, learns a modular supervisor, that is, several smaller supervisors, one for each specification. Additionally, a third algorithm, MPL, is introduced for learning a modular plant model. The approach is realized in the tool MIDES and has been used to learn supervisors in a virtual manufacturing setting for the Machine Buffer Machine example, as well as learning a model of the Lateral State Manager, a sub-component of a self-driving car. These case studies show the feasibility and applicability of the proposed approach, in addition to helping identify future directions for research.
... However, the formalism of process-based modeling requires complex learning mechanisms in logic [29]. When using probabilistic context free grammars as inductive bias, the latter can be learned using a number of standard algorithms for inferring grammars [30]. ...
Article
Full-text available
Equation discovery, also known as symbolic regression, is a type of automated modeling that discovers scientific laws, expressed in the form of equations, from observed data and expert knowledge. Deterministic grammars, such as context-free grammars, have been used to limit the search spaces in equation discovery by providing hard constraints that specify which equations to consider and which not. In this paper, we propose the use of probabilistic context-free grammars in equation discovery. Such grammars encode soft constraints, specifying a prior probability distribution on the space of possible equations. We show that probabilistic grammars can be used to elegantly and flexibly formulate the parsimony principle, that favors simpler equations, through probabilities attached to the rules in the grammars. We demonstrate that the use of probabilistic, rather than deterministic grammars, in the context of a Monte-Carlo algorithm for grammar-based equation discovery, leads to more efficient equation discovery. Finally, by specifying prior probability distributions over equation spaces, the foundations are laid for Bayesian approaches to equation discovery.
... However, the formalism of process-based modeling requires complex learning mechanisms in logic. When using probabilistic context free grammars as inductive bias, the latter can be learned using a number of standard algorithms for inferring grammars [8]. ...
Preprint
Full-text available
Equation discovery, also known as symbolic regression, is a type of automated modeling that discovers scientific laws, expressed in the form of equations, from observed data and expert knowledge. Deterministic grammars, such as context-free grammars, have been used to limit the search spaces in equation discovery by providing hard constraints that specify which equations to consider and which not. In this paper, we propose the use of probabilistic context-free grammars in the context of equation discovery. Such grammars encode soft constraints on the space of equations, specifying a prior probability distribution on the space of possible equations. We show that probabilistic grammars can be used to elegantly and flexibly formulate the parsimony principle, that favors simpler equations, through probabilities attached to the rules in the grammars. We demonstrate that the use of probabilistic, rather than deterministic grammars, in the context of a Monte-Carlo algorithm for grammar-based equation discovery, leads to more efficient equation discovery. Finally, by specifying prior probability distributions over equation spaces, the foundations are laid for Bayesian approaches to equation discovery.
... Solving the fault diagnosis problem without any explicit model of the system requires the use of learning techniques. Our proposal can be considered a type of supervised learning technique closer in style to grammatical inference [17], which mainly tries to identify the underlying system by learning automata [18]. We do not attempt to learn the model of the system [19], but rather a set of concise representations of the fault signatures such as proposed in [8,20] in a unsupervised learning context (classification). ...
Conference Paper
Full-text available
In this paper, we propose a method to diagnose faults in a discrete event system that only relies on past observed logs and not on any behavioural model of the system. Given a set of tagged logs produced by the system, the first objective is to extract from them a set of fault signatures. These fault signatures are represented with a set of critical observations that are the support of the diagnosis method. We first propose a method to compute the fault signatures from an initial log journal and follow with detail on how the signatures can then be updated when new logs are available.
... Finally, we are investigating alternative methods for the inference of grammatical models, in particular with reference to active learning [17]. This paradigm is based on the assumption that an informant, or oracle, may be used to guide inference by a process of queries and assessment. ...
Article
Full-text available
Recognizing users’ daily life activities without disrupting their lifestyle is a key functionality to enable a broad variety of advanced services for a Smart City, from energy-efficient management of urban spaces to mobility optimization. In this paper, we propose a novel method for human activity recognition from a collection of outdoor mobility traces acquired through wearable devices. Our method exploits the regularities naturally present in human mobility patterns to construct syntactic models in the form of finite state automata, thanks to an approach known as grammatical inference. We also introduce a measure of similarity that accounts for the intrinsic hierarchical nature of such models, and allows to identify the common traits in the paths induced by different activities at various granularity levels. Our method has been validated on a dataset of real traces representing movements of users in a large metropolitan area. The experimental results show the effectiveness of our similarity measure to correctly identify a set of common coarse-grained activities, as well as their refinement at a finer level of granularity.
... The theoretical fundamentals of grammatical inference go back to Gold (1967), and for active learning, to Angluin (1987). The surveys by de la Higuera (2005de la Higuera ( , 2010 provide an overview of the various approaches to many variants of this inference problem. These rather theoretical approaches typically simplify the setting, for example, by restricting the alphabet size of the automata to two. ...
Article
Full-text available
In software engineering, the imprecise requirements of a user are transformed to a formal requirements specification during the requirements elicitation process. This process is usually guided by requirements engineers interviewing the user. We want to partially automate this first step of the software engineering process in order to enable users to specify a desired software system on their own. With our approach, users are only asked to provide exemplary behavioral descriptions. The problem of synthesizing a requirements specification from examples can partially be reduced to the problem of grammatical inference, to which we apply an active coevolutionary learning approach. However, this approach would usually require many feedback queries to be sent to the user. In this work, we extend and generalize our active learning approach to receive knowledge from multiple oracles, also known as proactive learning. The ``user oracle'' represents input received from the user and the “knowledge oracle” represents available, formalized domain knowledge. We call our two-oracle approach the “first apply knowledge then query” (FAKT/Q) algorithm. We compare FAKT/Q to the active learning approach and provide an extensive benchmark evaluation. As result we find that the number of required user queries is reduced and the inference process is sped up significantly. Finally, with so-called On-The-Fly Markets, we present a motivation and an application of our approach where such knowledge is available.
... The inference of regular language represented by means of finite automata is widely studied in the field of machine learning (de La Higuera, 2005). The motivation of studying this problem is because of its position in the Chomsky hierarchy. ...
Thesis
Full-text available
Automatic control is a technique about designing control devices for controlling ma- chinery processes without human intervention. However, devising controllers using conventional control theory requires first principle design on the basis of the full under- standing of the environment and the plant, which is infeasible for complex control tasks such as driving in highly uncertain traffic environment. Intelligent control offers new op- portunities about deriving the control policy of human beings by mimicking our control behaviors from demonstrations. In this thesis, we focus on intelligent control techniques from two aspects: (1) how to learn control policy from supervisors with the available demonstration data; (2) how to verify the controller learned from data will safely control the process.
Article
This paper presents four state-of-art methods for the finite-state automaton inference based on a sample of labeled strings. The first algorithm is Exbar, and the next three are mathematical models based on ASP, SAT and SMT theories. The potentiality of using multiprocessor computers in the context of automata inference was our research’s primary goal. In a series of experiments, we showed that our parallelization of the exbar algorithm is the best choice when a multiprocessor system is available. Furthermore, we obtained a superlinear speedup for some of the prepared datasets, achieving almost a 5-fold speedup on the median, using 12 and 24 processes.
Article
Full-text available
Lindenmayer systems (L-systems) are a grammar system that consists of string rewriting rules. The rules replace every symbol in a string in parallel with a successor to produce the next string, and this procedure iterates. In a stochastic context-free L-system (S0L-system), every symbol may have one or more rewriting rule, each with an associated probability of selection. Properly constructed rewriting rules have been found to be useful for modeling and simulating some natural and human engineered processes where each derived string describes a step in the simulation. Typically, processes are modeled by experts who meticulously construct the rules based on measurements or domain knowledge of the process. This paper presents an automated approach to finding stochastic L-systems, given a set of string sequences as input. The implemented tool is called the Plant Model Inference Tool for S0L-systems or PMIT-S0L. PMIT-S0L is evaluated using 960 procedurally generated S0L-systems in a test suite, which are each used to generate input strings, and PMIT-S0L is then used to infer the system from only the sequences. The evaluation shows that PMIT-S0L infers S0L-systems with up to 9 rewriting rules each in under 12 hours. Additionally, it is found that 3 sequences of strings are sufficient to find the correct original rewriting rules in 100%100\% of the cases in the test suite, and 6 sequences of strings reduce the difference in the associated probabilities to approximately 1%1\% or less.
Thesis
Full-text available
Dans le contexte de la transition énergétique et de l'augmentation des interconnexions entre les réseaux de transport d'électricité en Europe, les opérateurs du réseau français doivent désormais faire face à davantage de fluctuations et des dynamiques nouvelles sur le réseau. Pour garantir la sûreté de ce réseau, les opérateurs s'appuient sur des logiciels informatiques permettant de réaliser des simulations, ou de suivre l'évolution d'indicateurs créés manuellement par des experts grâce à leur connaissance du fonctionnement du réseau. Le gestionnaire de réseau de transport d'électricité français RTE (Réseau de Transport d'Electricité) s'intéresse notamment aux développements d'outils permettant d'assister les opérateurs dans leur tâche de surveillance des transits sur les lignes électriques. Les transits sont en effet des grandeurs particulièrement importantes pour maintenir le réseau dans un état de sécurité, garantissant la sûreté du matériel et des personnes. Cependant, les indicateurs utilisés ne sont pas faciles à mettre à jour du fait de l'expertise nécessaire pour les construire et les analyser. Pour répondre à la problématique énoncée, cette thèse a pour objet la construction d'indicateurs, sous la forme d'expressions symboliques, permettant d'estimer les transits sur les lignes électriques. Le problème est étudié sous l'angle de la Régression Symbolique et investigué à la fois par des approches génétiques d'Evolution Grammaticale et d'Apprentissage par Renforcement dans lesquelles la connaissance experte, explicite et implicite, est prise en compte. Les connaissances explicites sur la physique et l'expertise du domaine électrique sont représentées sous la forme d'une grammaire non-contextuelle délimitant l'espace fonctionnel à partir duquel l'expression est créée. Une première approche d'Evolution Grammaticale Interactive propose d’améliorer incrémentalement les expressions trouvées par la mise à jour d'une grammaire entre les apprentissages évolutionnaires. Les expressions obtenues sur des données réelles issues de l'historique du réseau sont validées par une évaluation de métriques d'apprentissages, complétée par une évaluation de leur interprétabilité. Dans un second temps, nous proposons une approche par renforcement pour chercher dans un espace délimité par une grammaire non-contextuelle afin de construire une expression symbolique pertinente pour des applications comportant des contraintes physiques. Cette méthode est validée sur des données de l'état de l'art de la régression symbolique, ainsi qu’un jeu de données comportant des contraintes physiques pour en évaluer l'interprétabilité. De plus, afin de tirer parti des complémentarités entre les capacités des algorithmes d'apprentissage automatique et de l'expertise des opérateurs du réseau, des algorithmes interactifs de Régression Symbolique sont proposés et intégrés dans des plateformes interactives. L'interactivité est employée à la fois pour mettre à jour la connaissance représentée sous forme grammaticale, analyser, interagir avec et commenter les solutions proposées par les différentes approches. Ces algorithmes et interfaces interactifs ont également pour but de prendre en compte de la connaissance implicite, plus difficile à formaliser, grâce à l'utilisation de mécanismes d'interactions basés sur des suggestions et des préférences de l’utilisateur.
Article
Residuality plays an essential role for learning finite automata. While residual deterministic and nondeterministic automata have been understood quite well, fundamental questions concerning alternating automata (AFA) remain open. Recently, Angluin, Eisenstat, and Fisman (2015) have initiated a systematic study of residual AFAs and proposed an algorithm called AL⋆ – an extension of the popular L⋆ algorithm – to learn AFAs. Based on computer experiments they conjectured that AL⋆ produces residual AFAs, but have not been able to give a proof. In this paper we disprove this conjecture by constructing a counterexample. As our main positive result we design an efficient learning algorithm, named AL⋆⋆, and give a proof that it outputs residual AFAs only. In addition, we investigate the succinctness of these different finite automata (FA) types in more detail.
Chapter
Ensuring the correctness and reliability of deep neural networks is a challenge. Suitable formal analysis and verification techniques have yet to be developed. One promising approach towards this goal is model learning, which seeks to derive surrogate models of the underlying neural network in a model class that permits sophisticated analysis and verification techniques. This paper surveys several existing model learning approaches that infer finite-state automata and context-free grammars from Recurrent Neural Networks, an essential class of deep neural networks for sequential data. Most of these methods rely on Angluin’s approach for learning finite automata but implement different ways of checking the equivalence of a learned model with the neural network. Our paper presents these distinct techniques in a unified language and discusses their strengths and weaknesses. Furthermore, we survey model learning techniques that follow a novel trend in explainable artificial intelligence and learn models in the form of formal grammars.
Article
We present a method to extract knowledge in terms of quantifier-free sentences in disjunctive normal form from noisy samples of classified strings. We show that the problem to find such a sentence is NP-complete, and our approach for solving it is based on a reduction to the Boolean satisfiability problem. Moreover, our method bounds the number of disjuncts and the maximum number of literals per clause since sentences with few clauses and few literals per clause are easier to interpret. As the logic we are considering defines exactly the class of locally threshold testable (LTT) languages, our results can be useful in grammatical inference when the goal is to find a model of an LTT language from a sample of strings. We also use results of the Ehrenfeucht–Fraïssé game over strings in order to handle consistent and inconsistent samples of strings.
Article
Full-text available
The construction of a model that recognizes semantic components of spontaneous dialogues about telephonic queries of schedules and prices of long distance train tickets is reported in this paper. Grammatical inference techniques were used to infer an automaton. The accuracy of the automaton recognizing sequences of semantic components is 96.75%.
Article
In this work, we present a model for the automatic generation of written dialogues, through the use of grammatical inference. This model allows the automatic recognition of grammars from a set dialogues employed as a training set. The inferred grammars are then used to generate templates of responses within the dialogues. The final objective is to apply this model in a specific domain dialogue system that answers questions in Spanish with the use of a knowledge base. The experiments carried out have been performend using the DIHANA project corpus which contains dialogues written in Spanish about schedules and prices of a rail system.
Article
In order to define a DNF version of first-order sentences over strings in which atomic sentences represent substring properties of strings, we use results of the Ehrenfeucht–Fraïssé game over strings. Then, given a sample of strings and the number of disjunctive clauses, we investigate the problem of finding a DNF formula that is consistent with the sample. We show that this problem is NP-complete, and we solve it by a translation into Boolean satisfiability. We also present an extension of this problem that is robust concerning noisy samples. We solve the generalized version by a codification into the maximum satisfiability problem. As first-order logic over strings defines exactly the class of locally threshold testable (LTT) languages, our results can be useful in the grammatical inference framework when the goal is to find a model of a LTT language from a sample of strings.
Article
Full-text available
The topic data mining and machine learning for complex types of data, such as images, videos, texts or three- dimensional data, is a very complex topic and still in its infancy. Most of the work engages in this topic from the point of view of multimedia systems. Here the system aspect is in the foreground, not the methods. Mostly multimedia data cannot be represented by attribute–value pairs, but require more complex representations of data, which in processing raise a number of problems with which pattern recognition experts occupy themselves intensely from the theoretical side. They have taken up the problem of mining images, texts, videos and web documents and have already in the past made some substantial contributions in this field (see MLDM 1999). In order to be successful in the field data mining and machine learning a complex study of the problem from the angles of system design (multimedia systems) and theory (pattern recognition) is necessary. The conference ‘‘Machine Learning and Data Mining MLDM in Pattern Recognition’’ is dedicated to this concern. It is in this domain that a number of new results is to be expected. The most important results of the 2nd MLDM 2001 are summarized in this volume. The contributions deal with theoretical as well as with practical aspects of machine learning and data mining from the point of view of pattern recognition. The next MLDM will take place (as a conference) in 2003. We like to render our special thanks to all reviewers for their excellent work. Petra Perner Uwe Zscherpel
Article
Full-text available
Works dealing with grammatical inference of stochastic grammars often evaluate the relative entropy between the model and the true grammar by means of large test sets generated with the true distribution. In this paper, an iterative procedure to compute the relative entropy between two stochastic deterministic regular grammars is proposed. Resum'e Les travails sur l'inf'erence de grammaires stochastiques 'evaluent l'entropie relative entre le mod`ele et la vraie grammaire en utilisant grands ensembles de test g'en'er'es avec la distribution correcte. Dans cet article, on propose une proc'edure it'erative pour calculer l'entropie relative entre deux grammaires. 2 1 Introduction Stochastic models have been widely used in computer science, especially in those tasks dealing with noisy data or random sources such as pattern recognition, natural language modeling, etc. A stochastic model predicts a probability distribution for the events in the class under consideration and one ...
Article
Full-text available
New XML schema languages have been recently proposed to replace Document Type Definitions (DTDs) as schema mechanism for XML data. These languages consistently combine grammar-based construc- tions with constraint- and pattern-based ones and have a better expressive power than DTDs. As schema remain optional for XML data, we address the problem of schema extraction from XML data. We model the XML schema as extended context-free grammars and propose the schema extraction algorithm that is based on meth- ods of grammatical inference. The extraction algorithm copes also with the schema determinism requirement imposed by XML DTDs and XML Schema languages. We report the tests result of schema extraction on a collection of real XML documents.
Book
This volume presents the proceedings of the Second International Colloquium on Grammatical Inference (ICGI-94), held in Alicante, Spain in September 1994. Besides 25 research papers carefully selected and refereed by the program committee, the book contains a survey by E. Vidal. The book is devoted to all those aspects of automatic learning that explicitly focus on principles, theory, and applications of grammars and languages. The papers are organized in sections on formal aspects; language modelling and linguistic applications; stochastic approaches, applications and performance analysis; and neural networks, genetic algorithms, and artificial intelligence techniques.
Book
The Sixth International Colloquium on Grammatical Inference (ICGI2002) was held in Amsterdam on September 23-25th, 2002. ICGI2002 was the sixth in a series of successful biennial international conferenceson the area of grammatical inference. Previous meetings were held in Essex, U.K.; Alicante, Spain; Mo- pellier, France; Ames, Iowa, USA; Lisbon, Portugal. This series of meetings seeks to provide a forum for the presentation and discussion of original research on all aspects of grammatical inference. Gr- matical inference, the process of inferring grammars from given data, is a ?eld that not only is challenging from a purely scienti?c standpoint but also ?nds many applications in real-world problems. Despite the fact that grammatical inference addresses problems in a re- tively narrow area, it uses techniques from many domains, and is positioned at the intersection of a number of di?erent disciplines. Researchers in grammatical inference come from ?elds as diverse as machine learning, theoretical computer science, computational linguistics, pattern recognition, and arti?cial neural n- works. From a practical standpoint, applications in areas like natural language - quisition, computational biology, structural pattern recognition, information - trieval, text processing, data compression and adaptive intelligent agents have either been demonstrated or proposed in the literature. The technical program included the presentation of 23 accepted papers (out of 41 submitted). Moreover, for the ?rst time a software presentation was or- nized at ICGI. Short descriptions of the corresponding software are included in these proceedings, too.
Chapter
This chapter discusses the finite nondeterministic and probabilistic automata. The automata and sequential machines are strictly deterministic in their actions and at each moment, the next state is uniquely determined by the present state, and the scanned letter. The output is uniquely determined by the input and the initial state. The automata that possess several choices for their actions are considered in the chapter. The moves are chosen at random, possibly with prefixed probabilities. Finite nondeterministic automata introduced in the chapter are direct generalizations of finite deterministic automata. When scanning the letter x in the internal states, a nondeterministic automaton is at liberty to choose one of the possible next states. A nondeterministic automaton NA may also possess several initial states. Given a designated set S1 of final states, the language represented by S1 in NA consists of all words that cause at least one sequence of state transitions from an initial state to a final state.
Chapter
Most of the developments in pattern recognition research during the past decade deal with the decision-theoretic approach [1.1–11] and its applications. In some pattern recognition problems, the structural information which describes each pattern is important, and the recognition process includes not only the capability of assigning the pattern to a particular class (to classify it), but also the capacity to describe aspects of the pattern which make it ineligible for assignment to another class. A typical example of this class of recognition problem is picture recognition, or more generally speaking, scene analysis. In this class of recogniton problems, the patterns under consideration are usually quite complex and the number of features required is often very large which makes the idea of describing a complex pattern in terms of a (hierarchical) composition of simpler subpatterns very attractive. Also, when the patterns are complex and the number of possible descriptions is very large, it is impractical to regard each description as defining a class (for example, in fingerprint and face identification problems, recognition of continuous speech, Chinese characters, etc.). Consequently, the requirement of recognition can be satisfied only by a description for each pattern rather than the simple task of classification.
Article
Learning recursive rules and inventing predicates are difficult tasks for Inductive Logic Programming techniques. We propose an approach where given a set of examples and counter-examples, and a background knowledge, a human expert must propose constructive rules in order to parse the examples. These rules are used to associate with each example (or counter-example) a tree. Through type inference each tree is transformed into a many-sorted term. These are then used as input for a grammatical inference algorithm that returns a deterministic tree automaton. The automaton is finally combined with the expert knowledge in order to obtain a logic program for the concept described by the examples. We report in this paper the general construction of GIFT, its main algorithms, argue the necessity of the human expert, and show how it performs on some benchmarks.
Article
We address the problem of guiding a robot in such away, that it can decide, based on perceived sensor data, which future actions to choose, in order to reach a goal. In order to realize this guidance, the robot has access to a (probabilistic) automaton (PA), whose nal states represent concepts, which have to be recognized in order to verify, that a goal has been achieved. The contribution of this work is to learn these PA's from classified sensor data of robot traces through known environments. Within this framework, we account for the uncertainties arising from ambiguous perceptions. We introduce a knowledge structure, called prefix tree, in which the sample data, represented as cases, is organized. The pre x tree is used to derive and estimate the parameters of deterministic, as well as probabilistic automata models, which reflect the inherent knowledge, implicit in the data, and which are used for recognition in a restricted first-order logic framework.
Article
We present new algorithms for inferring an unknown finite-state automaton from its input/output behavior, even in the absence of a means of resetting the machine to a start state. A key technique used is inference of a homing sequence for the unknown automaton. Our inference procedures experiment with the unknown machine, and from time to time require a teacher to supply counterexamples to incorrect conjectures about the structure of the unknown automaton. In this setting, we describe a learning algorithm that, with probability 1 − δ, outputs a correct description of the unknown machine in time polynomial in the automaton′s size, the length of the longest counterexample, and log(1/δ). We present an analogous algorithm that makes use of a diversity-based representation of the finite-state system. Our algorithms are the first which are provably effective for these problems, in the absence of a "reset." We also present probabilistic algorithms for permutation automata which do not require a teacher to supply counterexamples. For inferring a permutation automaton of diversity D, we improve the best previous time bound by roughly a factor of D3/log D.
Article
We deal with inductive inference of an indexed family of recursive languages. We give two sufficient conditions for inductive inferability of an indexed family from positive data, each of does not depend on the indexing of the family. We introduce two notions of finite cross property for a class of languages and a pair of finite tell-tales for a language. The former is a generalization of finite elasticity due to Wright and the latter consists of two finite sets of strings one of which is a finite tell-tale introduced by Angluin. The main theorem in this paper is that if any language of a class has a pair of finite tell-tales, then the class is inferable from positive data. Also, it is shown that any language of a class with finite cross property has a pair of finite tell-tales. Hence a class with finite cross property is inferable from positive data. Furthermore, it is proved that a language has a finite tell-tale if and only if there does not exist any infinite cross sequence of languages contained in the language.
Article
A pattern is a finite string of constant and variable symbols. The language generated by a pattern is the set of all strings of constant symbols which can be obtained from the pattern by substituting non-empty strings for variables. Descriptive patterns are a key concept for inductive inference of pattern languages. A pattern π is descriptive for a given sample if the sample is contained in the language L(π) generated by π and no other pattern having this property generates a proper subset of the language L(π). The best previously known algorithm for computing descriptive one-variable patterns requires time O(n 4 logn), where n is the size of the sample. We present a simpler and more efficient algorithm solving the same problem in time O(n 2 logn). In addition, we give a parallel version of this algorithm that requires time O(logn) and O(n 3 /logn) processors on an EREW-PRAM. Previously, no efficient parallel algorithm was known for this problem. Using a modified version of the sequential algorithm as a subroutine, we devise a learning algorithm for one-variable patterns whose expected total learning time is O(ℓ 2 logℓ) provided the sample strings are drawn from the target language according to a probability distribution with expected string length ℓ. The probability distribution must be such that strings of equal length have equal probability, but can be arbitrary otherwise. Furthermore, we show how the algorithm for descriptive one variable patterns can be used for learning one-variable patterns with a polynomial number of superset queries.
Article
We introduce a formal model of teaching in which the teacher is tailored to a particular learner, yet the teaching protocol is designed so that no collusion is possible. Not surprisingly, such a model remedies the nonintuitive aspects of other models in which the teacher must successfully teach any consistent learner. We prove that any class that can be exactly identified by a deterministic polynomial-time algorithm with access to a very rich set of example-based queries is teachable by a computationally unbounded teacher and a polynomial-time learner. In addition, we present other general results relating this model of teaching to various previous results. We also consider the problem of designing teacher/learner pairs in which both the teacher and learner are polynomial-time algorithms and describe teacher/learner pairs for the classes of 1-decision lists and Horn sentences.
Article
This work describes algorithms for the inference of minimum size deterministic automata consistent with a labeled training set. The algorithms presented represent the state of the art for this problem, known to be computationally very hard.In particular, we analyze the performance of algorithms that use implicit enumeration of solutions and algorithms that perform explicit search but incorporate a set of techniques known as dependency directed backtracking to prune the search tree effectively.We present empirical results that show the comparative efficiency of the methods studied and discuss alternative approaches to this problem, evaluating their advantages and drawbacks.
Conference Paper
In 1990, Angluin showed that no class exhibiting a combinatorial prop-erty called \approximate can be identiied exactly using polynomially many Equivalence queries (of polynomial size). Here we show that this is a necessary condition: every class without approx-imate ngerprints has an identiication strategy that makes a poly-nomial number of Equivalence queries. Furthermore, if the class is \honest" in a technical sense, the computational power required by the strategy is within the polynomial-time hierarchy, so proving non-learnability is at least as hard as showing P 6 = NP.
Article
This note introduces subclasses of even linear languages for which there exist inference algorithms using positive samples only.
Article
The problem of identifying an unknown regular set from examples of its members and nonmembers is addressed. It is assumed that the regular set is presented by a minimally adequate Teacher, which can answer membership queries about the set and can also test a conjecture and indicate whether it is equal to the unknown set and provide a counterexample if not. (A counterexample is a string in the symmetric difference of the correct set and the conjectured set.) A learning algorithm L∗ is described that correctly learns any regular set from any minimally adequate Teacher in time polynomial in the number of states of the minimum dfa for the set and the maximum length of any counterexample provided by the Teacher. It is shown that in a stochastic setting the ability of the Teacher to test conjectures may be replaced by a random sampling oracle, EX( ). A polynomial-time learning algorithm is shown for a particular problem of context-free language identification.
Article
This paper presents a unifying framework of syntactic and statistical pattern recognition for one-dimensional observations and signals like speech. The syntactic constraints will be based upon stochastic extensions of the grammars in the Chomsky hierarchy. These extended stochastic grammars can be applied to both discrete and continuous observations. Neglecting the mathematical details and complications, we can convert a grammar of the Chomsky hierarchy to a stochastic grammar by attaching probabilities to the grammar rules and, for continuous observations, attaching probability density functions to the terminals of the grammar. In such a framework, a consistent integration of syntactic pattern recognition and statistical pattern recognition, which is typically based upon Bayes’ decision rule for minimum error rate, can be achieved such that no error correction or postprocessing after the recognition phase is required. Efficient algorithms and closed-form solutions for the parsing and recognition problem will be presented for the following types of stochastic grammars: regular, linear and context-free. It will be shown how these techniques can be applied to the task of continuous speech recognition.
Article
In Part I, four ostensibly different theoretical models of induction are presented, in which the problem dealt with is the extrapolation of a very long sequence of symbols—presumably containing all of the information to be used in the induction. Almost all, if not all problems in induction can be put in this form. Some strong heuristic arguments have been obtained for the equivalence of the last three models. One of these models is equivalent to a Bayes formulation, in which a priori probabilities are assigned to sequences of symbols on the basis of the lengths of inputs to a universal Turing machine that are required to produce the sequence of interest as output. Though it seems likely, it is not certain whether the first of the four models is equivalent to the other three. Few rigorous results are presented. Informal investigations are made of the properties of these models. There are discussions of their consistency and meaningfulness, of their degree of independence of the exact nature of the Turing machine used, and of the accuracy of their predictions in comparison to those of other induction methods. In Part II these models are applied to the solution of three problems—prediction of the Bernoulli sequence, extrapolation of a certain kind of Markov chain, and the use of phrase structure grammars for induction. Though some approximations are used, the first of these problems is treated most rigorously. The result is Laplace's rule of succession. The solution to the second problem uses less certain approximations, but the properties of the solution that are discussed, are fairly independent of these approximations. The third application, using phrase structure grammars, is least exact of the three. First a formal solution is presented. Though it appears to have certain deficiencies, it is hoped that presentation of this admittedly inadequate model will suggest acceptable improvements in it. This formal solution is then applied in an approximate way to the determination of the “optimum” phrase structure grammar for a given set of strings. The results that are obtained are plausible, but subject to the uncertainties of the approximation used.
Book
This book provides an introduction to basic concepts and techniques of syntactic pattern recognition. The presentation emphasizes fundamental and practical material rather than strictly theoretical topics, and numerous examples illustrate the principles. The subject is developed according to the following topics: introduction (background, patterns and pattern classes, approaches to pattern recognition, elements of a pattern recognition system, concluding remarks); elements of formal language theory (introduction; string grammars and languages; examples of pattern languages and grammars; equivalent context-free grammars; syntax-directed translations; deterministic, nondeterministic, and stochastic systems; concluding remarks); higher-dimensional grammars (introduction; tree grammars; web grammars; plex grammars; shape gammars; concluding remarks); recognition and translation of syntactic structures (introduction; string language recognizers; automata for simple syntax-directed translation; parsing in string languages; recognition of imperfect strings; tree automata; concluding remarks); stochastic grammars, languages, and recognizers (introduction; stochastic grammars and languages; consisting of stochastic context-free grammars; stochastic reocgnizers; stochastic syntax-directed translations; modified Cocke-Younger-Kasami parsing algorithm for stochastic errors of changed symbols; concluding remarks); and grammatical inference (introduction; inference of regular grammars; inference of context-free grammars; inference of tree grammars; inference of stochastic grammar; concluding remarks). 155 references, 93 figures, 4 tables. (RWR)
Article
The minimum consistent DFA problem is that of finding a DFA with as few states as possible that is consistent with a given sample (a finite collection of words, each labeled as to whether the DFA found should accept or reject). Assuming that P ≠ NP, it is shown that for any constant k, no polynomial time algorithm can be guaranteed to find a consistent DFA of size optk, where opt is the size of a smallest DFA consistent with the sample. This result holds even if the alphabet is of constant size two, and if the algorithm is allowed to produce an NFA, a regular grammar, or a regular expression that is consistent with the sample. Similar hardness results are described for the problem of funding small consistent linear grammars.
Article
Learning is regarded as the phenomenon of knowledge acquisition in the absence of explicit programming. A precise methodology is given for studying this phenomenon rom a computational viewpoint. It consists of choosing an appropriate information gathering mechanism, the learning protocol, and exploring the class of concepts that can be learned using it in a reasonable (polynomial) number of steps. Although inherent algorithmic complexity appears to set serious limits to the range of concepts that can be learned, it is shown that there are some important nontrivial classes of propositional concepts that can be learned in a realistic sense.