# Takayoshi ShoudaiKyushu International University · Faculty of Contemporary Business

Takayoshi Shoudai

Dr. Sci.

## About

108

Publications

4,464

Reads

**How we measure 'reads'**

A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more

609

Citations

Additional affiliations

April 2014 - present

January 2014 - May 2019

October 1993 - March 2014

## Publications

Publications (108)

A regular pattern is a string consisting of constant symbols and distinct variable symbols. The language of a regular pattern is the set of all constant strings obtained by replacing all variable symbols in the regular pattern with non-empty strings. The present paper deals with the learning problem of languages of regular patterns within Angluin's...

A formal graph system (FGS) is a logic programming system that directly manipulates graphs by dealing with graph patterns instead of terms of first-order predicate logic. In this paper, based on an FGS, we introduce a primitive formal ordered tree system (pFOTS) as a formal system defining labeled ordered tree languages. A pFOTS program is a finite...

A term is a connected acyclic graph (unrooted unordered tree) pattern with structured variables, which are ordered lists of one or more distinct vertices. A variable of a term has a variable label and can be replaced with an arbitrary tree by hyperedge replacement according to the variable label. The dimension of a term is the maximum number of ver...

A cograph (complement reducible graph) is a graph which can be generated by disjoint union and complement operations on graphs, starting with a single vertex graph. Cographs arise in many areas of computer science and are studied extensively. With the goal of developing an effective data mining method for graph structured data, in this paper we int...

This paper deals with a problem to decide whether a given graph structure appears as a pattern in the structure of a given graph. A graph pattern is a triple p=(V,E,H), where (V,E) is a graph and H is a set of variables, which are ordered lists of vertices in V. A variable can be replaced with an arbitrary connected graph by a kind of hyperedge rep...

In this paper, we describe how distributional learning techniques can be applied to formal graph system (FGS) languages. An FGS is a logic program that deals with term graphs instead of the terms of first-order predicate logic. We show that the regular FGS languages of bounded degree with the 1-finite context property (1-FCP) and bounded treewidth...

An efficient means of learning tree-structural features from tree-structured data would enable us to construct effective mining methods for tree-structured data. Here, a pattern representing rich tree-structural features common to tree-structured data and a polynomial time algorithm for learning important tree patterns are necessary for mining know...

A term tree pattern is a rooted ordered tree pattern which consists of ordered tree structures with edge labels and structured variables with labels. All variables with the same label in a term tree pattern can be simultaneously replaced with ordered trees isomorphic to the same rooted ordered tree. Then, a term tree pattern is suitable for represe...

Formal graph system (FGS) is a logic program that deals with term graphs instead of the terms of first-order predicate logic. In this paper, we introduce context-deterministic (c-deterministic) regular FGSs as a subclass of FGSs and propose a polynomial time algorithm for learning the class of c-deterministic regular FGSs by using membership and eq...

A tree contraction pattern (TC-pattern) is an unordered tree-structured pattern common to given unordered trees, which is obtained by merging every uncommon connected substructure into one vertex by edge contraction. In order to extract meaningful and hidden knowledge from tree structured documents, we consider a minimal language (MINL) problem for...

A tree contraction pattern (TC-pattern) is an unordered tree-structured pattern which can express a tree-structure common to given unordered trees. A TC-pattern has some special vertices, called contractible vertex, into which every uncommon connected substructure is merged by edge contractions. In this paper, we propose a probabilistic method for...

A graph contraction pattern is a triplet h = (V, E, U) where (V, E) is a connected graph and U is a distinguished subset of V. The graph contraction pattern matching problem is defined as follows. Given a graph contraction pattern h = (V, E, U) and a graph G, can G be transformed to (V, E) by edge contractions so that for any v ∈ V\U, only one vert...

A tree contraction pattern (TC-pattern) is an unordered tree-structured pattern common to given unordered trees, which is obtained by merging every uncommon connected substructure into one vertex by edge contraction. In order to extract meaningful and hidden knowledge from tree structured documents, we consider a minimal language (MINL) problem for...

Darknet monitoring plays an important role for understanding various botnet activities for early detection of the threats on the Internet caused by the botnets. However, common illegal accesses by ordinary malware make such detection difficult. To remove such accesses by ordinary malware from the results of network monitoring, Tsuruta et al. (2012)...

In this paper, we present a concept of edge contraction-based tree-structured patterns as a graph pattern suited to represent treestructured data. A tree contraction pattern (TC-pattern) is an unordered tree-structured pattern common to a given tree-structured data, which is obtained by merging every uncommon connected substructure into one vertex...

Darknet monitoring is very important for understanding various botnet activities for early detection and defense the threats on the Internet caused by the botnets. However, common illegal accesses by ordinary malware make such detection difficult. To remove such accesses by ordinary malware from the results of network monitoring, we propose a data...

This paper deals with a problem to decide whether a given graph structure appears as a pattern in the structure of a given graph. A graph pattern is a connected graph with structured variables. A variable is an ordered list of vertices that can be replaced with a connected graph by a kind of hyperedge replacements. The graph pattern matching proble...

To early detect and defend the threats in the Internet caused by botnet, darknet monitoring is very important to understand various botnet activities. However, common illegal accesses by ordinary malwares makes such detection difficult. In this paper, in order to remove such accesses by ordinary malwares from the results of network monitoring, we p...

An outerplanar graph is a planar graph that can be embedded in the plane in such a way that all vertices lie on the outer
boundary. Outerplanar graphs express many chemical compounds. An externally extensible outerplanar graph pattern (eeo-graph pattern for short) represents a graph pattern common to a finite set of outerplanar graphs, like a datas...

Due to the rapid growth of information technologies, the use of electronic data such as XML/HTML documents, which are a form
of tree structured data, has been rapidly increasing. We have developed an algorithm for effectively compressing tree structured
data and one for decompressing a compressed tree that are based on the Lempel–Ziv compression sc...

Extending the concept of ordered graphs, we propose a new data structure to express rooted pla-nar maps, which is called a planar map pattern. In order to develop an efficient data mining method from a dataset of rooted planar maps, we propose a polynomial time algorithm for finding a minimally generalized pla-nar map pattern, which represents maxi...

An outerplanar graph is a planar graph which can be embedded in the plane in such a way that all of vertices lie on the outer boundary. Many chemical compounds are known to be expressed by outerplanar graphs. An externally extensible outerplanar graph pattern (eeo-graph pattern for short) represents a graph pattern common to a finite set of outerpl...

A linear term tree is defined as an edge-labeled rooted tree pattern with ordered children and internal structured variables whose labels are mutually distinct. A variable can be replaced with arbitrary edge-labeled rooted ordered trees. We consider the polynomial time learnability of finite unions of linear term trees in the exact learning model f...

Recently, due to the rapid growth of electronic data having graph structures such as HTML and XML texts and chemical compounds,
many researchers have been interested in data mining and machine learning techniques for finding useful patterns from graph-structured
data (graph data). Since graph data contain a huge number of substructures and it tends...

Electronic data like XML/HTML documents , called tree structured data, have been rapidly increasing and have become larger day by day. In this paper, we propose an efficient compression and decompression algorithms based on the Lempel-Ziv compression scheme by improving XMill and XDemill (Liefke and Suciu, SIGMOD 2000) which is a com-pressor and a...

A graph is an interval graph if and only if each vertex in the graph can be associated with an interval on the real line such that any two vertices are adjacent in the graph exactly when the corresponding intervals have a nonempty intersection. A number of interesting applications for interval graphs have been found in the literature. In order to f...

An outerplanar graph is a planar graph which can be embedded in the plane in such a way that all of vertices lie on the outer
boundary. Many chemical compounds are known to be expressed by outerplanar graphs. We proposed a block preserving outerplanar
graph pattern (bpo- graph pattern, for short) as a graph pattern common to a set of outerplanar gr...

An outerplanar graph is a planar graph which can be embedded in the plane in such a way that all of vertices lie on the outer boundary. Many chemical compounds are known to be expressed by outerplanar graphs. In this paper, firstly, we introduce an externally extensible outerplanar graph pattern (eeo-graph pattern for short) as a graph pattern comm...

A linear graph pattern is a labeled graph such that its vertices have constant labels and its edges have either constant or mutually distinct variable labels. An edge having a variable label is called a variable and can be replaced with an arbitrary labeled graph. Let \({\mathcal GPC}\) be the set of all linear graph patterns having a structural fe...

An outerplanar graph is a planar graph which can be em- bedded in the plane in such a way that all of vertices lie on the outer boundary. Many semi-structured data like the NCI dataset having about 250,000 chemical compounds can be expressed by outerplanar graphs. In this paper, we consider a data mining problem of extracting structural features fr...

A graph is an interval graph if and only if each vertex in the graph can be associated with an interval on the real line such
that any two vertices are adjacent in the graph exactly when the corresponding intervals have a nonempty intersection. A number
of interesting applications for interval graphs have been found in the literature. In order to f...

In the fields of data mining and knowledge discovery, many semistructured data such as HTML/XML files are represented by rooted trees t such that all children of each internal vertex of t are ordered and t has edge labels. In order to represent structural features common to such semistructured data, we propose a linear ordered term tree, which is a...

In the fields of data mining and knowledge discovery, many semistructured data such as HTML/XML files are represented by rooted trees t such that all children of each internal vertex of t are ordered and t has edge labels. In order to represent structural features common to such semistructured data, we propose a regular term tree which is a rooted...

A cograph (complement reducible graph) is a graph which can be generated by disjoint union and complement operations on graphs, starting with a single vertex graph. Cographs arise in many areas of computer science and are studied extensively.
With the goal of developing an effective data mining method for graph structured data, in this paper we int...

Tree structured data such as HTML/XML files are represented by rooted trees with ordered children and edge labels. Knowledge
representations for tree structured data are quite important to discover interesting features which such tree structured data
have. In order to represent tree structured patterns with rich structural features, we introduce a...

In order to realize Web information retrieval using characteristic tree structured patterns in semistructured Web documents, methods for discovering frequent patterns or common characteristics in semistructured documents become more and more important. We have studied methods for discovering maximally frequent tree structured patterns in semistruct...

A wrapper is a program which extracts data from a web site and reorganizes them in a database. Wrapper generation from web
sites is a key technique in realizing such a metasearch system. We present a new method of automatic wrapper generation for
metasearch using our efficient learning algorithm for term trees. Term trees are ordered tree structure...

We consider the polynomial time learnability of ordered tree patterns with internal structured variables, in the query learning
model of Angluin (1988). An ordered tree pattern with internal structured variables, called a term tree, is a representation
of a tree structured pattern in semistructured or tree structured data such as HTML/XML files. St...

概要 近年,Web文書のような木構造データが増大している.そのため,木構造データか ら の情報抽出がより重要になっている.これら木構造データに共通する特徴的なパター ンを表現するために,本論文では順序項木を用いる.順序項木とは内部に構造的変数 を持つ順序木パターンで,その変数には任意の順序木が代入可能である.高さ制限変 数とは代入される順序木の高さを制限する新しい変数で,高さが iに制限された高さ 制限変数には,高さが高々iである順序木しか代入することはできない.本論文では, 高さ制限変数を持つ順序項木言語が多項式時間帰納推論可能であることを示す. Abstract Due to the rapid growth of tree structured data such as Web doc...

In order to extract meaningful and hidden knowledge from semistructured documents such as HTML or XML files, methods for discovering
frequent patterns or common characteristics in semistructured documents have been more and more important. We propose new
methods for discovering maximally frequent tree structured patterns in semistructured Web docum...

Due to the rapid growth of tree structured data such as Web documents, ecient learning from tree structured data becomes more and more important. In order to represent structural features common to such tree structured data, we propose a term tree, which is a rooted tree pattern consisting of tree structures and labeled variables. A vari- able is a...

In the field of Web mining, a Web page can be represented by a rooted tree T such that every internal vertex of T has ordered children and string data such as tags or texts are assigned to edges of T. A term tree is an ordered tree pattern, which has ordered tree structures and variables, and is suited for a representation of a tree structured patt...

Many semistructured data such as HTML/XML files are represented by rooted trees t such that all children of each internal vertex of t are ordered and all edges of t have labels. Such data is called tree structured data. Analyzing large tree structured data is a time-consuming process in data mining. If we can reduce the size of input data without l...

In order to represent structural features common to tree structured data, we propose an unlabeled term tree, which is a rooted tree pattern consisting of an unlabeled ordered tree structure and labeled variables. A variable is a labeled hyperedge which can be replaced with any unlabeled ordered tree of size at least 2. In this paper, we deal with a...

Advances in Knowledge Discovery and Data Mining: 7th Pacific-Asia Conference, PAKDD 2003, Seoul, Korea, April 30 - May 2, 2003. Proceedings Information Extraction from semistructured data becomes more and more important. In order to extract meaningful or interesting contents from semistructured data, we need to extract common structured patterns fr...

In this paper, we present an eective algorithm for extracting characteristic substructures from graph structured data with geometric information, such as CAD, map data and drawing data. Moreover, as an application of our algorithm, we give a method of lossless compression for such data. First, in order to deal with graph structured data with geomet...

Computational knowledge discovery can be considered to be a complicated human activity concerned with searching for something new from data with computer systems. The optimization of theentire process of computational knowledge discovery is a big challenge in computer science. If we had an atlas of hypothesis classes which describes prior and basic...

We consider the polynomial time learnability of finite unions of ordered tree patterns with internal structured variables, in the query learning model of Angluin (1988). An ordered tree pattern with internal structured variables, called a term tree, is a rooted tree pattern which consists of tree structures with ordered children and internal struct...

Tree structured data such as HTML/XML files are repre- sented by rooted trees with ordered children and edge labels. Knowledge representations for tree structured data are quite important to discover interesting features which such tree structured data have. In this paper, as a representation of structural features we propose a structured ordered t...

Tree structured data such as HTML/XML files are represented by rooted trees with ordered children and edge labels. As a representation of a tree structured pattern in such tree structured data, we propose an ordered tree pattern, called a term tree, which is a rooted tree pattern consisting of ordered children and internal structured variables. A t...

We developed a machine learning system HAKKE which is suitable for predicting functional regions from sequences, such as protein-coding region prediction, and transmembrane domain prediction. HAKKE is a hybrid system cooperated by a number of algorithms of a pool to make an accurate prediction. The system uses an extension of the weighted majority...

Many Web documents such as HTML files and XML files have no rigid structure and are called semistructured data. In general, such semistructured Web documents are represented by rooted trees with ordered children. We propose a new method for discovering frequent tree structured patterns in semistructured Web documents by using a tag tree pattern as...

Electronic documents such as SGML/HTML/XML files and LaTeX files have been rapidly increasing, by the rapid progress of network
and storage technologies. Many electronic documents have no rigid structure and are called semistructured documents. Since
a lot of semistructured documents contain large plain texts, we focus on the structural characteris...

How to help learners plan a navigation path on the Web is an important issue in Web-based learning/education. Our approach to this is to allow the learners to preview a sequence of Web pages as a navigation path plan. In this paper, we introduce an assistant system that enables learners to plan the navigation path in a self-directed way before navi...

With increasing access to the Internet and the wealth of material online, a Web-based self-teaching system has considerable educational value. Accordingly, we developed AEGIS (Automatic Exercise Generator based on the Intelligence of Students), which automatically generates questions whose difficulty level fits the achievement level of a student. H...

Popularization of computers and the Internet enable people to hold lectures using Web contents as a teaching material. Although teachers have prepared a lot of Web contents, most of them are used so as only to be browsed by students. If we arrange some exercises according to lecture notes and prepare an answering mechanism for the exercises via the...

The protein conformation problem, one of the hard and important problems, is to identify conformation rules which transform sequences to their tertiary structures, called conformations. Our aim of this work is to give a concrete theoretical foundation for graph-theoretic approach for the protein conformation problem in the framework of a probabilis...

Many documents such as Web documents or XML files have tree structures. A term tree is an unordered tree pattern consisting of internal variables and tree structures. In order to extract meaningful and hidden knowledge from such tree structured documents, we consider a minimal language (MINL) problem for term trees. The MINL problem for term trees...

Many documents such as Web documents or XML files have no rigid structure. Such semistructured documents have been rapidly
increasing. We propose a new method for discovering frequent tree structured patterns in semistructured Web documents. We
consider the data mining problem of finding all maximally frequent tag tree patterns in semistructured da...

We present a new method for discovering knowledge from structured data which are represented by graphs in the framework of Inductive Logic Programming. A graph, or network, is widely used for representing relations between various data and expressing a small and easily understandable hypothesis. The analyzing system directly manipulating graphs is...

this paper we 1 describe its meaning and functions and show how the system works

We present a new framework for discovering knowledge from two-dimensional structured data by using Inductive Logic Programming. Two-dimensional graph structured data such as image or map data are widely used for representing relations and distances between various objects. First, we define a layout term graph suited for representing twodimensional...

Graphs have enough richness and flexibility to express discrete structures hidden in a large amount of data. Some searching
methods utilizing graph algorithmic techniques have been developed in Knowledge Discovery. A term graph, which is one of expressions
for graph-structured data, is a hypergraph whose hyperedges are regarded as variables. Althou...

Many Internet technologies enable us to hold lectures with Web contents and even develop new lecture methods using the technologies. This paper proposes AEGIS (Automatic Exercise Generator based on the Intelligence of Students) that generates exercises of various levels according to each student's achievement l e v el, marks his/her answers and ret...

We present a method for discovering new knowledge from structural data which are represented by graphs in the framework of
inductive logic programming. A graph, or network, is widely used for representing relations between various data and expressing
a small and easily understandable hypothesis. Formal Graph System (FGS) is a kind of logic programm...

We consider discovery of new concepts which are represented by graph patterns having tree structures. A term tree is a graph pattern which has tree structures and contains variables. The concept represented by a term tree, called the term tree language, is the set of all trees obtained by putting arbitrary trees to the places of its variables. A te...

Introduction Genome sequencing projects on almost 70 biological organisms have been constantly processing, and up to now, some of them have already made public their complete genomes, for example, through Internet. These genomes are very carefully analyzed by skillful experts, usually by using some powerful tools for biological sequences like homol...

Software tools for genomic researches like homology search are very useful and have contributed on the progress of the genomic
researches. However, these tools are not designed directly toward scientific discovery and more discovery-oriented software
tools are strongly expected to assist scientific discovery in genomic researches. We have designed...

The refutation tree problem is to compute a refutation tree which is associated with the structure of a graph generated by a formal graph system (FGS). We present subclasses of FGSs, called simple FGSs, size-bounded simple FGSs and bounded simple FGSs. In order to show that the refutation tree problem for simple FGSs can be solved in polynomial tim...

Term graphs are a kind of hypergraphs such that arbitrary graphs can be put to the place of their hyperedges. Each hyperedge in a term graph is labeled with a variable. A term graph is called regular if each variable attached to a hyperedge does not occur more than once. Let f be a regular term graph. If a regular term graph which is obtained from...

This paper presents a brief overview of BONSAI Garden and describes computational experiments by BONSAI Garden scheduled for the workstation session. 2 BONSAI Garden

By using an O((log n)2) time EREW PRAM algorithm for a maximal independent set problem (MIS), we show the following two results: (1) Given a graph, the maximal vertex-induced subgraph satisfying a hereditary graph property π can be found in time 0(Δλ(π)T
π(n)(log n)2) using a polynomial number of processors, where λ(π) is the maximum of diameters o...