Article

# Tree–Tree Matrices and Other Combinatorial Problems from Taxonomy

European Journal of Combinatorics (Impact Factor: 0.65). 02/1996; 17(2-3):191-208. DOI: 10.1006/eujc.1996.0017

Source: DBLP

### Full-text

Michiel Hazewinkel, Sep 26, 2014 Available from: Data provided are for informational purposes only. Although carefully collected, accuracy cannot be guaranteed. The impact factor represents a rough estimation of the journal's impact factor and does not reflect the actual current impact factor. Publisher conditions are provided by RoMEO. Differing provisions from the publisher's actual policy or licence agreement may be applicable.

- [Show abstract] [Hide abstract]

**ABSTRACT:**.. This paper is concerned with information retrieval from large scientific data bases of scientific literature. The central idea is to define metrics on the information space of terms (key phrases) and the information space of documents. This leads naturally to the idea of a weak enriched thesaurus and the semiautomatic generation of such tools. Quite a large number of unsolved (mathematical) problems turn up in this context. Some of these are described and discussed. They mostly have to do with classification and clustering issues. Mathematics subject classification 1991: 68P20 Key words & phrases: information space, discrete metric space, Lipshits distance, clustering, single link clustering, information retrieval, data base, local search, neighborhood search, classification schemes, hierarchical schemes, classification trees, key phrases, co-citation analysis, thesaurus, weak thesaurus Note. The present text is a write up of a talk presented at the workshop on "Metadata: qualify... -
##### Article: A New Cluster Algorithm for Graphs

[Show abstract] [Hide abstract]

**ABSTRACT:**A new cluster algorithm for graphs called the Markov Cluster algorithm (MCL algorithm) is introduced. The graphs may be both weighted (with nonnegative weight) and directed. Let G be such a graph. The MCL algorithm simulates flow in G by first identifying G in a canonical way with a Markov graph G 1 . Flow is then alternatingly expanded and contracted, leading to a row of Markov Graphs Gi . The expansion step is done by computing higher step transition probabilities (TP 's), the contraction step creates a new Markov graph by favouring high TP 's and demoting low TP 's in a specific way. The heuristic underlying this approach is the expectation that flow between dense regions which are sparsely connected will evaporate. The stable limits of the process are easily derived and in practice the algorithm converges very fast to such a limit, the structure of which has a generic interpretation as an overlapping clustering of the graph G. Overlap is limited to cases where the input gr... - [Show abstract] [Hide abstract]

**ABSTRACT:**In [6] a cluster algorithm for graphs was introduced called the Markov cluster algorithm or MCL algorithm. The algorithm is based on simulation of (stochastic) flow in graphs by means of alternation of two operators, expansion and inflation. The results in [8] establish an intrinsic relationship between the corresponding algebraic process (MCL process) and cluster structure in the iterands and the limits of the process. Several kinds of experiments conducted with the MCL algorithm are described here. Test cases with varying homogeneity characteristics are used to establish some of the particular strengths and weaknesses of the algorithm. In general the algorithm performs well, except for graphs which are very homogeneous (such as weakly connected grids) and for which the natural cluster diameter (i.e. the diameter of a subgraph induced by a natural cluster) is large. This can be understood in terms of the flow characteristics of the MCL algorithm and the heuristic on which the...