On graph modelling, node ranking and visualisation.

IJISTA 01/2007; 3:188-210. DOI: 10.1504/IJISTA.2007.014259
Source: DBLP

ABSTRACT Graphs traditionally have many applications in various areas of computer science. Research in graph-based data mining has recently gained a high level of attraction due to its broad range of applications. Examples include XML documents, web logs, web searches and molecular biology. Most of the approaches used in these applications focus on deriving interesting, frequent patterns from given datasets. Two fundamental questions are, however, ignored; that is, how to derive a graph from a set of objects and how to order nodes according to their relations with others in the graph. In this paper, we provide approaches to building a graph from a given set of objects accompanied by their feature vectors, as well as to ranking nodes in the graph. The basic idea of our ranking approach is to quantify the important role of a node as the degree to which it has direct and indirect relationships with other nodes in a graph. A method for visualising graphs with ranking nodes is also presented. The visual examples and applications are provided to demonstrate the effectiveness of our approaches.

  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: Graph visualization is commonly used to visually model relations in many areas. Examples include Web sites, CASE tools, and knowledge representation. When the amount of information in these graphs becomes too large, users, however, cannot perceive all elements at the same time. A clustered graph can greatly reduce visual complexity by temporarily replacing a set of nodes in clusters with abstract nodes. This paper proposes a new approach to clustering graphs. The approach constructs the node similarity matrix of a graph that is derived from a novel metric of node similarity. The linkage pattern of the graph is thus encoded into the similarity matrix, and then one obtains the hierarchical abstraction of densely linked subgraphs by applying the k-means algorithm to the matrix. A heuristic method is developed to overcome the inherent drawbacks of the k-means algorithm. For clustered graphs we present a multilevel multi-window approach to hierarchically drawing them in different abstract level views with the purpose of improving their readability. The proposed approaches demonstrate good results in our experiments. As application examples, visualization of part of Java class diagrams and Web graphs are provided. We also conducted usability experiments on our algorithm and approach. The results have shown that the hierarchically clustered graph used in our system can improve user performance for certain types of tasks.
    Journal of Visual Languages & Computing 01/2006; · 0.56 Impact Factor
  • [Show abstract] [Hide abstract]
    ABSTRACT: This paper discusses the use of graph-theoretic methods for the representation and searching of three-dimensional patterns of side-chains in protein structures. The position of a side-chain is represented by pseudo-atoms, and the relative positions of pairs of side-chains by the distances between them. This description of the geometry can be represented by a labelled graph in which the nodes and the edges of the graph represent the pseudo-atoms and the sets of inter-pseudo-atomic distances, respectively. Given such a representation, a protein can be searched for the presence of a user-defined query pattern of side-chains by means of a subgraph-isomorphism algorithm which is implemented in the program ASSAM. Experiments with one such algorithm, that due to Ullmann, show that it provides both an effective and a highly efficient way of searching for patterns of side-chains. The method is illustrated by searches for the serine protease catalytic triad, for residues involved in the catalytic activity of staphyloccocal nuclease, and for the zinc-binding side-chains of thermolysin. The catalytic triad pattern search revealed the existence of a second Asp-His-Ser triad-like arrangement of residues in trypsinogen and chymotrypsinogen, in addition to the catalytic residues. In addition the program can be used to search for hypothetical patterns, as is shown for a pattern of three tryptophan side-chains. These searches demonstrate that the search algorithm can successfully retrieve the great majority of the expected proteins, as well as other, previously unreported proteins that contain the pattern of interest.
    Journal of Molecular Biology 11/1994; 243(2):327-44. · 3.91 Impact Factor


Available from
May 20, 2014