Article

Tidier Drawings of Trees

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Various algorithms have been proposed for producing tidy drawings of trees–drawings that are aesthetically pleasing and use minimum drawing space. We show that these algorithms contain some difficulties that lead to aesthetically unpleasing, wider than necessary drawings. We then present a new algorithm with comparable time and storage requirements that produces tidier drawings. Generalizations to forests and m-ary trees are discussed, as are some problems in discretization when alphanumeric output devices are used.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... This method teased out the partitions of the explained variance (R 2 ) among all the regressors, after accounting for the direct effects of each variable on alcohol and drug addiction, and the indirect effects of other variables. Furthermore, we visualized the network structure as three 'flow diagrams' using the flow function of R package qgraph 5 according to the Reingold-Tilford graphical layout algorithm (Reingold & Tilford, 1981), which placed the nodes of the ASI-A, ASI-D and SDS on the left, and created a vertical network indicating which edges would be directly or indirectly related to the three nodes of ASI-A, ASI-D and SDS. ...
Article
Full-text available
Substance use disorder (SUD) is characterized by alcohol and drug use, dependence features and adverse psychosocial consequences. SUD represents an interplay between substance use, personality, impulsivity and psychopathology. Network analysis is a powerful method to examine the direct and indirect relationship between various variables, and could understand SUD as a self-sustaining system. We examined the network structure of addictive behaviour, personality, impulsivity and psychopathology in treatment-seeking people with SUD. This cross-sectional network analysis utilized a convenience sample of 391 treatment-seeking patients with SUD in a specialized addiction psychiatric clinic. We measured drug and alcohol use, dependence features and psychopathology using clinician-rated scales. Personality and impulsivity were measured using self-report instruments. LASSO network, centrality indices and network stability were estimated. We also estimated the relative importance of the network nodes in determining levels of drug and alcohol use and dependence severity. The domain-level LASSO network of additive behaviour, psychopathological variables, traits of the Big Five Inventory (BFI) and impulsivity formed a highly connected network. BFI-neuroticism lied at the centre of the network with the highest closeness index. Depressive symptoms, anxiety symptoms and ‘general’ symptoms in the Positive and Negative Syndrome Scale showed highest expected influence and predictability. The mean predictability of the network suggested a self-sustaining system. The three psychopathological nodes significantly determined the variance of drug use and dependence severity. The network structure of SUD is influenced by anxiety and depressive symptoms. Clinicians should detect and intervene these symptoms to break the self-sustaining system of SUD.
... 1A-2), no layout was made accessible to phylogeneticists to make a better use of these empty areas. Such a layout, however, exists: the nonlayered tidy tree layout, which can be drawn in linear time following the Reingold-Tilford algorithm (Reingold and Tilford 1981) and its extension to trees with branch lengths (i.e., nonlayered, van der Ploeg 2014). ...
Article
Full-text available
Many layouts exist for visualizing phylogenetic trees, allowing to display the same information (evolutionary relationships) in different ways. For large phylogenies, the choice of the layout is a key element, because the printable area is limited, and because interactive on-screen visualizers can lead to unreadable phylogenetic relationships at high zoom levels. A visual inspection of available layouts for rooted trees reveals large empty areas that one may want to fill in order to use less drawing space and eventually gain readability. This can be achieved by using the nonlayered tidy tree layout algorithm that was proposed earlier but was never used in a phylogenetic context so far. Here, we present its implementation, and we demonstrate its advantages on simulated and biological data (the measles virus phylogeny). Our results call for the integration of this new layout in phylogenetic software. We implemented the nonlayered tidy tree layout in R language as a stand-alone function (available at https://github.com/damiendevienne/non-layered-tidy-trees), as an option in the tree plotting function of the R package ape, and in the recent tool for visualizing reconciled phylogenetic trees thirdkind (https://github.com/simonpenel/thirdkind/wiki).
... Upon clustering, the grouped nodes are placed on their respective layer according to the local layout selected from those offered by the igraph library. These layouts vary from simple ones like circle, grid, star, random to more advanced force-directed ones such as Fruchterman-Reingold (26), Distributed Recursive (Graph) Layout (27), Multidimensional scaling (28), , Large Graph Layout (LGL) and Graphopt or hierarchical-based ones like Reingold-Tilford (30) and Sugiyama. A more detailed description of these algorithms can be found in our previous Arena3D web article (16). ...
Preprint
Full-text available
Arena3Dweb is an interactive web tool that visualizes multi-layered networks in 3D space. In this update, Arena3Dweb supports directed networks as well as up to nine different types of connections between pairs of nodes with the use of Bezier curves. It comes with different color schemes (light/gray/dark mode), custom channel coloring, four node clustering algorithms which one can run on-the-fly, visualization in VR mode and predefined layer layouts (zig-zag, star and cube). This update also includes enhanced navigation controls (mouse orbit controls, layer dragging and layer/node selection) while its newly developed API allows integration with external applications as well as saving and loading of sessions in JSON format. Finally, a dedicated Cytoscape App has been developed, through which users can automatically send their 2D networks from Cytoscape to Arena3Dweb for 3D multi-layer visualization. Arena3Dweb is accessible at http://arena3d.pavlopouloslab.info or http://arena3d.org
... Vertices and edges can move during the animation and can have arbitrary lifetime. The purpose is to pursue aesthetic criteria commonly adopted for tree drawings [20]. ...
Preprint
Full-text available
In a graph story the vertices enter a graph one at a time and each vertex persists in the graph for a fixed amount of time $\omega$, called viewing window. At any time, the user can see only the drawing of the graph induced by the vertices in the viewing window and this determines a sequence of drawings. For readability, we require that all the drawings of the sequence are planar. For preserving the user's mental map we require that when a vertex or an edge is drawn, it has the same drawing for its entire life. We study the problem of drawing the entire sequence by mapping the vertices only to $\omega+k$ given points, where $k$ is as small as possible. We show that: $(i)$ The problem does not depend on the specific set of points but only on its size; $(ii)$ the problem is NP-hard and is FPT when parameterized by $\omega+k$; $(iii)$ there are families of graph stories that can be drawn with $k=0$ for any $\omega$, while for $k=0$ and small values of $\omega$ there are families of graph stories that can be drawn and others that cannot; $(iv)$ there are families of graph stories that cannot be drawn for any fixed $k$ and families of graph stories that require at least a certain $k$.
... This phenomenon violates a common aesthetic principal for graph drawing which holds that "a sub-tree should be drawn the same way regardless of where it occurs in the tree." (Reingold and Tilford, 1981) More importantly, it also results in a larger graph than is necessary due to excess space being used for some nodes, particularly those without children. Therefore, the overall objectives of this revised algorithm are to ensure that: ...
Article
Full-text available
Visualizing the academic descendants of prolific researchers is a challenging problem. To this end, a modified Pavlo algorithm is presented and its utility is demonstrated based on manually-collected academic genealogies of five researchers in biomechanics and biomedicine. The researchers have 15–32 children each and between 93 and 384 total descendants. The graphs generated by the modified algorithm were over 97% smaller than the original. Mentorship metrics were also calculated; their hm-indices are 5–7 and gm-indices are in the range 7‣13. Of the 1096 unique researchers across the five family trees, 153 (14%) had graduated their own PhD students by the end of 2021. It took an average of 9.6 years after their own graduation for an advisor to graduate their first PhD student, which suggests that an academic generation in this field is approximately one decade. The manually collected data sets used were also compared against the crowd-sourced academic genealogy data from the AcademicTree.org website. The latter included only 45% of the people and 34% of the connections, so this limitation must be considered when using it for analyses where completeness is required. The data sets and an implementation of the algorithm are available for re-use. Peer Review https://publons.com/publon/10.1162/qss_a_00205
... Our algorithm works by laying out the graph using the maximal spanning tree. Inspired by early works on tidy tree drawing [57,64,73], our approach has two main steps, with a focus on simplicity and efficiency. First, we generate an abstract layout of the maximal spanning tree that determines the distribution of space to nodes and subtrees of the graph. ...
Preprint
Full-text available
Force-directed layouts belong to a popular class of methods used to position nodes in a node-link diagram. However, they typically lack direct consideration of global structures, which can result in visual clutter and the overlap of unrelated structures. In this paper, we use the principles of persistent homology to untangle force-directed layouts thus mitigating these issues. First, we devise a new method to use 0-dimensional persistent homology to efficiently generate an initial graph layout. The approach results in faster convergence and better quality graph layouts. Second, we provide a new definition and an efficient algorithm for 1-dimensional persistent homology features (i.e., tunnels/cycles) on graphs. We provide users the ability to interact with the 1-dimensional features by highlighting them and adding cycle-emphasizing forces to the layout. Finally, we evaluate our approach with 32 synthetic and real-world graphs by computing various metrics, e.g., co-ranking, edge crossing, etc., to demonstrate the efficacy of our proposed method.
... In the beginning, we used the default algorithm implemented in d3-tree. This algorithm uses the Reingold-Tilford [20] algorithm with an improvement of Buchheim et al. [4] to run in linear time. ...
Conference Paper
Full-text available
A chatbot can automatically process a user's request, e.g. to provide a requested information. In doing so, the user starts a conversation with the chatbot and can specify the request by further inquiry. Due to the developments in the field of NLP in recent years, algorithmic text comprehension has been significantly improved. As a result, chatbots are increasingly used by companies and other institutions for various tasks such as order processes or service requests. Knowledge bases are often used to answer users queries, but these are usually curated manually in various text files, prone to errors. Visual methods can help the expert to identify common problems in the knowledge base and can provide an overview of the chatbot system. In this paper, we present Chatbot Explorer, a system to visually assist the expert to understand, explore, and manage a knowledge base of different chatbot systems. For this purpose, we provide a tree-based visualization of the knowledge base as an overview. For a detailed analysis, the expert can use appropriate visualizations to drill down the analysis to the level of individual elements of a specific story to identify problems within the knowledge base. We support the expert with automatic detection of possible problems, which can be visually highlighted. Additionally, the expert can also change the order of the queries to optimize the conversation lengths and it is possible to add new content. To develop our solution, we have conducted an iterative design process with domain experts and performed two user evaluations. The evaluations and the feedback from our domain experts have shown that our solution can significantly improve the maintainability of chatbot knowledge bases.
... Therefore, an ad hoc approach based on a tree-like scheme was implemented for satisfactory forming aesthetically functional drawings. This was achieved using as a base the Javascript-powered D3.tree library which generates node-link diagrams that lay out the connectivity between nodes in a parent-child correlation considering the tree representation algorithm exhibited in [39]. Under this approach, every child element is expected to have only one parent element. ...
Article
Full-text available
For the validation of vehicular Electrical Distribution Systems (EDS), engineers are currently required to analyze disperse information regarding technical requirements, standards and datasheets. Moreover, an enormous effort takes place to elaborate testing plans that are representative for most EDS possible configurations. These experiments are followed by laborious data analysis. To diminish this workload and the need for physical resources, this work reports a simulation platform that centralizes the tasks for testing different EDS configurations and assists the early detection of inadequacies in the design process. A specific procedure is provided to develop a software tool intended for this aim. Moreover, the described functionalities are exemplified considering as a case study the main wire harness from a commercial vehicle. A web-based architecture has been employed in alignment with the ongoing software development revolution and thus provides flexibility for both, developers and users. Due to its scalability, the proposed software scheme can be extended to other web-based simulation applications. Furthermore, the automatic generation of electrical layouts for EDS is addressed to favor an intuitive understanding of the network. To favor human–information interaction, utilized visual analytics strategies are also discussed. Finally, full simulation workflows are exposed to provide further insights on the deployment of this type of computer platforms.
... A second criterion beyond relationship visualization was needed, given the existence of different graphing algorithms. Here, we used an algorithm from Fruchterman and Reingold (1991), but other algorithms may be found in Kamada and Kawai (1989), Sugiyama, Tagawa, and Toda (1981), Davidson and Harel (1996), and Reingold and Tilford (1981). To our knowledge, the scholarly literature on financial networks contains no discussions of the merits of various algorithms in effectively representing networks. ...
Article
Financial Stability Indices and Financial Networks Dynamics in Europe While many analyses focus on the simulation of different contagion scenarios in the event of a crisis, there is little work devoted to the topology of financial networks. We investigate a sample of 260 European banks. The networks observed are unique and more sophisticated than the theoretical networks typically used in contagion scenarios. We show there are topological particularities in the banking networks. We demonstrate that the location of a bank in its networks of relationships and the empirical properties observed in its neighborhood have an impact on the stability of the financial system. In addition, we show that these topologies of connections have been significantly transformed both during and after the financial crisis.
... The containing behaviors of an IP are aligned horizontally within the node. To display the behaviors in a legible way, we calculated an even distribution of the graphical positions of the behavior nodes within the IP node with the Reingold-Tilford algorithm [Reingold and Tilford, 1981]. ...
Preprint
In recent years, an increased effort has been invested to improve the capabilities of robots. Nevertheless, human-robot interaction remains a complex field of application where errors occur frequently. The reasons for these errors can primarily be divided into two classes. Foremost, the recent increase in capabilities also widened possible sources of errors on the robot's side. This entails problems in the perception of the world, but also faulty behavior, based on errors in the system. Apart from that, non-expert users frequently have incorrect assumptions about the functionality and limitations of a robotic system. This leads to incompatibilities between the user's behavior and the functioning of the robot's system, causing problems on the robot's side and in the human-robot interaction. While engineers constantly improve the reliability of robots, the user's understanding about robots and their limitations have to be addressed as well. In this work, we investigate ways to improve the understanding about robots. For this, we employ FAMILIAR - FunctionAl user Mental model by Increased LegIbility ARchitecture, a transparent robot architecture with regard to the robot behavior and decision-making process. We conducted an online simulation user study to evaluate two complementary approaches to convey and increase the knowledge about this architecture to non-expert users: a dynamic visualization of the system's processes as well as a visual programming interface. The results of this study reveal that visual programming improves knowledge about the architecture. Furthermore, we show that with increased knowledge about the control architecture of the robot, users were significantly better in reaching the interaction goal. Furthermore, we showed that anthropomorphism may reduce interaction success.
... They can be visualized explicitly or implicitly [SS06]. The most common explicit representation are node-link diagrams [RT81] where nodes are connected by edges to express hierarchical relations between them. Implicit techniques are more space-efficient as they use alignment instead of edges to encode the relationship [SHS10]. ...
Article
Full-text available
The exploration of large‐scale conflicts, as well as their causes and effects, is an important aspect of socio‐political analysis. Since event data related to major conflicts are usually obtained from different sources, researchers developed a semi‐automatic matching algorithm to integrate event data of different origins into one comprehensive dataset using hierarchical taxonomies. The validity of the corresponding integration results is not easy to assess since the results depend on user‐defined input parameters and the relationships between the original data sources. However, only rudimentary visualization techniques have been used so far to analyze the results, allowing no trustworthy validation or exploration of how the final dataset is composed. To overcome this problem, we developed VEHICLE, a web‐based tool to validate and explore the results of the hierarchical integration. For the design, we collaborated with a domain expert to identify the underlying domain problems and derive a task and workflow description. The tool combines both traditional and novel visual analysis techniques, employing statistical and map‐based depictions as well as advanced interaction techniques. We showed the usefulness of VEHICLE in two case studies and by conducting an evaluation together with conflict researchers, confirming domain hypotheses and generating new insights.
... [39]: This layout places the vertices on a plane by simulating a physical model of springs. [40]: This is a tree-like layout and is suitable for trees, hierarchies, or graphs without many cycles. ...
Article
Full-text available
The Network Makeup Artist (NORMA) is a web tool for interactive network annotation visualization and topological analysis, able to handle multiple networks and annotations simultaneously. Precalculated annotations (e.g., Gene Ontology, Pathway enrichment, community detection, or clustering results) can be uploaded and visualized in a network, either as colored pie-chart nodes or as color-filled areas in a 2D/3D Venn-diagram-like style. In the case where no annotation exists, algorithms for automated community detection are offered. Users can adjust the network views using standard layout algorithms or allow NORMA to slightly modify them for visually better group separation. Once a network view is set, users can interactively select and highlight any group of interest in order to generate publication-ready figures. Briefly, with NORMA, users can encode three types of information simultaneously. These are 1) the network, 2) the communities or annotations of interest, and 3) node categories or expression values. Finally, NORMA offers basic topological analysis and direct topological comparison across any of the selected networks. NORMA service is available at: http://bib.fleming.gr:3838/NORMA, whereas the code is available at: https://github.com/PavlopoulosLab/NORMA.
... Different design options of the same network diagram are used to fulfill different esthetic and usability criteria in an application. These visualization methods have been used in different applications such as social network analysis, bioinformatics, linguistics, economics, chemistry, and computer network diagrams (Gorko et al., 2018;Herman et al., 2000;Jenny et al, 2017Jenny et al, , 2018Purchase et al., 2008;Reingold & Tilford, 1981). Flow map visualization models, on the other hand, have been mainly used in cartography (Phan et al., 2005). ...
Article
Full-text available
In multiple watershed planning and design problems, such as conservation planning, quantitative estimates of costs, and environmental benefits of proposed conservation decisions may not be the only criteria that influence stakeholders' preferences for those decisions. Their preferences may also be influenced by the conservation decision itself—specifically, the type of practice, where it is being proposed, existing biases, and previous experiences with the practice. While human‐in‐the‐loop type search techniques, such as Interactive Genetic Algorithms (IGA), provide opportunities for stakeholders to incorporate their preferences in the design of alternatives, examination of user‐preferred conservation design alternatives for patterns in Decision Space can provide insights into which local decisions have higher or lower agreement among stakeholders. In this paper, we explore and compare spatial patterns in conservation decisions (specifically involving cover crops and filter strips) within design alternatives generated by IGA and noninteractive GA. Methods for comparing patterns include nonvisual as well as visualization approaches, including a novel visual analytics technique. Results for the study site show that user‐preferred designs generated by all participants had strong bias for cover crops in a majority (50%–83%) of the subbasins. Further, exploration with heat maps visualization indicate that IGA‐based search yielded very different spatial patterns of user‐preferred decisions in subbasins in comparison to decisions within design alternatives that were generated without the human‐in‐the‐loop. Finally, the proposed coincident‐nodes, multiedge graph visualization was helpful in visualizing disagreement among participants in local subbasin scale decisions, and for visualizing spatial patterns in local subbasin scale costs and benefits.
... The graph visualisations in this paper were produced with the igraph library [2] of the R programming language. We visualised the merged STN mod- els using the Reingold-Tilford [11] layout algorithm, which is specially suited for drawing trees (graphs without layout cycles). It generates a layout where vertices are organised into layers based on their geodesic distance (path length) from a chosen root vertex, which in our case are the start of trajectories. ...
Chapter
Full-text available
NeuroEvolution of Augmenting Topologies (NEAT) is a system for evolving neural network topologies along with weights that has proven highly effective and adaptable for solving challenging reinforcement learning tasks. This paper analyses NEAT through the lens of Search Trajectory Networks (STNs), a recently proposed visual approach to study the dynamics of evolutionary algorithms. Our goal is to improve the understanding of neuroevolution systems. We present a visual and statistical analysis contrasting the behaviour of NEAT, with and without using the crossover operator, when solving the two benchmark problems outlined in the original NEAT article: XOR and double-pole balancing. Contrary to what is reported in the original NEAT article, our experiments without crossover perform significantly better in both domains.
... In VICTOR we also visualize similarity metrics as weighted networks after applying a force directed layout algorithm such as Fruchterman-Reingold [67], Reingold-Tilford [68] and Davidson-Harel [69]. If no cutoff-threshold is selected then the drawn graph is fully connected (complete graph/clique). ...
Preprint
Full-text available
Clustering is the process of grouping together different data objects based on similar properties. Clustering has applications in various case studies from several fields such as graph theory, image analysis, pattern recognition, statistics and others. Nowadays, there are numerous algorithms and tools able to generate clustering results. However, different algorithms or parameterization may result in very different clusters. This way, the user is often forced to manually filter and compare these results in order to decide which of them produce the ideal clusters. To automate this process, in this study, we present VICTOR, the first fully interactive and dependency-free visual analytics web application which allows the comparison and visualization of various clustering algorithms. VICTOR can handle multiple clustering results simultaneously and compare them using ten different metrics. Clustering results can be filtered and compared to each other with the use of interactive heatmaps, bar plots, correlation networks, sankey and circos plots. We demonstrate VICTOR's functionality using three examples. In the first case, we compare five different algorithms on a protein-protein interaction dataset whereas in the second example, we test four different parameters of the same clustering algorithm applied on the same dataset. Finally, as a third example, we compare four different meta-analyses with hierarchically clustered differentially expressed genes found to be involved in myocardial infarction. VICTOR is available at http://bib.fleming.gr:3838/VICTOR
... Balance comparison can be supported by a tree visualization encoding that can be aligned by a central axis so that the height/width of all subtrees can be compared [93]. For instance, the Reingold & Tilford [94] node-link diagram factors in the comparison between the left and right sides of the tree and provides a symmetric tree visualization. This distinction of specific target and its attribute provides strong evidence to support the final encoding choice for the visualization. ...
Article
Full-text available
In the field of information visualization, the concept of tasks is an essential component of theories and methodologies for how a visualization researcher or a practitioner understands what tasks a user needs to perform and how to approach the creation of a new design. In this paper, we focus on the collection of tasks for tree visualizations, a common visual encoding in many domains ranging from biology to computer science to geography. In spite of their commonality, no prior efforts exist to collect and abstractly define tree visualization tasks. We present a literature review of tree visualization papers and generate a curated dataset of over 200 tasks. To enable effective task abstraction for trees, we also contribute a novel extension of the Multi-Level Task Typology to include more specificity to support tree-specific tasks as well as a systematic procedure to conduct task abstractions for tree visualizations. All tasks in the dataset were abstracted with the novel typology extension and analyzed to gain a better understanding of the state of tree visualizations . These abstracted tasks can benefit visualization researchers and practitioners as they design evaluation studies or compare their analytical tasks with ones previously studied in the literature to make informed decisions about their design. We also reflect on our novel methodology and advocate more broadly for the creation of task-based knowledge repositories for different types of visualizations. The Supplemental Material will be maintained on OSF:https://osf.io/u5ehs/
... It is a simple layout generator that places one vertex in the center of a circle and the rest of the vertices equidistantly on the perimeter. • Reingold-Tilford (21) : It is a tree-like layout more suitable for trees, hierarchies and graphs without many cycles. ...
Preprint
Full-text available
Efficient integration and visualization of heterogeneous biomedical information in a single view is a key challenge. In this study, we present Arena3D web , the first, fully interactive and dependency-free, web application which allows the visualization of multilayered graphs in 3D space. With Arena3D web , users can integrate multiple networks in a single view along with their intra- and inter-layer connections. For clearer and more informative views, users can choose between a plethora of layout algorithms and apply them on a set of selected layers either individually or in combination. Users can align networks and highlight node topological features, whereas each layer as well as the whole scene can be translated, rotated and scaled in 3D space. User-selected edge colors can be used to highlight important paths, while node positioning, coloring and resizing can be adjusted on-the-fly. In its current version, Arena3D web supports weighted and unweighted undirected graphs and is written in R, Shiny and JavaScript. We demonstrate the functionality of Arena3D web using two different use-case scenarios; one regarding drug repurposing for SARS-CoV-2 and one related to GPCR signaling pathways implicated in melanoma. Arena3D web is available at http://bib.fleming.gr:3838/Arena3D .
... Since the backbone is a binary tree, its mapping requires less computation. In this paper, we use two different strategies to make a better use of the available space, and to improve vertices' distance preservation: the radial layout algorithm [36] and an adaptation of the H-tree algorithm [37]. ...
Article
Full-text available
Graph visualization has been successfully applied in a wide range of problems and applications. Although different approaches are available to create visual representations, most of them suffer from clutter when faced with many nodes and/or edges. Among the techniques that address this problem, edge bundling has attained relative success in improving node-link layouts by bending and aggregating edges. Despite their success, most approaches perform the bundling based only on visual space information. There is no explicit connection between the produced bundled visual representation and the underlying data (edges or vertices attributes). In this paper, we present a novel edge bundling technique, called Similarity-Driven Edge Bundling (SDEB), to address this issue. Our method creates a similarity hierarchy based on a multilevel partition of the data, grouping edges considering the similarity between nodes to guide the bundling. The novel features introduced by SDEB are explored in different application scenarios, from dynamic graph visualization to multilevel exploration. Our results attest that SDEB produces layouts that consistently follow the similarity relationships found in the graph data, resulting in semantically richer presentations that are less cluttered than the state-of-the-art.
... In theory any graph layout algorithm could be used but for the clustree package we have made use of the two algorithms specifically designed for tree structures available in the igraph package [13]. These are the Reingold-Tilford tree layout, which places parent nodes above their children [14], and the Sugiyama layout which places nodes of a directed acyclic graph in layers while minimising the number of crossing edges [15]. Both of these algorithms can produce attractive layouts and as such we have not found the need to design a specific layout algorithm for clustering trees. ...
Preprint
Full-text available
Clustering techniques are widely used in the analysis of large data sets to group together samples with similar properties. For example, clustering is often used in the field of single-cell RNA-sequencing in order to identify different cell types present in a tissue sample. There are many algorithms for performing clustering and the results can vary substantially. In particular, the number of groups present in a data set is often unknown and the number of clusters identified by an algorithm can change based on the parameters used. To explore and examine the impact of varying clustering resolution we present clustering trees. This visualisation shows the relationships between clusters at multiple resolutions allowing researchers to see how samples move as the number of clusters increases. In addition, meta-information can be overlaid on the tree to inform the choice of resolution and guide in identification of clusters. We illustrate the features of clustering trees using a series of simulations as well as two real examples, the classical iris dataset and a complex single-cell RNA-sequencing dataset. Clustering trees can be produced using the clustree R package available from CRAN ( https://CRAN.R-project.org/package=clustree ) and developed on GitHub ( https://github.com/lazappi/clustree ).
... Other layout algorithms could be used; for example, when there are no cycles or very few cycles, a tree-layout algorithms can be useful. Example visualisations using the Reingold-Tilford [43] tree layout algorithm for the MCSP problem can be found in the accompanying document on supplementary material. Force-directed layout algorithms are based on physical analogies and do not rely on any assumptions about the structure of the networks. ...
Article
Full-text available
We present a comparative analysis of two hybrid algorithms for solving combinatorial optimisation problems. The first one is a specific variant of an established family of techniques known as large neighbourhood search (LNS). The second one is a much more recent algorithm known as construct, merge, solve & adapt (CMSA). Both approaches generate, in different ways, reduced sub-instances of the tackled problem instance at each iteration. The experimental analysis is conducted on two NP-hard combinatorial subset selection problems: the multidimensional knapsack problem and minimum common string partition. The results support the intuition that CMSA has advantages over the LNS variant in the context of problems for which solutions contain rather few items. Moreover, they show that the opposite may be the case for problems in which solutions contain rather many items. The analysis is supported by a new way of visualising the trajectories of the compared algorithms in terms of merged monotonic local optima networks.
... For each sink, from top to bottom, we trace the lower contour of each class, and make sure that the preceding classes are shifted such that a minimum separation guarantees they do not overlap. This is similar to the shifting of subtrees in the tree-drawing algorithm of Reingold and Tilford [4]. ...
Preprint
We point out two flaws in the algorithm of Brandes and K\"opf (Proc. GD 2001), which is often used for the horizontal coordinate assignment in Sugiyama's framework for layered layouts. One of them has been noted and fixed multiple times, the other has not been documented before and requires a non-trivial adaptation. On the bright side, neither running time nor extensions of the algorithm are affected adversely.
... For example, we can use any layout for the low-stretch tree backbone. Figure 5a and Figure 5b show a proof-of-concept viewer using two popular layout approaches: the standard force-directed graph layout that is built into D3 [9], and the standard Reingold-Tilford tree drawing approach [40]. Future work could use any previously proposed or custom-designed tree drawing approach, since the LSQT backbone is itself a tree. ...
Preprint
We introduce low-stretch trees to the visualization community with LSQT, our novel technique that uses quasi-trees for both layout and edge bundling. Our method offers strong computational speed and complexity guarantees by leveraging the convenient properties of low-stretch trees, which accurately reflect the topological structure of arbitrary graphs with superior fidelity compared to arbitrary spanning trees. Low-stretch quasi-trees also have provable sparseness guarantees, providing algorithmic support for aggressive de-cluttering of hairball graphs. LSQT does not rely on previously computed vertex positions and computes bundles based on topological structure before any geometric layout occurs. Edge bundles are computed efficiently and stored in an explicit data structure that supports sophisticated visual encoding and interaction techniques, including dynamic layout adjustment and interactive bundle querying. Our unoptimized implementation handles graphs of over 100,000 edges in eight seconds, providing substantially higher performance than previous approaches.
... "dr" Large graph layout "layout.lgl" "lg" (1991); ka, Kamada and Kawai (1989); ge, Frick et al. (1995); da, Newman (2006); md, Cox and Cox (2001); re, Reingold and Tilford (1981); su, Sugiyama et al. (1981); dr, Martin et al. (2008); lg, Martin et al. (2008). ...
Article
Full-text available
The aim of the R package netCoin is to explore data structures using a number of statistical techniques that share the handling of interdependent variables. The main objective of this analysis is to detect events, characters, objects, attributes or characteristics that tend to appear together within a given set of scenarios. Its most notable feature is the combination of traditional multivariate statistical analysis and network analysis supported by topological graph theory. In addition, netCoin produces HTML graphs using the D3.js visualization library to support the interactive exploration of networked data. Among its many applications, netCoin can be used for the analysis of multiple responses in questionnaires to explore relevant associations, for the development of textual networks, for the study of ecological communities, for audience analysis, for mining large databases or for basket market analysis.
... Five different layouts can be used to visualize the resulting spanning tree. A tree layout based on Reingold and Tilford's tidy drawing algorithm (27) shows the nodes in hierarchical order, where the leaf nodes of the tree are placed on the left of the graph and the root node is placed on the right. In this way, the graph shows the metabolic flux toward the root node from left to right. ...
Article
Full-text available
Next-generation sequencing has paved the way for the reconstruction of genome-scale metabolic networks as a powerful tool for understanding metabolic circuits in any organism. However, the visualization and extraction of knowledge from these large networks comprising thousands of reactions and metabolites is a current challenge in need of user-friendly tools. Here we present Fluxer (https://fluxer.umbc.edu), a free and open-access novel web application for the computation and visualization of genome-scale metabolic flux networks. Any genome-scale model based on the Systems Biology Markup Language can be uploaded to the tool, which automatically performs Flux Balance Analysis and computes different flux graphs for visualization and analysis. The major metabolic pathways for biomass growth or for biosynthesis of any metabolite can be interactively knocked-out, analyzed and visualized as a spanning tree, dendrogram or complete graph using different layouts. In addition, Fluxer can compute and visualize the k-shortest metabolic paths between any two metabolites or reactions to identify the main metabolic routes between two compounds of interest. The web application includes >80 whole-genome metabolic reconstructions of diverse organisms from bacteria to human, readily available for exploration. Fluxer enables the efficient analysis and visualization of genome-scale metabolic models toward the discovery of key metabolic pathways.
... Tree representations have been widely used to represent hierarchical information. The first intuitive approaches were based on node-link representations like the one of Reingold and Tilfort [33]. However, these kinds of layouts require a lot of space and it is difficult to visualize large tree structures. ...
Article
Data mining techniques allow users to discover novelty in huge amounts of data. Frequent pattern methods have proved to be efficient, but the extracted patterns are often too numerous and thus difficult to analyse by endusers. In this paper, we focus on sequential pattern mining and propose a new visualization system, which aims at helping endusers to analyse extracted knowledge and to highlight the novelty according to referenced biological document databases. Our system is based on two visualization techniques: Clouds and solar systems. We show that these techniques are very helpful for identifying associations and hierarchical relationships between patterns among related documents. Sequential patterns extracted from gene data using our system were successfully evaluated by two biology laboratories working on Alzheimers disease and cancer.
... In [25], each subgraph of the story is given (each subgraph is a tree, whereas the entire graph may be arbitrary), each object can have an arbitrary lifetime, the model is off-line, and the vertices can move. Aesthetic criteria as in the classical Reingold-Tilford algorithm [22] are pursued. ...
... • Reingold-Tilford [38] : This is a tree-like layout and is suitable for trees or graphs without many cycles. • LGL : A force directed layout suitable for larger graphs. ...
Preprint
Full-text available
NORMA is a web tool for interactive network annotation visualization and topological analysis, able to handle multiple networks and annotations simultaneously. Precalculated annotations (e.g. Gene Ontology/Pathway enrichment or clustering results) can be uploaded and visualized in a network either as colored pie-chart nodes or as color-filled convex hulls in a Venn-diagram-like style. In the case where no annotation exists, algorithms for automated community detection are offered. Users can adjust the network views using standard layout algorithms or allow NORMA to slightly modify them for visually better group separation. Once a network view is set, users can interactively select and highlight any group of interest in order to generate publication-ready figures. Briefly, with NORMA, users can encode three types of information simultaneously. These are: i) the network, ii) the communities or annotations and iii) node categories or expression values. Finally, NORMA offers basic topological analysis and direct topological comparison across any of the selected networks. NORMA service is available at: http://bib.fleming.gr:3838/NORMA or http://genomics-lab.fleming.gr:3838/NORMA. Code is available at: https://github.com/PavlopoulosLab/NORMA
... The user must manually attempt to find the best position for symbols in order to prevent overlapping and perplexing criss-crossing of curved lines. Previous research recommended that the number of edge crossings in drawings should be minimised [33]. For these reasons, curved connector lines do not appear to be the proper solution. ...
Article
Full-text available
This article presents an evaluation of the QGIS Processing Modeler from the point of view of effective cognition. The QGIS Processing Modeler uses visual programming language for workflow design. The functionalities of the visual component and the visual vocabulary (set of symbols and line connectors) are both important. The form of symbols affects how workflow diagrams may be understood. The article discusses the results of assessing the Processing Modeler’s visual vocabulary in QGIS according to the Physics of Notations theory. The article evaluates visual vocabularies from the older QGIS 2.x and newer 3.x versions. The paper identifies serious design flaws in the Processing Modeler. Applying the Physics of Notations theory resulted in certain practical recommendations, such as changing the fill colour of symbols, increasing the size and variety of inner icons, removing functional icons, and using a straight connector line instead of a curved line. Another recommendation was to provide a supplemental preview window for the entire model in order to improve user navigation in huge models. Objective eye-tracking measurements validated some results of the evaluation using the Physics of Notations. The respondents read workflows to solve different tasks and their gazes were tracked. Evaluation of the eye-tracking metrics revealed the respondents’ reading patterns of the diagram. Evaluation using both Physics of Notation theory and eye-tracking measurements inspired recommendations for improving visual notation. A set of recommendations for users is also given, which can be applied easily in practice using a contemporary visual notation.
Chapter
In a graph story the vertices enter a graph one at a time and each vertex persists in the graph for a fixed amount of time ω, called viewing window. At any time, the user can see only the drawing of the graph induced by the vertices in the viewing window and this determines a sequence of drawings. For readability, we require that all the drawings of the sequence are planar. For preserving the user’s mental map we require that when a vertex or an edge is drawn, it has the same drawing for its entire life. We study the problem of drawing the entire sequence by mapping the vertices only to ω+k given points, where k is as small as possible. We show that: (i) The problem does not depend on the specific set of points but only on its size; (ii) the problem is NP-hard and is FPT when parameterized by ω+k; (iii) there are families of graph stories that can be drawn with k=0 for any ω, while for k=0 and small values of ω there are families of graph stories that can be drawn and others that cannot; (iv) there are families of graph stories that cannot be drawn for any fixed k and families of graph stories that require at least a certain k.
Article
Extracting color palettes from an image is a common practice used by artists in different visual domains. In this study, we introduce a novel tool for extracting color palettes from an image. Based on the hierarchical color model (HCM) (Jeong et al. (2019)), we develop a prototype user interface system comprising novel interactions and visualization. Our visualization is a node-link diagram tailored for the HCM; the accompanying interactions originate from the hierarchical structure. We evaluate our prototype system by performing a usability test and comparing it with contemporary alternatives for professional usage. The results from the user study validate the effectiveness of the presented approach. We also find a few user requirements that can be useful in the further development of the related tools. Moreover, we expect that the proposed interactive visualization can facilitate additional studies on the HCM. The prototype is available in the following URL: https://int-vis-hcm.web.app.
Article
Negative symptoms, particularly the motivation and pleasure (MAP) deficits, are associated with impaired social functioning in patients with schizophrenia (SCZ). However, previous studies seldom examined the role of the MAP on social functioning while accounting for the complex interplay between other psychopathology. This network analysis study examined the network structure and interrelationship between negative symptoms (at the "symptom-dimension" and "symptom-item" levels), other psychopathology and social functioning in a sample of 269 patients with SCZ. The psychopathological symptoms were assessed using the Clinical Assessment Interview for Negative Symptoms (CAINS) and the Positive and Negative Syndrome Scale (PANSS). Social functioning was evaluated using the Social and Occupational Functioning Assessment Scale (SOFAS). Centrality indices and relative importance of each node were estimated. The network structures between male and female participants were compared. Our resultant networks at both the "symptom-dimension" and the "symptom-item" levels suggested that the MAP factor/its individual items were closely related to social functioning in SCZ patients, after controlling for the complex interplay between other nodes. Relative importance analysis showed that MAP factor accounted for the largest proportion of variance of social functioning. This study is among the few which used network analysis and the CAINS to examine the interrelationship between negative symptoms and social functioning. Our findings supported the pivotal role of the MAP factor to determine SCZ patients' social functioning, and as a potential intervention target for improving functional outcomes of SCZ.
Article
We introduce a novel visual-interactive approach for analyzing, understanding, and correcting neural machine translation. Our system supports users in automatically translating documents using neural machine translation and identifying and correcting possible erroneous translations. User corrections can then be used to fine-tune the neural machine translation model and automatically improve the whole document. While translation results of neural machine translation can be impressive, there are still many challenges such as over- and under-translation, domain-specific terminology, and handling long sentences, making it necessary for users to verify translation results. Our system aims at supporting users in this task. Our visual analytics approach combines several visualization techniques in an interactive system. A parallel coordinates plot with multiple metrics related to translation quality can be used to find, filter, and select translations that might contain errors. An interactive beam search visualization and graph- or matrix-based visualizations for attention weights can be used for post-editing and understanding machine-generated translations. The machine translation model is updated from user corrections to improve the translation quality of the whole document. We designed our approach for an LSTM-based translation model and extended it to also include the Transformer architecture. We show for representative examples possible mistranslations and how to use our system to deal with them. A user study revealed that many participants favor such a system over manual text-based translation, especially for translating large documents. Furthermore, we performed quantitative computer-based experiments that show that our system can be used to improve translation quality and reduce post-editing efforts for domain-specific documents.
Chapter
Most algorithms on trees require a systematic method of visiting the nodes of a tree. The most common methods of exploring a tree are the preorder, the postorder, the top-down, and the bottom-up traversal. In a preorder traversal of a rooted tree, parents are visited before children, and siblings are visited in left-to-right order. In a postorder traversal of a rooted tree, children are visited before parents, and siblings are visited in left-to-right order. In a top-down traversal of a rooted tree, also known as level-order traversal, nodes are visited in order of non-decreasing depth. In a bottom-up traversal of a tree, not necessarily rooted, nodes are visited in order of non-decreasing height. These systematic methods of visiting the nodes of a tree are the subject of this chapter.
Article
Full-text available
In this paper, an overview-based interactive visualization for temporally long dynamic data sequences is described. To reach this goal, each data object at a certain time point can be mapped to a number value based on a given property. Among others, a property is application-dependent and can be number of vertices, number of edges, average degree, density, number of self-loops, degree (maximum and total), or edge weight (minimum, maximum, and total) for dynamic graph data, but it can as well be the number of ball contacts in a football match, or the time-dependent visual attention paid to a stimulus in an eye tracking study. To achieve an overview over time, an aggregation strategy based on either the mean, minimum, or maximum of two values is applied. This temporal value aggregation generates a triangular shape with an overview of the entire data sequence as the peak. The color coding can be adjusted, forming visual patterns that can be rapidly explored for certain data features over time, supporting comparison tasks between the properties. The usefulness of the approach is illustrated by means of applying it to dynamic graphs generated from US domestic flight data as well as to dynamic Covid-19 infections on country levels. Graphic abstract
Article
In this article we present the Network Analysis Profiler (NAP v2.0), a web tool to directly compare the topological features of multiple networks simultaneously. NAP is written in R and Shiny and currently offers both 2D and 3D network visualisation, as well as simultaneous visual comparisons of node- and edge-based topological features as bar charts or scatterplot matrix. NAP is fully interactive, and users can easily export and visualise the intersection between any pair of networks using Venn diagrams or a 2D and a 3D multi-layer graph-based visualisation. NAP supports weighted, unweighted, directed, undirected and bipartite graphs.
Article
Clustering is the process of grouping different data objects based on similar properties. Clustering has applications in various case studies from several fields such as graph theory, image analysis, pattern recognition, statistics and others. Nowadays, there are numerous algorithms and tools able to generate clustering results. However, different algorithms or parameterizations may produce quite dissimilar cluster sets. In this way, the user is often forced to manually filter and compare these results in order to decide which of them generate the ideal clusters. To automate this process, in this study, we present VICTOR, the first fully interactive and dependency-free visual analytics web application which allows the visual comparison of the results of various clustering algorithms. VICTOR can handle multiple cluster set results simultaneously and compare them using ten different metrics. Clustering results can be filtered and compared to each other with the use of data tables or interactive heatmaps, bar plots, correlation networks, sankey and circos plots. We demonstrate VICTOR’s functionality using three examples. In the first case, we compare five different network clustering algorithms on a Yeast protein-protein interaction dataset whereas in the second example, we test four different parameters of the MCL clustering algorithm on the same dataset. Finally, as a third example, we compare four different meta-analyses with hierarchically clustered differentially expressed genes found to be involved in myocardial infarction. VICTOR is available at http://victor.pavlopouloslab.info or http://bib.fleming.gr:3838/VICTOR.
Article
Full-text available
In this article we present the Network Analysis Profiler (NAP v2.0), a web tool to directly compare the topological features of multiple networks simultaneously. NAP is written in R and Shiny and currently offers both 2D and 3D network visualisation, as well as simultaneous visual comparisons of node- and edge-based topological features as bar charts or scatterplot matrix. NAP is fully interactive, and users can easily export and visualise the intersection between any pair of networks using Venn diagrams or a 2D and a 3D multi-layer graph-based visualisation. NAP supports weighted, unweighted, directed, undirected and bipartite graphs.
Article
Full-text available
Efficient integration and visualization of heterogeneous biomedical information in a single view is a key challenge. In this study, we present Arena3Dweb, the first, fully interactive and dependency-free, web application which allows the visualization of multilayered graphs in 3D space. With Arena3Dweb, users can integrate multiple networks in a single view along with their intra- and inter-layer connections. For clearer and more informative views, users can choose between a plethora of layout algorithms and apply them on a set of selected layers either individually or in combination. Users can align networks and highlight node topological features, whereas each layer as well as the whole scene can be translated, rotated and scaled in 3D space. User-selected edge colors can be used to highlight important paths, while node positioning, coloring and resizing can be adjusted on-the-fly. In its current version, Arena3Dweb supports weighted and unweighted undirected graphs and is written in R, Shiny and JavaScript. We demonstrate the functionality of Arena3Dweb using two different use-case scenarios; one regarding drug repurposing for SARS-CoV-2 and one related to GPCR signaling pathways implicated in melanoma. Arena3Dweb is available at http://bib.fleming.gr:3838/Arena3D or http://bib.fleming.gr/Arena3D.
Chapter
In SODA’99, Chan introduced a simple type of planar straight-line upward order-preserving drawings of binary trees, known as LR drawings: such a drawing is obtained by picking a root-to-leaf path, drawing the path as a straight line, and recursively drawing the subtrees along the paths. Chan proved that any binary tree with n nodes admits an LR drawing with \(O(n^{0.48})\) width. In SODA’17, Frati, Patrignani, and Roselli proved that there exist families of n-node binary trees for which any LR drawing has \(\varOmega (n^{0.418})\) width. In this paper, we improve Chan’s upper bound to \(O(n^{0.437})\) and Frati et al.’s lower bound to \(\varOmega (n^{0.429})\).
Preprint
Full-text available
In this article we present the Network Analysis Profiler (NAP v2.0), a web tool to directly compare the topological features of multiple networks simultaneously. NAP is written in R and Shiny and currently offers both 2D and 3D network visualization as well as simultaneous visual comparisons of node- and edge-based topological features both as bar charts or as a scatterplot matrix. NAP is fully interactive and users can easily export and visualize the intersection between any pair of networks using Venn diagrams or a 2D and a 3D multi-layer graph-based visualization. NAP supports weighted, unweighted, directed, undirected and bipartite graphs and is available at: http://bib.fleming.gr:3838/NAP/ . Its code can be found at: https://github.com/PavlopoulosLab/NAP
Chapter
Automated planning tools are complex pieces of software that take declarative domain descriptions and generate plans from domains and problems. New users often find it challenging to understand the plan generation process, while experienced users often find it difficult to track semantic errors and efficiency issues. In response, we develop a cloud-based planning tool with code editing and state-space visualization capabilities that simplifies this process. The code editor focuses on visualizing the domain, problem, and resulting sample plan, helping the user see how such descriptions are connected without changing context. The visualization tool explores two alternative visualizations aimed at illustrating the operation of the planning process and how the domain dynamics evolve during plan execution.
Article
Full-text available
Trees are extremely common data structures, both as internal objects and as models for program output. But it is unusual to see a program actually draw trees for visual inspection. Although part of the difficulty lies in programming graphics devices, most of the problem arises because naive algorithms to draw trees use too much drawing space and sophisticated algorithms are not obvious. We survey two naive tree drawers, formalize aesthetics for tidy trees, and descnbe two algorithms which draw tidy trees. One of the algorithms may be shown to require the minimum possible paper width. Along with the algorithms proper, we discuss the reasoning behind the algorithm development.
Article
A pleasing layout of printed tree structures is difficult to achieve automatically. The paper points out three main problems. First, the horizontal position of a node on the printed page depends on global consideration of the position of other nodes; secondly, the physical characteristics of printers require scanning the tree in left-to-right top-to-bottom sequence; finally, page overflow for wide trees must be handled. These problems are illustrated by analysing the shortcomings of a simple printing algorithm. A suitable general binary tree printing algorithm is presented and its adaptation to other types of trees is shown.
Article
We investigate the complexity of producing aesthetically pleasing drawings of binary trees, drawings that are as narrow as possible. The notion of what is aesthetically pleasing is embodied in several constraints on the placement of nodes, relative to other nodes. Among the results we give are: (1) There is no obvious “principle of optimality” that can be applied, since globally narrow, aesthetic placements of trees may require wider than necessary subtrees. (2) A previously suggested heuristic can produce drawings on n-node trees that are Θ(n) times as wide as necessary. (3) The problem can be reduced in polynomial time to linear programming; hence, if the coordinates assigned to the nodes are continuous variables, then the problem can be solved in polynomial time. (4) If the placement is restricted to the integral lattice then the problem is NP-hard, as is its approximation to within a factor of about 4 per cent.
Tree drawing algorithms
  • J S Tilford
J. S. Tilford, "Tree drawing algorithms," M.S. thesis, Dep. Comput. Sci., Univ. Illinois, Urbana, IL, Rep. UIUC DCS-R-81-1055, 1981.
He is currently an Associate Professor of Computer Science at the University of 11-linois, Urbana-Champaign. H-is areas of special-ization are data structures and the analysis of algorithms
  • C Wetherell
  • A Shannon
C. Wetherell and A. Shannon, "Tidy drawimgs of trees," IEEE Trans. Software Eng., vol. SE-5, pp. 514-520, 1979. Edward M. Reingold received the B.S. degree from the Illinois Institute of Technology, Chicago, IL, and the M.S. and Ph.D. degrees from Cornell University, Ithaca, NY. -He is currently an Associate Professor of Computer Science at the University of 11-linois, Urbana-Champaign. H-is areas of special-ization are data structures and the analysis of algorithms. John S. Tilfoid received the B.A. degree from DePauw University, Greencastle, IN, in 1977. He is currently a Ph.D. candidate in computer science at the University of Illinois, Urbana-Champaign. His interests include data struc-tures and relational database design theory.
Tilfoid received the B.A. degree from DePauw University, Greencastle, IN, in 1977. He is currently a Ph.D. candidate in computer science at the University of Illinois
  • S John
John S. Tilfoid received the B.A. degree from DePauw University, Greencastle, IN, in 1977. He is currently a Ph.D. candidate in computer science at the University of Illinois, Urbana- Champaign. His interests include data structures and relational database design theory.
He is currently an Associate Professor of Computer Science at the University of 11-linois
  • M Edward
Edward M. Reingold received the B.S. degree from the Illinois Institute of Technology, Chicago, IL, and the M.S. and Ph.D. degrees from Cornell University, Ithaca, NY. -He is currently an Associate Professor of Computer Science at the University of 11-linois, Urbana-Champaign. H-is areas of specialization are data structures and the analysis of algorithms.
He is currently a Ph.D. candidate in computer science at the University of Illinois
  • S John
John S. Tilfoid received the B.A. degree from DePauw University, Greencastle, IN, in 1977. He is currently a Ph.D. candidate in computer science at the University of Illinois, Urbana-Champaign. His interests include data structures and relational database design theory.