Content uploaded by John E. McEneaney

Author content

All content in this area was uploaded by John E. McEneaney

Content may be subject to copyright.

61

Visualizing and Assessing Navigation in Hypertext

John E. McEneaney

Division of Education

Indiana University South Bend

South Bend, IN 46634-7111, USA

Tel/Fax: 219-237-4576 / 219-237-4550

Email: JMcEneaney@IUSB.edu

ABSTRACT

User navigation has been a central theme in both theoretical

and empirical work since the earliest days of hypertext

research and development. Studies exploring user

navigation have, however, tended to rely on indirect

navigational measures and have rarely tried to relate

navigation to performance solving problems or locating

information. The purpose of this paper is to propose

methods that lead to a more direct representation and

analysis of user movement in hypertext and to empirically

explore the relationship of resulting measures to

performance in a hypertext search task. Results of this study

support the claim that the proposed graphical and numerical

methods have empirical significance and may be useful in

applications related to assessing and modeling user

navigation.

KEYWORDS: visualization; user paths; path analysis;

navigation patterns; navigation metrics; empirical

validation.

INTRODUCTION

In response to problems related to hypertext navigation,

researchers and developers have created a variety of

powerful tools, many of them based on visualization

techniques. Site maps are now commonly used and there is

evidence that users find them helpful in navigating and in

establishing a clearer idea of the organizational structure of

a site [8, 32]. In larger networks where complete site maps

are impractical, fish-eye views [12, 28, 2], clustering

techniques that organize nodes into meaningful groups [13,

21], and a variety of other filtering and mapping techniques

(e.g. [15, 22]) have been developed to assist both in

development and use of large scale hypertext networks.

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are not

made or distrinbuted for profit or commercial advantage and that copies bear

this notice and the full citation on the first page. To copy otherwise, to

republish, to post on servers or to redistribute to lists requires prior specific

permission and/or a fee.

Hypertext 99 Darmstadt Germany

Copyright ACM 1999 1-58113-064-3/99/2...$5.00

Numerical measures and metrics have also been

proposed to help assess global properties of hypertext that

may be related to the difficulties users experience. Some of

these metrics [1] have been imported directly from other

disciplines, most notably from social network theory, where

there is a long history of using proximity measures that have

been more recently applied in hypertext research. Other

metrics have been developed to address the specific needs

and perspectives of hypertext researchers (e.g. [3]).

There has also been interest in developing a better

understanding of how users navigate hypertext under

various task and environmental conditions. Some of these

studies have employed “static” measures related to numbers

of nodes or links accessed, measures of time and path length

[29, 26, 19] or analyzed selected episodes of movement,

tabulating navigation within or across sections of a network

[29]. Some investigators have applied statistical techniques

to identify “clusters” of nodes and interpret user navigation

in terms of these constructs [17]. Other statistical

approaches include collapsing data from large numbers of

users into state transition probability tables that are used as

a basis for analysis [7], and identification of statistical

benchmarks that might be useful in developing theoretical

models [26, 23].

Perhaps the most consistent trend in user navigation

research however is the use of navigational “paths” as a

primary source of data. Although the concept of a path has

been referred to using a variety of terms including “route”

[4], “user trail” [14], “audit trail” [20] and others, these

terms all refer to data sets that record the sequence of nodes

visited by a subject in a hypertext session and often also

include measures of time related to visits. That user paths

are commonly employed in navigational studies should be

no surprise. Although a time-stamped path misses

deliberations that go into a users’ decision making (e.g. a

pointer hovers momentarily over one link before moving on

to another that is clicked), a path represents the single most

complete measure of user navigation and thus affords an

important window on the search process [17]. Moreover,

since data can be recorded and formatted in an unobtrusive

manner on-the-fly, this approach provides empirical

62

investigators with a powerful data collection tool that has

the additional benefit of being seamlessly integrated with the

delivery of experimental materials.

For all of the interest in navigational paths however

there have been relatively few studies that have sought to

examine the relationship between patterns of navigation and

outcome measures. Two exceptions are studies by Smith

[30] and Cardle [5], both of whom investigated the

relationship of informal measures of search success (i.e.

displays of frustration or confidence by subjects) with a path

efficiency measure. Both of these studies, however, were

based on static navigational measures (e.g. number of pages

visited) that did not attempt to incorporate spatial or

directional aspects of user paths. Other studies that sought

to relate path data to outcome measures have been based on

similarly indirect path measures. Chang and McDaniel [6],

relied on video transcripts of their subjects and

accompanying “think aloud” data to subjectively categorize

navigational patterns. Pirolli, Pitkow, and Rao [25] and

Chen [7] adopted a more quantitative focus on user paths

but employed large-scale aggregate data drawn from web

server statistics. By using these aggregated data sets,

however, the “paths” studied had to be inferred from access

logs and were thus subject to problems related to firewalls,

proxy masking of user identity, intentional reloading of

documents by users, and missed hits as a result of local

browser caches.

Although discrete static measures, informal descriptive

characterizations, and aggregated summarizations of user

navigation are useful starting points, our understanding of

user navigation will be significantly enhanced if user

movement can be more directly represented, analyzed, and

related to performance outcomes. It is the purpose of this

paper to achieve these ends - to define methods that support

more direct representation and analysis of user paths and to

empirically relate resulting measures to hypertext search

outcomes.

In this paper, user paths will be analyzed in two

different ways. One form of analysis is based on a graphical

method intended to illustrate user paths in a way that makes

navigational patterns visually distinct. The second form of

analysis relies on path-specific structural metrics that are

related to the visually distinct categories identified by

graphic analysis. Following definition of these path metrics,

the paper will report on an experimental study that explores

the relationship between the proposed measures and a

quantitative search outcome measure. The paper will

conclude by considering limitations of the proposed methods

and will suggest some potential applications of the proposed

methods and metrics.

VISUALIZING AND ASSESSING NAVIGATION

The framework adopted to visualize and assess user

navigation is based on the traditional node-and-link model.

At least part of the popularity of this model can be attributed

to two simple but powerful formalisms that support analysis:

adjacency matrices that are well suited to computational

analysis and labeled directed graphs (digraphs) that present

structural information in a readily interpreted visual format.

Both of these formalisms have played important roles in

earlier work establishing metrics to assess structure in

hypertext and they serve similar roles in the present study.

The two structural metrics of special interest in the

present study are compactness and stratum [3]. The purpose

of these metrics is to yield global network-based assessments

of structure that are grounded in node-based centrality

measures. Briefly, compactness refers to the overall

connectedness of a network with more sparsely linked

networks resulting in values for compactness close to 0,

while densely connected networks yield compactness closer

to 1 (a continuum labeled “sparse vs. rich” hypertext by

Nielsen [24]). Stratum, on the other hand, refers to the

degree of linearity of a network, as indicated by the extent to

which a network is organized so that certain nodes must be

read before others. Stratum also ranges between 0 and 1,

with more linear networks closer to 1 and more web-like

networks closer to 0.

Although compactness and stratum were originally

developed to assess the structure of hypertext networks,

subsequent work [27] has suggested that these same metrics

might be usefully applied as tools in assisting users navigate

networks. The present study attempts to broaden the

application of these metrics still further by proposing

adaptations intended to support the analysis of user

movement in hypertext. Before considering how these

metrics can be adapted, however, it will be useful to be more

specific both about the larger conceptual framework that

supports the proposed methods and the concept of a path.

Conceptual Preliminaries

Distance matrices have proven to be particularly useful

for analyzing hypertext networks. Each cell Dij in a distance

matrix identifies the minimum number of steps required to

move from node i to node j. By convention, the distance

from a node to itself is “0” and the distance to unreachable

nodes is infinite. In circumstances where infinite values

complicate subsequent computation, they are replaced with

a conversion constant K (usually set equal to the number of

nodes in the network), resulting in a converted distance

matrix. Sums across rows in the converted distance matrix

(3jCij) are referred to as converted out-distances and

represent the centrality of nodes when considered as points

of departure. Sums down columns in the distance matrix

(3iCij) are referred to as converted in-distances and

represent the centrality of nodes as destinations. The sum

of all entries in the converted matrix is referred to as the

converted distance (CD) of a network and is commonly

63

A. Path:¢¢6,30,6,37,6,15,16,6,21,6,21,30,6,23,24,6,35,6¦¦.

to

from 6 15 16 21 23 24 30 35 37

6 010210111

15 001000000

16 100000000

21 100000100

23 000001000

24 100000000

30 200000000

35 100000000

37 100000000

B. Path matrix for the path above.

Figure 1. A path across a larger 78-node

hypertext (A) and its resulting path matrix. (B).

to

from 6 15 16 21 23 24 30 35 37

6 012112111

15 201334333

16 120223222

21 123023122

23 234301333

24 123220222

30 123223022

35 123223202

37 123223220

A. Converted distance matrix for the path in Figure 1A.

B. Path diagram for the path in Figure 1A.

Figure 2. The converted distance matrix and

path diagram for the path in Figure 1A.

employed to normalize in- and out-distances so that

comparisons can be made between networks that differ in

size or connectedness.

A path P in a network is simply the sequence of nodes

or pages visited by a reader during a browsing session (i.e.

P = +p1,p2,p3,...,pL,, where L refers to the length of the path

or number of node visits). Although this use of the term

“path” deviates from its meaning in the underlying graph-

theoretic framework (since it allows nodes to appear more

than once), this usage has been adopted in recent work [25,

10, 9] and will be employed in the present study as well.

Since nodes can occur in more than one position in a path,

however, the length of a path may not correspond to the

number of distinct nodes visited and this has some important

consequences for both path visualization and the calculation

of metrics based on path matrices.

The Path Matrix And Its Derivatives

Unlike the distance matrix described earlier that

represents distances between nodes, a path matrix represents

frequencies of node transitions during a browsing session

from each node in the path to every other node in the path

[25]. If the hypertext under consideration is closed, the path

matrix can also be normalized by representing every node in

the hypertext as a whole, regardless of whether a node

appears in the path. As a result of this expansion, it is

possible to sum individual user paths into group “paths”.

The normalizing expansion of the path matrix is achieved

by inserting rows and columns filled with zeroes in the

appropriate places in the path matrix so that every node in

the hypertext is represented. The net result of this

expansion is to embed the path within the larger structure of

the hypertext. The structural features of the original path

matrix are, however, preserved while establishing a normal

form for all paths.

Calculation of the proposed path metrics, like their

structural counterparts, requires a distance matrix and

suitable conversions. Consider, for example, the path in

Figure 1A, consisting of 17 transitions, beginning and

ending at node 6. The path matrix is constructed by

creating a suitably labeled matrix that includes each distinct

node in the path and then incrementing the appropriate cell

for each transition represented. The resulting path matrix

(Figure 1B) indicates the number of transitions from each

node to every other node in the path.

The distance matrix and converted distance matrix for

a path are created through straightforward adaptations of the

procedures described by Botafogo et al. [3, 27], with one

important difference. Creation of a distance matrix for a

path begins by substituting a value of “1” for all entries in

the original path matrix that exceed one (i.e., that represent

multiple transitions) and then generating a distance matrix

as would be done with an adjacency matrix representing a

hypertext. For path stratum, no further conversion is

needed. The path compactness metric, however, requires

conversion, replacing infinite cell entries with the

conversion constant K, where K equals the number of nodes

in the original path matrix (Figure 2A). The path diagram

is constructed from the original path matrix by creating a

vertex for each node represented and an arc for each non-

zero cell entry (Figure 2B).

Path Metrics

Although the computational procedures involved in

generating path metrics are straightforward manipulations

of the structural metrics they are based on [3, 27], there is an

important difference between path matrices and

corresponding structural matrices. In structural matrices,

64

every node in the hypertext is represented but this is clearly

not the case in the path matrix presented in Figure 1B,

which only represents nodes actually in the path. While the

expanded path matrix does represent all nodes in the

hypertext it will usually be much larger than the path matrix

and this has significant consequences for both path

visualization and metrics, suggesting that the path matrix

rather than the expanded path matrix should serve as the

basis for subsequent visualization and calculation.

With respect to visualization, use of the expanded path

matrix will tend to crowd the display of a path with large

numbers of isolated (i.e. unconnected) nodes that do little

more than communicate the size of the network traversed,

something more easily done by simply reporting this

information. A second problem introduced by using the

expanded path matrix is that subsequent calculations will

tend to be dominated by the conversion constant. Since

most paths traverse only small portions of hypertexts, the

expanded distance matrix will typically include large

numbers of unreachable nodes with infinite entries that, on

conversion, will be replaced by K, the conversion constant.

Although the influence of the conversion constant can be

moderated somewhat by scaling this constant up or down, it

seems counterproductive to define a metric in a way that

knowingly obscures the variability that is of greatest interest.

Although a complete representation of nodes is clearly

needed in assessing network structure, trying to distinguish

user paths within a visual display of a larger network seems

ill-advised. The very nature of path analysis, particularly

when it attempts to treat individual users, suggests that

ignoring unvisited nodes is a better approach since any

attempts to account for the influence of these unvisited

nodes is speculative at best. Moreover, if as cognitive

flexibility theory suggests [31, 16], users define their

“knowledge space” as a result of their particular experiences

traversing larger spaces, it seems theoretically justifiable (as

well as computationally convenient) to base path

visualization and metrics on the smaller path matrix.

As a consequence of these considerations, it is possible

to define path compactness and path stratum, the specialized

metrics that are a central objective of this study. Path

compactness (PCp) refers to the complexity of a user’s path,

based on the same notion of connectedness employed in the

corresponding structural metric and is formally defined as

PCp '(PMax&EiEjPCij )

(PMax&PMin),

where PC refers to the converted distance matrix of the path

and PMax and PMin refer, respectively, to the maximum and

minimum converted distance values that the path matrix can

assume for a completely-connected (PMin) and completely-

disconnected (PMax) network consisting of as many nodes as

are in the path. PMax and PMin are given by

PMax'K(n2&n),and PMin'(n2&n),

where n is simply the order of the path matrix.

Path stratum (PSt) is also defined in a manner analogous

to its structural equivalent with

PSt 'path absolute prestige

LAP ,

where the absolute prestige of a path is defined similarly to

a network with the exception that the distance matrix is

derived from a path matrix rather than an adjacency matrix.

The linear absolute prestige (LAP) that serves as the

normalizing measure for this metric is also defined

analogously to the structural metric, based on the number of

distinct nodes (n) in the path with

LAP '

n3

4,if nis even.

n3&n

4,if nis odd.

EMPIRICAL VALIDATION

The purpose of this section is to describe the results of

an empirical study designed to assess whether the proposed

measures are empirically meaningful in the sense that they

can be shown to be related to objective categories of users.

More specifically, the focus of the present validation will be

to determine whether the proposed graphic techniques and

path metrics can be shown to be associated with user success

in a hypertext search task.

Participants in the validation study were adult students

at a medium-sized Midwestern public university in the USA.

Identities of subjects were coded so that individuals could

not be identified and all procedures were reviewed and

approved by a university human subjects board. A total of

89 teacher education students participated, with data

collection extending across both terms of the 1997-1998

academic year. The experiment required subjects to respond

to a set of academic advising questions using an electronic

student advising handbook. Subjects were to answer as

many questions as possible within a 15-minute period. The

handbook consisted of approximately 31,000 words in 78

text nodes structured in a hierarchical-linear fashion with

major handbook divisions organized hierarchically and

nodes within those divisions organized in a linear fashion.

The handbook duplicated the content and overall structure

of a print version that had been in use for a number of

years.

Since all subjects participating in the study were new

65

Figure 3: Three path diagrams for

subjects whose scores on the hypertext

search task were low.

Figure 4. Three path diagrams for subjects

whose scores on the hypertext task were high.

admissions to the teacher education program and had not yet

participated in formal academic advising, it was unlikely

they were familiar with the handbook content. Moreover,

since all subjects were at the same point in their academic

careers, it is highly unlikely there was any systematic

variation of subjects’ familiarity with program policies and

procedures across the experimental groups.

Following completion of the experimental sessions,

subjects’ browser cookie files were retrieved and path data

were extracted. Mathematica [33] routines were developed

to format graph files for display by GraphViz 1.3 [11],

employing a hierarchical embedding format. Mathematica

routines were also used to calculate the path compactness

and path stratum metrics described above. Subjects’

responses to academic advising questions were scored on the

basis of information provided in the handbook across a scale

of three values with 0 points awarded for incorrect and

omitted responses, ½ point for partially correct and correct-

but-incomplete responses, and 1 point for complete and

correct responses [30, 18, 34].

Visual Analysis

The first stage of analysis validating the proposed

graphical methods was based on a subset (n = 29) of the

larger subject pool (n=89). In this preliminary graphical

analysis subjects were grouped according to their search

performance answering questions using the hypertext

handbook. The “high” group consisted of subjects with the

top three scores in each of four groups that had been set up

to counter-balance experimental conditions. The “low”

group, on the other hand, consisted of subjects with the

bottom three scores in each of the four counter-balancing

groups. As a result, the analysis was based on two groups of

12 subjects that differed according to their success in

carrying out the hypertext search task.

The next step in the analysis was to create and review

path diagrams for the subjects in the two groups, with the

intent of discerning visually distinctive patterns that might

be related to success in the search task. In reviewing these

path diagrams, it soon became apparent that there were,

indeed, visually salient features that seemed to be related to

subjects’ success in the experimental task. A number of

examples of these distinctive path diagrams for individual

high and low scoring subjects are presented in Figures 3 and

4.

Review of the path diagrams suggested that subjects

whose scores on the search task were low tended to assume

a “passive” approach to locating answers in the handbook,

relying much more heavily on sequential “page-turning”

than those subjects who did well on the search task. The

path diagrams for these low scoring subjects typically

revealed distinctively linear patterns of movement. Path

diagrams for high scoring subjects, on the other hand,

tended to show shallow hierarchical patterns of movement

with the handbook table of contents serving as the root of

the tree, indicating repeated visits to the table of contents

during the course of the browsing session. Moreover, as

indicated in Figures 5 and 6, similar navigational patterns

resulted when group paths were generated by summing the

individual expanded path matrices for high- and low-scoring

subjects.

Since some link traversals in group paths probably

represent idiosyncratic thinking and navigational errors,

“noise” was eliminated from the summed group diagrams by

66

Figure 5: Group path matrix (33pi ) for

low ability hypertext readers for path

transition frequencies $$ 3.

Figure 6. Group path matrix ( 33pi ) for high ability

hypertext readers for path transition frequencies

$$ 3.

setting a threshold that had to be met in order for a

traversal to be displayed. In figures 5 and 6, the threshold

is set equal to 3, with the result that only those links are

displayed that were traversed at least three times by the

subjects in each group. While setting thresholds to other

values (5,2,1, etc.) altered diagrams in minor ways (mainly

by increasing the number of links displayed), alternative

settings did not alter the characteristic linear and shallow

hierarchical patterns associated with the low- and high-

scoring groups.

The graphical analysis carried out provides fairly

compelling, if informal, evidence in support of distinctive

navigational patterns associated with hypertext search

outcome measures. More effective hypertext search is

associated with a shallow hierarchical path diagram that

results from subjects making repeated trips back to the main

table of contents in order to make decisions about how to

locate information in the electronic handbook. Less

effective hypertext search is associated with a more passive

linear path diagram that reflects users’ reliance on

sequential “page-turning” with users hoping to locate

desired information by simply coming across it in their

browsing.

Moreover, these graphical analyses are suggestive about

what we might expect to find when we examine the

association between the path metrics that have been defined

and hypertext search outcome scores. Specifically, the

linear character of the low scoring group suggests that path

stratum will be negatively correlated with a subjects’

hypertext search score since low scoring subjects seemed

more likely to adopt linear navigational paths. Conversely,

since the compactness of a bidirectional star pattern

(characteristic of the paths of more successful subjects) tends

to approach 1 as the number of nodes increases, while both

bidirectional cycles and linear patterns (characteristic of less

successful subjects) approach values less than 1 [3], it

appears likely that path compactness will be positively

associated with users’ hypertext search scores. The research

hypotheses that will be explored in the quantitative analysis

that follows will therefore involve one-tailed, directional

tests of significance of the following research hypotheses:

1) The path compactness metric will correlate significantly,

in a positive fashion, with subjects’ hypertext search

scores, and

2) The path stratum metric will correlate significantly, in an

inverse fashion, with subjects’ hypertext search scores.

Analysis Based on Path Metrics

The purpose of this section is to present results of

analyses carried out to determine if the predictions made on

the basis of the graphical analysis hold up under quantitative

analysis. As noted earlier, subjects’ outcome measures are

based on numbers of questions answered correctly using a

hypertext student advising handbook. Pearson correlation

coefficients were determined relating a number of

experimental variables and hypertext search scores,

including path compactness and path stratum. Results of

the analyses are indicated in Table 1, with significant

correlations (" < .05) flagged. Note that, of the variables

examined, only path compactness and path stratum

correlated significantly with the hypertext search measure.

67

Correlations of Various Experimental Variables

with Hypertext Search Scores

r p n

Print Ability .210 .137 29

Pages viewed -.006 .955 89

Order of path matrix -.011 .921 89

Path compactness .239* .012† 89

Path stratum -.205* .027† 89

* p < .05; † Indicates a one-tailed test of association. All other

tests are two-tailed.

Table 1. Correlations (with p values and

subjects) for experimental variables and the

hypertext search scores.

Moreover, these correlations were as expected, with

compactness exhibiting a significant positive correlation and

stratum exhibiting a significant inverse correlation. These

analyses suggest that the observed relationship is not likely

to be the result of chance, and thus support the interpretation

of path diagrams and the proposed metrics as reflecting

empirically meaningful and potentially useful measures of

hypertext navigation.

GENERAL DISCUSSION AND LIMITATIONS

Results of the empirical validation suggest that the

proposed methods and metrics can be productively applied

in assessing user navigation. Moreover, the results reported

suggest that navigational patterns and their associated

metrics may be useful as indirect measures of user strategy

and perhaps even of users’ success in cognitively

“modeling” the domain represented by a hypertext. If, as

cognitive flexibility theory suggests, learning in hypertext

materials involves the cognitive reconstruction of a domain

space through repeated traversals across that space, the

paths users choose are sure to have a powerful influence on

learning outcomes. In the present study, subjects who

adopted shallow, hierarchical search strategies that more

accurately “modeled” the organization of the hypertext

materials were more successful in their search, while those

who adopted more linear paths through the materials were

less successful. In effect, more successful subjects

recognized and took advantage of the structure of the

domain space by returning to the higher ground of the table

of contents and the broader cognitive view it afforded of the

domain.

That more successful hypertext users recognize and take

advantage of the structure of the materials they are using is

not, in itself, very surprising. That is, after all, the purpose

of graphic overviews, site maps, and other techniques that

have been shown to promote more effective use of hypertext.

What is important about the methods and metrics proposed

is that they are not merely ad hoc constructions, but are

grounded in a widely used conceptual framework, and that

they hold up under both qualitative and quantitative

scrutiny. Although prior work has identified qualitative

features similar to those noted here, informal

characterizations such as “loopiness” and “spikiness” [34]

can now be related to objectively assessed metrics. Another

important feature of the methods proposed is that they are

based on information that is immediately and unobtrusively

available during reading, something that is not generally

true of outcome measures. Given the demonstrated

association of path information and outcome measures, it

may be possible to apply real-time path data in generating

user models that will lead to more effective adaptive

hypertext systems. It may also be possible to apply these

metrics in designing user paths to meet particular objectives

or needs of users. Even in the absence of immediate

applications, however, it will be important to explore these

more direct measures of user movement, given the interest

in, and widespread use of, less direct measures.

While the results of the present study are relatively

clear-cut, three limitations suggest that these findings

should be considered preliminary. One limitation has to do

with the strength of the observed association between path

metrics and search success. A second limitation has to do

with the choice that has been made with regard to

normalization, and a third set of related limitations is

associated with the generalizability of findings, given

constraints imposed by the design of the study and the

experimental materials.

Although the observed association between the proposed

metrics and search success is not likely to be due to chance,

the strength of the association is not great. A weak

association remains of significant interest, but it also

suggests that this variable should be considered within a

larger explanatory context. Regarding normalization, it is

relevant to note that Botafogo et al. [3] recognize the general

nature of the normalization problem in their work

establishing structural metrics. In response to this problem,

they suggest that alternative normalization procedures be

considered, particularly for the stratum metric. They note,

for instance, that since stratum depends on LAP, a measure

that is O(n3) for matrices of order n, it may be problematic

to compare networks that have large differences in numbers

of nodes. Given this, the question arises whether the

variation in the order of path matrices across subjects is

sufficient to raise concerns about the analyses that have been

carried out.

Two circumstances of the present study suggest,

however, that the problem of normalization has not

compromised the specific results reported. One

circumstance is that although there was variation in the size

of the path matrices used to calculate path stratum values,

there was no significant correlation between subjects’ search

scores and the size of their associated path matrices (

Pearson r = -.011, p = .921). Path matrices varied, but there

68

is no evidence of a systematic variation that might influence

the relationship noted between the outcome measure and

path stratum.

The second circumstance is related to the observation

([3], pp. 169-170) that differences in stratum values can

result when index and reference nodes are excluded from the

stratum calculation, the implication being that the presence

of prominent nodes can distort the stratum metric. In an

effort to determine whether this should be a source of

concern in the present study, path stratum values were

recalculated excluding index and reference nodes identified

as those whose in- and out-degrees differed from their

respective means by more than one standard deviation.

Recalculation of path stratum values and reanalysis of the

association between observed path stratum and search scores

remained significant, with only a very minor deviation from

prior results. As noted in Table 1, analysis across all nodes

resulted in an r = -.205. Reanalysis with index and

reference nodes excluded resulted in an r = -.200, suggesting

that the path stratum metric had not been influenced

significantly by unusual user movement across specific

links.

A limitation having to do with the design of the

validation study arises because the investigation has focused

on the relationship between user paths and outcome

measures from a strictly correlational perspective. Had user

movement been more carefully controlled so that a path

“factor” could be established, a stronger inference regarding

the contribution of user movement to search success might

have been possible. That, however, must be addressed in

future research. For the present, we must be satisfied with

the observation that user movement and search success are

significantly associated, without clear understanding about

how and why these measures are related.

Finally, it is important to note that the present

investigation is limited to a test of the proposed methods and

metrics utilizing a single hypertext using a specific browser

interface. The findings reported are consistent with prior

work [4, 34, 35] that suggests hierarchical patterns of

movement benefit readers unfamiliar with the material

presented but larger questions remain. It is, for instance,

still unclear if the proposed methods and metrics will work

equally well in other hypertext structures and under different

browsing conditions. While it is apparent that the observed

efficacy of navigational patterns reflects structural features

of the hypertext itself, which inevitably sets conditions

within which users must operate, it is not clear how specific

hypertext structures influence user movement and the

metrics proposed. Like the limitation related to the design

of the study, this question requires further empirical work.

These limitations notwithstanding, the proposed

methods and metrics afford hypertext developers and

researchers a number of important benefits. One benefit is

that these methods and metrics support more direct analysis

of user movement in hypertext than has been possible

before. A second benefit is that the concepts and

computational framework these methods and metrics are

based on are natural extensions of prior methods and metrics

developed to analyze the structure of hypertext, and thus

support a more general perspective that encompasses both

structure and navigation in hypertext. Finally, there are

both informal and quantitative reasons for confidence in the

adequacy of these methods since the metrics that have been

proposed are clearly related to the graphical displays

developed and these metrics have been shown to have

significant empirical association with success in a hypertext

search task.

ACKNOWLEDGMENT

This work was supported in part by an Indiana

University Summer Faculty Fellowship. The author would

also like to thank Mr. Michael Mancini for his assistance

collecting data and four anonymous reviewers for

suggestions and comments that contributed to the revision

of this manuscript.

REFERENCES

1. Astleitner, H. & Leutner, D. (1996). Applying standard

network analysis to hypermedia systems: Implications

for learning. Journal of Educational Computing

Research, 14(3), 285-303.

2. Bartram, L., Ho, A., Dill, J., & Henigman, F. (1995).

The continuous zoom: A constrained fisheye technique

for viewing and navigating large information spaces.

Proceedings, User Interface and Software Technology

‘95. ACM. New York, 207-215.

3. Botafogo, R. A., Rivlin, E., & Shneiderman, B. (1992).

Structural analysis of hypertexts: Identifying hierarchies

and useful metrics. ACM Transactions on Information

Systems, 10(2), 142-180.

4. Canter, D., Rivers, R., & Storrs, G. (1985).

Characterizing user navigation through complex data

structures. Behaviour and Information Technology,

4(2), 93-102.

5. Cardle, N. T. (1994). A hypercard on Celtic history to

assess navigability measures for hypertext. MSc.

Dissertation. Institute of Information Technology,

University of Nottingham, UK.

6. Chang, C. & McDaniel, E. D. (1995). Informal search

strategies in loosely structured settings. Journal of

Educational Computing Research, 12(1), 95-107.

7. Chen, C. (1997). Structuralizing and visualizing the

69

WWW by generalised similarity analysis. In Mark

Bernstein, Leslie Carr, & Kasper Østerbye (Eds.),

Proceedings of the Eighth ACM Conference on

Hypertext - Hypertext ‘97, 177-186. New York: ACM.

8. Chen, C. & Rada, R. (1996). Interacting with

hypertext: A meta-analysis of experimental studies.

Human-Computer Interaction, 11, 125-156.

9. Cockburn, A. & Jones, S. (1996). Which way now?

Analysing and easing inadequacies in WWW

navigation. International Journal of Human-Computer

Studies, 45, 105-129.

10. Eklund, J. & Zeiliger, R. (1996). Navigating the web:

Possibilities and practicalities for adaptive navigational

support [On-line]. Available:

http://www.scu.edu/sponsored/ausweb/ausweb96/tech/

eklund1/paper.html.

11. Ellson, J., Koutsofios, E., & North, S. (1998).

GraphViz 1.3 [Computer software]. Murray hill, NJ:

Lucent Technologies. Available:

http://www.research.att.com/sw/tools/graphviz/.

12. Furnas, G. (1986). Generalized fisheye views,

Proceedings of CHI ‘86, Human Factors in Computing

Systems, Boston, April, 1986, 16-23.

13. Gloor, P. A. (1991). Cybermap, Yet another way of

navigating in hyperspace. In Proceedings of the Third

ACM Conference on Hypertext, Hypertext ‘91. San

Antonio, Texas, December, 1991, 107-121.

14. Hill, G., Hutchings, G., James, R., Loades, S., Halé, J.,

Hatzopulous, M. (1997). Exploiting serendipity

amongst users to provide support for hypertext

navigation. In Mark Bernstein, Leslie Carr, & Kasper

Østerbye (Eds.), Proceedings of the Eighth ACM

Conference on Hypertext - Hypertext ‘97, 212-213.

New York: ACM.

15. Husemann, H., Petersen, J., Kanty, C., Kochs, H., &

Hase, P. (1997). A user adaptive navigation metaphor

to connect and rate the coherence of terms and complex

objects. In Mark Bernstein, Leslie Carr, & Kasper

Østerbye (Eds.), Proceedings of the Eighth ACM

Conference on Hypertext - Hypertext ‘97, 177-186.

New York: ACM.

16. Jacobson, M. & Spiro, R. (1995). Hypertext learning

environments, cognitive flexibility, and the transfer of

complex knowledge: An empirical investigation.

Journal of Educational Computing Research, 12(4),

301-333.

17. Lawless, K. A. & Kulikowich, J. M. (1996).

Understanding hypertext navigation through cluster

analysis. Journal of Educational Computing Research,

14(4), 385-399.

18. McKnight, C., Dillon, A., & Richardson, J. (1990). A

comparison of linear and hypertext formats in

information retrieval. In R. McAleese & C. Green

(Eds.), Hypertext: State of the art (pp. 10-19). Oxford:

Intellect.

19. Melara, G. E. (1996). Investigating learning styles on

different hypertext environments: Hierarchical-like and

network-like structures. Journal of Educational

Computing Research, 14(4), 313-328.

20. Misanchuk, E. R. & Schwier, R. A. (1992).

Representing interactive multimedia and hypermedia

audit trails. Journal of Educational Multimedia and

Hypermedia, 1, 355-372.

21. Mukherjea, S., Foley, J. D., & Hudson, S. (1995).

Visualizing complex hypermedia networks through

multiple hierarchical views [On-line].

Available:http://www.acm.org/sigs/sigchi/sigchi95/el

ectronic/documnts/papers/sm_bdy.htm.

22. Neves, F. D. (1997). The Aleph: A tool to spatially

represent user knowledge about the WWW docuverse.

In Mark Bernstein, Leslie Carr, & Kasper Østerbye

(Eds.), Proceedings of the Eighth ACM Conference on

Hypertext - Hypertext ‘97, 197-207. New York: ACM.

23. Nielsen, J. (1989). The matters that really matter for

hypertext usability. Proceedings of the ACM

conference on hypertext - Hypertext ‘89, 239-248. New

York: ACM.

24. Nielsen, J. (1990). Hypertext and hypermedia.

London: Academic Press.

25. Pirolli, P., Pitkow, J., & Rao, R. (1996). Silk from a

sow’s ear: Extracting usable structures from the web

[On-line]. Available:http://www.acm.org/

sigchi/chi96/proceedings/papers/Pirolli2/pp2.html.

26. Qiu, L. (1994). Frequency distributions of hypertext

path patterns: A pragmatic approach. Information

Processing & management, 30(1), 131-140.

27. Rivlin, E., Botafogo, R., & Shneiderman, B. (1994).

Navigating in hyperspace: Designing a structure-based

toolbox. Communications of the ACM, 37(2), 87-96.

28. Sarker, M. & Brown, M. H. (1994). Graphical fisheye

views, Communications of the ACM, 37, 12, July,

70

1994, 73-84.

29. Schroeder, E. E., & Grabowski, B. L. (1995). Patterns

of exploration and learning with hypermedia. Journal

of Educational Computing Research, 13(4), 313-335.

30. Smith, P. A. (1996). Towards a practical measure of

hypertext usability. Interacting with Computers, 8(4),

365-381.

31. Spiro, R. & Jehng, J. (1990). Cognitive flexibility and

hypertext: Theory and technology for the nonlinear and

multidimensional traversal of complex subject matter.

In Don Nix and Rand Spiro (Eds.), Cognition,

Education, Multimedia. Hillsdale, NJ: Lawrence

Erlbaum.

32. Utting, K. & Yankelovich, N. (1989). Context and

orientation hypermedia networks. ACM Transactions

on Office Information Systems, 7(1), 58-84.

33. Wolfram, S. (1994). Mathematica (Version 2.2)

[Computer software]. New York: Addison Wesley.

34. Leventhal, L. M., Teasley, B. M., Instone, K.,

Rohlman, D. S., & Farhat, J. (1993). Sleuthing in

HyperHolmesTM: an evaluation of using hypertext vs. a

book to answer questions. Behaviour & Information

Technology, 12(3), 149-164.

35. Simpson, A. & McKnight, C. (1990). Navigation in

hypertext: structural cues and mental maps. In R.

McAleese and C. Green (Eds.), Hypertext: State of the

art. Intellect: Oxford.