Content uploaded by Jaume Nualart Vilaplana
Author content
All content in this area was uploaded by Jaume Nualart Vilaplana on Feb 11, 2018
Content may be subject to copyright.
Title:
Visual Articulation of Navigation and Search Systems for Digital
Libraries
Abstract
Journal and digital library portals are the information systems that
researchers turn to most frequently for undertaking and disseminating their
academic work. However their interfaces have not been improved. We
propose an articulation of the navigation and search systems in a single
visual solution that would allow the simultaneous exploration and
interrogation of the information system. Area is a low-cost visualization tool
that is easy to implement, and which can be used with large collections of
documents. Moreover, it has a short learning curve that enhances both
user-experience and user-satisfaction with journal and digital library
websites.
Keywords
Searching; Browsing; Information visualization; Information management;
Digital libraries
1. Introduction
When designing a digital information system, the first objective that has to
be met is that of facilitating the most intuitive means for users of locating
information. To satisfy this objective, the systems of organization, labeling,
navigation and searching have to be properly designed, as do the
controlled vocabularies that articulate this digital environment (Morville,
2007).
For a web page, for example, this means that the organizational systems
must serve to structure and organize website content. They are usually
constructed by using a classification, based on one or more specific criteria
of the content housed on that page (for example, the subject that is being
dealt with, the date of creation or the audience being targeted). The
labeling system consistently and efficiently defines and determines the
terms used to name the categories, options and links used on the web in a
user-friendly language. The navigation system allows users to move
comfortably around the different sections that make up the website. It
provides a method of orientation for users to move in a controlled way from
one point of the website to another and to ensure that at all times they
know where they are and where they can go within the structure of the
web. Based on a previous indexing strategy, the search system allows the
user to formulate queries and to retrieve information from within the
website. Controlled vocabularies or languages are documentary resources
(thesauri, taxonomies, synonym rings, etc.) that facilitate, by articulating the
other elements of the architectural structure, the search and retrieval of
information on the site (Pérez-Montoro, 2010).
While all these elements form part of the architectural anatomy of a digital
information system, the two elements used most frequently by users when
seeking information are the search and navigation systems. These two
systems tend to be clearly identified in the system interface using the
search box and the navigation bar, respectively. Users are typically well
versed in their use and, to improve their performance, they are usually
articulated via the labeling system (i.e., the navigation system labels are
used as indexing terms in the search engine).
In the case of journals and digital libraries, in common with other digital
information systems, architectural elements are usually employed to
facilitate user location of the information they manage.
Among these elements, the most frequently used are typically their
navigation and search systems. In this case, the navigation system is
usually quite simple, allowing an exploration of the resources filtered
through such criteria as author, year of publication, journal or publisher and,
in the best of cases, subject. The results of this navigation appear as a list
of clickable labels that lead the user to the set of resources, listed
alphabetically, corresponding to these criteria. Search systems usually
allow the formulation of queries (e.g., Any Word, All Words or Exact
Phrase) by field (e.g., title, description, keywords or anywhere). The result
of the query is a list of resources, normally sorted alphabetically too, which
corresponds to the criteria in the search interface.
These architectural systems and their interfaces are typically adapted to
the nature of the documents managed by these systems and to the
metadata used. The documents are static, non-dynamic, resources as far
as their content is concerned, and they do not change over time. Moreover,
their metadata describe the contents stored (based on qualitative, ordinal,
nominal or hierarchical data) (Hearst, 2009, van Hoek et al., 2014).
These systems are the direct heirs of the classical interfaces of the
document databases on CD-ROM developed in the eighties and which
have barely evolved since. In contrast with other information systems, such
as e-commerce websites, their interfaces have not been improved on the
basis of the findings provided by user studies, nor have the advances
developed in specific disciplines, such as information architecture, or those
derived more generally from User Experience (UX), been applied to them.
2. Visualization of information in digital libraries
One of the options for improving classical interfaces is the introduction of
new visual solutions in the search process that improve user-experience
and user-satisfaction with these digital systems of scientific information.
Traditionally, following on from the initial query, the search systems
implemented in information systems of this type offer a very simple
representation of the results retrieved. They usually only provide a vertical
list of results sorted alphabetically, and, for each result, they give additional
information about the retrieved item, such as its author, the title or date of
publication of the document, among others.
This strategy of traditional representation has significant limitations. On the
one hand, it does not always provide sufficient information about the
content of the document to enable the user to accept it or dismiss it without
having to read or interact with it first (Baeza-Yates, 2011, Nualart et al.,
2014). And, on the other, it does not allow the user to deploy techniques of
berrypicking in the search process (Bates, 1989), which could refine the
results obtained so as to propose subsequent, more efficient searches in
keeping with the user’s changing information needs following interaction
with the results.
In an attempt at overcoming these limitations, from the late eighties
onward, a series of prototypes have been developed that seek to improve
the visualization of results from journal and digital library portals. Some
have focused on the representation of the content of the retrieved
documents (Hearst, 1995, Egan et al., 1989, Weiss-Lijn et al., 2001,
Woodruff et al., 2001, Lam et al., 2005, Hoebar et al., 2006, Nualart et al.,
2013); while others have contributed new interactive visualizations of the
set of results after formulating the search query.
If we focus on the second group of prototypes, we can identify two main
types of strategy, some of which are interactive: first, those that provide
support for query creation and refinement and, second, those that offer
visual support for the presentation of results.
The earliest techniques were designed to help the user in formulating the
query, facilitating the use of Boolean operators (Jones, 1998, Wong et al.,
2011) or supplying and suggesting possible terms to the user for building
their queries (Schatz et al., 1996).
Those focusing on the visual presentation of results include different
alternatives. Some offer two-dimensional visualizations of the relationships
between the retrieved documents by using maps or clusters (Chalmers et
al., 1992, Andrews et al., 2001, Andrews et al., 2002) or by using two-
dimensional tables or grids (Fox et al., 1993, Shneiderman et al., 2000, Kim
et al., 2011). Others present strategies based on three-dimensional
visualizations of the retrieved results (Robertson et al., 1991, Hearst et al.,
1997, Cugini et al., 2000). These visual prototypes made a series of
significant improvements to the classical interfaces of journal and digital
library portals. Thus, on the one hand, they provided more rapid search
times compared to those of traditional non-visual methods (Hienert et al.,
2012) and, on the other, they permitted a more efficient formulation of
queries in a way that was tailored to the information needs of users. And,
finally, they provided additional information to users, information that was
not available on a page of more conventional results. This extra
information, which shows different semantic relationships between the
documents retrieved, provides a better interaction with the results and
facilitates the refinement of subsequent queries (Bauer, 2014).
Yet, even with these advantages, these prototypes and advances in
visualization have not been widely implemented in the portals or websites
of journals or digital libraries. The reasons for this are varied, but they can
be classified into two main groups: reasons of a practical nature and
methodological reasons.
In the case of the practical reasons, in resources of this type these tools
are implemented as separate pages from the basic search interfaces,
which means users perceive them as being secondary tools. Furthermore,
these solutions, especially those that visualize the results, involve a high
level of abstraction and conceptualization that means they are not very
intuitive for users. And, perhaps more importantly, implementing these
techniques, unlike traditional interfaces, does not offer any clear
commercial or economic benefits in the world of digital systems of scientific
information of this type.
If we focus on the methodological reasons, it can be seen that very few of
the proposed techniques have been tested and evaluated with end users,
which makes it difficult to draw any clear conclusions about their efficiency.
Moreover, the prototypes have only been used with small collections of
documents, and so their efficient use with large collections has not been
demonstrated to users. Likewise, the paucity of the quantitative results
reported in these studies of visual prototypes fails to demonstrate whether
they are any better than the classical versions of the interfaces. As such,
experiments are needed that analyze a period of widespread use over a
broader period of time before it can be concluded whether or not the
difficulty in using them stems from the users’ learning curve and their
degree of familiarity with the system. Similarly, when these prototypes are
constructed by articulating different techniques it becomes more difficult to
compare them, because it is not possible to attribute unequivocally the
success or failure of the system to one or more of the techniques
implemented. And, in this sense, these tools do not share a methodological
design that would allow us to compare the results of each proposal and to
analyze them jointly.
3. Area: an alternative visualization proposal
To overcome these practical and methodological limitations, new solutions
and low-cost tools that can be readily implemented, and which can improve
user-experience and user-satisfaction with these information systems, need
to be identified. One possible alternative is the articulation of the navigation
and search systems in a single visual solution that would allow the
simultaneous exploration and interrogation of the information system.
Area is a new, low-cost visualization tool that is easy to implement, and
which can be used with large collections of documents. Moreover, it has a
short learning curve that articulates the two systems using a two-
dimensional structure that can enhance both user-experience and user-
satisfaction with journal and digital library websites.
Although the idea for Area originated in 2006, it has evolved since then with
the development of versions in several computer languages for a range of
different uses and purposes. However, for the experiment reported here
Area has been completely rewritten. Today it is a simpler version that runs
completely on the client side from a standard browser. Area is free
software.
In presenting this alternative visualization proposal, we have selected the
contents of the journal Information Research to serve as our corpus of texts
on which we demonstrate the tool’s visualization and exploration
capacities. To do so, we replicated these contents on a standalone server,
where Area is presented as an alternative interface to that of the
Information Research journal, yet emulating all its capabilities and adding
additional ones (see Table 1) (Nualart, 2014a).
Features Information
Research Area Comment
Explore by issue as a list of
papers YES YES No changes: Area redirects to the
existing issue page
Search with Atomz, and Search
with Google YES YES No changes: It redirects to the IR
search page
Multiples overviews of the
collection NO YES
New feature: (no. of eligible
properties)2
This is 52 = 25 combinations of
eligible properties
Numerical overview NO YES
New feature: Area shows an
overview of the main numbers of
the collection
Topic distribution NO YES New feature: filter papers are
marked during exploration.
Explore by Language NO YES
New feature: Language is an
eligible property. So it can be
represented in combination with
the other of properties.
Explore by Year, Issue and
Volume YES IMPROVED
Improved feature: multiple
representations and evolution
visualization
Explore by Subject YES IMPROVED Improved feature: TAB “by topic”,
allows filter by typing
Explore by Author (authors can
have more than one paper) YES IMPROVED Improved feature: TAB “by
author”, allows filter by typing
How many papers talk about a
subject? YES IMPROVED
Improved: Area shows the papers
and their context. A better
visualization of the group of
results
Table 1. Comparison of features of existing Information Research site and Area
We have chosen the contents of Information Research for two reasons. On
the one hand, it serves as a good example of an open access journal with
the collection being published online under a Creative Commons license
and, on the other, academic papers represent a controlled collection of
texts with a similar language register, structure and length, which gives the
collection a homogeneous shape. All the codes related to this experiment,
as well as the Area software itself, can be downloaded from the GitHub
repository (Nualart, 2014b).
3.1 Area’s Visualization Capacity
Generally speaking, Area is an architectural proposal in which we
articulate, in a single structure, the two main systems facilitating the
location of information in digital contexts, i.e., the navigation system and
the search system. These two systems are present in most contexts, which
is a guarantee that users are fully familiar with them and that additional
specific instructions are not needed for them to use Area efficiently and
comfortably.
Area represents two of the eligible properties simultaneously. The first
property is represented graphically as blocks. These form a grid of blocks
that contain the items in the collection, depicted as small squares. The
second property is the color representation of each item (see Figure 1).
This particular architectural structure provides the tool with a series of
capabilities for locating and visualizing the information contained in the
collection that makes up the web page of the journal or digital library.
Figure 1. Screenshot of Area interface. The first eligible property is “Year” of publication,
represented by twenty-five blocks. The second eligible property is “no. of
references/papers”, grouped in seventeen categories and is represented by a different
color.
First, the system can browse the collection and simultaneously select two
of the attributes of each document in the collection: the year of publication,
the volume in which it appears, the issue in which it was published, the
number of references per article and the language in which the article is
written. The application of this double selection process generates a two-
dimensional representation in which all (not just part) of the collection of
documents managed on the web page of the journal or in the digital library
is depicted, unlike classical navigation and search systems. This
presentation allows us to visualize information about the collection, such as
the volume of the collection referred to, the way in which the volume or
issues are distributed throughout the year, the annual variation in the
number of references included in the documents and the distribution of
articles by languages. These indicators are not available in classical
systems.
By clicking on one of the rectangles (representing a document) in the grid,
a central window opens showing all the available bibliographic information
(title, author, volume, number of references, etc.) about that selected paper.
The system allows 25 combinations of “eligible properties” (5x5), of which
twenty relate two different properties (bivariates) and five represent the
collection in terms of a single attribute (univariates), where blocks and
colors coincide. Figure 1 shows the entire collection of documents from
Information Research using as our criteria the Year and the Number of
References. Each block corresponds to a Year and each rectangle
corresponds to a document colored according to the number of references
that it includes.
Second, once the collection has been presented in terms of the
combination of criteria or parameters, the system allows us to apply a
series of filters to locate documents that can help the user meet her
information needs. The documents corresponding to the filters are
highlighted in black. Specifically, three different types of filter are available:
author, subject and manual with field selection.
If we click the tab marked “by Author” tab (top left), we can write the name
of the author of the documents we seek or choose the author from the list
of all authors that have published in the journal. This second option should
be understood as a system query-builder (Figure 2).
Figure 2. Detail of filter “by Author”.
If we click the tab marked “by Subject” (top left), we can write the subject of
the documents we seek or choose the subject from the list of all subjects
dealt with by the documents in the collection. This second option should,
once more, be understood as a system query-builder (Figure 3).
Figure 3. Detail of Filter “by Subject”.
The manual filter – “filter paper” (in the right-hand column, mid-zone) –
allows a text to be filtered by the attributes or parameters of the document,
namely, Title, Citation, Year, Authors or Subject/s (Figure 4). The user can
choose which of these fields they want to filter for. If more than one filter is
selected, the OR operator functions between them. If the Author filter is
selected, we can also filter by the university to which the author is affiliated
or the city in which the author lives.
Figure 4. Detail of manual filter.
It should be pointed out that the first two filters (“by Author” and “by
Subject”) are not cumulative, so that every time we write something in the
corresponding box this overrides the previous filter. Once one of these two
filters is completed, if we change the attributes, the filter is maintained.
Moreover, once we have used one of those two filters, we can see what we
have typed using the query builder (clicking) as it will appear written in the
manual filter box.
Third, Area allows us to customize the visualization by giving users the
possibility of varying the colors (fixed, random or gradient mode) and thus
overcome any potential problems of color-blindness that users might suffer
from. It should also be stressed that it incorporates (left-hand column, mid-
zone) a support text which explains how to use the tool and an overview of
the data in the collection making up the journal or digital library web page.
Fourth, Area also includes the original location systems available on the
Information Research website. Thus, the filters offered by the tool can be
understood as a complement to the Google and Atomz searches offered by
the Information Research website.
Finally, unlike alternative visualization systems, the possibilities offered by
Area are not visually affected by the size of the collection represented. By
incorporating a grid that grows in function of the size of the collection, and
not depending on other systems such as 3-D or clusters, it avoids the
potential visual overlapping of information and the production of visual
noise when representing large quantities of documents. Area, as specified
in the technical description, is recommended for collections of up to 50,000
items (Nualart, 2014c).
3.2 Technical description
Area is a simple, small application coded in Javascript, which uses the
libraries jquery and D3, HTML, and CSS. The data files are stored in JSON
format and the application is accessible with a modern browser. When
visiting the Area website the client can download all the necessary files to
run the application entirely on the client side.
The implementation of this application faces two main constraints: the
number of items in the dataset and the dataset size. The first of these is
related to screen resolution while the second is related to the size of RAM
memory available on the client side. Performance tests conducted31
suggest the use of collections that do not exceed fifty thousand items.
Area represents the metadata of a collection of items, allowing filtering and
the exploration of the contents of each item. Each time the application and
the data files are downloaded, the properties from the metadata schema
are analyzed. In those cases in which the number of possible values of a
property is not greater than a configured value, then the group of eligible
properties is added.
Area was tested in 2014 on desktops, laptops, mobile phones and tablets.
All were found to offer good interface responsiveness. However, small
screens need to use scroll and zoom in order to provide the same
experience as that on larger screens.
4. User evaluation test
To gain a better understanding of the potential of the visual exploration and
search of text collections with the Area tool, we undertook a web-based
survey.
The aim of the survey was to compare the text-based website with Area for
the presentation of collections of texts, specifically, scientific papers. To this
end, we addressed the following questions: Are users able to detect the
new features? Do users still prefer or require access to the existing
presentation? Are users able to understand the new features? Do users
feel confident and positive about using the new features?
The design of the experiment is based on the established technology
acceptance model (TAM) (Davis et al., 1989), and the task technology fit
(TTF) (Goodhue, 1995). TAM seeks to understand why people accept or
reject information technologies, whereas TTF says that technologies will be
used if, and only if, their available functionalities support the user’s
activities. As such, the focus is on the match between the user’s task needs
and the available functionalities of a given technology. The questions have
been designed following Taylor-Powell and Marshall (1996).
In the rest of this section we explain the data collection process: choice,
download and storage. Then we describe the demographics of the
participants. Finally, we explain in detail the content of the questionnaire
administered to the users. In the section the follows we discuss the results
of the evaluation.
4.1 Data Collection
We used the collection of papers in Information Research (IR), edited by
Prof. T.D. Wilson (http://www.informationr.net/ir). It has been published
since 1995, and as of November 2014 the journal has published 592
papers, in 74 quarterly issues, and 19 yearly volumes.
We selected the contents of Information Research to provide the corpus of
texts for this experiment for the two main reasons discussed above, namely
its status as an open access journal, published online under a Creative
Commons license and, because its academic papers constitute a controlled
collection of texts with a similar language register, structure and length,
giving the collection a homogeneous shape.
In designing Area we sought to provide most of the features that the
existing Information Research website offers. Indeed, for some features
Area redirects the user to the existing services on the website. This is the
case of Atomz search and the domain-restricted Google search. Other
features have been improved in Area, specifically, exploring the collection
by year, by language, by number of references per paper, by issue and by
volume, and exploring by subject and by author.
To obtain the data collection we harvested the contents from the journal’s
website. Papers have been published in different versions of HTML,
reflecting the evolution of the markup language since 1995 and changes
dictated by the publishers in the structure of the pages. We customized the
spiders to the non-homogeneous HTML structure of the corpus.
After cleaning the data and adding HTML entities for all special characters,
above all for authors’ names, we selected several metadata properties for
each paper. See Table 2 for a detailed list.
metadata properties type In which page of the
journal is this data?
Function
(no. of different values)
Volume integer paper page + issue page Eligible property (nineteen
volumes)
Issue integer paper page + issue page
Eligible property
(seventy-four issues and four
values)
Year integer paper page + issue page Eligible property (twenty years)
Number of references
(grouped) integer paper page Eligible property (seventeen
groups)
Language string paper page + issue page
Eligible property & Searchable
(three languages: English,
Spanish, Portuguese)
Title string paper page + issue page Searchable (592 values)
Authors,
institution/country string paper page Searchable (582 values)
Citation string paper page Searchable (592 values)
Paper URL URL
string paper page + issue page External link (592 values)
Issue URL URL
string paper page + issue page External link (74 values)
Number of references integer paper page Property (101 values)
Paper subjects string by-subject page Property (400 subjects)
Individual author names string by-author page Property (895 authors)
Table 2: Metadata properties: list and details
In line with the conditions described above, five properties were labeled as
being eligible: Volume, Issue number, Year, Grouped number of
references/papers, and Language. The remaining properties (eight) were:
Title, Authors with institutions and countries, Citation, Paper URL, Issue
URL, Number of references, Paper subjects, and Individual author names.
This valuable metadata from the contents of Information Research were
stored in JSON files, and like the rest of the code, have been published
under free licenses to allow others to reuse them.
4.2 Questionnaire description
Online questionnaires are the most frequently employed method for
collecting quantitative data from users for statistical analysis.
Questionnaires allow the participation of an unlimited number of people and
can be used to gather data about users’ knowledge, beliefs, attitudes, and
behaviors34. Online questionnaires also make it easier to protect the privacy
of participants.
The questionnaire comprised fourteen questions. Seven demographic
questions and seven specific questions compare the tasks and features of
the Information Research website and the Area website.
Eligible respondents of the questionnaire were any potential visitors of an
academic journal. Initially, participants were invited to visit the existing
Information Research website and the Area website in order to familiarize
themselves with them and so as to be able to answer the questions. In
order to find participants, open calls were sent out using mailing lists of
PhD and Master’s students.
5. Results
The questionnaire was answered by forty-four respondents, with thirty-
seven completing all the questions. One out of three respondents were
women and seven out of ten were between thirty and fifty years of age. All
the participants said they had either a good, very good or expert technical
knowledge of computers in approximately equal proportions. In line with
this, seven out of ten of the participants use web browsers several times a
day.
The attitude of the participants to the new features found on the websites
was positive: they like to find new features sometimes (56.82%) or often
(15.91%). Other answers were: no opinion (18.18%), and rarely (4.55%). In
contrast, almost half of the participants said they were happy (45.45%) with
the information tools and interfaces they use. Finally, more than half of the
participants (56.41%) have published scientific papers, and three out of
four read scientific papers on quite a regular basis.
We asked participants to compare several tasks completed with the
journal’s existing interface, on the one hand, and with Area’s interface, on
the other. To answer the questions we encouraged participants to visit both
sites and to familiarize themselves with their interfaces before they started
to complete the questionnaire. For all seven tasks, users preferred the new
interface. In six out of the seven, participants preferred Area for solving the
proposed tasks in 80% of cases.
6. Discussion
As is apparent from the results obtained, the Area visualization prototype
is, in the opinion of users, better than the conventional tool available on the
website of the Information Research journal.
In response to all the questions asked, Area obtained a more positive
response than that given to the classical visualization tool (in six of them
there was a roughly 80% preference and in the remaining task a 64.86%
preference).
Thus, Area was preferred by 80% of the users for completing the following
tasks:
- Verifying the number of papers making up the collection.
- Identifying the number of papers addressing a specific
subject and their distribution in time.
- Obtaining an overview of the collection.
- Understanding the subjects addressed by the journal.
- Finding papers related to a user’s interests.
And 64.86% of users preferred it for:
- Exploring new topics and discovering new research in a
specific field.
These results can be attributed to the enhanced capacity of visualization
provided by Area compared to that provided by the classical resources of
information presentation included on the journal’s website. In the case of
the following functions: Explore by issue as a list of papers, Search with
Atomz, and Search with Google, Area redirects users to the resources on
the journal’s web page.
However, Area betters the classical visualization tool (included in the
interface of most journals and digital libraries) in several of its features
(Table 3). On the one hand, it incorporates new visualization features that
are not available in the classical proposal. For example, it allows the user
to visualize the whole collection in different ways depending on the two
properties and filters selected, and not just as a subset of the whole as in
classical systems. We have named this new function: Multiple overviews of
the collection. Area also provides rapid access to the quantitative
characteristics of the collection (Numbers of papers, Issues, Volumes,
Years, etc.), a function named: Numerical overview. Area also shows the
user how a subject is distributed during the history of the journal as it allows
filtered papers to be marked during exploration. A feature we have named
Topic distribution. And, finally, Area allows the user to explore papers by
language and to see the evolution in this language, since language is an
eligible property and it can be represented in combination with the other
properties. We have named this new function: Explore by language.
How many papers have been published in the journal since
the first issue?
How is the term “visualization” distributed in the history of the
journal?
Answer Count Percentage
IR existing interface 2 5.41%
IR Area interface 31 83.78%
No difference 4 10.81%
No answer 0 0.00%
Answer Count Percentage
IR existing interface 1 2.70%
IR Area interface 31 83.78%
No difference 5 13.51%
No answer 0 0.00%
How many papers talk about visualization?
Answer Count Percentage
IR existing interface 3 8.11%
IR Area interface 24 64.86%
No difference 10 27.03%
No answer 0 0.00%
When exploring papers of the journal website: do you have a
better overview of the journal using the existing interface or
the Area interface?
Answer Count Percentage
IR existing interface 6 16.22%
IR Area interface 30 81.08%
No difference 1 2.70%
No answer 0 0.00%
Understanding the topics and themes of the journal
Answer Count Percentage
IR existing interface 4 10.81%
IR Area interface 31 83.78%
No difference 2 5.41%
No answer 0 0.00%
Finding papers related to your personal interests
Answer Count Percentage
IR existing interface 3 8.11%
IR Area interface 30 81.08%
No difference 4 10.81%
No answer 0 0.00%
Exploring new topics and discovering new research in this
field
Answer Count Percentage
IR existing interface 5 13.51%
IR Area interface 29 78.38%
No difference 3 8.11%
No answer 0 0.00%
Table 3: Task questionnaire
Area also improves certain functions that already exist in the classical
version. For example, it improves the Explore by Year, Issue and Volume
function by allowing multiple representations and evolution visualization. It
also improves the functions of Explore “by author” and Explore “by subject”
by allowing filter-by-typing. Finally, Area improves the function of identifying
How many papers talk about a subject? by showing the papers and their
context.
7. Conclusions
These new visualization functions and the outcomes recorded allow us to
draw a number of conclusions.
The test conducted on Area provided positive responses to the questions
that we set out to answer: users detect and understand new features; users
prefer or require new ways of presenting information; and users feel
confident and positive about using the new features.
The simplicity and economy of the Area prototype should pave the way for
the widespread introduction of these visualization tools in the portals and
websites of journals and digital libraries. The fact that Area is not
implemented as a page which is independent of the basic search interface
means that it is not perceived by users as a secondary tool; nor does the
prototype present a high level of abstraction and conceptualization that
means its use is not very intuitive for users. Similarly, Area, by basing its
visualization power on the metadata file, is a non-intrusive system that only
needs to be accessible from any point in the network and, once
downloaded locally, it allows interaction without an Internet connection.
Unlike other prototypes that have been implemented only with small
collections of documents and in highly controlled experimental conditions,
Area has been implemented in a real world context with the entire
collection of documents from a journal (not just with a subset of retrieved
documents). Therefore, the user-satisfaction results reported here cannot
be dismissed on the grounds of their having been obtained with a limited
collection or a limited number of documents. Finally, it should be stressed
that Area is a free licensed tool that is readily implemented which, unlike
other more abstract and expensive prototypes, facilitates its implementation
in journal and digital library sites.
References
Andrews, Keith, Gutl, Christian, Moser, Josef, Sabol, Vedran and Lackner,
Wilfried. "Search result visualisation with xfind." En User Interfaces to Data
Intensive Systems, 2001. UIDIS 2001. Proceedings. Second International
Workshop on, 50--58. :, 2001.
Andrews, Keith, Kienreich, Wolfgang, Sabol, Vedran, Becker, Jutta,
Droschl, Georg, Kappe, Frank, Granitzer, Michael, Auer, Peter and
Tochtermann, Klaus. "The infosky visual explorer: exploiting hierarchical
structure and document similarities." Information Visualization 1, no. 3-4
(2002): 166--181.
Baeza-Yates, Ricardo. "Tendencias en recuperación de información en la
web." Bid, no. 27 (2011): 1--4.
Bates, Marcia J. "The design of browsing and berrypicking techniques for
the online search interface." Online review 13, no. 5 (1989): 407--424.
Bauer, Sabine (2014). Interactive Visualizations for Search Processes. 5th
IEEE Germany Student Conference. University of Passau
Chalmers, Matthew and Chitson, Paul. "Bead: Explorations in information
visualization." En Proceedings of the 15th annual international ACM SIGIR
conference on Research and development in information retrieval, 330--
337. :, 1992.
Cugini, John V, Laskowski, Sharon and Sebrechts, Marc M. "Design of 3D
visualization of search results: evolution and evaluation." En Electronic
Imaging, 198--210. :, 2000.
Davis, Fred D, Bagozzi, Richard P and Warshaw, Paul R. "User acceptance
of computer technology: a comparison of two theoretical models."
Management science 35, no. 8 (1989): 982--1003.
Egan, Dennis E, Remde, Joel R, Gomez, Louis M, Landauer, Thomas K,
Eberhardt, Jennifer and Lochbaum, Carol C. "Formative design evaluation
of superbook." ACM Transactions on Information Systems (TOIS) 7, no. 1
(1989): 30--57.
Fox, Edward A, Hix, Deborah, Nowell, Lucy T, Brueni, Dennis J, Wake,
William C, Heath, Lenwood S and Rao, Durgesh. "Users, user interfaces,
and objects: Envision, a digital library." Journal of the American Society for
Information Science 44, no. 8 (1993): 480--491.
Goodhue, Dale L. "Understanding user evaluations of information
systems." Management science 41, no. 12 (1995): 1827--1844.
Hearst, Marti A and Karadi, Chandu. "Cat-a-Cone: an interactive interface
for specifying searches and viewing retrieval results using a large category
hierarchy." En ACM SIGIR Forum, 246--255. Vol. 31., bk. SI. :, 1997.
Hearst, Marti A. "TileBars: visualization of term distribution information in
full text information access." En Proceedings of the SIGCHI conference on
Human factors in computing systems, 59--66. :, 1995.
Hearst, Marti. Search user interfaces.: Cambridge University Press, 2009.
Hienert, Daniel, Sawitzki, Frank, Schaer, Philipp and Mayr, Philipp.
"Integrating interactive visualizations in the search process of digital
libraries and IR systems." In Advances in Information Retrieval, 447--450. :
Springer, 2012.
Hoeber, Orland and Yang, Xue Dong. "A comparative user study of web
search interfaces: HotMap, Concept Highlighter, and Google." En Web
Intelligence, 2006. WI 2006. IEEE/WIC/ACM International Conference on,
866--874. :, 2006.
Jones, Steve. "Graphical query specification and dynamic result previews
for a digital library." En Proceedings of the 11th annual ACM symposium on
User interface software and technology, 143--151. :, 1998.
Kim, Beomjin, Scott, Jon and Kim, Seung Eun. "Exploring digital libraries
through visual interfaces." (2011).
Lam, Heidi and Baudisch, Patrick. "Summary thumbnails: readable
overviews for small screen web browsers." En Proceedings of the SIGCHI
conference on Human factors in computing systems, 681--690. :, 2005.
Morville, Peter and Rosenfeld, Louis Information Architecture., Sebastopol:
O'Reilly Media, Inc(2007).
Nualart, Jaume (2014a). Area for Information Research, [accessed January
3, 2015] http://research.nualart.cat/area-ir/
Nualart, Jaume (2014b). Area repository. [accessed January 3, 2015]
https://github.com/jaumet/Area
Nualart, Jaume (2014b). Area stress. [accessed January 3, 2015]
http://research.nualart.cat/area-stress/
Nualart, Jaume and Pérez-Montoro, Mario. "Texty, a visualization tool to aid
selection of texts from search outputs.." Information Research 18, no. 2
(2013).
Nualart, Jaume, Pérez-Montoro, Mario and Whitelaw, Mitchell. "How we
draw texts: A review of approaches to text visualization and exploration." El
profesional de la información 23, no. 3 (2014): 221--235.
Pérez-Montoro, Mario. "Arquitectura de la información en entornos web." El
profesional de la información 19, no. 4 (2010): 333--338.
Robertson, George G, Mackinlay, Jock D and Card, Stuart K. "Cone trees:
animated 3D visualizations of hierarchical information." En Proceedings of
the SIGCHI conference on Human factors in computing systems, 189--194.
:, 1991.
Schatz, Bruce R, Johnson, Eric H, Cochrane, Pauline A and Chen,
Hsinchun. "Interactive term suggestion for users of digital libraries: Using
subject thesauri and co-occurrence lists for information retrieval." En
Proceedings of the first ACM international conference on Digital libraries,
126--133. :, 1996.
Shneiderman, Ben, Feldman, David, Rose, Anne and Grau, Xavier Ferré.
"Visualizing digital library search results with categorical and hierarchical
axes." En Proceedings of the fifth ACM conference on Digital libraries, 57--
66. :, 2000.
Taylor-Powell, Ellen and Marshall, Mary Gladys. Questionnaire Design:
Asking questions with a purpose.: University of Wisconsin-Extension
Cooperative Extension Service, 1996.
van Hoek, Wilko and Mayr, Philipp. "Is Evaluating Visual Search Interfaces
in Digital Libraries Still an Issue?." arXiv preprint arXiv:1408.5001 (2014).
Weiss-Lijn, Mischa, McDonnell, Janet T and James, Leslie. "Supporting
document use through interactive visualization of metadata." En
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries. :,
2001.
Wong, William, Chen, Raymond, Kodagoda, Neesha, Rooney, Chris and
Xu, Kai. "INVISQUE: intuitive information exploration through interactive
visualization." En CHI'11 Extended Abstracts on Human Factors in
Computing Systems, 311--316. :, 2011.
Woodruff, Allison, Faulring, Andrew, Rosenholtz, Ruth, Morrsion, Julie and
Pirolli, Peter. "Using thumbnails to search the Web." En Proceedings of the
SIGCHI conference on Human factors in computing systems, 198--205. :,
2001.