Conference PaperPDF Available

Visual Topic Maps Layer between Document Collections and Learning Material


Abstract and Figures

This paper introduces a semantic layer, called visual topic maps, to build a bridge between annotated document collections and the use of these documents as learning material. The main components of a visual classi¯cation are metadata-based topic maps attached to documents that allow customization according to users' needs and profiles. The metadata of documents and visual topic maps are based on MPEG7 and its Semantic Descriptor including additional attributes on instructional information. These generic meta- data are complemented by domain speci¯c metadata based on the domain of the document collection. The paper explains the visual topic maps concept and sets it in context to research on learning objects, metadata and educational modeling languages. The structure of visual topic maps and their metadata are then discussed in more detail and the process of building visual topic maps and using them is outlined.
Content may be subject to copyright.
Visual Topic Maps Layer between Document Collections
and Learning Material
Sachit Rajbhandari
FAO/NAiST Lab, Kasetsart
50 Phaholyothin Rd.
Bangkok, Thailand
Frederic Andres
2-1-2 Hitotsubashi,
Tokyo 101-8430, Japan
Asanee Kawtrakul
NAiST Research Laboratory,
Kasetsart University
P.O. Box 1212
Bangkok, Thailand
This paper introduces a semantic layer, called visual topic
maps, to build a bridge between annotated document col-
lections and the use of these documents as learning ma-
terial. The main components of a visual classification are
metadata-based topic maps attached to documents that al-
low customization according to users’ needs and profiles.
The metadata of documents and visual topic maps are based
on MPEG7 and its Semantic Descriptor including additional
attributes on instructional information. These generic meta-
data are complemented by domain specific metadata based
on the domain of the document collection. The paper ex-
plains the visual topic maps concept and sets it in context
to research on learning objects, metadata and educational
modeling languages. The structure of visual topic maps and
their metadata are then discussed in more detail and the
process of building visual topic maps and using them is out-
Categories and Subject Descriptors
H.4 [Information Systems Applications]: Miscellaneous;
I.7.5 [Document Capture ]: Document analysis
General Terms
Management, Standardization
Visual Semantic, Visual Topic Maps, ISO standard
The context for the research presented in this paper is the
Memory of the Past” project being carried out at the Na-
tional Institute of Informatics, Japan. This project aims at
collaborative building of a multi-lingual multi-cultural digi-
tal memory of the past by school pupils, teachers and multi-
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
ICIKM ’08 Kathmandu, Nepal
Copyright 200X ACM X-XXXXX-XX-X/XX/XX ...$5.00.
disciplinary experts. A wide variety of techniques are em-
ployed to present the history: ancient maps and drawings
are digitized, 3D animations of artefacts are constructed,
historical documents are analyzed, images and videos are
recorded to show the present condition of historical sites,
etc. The resulting document collection needs to be seman-
tically enriched and annotated using metadata to facilitate
the searching for documents. Currently, the documents are
being annotated using standardized metadata schemes like
Dublin Core 1and LOM (2002) 2and partly with specif-
ically developed metadata schemes such as the ontology-
based metadata schemes developed in [10]. From the per-
spective of eLearning, a project such as the ”Memory of
the Past” project delivers a very large archive of potential
learning objects. The challenge that each individual teacher
faces is to locate the appropriate information using a se-
mantic search engine. This task is compounded by several
factors: the shear number of documents of various types
and content, the distribution of documents and their meta-
data across multiple sites, the limitations of standardized
metadata, the lack of context for standardized metadata,
the restrictions in time available to ’surf ’ and search for
resources, the variety of languages connected with such a
project, or possibly the lack of domain knowledge in highly
specialized areas. The research presented in this paper aims
to address some of the issues mentioned to facilitate access
to information in contexts of very large collections of doc-
uments relating to a common subject area as given in the
example of the ”Memory of the Past” project. The idea
is to introduce a semantically rich layer, informally called
’visual topic maps’, between document collection and learn-
ing material that links documents according to topics. One
motivation behind this approach is to add a more focused,
semantic layer on top of the untargeted metadata that are
commonly used to describe single documents. Speaking from
an eLearning context, the visual topic maps build on learn-
ing objects and become information sources and building
blocks for stories and learning material. The implementa-
tion and practice have been done using a semantic extension
of TM4L [5] as TM4L is one of the most popular topic maps
editors currently available. In the following of the paper,
Section 2 outlines the background of ”Visual Topic Maps”
and a discussion emphasizes the thinking behind the con-
cept that is explained in its main aspects thereafter. Then,
1Dublin Core
2IEEE Learning Technology Standards Committee (LTSC)
topic map implementation is introduced in Section 3, fol-
lowed by section 4 on authoring and use of topic maps. A
summary and plans for future research conclude the paper
in section 5.
Visual Topics are a form of semantic where the knowledge
base is driven by visual features. These characteristics sup-
port our idea of providing an interpretive, semantic layer on
top of document collections that classifies these documents
according to scope, context and constraints. The terms ’vi-
sual topic’ further matches well the call for ’subjective topic’
or ’subjective meta’ data which we will outline in later sec-
tions of this paper. Regarding the learners [11], visual topic
maps provide support for an efficient context-based retrieval
of learning resources [1] as textual topics have semantic
limitation compared to visual topics which have a better
awareness in topic-domain browsing using a higher semantic
level. Furthermore, other advantages are related to informa-
tion visualization; customized views, adaptive guidance, and
context-based feedbacks. Regarding the instructors, visual
topic maps improve the effectiveness of management and the
maintenance of knowledge and information according to sev-
eral layers [3]. It provides a better personalized courseware
presentations using visual semantics. It can help distributed
courseware development, to reuse and exchange of learning
materials. Finally, it is possible to set collaborative visual
topic maps authoring as it has been shown in [4].
The Topic Maps model is defined by a resource algebra
to handle topic maps produced by semantic computing ap-
proach such as Latent Semantic Analysis [2]. The resource
algebra is described in the following of this section.
Resource Algebra: This Resource algebra uses resources’
domain data types. Resource semantic type and func-
tions in the topic maps are directly represented using
the appropriate data type and functions supported by
the resource algebra. This algebra follows two targets.
First, it is the semantic interface between scientist to
reduce the semantic gap and to strength the metadata
bridging between them. Second, this high level seman-
tic algebra facilitates the collaborative intersection of
scientists using topic maps integrating high level se-
mantics. Let us remind the notion of many sorted
algebra [6]. Such an algebra consists of several sets
of values and a set of operations (functions) between
these sets. It consists of two sets of symbols called
sorts (e.g. topic, pdf, rtf, jpeg) and operators (e.g.
tm transcribe, semantic similarity); the function sec-
tions constitute the signature of the algebra. Second
Order Signature [7] is based on two coupled many-
sorted signatures where the top-level signature pro-
vides kinds (set of types) as sorts (e.g. DATA, RE-
SOURCE, SEMANTIC DATA) and type constructors
as operators (e.g. set).
To illustrate the approach, we assume the following
simplified many-sorted algebra as topic map model im-
Type Constructor
DATA topic;
RESOURCE pdf, rtf, htm, xml, cvs, jpeg, tiff;
SEMANTIC DATA lsi sm, mpeg7 sm;
TOPIC MAPS tm(topic maps);
Unary operations
Resource in RESOURCE, resource sm:
SEMANTIC DATA, tm tm transcribe;
sm in SEMANTIC DATA sm set(tm) semantic similarity;
Binary operations
tm in TOPIC MAPS, (tm)+tm topicmaps merging;
sm, tm tm semantic merging;
topic in DATA, tm in TOPIC MAPS,
set(tm) x (topic bool) set(tm ) select;
The notion sm:SEMANTIC DATA means ”some type
sm in SEMANTIC DATA,”. So There is a typing map-
ping associated with the tm transcribe operator. Each
operator determines the result type within the kind of
SEMANTIC DATA, depending on the given operand
resource types. The semantic merging operation takes
two or more operands that are all topic maps values.
The select takes an operand type set (tm) and a pred-
icate of type topic and returns a subset of the operand
set fulfilling the predicate. From the implementation
of view, the resource algebra is an extensible library
package providing a collection of resource data types
and operations for domain-oriented resource computa-
tion (e.g. agriculture field [8]). The major research
challenge will be the formalization and the standard-
ization of cultural resource data types and semantic
operations through ISO standardization.
Representation of Topic Maps: A Topic Maps can be
represented as a triple G = (T, G, r) where T is the
set of topics, G is the graph representation of T, and
r is a function called ”representation function”. The
domain of r is a part of G. The range of r is T. A topic
is defined by a name, by properties links to learning
resources and by metadata. The name is textual data.
on what the author can write. The author can use the
narrative to tell facts, provide interpretations, make
comparisons, draw attention or similar. The narratives
of stories are structured into units and paragraphs.
Within a unit a line of argument will be preserved.
Units can be used to tell different aspects of a story.
Definition: A visual topic map is a topic map where the
type of topic name is not only a string but a pointer to
a multimedia object. so a topics are visual topics. The
semantic vectors related to resources are associated to
each topic. furthermore, links to learning object re-
sources are visual topic’s resources. The purpose of
these links is to relate the visual topic closely to the
underlying documents using some knowledge manage-
ment systems or ontologies [9]. The documents can
provide examples, illustration, proof, further explana-
tion, additional material or similar.
Figure 1: Setting the language preference in TM4L
The concept of Visual Topic Maps was introduced in the
context of the ”Memory of the Past” project to enable or-
ganizing, maintaining, and using very large repositories of
digitized images. The project’s ambitious goal to build a
rich, multi-lingual, multi-cultural digital memory of the past
by users with different nationality, background, and level of
expertise posed a number of requirements to the authoring
environment of visual semantics, which was designed as an
extension of the Topic Map Editor TM4L, which supports
visual topic maps. These requirements include implement-
ing the user interface in various languages, enabling handling
of visual topics, supporting the author in building a visual
topic map, etc. It includes the full functionality of TM4L
complemented with new features for supporting the creation
and use of visual topic maps.
4.1 User interface internationalization
The user interface has been implemented currently in more
than 12 languages including English, Spanish, German, French,
Traditional and Simplified Chinese, Japanese, Nepali and
Thai language. While internalization of a website or a tool
with simple interface is straightforward using the Java in-
ternationalization feature cite as it is shown in Figure 1,
the complication in the case of this application comes from
the translation of the predefined in English object types,
which are used in a special way in the application, for exam-
ple, the predefined basic relationship types ’class-subclass’,
’part-whole’, and ’instance-of’, which are used for building
the topic hierarchy in the Topics panel.
4.2 Handling visual topics
A visual topic map contains two types of topics - ’stan-
dard’ topics and ’visual’ topics. Visual topics represent con-
cepts as images. Thus a visual topic presents a repository
image and has the image file name as its primary name and
a resource of type ’File Path’ containing the path of the im-
age file. In this way the primary topic name is not related
to any specific natural language. The term for the concept
reified by that topic, translated in different languages can
be added as additional topic names, scoped with the corre-
Figure 2: View of the name information of a visual
sponding language topic. The use of scopes allows displaying
only the names scoped with a theme specified by the user
when visualizing the topic map. Thus concept names can
be displayed in different languages by specifying different
scopes. Among the implemented additions in TM4L is the
treatment of topic names and resources. In addition to the
two resource types defined in the Topic Map standard - in-
ternal resources that contain text included directly in the
topic map and external resources that specify URLs of web
resource, we have introduced a third type, different from
both of them. It resembles the external resources in that it
is not included in the TM and is specified by its address,
which however is not a URL, but a path of a file residing on
the local machine where TM4L is installed. The type of the
file is one of: JPEG, GIF, or PNG. Topics represent con-
cepts. In a conventional topic map, a concept is reified with
a topic, which is named with the term that is used to name
the concept. In a visual topic map, a concept is reified with
an image, which is reified with a topic having as a name the
name of the image file. Since a file name often doesn’t reveal
the semantics of the concept (image), it is very important
for the topic map authoring that the tool provides a mean
for displaying the image. Thus in TM4L, topic name infor-
mation is displayed differently for the standard and visual
topics. If a topic is a standard topic, then the topic name
string is displayed; if it is a visual topic, then in addition to
the name, the image represented by the topic is displayed
(see Figure 2). This way by seeing the image, the author can
identify the concept and subsequently add additional name
and/or annotate it.
Similarly, in the visualization of the topic map, for all
’standard’ topics, their topic names are displayed; for the
visual topics - icons of the images that they represent are
displayed. For each icon, the corresponding full-size image
can be displayed if the ”View image” option of the context-
sensitive menu is selected. (See Figure 3 ).
4.3 Support in building visual topic maps
As it was already mentioned, the concept of visual topic
maps was introduced to facilitate the structuring and use of
large collections of images. Since the visual topics are rep-
resented in the topic map by the paths of the corresponding
Figure 3: Topic Map visualization
images, the creation of such a map is a time-consuming and
unpleasant work for the author. From another side, such a
presentation allows an automatic extraction of (a draft) of
the topic map. In implementing this functionality we took in
consideration the fact that in many cases images are already
classified in subdirectories with meaningful names (indicat-
ing the scope of the stored there images). Thus, in order to
support the author, we implemented an automatic creation
of a draft topic map by recursive extraction of the structure
of a specified file directory, containing image files organized
hierarchically in subdirectories. Figure 4 displays the dialog
for extracting topics from a file directory. The author spec-
ifies the directory, the name of the relationship type to be
used in building the topic hierarchy and the root topic to
which the extracted hierarchy should be attached. The ex-
tracted topics are added to the current (a newly created or
an opened) topic map and after its reloading are displayed
in the Topic panel (see Figure 5).
The author can then use this draft map as a starting
point to produce the desired map. For example, he can
delete unwanted topics, restructure the topics hierarchy (by
adding/deleting parent topics), annotate topics, create new
relationships between topics. Figure 6 shows the visualiza-
tion of the automatically extracted topic map from Figure
The ”Visual Topic Maps”concept introduces a new seman-
tic layer between collections of learning objects and learning
material. The topics link semantically related learning ob-
jects. As learning objects by definition are not restricted in
size the links from the visual topic maps refer to specific top-
ics within the learning objects to guide the reader precisely
to relevant sections. Furthermore, visual topic maps provide
the context and scope that are required for the specification
and use of metadata. The visual topic maps descriptions can
be seen as rich metadata that annotate the referred learning
objects. Visual topic maps specific metadata and domain
specific metadata allow for the customization of topic maps
according to the needs and scope of individual users. The Vi-
Figure 4: Automatic extraction of topics from a file
Figure 5: Visualisation of the topic map as Tree
Figure 6: Visualization of the automatically ex-
tracted topic map
sual topic maps themselves form new learning objects that
provide annotation to thematically linked learning objects
stemming from large document collections. The metadata
of visual topic maps are based on MPEG7 for the links to
the multimedia document of the referenced learning objects
and on the MPEG7 Semantic Descriptor for each node of the
topic map. These metadata are complemented by attributes
for instructional information and story semantics. We envis-
age that the stories will be written by domain experts and
used by teachers. The term ”Visual” was chosen to express
the desire to communicate via a medium, the visual, that is
familiar to everyone as there and therefore can be easily au-
thored and easily understood. The visual topic maps provide
the mechanism required expressing knowledge and interpre-
tations in a natural way, the visual topic maps metadata
deliver the precision for search and retrieval both of top-
ics and underlaying documents. The visual topics concept
aims at classifying large document collections like they are
provided by the ”Memory of the past”project for the preser-
vation of the knowledge and culture. Work is currently un-
dertaken to specify multi-dimensional metadata under topic
maps model and to collect related data from experts and
pupils. The next step will be for experts and non-experts
such pupils to enrich visual topic maps with that set the vast
amount of single documents or learning objects in context
to make them more easily accessible.
We would like to thanks NII for this International Co-
operation support and the Japanese Ministry of Education,
Science and Technology for the support to the visual topic
maps project under the Geomedia project.
[1] H. Allert, H. Dhraief, and W. Nejdl. How are learning
objects used in learning process? instructional role of
learning objects in lom. In Proceedings of
EDMedia2002, pages 40–41, Denver, Colorado, USA,
2002. AACE.
[2] F. Andres and M. Naito. Dynamic topic mapping
using latent semantic indexing. In IEEE International
Conference on Information Technology and
Applications, pages 220–225. IEEE, July 2005.
[3] F. Buendia, P. Diaz, and J. V. Benlloch. A framework
for the instructional design of multi-structured
educational applications. In Proceedings of
EDMedia2002, volume 5021, pages 40–41, Denver,
Colorado, USA, 2002. AACE.
[4] D. C., D. Dicheva, and L. Aroyo. Using topic maps for
web-based education. In Int. J. of Advanced
Technology for Learning, volume 1(1), pages 1–7, 2004.
[5] D. Dicheva and C. Dichev. Tm4l: Creating and
browsing educational topic maps. In British Journal of
Educational Technology - BJET, volume 37(3), pages
92–109, 2006.
[6] R. G¨
uting. Gral: An extensible relational database
system for geometric applications. In Proc. of the 15th
Intl. Conf. on Very Large Databases, pages 33–44,
[7] R. H. G¨
uting. Second-order signature: a tool for
specifying data models, query processing, and
optimization. In SIGMOD ’93: Proceedings of the
1993 ACM SIGMOD international conference on
Management of data, pages 277–286, New York, NY,
USA, 1993. ACM Press.
[8] A. Kawtrakul, C. Yingsaeree, and F. Andres. Semantic
tracking in peer-to-peer topic maps management. In
DNIS 2007, pages 54–69. Springer LNCS 4777, 2007.
[9] D. Noikongka, D. Thamvijit, A. Imsombut,
M. Suktarachan, S. Rajbhandari, F. Andres, and
A. Kawtrakul. A workbench for collaborative
ontological knowledge construction and maintenance
with authoring tools. In Proceedings of Ontolex
workshop (From Text to Knowledge: The
Lexicon/Ontology Interfac)e, pages 98–107, 2007.
[10] T. Okamura, N. Fukami, C. Robert, and F. Andres.
Digital resource semantic management of islamic
historical buildings case study on isfahan islamic
architecture digital collection. In the International
Journal of Architectural Computing, volume 5 (2),
pages 356–373. Multi-Science Publishing Co Ltd, June
[11] C. Robert, F. Andres, and K. Veltman. Advances in
collaborative annotation in semantic management
environment. In IEEE/ACM ICDIM’2007, pages
339–344. IEEE, October 2007.
ResearchGate has not been able to resolve any citations for this publication.
Conference Paper
Full-text available
This paper presents a collaborative semantic tracking framework based on topic maps which aims to integrate and organize the data/information resources that spread throughout the Internet in the manner that makes them useful for tracking events such as natural disaster, and disease dispersion. We present the architecture we defined in order to support highly relevant semantic management and to provide adaptive services such as statistical information extraction technique for document summarization. In addition, this paper also carries out a case study on disease dispersion domain using the proposed framework.
Conference Paper
Full-text available
An ontological knowledge management system requires dynamic and encapsulating operation in order to share knowledge among communities. The key to success of knowledge sharing in the field of agriculture is using and sharing agreed terminologies such as ontological knowledge especially in multiple languages. This paper proposes a workbench with three authoring tools for collaborative multilingual ontological knowledge construction and maintenance, in order to add value and support communities in the field of food and agriculture. The framework consists of the multilingual ontological knowledge construction and maintenance workbench platform, which composes of ontological knowledge management and user management, and three ontological knowledge authoring tools. The authoring tools used are two ontology extraction tools, ATOM and KULEX, and one ontology integration tool.
Full-text available
In this paper we discuss the educational and research benefits of applying an innovative technology - Topic Maps - for organizing and retrieving online information in the context of Web-based courseware. Topic maps offer a standards-based approach to encoding expert’s domain and instructional knowledge, i.e. to building educational ontologies and courseware components. This allows further reuse, sharing, and interoperability of knowledge structures and teaching units between courseware authors and developers. We also present our current work on developing, evaluating, and utilizing Topic maps-based courseware modules.
Full-text available
In this article, we describe TM4L, an environment for building, maintaining and using standards-based, ontology-aware e-learning repositories. It is based on the idea that concept-driven access to learning material implemented as a topic map can bridge the gap between a learner and targeted knowledge. One of the driving goals of this work is to increase the reusability of available educational resources by enabling the use of a developed subject ontology with courses on the same subject with different stricture. Another goal of TM4L is to support an efficient context-based retrieval of learning content tailored to the needs of a learner working on an educational task. The paper focuses on three aspects of the TM4L environment: domain modelling, editing capabilities and the interface for exploring the learning collection. The key features of the TM4L functionality are illustrated with some examples.
Conference Paper
Full-text available
Providing solutions to problems associated with methodical creation, management and information search in an annotation archive is the core of this study. Information archives grow at a relatively slow space but annotations associated with archives grow geometrically because of the diversity of reflections on documents emanating from different authors and with time. Information annotation by users of document is generally connected to a definite document, specific individuals or a precise time. Annotation can be seen as an informal way for individuals who do not freely have initial rights for a document to "publish" their thoughts on a subject of interest. Publishing one's thoughts using annotations does not involve publication protocols such as copyright issues. Where there is freedom of expression through annotation, the flexibility and frequencies of "publishing" one's views on a subject are bound to increase. This flexibility and simplicity in expression entails a systematic management of an annotation archive. The creation of an annotation database is often seen as the human activity that can embed the function of its creator (who is also a document user), the original document and time. It means that a database of annotations based on three parameters (creator, document and time) may include divergent annotations as a result of multiple documents and human factors. With participation of diverse users, there can be divergent interpretations of subjects of interest based on varying thoughts of users. With change of time, a user's opinion on a subject can change. The question that quickly comes to mind is how can a database growing geometrically, with divergent reflections (annotations), by divergent users with considerable length of time be created and searched effectively in a collaborative environment? We consider creation and the exploration of an annotation database by combining the concept of semantic technology with the topic maps data model. Each word - - used by users in annotation creation benefits from the potential of semantic technology based on topic maps to resolve the difficulty in management. More precisely, our attention in this study is the creation and exploitation of annotation databases to improve information research. Our TMSUMS platform benefits of combining the SUMS-based semantic logical model with the topic maps-based semantic physical data model. As one of the key issues, we brought to light, the problem related to annotation creation in a collaborative environment. Thereafter, we introduce scenarios of information search in an annotation database constructed on specific parameters. We demonstrate how difficult it to search for meaningful information from such an annotation archive/database in a normal situation. Our proposal is a search through such a database with the concepts of semantic technology and topic maps data model to demonstrate how such a search can be improved. Our conception on how various elements of such annotation system illustrates how to build semantic management based on topic maps data model. We point out how annotation management can be improved following this approach.
Conference Paper
Full-text available
This paper proposes an approach to provide a dynamic multipoint view of textual documents based on summarization in arbitrary scale in order to produce topic maps. Our approach is based on the latent semantic indexing (LSI) to deal with synonymy and polysemy. Textual resources are decomposed into a set of sentences and then summarized by a set of sentences that are similar to the view of user. A document may have various summaries and by consequence several topic maps according different user interests. The advantage of our method is to be independent to the language used in the source text. Our experimentation shows that the summary text can contains the sentences whose words are different from those used in the user view but their meanings are close to those used in the user point of view
Full-text available
In order to reuse and exchange learning objects we need information about these learning objects. The LOM draft standard defines a set of more than 70 attributes, which specify learning object properties like author, title, subject, and many others. Even though the LOM draft includes a category educational, no information is included in the standard to specify, which instructional roles are played by a learning object in a course. We show how to include this important didactic information using the concept of instructional roles and relations in a way, which is extensible and flexible enough to specify not only general didactic criteria, but rather specific criteria, as prescribed by different instructional theories.
: We describe the architecture of a relational database system that is extensible by user-defined data types and operations, including relation operations. The central concept is to use languages based on many-sorted algebra to represent queries as well as query execution plans. This leads to a simple and clean extensible system architecture, eases the task of an application developer by providing a uniform framework, and also simplifies rule-based optimization. As a case study the extensions needed for a geometric database system are considered. 1. Introduction Much of the database research of recent years was aimed at providing a better support for non-standard applications such as office information systems, geographic information systems, CAD databases, etc. A common need of these applications is the representation and manipulation of more complex objects than those representable by a tuple of a relation in the traditional relational model, for example, an office form, a complete ...