Shigeo SugimotoUniversity of Tsukuba · College of Media Arts, Science and Technology
Shigeo Sugimoto
PhD
About
156
Publications
12,332
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
537
Citations
Publications
Publications (156)
Wikidata is evolving as the hub of Linked Open Data (LOD), with its language-neutral URIs and close adherence to Wikipedia. Well defined URIs help the data to be interoperable and linkable. This paper examines the possibilities of utilizing Wikidata as the means of a vocabulary resource for promoting the use of linkable concepts. Digital curation p...
Publishing findable datasets is a crucial step in data interoperability and reusability. Initiatives like Google data search and semantic web standards like Data Catalog Vocabulary (DCAT) and Schema.org provide mechanisms to expose datasets on the web and make them findable. Apart from these standards, it is also essential to optionally explain the...
Metadata application profiles (MAP) serve a critical role in the of metadata interoperability. Singapore framework recommends publishing the application profiles as documentation, with detailed usage guidelines aimed to maximize reusability and interoperability. Authoring, maintenance, versioning, and ensuring the availability of previous versions...
In Japan, manga, anime and video game (MAG) are viewed as popular culture. Many MAG works are published across media, i.e. shared characters, story, universe, etc. Entities and relationships describing multimedia franchise are helpful for general audiences to search MAG items. However, multimedia franchises are established by the cognition of audie...
Metadata application profiles act as a key element in interoperability of metadata instances. There are various accepted formats to express application profiles. Most of these expression formats such as RDF, OWL, JSON-LD, SHACL, and ShEx are machine-actionable, and formats like spreadsheets or web pages act as human-readable documentation. Due to l...
Background. There are many digital archives in cultural domains, but there is no well-established metadata model which covers both tangible and intangible cultural heritage. Neither is there a well-established metadata model applicable to building digital archives by aggregating existing cultural heritage information. Objectives. The objective of t...
Background. Multimedia franchises may have a single origin, but over time develop into a network of related creative works in various media formats such as film, novels, animation, and video games. A single entity to represent a whole franchise is often utilized on the Web, but the ability for existing bibliographic models to represent this entity,...
As Japanese pop culture spreads worldwide, digital libraries compiling information about works through representative media (manga, anime, video games) emerge. Some of these works may share the same story, characters or universe, thus being part of a conceptual instance which we call a transmedia work in this paper. Transmedia works are abstract en...
Purpose
Multiple studies have illustrated that the needs of various users seeking descriptive bibliographic data for pop culture resources (e.g. manga, anime, video games) have not been properly met by cultural heritage institutions and traditional models. With a focus on manga as the central resource, the purpose of this paper is to address these...
In the context of popular culture, a successful work or a work with broad cultural or scientific impact often prompts the publication of many derivative works across multiple formats and by multiple creators, works that share elements with the original work such as topics, characters or universes. We argue for reconceputalization of the “Superwork”...
Purpose
Provenance information is crucial for consistent maintenance of metadata schemas over time. The purpose of this paper is to propose a provenance model named DSP-PROV to keep track of structural changes of metadata schemas.
Design/methodology/approach
The DSP-PROV model is developed through applying the general provenance description stan...
Cultural Heritage Information (CHI) is a key feature which explains the content of a Heritage and its values to the society. Cultural heritage information is scattered among memory institutions and this research focus on accessing and organizing of the informational content of heritage. The authors proposed a model based information organization ap...
Cultural Heritage Information (CHI) is scattered among memory institutions, and connecting them together is an important issue for their continued discovery, access, and use. This study proposes a generalized model named Cultural Heritage in Digital Environments (CHDE), which enables the organizing of both tangible and intangible cultural heritage...
Typical digital archives of cultural heritage are a collection of digital resources created using original cultural resources, which may be tangible or intangible. These archives should have features originating from both libraries and museums, as they are a collection of digital copies, as well as a collection of cultural heritage resources. In ad...
Purpose: The purpose of this paper is to discuss provenance description of metadata terms and metadata vocabularies as a set of metadata terms. Provenance is crucial information to keep track of changes of metadata terms and metadata vocabularies for their consistent maintenance.
Design/methodology/approach: The W3C PROV standard for general proven...
Background: Sri Lankan memory institutions are currently not optimised to deliver Cultural Heritage Information (CHI) on the Web. Part of the issue is the lack of metadata aggregation. We propose an approach that would help aggregate Sri Lankan CHI with more contextual information in order to open up this information and make it more accessible.
Ob...
Cultural Heritage Properties (CHPs) around the world have been altered or destroyed due to various unforeseen factors, both natural and human-made. Consequently, as a preparedness approach around such disasters, documenting the CHPs are crucial to any efforts to repair, rebuild or relocate them. With advancements in digital technologies, integratin...
Many digital collections of cultural heritage resources, commonly referred to as digital archives, have been developed in Japan. This paper covers the multiple facets of digital archiving for a comprehensive understanding of Japanese activities and issues learnt since the 1990s, starting from the development of digital archives and related activiti...
Many different institutions create bibliographic data for manga, a style of Japanese comic. These institutions typically describe the same resources, but in ways that differ depending on their institution type. The sharing of this data would result in a more complete bibliographic data landscape for manga; the majority of this data, however, exists...
Cultural Heritage Properties (CHPs) around the world have been altered or destroyed due to various unforeseen factors, both natural and human-made. Consequently, as a preparedness approach around such disasters, documenting the CHPs are crucial to any efforts to repair, rebuild or relocate them. With advancements in digital technologies, integratin...
Introduction
We have experienced drastic changes in our information and knowledge infrastructure over the past few decades. As consumers of information resources, we find and access the resources from the internet. As producers of information resources, we create information resources in digital formats and provide access to them through the intern...
This paper presents a domain ontology for cloud archives, based in part on the PREMIS Editorial Committee ontology for the PREMIS Data Dictionary. Our ontology's design is based on a layered model of cloud computing where lower layers provide shared services to higher layers. It focuses on the submission of generic Submission Information Packages w...
This paper discusses some key issues for digital archives and metadata in a networked information environment to keep our community memory for the future. The paper is based primarily on the experiences and lessons learnt by the author from his research activities on metadata and digital archives. The author participated in a study group on digital...
Manga -- a Japanese term meaning graphic novel or comic -- has been globally accepted. In Japan, there are a huge number of monographs and magazines of manga published. The work entity defined in Functional Requirements of Bibliographic Records (FRBR) is useful to identify and find manga. This paper examines how to identify manga works in a set of...
The first Asian Summer School in Information Access (ASSIA 2013) was held between 22nd and 24th June, 2013 in Tsukuba, Japan. The summer school offered 9 lectures in Information Retrieval, Web Search, and related topics, along with two panel discussions and a poster session. This reports a successful international summer school in Asia attracting a...
This paper is about a digital archive named Digital Dao-Fa Hui-Yuan (DDFHY) [1]. DDFHY is a collection of Fus and composed of 268 volumes. A Fu is a secret graphic symbol. The relationships among Fus are not well understood because of the huge number of Fus and the secrecy of Daoism. We analyzed the relationships and examined the distribution of al...
Manga has been very popular in Japan for many years and it is now gaining popularity worldwide. Given the new creation and display devices, the production and publication of manga is also changing. Once a manual art form, manga is becoming a digital art and so there are now emerging digital libraries of manga.This paper proposes a metadata-based pl...
Data integrity constraints are fundamental in various applications, such as data management, integration, cleaning, and schema extraction. In this paper, we address the problem of finding inclusion dependencies on the Web. The problem is important because (1) applications of inclusion dependencies, such as data quality management, are beneficial in...
Metadata is well recognised as one of the foundational components required in archiving and preservation of digital resources. It is crucial to select and/or combine metadata standards in accordance with requirements in an application domain and in the records lifecycle. Based on our previous research in which we have clarified the features of majo...
Purpose
The purpose of this paper is to examine the characteristics of managing records in a cloud computing environment and compare these with existing archiving models, exemplified by the open archival information system (OAIS) reference model.
Design/methodology/approach
The authors compare the functional entities in OAIS with a layered model o...
Dao Fa Hui Yuan (???), is an important resource for the study of Daoism. It is a compilation of a number of Fus used in Daoism. A Fu is expressed as a complex graphical symbol composed of one or more constituent parts. This paper presents the Dao Fa Hui Yuan digital archive named Digital DFHY which is designed to help researchers explore relationsh...
An increasing number of organizations are using cloud computing to create and store digital records. To ensure safe storage and long-term preservation, standards for metadata are needed. This paper proposes a metadata application profile for cloud archives.
We use guidelines from the Singapore Framework for Dublin Core Application Profiles to defin...
In this paper, we define some of the characteristics of archiving in a cloud computing environment. Based on these, we describe a model for a cloud archiving system using concepts and information types from the OAIS reference model. The proposed model allows the sharing of functionality and information objects by making these available as services...
Metadata is one of the keys for digital archiving and preservation. This is well recognized as an important issue in our networked information society. There are several standards for archival and preservation metadata, e.g. ISAD(G), EAD, AGRkMS, PREMIS, and OAIS. This leads to selection and interoperability issues for metadata standards in the des...
Today, publishing information on Web sites is common. And the size of the Web contents that need to be managed is increasing. Therefore it is important to maintain content integrities on the Web. This paper proposes a system to maintain the content integrity of Web sites without backend databases. First, we explain the architecture of the proposed...
This paper presents an experimental study of the automatic correction of broken (dead) Web links focusing, in particular, on links broken by the relocation ofWeb pages. Our first contribution is that we developed an algorithm that incorporates a comprehensive set of heuristics, some of which are novel, in a single unified framework. The second cont...
This paper addresses the problem of finding new locations of moved Web pages. We discuss why the content-based approach has a limitation in solving the problem and why it is important to exploit the knowledge on where to search for the pages.
Inclusivity – how to be inclusive of the privileged as well as the not so privileged; diversity – how to manage the diverse cultures and systems; convergence –how do we discover the unifying thread of flowing together – these are the issues this Round Table will address. This Round Table will focus on the characteristics, requirements, challenges a...
PageChaser is a system that monitors links between Web pages and searches for the new locations of moved Web pages when it finds broken links. The problem of searching for moved pages is different from typical information retrieval problems. First, it is impossible to identify the final destination until the page is actually moved, so the index-ser...
Archiving Web content is an important topic for digital libraries. Usually, Web archiving systems freely can collect, preserve and provide. However, Web archiving systems have disadvantages: it is difficult to collect all versions of a resource. We proposed a Web archiving system which is designed to collect resources in accordance with a resource...
Wrapping of Web sources is known to be one of the key tasks in information integration problems. This paper proposes Wraplet, a wrapping language for extracting structured data from Web contents written in HTML. Unlike existing solutions, Wraplet is designed as a lightweight language in which users can write scripts for wrapping easily with text ed...
The purpose of this paper is to clarify the temporal aspect of terminology focusing on the dictionary's impact on terms. We used women's studies terms as data and examined the changes of their values of five automatic term recognition (ATR) measures before and after dictionary publication. The changes of precision and recall of extraction based on...
Content Management Systems (CMS) are widely used for organizations to publish information, to keep transactions and records, and so on. However, in general, CMS in use today do not offer the required level of functionality for an organization that needs to ensure safe, legally compliant records management. It therefore becomes necessary to transfer...
The digitizing program of the French National Library
A growing number of people are using the Web to access English-language resources, among other things. In Asian countries,
for example, many people want access to English texts. Many Asians are not as competent reading English as they may be in
the intellectual content of their domain. The problem of accessibility to English texts is significant si...
Many communities provide Web resource directories to help users find useful resources in the community. A typical example is a resource directory in a homepage of a local government. Crosswalk of the directories of neighboring communities is a crucial function for users to collect useful resources from the communities. However, an appropriate schem...
We have developed a content construction system for collaborative learning based on a Wiki, a Web- based collaborative document authoring system. Requirements for such a tool are explained in three aspects (usability, content construction, easy and secure operation) and implemented accordingly. Experiment on usability of the system interface sugges...
Metadata schema registries have great potential to enhance usability and reusability of metadata schemas. Application profiles are a key concept for Dublin Core, and have a crucial role in promoting reuse of metadata schemas. This paper discusses basic concepts and models of metadata schemas, in order to clarify functional requirements for extendin...
In this paper we consider the suitability of the Functional Requirements of Bibliographic Records (FRBR) model for an inclusive information environment. The AccessForAll approach to digital information asserts that every user has an equal right to information resources. The FRBR study identified bibliographic records users and, for them, determined...
Today, more and more people in knowledge communities, like research laboratories, use shared file servers to store and share their information. People in such communities often work together and their files stored in a file server have relationships with each other. Information on the relationships is usually exchanged offline and used implicitly t...
Subject vocabularies are crucial components for subject gateways. From our experiences on subject gateway projects that are designed for regional- and domain-specific communities, we learned that subject gateways require a subject vocabulary that is reasonably small and tailored in accordance with the resources in the domain. The goal of this paper...
Recently, data exchange between different information sources has increased its importance, and many tools to help data exchange have been proposed. However, there has been no established method to evaluate the effectiveness of such tools. If you would like to evaluate the query execution performance of an RDBMS, we have various benchmarks, such as...
The International Conference on Asian Digital Libraries (ICADL) was born in Hong Kongin 1998and hosted in Taipei (1999),Seoul(2000),Bangalore(2001), Singapore (2002), Kuala Lumpur (2003), Shanghai (2004) and Bangkok (2005). ICADL 2006 held in Kyoto, Japan was the 9th of the ICADL series. ICADL has been recognized as an important event for the digit...
Metadata has been widely recognized as an important issue in digital libraries in many aspects. This report briefly describes
models and frameworks of metadata schemas developed through metadata-centric research projects at University of Tsukuba, which
are a few subject gateways and a few metadata schema projects primarily based on Dublin Core and...
We are developing a software tool to support schema matching for data transformations. Schema matching is the process of finding relationships between components of two given database schemas. The tool is unique in that it first extracts conceptual schemas from the two database schemas and allows the user to use the extracted conceptual schemas as...
We are developing a software tool that finds destinations (new URLs) of Web pages after pages are moved. A point of the tool is that it tries to find "reliable Web links," which are links always to be kept updated. We believe this is a new approach in finding new URLs for Web pages. This paper explains how the tool works internally and shows some r...
The long-term preservation of digital resources is one of the most important issues facing the library community. In particular, libraries need a preservation strategy for digital objects, since digitization alone provides access but not preservation. The digital library community is also focusing on the problem of designing and implementing long-t...
Archiving Web content is an important topic for digital libraries and especially for deposit libraries. Web archiving systems
usually collect Web resources using search robot software and/or by human labor. However, these resource gathering methods
have disadvantages: for example, it is difficult to collect all historical versions of a resource or...
The DCMI metadata schema registry has been developed as an authoritative source of DCMI metadata terms. The DCMI registry has an important role to enhance semantic interoperability of the metadata terms. From our experiences in the development of the DCMI registry, we have learned that the registry has large potential to serve as a center of variou...
Metadata technologies and its standardizations have been developed in many fields such as library science, broadcasting, etc., and they are already in practical use. Content having metadata is easier to manage and discover resources than non-metadata contents. With the growth of the Internet, metadata has become a more important function in the eff...
The Internet has a lot of rich information resources available in different languages aside from English, the major language of use. Those resources are, however, not easy to access because of language barriers. This paper proposes a collaborative model to develop a multilingual subject gateway, which is called the Internet Public Library Asia (IPL...
The Dublin Core metadata element set has been widely adopted by cultural and scientific institutions, libraries, governments, and businesses to describe resources for discovery on the Internet. This paper provides an overview of its history and underlying principles and describes the activities of Dublin Core Metadata Initiative (DCMI) as an organi...
This paper describes the International Conference on Dublin Core and Metadata Applications 2001 (DC-2001), the ninth major workshop of the Dublin Core Metadata Initiative (DCMI), which was held in Tokyo in October 2001. DC-2001 was a week-long event that included both a workshop and a conference. In the tradition of previous events, the workshop pr...
The Dublin Core Metadata Initiative is currently designing and
implementing a registry for managing its official element and qualifier
definitions over time and in multiple languages. Three aspects of this
registry must be versioned: the schema that describes the elements and
qualifiers, which may be accessed by software or Web agents; sets of
elem...
With the recent Internet expansion, persons all over the world can access more and more document databases. As Unicode has become more popular, the environment for multilingual retrieval has been improved to some extent. However, there are still numerous problems to be solved, such as the multilingual input and display. This paper proposes a system...
This paper describes a multilingual metadata schema registry which has been developed at the Uni- versity of Library and Information Science in Tsukuba, Japan. The registry currently has reference transla- tions of Dublin Core Metadata Element Set written in 22 languages, DC Qualifiers and the DCMI Type Vo- cabulary written in English and Japanese,...
Selection and metadata issues which surround the preservation of digital information are discussed, in particular, the assignment of "collection levels" to Web materials to ensure preservation, and some Preservation Metadata Element Sets (PMES) which have been identified as informed by the Open Archival Information System (OAIS) Reference Model. A...
With the recent Internet expansion, persons all over the world can access more and more document databases. As Unicode has become more popular, the environment for multilingual retrieval has been improved to some extent. However, there are still numerous problems to be solved, such as the multilingual input and display. This paper proposes a system...
Digital Library System at University of Library and Information Science
The Internet enables people to share documents written in various languages worldwide. Many documents on the Internet are provided by the WWW. Most of them are markupped with HTML tags. The tags which indicate document elements are very useful for full-text retrieval. The author considers that a full-text retrieval system for tagged multilingual do...
The World Wide Web (WWW) covers the globe. However, the browsing functions for documents in multiple languages are not easily
accessed by occasional users. Functions to display and input multilingual texts in digital libraries are clearly crucial.
Multilingual HTML (MHTML) is a document browser technology for multilingual documents on the WWW. The...
Multilinguality is an important aspect of the digital library and its international access to and sharing of global information. The Web has expanded very rapidly worldwide. Easy access to a document from a foreign site using an off-the-shelf browser. However, that browser is usually capable of showing only documents written in English and a local...
Along with the expansion of the Internet the world-wide network has become a reality and informations from every country and in every language are shared. These informations are written in various languages and therefore multilingual display and retrieval possibilities are indispensable functions. According to this background, we believe that a dat...
http://www.tulips.tsukuba.ac.jp/mylimedio/dl/page.do?issueid=1000066&tocid=100084704&page=75-82
Folk tales are part of every nation's cultural heritage. They are a way to share material, discover another culture, and learn its language. This article discusses a multilingual collection of Japanese old tales developed on a multilingual browsing tool for HTML texts. Every tale contained in the collection is written in English, French, and Japane...
Folk tales are an important heritage of every nation. Electronic text collections of folk tales are meaningful information resources for people who wish to learn about foreign cultures and their languages. This paper describes an electronic text collection of old folk tales which was developed using a multilingual document browsing system called th...
The digital library, widely recognized as an important application in global and national information infrastructures, is an integration of advanced information technologies. Activities related to the digital library and the information technology to enhance accessibility to information in the digital library are surveyed. The paper describes five...