ChapterPDF Available

HDL IP Cores Search Engine Based on Semantic Web Technologies

Authors:

Abstract and Figures

A few years ago, the System on Chip idea grew largely and ‘flooded’ the market of embedded systems. Many System on Chip designers started to write their own HDL components and made them available on the Internet. The idea of searching for a couple of pre-written cores and building your own System on Chip only by connecting them seemed time saving. We’ve developed a system that enables a semantic description of VHDL IP components, allows search of specific components based on the unambiguous semantic description and works with prebuilt VHDL IP cores. We present an application built around the system and focus on the benefits the application user gains during the process of System on Chip design. KeywordsHDL–Semantic Web–VHDL–System on Chip–Components–Search–Composition
Content may be subject to copyright.
HDL IP Cores Search Engine based on Semantic Web
Technologies
Vladimir Zdraveski1, Milos Jovanovik1, Riste Stojanov1, Dimitar Trajanov1
1 Faculty of Electrical Engineering and Information Technologies Skopje,
Rugjer Boskovic bb, 1000 Skopje, Republic of Macedonia
vladimir.zdraveski@feit.ukim.edu.mk, milos@feit.ukim.edu.mk, ristes@feit.ukim.edu.mk,
dimitar.trajanov@feit.ukim.edu.mk
Abstract. A few years ago, the System on Chip idea grew largely and ‗flooded‘
the market of embedded systems. Many System on Chip designers started to
write their own HDL components and made them available on the Internet. The
idea of searching for a couple of pre-written cores and building your own
System on Chip only by connecting them seemed time saving. We‘ve
developed a system that enables a semantic description of VHDL IP
components, allows search of specific components based on the unambiguous
semantic description and works with prebuilt VHDL IP cores. We present an
application built around the system and focus on the benefits the application
user gains during the process of System on Chip design.
Keywords: HDL, Semantic Web, VHDL, System on Chip, Components,
Search, Composition.
1 Introduction
Embedded systems give intelligence to many devices that we use in everyday life
they are found in everything from mobile phones and MP3 players, cars and home
appliances, to complex controllers. The continuous progress of semiconductor
technology has made it possible to implement complex systems on a single chip,
which has led to new challenges in design methodologies. System on Chip (SoC) is a
complex integrated circuit, or integrated chipset, which combines the major functional
elements or subsystems of a complete end product into a single entity.
The design of SoC would not be possible if every design started from scratch. In
fact, the design of SoC depends heavily on the reuse of Intellectual Property blocks -
which are called ―IP Cores‖. IP reuse has emerged as a strong trend over the last years
and has been one key element in closing what the International Technology Roadmap
for Semiconductors calls the ―design productivity gap‖ - the difference between the
rate of increase of complexity offered by advancing semiconductor process
technology, and the rate of increase in designer productivity offered by advances in
design tools and methodologies [1], important to offer ways of enhancing designer
productivity - although it has dramatic impacts on that. It also provides a mechanism
for design teams to create SoC products that span multiple design disciplines and
domains. The availability of both hard (laid-out and characterized) and soft
(synthesizable) IP cores from a number of IP vendors allows design teams to drop
them into their designs and thus add a required functionality to an integrated SoC. In
this sense, the advantages of IP reuse go beyond productivityit offers both a large
reduction in design risk, and also a way for SoC designs to be done that would
otherwise be infeasible owing to the length of time it would take to acquire expertise
and design IP from scratch.
Soft IP cores are usually written in some Hardware Description Language (HDL)
like VHDL [2], Verilog or SystemC and System Verilog. Following the trend for
open source development, there is a large number of available open source HDL
components. In order to design a complete SoC, one should interconnect many single
VHDL components, spending a lot of time on the compatibility analysis. Today‘s
HDL search tools are generally statistic-based, so the process of searching a specific
component is quite difficult. For instance, it is not a trivial task to search for an 8bit
counter with 1 clock pin and 1 chip-enable pin. Instead, one should download, open
and analyze many HDL projects, before deciding whether the IP core matches or not.
In order to solve this problem, there is a need to have a more meaningful description
of attributes and the function of the HDL component. One of the ways to accomplish
this is to add semantic description to the HDL component. The use of semantic
annotation of HDL files gives way to many new and different opportunities for
improvement to the storage and search process of HDL search engines.
Semantic information gives the machine the ability to know and decide and do
much more of the work than it was doing previously. Instead of storing HDL
information simply as a text file, with the use of semantic web technologies the
machine can understand more about the kind and interface of a HDL component. That
knowledge enables automated search by I/O interfaces, component type, and further
composition of an entire SoC by the use of HDL IP cores.
In order to test the idea in practice, we developed a semantic extension of VHDL
and designed a system for it. The system has a module for automatic VHDL
annotation, a module for manual semantic annotation, annotated semantic data
storage, search and composition of HDL IP cores.
2 Related Work
2.1 HDL Repository Web Portals
There are numerous open source HDL code projects and a few web portals (groups,
environments) that enable storage and search of HDL projects. One of them is ―Open
Cores‖ [3], where existing IP cores can be found and downloaded. The search process
requires the user to search only by name or by the type of the IP core needed.
Although there are thousands of projects in this repository, it is quite difficult to find a
specific IP component that contains specific ports (e.g. an 8-bit buffer).
Another example is the Java optimized processor‘s group [4], where a project for
building a processor from scratch and optimizing it to execute Java instructions is
shown. But, here the user faces the same problem: despite the fact that this project
contains many IP cores, e.g. 8-bit buffers, finding a specific one is not so straight-
forward. To find a specific core, one should search through all the folders and analyze
files one at a time, which takes a lot of time and sometimes ends unsuccessfully.
There are also many other similar examples, [5][6][7][8][9].
Also, there are some plug-ins for programming environments that enable inserting
of pre-made IP cores, such as standard types of memories, buffers, counters, etc. For
instance, Xilinx ISE [10] has its own library, which makes it quite user-friendly. But,
the same problem of finding a specific component exists.
2.2 Semantic Search Systems
On the other hand, the technologies of the Semantic Web allowed development of
novel approaches to data storage and retrieval. A semantic search system is essentially
an information retrieval system which employs semantic technologies in order to
enhance different parts of the information retrieval process. This is achieved by
semantic indexing and annotation of content, query expansion, filtering and ranking
the retrieved information [11]. The semantic search also introduces additional
possibilities, such as search for online ontology [12], search for online (distributed)
knowledgebase, retrieval of facts from the ontology and knowledgebase and question
answering.
A recent survey showed that there is a diversity of approaches in semantic search
systems, based on the following categories: search goals, scope, ontology encoding,
knowledge richness, user input, architecture, and search phrase support [11].
From the viewpoint of search goals, semantic search can be classified into
information retrieval, data retrieval [13], question answering [14] or ontology
retrieval [12]. The scope of a semantic search can be the Web [15][16][17][18],
desktop search [19][20] or domain repositories [21]. The encoding format of the
ontologies used can be a proprietary format [22], OWL or RDFS as open standards
[13][20], or even some other format [23][24][25]. In order to enhance the semantic
search over the traditional, statistical and syntactical search, the researchers used to
focus on the use of thesaurus and taxonomy [26][27]. But in recent year, with the
development of the semantic web technologies, it is possible to use richer and more
complex knowledge structures, such as classes, instances, object properties, relations,
axioms, etc. [14]. Based on the user interactions required by the system, the different
approaches fall into one of the following categories [11]: simple keyword based entry
into text field [18], natural language sentences [14], graphical taxonomy/ontology
browsing [23][28], multi-optional specification of search parameters [27], use of a
formal ontology query language [29][30], and interactive with explicit user feedback
[31].
From the perspective of the architecture of semantic search systems, most of the
developed applications are domain-specific intranet or desktop applications, built over
semantically annotated data from a certain domain [20][24]. The main reason for this
is the most common problem the semantic web researchers face: the lack of
semantically annotated data on the Web. Still, there are other types of applications
which use different semantic techniques on top of existing standard search engines
[18].
3 Overview of HDL IP Cores Search Engine
Our approach is to use information retrieval for desktop search over a local, domain
ontology-based knowledgebase. We use an OWL domain ontology and an RDF [33]
knowledgebase. The documents needed for the application (HDL IP cores) are
semantically annotated with concepts from the domain ontology. The architecture of
the application puts it in the category of stand-alone web applications and uses its
own data repository.
3.1 HDL Ontology
For the purpose of the system, we designed the HDL ontology. Although we would
present this ontology from a VHDL perspective, it can be used for classification of
any kind of hardware units, chips, etc.
There are specifications about VHDL components and many classes that enable
quite original and intuitive classification of different, commonly used VHDL
components, written by different authors. Furthermore, there are some predicates and
relations that could be used to specify the hierarchy in the RDF description.
The HDL ontology was designed using the Protégé editor [32], shown in Fig. 1.
The ontology is used to classify and annotate all of the VHDL components in order to
store the details of the users‘ source code into the system.
Fig. 1. VHDL ontology in Protégé.
A component could be classified as Counter, Register, CPU, etc. There is information about mode type,
data type and semantic type of all the ports. The semantic type describes whether the pin/port is data,
control, enable, etc.
The ontology covers relations and classes inside the VHDL code. It allows
description of the entity, its ports, their data type, length and mode. A generic section
is also considered. In most of the classes the Other class was nested in order to
classify all uncovered hardware components.
Besides VHDL mapping, the ontology also includes additional metadata. There is a
semantic kind of the component (to specify whether it is an Adder, a Buffer, a CPU,
etc.), author info and frequency specification. Ports are described with the
SemanticType property, which allows the user to assign semantic meaning to a port.
Thus, the user can define the port as data, control, enable, clock, etc. These additional
information gives novelty to the storage and search engines for HDL components, and
adds benefit for the end user. By semantically annotating the different components of
a VHDL solution, the user can search for a component that matches some specific
pattern, at the level of ports and pins interface, kind and working frequency.
4 Solution Description
As shown in Fig. 2, the HDL IP cores search system consists of a presentation layer
(developed using JSP technology), a business layer (developed in Java) and a Jena-
based data storage [34][35].
Semantic
annotator
VHDL Jena based API
Jena
Search
engine
Composer
JSP presentation layer
RDF
Repository
VHDL
Ontology
Fig. 2. System architecture.
Fig. 3. Port semantics.
The business tier includes the semantic annotator, the search engine and the
composition engine. The core of the system is the Jena Framework, a Java framework
which allows the use of the technologies of the Semantic Web. We also use the Jena
repository, which serves for data storage.
4.1 Semantic Annotator
This module is used for uploading VHDL files on the server. The server parses the
files, extracts the needed HDL information of the component and allows adding an
additional semantic description (Fig. 3). We modified the Hardware-Vhdl-Parser [36]
so that it converts the VHDL entity into its appropriate RDF representation.
Fig. 4. Nested RDF description.
Fig. 5. Search for a VHDL
component.
In order to create self contained, semantically described VHDL documents, this
module allows embedding of RDF code directly into the VHDL file, as a VHDL
comment. If such a comment exists in the loaded VHDL file, the application reads it
directly and doesnt show the form shown in Fig. 3. An example of a nested RDF
description within a standard VHDL file is shown in Fig. 4.
4.2 Repository
The data is stored with the use of the Jena Framework libraries. The Jena Framework
stores the data in a graph-like structure, instead of the common database approach
which uses strictly formed tables. This graph-like structure is commonly known as an
RDF Repository. On top of the Jena Framework we wrote our own API, that provides
the functionality according to the VHDL structure. The API provides ontology and
data access and is used by the upper layers of the system.
4.3 Search Engine
The search engine is a Java application that uses the Jena API for querying the RDF
repository. It is at this point where the semantic annotations of the VHDL components
are used to increase the effectiveness over the classic ways of search and retrieval,
from the aspect of precision. The search is made by matching the semantic concepts
specified in the user request and the semantic annotations of the available components
from the RDF repository of the system. We use a semantic based comparison
algorithm, which checks for port, frequency and component class matching. We
assigned different weighs for different properties, and the final matching score is a
sum of all matching weighs (the numbers below the component names in Fig. 6
represent the matching score).
The search form is shown in Fig. 5, where a user needs to simply specify the type,
frequency and port interface of the needed VHDL component. The results are listed as
shown in Fig. 6. An AJAX add-on gives a review of the located component (Fig. 7).
The buttons on the right side allow further search for similar components, based on
the current component. The ―Find Similar‖ button puts the current component on top
and starts a search for similar components.
Fig. 6. Compatible components.
Fig. 7. AJAX component preview.
4.4 Composer
This module is a base for making a composition of IP cores. The composer module
API contains functions for compatibility check, which means that compatible
components can be located. Finding a compatible component (―Find Compatible‖
button in Fig. 6) means finding components that can be connected to a chosen one, for
instance, an 8-bit output port is compatible with an 8-bit input port. An example of
usage is shown in Fig. 6, where compatible components for the PicoBlaze CPU-
kcpsm3 are listed (e.g. ram_dp_sr_sw RAM component at the bottom). This is the
first step of creating an entire SoC composition by the use of IP cores.
4.5 Advantages and System Usability
When searching through existing HDL portals (aforementioned in the related work
section) the result is a folder with the full HDL project, containing many files. In
contrast, our HDL IP cores search engine gives a single HDL file that represents a
specific component as a result. However, despite this difference, these two
approaches (file-based and project-based) are applicable to different problems, so they
would and should exist concurrently.
Another advantage of the proposed system is that the component classification is
based on an OWL ontology, which enables knowledge sharing and easy resolution of
ambiguity. There are common methods for merging semantic data from different
ontologies, which makes the HDL IP cores system really scalable and easy to
maintain and improve.
Unique features of the HDL IP cores system, which are a direct result of the usage
of semantic technologies, are the effortless retrieval of similar components and the
ability to search for compatible components. As shown in Fig. 5, Fig. 6 and Fig. 7, the
HDL IP cores system enables search ―by port‖ and gives reliable ranking of the
similar and compatible components available in the repository. Although we tested
the HDL IP cores system with a relatively small set of VHDL examples, we find the
testing reliable because of the completeness of the test set. Namely, there are
memories, controllers, buffers, flip-flops, counters, registers, multiplexers, coders,
decoders, parity generators, etc. in the system. The results from the compatibility
search were obvious and expected. For instance, a CPU compatibility search ranks
memories higher, and a coder compatibility search places decoders at the top of the
result list. An IP core compatibility feature is an approach that allows SoC design via
browse through search results.
5 Conclusion and Future Work
The system and the application we designed and developed are intended to
demonstrate the ability of the semantic web technologies for building a more precise
search engine for VHDL components. The same principle can be used for describing
and searching specific chip-products, too. Such a system can be easily implemented
within the hardware producer‘s web sites or a web market engine.
In general, the improvements which our system offers are quite a large step to a
faster and smarter way of storing and searching data. It is described as a VHDL tool
here, but it easily applies to any hardware or software, class-based programming code
or product specification.
Our future plans include an extension of the current semantic description of the
port, with provision of information about whether the port is buffered or not. This
feature will introduce the ability to build a complete system automatically. The
application itself would be able to decide whether a buffer component is needed or
not and find some of the available components from the repository.
With this feature, we will be able to develop a system that automatically composes
a logical block of components solely from the user specifications. The user will be
required to define the needed inputs and outputs of the circuit, and the system will be
able to compose a logical block which consists of semantically annotated VHDL
components from the repository, components which can be connected together in a
way that satisfies the user‘s specifications. This is a problem similar to the problem of
automatic composition of semantic web services, and our intention is to apply these
concepts to composition of IP cores.
References
1. International technology roadmap for semiconductors, Design, 2007 edition,
http://www.itrs.net/Links/2007ITRS/2007_Chapters/2007_Design.pdf,
2. VHDL Very High Speed IC Hardware Description Language , http://www.vhdl.org/
3. Open Cores web portal, http://opencores.org/
4. Java optimized processor - group , http://tech.groups.yahoo.com/group/java-processor/
5. IP supermarket web portal, http://www.ipsupermarket.com/index.php
6. Infineon web portal, http://www.ipsupermarket.com/index.php
7. Lattice web portal, http://www.latticesemi.com/
8. Chip Estimate web portal, http://www.chipestimate.com/
9. Design & Reuse web portal, http://www.design-reuse.com/
10. Xilinx ISE HDL programming environment , http://www.xilinx.com/
11. Strasunskas, D. and Tomassen, S.L. On Variety of Semantic Search Systems and Their
Evaluation Methods. Proceedings of International Conference on Information Management
and Evaluation, University of Cape Town, South Africa, 25-26 March 2010, Academic
Conferences Publishing, pp. 380-387.
12. Pan, J.Z., Thomas, E. and Sleeman, D. (2006) “Ontosearch2: Searching and querying web
ontologies”, In Proc. of the IADIS International Conference, pp 211-218.
13. Guha, R., McCool, R. and Miller, E. (2003) “Semantic search”, In Proc. of WWW 2003, pp
700-709.
14. Lopez, V., Uren, V., Motta, E. and Pasin, M. (2007) “AquaLog: An ontology-driven question
answering system for organizational semantic intranets”, Web Semantics, Vol 5, No. 2, pp
72-105.
15. Stojanovic, N., Studer, R., and Stojanovic, L. (2003) “An approach for the ranking of query
results in the Semantic Web”, In ISWC 2003, LNCS 2870, Springer-Verlag, pp 500-516.
16. Rocha, C., Schwabe, D. and de Aragao, M. (2004) “A hybrid approach for searching in the
semantic web”, In Proc. of WWW 2004, ACM Press, pp 374-383.
17. Corby, O., Dieng-Kuntz, R., Faron-Zucker, C., Gandon, F.L. (2006) “Searching the Semantic
Web: Approximate Query Processing Based on Ontologies”, IEEE Intelligent Systems, Vol
21, No. 1, pp 20-27.
18. Tomassen, S.L., and Strasunskas, D. (2009) “A semiotics-driven approach to Web search:
analysis of its sensitivity to ontology quality and search tasks”, In Proc. of iiWAS’2009,
ACM.
19. Kiryakov, A., Popov, B., Terziev, I., Manov, D. and Ognyanoff, D. (2004) “Semantic
annotation, indexing, and retrieval”, Journal of Web Semantics Vol 2, No. 1, pp 4979.
20. Chirita, P.-A., Costache, S., Nejdl, W. and Paiu, R. (2006) “Beagle++: Semantically
enhanced searching and ranking on the desktop”, ESWC 2006, LNCS 4011, pp 348-362.
21. Castells, P., Fernandez, M., and Vallet, D. (2007) “An Adaptation of the Vector-Space
Model for Ontology-Based Information Retrieval”. IEEE TKDE 19(2), pp 261-272.
22. Amaral, C., Laurent, D., Martins, A., Mendes, A. and Pinto, C. (2004) “Design and
Implementation of a Semantic Search Engine for Portuguese”, In Proc. LREC 2004, Vol 1,
pp 247250.
23. Brasethvik, T. (2004) Conceptual modelling for domain specific document description and
retrieval- An approach to semantic document modelling. PhD thesis, NTNU, Trondheim,
Norway, 2004.
24. Zhang, L., Yu, Y., Zhou, J., Lin, Ch. and Yang, Y. (2005) “An enhanced model for searching
in semantic portals”, In WWW 2005, pp 453-462.
25. Burton-Jones, A., Storey, V.C., Sugumaran, V. and Purao, S. (2003) “A heuristic-based
methodology for semantic augmentation of user queries on the web”, ER 2003, LNCS 2813,
pp 476489.
26. Ciorascu, C., Ciorascu, I., and Stoffel, K. (2003) “knOWLer - ontological support for
information retrieval systems”, In Proc. of SIGIR 2003 Conference, Workshop on Semantic
Web.
27. Aitken, S. and Reid, S. (2000) “Evaluation of an ontology-based information retrieval tool”,
In Proc. of Workshop on the Applications of Ontologies and Problem-Solving Methods,
ECAI 2000, Berlin.
28. Suomela, S. and Kekalainen, J. (2005) “Ontology as a search-tool: A study of real user's
query formulation with and without conceptual support”, ECIR’2005, LNCS 3408, Springer-
Verlag, pp 315-329.
29. Blacoe, I., Palmisano, I., Tamma, V. and Iannone, L. (2008) “QuestSemantics - Intelligent
Search and Retrieval of Business Knowledge”, Frontiers in Artificial Intelligence and
Applications, Vol. 178, pp 648-652.
30. Wang, H., Zhang, K., Liu, Q., Tran, T., and Yu, Y. (2008) “Q2Semantic: A Lightweight
Keyword Interface to Semantic Search”, ESWC 2008, LNCS 5021, Springer-Verlag, pp 584-
598.
31. Nagypal, G. (2007) Possibly imperfect ontologies for effective information retrieval. PhD
thesis, University of Karlsruhe.
32. Protégé semantic data editor, RDF, OWL…, Stanford Center for Biomedical Informatics
Research, http://protege.stanford.edu/, 2010
33. RDF, Resource Description Framework, http://www.w3.org/RDF/, 2010
34. Jena A semantic web, java framework,
Official API documentation and examples for Jena libraries, http://jena.sourceforge.net/,
2010
35. Jena TDB storage, http://openjena.org/wiki/TDB, 2010
36. Vhdl-Parser,http://search.cpan.org/~gslondon/Hardware-Vhdl-Parser-0.12/Parser.pm, 2000
... In that direction we present a concept for testbench retrieval and seamless integration into the designers' native workspace. Using our existing HDL IP Cores system [15] and integrating existing client-side tools and simulators, we managed to go a step forward and provide the users an automated functionality of a testbench search and download directly inside their HDL programming environment. components ranking by similarity and compatibility with a given component. ...
... In our system, the search for a testbench is made by the use of a search engine, based on OWL domain ontology and RDF knowledgebase [15]. Since the HDL files have a predefined structure by themselves, the annotation is done automatically, using custom ontology as a domain data schema. ...
Conference Paper
Full-text available
A huge part of the HDL development process is spent on testing and simulation. Supporting the idea of a testbench design automation, we present a module of the HDL IP Cores system, integrated with a client-side eclipse plug-in, as an automatic testbench search engine embedded inside the designer's native programming environment. The concept is extended with the use of a simulator for compatibility verification and existing results ranking improvement.
... Zdraveski et al. [6] designed and developed a search engine for VHDL components. They described an HDL ontology, then refined it to a VHDL ontology [7], for semantic annotation of VHDL components. ...
... At the right-hand side of Fig. 5. SoC design ontologies, the HDL IP Cores application ontology (hipc.owl) is shown. It covers all classes and relations required to deploy a functional application and with that offers client-side features which provide a novel concept in the storage and retrieval of HDL IP Cores [34][35] [36]. ...
Conference Paper
Full-text available
Nowadays we are witnessing the establishment of the data-driven science as a new scientific paradigm, that is opening a waste amount of new opportunities for scientific and technological advances. The data is becoming the main asset in today’s science and technology. Unfortunately, a significant amount of available and stored data is not used today. This data is known as a dark data. Starting from this point, the primary goal of this paper is to raise the awareness of the opportunities that are explored with the dark data utilization in companies and organizations, by giving an overview of the underlining technologies, proposing a methodology and showing example projects that utilize the dark data in the IoT domain.
... The VHDL ontology was designed using the Protégé editor [19], shown inFigure 2. The ontology is used to classify and annotate all of the VHDL components in order to store the details of the users' source code into the system. Further information about the ontology use cases could be found in [20]. ...
Conference Paper
Full-text available
Recently, the hardware description languages (HDL) are part of the most of hardware design processes and the HDL components are the main intellectual property (IP) of the producers of IC's. The large companies have internal databases and moreover whole code sub-versioning repositories, but however the easiness of code reuse is still quite low and there is almost no intelligence in the storage systems. Contributing to the improvement of the hardware design process, essentially based on the reuse of previously written cores, and utilizing the Semantic Web technologies, we propose a basic ontology for semantic annotation of VHDL components, that also contains the most frequently used component types and their "is-part-of"-dependencies, providing a knowledge base for classification and automated composition of predefined IP cores.
Article
Full-text available
Increased designers' interest in digital system design using hardware description languages has resulted in a huge data set of open source, available on the Web. Difficulties in discovering specific component introduce the need of automation in the process of search and reuse of already existing components. Despite the interface, a very important part required for a complete automation is the software analysis of the components' inner architecture. Applying the Semantic Web methodologies and using our existing hardware description ontology, we propose extension that will enable a semantic annotation of the inner architecture and will significantly improve the tools for automatic search and system composition of existing components. The ontology is published and can be used as a model for a standardized annotation, in order to increase the availability of the existing components and to provide easier reuse in novel designs. The concept is also applicable inside a company, to accelerate the retrieval through the local repositories of components.
Conference Paper
Full-text available
Recent activities of governments around the world regarding the publication of open government data on the Web, re-introduced the Open Data concept. The concept of Open Data represents the idea that certain data should be freely available to the public, i.e. the citizens, for use, reuse, republishing and redistributing, with little or no restrictions. The goal is to make non-personal data open, so that it can be used for building useful applications which leverage their value, allow insight, provide access to government services and support transparency. These data can contribute to the overall development of the society, by both boosting the ICT business sector and allowing citizens a deeper insight into the work of their government. This recent rise in interest for Open Data introduced the necessity for efficient mechanisms which enable easy publishing, management and consumption of such data. Therefore, we developed an Open Data Portal, with the use of the technologies of the Semantic Web. It allows users to publish, manage and consume data in machine-readable formats, interlink their data with data published elsewhere on the Web, publish applications build on top of the data, and interact with other users.
Article
Full-text available
There is an intensive on-going research on semantic search. The progress and results in this area offer a promising prospect to improve performance of current search systems. Existing sparse evaluations of semantic search systems report improvement compared to traditional information retrieval (IR) systems. However, the results lack indications whether this improvement is optimal. Yet, majority of IR evaluation methods is mainly based on relevance of retrieved information. Typically, additional sophistication of the semantic systems adds complexity on user interaction to reach improved results. Consequently, standard IR metrics as recall and precision do not suffice alone to measure user satisfaction because of complexity and effort needed to use semantic search tools. Furthermore, evaluation methods based on recall and precision do not indicate the causes for variation in different retrieval results. There are many factors that influence the performance of ontology-driven information retrieval, such as query quality, ontology quality, complexity of user interaction, difficulty of a searching topic with respect to retrieval, indexing, searching, presentation of results, and ranking methods. Therefore, the paper targets to deepen understanding in evaluation of semantic search systems. The main objective is to analyse essential components of such IR systems and establish constructs that would give a close-to-complete insight of the system's performance. We survey a set of representative semantic search systems and their evaluation methods. Then we conceptualise and outline a proposal for a holistic evaluation of semantic search systems. The framework is constructed based on analysis and findings from a contemporary literature. Hence, the contribution of the paper is as follows: structured review and classification of semantic search systems, analysis of evaluation of the systems, and the derived evaluation framework.
Article
Full-text available
We present the semantic multilingual question answering engine of the TRUST project, describing its overall architecture, its common multilingual resources, as well as the specific resources, tools and processing mechanisms implemented for the development of the Portuguese language module.
Article
Full-text available
Ontologies are important components of web-based applications. While the Web makes an increasing number of ontologies widely available for applications, how to discover ontologies in the Web becomes a more challenging issue. Existing approaches are mainly based on keywords and metadata information of ontologies, rather than semantic entailments of ontologies. In this paper, we present a Semantic Web engine, called ONTOSEARCH2, which searches and queries Web ontologies by creating and storing a copy of ontologies in a tractable description logic. ONTOSEARCH2 allows formal querying of its repository, including both the structures and instances of ontologies, using the SPARQL query language. Furthermore, this paper reports on preliminary, but encouraging, benchmark results which compare ONTOSEARCH2's response times on a number of queries with those of existing knowledge base management systems.
Conference Paper
Full-text available
Existing desktop search applications, trying to keep up with the rapidly increasing storage capacities of our hard disks, offer an incomplete solution for information retrieval. In this paper we describe our Beagle + +  desktop search prototype, which enhances conventional full-text search with semantics and ranking modules. This prototype extracts and stores activity-based metadata explicitly as RDF annotations. Our main contributions are extensions we integrate into the Beagle desktop search infrastructure to exploit this additional contextual information for searching and ranking the resources on the desktop. Contextual information plus ranking brings desktop search much closer to the performance of web search engines. Initially disconnected sets of resources on the desktop are connected by our contextual metadata, PageRank derived algorithms allow us to rank these resources appropriately. First experiments investigating precision and recall quality of our search prototype show encouraging improvements over standard search.
Article
Full-text available
The semantic web vision is one in which rich, ontology-based semantic markup will become widely available. The availability of semantic markup on the web opens the way to novel, sophisticated forms of question answering. AquaLog is a portable question-answering system which takes queries expressed in natural language and an ontology as input, and returns answers drawn from one or more knowledge bases (KBs). We say that AquaLog is portable because the configuration time required to customize the system for a particular ontology is negligible. AquaLog presents an elegant solution in which different strategies are combined together in a novel way. It makes use of the GATE NLP platform, string metric algorithms, WordNet and a novel ontology-based relation similarity service to make sense of user queries with respect to the target KB. Moreover it also includes a learning component, which ensures that the performance of the system improves over the time, in response to the particular community jargon used by end users.
Conference Paper
One of the vital problems in the searching for information is the ranking of the retrieved results, because users make typically very short queries (2-3 terms) and tend to consider only the first ten results. In traditional IR approaches the relevance of the results is determined only by analysing the underlying information repository (content and hyperlink structure), which leads to the weak relevance model. On the other hand, in the Semantic Web the querying process is supported by an ontology such that other important sources for determining the relevance of results can be considered: the structure of the underlying domain and the characteristics of the searching process. In this paper we present a novel approach for determining relevance in ontology-based searching for information, which exploits the "full potential" of the semantics of such a semantically-based link structure. We present several analyses about how a Semantic Web querying mechanism can benefit of using our ranking approach.
Article
The Semantic Web realization depends on the availability of a critical mass of metadata for the web content, associated with the respective formal knowledge about the world. We claim that the Semantic Web, at its current stage of development, is in a state of a critically need of metadata generation and usage schemata that are specific, well-defined and easy to understand. This paper introduces our vision for a holistic architecture for semantic annotation, indexing, and retrieval of documents with regard to extensive semantic repositories. A system (called KIM), implementing this concept, is presented in brief and it is used for the purposes of evaluation and demonstration. A particular schema for semantic annotation with respect to real-world entities is proposed. The underlying philosophy is that a practical semantic annotation is impossible without some particular knowledge modelling commitments. Our understanding is that a system for such semantic annotation should be based upon a simple model of real-world entity classes, complemented with extensive instance knowledge. To ensure the efficiency, ease of sharing, and reusability of the metadata