Article

TOWARDS A PAN-EUROPEAN E-PROCUREMENT PLATFORM TO AGGREGATE, PUBLISH AND SEARCH PUBLIC PROCUREMENT NOTICES POWERED BY LINKED OPEN DATA: THE MOLDEAS APPROACH

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

This paper aims to describe a public procurement information platform which provides a unified pan-European system that exploits the aggregation of tender notices using linking open data and semantic web technologies. This platform requires a step-based method to deal with the requirements of the public procurement sector and the open government data initiative: (1) modeling the unstructured information included in public procurement notices (contracting authorities, organizations, contracts awarded, etc.); (2) enriching that information with the existing product classification systems and the linked data vocabularies; (3) publishing relevant information extracted out of the notices following the linking open data approach; (4) implementing enhanced services based on advanced algorithms and techniques like query expansion methods to exploit the information in a semantic way. Taking into account that public procurement notices contain different kinds of data like types of contract, region, duration, total amount, target enterprise, etc., various methods can be applied to expand user queries easing the access to the information and providing a more accurate information retrieval system. Nevertheless expanded user queries can involve an extra-time in the process of retrieving notices. That is why a performance evaluation is outlined to tune up the semantic methods and the generated queries providing a scalable and time-efficient system. Moreover, this platform is supposed to be especially relevant for SMEs that want to tender in the European Union (EU), easing their access to the information of the notices and fostering their participation in cross-border public procurement processes across Europe. Finally an example of use is provided to evaluate and compare the goodness and the improvement of the proposed platform with regard to the existing ones.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Technology in public procurement makes it possible to automate the entire procurement process. However, implementing digital technologies for public procurement should not be based only on exchanging the paper model for the digital one [28]. For the development and implementation of digital technologies, it is necessary to characterize the presence of "technological infrastructure, the set of hardware, software, networks, Internet, services and applications" [29, p. 631], [30], so that the adopted technology contributes to the improvement of the procurement process. ...
... The use of ontology allows the sharing of information through Linked Data, resulting in a better knowledge of suppliers and citizens, as well as the improvement of the information systems of public organizations. With Linked Data, Alvarez et al. [28] modeled a pan-European platform to aggregate tender notices through Open Government Data (OGD). Implementing the platform with high-quality data can facilitate the participation of SMEs, support multilingual and multicultural issues, and encourage cross-border participation in public procurement. ...
... Ontologies play an essential role within the Semantic Web; an ontology captures and formalises rich domain knowledge by using commonly agreed vocabularies in a logic-based framework supporting automated reasoning [15]. Ontology-based approaches in public procurement include PPROC ontology [8] for describing public processes and contracts, LOTED2 ontology [16] for public procurement notices, PCO ontology [17] for contracts in public domain, and MOLDEAS ontology [18] for announcements about public tenders. Finally, there is also eProcurement ontology, which is under development and supported by the European Commission. ...
Article
Full-text available
Governments need to be accountable and transparent for their public spending decisions in order to prevent losses through fraud and corruption as well as to build healthy and sustainable economies. Open data act as a major instrument in this respect by enabling public administrations, service providers, data journalists, transparency activists, and regular citizens to identify fraud or uncompetitive markets through connecting related, heterogeneous, and originally unconnected data sources. To this end, in this article, we present our experience in the case of Slovenia, where we successfully applied a number of anomaly detection techniques over a set of open disparate data sets integrated into a Knowledge Graph, including procurement, company, and spending data, through a linked data-based platform called TheyBuyForYou. We then report a set of guidelines for publishing high quality procurement data for better procurement analytics, since our experience has shown us that there are significant shortcomings in the quality of data being published. This article contributes to enhanced policy making by guiding public administrations at local, regional, and national levels on how to improve the way they publish and use procurement-related data; developing technologies and solutions that buyers in the public and private sectors can use and adapt to become more transparent, make markets more competitive, and reduce waste and fraud; and providing a Knowledge Graph, which is a data resource that is designed to facilitate integration across multiple data silos by showing how it adds context and domain knowledge to machine-learning-based procurement analytics.
... However, these initiatives still generate a lot of heterogeneity. In order 5 to alleviate these problems, several ontologies including PPROC [17], LOTED2 [18], MOLDEAS [19], or PCO [20], as well as the upcoming eProcurement ontology 10 emerged, with different levels of detail and focus (e.g., legal and process-oriented). So far, however, none of them has reached a wide adoption mainly due to their limited practical value. ...
Article
Full-text available
Public procurement is a large market affecting almost every organisation and individual; therefore, governments need to ensure its efficiency, transparency, and accountability, while creating healthy, competitive, and vibrant economies. In this context, open data initiatives and integration of data from multiple sources across national borders could transform the procurement market by such as lowering the barriers of entry for smaller suppliers and encouraging healthier competition, in particular by enabling cross-border bids. Increasingly more open data is published in the public sector; however, these are created and maintained in siloes and are not straightforward to reuse or maintain because of technical heterogeneity, lack of quality, insufficient metadata, or missing links to related domains. To this end, we developed an open linked data platform, called TheyBuyForYou, consisting of a set of modular APIs and ontologies to publish, curate, integrate, analyse, and visualise an EU-wide, cross-border, and cross-lingual procurement knowledge graph. We developed advanced tools and services on top of the knowledge graph for anomaly detection, cross-lingual document search, and data storytelling. This article describes the TheyBuyForYou platform and knowledge graph, reports their adoption by different stakeholders and challenges and experiences we went through while creating them, and demonstrates the usefulness of Semantic Web and Linked Data technologies for enhancing public procurement.
... The regional level of transactions should be regarded as a relatively new dimension of e-government-driven communication between its key subjects due to the fact that there are not so many really successful examples of political integration between countries that results in the development of centralized e-government strategies and platforms or at least universal realization instruments and laws in the sphere. The ideal example of this type of transactions could be e-government networks that arguably exist in the European Union, often called pan-European e-government institutions (Alvarez et al., 2012;Hanseth, 2014), between such actors as, for example, the regional ICT-driven public administration platforms, national and even local players (e.g. the transactions between the union e-government portals situated in Brussels, i.e. in the administrative capital of the EU and residents or business entities who live at local levels, for example, somewhere in remote areas of Gotland in Sweden, Corsica in France or Sicily in Italy). This phenomenon could be tentatively called e-confederalism in contrast to e-federalism that exists in the United States or e-centralism in Kazakhstan (Kassen, 2015) (see Figure 10). ...
Chapter
Full-text available
Asking such a simple question as what e-government politics really is to policymakers, practitioners and other stakeholders in the area and all the more so in a cross-country and cross-institutional manner could be an extremely prolific undertaking since it allows to generate a myriad of unique stories and perspectives about this phenomenon. E-government is a universally well-known concept in public policy, public administration, political and economic sciences and beyond and related academic and professional literature is really rich with demonstrative cases that represent these narratives well from various viewpoints and fields. In this regard, the key purpose of the article is not to update a state-of-the-art in the area but rather an attempt to synthesize and systematize all available institutional perspectives on the development of this truly multidimensional networking phenomenon equally from stakeholder, cross-institutional and cross-country perspectives.
... The regional level of transactions should be regarded as a relatively new dimension of e-government-driven communication between its key subjects due to the fact that there are not so many really successful examples of political integration between countries that results in the development of centralized e-government strategies and platforms or at least universal realization instruments and laws in the sphere. The ideal example of this type of transactions could be e-government networks that arguably exist in the European Union, often called pan-European e-government institutions (Alvarez et al., 2012;Hanseth, 2014), between such actors as, for example, the regional ICT-driven public administration platforms, national and even local players (e.g. the transactions between the union e-government portals situated in Brussels, i.e. in the administrative capital of the EU and residents or business entities who live at local levels, for example, somewhere in remote areas of Gotland in Sweden, Corsica in France or Sicily in Italy). This phenomenon could be tentatively called e-confederalism in contrast to e-federalism that exists in the United States or e-centralism in Kazakhstan (Kassen, 2015) (see Figure 10). ...
Article
Full-text available
Asking such a simple question as what e-government politics really is to policymakers, practitioners and other stakeholders in the area and all the more so in a cross-country and cross-institutional manner could be an extremely prolific undertaking since it allows to generate a myriad of unique stories and perspectives about this phenomenon. E-government is a universally well-known concept in public policy, public administration, political and economic sciences and beyond and related academic and professional literature is really rich with demonstrative cases that represent these narratives well from various viewpoints and fields. In this regard, the key purpose of the article is not to update a state-of-the-art in the area but rather an attempt to synthesize and systematize all available institutional perspectives on the development of this truly multidimensional networking phenomenon equally from stakeholder, cross-institutional and cross-country perspectives.
Conference Paper
Full-text available
The rising interest in Open Data, fostered by the proliferation of public policies on open government, has promoted methodological models that allow the reuse of those data in social and business initiatives that enable inclusive development. The interest of this scientific article focuses on the review of the development of Open Data in Latin American countries that, due to their geographical location, have similarities in multiple areas: Brazil, Colombia, Mexico, Ecuador, Chile, Argentina, Peru, Venezuela, Cuba, Uruguay, Costa Rica, Paraguay, Bolivia, Panama, Puerto Rico, Trinidad and Tobago, Cayman Islands, Haiti, Jamaica, Dominican Republic, French Guiana, Nicaragua and Suriname; from the scientific production that shows 664 documents of the recognized repository Scopus. The semantic analysis of co-occurrence of terms in the bibliography was carried out with the VOSviewer tool based on more than five occurrences for each term from the key words extracted. The results from the graphic evidence generated shows that, the work of scientists on the theme has focused mainly on the semantic web, open government, transparency, information systems, data mining and big data, from which it is possible to identify a strong contribution of research related to governance rather than other initiatives. In conclusion, Open Data is maturing as a field of scientific knowledge at a global level, due to its connection with the development of information technologies and the diversity of work areas. In Latin America there is still no complexity in terms of the diversity of studies and it still preserves its centrality.
Article
Public authorities promote transparent public procurement practices to increase competition and reduce public procurement costs. In this article, we focus on public procurement of the European Union (EU). We employ a multidisciplinary approach to analyse economic effects of information in public procurement. We quantify the information content of 2,390,630 EU public procurement notices published in 22 different languages using natural language processing techniques. Subsequently, we examine the impact of the information content on public procurement outcomes. We find that higher information levels have significant positive effects. Competition is considerably higher when notices contain more information. On average, contract prices would be 6%–8% lower if notices were to contain adequate information. EU governments could save up to [Formula: see text]80 billion if all public procurement notices were to have detailed information. Based on our comprehensive analysis, we believe that authorities should regulate the information content of notices to promote competition and cost-effectiveness in public procurement.
Chapter
Procurement or tender search is where suppliers seek opportunities for providing goods, works or services that authorities, organisations and businesses require. Such opportunities are listed as procurement contract notices for which suppliers can submit tenders. Typically, an E-Procurement system is used to help find and carry out one or more of the stages involved in the procurement process (from finding potential opportunities, bidding on such opportunities, to delivering the goods, works or services, i.e. find, win, deliver). Such systems are crucial in enabling suppliers to efficiently search through the available listings of procurement contract notices listed across various public and commercial portals. However, little research has investigated how end-users search for such opportunities. In this paper, we perform a descriptive analysis of the professional search behaviours of suppliers using a bespoke e-procurement system. Our analysis is based on a sub-sample of six months of search log interaction data. First, we provide an overview of the usage patterns of our sample of users before investigating how the behaviour of searchers is influenced by the type of search form used (quick vs advanced), user expertise (new vs experienced), and the domain of the procurement notices (General, Defence, Medical, etc.). Our findings highlight that more experienced searchers appear to be more strategic than less experienced searchers and that searchers behave differently depending on the domain in terms of querying and assessing behaviours. This analysis suggests that e-procurement search engines need to be mindful of the differences across searchers and between domains when designing a system to help support their users.KeywordsProfessional searchProcurement search
Article
Full-text available
Purpose This study aims to empirically assess the standardization of using voluntary ex ante transparency notices to announce the awards of noncompetitive large-value contracts. Design/methodology/approach Based on open data published in the Official Journal of the European Union , a pooled cross-sectional research design is used to determine the level of standardized use of noncompetitive contracting by member states. Findings Findings suggest little evidence of standardization when publicizing direct contract awards, which might warrant remedial measures for promoting standardization by the EU. Moreover, France was found to be a major outlier in the prevalence of using non-competitive direct contract awards procedures. Social implications Maintenance of the European Union is predicated on free, transparent and open competition among member states, and this can only be maintained if each member state transposes EU standards into their national laws. Originality/value Findings suggest little evidence of standardization when publicizing direct contract awards, which might warrant additional remedial measures promoting consistency across the EU. Moreover, France was found to be a major outlier in the prevalence of using non-competitive direct contract awards procedures.
Article
The study undertakes a systematic literature review of government-provided E-Services for businesses (Government to Business E-Services). The literature review process selects and analyses 331 publications. The research highlights the polydispersity of identified publications, resulting from the fragmentation of disciplines. The primary activities in this research field occur mainly in Europe and Asia. In addition, there is a steady engagement of researchers on G2B E-Services. The overall maturity level of governmental services identified in the literature review is quite satisfactory. The main research topics are E-Procurement, E-Customs, E-Taxes, Interoperability, Process Transformation, Cost and Pricing Issues, Barriers and Key Success Factors, and Assessment of Effectiveness. Case studies and surveys are the most popular research methods. In the future, a deeper analysis of the stakeholders' views, enhancement of planning, development and transformation of G2B E-Services, redesign of back-office processes, and interoperability between different systems should attract research interest. Finally, we expect further analysis of E-Customs and E-Invoices, as these services have significant added value.
Chapter
Full-text available
The release of a growing amount of open procurement data means that we are increasingly able, and even have the obligation, to scrutinize and analyse public spending for delivering better quality of public services, optimizing costs, preventing fraud and corruption, and building healthy and sustainable economies. The TheyBuyForYou project addresses this challenge by developing an integrated technology platform, with a cross-lingual and cross-border procurement knowledge graph, core services, open APIs, and online tools, and validating them in several business cases in public/corporate procurement in Slovenia, Spain and Italy. This paper gives an overview about the project’s goals and challenges.
Chapter
Full-text available
Public procurement is a large market affecting almost every organisation and individual. Governments need to ensure efficiency, transparency, and accountability, while creating healthy, competitive, and vibrant economies. In this context, we built a platform, consisting of a set of modular APIs and ontologies to publish, curate, integrate, analyse, and visualise an EU-wide, cross-border, and cross-lingual procurement knowledge graph. We developed end-user tools on top of the knowledge graph for anomaly detection and cross-lingual document search. This paper describes our experiences and challenges faced in creating such a platform and knowledge graph and demonstrates the usefulness of Semantic Web technologies for enhancing public procurement.
Chapter
Full-text available
The release of a growing amount of open procurement data led to various initiatives for harmonising the data being provided. Among others, the Open Contracting Data Standard (OCDS) is highly relevant due to its high practical value and increasing traction. OCDS defines a common data model for publishing structured data throughout most of the stages of a contracting process. OCDS is document-oriented and focuses on packaging and delivering relevant data in an iterative and event-driven manner through a series of releases. Ontologies, beyond providing uniform access to heterogeneous procurement data, could enable integration with related data sets such as with supplier data for advanced analytics and insight extraction. Therefore, we developed an ontology, the “OCDS ontology”, by using OCDS’ main domain perspective and vocabulary, since it is an essential source of domain knowledge. In this paper, we provide an overview of the developed ontology.
Chapter
Full-text available
Procurement affects virtually all sectors and organizations particularly in times of slow economic recovery and enhanced transparency. Public spending alone will soon exceed EUR 2 trillion per annum in the EU. Therefore, there is a pressing need for better insight into, and management of government spending. In the absence of data and tools to analyse and oversee this complex process, too little consideration is given to the development of vibrant, competitive economies when buying decisions are made. To this end, in this short paper, we report our ongoing work for enabling procurement data value chains through a knowledge graph based platform with data management, analytics, and interaction.
Article
Purpose Searching the tender notices that publish every day in open tendering websites is a common way for finding business opportunity in public procurement. The heterogeneity of tender notices from various tendering marketplaces is a challenge for exploiting semantic technologies in the tender search. Design/methodology/approach Most of the semantic matching approaches require the data to be structured and integrated according to a data model. But the integration process can be expensive and time-consuming especially for multi-source data integration. Findings In this paper, a product search mechanism that had been developed in an e-procurement platform for matching product e-catalogues is applied to the tender search problem. The search performance has been compared using two procurement vocabularies on searching tender notices from two major tender resources. Originality/value The test results show that the matching mechanism is able to find tender notices from heterogeneous resources and different classification systems without transforming the tenders to a uniform data model.
Article
This paper develops an approach to evaluating designs for digitalisation interventions in purchasing and supply management (PSM), and identifies some fundamental design principles for such interventions. A set of advanced technologies for digitalisation and a theory-based set of seven value drivers for PSM are identified for the proposed grid to facilitate the design of applications and interventions for digitalising PSM. The grid relates the digital technologies to the PSM value drivers in a matrix-like manner, allowing the structured consideration of the space defined by these two dimensions. The proposed approach to the strategic evolution of digitalisation in PSM is tested and its utility is demonstrated in analyses of practitioner literature and multiple case-study-based perspectives on PSM digitalisation. Two fundamental design principles relating to the use of the grid, or to the filling of its space, are set out, thus the research provides new theoretical perspectives on the design of advanced forms of PSM digitalisation. The proposed grid may be used in application design, communicating current and future states of PSM digitalisation to stakeholders, and specifically in developing a future-oriented strategy with a digitalization element for the PSM function.
Article
We mechanize some of the richest yet significantly under‐utilized data resources within developed, ‘Open Data' economies. We show how it is possible to scrape, parse, clean and merge tens of thousands of disaggregated public payments datasets in an attempt to bridge the methodological gap between newly available data from the administrative sphere and applications in empirical social science research. We outline techniques to unambiguously link records to various freely available institutional registers. In particular, we offer guidance on overcoming the substantial challenges of heterogeneous provision and administrative recording errors in the absence of Uniform Resource Identifiers, namely in the form of an approximate, domain‐specific ‘record‐linkage' type matching algorithm. As an illuminating example, we construct a cleaned database of 24,581,192 local government payments subject to the Local Transparency Codes which total £169.87bn in value. We overcome various challenges in a detailed examination of the procurement of services by local government from the voluntary sector: an important contemporary issue due to the rise of the ‘Big Society’ political ideology of the early 21st century. Finally, we motivate future work in this area and discuss potential international applications and practical advancements.
Chapter
This chapter examines the relationships among procurement and other stakeholders.
Conference Paper
The current research investigates implementation feasibility and challenges of e-procurement platform in Iran. In order to do so, it reviews the literature regarding e-procurement implementation challenges in general. Thereafter it looks into key important figures which have direct or indirect influence of the success of this platform. Indicators such as population age, population distribution and number of internet users are discussed to support the hypothesis that Iran presents strong potential for an e-procurement platform. The literature review indicates most of implementation challenges are similar to those reported in other countries, however there is a few critical national factors which are identical to Iran. Analysis of key figures proves although the environment seems to be ready and it is feasible to implement an e-procurement platform, challenges have to be considered and solutions to deal with them should be planned in advance.
Chapter
In a period of economic crisis, the promotion of small and medium-sized enterprises (SMEs) seems to be an important issue as they constitute almost 99 % of European enterprises and play a key role in economic growth. The European Union (EU) still faces challenging economic conditions with an intensifying sovereign debt crisis in the euro zone, the spectre of double-dip recession looming in several countries, and faltering growth in the better performing ones. In this context, however, in 2012, SMEs retained their position as the backbone of the European economy: there are over 20.7 million such enterprises, which amount to more than 98 % of total businesses.
Article
Public procurement or tendering refers to the process followed by public authorities for the procurement of goods and services. In most developed countries, the law requires public authorities to provide online information to ensure competitive tendering as far as possible, for which the adequate announcement of tenders is an essential requirement. In addition, transparency laws being proposed in such countries are making the monitoring of public contracts by citizens a fundamental right. This paper describes the PPROC ontology, which has been developed to give support to both processes, publication and accountability, by semantically describing public procurement processes and contracts. The PPROC ontology is extensive, since it covers not only the usual data about the tender, its objectives, deadlines and awardees, but also details of the whole process, from the initial contract publication to its termination. This makes it possible to use the ontology for both open data publication purposes and for the overall management of the public contract procurement process.
Article
This chapter introduces the promotion of statistical data to the Linked Open Data initiative in the context of the Web Index project. A framework for the publication of raw statistics and a method to convert them to Linked Data are also presented following the W3C standards RDF, SKOS, and OWL. This case study is focused on the Web Index project; launched by the Web Foundation, the Index is the first multidimensional measure of the growth, utility, and impact of the Web on people and nations. Finally, an evaluation of the advantages of using Linked Data to publish statistics is also presented in conjunction with a discussion and future steps sections.
Conference Paper
Public procurement is an activity that is common to all administrations, with a major impact on their functioning and that also affects the economy as a whole. This paper presents an experience that shows how the Semantic Web provides appropriate resources to develop data models that can be used both for the management of public contracts and for the publication of information about them. And within that, with a dual objective of improving efficiency by facilitating competitive tendering, and of making easy the monitoring of public contracts by citizens. Firstly, we developed the PPROC ontology in said experience, with the domain of the ontology being the legal institution of public contracting, which includes the procedure for the preparation of contracts. Next, we used the ontology as a basis for the integration and publication of data from various Spanish administrations.
Article
The idea of open government has spread across Latin American countries causing confusion and ambiguous interpretations. The concepts of "open government" and "transparency" have been used as synonyms and have created false expectations, but mostly they have been used to implement public policies and government strategies to accomplish the goal of creating a more open government. The purpose of this research is to provide some theoretical elements to clear this confusion. In order to achieve such goal, related papers of academic journals over the past six years have been reviewed and classified on three main paths of knowledge. From this point a concept of open government is proposed and discussed. The aim of this paper is to contribute with a theoretical framework that supports research on open government, transparency and accountability.
Article
In recent years, open government data has become an important movement among government administrations around the world. While there is still limited open data research conducted in East Asia, this study explores the complexity of open data initiatives in Taiwan. In particular, the influential factors and their impacts on open data initiatives are investigated from four perspectives: technology, organization, legislation and policy, and environment. Legislation and policy is found to have the most significant impact while agencies’ existing regulations and policies act as constraints. The factors residing in organizational and environmental perspectives follow as the secondary impacts. Technological factors also exist but are considered to be relatively more easily resolved with sufficient support. While the identified factors act as determinants to influence government agencies’ intentions towards open data participation, it is also found that open data is closely related to interagency information sharing, and the two activities in the long term are expected to reinforce to each other iteratively. In addition, practical implications are discussed to provide practitioners with insights. Lastly, the contributions, limitations and potential future research of the current study are listed in the Conclusion section.
Article
Purpose – The aim of this paper is to present an initiative to apply the principles of Linked Data to enhance the search and discovery of OpenCourseWare (OCW) contents created and shared by the universities. Design/methodology/approach – This paper is a case study of how linked data technologies can be applied for the enhancement of open learning contents. Findings – Results presented under the umbrella of OCW-Universia consortium, as the integration and access to content from different repositories OCW and the development of a query method to access these data, reveal that linked data would offer a solution to filter and select semantically those open educational contents, and automatically are linked to the linked open data cloud. Originality/value – The new OCW-Universia integration with linked data adds new features to the initial framework including improved query mechanisms and interoperability.
Article
The present paper introduces and reviews existing technology and research works in the field of e-Procurement. More specifically this survey aims to collect those relevant approaches that have tackled the challenge of delivering more advanced and intelligent e-Procurement management systems due to its relevance in the industry to afford more timely, adaptable and flexible decisions in purchasing processes. Although existing tools and techniques have demonstrated their ability to manage e-Procurement processes as a part of a supply management system there is a lack of interoperability among tools, tangled dependencies between processes or difficulties to exploit existing data and information to name a few that are preventing a proper use of the new dynamic and data-based environment. On the other hand semantic-based technologies emerge to provide the adequate building blocks to represent domain-knowledge and elevate the meaning of information resources through a common and shared data model (RDF) with a formal query language (SPARQL) and accessible via the Internet Protocols. In this sense the Linked Data effort has gained momentum to apply the principles of the aforementioned initiative to boost the re-use of information and data across different tools and processes. That is why authors review both existing open issues in the context e-Procurement with special focus on public procurement and semantic-based approaches to address them. To do so a preliminary research study is conducted to assess the state of the art in the context of e-Procurement and semantic-based systems. Afterwards main drawbacks of existing e-Procurement systems are presented to narrow down in semantic-based approaches applied to this field. Once the current status in both areas is reviewed, authors purpose the use and creation of an e-Procurement index to evaluate the quality of service of procurement systems. In this light the Analytical Hierarchy Process (AHP) method is used to set up an initial weight for each indicator in the index and to perform a first comparison between traditional and semantic-based approaches. Finally some discussion, conclusions and future challenges are also outlined.
Article
Serbia recently decided to organize the country as a democracy, with major political goal to join European Union. This initiated adoption of new laws according to the best practice from European countries regulations. The area of public procurement was one of them, with major aim to increase the transparency and concurrency in the procurement process in order to make it more efficient. During last 10 years, we saw the rise and fall of procurement process in Serbia. Initially, new rules gave positive results, with constantly increasing amount of realized procurements and 7.2 as an average number of bidders per procurement in 2004. In order to support the process and make it more visible, Internet portal for procurement was deployed. Unfortunately, in the next period number of irregular procurements constantly increased while number of bidders decreased to an average of 2.7 in 2011. Analysts agree that major problems in the process are poor application of regulations and high level of corruption. Further, they identified three phases of procurement where corruption usually happen: planning phase, setting up estimation criteria and realization phase. Our proposal is to introduce semantic technologies in the process, in order to enable data manipulation by machines. We propose meta-model for the definition of procurement documents, and another one meta-model for the specification of alert rules, together with domain-specific language. This would enable an expert to set up rules regarding specific conditions, in order to be alerted about possibly irregular procurements. Application of proposed solution should enable earlier recognition of potentially irregular procurements, which can be prevented before realization or sanctioned prior to obsolescence.
Article
Innovation is one of the keys to success in the business world, particularly within the current economic climate. R&D projects constitute the building blocks of the innovation process, hence the importance of searching for funding for these projects. As ontologies and semantic technologies mature, they provide a consistent and reliable means to represent and aggregate knowledge from different sources. The present work explores the use of ontologies to model R&D grant funding calls and the application of semantic technologies to the development of an enhanced funding management system. Our experiments confirm the success of the proposed approach, and reveal that it may bring considerable benefits to R&D funding.
Conference Paper
The present paper introduces a technique to deal with coporate names heterogeneities in the context of public procurement metadata. Public bodies are currently facing a big challenge trying to improve both the performance and the transparency of administrative processes. The e-Government and Open Linked Data initiatives have emerged as efforts to tackle existing interoperability and integration issues among ICT-based systems but the creation of a real transparent environment requires much more than the simple publication of data and information in specific open formats; data and information quality is the next major step in the pubic sector. More specifically in the e-Procurement domain there is a vast amount of valuable metadata that is already available via the Internet protocols and formats and can be used for the creation of new added-value services. Nevertheless the simple extraction of statistics or creation of reports can imply extra tasks with regards to clean, prepare and reconcile data. On the other hand, transparency has become a major objective in public administractions and, in the case of public procurement, one of the most interesting services lies in tracking rewarded contracts (mainly type, location, and supplier). Although it seems a basic kind of reporting service the truth is that its generation can turn into a complex task due to a lack of standardization in supplier names or the use of different descriptors for the type of contract. In this paper, a stepwise method based on natural language processing and semantics to address the unfication of corporate names is defined and implemented. Moreover a research study to evaluate the precision and recall of the proposed technique, using as use case the public dataset of rewarded public contracts in Australia during the period 2004-2012, is also presented. Finally some discussion, conclusions and future work are also outlined.
Article
The present paper introduces a method to promote existing controlled vocabularies to the Linked Data initiative. A common data model and an enclosed conversion method for knowledge organization systems based on semantic web technologies and vocabularies such as SKOS are presented. This method is applied to well-known taxonomies and controlled vocabularies in the business sector, more specifically to Product Scheme Classifications created by governmental institutions such as the European Union or the United Nations. Since these product schemes are available in a common and shared data model, the needs of the European e-Procurement sector are outlined to finally demonstrate how Linked Data can address some of the challenges for publishing and retrieving information resources. As a consequence, two experiments are also provided in order to validate the gain, in terms of expressivity, and the exploitation of this emerging approach to help both expert and end-users to make decisions on the selection of descriptors for public procurement notices.
Article
The PSGR project is the first attempt to generate, curate, interlink and distribute daily updated public spending data in LOD formats that can be useful to both expert (i.e. scientists and professionals) and naïve users. The PSGR ontology is based on the UK payments ontology and reuses, among others, the W3C Registered Organization Vocabulary and the Core Business Vocabulary. RDFized data are linked to product classifications, Geonames and DBpedia resources. Online services contain advanced search features and domain level information (e.g. local government), simple and complex visualizations based on network analysis, linked information about payment entities and SPARQL endpoints. During February 2013, the growing dataset consists of approximately 2 mil. payment decisions valued 44.5 bil. euros forming 65 mil. triples.
Research Proposal
Web technologies have evolved to effectively support the creation and existence of Linked Open Data (LOD) within the web. However, the production and mass consumption of LOD have been a matter of debate over methodologies, best practices, and technologies that should be employed. Particularly in the case of government LOD, the production and publication lifecycle needs to follow a methodological approach. The paper presents publicspending.gr, a 5-star LOD project focusing on Greek Public spending as a case of LD application that is developed according to a methodological approach, following a specific Linked Data Production Life Cycle.
Article
Efforts in integrating the basic economic functions under a common or compatible context could be accelerated by enabling semantic processing of the underlying data. We establish the basic flows among public budgeting, contracting and spending with business information and provide the necessary ontological elements that would integrate them in economic Linked Open Data corpus.
Article
Full-text available
The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries. To make the Semantic Web or Web of Data a reality, it is necessary to have a large volume of data available on the Web in a standard, reachable and manageable format. In addition the relationships among data also need to be made available. This collection of interrelated data on the Web can also be referred to as Linked Data. Linked Data lies at the heart of the Semantic Web: large scale integration of, and reasoning on, data on the Web. Supporting the adoption of semantic Web technologies, currently, there exists a great of tools oriented to creation, publication and management of data, and a big subset for Linked Data. However, an important weakness in this area of Web engineering is that it has not been established completely a formal reference that integrates the necessary infrastructure in terms of components, and the order that this infrastructure must be implanted. This lack implies a slower technological adoption, covering both public sector and private sector. This paper explores the emergence of the Semantic Web and in particular of Linked Data, and their potential impact on IT industry. The main advantages of using Linked Data are discussed from an IT professional perspective where the capability of having standard technologies and techniques to access and manipulate the information is one of the most important achievements in the application of Linked Data.
Article
Full-text available
In this paper we propose a combination between collaborative tagging and semantic web technologies for the development of an image repository system. The proposed system will be part of the WESONET project which will also handle other types of resources like video and audio. Our approach combines low level features extracted by automatic means, high level descriptions provided by the content creators and sets of tags dynamically added by end-users.
Article
Full-text available
The aim of this paper is to present the application of the Spreading Activation Technique in the scope of medical systems. This technique is implemented through the ONTOSPREAD framework for the development, configuration, customization and execution of the Spreading Activation technique over graph-based structures, more specifically over RDF graphs and ontologies arising from the Semantic Web area. It has been used to the efficient exploration and querying of large and heterogeneous knowledge bases based on semantic networks in the Information and Document Retrieval domains. ONTOSPREAD implements the double process of activation and spreading of concepts in ontologies applying different restrictions of the original model like weight degradation according to the distance or others coming from the extension of this technique like the converging paths reward. It is considered to be relevant to support the recommendation of concepts for tagging clinical records and to provide a tool for decision-support in clinical diagnosis. Finally an evaluation methodology and two examples using the well-known ontologies Galen and SNOMED CT are presented to validate the goodness, the improvement and the capabilities of this technique applied to medical systems.
Article
Full-text available
The Semantic Web [1] extends the World Wide Web by providing well-defined semantics to information and services. Through these semantics machines can “understand ” the Web, making it possible to query and reason over Web information, treating the Web as if it were a giant semi-structured database.
Technical Report
Full-text available
Query engines for ontological data based on graph models mostly execute user queries without considering any optimization. Especially for large ontologies, optimization techniques are required to ensure that query results are delivered within reasonable time. OptARQ is a first prototype for SPARQL query optimization based on the concept of triple pattern selectivity estimation. The evaluation we conduct demonstrates how triple pattern reordering according to their selectivity affects the query execution performance.
Chapter
Full-text available
With the advent of the web 2.0 trend, there have appeared a great vari- ety of services that are offered as free. Although the appearance of free services is not new, we consider that the popularity of these services in the Web 2.0 world is a relevant fact that has to be analyzed from an economic point of view. In this chapter we will give an overview of some of the business models that are present behind those services, like freemium, advertising, work exchange and mass col- laboration. We will also present a case study, called EuroAlert, which contains a combination of the above models.
Conference Paper
Full-text available
In this paper, we apply Semantic Web technologies to the creation of an improved search engine over legal and public administration documents. Con- ventional search strategies based on syntactic matching of tokens offer little help when the users' vocabulary and the documents' vocabulary di ffer. This is often the case in public administration documents. We present a semantic search tool that fills this gap using Semantic Web technologies, in particula r, ontologies and con- trolled vocabularies, and a hybrid search approach, avoiding the expensive tagging of documents.
Conference Paper
Full-text available
The SPARQL query language is a recent W3C standard for processing RDF data, a format that has been developed to encode information in a machine-readable way. We investigate the foundations of SPARQL query optimization and (a) provide novel complexity results for the SPARQL evaluation problem, showing that the main source of complexity is operator OPTIONAL alone; (b) propose a comprehensive set of algebraic query rewriting rules; (c) present a framework for constraint-based SPARQL optimization based upon the well-known chase procedure for Conjunctive Query minimization. In this line, we develop two novel termination conditions for the chase. They subsume the strongest conditions known so far and do not increase the complexity of the recognition problem, thus making a larger class of both Conjunctive and SPARQL queries amenable to constraint-based optimization. Our results are of immediate practical interest and might empower any SPARQL query optimizer.
Conference Paper
Full-text available
The Web has established itself as the largest public data repository ever available. Even though the vast majority of informa- tion on the Web is formatted to be easily readable by the human eye, "meaningful information" is still largely inaccessible for the computer applications. In this paper, we present automated algorithms to gather meta-data and instance information by utilizing global regularities on the Web and incorporating with contextual information. Experimental evaluations successfully performed on the TAP knowledge base and the faculty-course home pages of computer science departments containing 16,861 Web pages. The system achieves this performance without any domain specific engineering requirement.
Conference Paper
Full-text available
Efficiently querying RDF (1) data is being an important fac- tor in applying Semantic Web technologies to real-world ap- plications. In this context, many efforts have been made to store and query RDF data in relational database using par- ticular schemas. In this paper, we propose a new scheme to store, index, and query RDF data in triple stores. Graph feature of RDF data is taken into considerations which might help reduce the join costs on the vertical database structure. We would partition RDF triples into overlapped groups, store them in a triple table with one more column of group identity, and build up a signature tree to index them. Based on this infrastructure, a complex RDF query is decomposed into multiple pieces of sub-queries which could be easily fil- tered into some RDF groups using signature tree index, and finally is evaluated with a composed and optimized SQL with specific constraints. We compare the performance of our method with prior art on typical queries over a large scaled LUBM and UOBM benchmark data (more than 10 million triples)in (3). For some extreme cases, they can pro- mote 3t o 4o rders of magnitude.
Conference Paper
Full-text available
A Web search engine must update its index periodically to incorporate changes to the Web. We argue in this paper that index updates fundamentally impact the design of search engine result caches, a performance-critical component of modern search engines. Index updates lead to the problem of cache invalidation: invalidating cached entries of queries whose results have changed. Naive approaches, such as flushing the entire cache upon every index update, lead to poor performance and in fact, render caching futile when the frequency of updates is high. Solving the invalidation problem efficiently corresponds to predicting accurately which queries will produce different results if re-evaluated, given the actual changes to the index. To obtain this property, we propose a framework for developing invalidation predictors and define metrics to evaluate invalidation schemes. We describe concrete predictors using this framework and compare them against a baseline that uses a cache invalidation scheme based on time-to-live (TTL). Evaluation over Wikipedia documents using a query log from the Yahoo! search engine shows that selective invalidation of cached search results can lower the number of unnecessary query evaluations by as much as 30% compared to a baseline scheme, while returning results of similar freshness. In general, our predictors enable fewer unnecessary invalidations and fewer stale results compared to a TTL-only scheme for similar freshness of results.
Conference Paper
Full-text available
This paper presents a search architecture that combines classical search techniques with spread activation techniques applied to a semantic model of a given domain. Given an ontology, weights are assigned to links based on certain properties of the ontology, so that they measure the strength of the relation. Spread activation techniques are used to find related concepts in the ontology given an initial set of concepts and corresponding initial activation values. These initial values are obtained from the results of classical search applied to the data associated with the concepts in the ontology. Two test cases were implemented, with very positive results. It was also observed that the proposed hybrid spread activation, combining the symbolic and the sub-symbolic approaches, achieved better results when compared to each of the approaches alone.
Conference Paper
Full-text available
Semantic Search refers to a loose set of concepts, challenges and techniques having to do with harnessing the information of the growing Web of Data (WoD) for Web search. Here we propose a formal model of one specific semantic search task: ad-hoc object retrieval. We show that this task provides a solid framework to study some of the semantic search problems currently tackled by commercial Web search engines. We connect this task to the traditional ad-hoc document retrieval and discuss appropriate evaluation metrics. Finally, we carry out a realistic evaluation of this task in the context of a Web search application.
Conference Paper
Full-text available
The goal of this tutorial is to introduce, motivate and detail techniques for integrating heterogeneous structured data from across the Web. Inspired by the growth in Linked Data publishing, our tutorial aims at educating Web researchers and practitioners about this new publishing paradigm. The tutorial will show how Linked Data enables uniform access, parsing and interpretation of data, and how this novel wealth of structured data can potentially be exploited for creating new applications or enhancing existing ones. As such, the tutorial will focus on Linked Data publishing and related Semantic Web technologies, introducing scalable techniques for crawling, indexing and automatically integrating structured heterogeneous Web data through reasoning.
Conference Paper
Full-text available
The W3C SPARQL working group is defining the new SPARQL 1.1 query language. The current working draft of SPARQL 1.1 focuses mainly on the description of the language. In this paper, we provide a formalization of the syntax and semantics of the SPARQL 1.1 federation extension, an important fragment of the language that has not yet received much attention. Besides, we propose optimization techniques for this fragment, provide an implementation of the fragment including these techniques, and carry out a series of experiments that show that our optimization procedures significantly speed up the query evaluation process.
Conference Paper
Full-text available
In previous work we have shown that the MapReduce framework for distributed computation can be deployed for highly scalable inference over RDF graphs under the RDF Schema semantics. Unfortunately, several key optimizations that enabled the scalable RDFS inference do not generalize to the richer OWL semantics. In this paper we analyze these problems, and we propose solutions to overcome them. Our solutions allow distributed computation of the closure of an RDF graph under the OWL Horst semantics. We demonstrate the WebPIE inference engine, built on top of the Hadoop platform and deployed on a compute cluster of 64 machines. We have evaluated our approach using some real-world datasets (UniProt and LDSR, about 0.9-1.5 billion triples) and a synthetic benchmark (LUBM, up to 100 billion triples). Results show that our implementation is scalable and vastly outperforms current systems when comparing supported language expressivity, maximum data size and inference speed.
Conference Paper
Full-text available
Keyword-matching systems based on simple models of semantic relatedness are inadequate at modelling the ambiguities in natural language text, and cannot reliably address the increasingly complex information needs of users. In this paper we propose novel methods for computing semantic relatedness by spreading activation energy over the hyperlink structure of Wikipedia. We demonstrate that our techniques can approach state-of-the-art performance, while requiring only a fraction of the background data.
Conference Paper
Full-text available
The Semantic Desktop is a means to support users in Personal Information Management (PIM). It provides an excellent test bed for Semantic Web technology: resources (e. g., persons, projects, messages, documents) are distributed amongst multiple systems, ontologies are used to link and annotate them. Finding information is a core element in PIM. For the end user, the search interface has to be intuitive to use, natural language queries provide a simple mean to express requests. State of the art semantic search engines focus on fact retrieval or on semantic document retrieval. We combine both approaches to search the Semantic Desktop exploiting all available information. Our semantic search engine, built on semantic teleporting and spreading activation, is able to answer natural language queries with facts, e. g., a specific phone number, and/or relevant documents. We evaluated our approach on ESWC 2007 data in comparison with Google site search.
Article
Full-text available
Presents a spreading-activation theory of human semantic processing, which can be applied to a wide range of recent experimental results. The theory is based on M. R. Quillian's (1967) theory of semantic memory search and semantic preparation, or priming. In conjunction with this, several misconceptions concerning Quillian's theory are discussed. A number of additional assumptions are proposed for his theory to apply it to recent experiments. The present paper shows how the extended theory can account for results of several production experiments by E. F. Loftus, J. F. Juola and R. C. Atkinson's (1971) multiple-category experiment, C. Conrad's (1972) sentence-verification experiments, and several categorization experiments on the effect of semantic relatedness and typicality by K. J. Holyoak and A. L. Glass (1975), L. J. Rips et al (1973), and E. Rosch (1973). The paper also provides a critique of the Rips et al model for categorization judgments.
Article
Full-text available
The recent publication of public sector information (PSI) data sets has brought to the attention of the scientific community the redundant presence of location based context. At the same time it stresses the inadequacy of current Linked Data services for exploiting the semantics of such contextual dimensions for easing entity retrieval and browsing. In this paper describes our approach for supporting the publication of geographical subdivisions in Linked Data format for supporting the e-government and public sector in publishing their data sets. The topological knowledge published can be reused in order to enrich the geographical context of other data sets, in particular we propose an exploitation scenario using statistical data sets described with the SCOVO ontology. The topological knowledge is then exploited within a service that supports the navigation and retrieval of statistical geographical entities for the EU territory. Geographical entities, in the extent of this paper, are linked data resources that describe objects that have a geographical extension. The data and services presented in this paper allows the discovery of resources that contain or are contained by a given entity URI and their representation within map widgets. We present an approach for a geography based service that helps in querying qualitative spatial relations for the EU statistical geography (proper containment so far). We also provide a rationale for publishing geographical information in Linked Data format based on our experience, within the EnAKTing project, in publishing UK PSI data.
Article
Full-text available
This paper describes a system to semi-automatically extend and refine ontologies by mining textual data from the Web sites of international online media. Expanding a seed ontology creates a semantic network through co-occurrence analysis, trigger phrase analysis, and disambiguation based on the WordNet lexical dictionary. Spreading activation then processes this semantic network to find the most probable candidates for inclusion in an extended ontology. Approaches to identifying hierarchical relationships such as subsumption, head noun analysis and WordNet consultation are used to confirm and classify the found relationships. Using a seed ontology on "climate change" as an example, this paper demonstrates how spreading activation improves the result by naturally integrating the mentioned methods.
Chapter
This chapter introduces e-procurement as a strategic tool for organizations' competitive position in the new information economy. It argues that that e-procurement is significantly changing the ways businesses operate and thus new business models are needed. E-procurement success factors that have to be considered are: cost factors, time factors, process simplification factors and the volume of e-transactions factors. By gaining understanding of the most important e-procurement factors, organizations have to organize themselves in a way that ensures success. Furthermore, author hopes that with knowing such factors, organizations will be able to better prepare for e-procurement and thus operate successfully and thus be able to compete in the global market.
Conference Paper
Social bookmark services like del.icio.us enable easy annotation for users to organize their resources. Collaborative tagging provides useful index for information retrieval. However, lack of sufficient tags for the developing documents, in particular for new arrivals, hides important documents from being retrieved at the earlier stages. This paper proposes a spreading activation approach to predict social annotation based on document contents and users’ tagging records. Total 28,792 mature documents selected from del.icio.us are taken as answer keys. The experimental results show that this approach predicts 71.28% of a 100 users’ tag set with only 5 users’ tagging records, and 84.76% of a 13-month tag set with only 1-month tagging record under the precision rates of 82.43% and 89.67%, respectively.
Chapter
In this paper, we propose a new similarity measure to compute the pair-wise similarity of text-based documents based on patterns of the words in the documents. First we develop a kappa measure for pair-wise comparison of documents then we use ordered weighting averaging operator to define a document similarity measure for a set of documents.
Article
Ontologies, as knowledge engineering tools, allow information to be modelled in ways resembling to those used by the human brain, and may be very useful in the context of personal information management (PIM) and task information management (TIM). This work proposes the use of ontologies as a long-term knowledge store for PIM-related information, and the use of spreading activation over ontologies in order to provide context inference to tools that support TIM. Details on the ontology creation and content are provided, along with a full description of the spreading activation algorithm and its preliminary evaluation.
Conference Paper
Graph ranking algorithms such as PageRank and HITS share the common idea of calculating eigenvectors of graph adjacency matrices. This paper shows that the power method usually used to calculate eigen- vectors is also present in a spreading activation search. In addition, we empirically show that spreading activation algorithm is able to converge on periodic graphs, where power method fails. Furthermore, an extension to graph ranking calculation scheme is proposed unifying calculation of PageRank, HITS, NodeRanking and spreading activation search.
Conference Paper
Query expansion methods have been studied for a long time - with debatable success in many instances. In this paper we present a probabilistic query expansion model based on a similarity thesaurus which was constructed automatically. A similarity thesaurus reflects domain knowledge about the particular collection from which it is constructed. We address the two important issues with query expansion: the selection and the weighting of additional search terms. In contrast to earlier methods, our queries are expanded by adding those terms that are most similar to the concept of the query, rather than selecting terms that are similar to the query terms. Our experiments show that this kind of query expansion results in a notable improvement in the retrieval effectiveness when measured using both recall-precision and usefulness.
Conference Paper
The idea of "Web of data"[21] has been widely enhanced by the establishment of Linked Data principles on the Web [5]. The emergence of Linking Open Data Project opens the doors to the concept of Linked Open Data, establishing the basis for publishing open data, usable by anyone on the Web. However, even though the form (Linked Data) and the goal (Web of data) have been formally defined, the definition of a components architecture to support the implementation of such technologies is still fuzzy, as is a methodology of implementation associated with this architecture. The problem is that the definition and methodology together should enable both publishing and maintenance of semantic data in a standardized way for a subset of publishers, namely public administrators. In this paper we describe a first approach to an adoption process of Web Semantic technologies, specifically with regard to tools and methods that enable the publication and maintenance of Linked Open Data within the context of public administration. To this end, we review the related basic concepts, define infrastructure, describe its components and functions, and propose a sequence to be followed. Finally, we present a case of the use of our methodology in the Library of Congress of Chile.
Conference Paper
Since the current Web is largely unorganized and there is a rapid growth of information volumes, the recommendation system whose major purpose is to reduce irrelevant content and to provide users with more pertinent and tailored information becomes an important research area. A key issue in this area is how to discover user's interest and behavior effectively. In this paper we investigate an approach to recommendation system based on user ontology and spreading activation model. Through combining the user ontology and spreading activation model, the capability of discovering of user's potential interests is enhanced. A prototype system is also dev.eloped base on this methodology.
Article
The Service-Oriented Architecture (SOA) development paradigm has emerged to improve the critical issues of creating, modifying and extending solutions for business processes integration, incorporating process automation and automated exchange of information between organizations. Web services technology follows the SOA’s principles for developing and deploying applications. Besides, Web services are considered as the platform for SOA, for both intra- and inter-enterprise communication. However, an SOA does not incorporate information about occurring events into business processes, which are the main features of supply chain management. These events and information delivery are addressed in an Event-Driven Architecture (EDA). Taking this into account, we propose a middleware-oriented integrated architecture that offers a brokering service for the procurement of products in a Supply Chain Management (SCM) scenario. As salient contributions, our system provides a hybrid architecture combining features of both SOA and EDA and a set of mechanisms for business processes pattern management, monitoring based on UML sequence diagrams, Web services-based management, event publish/subscription and reliable messaging service.
Article
A number of studies have examined the problems of query expansion in monolingual Information Retrieval (IR), and query translation for crosslanguage IR. However, no link has been made between them. This article first shows that query translation is a special case of query expansion. There is also another set of studies on inferential IR. Again, there is no relationship established with query translation or query expansion. The second claim of this article is that logical inference is a general form that covers query expansion and query translation. This analysis provides a unified view of different subareas of IR. We further develop the inferential IR approach in two particular contexts: using fuzzy logic and probability theory. The evaluation formulas obtained are shown to strongly correspond to those used in other IR models. This indicates that inference is indeed the core of advanced IR.
Article
This paper presents a framework for knowledge discovery and concept exploration. In order to enhance the concept exploration capability of knowledge-based systems and to alleviate the limitations of the manual browsing approach, we have developed two spreading activation-based algorithms for concept exploration in large, heterogeneous networks of concepts (e.g., multiple thesauri). One algorithm, which is based on the symbolic AI paradigm, performs a conventional branch-and-bound search on a semantic net representation to identify other highly relevant concepts (a serial, optimal search process). The second algorithm, which is based on the neural network approach, executes the Hopfield net parallel relaxation and convergence process to identify “convergent” concepts for some initial queries (a parallel, heuristic search process). Both algorithms can be adopted for automatic, multiple-thesauri consultation. We tested these two algorithms on a large text-based knowledge network of about 13,000 nodes (terms) and 80,000 directed links in the area of computing technologies. This knowledge network was created from two external thesauri and one automatically generated thesaurus. We conducted experiments to compare the behaviors and performances of the two algorithms with the hypertext-like browsing process. Our experiment revealed that manual browsing achieved higher-term recall but lower-term precision in comparison to the algorithmic systems. However, it was also a much more laborious and cognitively demanding process. In document retrieval, there were no statistically significant differences in document recall and precision between the algorithms and the manual browsing process. In light of the effort required by the manual browsing process, our proposed algorithmic approach presents a viable option for efficiently traversing large-scale, multiple thesauri (knowledge network).
Article
In this paper we describe a concept of the recommender system for collaborative real time web based editing in the context of creativity sessions. The collaborative real time editing provides creativity teams of which members are physically distributed with an emulation of the synchronous collaboration where presence of the team members is required simultaneously (e.g., brainstorming, meetings). The concept of recommendation is based on matchmaking the currently performed activities at the user interface and external linked open data provided through SPARQL endpoints. The real time propagation of the changes in editor and recommendation is achieved by reverse AJAX and observer pattern. An experiment in the area of the creativity domain shows that the recommendation in collaborative real time editing activities are useful in task performance, guidance, and inspiration.
Book
The World Wide Web has enabled the creation of a global information space comprising linked documents. As the Web becomes ever more enmeshed with our daily lives, there is a growing desire for direct access to raw data not currently available on the Web or bound up in hypertext documents. Linked Data provides a publishing paradigm in which not only documents, but also data, can be a first class citizen of the Web, thereby enabling the extension of the Web with a global data space based on open standards-the Web of Data. In this Synthesis lecture we provide readers with a detailed technical introduction to Linked Data. We begin by outlining the basic principles of Linked Data, including coverage of relevant aspects of Web architecture. The remainder of the text is based around two main themes-the publication and consumption of Linked Data. Drawing on a practical Linked Data scenario, we provide guidance and best practices on: Architectural approaches to publishing Linked Data; choosing URIs and vocabularies to identify and describe resources; deciding what data to return in a description of a resource on the Web; methods and frameworks for automated linking of data sets; and testing and debugging approaches for Linked Data deployments. We give an overview of existing Linked Data applications and then examine the architectures that are used to consume Linked Data from the Web, alongside existing tools and frameworks that enable these. Readers can expect to gain a rich technical understanding of Linked Data fundamentals, as the basis for application development, research or further study.
Conference Paper
Arabic is a language with a particularly large vocabulary rich in words with synonymous shades of meaning. Modern Standard Arabic, which is used in formal writings, is the ancient Arabic language incorporated with loanwords derived from foreign languages. Different synonyms and loanwords tend to be used in different writings. Indeed, the Arabic composition style tends to vary throughout the Arab countries (Abdelali, 2004). Relevant documents could be overlooked when the query terms are synonyms or related to the ones used in the document collection. This could deteriorate the performance of a cross lingual information retrieval (CLIR) system. Query expansion (QE) using the document collection is the usual approach taken to enrich translated queries with context related terms. In this study, QE is explored for an English-Arabic CLIR system in which English queries are used to search Arabic documents. A thesaurus-based disambiguation approach is applied to further optimize the effectiveness of that technique. Indeed, experimental results show that QE enhanced by disambiguation gives an improved effectiveness.
Conference Paper
Graph pattern is the key body of a SPARQL query. Prevalent SPARQL query treatment delivers queries to OWL ontology model directly. To obtain inference results, the graph pattern are matched with inference ontology model which is generated by ontology inference engine. Since an inference model occupies much larger space than original model, and can not be reused as inference requirement varies, this method is not suitable to wide deployment in large scale. Alternatively, this paper proposes a novel method which sends rewritten graph patterns to original ontology model to acquire inference results. This method features in the reuse of original model between users, and avoidance of the heavy workload caused by generating and storing ontology inference model. The paper defines the ontology inference rules which affect the query resolving, and addresses a detailed process for rewriting graph pattern based on these rules. A prototype system is implemented to compare our proposal with current approaches. The experiment results show our method’s advantages in aspects of completeness, soundness and effectiveness.
Article
For years, ontologies have been known in computer science as consensual models of domains of discourse, usually implemented as formal definitions of the relevant conceptual entities. Researchers have written much about the potential benefits of using them, and most of us regard ontologies as central building blocks of the semantic Web and other semantic systems. Unfortunately, the number and quality of actual, "non-toy" ontologies available on the Web today is remarkably low. This implies that the semantic Web community has yet to build practically useful ontologies for a lot of relevant domains in order to make the semantic Web a reality. Theoretically minded advocates often assume that the lack of ontologies is because the "stupid business people haven't realized ontologies' enormous benefits." As a liberal market economist, the author assumes that humans can generally figure out what's best for their well-being, at least in the long run, and that they act accordingly. In other words, the fact that people haven't yet created as many useful ontologies as the ontology research community would like might indicate either unresolved technical limitations or the existence of sound rationales for why individuals refrain from building them - or both. Indeed, several social and technical difficulties exist that put a brake on developing and eventually constrain the space of possible ontologies
Conference Paper
In B2B relationships electronic product catalogs and the respective catalog data gain an important meaning as the starting point for procurement decisions. Suppliers have to provide catalog data for their customers and market places in standardized XML formats and defined quality. Contrary to B2C, catalog usage in B2B is characterized by the fact that data of the catalog-creating enterprise is imported into an information system (target system) of the catalog-receiving enterprise. Despite the application of standardized catalog formats, often a relevant amount of coordination and communication between the involved enterprises is necessary. Especially in the initialization phase, when the first exchange between two partners is established, a lot of adjustments regarding syntax, contents and quality of the transmitted data have to be made. A starting point for the improvement of exchange processes is extending the XML catalog standards so that they support the coordination and the exchange more widely by providing an appropriate process model and additional business messages. The paper pursues this approach by examining the catalog exchange processes for lacks and inadequacies, and developing a three-stage improvement concept that can be used for the extension of commercial XML catalog standards.
Economic Dimension of Public Procurement; e-Procurement European Commision, consultation on the Green paper on expanding the use of e-Procurement in the EU
  • C4 Unit
Unit C4 À À À Economic Dimension of Public Procurement; e-Procurement European Commision, consultation on the Green paper on expanding the use of e-Procurement in the EU, April 2010, http://ec.europa.eu/internal market/consultations/2010/e-procur-ement en.htm.
Publishing Open Government Data, W3C working draft
  • D Bennett
  • A Harvey
D. Bennett and A. Harvey, Publishing Open Government Data, W3C working draft, W3C, 2009. http://www.w3.org/TR/gov-data/.
The Linking Open data cloud diagram
  • R Cyganiak
  • A Jentzsch
R. Cyganiak and A. Jentzsch, The Linking Open data cloud diagram, November 2011. http://richard.cyganiak.de/2007/10/lod/.
Linked open tenders electronic dialy
  • Loted
  • Project
LOTED Project, Linked open tenders electronic dialy, 2010, http://loted.eu:8081/ LOTED1Rep/.
Opening up government
  • Uk
  • Government
UK Government, Opening up government, December 2011, http://data.gov.uk/.
The Open Database of The Corporate World
  • Ch Taggart
  • R Mckinnon
Ch. Taggart and R. McKinnon, The Open Database of The Corporate World, 2010, http://opencorporates.com/.