Article

Digital library of University of Maribor (more than just a bunch of documents)

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

Technical background of digital library of University of Maribor is described in this paper. We start with basic description of the library and its purpose, but the main focus of this paper is set on features, that are mostly not found in other digital libraries. Features like plagiarism detection, informative and useful statistics and specific content extraction are described. We present existing functionality and describe some ideas for future development.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... V temčlankutemˇtemčlanku se omejimo na vsebinsko priporočanje dokumentov z metodo BM25 in dodatnimi utežmi, ki so pridobljene z metapodatki dokumentov in opazovanjem uporabnikovih aktivnosti. Vhodna množica obsega 20000 dokumentov iz Digitalne knjižnice Univerze v Mariboru [4, 5] (v nadaljevanju DKUM). Vsak dokument predstavimo z naslovom, ključnimi besedami, povzetkom in drugimi metapodatki. ...
Conference Paper
Full-text available
We present a content-based document recommender system for digital libraries, based on a ranking function BM25 with additional ranking weights, based on document meta-data and user activity. The recommendation is based on the information in titles, keywords and abstracts of the documents. We provide a comparison study on how different text preprocessing steps effect the quality of recommendation. These preprocessing steps include use of words, phrases, lemmatisation and semantic tagging for automated keyword extraction.
... All of the above can be done with small modifications to the original algorithm and with smart parameters choice. We successfully deployed our algorithm into the advanced search in DKUM [8] making it a fuzzy full text search. We also used the algorithm in the upgraded version of the question answering system described in [10] as a process for detecting entities for solving disambiguation. ...
Article
Full-text available
This article describes some common problems faced in natural language processing. The main problem consist of a user given sentence, which has to be matched against an existing knowledge base, consisting of semantically described words or phrases. Some main problems in this process are outlined and the most common solutions used in natural language processing are overviewed. A sequence matching algorithm is introduced as an alternative solution and its advantages over the existing approaches are explained. The algorithm is explained in detail where the longest subsequences discovery algorithm is explained first. Then the major components of the similarity measure are defined and the computation of concurrence and dispersion measure is presented. Results of the algorithms performance on a test set are then shown and different implementations of algorithm usage are discussed. The work is concluded with some ideas for the future and some examples where our approach can be practically used.
Conference Paper
Full-text available
Our implementation of a natural language processing framework (called TextProc) is described in this paper. We start with a general overview of the framework and continue with detailed description of its parts. Actual language processing is implemented as software plug-ins. Plug-ins can be put together into processes that perform a practical natural processing function. One such process is plagiarism detection, which is explained in detail. The process for plagiarism detection is actually used in digital library of University of Maribor and the integration of digital library with TextProc is also briefly described. At the end of this paper some ideas for future development are given.
Article
Full-text available
In today's world the majority of information is sought after on the internet. A common method is the use of search engines. However since the result of a query to the search engine is a ranked list of results, this is not the final step. It is up to the user to review the results and determine which of the results provides the information needed. Often this process is time consuming and does not provide the sought after information. Besides the number of returned results the limiting factor is often the lack of ability of the users to form the correct query. The solution for this can be found in the form of question answering systems, where the user proposes a question in the natural language, similarly as talking to another person. The answer is the exact answer instead of a list of possible results. This paper presents the design of a question answering system in natural slovene language. The system searches for the answers for our target domain (Faculty of Electrical Engineering and Computer Science) with the use of a local database, databases of the faculty's information system, MS Excel files and through web service calls. We have developed two separate applications: one for users and the other for the administrators of the system. With the help of the latter application the administrators supervise the functioning and use of entire system. The former application is actually the system that answers the questions.