Peter Vojtáš

Peter Vojtáš
Charles University in Prague | CUNI · Department of Software Engineering

PhD

About

194
Publications
12,415
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
1,944
Citations
Citations since 2017
15 Research Items
389 Citations
2017201820192020202120222023020406080
2017201820192020202120222023020406080
2017201820192020202120222023020406080
2017201820192020202120222023020406080
Introduction
Preference learning for each user on an e-shop separately and hence enhancing recommendation - from Models, Methods to Prototypes and (so far) offline Experiments on real and/or generated Data wrt different Metrics (mainly on the top) - MMPEDM. Second interest is in web semantization, namely automatization of the pocess of semantization of third party web resources, again MMPEDM. This can be understood also as contribution to data driven fuzzy technology from Software Engineering point of view.
Additional affiliations
September 2005 - present
Charles University in Prague
Position
  • Professor (Full)
Description
  • Teaching: Bc course Information Models; Master course Web Semantization; Master course Querying with Preferences. Supervised 2 defended PhD thesis (running 2)
September 2005 - present
Charles University in Prague
Position
  • Professor (Full)
Description
  • - online learning of web user preferences - web semantization (automated extraction and annotation) - use-case, models, methods, data and experiments for validation - uncertain reasoning (deduction, induction, abduction)
January 1998 - December 2009
The Czech Academy of Sciences
Position
  • Senior Researcher
Education
October 1976 - January 1981
Charles University in Prague
Field of study
  • Mathematical Logic
October 1969 - September 1974
Charles University in Prague
Field of study
  • Mathematics - theoretical Informatics

Publications

Publications (194)
Preprint
Full-text available
We study CRF-Challenge-Response Framework (formerly Galois-Tukey) as a formalism to handle reductions of real situations to model situations to get clues for solutions. In a situation we recognize the challenge side (input, query, problem, …) and the response side (output, answer, solution, …). The two sides of a situation are equipped with a notio...
Article
Data mining from unstructured data can be skillfully employed to improve the performance of manufacturing or industrial processes. The main goal of this paper is to create a fast emergency aid system for object detection in SME industrial premises. The basic assumption is that SMEs do not have any IT-trained personnel, and the solution has to be un...
Chapter
Full-text available
We study possibilities and ways to increase automation, efficiency, and digitization of industrial processes by integrating knowledge gained from UAV (unmanned aerial vehicle) images with systems to support managerial decision-making. Here we present our results in the secondary wood processing industry. First, we present a deployed solution for re...
Chapter
A real problem use-case represents a challenge. This is usually transformed (reduced) to a model. We expect the model to give a response/solution which is (at least in a degree) acceptable/meets the challenge. Moreover this challenge-response understanding has two levels – both the real world situation and model situation contains challenge side (i...
Conference Paper
In this paper, we present our work towards comparing on-line and off-line evaluation metrics in the context of small e-commerce recommender systems. Recommending on small e-commerce enterprises is rather challenging due to the lower volume of interactions and low user loyalty, rarely extending beyond a single session. On the other hand, we usually...
Article
Full-text available
Our customer preference model is based on aggregation of partly linear relaxations of value filters often used in e-commerce applications. Relaxation is motivated by the Analytic Hierarchy Processing method and combining fuzzy information in web accessible databases. In low dimensions our method is well suited also for data visualization. The proce...
Chapter
Our customer preference model is based on aggregation of partly linear relaxations of value filters often used in e-commerce applications. Relaxation is motivated by the Analytic Hierarchy Processing method. In low dimensions our method is well suited also for data visualization.
Preprint
Full-text available
In this paper, we present our work towards comparing on-line and off-line evaluation metrics in the context of small e-commerce recommender systems. Recommending on small e-commerce enterprises are rather challenging due to the lower volume of interactions and low user loyalty, rarely extending beyond a single session. On the other hand, we usually...
Article
Full-text available
Our work is generally focused on making recommendations for small or medium-sized e-commerce portals, where we are facing scarcity of explicit feedback, low user loyalty, short visit durations or a low number of visited objects. In this paper, we present a novel approach to use a specific user behavior pattern as implicit feedback, forming binary r...
Article
The 28th 2017 ACM international conference on Hypertext and Social Media will be held in Prague, Czech Republic, from July 4 to 7. This newsletter article briefly introduces the conference and its venue. We hope to meet you all at Hypertext 2017! https://ht.acm.org/ht2017/
Conference Paper
Our starting motivation is a user visiting an e-shop. E-shops usually offer conjunction of sharp filter conditions and one attribute ordering of results. We use a top-k query system where results are ordered by a multi-criterial monotone combination of soft filter conditions. For prediction of users’ behavior, we introduce a class of basis function...
Article
Full-text available
Our research aims to address the challenge of muti-user personalized recommendation. We hypothesize that data driven fuzzy technology can be used even in case of large, extremely sparse data and big number of users. In this paper we deal with this problem methodologically-technical details appear elsewhere. To test our hypothesis, we first use fuzz...
Chapter
We consider the problem of user-item recommendation as a multiuser instance ranking learning. A user-item preference is monotonizable if the learning can restrict to monotone models. A preference model is monotone if it is a monotone composition of rankings on domains of explanatory attributes (possibly describing user behavior, item content but al...
Conference Paper
In this paper we are concerned with user understanding in content based recommendation. We assume having explicit ratings with time-stamps from each user. We integrate three different movie data sets, trying to avoid features specific for single data and try to be more generic. We use several metrics which were not used so far in the recommender sy...
Conference Paper
Full-text available
Our work is generally focused on recommending for small or medium-sized e-commerce portals, where we are facing scarcity of explicit feedback, low user loyalty, short visit times or low number of visited objects. In this paper, we present a novel approach to use specific user behavior as implicit feedback, forming binary relations between objects....
Conference Paper
Full-text available
In this paper, we present our work in progress on using LOD data to enhance recommending on existing e-commerce sites. We imagine a situation of e-commerce website employing content-based or hybrid recommendation. Such recommending algorithms need relevant object attributes to produce useful recommendations. However, on some domains, usable attribu...
Conference Paper
The main motivation of this paper is a support of knowledge management for small to medium enterprises (business). We present our tool sitIT.cz which was developed to support communication of IT specialists (both from academia and business) using public funding. The main message of this paper is that this tool is quite generic and can be used in di...
Article
In this paper we describe details of our approach to the RecSys Challenge 2014: User Engagement as Evaluation. The challenge was based on a dataset, which contains tweets that are generated when users rate movies on IMDb (using the iOS app in a smartphone). The challenge for participants is to rank such tweets by expected user interaction, which is...
Article
Full-text available
Our research is focused on interpreting user preference from his/her implicit behavior. There are many types of relevant behav-ior e.g. time on page, scrolling, clickstream etc. which we will further denote as Relevant Behavior Types (RBT). RBT s varies both in quality and incidence and thus we might need different approaches to process them. In th...
Conference Paper
Full-text available
In our work, we focus on recommending for small or medium-sized e-commerce portals. Due to high competition, users of these portals lack loyalty and e.g. refuse to register or provide any/enough explicit feedback. Furthermore, products such as tours, cars or furniture have very low average consumption rate preventing us from tracking unregistered u...
Conference Paper
Full-text available
We consider applications of user preference rule learning in marketing. We chose rules because of human-understandability. We chose fuzzy logic because it enables to order items for recommendation. In this paper we introduce a rule based system equivalent to the Fagin-Lotem-Naor preference system. We show a multi-user version, introduce induction a...
Conference Paper
Full-text available
In this paper we describe approach of our SemWex1 group to the ESWC 2014 RecSys Challenge. Our method is based on using an adaptation of Content Boosted Matrix factorization [1], where objects are defined through their content-based features. Features were comprised of both direct DBPedia RDF triples and derived semantic information (with some WIE...
Conference Paper
Full-text available
In this paper, we focus on small or medium-sized e-commerce portals. Due to high competition, users of these portals are not too loyal and e.g. refuse to register or provide any/enough explicit feedback. Furthermore, products such as tours, cars or furniture have very low average consumption rate preventing us from tracking unregistered user betwee...
Conference Paper
Full-text available
In this paper, we present an innovative method to use Linked Open Data (LOD) to improve content based recommender systems. We have selected the domain of secondhand bookshops, where recommending is extraordinary difficult because of high ratio of objects/users, lack of significant attributes and small number of the same items in stock. Those diffic...
Book
Full-text available
This volume contains workshop papers, poster abstracts, and tutorial materials of the 13th ITAT conference, which took place on September 11-15, 2013 at Donovaly, Slovakia. ITAT is a computer science conference with the primary goal of presenting new results of young researchers and doctoral students from Slovakia and the Czech Republic. The confer...
Conference Paper
Full-text available
In this paper, we imagine the situation of a typical e-commerce portal employing personalized recommendation. Such website typically receives user feedback from their implicit behavior such as time on page, scrolling etc. The implicit feedback is generally understood as positive only, however we present several methods how to identify some of the i...
Conference Paper
In this paper we evaluate various approaches to a user profile modelling for news recommendation. We represent a user profile as a bag of real world entities, the user is interested in. News articles are thus recommended based on its contained concepts and not based on a text similarity. We propose several ways of such a user profile construction b...
Article
This paper includes five contributions on the topic of multimedia information systems for social, cross-cultural and environmental computing. Approaches, models, and methods for Web Semantization, Cross-cultural Image Computing, Information Modelling and Data Mining, Mobile Information Systems for Ubiquitous Society, and Multimedia Systems for Cros...
Article
This paper contains my memories on how did I arrive to field of fuzziness and personal views on present stage and expectation on fuzziness. Main message is: go back to roots and start with real world problems, large scale data and solutions like L. A. Zadeh did 50 years ago in his applications which emerged into fuzzy theory. I will illustrate thes...
Article
Full-text available
In this paper we summarize efforts of our research group on web semantization-process of increasing the degree of automation of web processing-and some of its applications. We present several methods for mining textual information and assisted annotations as we believe this should be the first steps towards the semantic web. Then several methods fo...
Article
This paper describes the concept and some preliminary experiments of extension of the sitIT.cz portal-the social network of the ICT specialists in Czech Republic. SitIT.cz interconnects ICT specialists and offers effective search according to several types of structured-machine readable-profiles. It is intended to support technology transfer, shari...
Article
In this chapter we describe our project under development and proof of concept for creating large Open-Linked Data repositories. The main problem is twofold: (1) Who will create (annotate) Open-Linked Data and in which vocabularies? (2) What will be the usage and profit of it? For the first problem we propose several procedures on how to create Ope...
Chapter
In this paper, we describe the current state of the development of a web portal SitIT.cz. The portal is being developed in the scope of a EU-funded regional project SOSIREČR ( http://www.sosirecr.cz ). It is based on the concept of a social network which has become a very common concept in recent years. It differs from the existing portals in its s...
Conference Paper
Purpose The purpose of this paper is to focus on the problem of named entity disambiguation. The paper disambiguates named entities on a very detailed level. To each entity is assigned a concrete identifier of a corresponding Wikipedia article describing the entity. Design/methodology/approach For such a fine‐grained disambiguation a correct repre...
Article
Full-text available
In this paper, we discuss the importance of different types of implicit user feedback for creating useful recommendations on an e-commerce website. Each website user may provide us with many different types of implicit feedback and it is difficult to decide which one to use for recommendations. If our recommendation algorithm support using more imp...
Article
Full-text available
In this paper we study the problem of classification of textual web reports. We are specifically focused on situations in which structured information extracted from the reports is used for classification. We present an experimental classification system based on usage of third party linguistic analyzers, our previous work on web information extrac...
Article
Full-text available
In this paper, we focus on the situation of a typical e-commerce portal employing personalized recommendation. Such website could, in addition to the explicit feedback, monitor many different patterns of implicit user behavior – implicit factors. The problem arises while trying to infer connections between observed implicit behavior and user prefer...
Article
In this paper we summarise our acquaintance with preference learning after a series of papers - presenting models, algorithms and experiments with preference learning in e-shop environment. We recall some achievements, several observations and problems left, together with thorough description of the preference learning. We conclude with our future...
Article
Full-text available
Monotone prediction problems, in which the target variable is non-decreasing given an increase of the explanatory variables, have became more popular nowadays in many prob-lem settings which fulfill the so-called monotonicity constraint, namely, if an object is better in all attributes as another one then it should not be classified lower. Recent a...
Conference Paper
Our main motivation is the data access model and aggregation algorithm for middleware by R. Fagin, A. Lotem and M. Naor. They assume data attributes in a variety of repositories ordered by a grade of attribute values of objects. Moreover they assume the user has an aggregation function, which eventually qualifies an object to top-k answers. In this...
Conference Paper
This paper describes the concept of a social network of the ICT specialists in the regions of the Czech Republic. In particular, we focus on the web portal under development, i.e. a software tool serving for the network implementation. Associated activities concerning collecting and analyzing ICT requirements from companies and educational ICT know...
Article
Full-text available
In this paper we introduce our idea of a semantic informa-tion filtering system. Contrary to traditional information filtering systems exploiting information retrieval techniques to select relevant data, we propose a workflow exploiting semantic information obtained from the web. Our system utilises the structured information crawled from the seman...
Conference Paper
Full-text available
http://www.thinkmind.org/index.php?view=article&articleid=semapro_2011_2_10_50013 Information extraction (IE) and automated semantic annotation of text are usually done by complex tools. These tools use some kind of a model that represents the actual task and its solution. The model is usually represented as a set of extraction rules (e.g., regula...
Conference Paper
Full-text available
The main topic of this paper is description of a proposal of a web shop with user preference searching - PrefShop. A typical web shop was implemented, but new capabilities were added to help the user with finding desired object. Besides preference search, visual hints that clarify the object relevance or the lack of relevance are proposed.
Conference Paper
In this paper we study scoring and order approach to concept interpretation in description logics. Only concepts are scored/ordered, roles remain crisp. The concepts in scoring description logic are fuzzified, while the concepts in order description logic are interpreted as preorders on the domain. These description logics are used for preferentia...
Conference Paper
In this paper we deal with a task to learn a general user model from user ratings of a small set of objects. This general model is used to recommend top-k objects to the user. We consider several (also some new) alternatives of learning local preferences and several alternatives of aggregation (with or without 2CP-regression). The main contribution...
Chapter
User preference is often a source of uncertainty in web search. We propose an order-oriented description logic suited especially to represent user preference. Concepts are interpreted as preorders of individuals from the domain. We redefine reasoning tasks to reflect order-oriented approach and we present an algorithm for order instance problem in...
Conference Paper
In this paper we present a proposal of a system that combines various methods of user modelling. This system may find its application in e-commerce, recommender systems, etc. The main focus of this paper is on automatic methods that require only a small amount of data from user. The different ways of integration of user models are studied. A proof-...
Conference Paper
The task of similarity search is widely used in various areas of computing, including multimedia databases, data mining, bioinformatics, social networks, etc. For a long time, the database-oriented applications of similarity search employed the definition of similarity restricted to metric distances. Due to the metric postulates (reflexivity, non-n...
Conference Paper
We propose an alternate method for indexing data for answering queries in non-metric spaces. The traditional use of distance and triangle inequality is substituted with the use of fuzzy similarity fulfilling the transitivity property with a tuneable fuzzy conjunctor. In a non-metric space it is still possible that there is a fuzzy conjunctor such t...
Conference Paper
This contribution describes clustering of most informative keywords within full-text query results and its visualization in 2D or 3D space using so-called sociomapping. The main goal of the clustering is to help user with orientation in the term space and with the reformulating – more detail specification – of ambiguous queries. Test data were obta...
Article
Full-text available
We present models, methods, implementations and experiments with a system enabling personalized web search for many users with different preferences. The system consists of a web information extraction part, a text search engine, a middleware supporting top-k answers and a user interface for querying and evaluation of search results. We integrate s...
Article
Full-text available
Searching top-k objects for many users face the problem of different user preferences. The family of Threshold algorithms computes top-k objects using sorted access to ordered lists. Each list is ordered w.r.t. user preference to one of objects' attributes. In this paper the index based methods to simulate the sorted. access for different user pref...
Conference Paper
The retrieval problem is one of the main reasoning problems for ontology based systems. The retrieval problem for concept C consists in finding all individuals a which satisfy C(a). We present ontology transformation which can help to improve evaluating queries over (sublanguage of) OWL ontologies. Our solution is based on translating retrieval con...
Conference Paper
Learning user preferences is a complex area, especially difficult for performing experiments - every person is different and has different preferences, which often change in time. In this paper, we propose a method for testing a preference learning method that is in a sense more general than our previous attempts of testing an inductive method. We...
Conference Paper
We present a chain of techniques for extraction of object attribute data from web pages which contain either multiple object data or detailed data about a single object. We discover data regions containing multiple data records, which will be extracted with help of extraction ontology. Furthermore, we present an additional algorithm for detail-page...
Conference Paper
Full-text available
In this paper we present a fuzzy system which provides a fuzzy classification of textual web reports. Our approach is based on usage of third party linguistic analyzers, our previous work on web information extraction and fuzzy inductive logic programming. Main contributions are formal models and prototype implementation of the system and evaluatio...
Conference Paper
Full-text available
Web 2.0 is based on personalized access to impre- cise and incomplete information from heterogeneous sources. In this paper we present a web-based system enabling preference-based search. We use modified Fagin's model of fuzzy preferences based on aggregation of attribute preferences. Aggregation is generated by user ranking of objects. Our contrib...
Conference Paper
Full-text available
This paper studies a possibility to learn a complex user preference model, based on CP-nets, from user ratings. This work is motivated by the need of user modelling in decision mak- ing support, for example in e-commerce. We extend our user model based on fuzzy logic to capture variation of preference objectives. The proposed method 2CP-regression...
Conference Paper
Web Semantization is a concept we introduce in this paper. We understand Web Semantization as an automated process of increasing degree of semantic content on the web. Part the of content of the web is further usable, semantic content (usually annotated) is more suitable for machine processing. The idea is supported by models, methods, prototypes a...
Conference Paper
In this paper we deal with the problem of learning user preferences from userpsilas scoring of a small sample of objects with labels from a very small linearly ordered set. The main task of this process is to use these preferences for a top-k query, which delivers the user with an ordered list of k highest ranked objects. We deal with a problem of...
Conference Paper
Full-text available
Methods of top-k search with no random access can be used to find k best objects using sorted access to the sources of attribute values. In this paper we present new heuristics over the NRA algorithm that can be used for fast search of top-k objects using wide range of user preferences. NRA algorithm usually needs a periodical scan of a large numbe...
Conference Paper
Full-text available
Methods of top-k search with no random access can be used to find k best objects using sorted lists of attributes that can be read only by sorted access. Such methods usually need to work with a large number of candidates during the computation. In this paper we propose new methods of no random access top-k search that can be used to compute k best...
Article
Full-text available
We propose a model of a middleware system enabling personalized web search for users with different preferences. We integrate both inductive and deductive tasks to find user preferences and consequently best objects. The model is based on modeling preferences by fuzzy sets and fuzzy logic. We present the model-theoretic semantics for fuzzy descript...
Conference Paper
In this paper we introduce a new "scoring" description logic s-EL(D) with concept instance ordering and top-k restriction queries. This enables to create ontologies describing user preferences (ordering of concept instances) and to describe concepts consisting of top-k instances according to this ordering. We construct algorithm for instance proble...
Conference Paper
This paper contains five contributions of the participants of the panel discussion on “Multi-agent Knowledge Modelling” in the EJC 2008 conference. We addressed four main topics: (a) Semantic Web technologies, (b) reality vs. agents, (c) cross-cultural knowledge and (d) communication of agents. Each of the discussants presented his/her view of the...
Conference Paper
Bottleneck for semantic web services is lack of semantically annotated information. We deal with linguistic information extraction from Czech texts from the Web for semantic annotation. The method described in the paper exploits existing linguistic tools created originally for a syntactically annotated corpus, Prague Dependency Treebank (PDT 2.0)....
Conference Paper
Full-text available
We focus on replacing human processing web resources by automated processing. On an experimental system we identify uncertainty issues making this process difficult for automated processing and try to minimize human intervention. In particular we focus on uncertainty issues in a Web content mining system and a user preference mining system. We conc...
Conference Paper
Full-text available
In this position paper we discuss the what, who, when, where, why and how of uncertain reasoning based on achievements of URW3XG [2], our experiments and some future plans. What and Why – improving semantic web practice through uncertain reasoning. This vision is described in the URW3XG charter (see [2]), especially the objective is "to identify an...