Ivan Dunđer

Ivan Dunđer
University of Zagreb · Department of Information Sciences

About

40
Publications
25,333
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
225
Citations

Publications

Publications (40)
Article
Full-text available
Automatsko strojno prevođenje sve je popularnija istraživačka tema u znanosti i raznim znanstvenim disciplinama, kao što su informacijske i komunikacijske znanosti, računarstvo, računalna lingvistika i sl. Razlog tome je prvenstveno to što danas omogućuje nezaobilaznu komunikaciju i brz prijenos informacija između različitih prirodnih jezika. To je...
Chapter
This paper describes a novel tool for concordance searching, named Concordia. It combines the capabilities of standard concordance searchers with the usability of a translation memory. The tool is described in detail with regard to main applied methods and differences when compared to already existing CAT tools. Concordia uses three data structures...
Chapter
Full-text available
Increased use of computer-assisted translation (CAT) technology in business settings with augmented amounts of tasks, collaborative work, and short deadlines give rise to errors and the need for quality assurance (QA). The research has three operational aims: 1) methodological framework for QA analysis, 2) comparative evaluation of four QA tools, 3...
Article
S ubrzanim razvojem i velikim rastom popularnosti video igara, u posljednjih nekoliko godina industrija video igara doživjela je značajan proces globalizacije. Video igre danas su bitan faktor u industriji digitalne zabave i privlače mnoge igrače, bez obzira na spol, dob, društveni, politički ili ekonomski status. Ta popularnost i brz razvoj video...
Article
Full-text available
Machine translation is increasingly becoming a hot research topic in information and communication sciences, computer science and computational linguistics, due to the fact that it enables communication and transferring of meaning across different languages. As the Croatian language can be considered low-resourced in terms of available services and...
Chapter
Spatiality is a term used to describe the attributes of a given space, its various cultural identities in an established time, differentiated from the notion of territoriality. While territoriality is naturally bound by the established limits of the national territory of a state, spatiality overcomes geographical distinctions and focuses on the ide...
Article
Full-text available
Ease of access to and low cost of hardware and software for 3D scanning have made 3D technologies increasingly popular in recent research. One of the possible 3D scanning approaches is photogrammetry which relies on using a data set consisting of photographs of the same physical object. In this paper are evaluated different 3D models generated from...
Article
This paper explains the 3D scanning procedure of creating a virtual 3D model from photographs by using a process called photogrammetry. It starts by giving a technical explanation of different technologies for 3D scanning, explains why photogrammetry was chosen and gives general specifications of hardware and software used in the process. The whole...
Conference Paper
Full-text available
This paper presents an idea to bring crowdsourcing to a higher level, for the purpose of acquiring valuable machine translation and natural language processing resources. In the proposed scenario, students are being educated in order to improve the quality and effectiveness of their natural language processing (NLP) related work. Their motivation i...
Poster
Full-text available
The poster presents 3 basic steps used in the system TMrepository used for collection of parallel data through crowdsourcing: - collecting and uploading parallel corpora - review and quality check - get top ranking (gamification)
Conference Paper
Full-text available
There is clear evidence of intense activities concerning start-up companies and their founders. In particular, EU governments play an active role in investing taxpayers money into start-ups, which are, by their very definition, high-risk business ventures. To evaluate such market interventions, interested parties have to develop precise methodologi...
Conference Paper
Full-text available
The paper presents automatic extraction process from monolingual text performed by three language independent tools, but relying on different principles. The research is conducted on the domain of pharmaceutical documentation. After the digitization process and use of OCR techniques, the automatic extraction process is performed. Results are compar...
Article
Full-text available
The Semantic Web as a World Wide Web trend setting technological paradigm has attracted a large and diverse community of educational and research institutions, as well as enterprise companies, all sharing a common belief that one day the Semantic Web will reshape the way the current World Wide Web functions and is being used. The groundbreaking ide...
Conference Paper
Full-text available
Twitter is currently the most popular tool for social interaction and real-time information exchange. Outreach and importance of individual accounts is measured by the number of their followers. The aim of this paper is to investigate the applicability and usefulness of corpora containing textual and visual information for the purpose of machine ob...
Conference Paper
Full-text available
Affective computing opens a new area of research in computer science with the aim to improve the way how humans and machines interact. Recognition of human emotions by machines is becoming a significant focus in recent research in different disciplines related to information sciences and Human-Computer Interaction (HCI). In particular, emotion reco...
Conference Paper
Full-text available
The socio-technical systems research paradigm is about the complexity of real situations. It confronts us with the quest for variables that could provide us with insight into the behavior of such systems. Their behavior emerges according to internal system properties and adaptation of the system to external conditions. In our view, behavioral patte...
Conference Paper
Full-text available
Sustavi za upravljanje dokumentima i zapisima (EDRMS), koji su najčešće dijelovi sveobuhvatnijeg sustava za upravljanje korporacijskim sadržajima (ECMS) zahvaćaju dokumente i zapise koji izvorno nastaju u digitalnom obliku kao i one koji su digitalizirani. Dok je izvorno digitalne zapise relativno jednostavno opisati tijekom njihova nastanka te im...
Article
Full-text available
Automatic quality evaluation of machine translation systems has become an important issue in the field of natural language processing, due to raised interest and needs of industry and everyday users. Development of online machine translation systems is also important for less-resourced languages, as they enable basic information transfer and commun...
Conference Paper
Full-text available
In this research, a specific data set was machine translated by two publicly available machine translation services, Google Translate and Yandex.Translate. Machine translations were performed for two language pairs: English-Croatian and Russian-Croatian. Afterwards, automatic quality evaluation of the machine translated data set was carried out. Se...
Conference Paper
Full-text available
Automatic quality evaluation of machine translation systems has become an important issue in the field of natural language processing, due to raised interest and needs of industry and everyday users. Development of online machine translation systems is also important for less-resourced languages, as they enable basic information transfer and commun...
Conference Paper
Full-text available
extended abstract presenting our ongoing research
Conference Paper
This is extended abstract presenting our on going research.....
Conference Paper
Full-text available
Literature discussing scientific communication considers two types of communities that communicate around scientific output: primary and secondary communities. Primary communities encompass scientists, researchers, and scholars Secondary communities involve practitioners from public, private, and non-governmental organizations. This paper gives an...
Article
Full-text available
Sažetak: U radu se predlaže pristup dizajnu informacijskog sustava temeljenog na primarnom i sekundarnom iskustvu. Predložen pristup razvijen je za potrebe empirijskog istraživanja korištenja znanstvenih publikacija kreiranih na akademskim institucijama od strane javnih, privatnih i nevladinih organizacija, te za potrebe izgradnje informacijskog su...
Article
Full-text available
This paper presents results of human evaluation of machine translated texts for one non closely-related language pair, English-Croatian, and for one closely-related language pair, Russian-Croatian. 400 sentences from the domain of tourist guides were analysed, i.e. 100 sentences for each language pair and for two online machine translation services...
Article
Full-text available
The paper presents combined automatic speech recognition (ASR) of English and machine translation (MT) for English and Croatian and Croatian-English language pairs in the domain of business correspondence. The first part presents results of training the ASR commercial system on English data sets, enriched by error analysis. The second part presents...
Article
Full-text available
In the paper the evaluation of formant speech synthesis for Croatian is conducted in four domains: hotel reservation, insurance, automobile industry and weather forecast. Human evaluation is performed in order to evaluate quality of speech according to criteria of sentence comprehensibility, word intelligibility and correctness of word pronunciatio...
Article
Full-text available
Formant speech synthesis method mimics the time-varying formant frequencies of human speech and does not use prerecorded speech samples. In this paper, related work is discussed and an experiment is conducted using formant synthesis-based text-to-speech tool CroSS (Croatian Speech Synthesizer), in order to assess and evaluate the quality of synthes...
Conference Paper
Full-text available
Monolingual and multilingual terminology and collocation bases represent valuable additional electronic resources, which can be used in further research, in written communication and in everyday communication. Building of such resources can be supported by terminology extraction tools relying on statistical or language approaches, or on hybrid mode...
Conference Paper
Full-text available
Monolingual and multilingual terminology and collocation bases represent valuable additional electronic resources, which can be used in further research, in written communication and in everyday communication. Building of such resources can be supported by terminology extraction tools relying on statistical or language approaches, or on hybrid mode...

Network

Cited By

Projects

Projects (3)
Project
Digital Document Classification
Project
The goal of this project is to develop crucial improvements to affective multimedia databases, of affective repositories as they are also called, in regard to facilitated construction and expressive annotation, improved searching and multifaceted retrieval of emotion-inducing documents, simplified usage in experimentation, and supported sharing of expert knowledge among involved researchers. These datasets are an important and frequently used research tool in psychology, neurology, and cognitive sciences. By using modern approaches from computer science in the fields of knowledge representation and automated reasoning, machine learning, information retrieval, decision support systems, recommender systems, many of their features may be significantly improved over existing de-facto standards. Successful implementation of computer science technologies could further expedite advanced research into human emotion, attention, perception, and behavior. Improved datasets of affectively-annotated multimedia documents are called advanced affective multimedia databases.
Project
Concordia is a C++ library for fast text lookup in large corpora. It uses a RAM stored index, which takes up approximately 600MB of memory for a corpus of 2 million sentences. It is based on the idea of a suffix array, enhanced by the presence of other auxiliary data structures.