November 2024
·
2 Reads
Lexicographica - International Annual for Lexicography / Internationales Jahrbuch für Lexikographie
This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.
November 2024
·
2 Reads
Lexicographica - International Annual for Lexicography / Internationales Jahrbuch für Lexikographie
September 2022
·
14 Reads
We created a prototype of an electronic dictionary for the mathematical domain of graph theory. We evaluate our prototype and compare its effectiveness in task-based tests with that of Wikipedia. Our dictionary is based on a corpus; the terms and their definitions were automatically extracted and annotated by experts (cf. Kruse/Heid 2020). The dictionary is bilingual, covering German and English; it gives equivalents, definitions and semantically related terms. For the implementation of the dictionary, we used LexO (Bellandi et al. 2017). The target group of the dictionary are students of mathematics who attend lectures in German and work with English resources. We carried out tests to understand which items the students search for when they work on graph-theoretical tasks. We ran the same test twice, with comparable student groups, either allowing Wikipedia as an information source or our dictionary. The dictionary seems to be especially helpful for students who already have a vague idea of a term because they can use the resource to check if their idea is right.
January 2022
·
1,157 Reads
·
2 Citations
Legal documents often have a complex layout with many different headings, headers and footers, side notes, etc. For the further processing, it is important to extract these individual components correctly from a legally binding document, for example a signed PDF. A common approach to do so is to classify each (text) region of a page using its geometric and textual features. This approach works well, when the training and test data have a similar structure and when the documents of a collection to be analyzed have a rather uniform layout. We show that the use of global page properties can improve the accuracy of text element classification: we first classify each page into one of three layout types. After that, we can train a classifier for each of the three page types and thereby improve the accuracy on a manually annotated collection of 70 legal documents consisting of 20,938 text elements. When we split by page type, we achieve an improvement from 0.95 to 0.98 for single-column pages with left marginalia and from 0.95 to 0.96 for double-column pages. We developed our own feature-based method for page layout detection, which we benchmark against a standard implementation of a CNN image classifier. The approach presented here is based on corpus of freely available German contracts and general terms and conditions. Both the corpus and all manual annotations are made freely available. The method is language agnostic.
January 2022
·
12 Reads
International Journal of Lexicography
1. Introduction Michael Rundell Sue Atkins, who died in September at the age of 90, was a true visionary, and one of the most important and influential lexicographers of this or any other era. Her long and distinguished career began in the 1960s — when dictionary-making inhabited a rather amateurish, index-carded (and male-dominated) milieu — and continued well into the 21st century. By the time Sue retired (if she ever truly did), corpus-based lexicography was the norm, collaborations between lexicographers and linguists were almost routine, and tools for semi-automatic dictionary compilation were already well advanced. But the important point here is not simply that Sue lived through these dramatic changes — she was one of the key people driving them, and her impact will continue to be felt for years to come. The eight pieces collected here come from friends and colleagues who worked with Sue at different points in her career. They testify to her massive contributions across the range of activities — technical, theoretical, and practical — which her career spanned. They also show the great admiration and affection in which Sue was held by all who knew her and worked with her. Reading these reminiscences, one can’t help noticing a number of recurrent themes: her remarkable insight into the nature of language; her readiness to engage with people from other communities (academic linguists, computer scientists, and so on) in order to take things forward; her boundless energy and infectious enthusiasm; her organisational skills and talent for getting things done (and for knowing what needed to be done); and her wit, charm, and great capacity for fun and friendship. Whatever the situation, Sue was always good company, as many of these reminiscences attest.
January 2022
·
14 Reads
Jusletter IT
December 2021
·
5 Reads
December 2021
·
6 Reads
December 2021
·
15 Reads
December 2021
·
12 Reads
Online-Rezensionen zu künstlerischen Artefakten können Bildungsprozesse anstoßen. Sowohl in der produktiven Auseinandersetzung mit einem Werk als auch in der Aufbereitung dieser Erfahrung in einem rezensiven Text und für ein spezifisches Publikum liegt ein hohes Potenzial hinsichtlich der kulturellen Teilhabe und Überwindung von Bildungsbarrieren. Aber welche Prozesse, Inhalte und Kontexte spielen dabei eine Rolle? Dieser Frage widmete sich das interdisziplinäre Forschungsprojekt Rez@Kultur, dessen Ergebnisse hier erstmals umfassend dargestellt werden. Ergänzt werden die Befunde um Anschlussperspektiven und Kommentare aus Forschung und Praxis.
December 2021
·
4 Reads
Online-Rezensionen zu künstlerischen Artefakten können Bildungsprozesse anstoßen. Sowohl in der produktiven Auseinandersetzung mit einem Werk als auch in der Aufbereitung dieser Erfahrung in einem rezensiven Text und für ein spezifisches Publikum liegt ein hohes Potenzial hinsichtlich der kulturellen Teilhabe und Überwindung von Bildungsbarrieren. Aber welche Prozesse, Inhalte und Kontexte spielen dabei eine Rolle? Dieser Frage widmete sich das interdisziplinäre Forschungsprojekt Rez@Kultur, dessen Ergebnisse hier erstmals umfassend dargestellt werden. Ergänzt werden die Befunde um Anschlussperspektiven und Kommentare aus Forschung und Praxis.
... The queries can incorporate the hierarchical structures implied by many annotation types (e.g. syllables contained in words) but can also relate multiple independent annotation categories, such as prosodic annotation and part-of-speech-tagging (Gut et al. 2004). ...
March 2004
... Based on their origin, electronic dictionaries can also be divided in two types -dictionaries transferred from existent print dictionaries or digitized print dictionaries and dictionaries compiled for the electronic medium or purposebuilt electronic dictionaries (Svensén, 2009: 438-439). The properties of a print dictionary that has been adapted to the electronic medium have been described by Debus-Gregor andHeid (2013: 1002) as 'somewhat "in between" those of a paper dictionary and those of a dictionary conceived to exist exclusively as an electronic tool'. Though, it should be noted that these electronic dictionaries may be very close or nearly identical to the original print dictionaries or, due to extensive use of advantages offered by the electronic medium, already differ from them considerably. ...
December 2013
... It should be notes within the discussion on the language of legislation documents, among other issues, the focus on their titles/ headings/ headlines emerged in the past century (Carlson, 1968). The 21st century consistently recognizes the importance of titles/headlines/headings in legal documents as legal titles/headings verbal representation either contributes or downplays the visibility of concept hierarchy within the above sources (Josi, et al., 2022;Sanchez, 2019). Currently scholars underline the importance of accurate and consistent verbal representation of legal titles/headings in the process of avoiding legal uncertainty (Mavroidis, 2022). ...
January 2022
... Lü and Zhou (2004) propose a model for translating between English and Chinese collocations that they extract from monolingual corpora parsed with the NLPWin parser (Heidorn, 2000). Heid et al. (2008) extract German juridical terminology and use FSPAR (Schiehlen, 2003) to extract verb-object pairs. Weller and Heid (2010) use the same parser to extract German multiword expressions and their morphosyntactic features. ...
September 2008
... Today, corpora and corpus-based tools are considered almost a conventional approach to building lexicographic materials (Sinclair 1992;Abdelzaher 2022). Therefore, it comes as a surprise that it has generally not been adequately acknowledged and precisely defined by technical dictionaries and glossaries, especially as the vocabulary volume in technical and scientific texts is not as large as in GE texts, thus having higher frequency (density) of core vocabulary (Chung and Nation 2004;Kovalev et al. 2019;Kruse and Heid 2021). This may be the case because of frequently and ad hoc developed technical (often bilingual) specialized glossaries (e.g. of some medical, business or nautical terms) which are often not compiled by language professionals, thus not receiving significant attention from lexicographers (cf. ...
Reference:
Frequency or Keyness?
November 2020
... Both corpora were compiled from open-access specialized journals and conference proceedings, and are similar in nature. In the case of the lexicography corpus, we used one compiled by Lindemann, Kliche and Heid (2018). The lexicography corpus, which can be considered a subfield (though not a subset) of the other, includes proceedings of Euralex and other conferences, as well as papers from open-source lexicography and lexicology journals. ...
July 2018
... (Rundell 2012: 72) Good electronic dictionaries are characterised by the utilisation of electronic features enabled by computer technology and utilisation of virtually unlimited space on the internet. The interested reader is referred to De Schryver (2003), and Prinsloo (2019a) for a more detailed discussion of such features and to Bothma, Prinsloo and Heid (2018), , Prinsloo (2019a), Prinsloo and Bothma (2020) and Prinsloo and Taljard (2019) for detailed discussions on user support tools in electronic dictionaries. Bukantswe has more than 10,000 Sesotho entries with their English equivalents available from http://bukantswe.sesotho.org/. ...
August 2018
Lexicographica - International Annual for Lexicography / Internationales Jahrbuch für Lexikographie
... G-MdS: The digital format offers far more options to 'solve' problematic lexicographical issues, especially for languages with complex morphology. English is trivial lexicographically speaking (it has typically two forms for nouns, four forms for verbs, and very little morphology and thus lemmatisation issues elsewhere); this is very different for many other language families, including for instance the Bantu languages, which are agglutinative, and where digital dictionaries may truly simplify look-up for both decoding as well as encoding purposes (Prinsloo, 2005;Prinsloo et al., 2012;Prinsloo et al., 2017). Moreover, for polysynthetic languages, such as several of the Amerindian languages, the only truly successful way to lemmatise lexis in a user-friendly way is in a digital product (Frawley et al., 2002). ...
November 2017
Lexikos
... This research uses qualitative methods and is built based on modeling based on main theories (Mani, 2022;Chen & Luo, 2019;Schoormann et al., 2017). Research with a strong theoretical literature review and consistently summarizing previous research results can summarize theoretical prepositions and build modeling. ...
June 2017
Lecture Notes in Business Information Processing
... Formats: plain text, CoNLL[36], TCF[25], UIMA XMI[21] ...
January 2010