Figure 3 - uploaded by Üveges István
Content may be subject to copyright.
Measured F1-Scores with {"lr": 0.3, "wordNgrams": 1, "minCount": 1, "epoch": 5} parameter set

Measured F1-Scores with {"lr": 0.3, "wordNgrams": 1, "minCount": 1, "epoch": 5} parameter set

Source publication
Article
Full-text available
The current article briefly presents a pilot machine-learning experiment on the classification of official texts addressed to lay readers with the use of support vector machine as a baseline and fastText models. For this purpose, a hand-crafted corpus was used, created by the experts of the National Tax and Customs Administration of Hungary under t...

Context in source publication

Context 1
... kind of behavior was not affected significantly by changing any of the parameters mentioned above. Figure 3 illustrates a typical result after testing the model both on the train and test set with the use of pretrained word-vectors. The following main conclusions were then drawn from the testing of different combinations of possible parameters: ...

Similar publications

Article
Full-text available
Language learning is no longer a matter of sitting in a class with a pen in hand aiming at making notes about new vocabulary, new rules and new ways of how to become a native-like speaker. In this day and age, the abiding classical role of the teacher-being the unique source of language input-has collapsed under a heavy pressure of a giant native l...
Article
Full-text available
O presente artigo propõe traçar entendimentos sobre questões de colonialidade e decolonialidade (SANTOS, 2010), refletindo sobre a performance drag e sua relação com o sistema moderno-colonial de gênero (LUGONES, 2020) que caracteriza o pensamento moderno ocidental abissal. Embasado na perspectiva da Linguística Aplicada Indisciplinar e Transgressi...
Article
Full-text available
As survey research in second language acquisition grows in popularity, the adherence to best practices associated with questionnaire quality is critical for a better understanding of factors that influence second language (L2) development. To ensure that a self-report scale targets the construct of interest and does it consistently and accurately,...

Citations

Article
Full-text available
Increasingly, computer technologies in linguistics offer their advanced tools to process, store and select language data, which has triggered the fast development of the actual branch of linguistic studies – corpus linguistics. Through the use of large-scale empirical data and advanced computer technologies to reach objective insights into language function, linguistic corpora have quickly become invaluable resources. Data obtained through corpus analysis facilitate the drawing of qualitatively new conclusions about language and highlight research directions that previously received little attention. Despite substantial linguistic work on Dmytro Dontsov’s writings, there still exist a number of questions not studied so far. Among these are compiling a linguistic language corpus as well as a concordance of this prominent thinker, public figure, publicist utilizing modern methods to calculate and fix lexemes in order to identify their attributive collocation. Any available means of programming, facilitating processing language material of a big mass, have been realized to be an undoubtedly productive way to properly perform conceptual analysis and to serve as an additional tool for studying. The purpose of the study is to determine the peculiarities of the unique features of Dmytro Dontsov’s maxims in the aspect of corpus linguistics and lexicography on the basis of the corpus space that forms the linguistic vision of the world and is a source of creating lexicographical pieces of work (concordances, a writing language, a writer language, etc.). We emphasize that the creation of Dmytro Dontsov’s writing concordance has become the subject of corpus linguistics study for the first time. The subject of our study is the writings of Dmytro Dontsov, their lexicographical parametrization which provides all possible words with their description (phonetic, word-creating, grammatical), along with quantitative indices, i.e. it is the result of learning many linguistic disciplines, unified by the dictionary. The methodology of study is the combination of general theoretical methods (analysis, generalization, explanation) with the applied methods of linguistics. Analysis of the studied conception is grounded in “The Spirit of Our Antiquity” text corpus compiled by using the Sketch Engine program. On the basis of the analysis, it is found out that an electronic corpus provides the opportunity to accelerate language study and increase its effectiveness, probability and checkability significantly. The article reveals heuristic potential, practical effectiveness of the corpus and application of concordant technologies in conceptual studies. It was discovered that, the construction of the full concordance of the writings of Dmytro Dontsov will enable showing the picture of the world on the basis of learning the author’s lexical wealth and reproducing his understanding of the political situation. The created concordance is a stage of forming lexicographical works about Dontsov, which provides understanding of stages, methods, principles and peculiarities of compiling Dmytro Dontsov’s writing language dictionary. The multifaceted study of D. Dontsov’s writings is believed to be of importance. The concordance serves as a new material, important for politicians, journalists, teachers, students, for it is an entrance in a system of new words, filling with new ideas, and possibility to perceive the world through the nation-state aspirations of a great thinker. Realization of his project will help to create a political-intellectual product of great importance to offer bright prospects for future linguistic study, whose tasks are supposed to use linguistic material.
Article
Full-text available
This article presents the development of an artificial intelligence maturity model (AIMM), specifically tailored for public sector organizations to assess their readiness for AI adoption. Using design science methodology, the research synthesizes insights from academic literature and expert consultations to propose a comprehensive AIMM. Through iterative development and expert feedback, the study refines a model that categorizes AI maturity across eight dimensions. The model’s validity is assessed through expert evaluations and questionnaires, confirming its relevance and utility in guiding public organizations toward effective AI adoption. This research contributes to the theoretical and practical understanding of AI implementation in the public sector, addressing unique challenges such as procurement models, legal compliance, and organizational capabilities.
Article
Full-text available
One crucial aspect of access to justice and access to legal information is the comprehensibility of legal text. The complexity and specialized terminology of legal language often prevents citizens from understanding legal texts and representing themselves effectively in legal proceedings. The new developments of Machine learning, such as solutions based on large language Models, could represent a significant advancement in access to legal information, as they can transform complex legal texts into more straightforward, more understandable forms for laypeople. This paper attempts to exploit the capabilities of OpenAI’s GPT-4 model to produce automatic Plain language transcriptions of legal texts. the experiment concerns four specific linguistic features, and the results are analyzed manually from both a legal and a linguistic point of view.
Article
Full-text available
The article discusses the development of autonomous robotic transport systems for use in hospitals and medical facilities and presents a legal analysis with a particular focus on the development of related case-law and legislation in the past years. The new solutions are part of the evolutionary trends and hold the potential to determine the future of medical facilities. The article focuses on the search for innovative solutions that are supposed to change the quality of operation of medical premises. An internal autonomous medical transport vehicle (IAMTV) is designed for use within hospital premises and not intended for operation on public roads, thus, the scope of this analysis forgoes traffic-related aspects. Furthermore, the purpose of the vehicle is the transportation of items within its locked storage space. While the items transported may vary during use, this legal analysis presumes and focuses on the transport of medicines, medical samples, and even documents. Also, throughout the analysis, the immediate operational environment of IAMTVs will be evaluated. An assumption is made that IAMTVs are not able to operate entirely independently, in a vacuum, but rather will form a part of hospitals' internet of things network.
Conference Paper
Full-text available
A természetesnyelv-feldolgozási feladatok gyakori építőeleme a mondatokra történő szegmentálás, amely azonban a jogi szövegek ese-tében hagyományosan problémás terület. Jelen cikkünkben bemutatjuk a bírósági határozatok szegmentálására "hangolt" szabályalapú eszközün-ket, összemérve azt pontosságban és futásidőben más magyar nyelvre alkalmazható mondatszegmentálókkal az eddigiekben mondatszegmen-tálás mérésére használt Szeged Treebank és UD korpuszokon, valamint egy csak magyar nyelvű bírósági határozatokat tartalmazó korpuszon. A szegmentálónk a Szeged Treebank korpuszon összességében a Stanza szegmentálóhoz hasonló eredményt ért el, azonban a jogi alkorpuszon már a legjobb modell volt, amely nem látta korábban ezt az adatot. A bírósági határozatokon a legjobbnak bizonyult a megközelítésünk. Meg-vizsgáltuk a tördelés ismeretének hatását a szegmentálókra, valamint a szövegek doménjeinek hatását a szegmentálókra.
Article
A közérthető fogalmazás az állami szervek részéről régóta ismerten hozzájárul a jogbiztonság, valamint a joghoz való hozzáférés elősegítéséhez, továbbá támogatja az állampolgárokat demokratikus érdekérvényesítő képességük kiteljesítésében. Habár a jogtudományi és nyelvészeti szakirodalom régóta foglalkozik a jogi szövegek közérthetőbbé tételének kérdésével, a probléma mindezidáig híján van olyan (félig) automatikus megoldásoknak, amelyek a szakértői munkát érdemben támogatni képesek lennének, ezzel csökkentve a kapcsolódó intralingvális átfordítás idő- és munkaigényét. A tanulmány ennek megfelelően számba veszi azokat a kézzel írt szabályokat, amelyek segítségével a jogi doménhez tartozó szövegek szükségtelen nyelvi komplexitása esetlegesen redukálható, egyszersmind számot ad arról, hogy a jelenleg elérhető gépi tanulási megoldások hogyan és milyen mértékben alkalmazhatók a közérthető kommunikáció elősegítése érdekében.