Flammie A Pirinen’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (6)


Figure 1: Screenshot of Lule Sámi keyboard for Android, as defined in the listing above.
GiellaLT: an infrastructure for rule-based language technology tool development
  • Article
  • Full-text available

January 2023

·

87 Reads

Flammie A Pirinen

·

·

·

Currently, machine learning is presented as the ultimate solution for language technology regardless of use case and application, however, it requires as a starting point a massive amount of curated linguistic data in electronic form that is expected to be high quality and representative of the kind of language usage that the tools will follow. For minority and indigenous languages, this can be an insurmountable task, as digital materials of the necessary sizes do not exist and can not easily be produced. In this article we present an approach we have successfully used for supporting indigenous languages to survive and grow in digital contexts for years, and describe the potential of our approach for African contexts. Our technological solution is a free and open-source infrastructure that enables language experts and users to cooperate on creating linguistic resources like dictionaries and grammatical descriptions. In addition we provide language-independent frameworks to build these into applications that are needed by the language community.

Download

Figur 1: Modulaer struktur av GramDivvun
Mii *eai leat gal vuollánan – Vi *ha neimen ikke gitt opp: En hybrid grammatikkontroll for å rette kongruensfeil

August 2022

·

11 Reads

Nordlyd

·

Flammie Pirinen

·

·

[...]

·

Thomas Omma

Machine learning is the dominating paradigm in natural language processing nowadays. It requires vast amounts of manually annotated or synthetically generated text data. In the GiellaLT infrastructure, on the other hand, we have worked with rule-based methods, where the linguistis have full control over the development the tools. In this article we uncover the myth of machine learning being cheaper than a rule- based approach by showing how much work there is behind data generation, either via corpus annotation or creating tools that automatically mark-up the corpus. Earlier we have shown that the correction of grammatical errors, in particular compound errors, benefit from hybrid methods. Agreement errors, on the other other hand, are to a higher degree dependent on the larger grammatical context. Our experiments show that machine learning methods for this error type, even when supplemented by rule-based methods generating massive data, can not compete with the state-of-the-art rule-based approach.


Error types, percentage of all errors.
You can’t suggest that?!: Comparisons and improvements of speller error models

August 2022

·

48 Reads

·

2 Citations

Nordlyd

In this article, we study correction of spelling errors, specifically on how the spelling errors are made and how can we model them computationally in order to fix them.The article describes two different approaches to generating spelling correction suggestions for three Uralic languages: Estonian, North Sámi and South Sámi.The first approach of modelling spelling errors is rule-based, where experts write rules that describe the kind of errors are made, and these are compiled into finite-state automaton that models the errors.The second is data-based, where we show a machine learning algorithm a corpus of errors that humans have made, and it creates a neural network that can model the errors.Both approaches require collection of error corpora and understanding its contents; therefore we also describe the actual errors we have seen in detail.We find that while both approaches create error correction systems, with current resources the expert-build systems are still more reliable.



The results indicate that both of the models receiving a chunk of two words at a time reached to the highest accuracy, and the model without the POS tags also reached to the highest precision.
Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language

August 2021

·

99 Reads

·

3 Citations

We investigate both rule-based and machine learning methods for the task of compound error correction and evaluate their efficiency for North Sámi, a low resource language. The lack of error-free data needed for a neural approach is a challenge to the development of these tools, which is not shared by bigger languages. In order to compensate for that, we used a rule-based grammar checker to remove erroneous sentences and insert compound errors by splitting correct compounds. We describe how we set up the error detection rules, and how we train a bi-RNN based neural network. The precision of the rule-based model tested on a cor- pus with real errors (81.0%) is slightly better than the neural model (79.4%). The rule-based model is also more flexible with regard to fixing specific errors requested by the user community. However, the neural model has a better recall (98%). The results suggest that an approach that combines the advantages of both models would be desirable in the future. Our tools and data sets are open-source and freely available on GitHub and Zenodo.


Citations (2)


... Vi har oppnådd gode resultater med maskinlaering for saerskrivingsfeil, dvs. lokale grammatikkfeil (Wiechetek et al. 2021). Vi ønsker derfor å undersøke nytten og begrensningene metoden har for andre feiltyper og muligheten for å kombinere maskinlaeringsbaserte og regelbaserte metoder for å lage en bedre grammatikkontroll. ...

Reference:

Mii *eai leat gal vuollánan – Vi *ha neimen ikke gitt opp: En hybrid grammatikkontroll for å rette kongruensfeil
Rules Ruling Neural Networks – Neural vs. Rule-Based Grammar Checking for a Low Resource Language

... Different approaches to AI can complement each other. For example, Wiechetek et al. (2021) investigated both rule-based and machine learning methods for grammar checking on a Sami language, which has a small number of native speakers. ...

Rules Ruling Neural Networks - Neural vs. Rule-Based Grammar Checking for a Low Resource Language