Content uploaded by Nahid Hossain
Author content
All content in this area was uploaded by Nahid Hossain on Oct 21, 2023
Content may be subject to copyright.
Panini: A Transformer Based Grammatical Error
Correction Method for Bangla
Nahid Hossain1†, Mehedi Hasan Bijoy2†, Salekul Islam1,
Swakkhar Shatabda1*
1*Computer Science and Engineering, United International University,
Dhaka, 1212, Bangladesh.
2Computer Science and Engineering, Bangladesh University of Business
and Technology, Dhaka, 1216, Bangladesh.
*Corresponding author(s). E-mail(s): swakkhar@cse.uiu.ac.bd;
Contributing authors: nahid@cse.uiu.ac.bd;mhb6434@gmail.com;
salekul@cse.uiu.ac.bd;
†These authors contributed equally to this work.
Abstract
The purpose of the Bangla grammatical error correction task is to sponta-
neously identify and correct syntactic, morphological, semantic, and punctuation
mistakes in written Bangla text using computational models, ultimately enhanc-
ing language precision and eloquence. The signicance of the task encompasses
bolstering linguistic acumen, fostering ecacious communication, and ensuring
utmost lucidity and meticulousness in written expression, thereby mitigating
the potential for obfuscation or dissemination of fallacious connotations. Prior
endeavors have centered around surmounting the constraints inherent in rule-
based and statistical methods through the exploration of machine learning and
deep learning methods, aiming to enhance accuracy by apprehending intricate
linguistic patterns, comprehending contextual cues, and discerning semantic
nuances. In this study, we address the absence of a baseline for the task by
developing a large-scale parallel corpus comprising 7.7M source-target pairs and
exploring the untapped potential of transformers. Alongside the corpus, we intro-
duce a Vaswani-style ecient monolingual transformer-based method named
Bangla grammatical error corrector, Panini∗
by leveraging transfer learning, which
∗One of the earliest linguist and grammararian, bangla grammar follows the rules set by
Panini
1
has become the state-of-the-art method for the task by surpassing the perfor-
mance of both BanglaT5 and T5-Small by 18.81% and 23.8% of accuracy scores,
and 11.5 and 15.6 of SacreBLEU scores, respectively. The empirical ndings of
the method substantiate its superiority over other approaches when it comes to
capturing intricate linguistic rules and patterns. Moreover, the ecacy of our
proposed method has been compared with the Bangla paraphrase task, showcas-
ing its superior capability by outperforming the previous state-of-the-art method
for the task as well. The BanglaGEC corpus and Panini, along with the baselines
of BGEC and the Bangla paraphrase task, have been made publicly accessible
at https://tinyurl.com/BanglaGEC.
Keywords: Transformer, Panini, Grammar Error Correction, Bangla Paraphrase,
Transfer Learning
1 Introduction
The Grammatical Error Correction (GEC) task aims to autonomously identify and
rectify errors in written texts, encompassing grammar, syntax, punctuation, and
language rules, with the purpose of optimizing overall precision, coherence, and read-
ability to facilitate eective written communication. Within the specic context of
the Bangla Grammatical Error Correction (BGEC), the task involves automatically
detecting and correcting grammatical errors in Bangla text. The signicance of BGEC
lies in its capacity to enhance communication, foster language acquisition, elevate
writing excellence, safeguard the Bangla language, and escalate ecacy in handling
Bangla text, beneting individuals, educators, and organizations alike. Despite sig-
nicant advancements in GEC for high-resource languages [1–4], the development of
accurate and eective GEC systems for low-resource languages like Bangla remains
a challenge. The intricacy of the Bangla language, characterized by its complex mor-
phology, diverse verb forms, and tangled sentence structures, renders it one of the
most formidable endeavors in Bangla Natural Language Processing. Moreover, the
paucity of a publicly available large-scale corpus for Bangla grammatical error cor-
rection poses another inevitable constraint in the pursuit of developing exceedingly
accurate models.
In the past decade, considerable research has been undertaken to address the
BGEC task through rule-based [5–18] and statistical [19–30] approaches. The lim-
itations of rule-based methods in the BGEC task, including the lack of exibility,
challenges in rule creation, limited coverage, inability to handle ambiguity, di-
culty in handling language variation, and limited error detection capabilities, have
prompted researchers to investigate data-driven statistical approaches as a means to
overcome these limitations and enhance accuracy. Nevertheless, statistical methods
also exhibit inherent limitations when it comes to contextual understanding, disen-
tangling ambiguity, excessive reliance on handcrafted features, adaptability to novel
languages, handling out-of-vocabulary words, grappling with data sparsity challenges,
and limited error detection capabilities. These constraints have spurred the explo-
ration of sophisticated methods such as machine learning [31,32] and deep learning
2
[33–42], aimed at ameliorating accuracy and fortifying the resilience of error correction
systems. In recent years, the application of deep learning techniques has showcased
promising accomplishments in the BGEC task owing to their enhanced capacity in
capturing intricate linguistic patterns, deciphering context dependencies, and discern-
ing semantic nuances. Lately, transformer-based methods have exhibited remarkable
prowess in several Bangla natural language processing tasks, including machine trans-
lation [43], sentiment analysis [44], and spelling error correction [45], to name a
few. A Vaswani et al. [46] style transformer architecture is utilized in our proposed
method which is bifurcated into two integral components: the encoder and the decoder.
Within the encoder module, there are pivotal elements including positional encoding,
multi-head self-attention mechanism, layer normalization, residual connections, and
feedforward neural networks. Likewise, the decoder module encompasses positional
encoding, masked multi-head self-attention, layer normalization, residual connections,
and feedforward neural networks. The mainstream structure of transformers typically
involves stacking multiple encoder and decoder blocks. To the best of our knowledge,
no transformer-based baseline for the BGEC task has been proposed yet. Henceforth,
we endeavor to harness the formidable capabilities of transformers and embark upon
an exploration of their untapped potential in the realm of BGEC.
Several constraints associated with the BGEC task have been identied, partic-
ularly concerning the dearth of a large-scale parallel corpus and the utilization of
transformer-based monolingual methods. The objective of this study is to surmount
the recognized limitations and provide a solid foundation for further advancements
in the eld by establishing a comprehensive baseline, thereby paving the way for
future research endeavors. To do so, an extensive parallel corpus has been developed
by carefully crafting a diverse set of Bangla grammar rules. Moreover, a monolin-
gual transformer-based model named Panini has been proposed for the BGEC task.
Additionally, a scrutiny was conducted to ascertain whether the performance of the
monolingual transformer model is improved by the transfer learning technique. To this
end, we initially train the model on a Bangla paraphrase dataset [47] and then trans-
fer the acquired knowledge while addressing the BGEC task. In short, our proposed
Panini accepts a grammatically erroneous sentence as input, which is subsequently
tokenized using a pre-trained tokenizer [47]. The tokens are then fed into the encoder
component of the model, where they undergo transformations resulting in a sequence
of continuous representations. Following this, the decoder component integrates the
output response from the encoder along with the output from its previous time step
to generate the grammatically correct sentence.
The contributions of this article are summarized below:
•We propose a large-scale parallel corpus for the BGEC task, which comprises
approximately 7.74M source-target instances. It has been created by carefully craft-
ing a diverse set of intricate grammar rules, thus making Bangla a resourceful
language for the task.
•A state-of-the-art monolingual transformer-based model named Panini has been
introduced, exemplifying advancements in the BGEC task compared with other
transformer-based baselines including BanglaT5 and T5-Small, thereby potentially
heralding more sophisticated automated grammatical error correction.
3
•The impact of the training corpus size on the potency of the proposed method
Panini in rectifying grammatical errors in Bangla has been investigated.
•The ecacy of transfer learning from the Bangla paraphrase task in the domain of
BGEC has been meticulously scrutinized.
•The empirical outcomes of the proposed Panini have been juxtaposed with the
baselines of the Bangla Paraphrase task, showcasing its supremacy in the task by
surpassing the previous state-of-the-art performance with approximately 3.5 times
fewer parameters, therefore attesting to its superior capabilities across dierent
Bangla Natural Language Processing (BNLP) tasks.
The remaining sections of this paper are structured as follows: Section 2 provides
a comprehensive review of contemporary research in the domain of grammatical error
correction, shedding light on the prevailing hurdles encountered specically in the
context of Bangla. In Section 3, we explicate the meticulous process employed for
the creation of a large-scale parallel corpus, outlining the step-by-step procedure in a
systematic manner. Subsequently, Section 4 elucidates the methodology and architec-
tural design of our proposed monolingual transformer-based method. Next, Section
5 showcases the experimental setup and results along with the evaluation metrics
employed to assess the performance of the model. Finally, Section 6 summarizes the
results, implications, and potential future directions for this research.
2 Related Work
A considerable amount of research has been carried out on correcting grammatical
errors in the Bangla language. While the development of the Bangla GEC task has
indeed gained steep attention since the late 2000s, it is evident that notable standards
have yet to be achieved. The existing methods can broadly be classied into four
primary groups including rule-based [5–18], statistical [19–30], machine-learning-based
[31,32], and deep-learning-based [33–42]. We observed that deep-learning-based and
rule-based approaches became prominent and obscure after 2018, respectively.
2.1 Rule-based Methods
The most commonly used rule-based schemes for Bangla grammatical error correc-
tion include context-free grammar (CFG) [5,8,16], context-sensitive grammar (CSG)
[9,14], head-driven phrase structure grammar (HPSG) [7], string matching algorithm
[10], and Viterbi algorithm [17]. Among rule-based approaches, [5], [6], [8], [12], and
[16] utilize the formalism of CFG by dening a set of valid grammar rules and deter-
mining whether a given sentence conforms to these rules or not. In particular, Purohit
et al. [8] identify several features of Bangla words, and further develop a set of seman-
tic features for dierent word categories with the help of CFG to tackle the Bangla
GEC task. A CFG-based predictive parser has been proposed by [5] and [6], which
is implemented following a top-down fashion to avoid the left recursion issue of the
CFG by left factoring, for Bangla grammar error correction. In 2016, Rabbi et al. [12]
introduced a parsing method, to resolve the intricate and ambiguous Bangla gram-
mar, by employing a shift-reduce parser through constructing a parse table based
4
on the LR strategy. Recently, [16] utilized both CFG and CYK parsing algorithms
for Bangla GEC and found that although the CFG-based parser performed better,
the CYK-based parser worked faster. Another parser has been proposed by [9] which
incorporates both CFG and CSG rules to parse Bangla complex and compound sen-
tences semantically. However, Alamgir and Aren [14] propose a CSG-based parser
that prioritizes the intonation or mood of a sentence over its structure. Besides, [7],
[10], and [17] bring forward a Bangla grammar checker using HPSG, string matching
algorithm, and Viterbi algorithm. Unlike CFG-based parsers, the HPSG-based one [7]
can detect syntactic and semantic errors in a sentence by utilizing the POS tags of
words. Karim et al. [10] also utilize POS tags to determine sentence types, followed
by validation of their structure through a string-matching algorithm. Furthermore,
an augmented phrase structured grammar (APSG) rule-based semantic analyzer was
proposed for scrutinizing the legitimacy of simple, complex, and compound Bangla
sentences in 2018 [15]. A more recent study by Faisal et al. [18] presented a rule-based
method for identifying grammatical errors in Bengali sentences employing only POS
tags. First, they classied words into one of seven POS tags and then checked whether
the resulting tag combination followed any of their manually written rules.
2.2 Statistical Methods
In the case of statistical methods, the n-gram language model [20,24,29,30] is found
to be the most widely used where a few approaches utilized the frequency of words
[21,22] and term frequency-inverse document frequency (TF-IDF) [25] to x Bangla
grammatical mistakes. For instance, Kundu et al. [20] come up with a natural lan-
guage generation (NLG) based approach for Bangla GEC, which rst transforms the
input sentence into word vectors. These vectors are then fed into a bi-gram language
model to determine whether the sentence is grammatically correct or not. Rana et
al. [27] and Hossain et al. [30] have also introduced methods that combine bigram
and trigram models to tackle Bangla homophone errors in real-world text and Bangla
GEC, respectively. A similar method has been presented by Mridha et al. [28], which
is the coalescence of bigram and trigram models, for addressing sentence-level miss-
ing word errors. Recently, a higher order n-gram (n= 6) model has been proposed to
cluster Bangla words considering their contextual and semantic similarity [26]. How-
ever, to resolve the zero probability issue in the n-gram model, some approaches utilize
smoothing techniques such as Written-Bell [23,24] and Kneser-Ney [29]. In 2020,
Rahman et at. [29] experimented with both Written-Bell and Kneser-Ney smoothing
approaches to deal with missing words in the corpus. Their empirical outcomes mani-
fested that Kneser-Ney outperforms Witten-Bell. Moreover, [21] and [22] have brought
forward a Bangla grammar checker that counts the frequency of words. While a graph-
based edge-weighting method has been described in [22] to measure the semantic
similarity between two words, a condence score lter has been delineated in [21] to
elect an appropriate sample from the outcomes. Lately, Nipu and Pal [25] proposed
another Bangla grammar checker utilizing a vector space model (VSM) with TF-IDF
scores.
5
2.3 Machine Learning and Deep Learning based Methods
Due to recent advancements in Bangla NLP, machine learning [31,32] and deep learn-
ing [33,37,39,41,42] based approaches have become prominent in the Bangla GEC
task because of their impressive performance. Especially, deep-learning-based meth-
ods have received signicant attention for their versatility in handling dierent types
of grammar errors. In 2013, Kundu et al. [31] introduced a method that uses the
K-Nearest Neighbors (k-NN) algorithm to correct Bangla grammatical errors. In addi-
tion, they introduced an active-learning-based novel complexity estimation matrix
(CMM) for quantifying the grammatical intricacy of a sentence. A more recent study
by Mridha et al. [32] presented a Naive Bayes classier to address the same task to a
broader extent, as it incorporates both typographical and grammatical errors. How-
ever, a word embedding-based tactic has been described in [34], [38], and [41] to grasp
the semantic meaning, followed by cosine similarity to measure the semantic similar-
ity of words. In [34], the authors use a pre-trained word2vec model with an embedding
size of 300, which was trained on Bangla Wikipedia texts, to nd semantic textual
similarity. Pandit et al. [38] investigated a path-based and a distributional model for
calculating semantic similarity in Bangla. Their experimental results favored the dis-
tributional model, which employed word2vec, over the path-based one. Furthermore,
Iqbal et al. [41] inspected word2vec, GloVe, and FastText to calculate semantic simi-
larity and found that FastText with a continuous bag-of-words outperformed word2vec
and GloVe. Recent studies have utilized recurrent neural networks (RNN) to keep
pace with advancements in NLP [33,37,39,40]. Rakib et al. [36], for instance, used
a Gated Recurrent Unit (GRU), a type of RNN cell, on an n-gram dataset to antici-
pate the next most appropriate word in Bangla sentences in 2019. In the same year,
Islam et al. [37] employed long short-term memory (LSTM), another type of RNN cell,
in the sequence-to-sequence model to generate coherent and grammatically correct
Bangla sentences. Likewise, in 2021, the LSTM RNN cell is utilized by Chowdhury
et al. [40] to determine the suitability, adjacency, and anticipation of simple Bengali
sentences. Recently, Anbukkarasi and Varadhaganapathy [42] proposed a GRU-based
grammar checker for Tamil which is a low-resource language. Furthermore, several
studies have presented strategies that leverage the advantages of bidirectional LSTM
RNNs [33,35,39]. Islam et al. [33] introduced a seq2seq model for both correcting and
auto-completing Bangla sentences. Even though they employed a bi-LSTM RNN in
the encoder, a conventional LSTM RNN with attention is used in the decoder part of
the seq2seq model. Abujar et al. [35] and Noshin et al. [39] proposed bi-LSTM-RNN-
based methods for predicting the next word in the sequence and correcting real-word
errors in Bangla, respectively. Lately, the T5 model has been utilized by [48] for
Bangla grammatical error correction, where it underwent ne-tuning on a tiny corpus
consisting of only 9385 sentences, which is not enough for such a model. Neverthe-
less, we encountered the unattainable reproducibility of this endeavor, rendering the
ndings inconclusive.
6
2.4 Drawbacks of Dierent Methods
In brief, rule-based approaches are limited to a few rules when detecting Bangla
grammatical errors, and their performance largely depends on POS tags. While these
approaches are capable of eciently correcting syntactic errors, they have limitations
in their ability to address semantic errors. They are neither language-independent nor
easy to maintain and update. In the case of statistical methods, reliable performance
depends heavily on the quality of the corpus. Therefore, it is essential to have a bal-
anced corpus for these methods to be eective. However, statistical methods struggle
with correcting complex errors and have limitations in handling domain-specic
language. Similarly, the machine-learning-based approaches vastly rely on annotated
training data, and they also have several limitations such as a lack of interpretability
and performance degradation in noisy environments. On the ip side, deep learning-
based methods outperform other approaches given sucient training data.
However, the biggest barrier to the development of Bangla GEC is the scarcity of
publicly available large-scale parallel corpora for the task. To address this challenge,
we rst develop a large-scale parallel corpus for the Bangla GEC task and make
it publicly available. To the best of our knowledge, we are the rst to propose a
transformer-based method for Bangla grammatical error correction.
3 Corpus Creation
The scarcity of parallel corpora is the most signicant barrier to the development
of eective natural language processing systems for grammatical error correction. In
recent years, the availability of parallel corpora for some languages like English has
signicantly improved [4], but the situation is not the same for languages like Bangla
due to limited resources. Therefore, we take the initiative to make Bangla a resource-
ful language for the GEC task by developing a large-scale parallel corpus. To do so,
we identify seven primary types of Bangla grammatical mistakes, including errors in
verb inection, number (bochon), word choice (homonym), sentence structure, punc-
tuation, the agreement between subject and verb, and sentence fragments because of
a missing subject or verb. Furthermore, sentence fragments can be further classied
into four categories: subject missing, verb (from the dictionary) missing, auxiliary
verb missing, and main verb missing. The grammatical errors we incorporated into
our corpus are described below.
•Verb Inection. It refers to a set of letters that correlate one word to another,
particularly a noun or pronoun to its corresponding verb or adjective, in a sentence.
It varies depending on the changes in number (bochon). Using verb inection incor-
rectly can disrupt the relationship between a verb and its associated noun within a
sentence. For example: (correct) →( erroneous)
.
•Number (Bochon). In Bengali grammar, the act of determining the quantity of
nouns and pronouns is known as number (bochon). There are two types of numbers
including singular and plural. This type of error occurs when the number of a noun
7
or pronoun does not match the number of the verb, adjective, or article used in
the sentence. There could be three variants of errors in number (bochon): (i) using
a singular verb with a plural subject, or vice versa, (ii) using a singular article or
adjective with a plural noun, or vice versa, and (iii) using a singular pronoun to
refer to a plural noun or vice versa. It is worth mentioning that these errors can
make a sentence dicult to understand or change its semantic meaning altogether.
Therefore, it’s important to pay attention to the correct use of singular and plural
forms in Bengali grammar to ensure clear and eective communication. For example:
(correct) →( erroneous)
.
•Word Choice (Homonym Error). A homonym error is a mistake where two or
more words sound the same but have dierent meanings. The wrong choice of words
in a sentence essentially leads to confusion and miscommunication, especially in
written language. Also, the mistake in choosing appropriate homonyms brings up
the compatibility issue of the sentence. For example: (correct)
→( erroneous)
.
•Sentence Structure. It occurs when the arrangement of words in a sentence is
incorrect, which consequently makes the sentence grammatically incorrect. It often
changes the intended meaning and makes the sentence dicult to understand. For
example: (correct) →( erroneous)
.
•Punctuation. Punctuation marks are symbols that are used in writing to clarify
the meaning and structure of a sentence. Some common punctuation marks in
Bangla include full stop or period (�), comma (,), semicolon (;), colon (
→
•Subject-Verb Agreement. This type of error occurs when there is a mismatch in
person and number between the subject and verb. In Bangla grammar, person refers
to the grammatical category that indicates the relationship between the speaker and
the subject. Bangla grammar recognizes three persons - rst person, second person,
and third person. To ensure correct grammar in Bangla, the verb must agree with
the subject in both person and number. For instance, if the subject is singular, the
verb must be singular as well, and if the subject is plural, the verb should be plural.
Similarly, if the subject is in the rst person, the verb must also be in the rst
person, and so on. For example: (correct)
→( erroneous) .
•Sentence Fragments. In Bangla grammar, a sentence fragment refers to a collec-
tion of words that do not constitute a complete sentence or convey a complete idea.
8
These phrases usually lack a subject, a verb, or both, making them unable to func-
tion as complete sentences independently. Such a fragment can arise when a writer
neglects to provide a complete sentence or excludes necessary components. This
may cause ambiguity or misinterpretation in communication, leading to confusion
or misunderstanding.
–Subject Missing. This pertains to the circumstance in which the subject is left
out or not included. For example: (correct)
→( erroneous)
.
–Verb (from the Dictionary) Missing. This refers to a scenario in which a
verb is absent from a predetermined list. To address this type of mistake, we
compile a list of verbs beforehand from online Bangla dictionaries. For example:
(correct)
→( erroneous)
.
–Auxiliary Verb Missing. This type of error occurs when the sentence lacks
an auxiliary verb. For example: (correct)
→( erroneous)
.
–Main Verb Missing. This error arises when the sentence is missing a main
verb. For example: (correct) →(
erroneous) .
3.1 Data Sourcing
We source the raw data from a publicly available corpus named BanglaParaphrase
[47], which comprises approximately 466k pairs of high-quality synthetic paraphrases
in Bangla. These paraphrases are carefully crafted to ensure both semantic coherence
and syntactic diversity, thus guaranteeing their superior quality.
3.2 Data Augmentation
We introduce the previously discussed ten types of Bangla grammatical errors in the
sourced data employing the noise injection technique. To do so, we consider each
sentence as a nite set of words denoted as S={W1, W2, ..., WN−1, WN}where N
is the length of the sentence such that N∈Z+. Furthermore, each word Wi∈S
is considered as another nite set of Bangla characters which is represented as
Wi={C1, C2, ..., CM−1, CM}where Mis the length of the word such that M∈Z+
as well. However, four dierent approaches have been initiated to propagate these ten
types of errors due to their complex structures. Our process ensures that each syn-
thetic erroneous sentence has only one mistake. Nevertheless, some sentences contain
multiple erroneous words, and we take all of them into account in separate sentences.
Therefore, this results in one correct sentence and multiple incorrect versions.
An analogous procedure has been implemented to embed verb inection , num-
ber (bochon), and punctuation errors into a sentence, S. Firstly, a set of suxes for
9
inection and number (bochon) errors and a set of Bangla punctuations for punc-
tuation errors are collected, which are delineated as A={a1, a2, a3..., aB}where
aj∈Ais the jth sux or punctuation symbol. The items in set Aare further
grouped into sub-lists based on the similarity of suxes or punctuations which is
denoted as Di= [d1, d2, d3..., dE]such that dj∈Aand Diis the ith sub-list. Next,
a dictionary is created incorporating these similar groups, which is described as
F={G1:D1, G2:D2, ..., GN:DN}where Gjis the jth group name and Djis its
corresponding list of similar suxes or punctuations. Then, we iterate each word of a
sentence, Wi∈S, and determine whether it is found in the dictionary, F. If Wi∈F,
we replace Wiwith another sux or punctuation djfrom its corresponding group,
Di, such that dj∈A.
The same list of Bangla homonyms used by [45] is being utilized here. The
homonym words’ list is dened as H= [(h1, p1),(h2, p2), ..., (hK, pK)] where hjis the
jth word of the list and pjis its respective homonym version. To propagate the error,
we iterate each word, Wj, in the sentence, S, and if it is found in the homonym words’
list, H, we simply replace the word with its homonym version. Likewise, we accumu-
late a list of verbs, denoted as V= [v1, v2, v3..., vK], from an online dictionary through
web-scraping [45] to introduce the missing verb (from the dictionary) error. Again,
we go through each word, Wj, in a sentence, S, and remove it upon its appearance in
the previously collected verbs’ list, such that Wj∈V. On the other hand, we intro-
duce errors in sentence structure by randomly exchanging the positions of two words
within a sentence.
In order to generate synthetic errors related to subject-verb agreement, missing
subjects, missing auxiliary verbs, and missing main verbs, we make use of parts-of-
speech (POS) tags obtained from a Bangla language toolkit called bnlp1. To begin
with, we generate corresponding POS tags for each word, Wi, in the sentence, S,
which is denoted as St={pt1, pt2, ..., ptN}. Then, we iterate through the tags set,
St, and perform the following three operations: if the pos tag indicates a pronoun,
auxiliary verb, and main verb, we remove it to introduce a missing subject error,
missing auxiliary verb error, and missing main verb error, respectively. However, a
slightly dierent approach is taken to introduce errors in subject-verb agreement. To
do so, the subject and verb in the sentence (S) are determined using the POS tag list
(St). Then, the subject is changed so that it mismatches with the form of the verb. In
order to keep the resultant sentence semantically coherent even after introducing the
error, a similar dictionary, previously used when introducing errors in verb inection
, number (bochon), and punctuation, has been developed for subjects as well.
3.3 Corpus Statistic
The developed Bangla GEC corpus comprises ten distinct types of errors. The verb
inection errors are found to be the most frequent (25.38%) and word choice is found
to be the least frequent (2.09%) type of error in the corpus. The error in sentence
structure is the second largest type of error, containing 1795641 pairs of instances,
which is slightly above a quarter. Four out of ten types of errors, including verb inec-
tion, number, errors in sentence structure, and main verb missing errors, comprise
1https://github.com/sagorbrur/bnlp
10
78.60% of total errors in the corpus, while the remaining six comprise the remaining
21.33% of errors. Even though word choice or homonym errors, punctuation errors,
subject-verb agreement errors, subject missing errors, missing dictionary verb errors,
auxiliary verb missing errors, and main verb missing errors comprise 2.09%, 7.40%,
3.22%, 2.19%, 2.44%, and 3.36% of the corpus, respectively, each type of error contains
a substantial amount of instances. For instance, the top three least frequent types of
errors are word choice or homonym errors, missing dictionary verb errors, and subject
missing errors, each containing 147737, 172704, and 197540 instances, respectively.
Error Type #No. of Instances Percentage
Verb Inection 1804721 25.51%
Number (Bochon) 709480 10.03%
Word Choice (Homonym Error) 147737 2.09%
Sentence Structure 1795641 25.38%
Punctuation 523255 7.40%
Subject-Verb Agreement 227534 3.22%
Subject Missing 197540 2.79%
Verb (from the Dictionary) Missing 172704 2.44%
Auxiliary Verb Missing 237741 3.36%
Main Verb Missing 1258072 17.78%
Total = 7074425
Table 1 Statistic of the Bangla GEC corpus.
The percentage of dierent error types is justied as none of them were introduced
manually. All the instances have been crafted automatically based on the underlying
corpus and predened suxes which are carefully extracted by experts in the language.
Moreover, error related to word choice is the least common in the corpus, which is
logical considering the fact that the Bangla language has a relatively small number of
homonyms. On the other hand, the most prominent error type in the corpus is related
to verb inection, which is not surprising given the fact that the Bangla language has
a wide range of verb inection suxes.
4 Methodology
The proposed method is two-fold: initially, a transformer-based seq2seq
(MarianMT[49]) model, µ(.), undergoes training on a Bangla Paraphrase task, and
subsequently, the acquired knowledge is transferred to an identical model that is
tweaked for the Bangla GEC task. In either case, the initial step involves passing
an input sentence, [x1, x2, ..., xn], through a pre-trained Bangla tokenizer, τ(.), that
transforms the text into numerical data. After the input sentence has been tokenized,
it is then fed into the model, µ(.), to make a prediction. Finally, the model’s predic-
tion is assessed using relevant metrics specic to the task at hand. The entire Bangla
11
Source
Pre-trained
Tokenizer
Prediction
Evaluation
SacreBLEU
Encoder Decoder
BanglaGECTra
BERT Score
Bangla
Paraphrase
Corpus
Erroneous
Pre-trained
Tokenizer
Correction
Evaluation
F1 Score
Encoder Decoder
BanglaGECTra
SacreBLEU
Bangla
GEC
Corpus
Accuracy
BERT Score
Knowledge
Fig. 1 (Top) The Panini is initially trained on the Bangla paraphrase task. It begins by taking
a source sentence as input, which is then tokenized using a pre-trained tokenizer, τ(.). Finally, the
model, µ(.), generates a prediction. The knowledge acquired during this process is saved for further
use. (Middle) We employed transfer learning by initially subjecting the model to pretraining on
the Bangla paraphrase task, followed by harnessing the saved weights to enhance both the learning
dynamics and the ecacy of the model for the Bangla grammatical error correction task. Here, the
term ’knowledge’ refers to the insights garnered from the Bangla paraphrase task. (Bottom) The
Panini is being trained here to address the BGEC task, harnessing the knowledge accrued from the
Bangla paraphrase task via transfer learning. The process commences by embracing an erroneous
input sentence, subsequently subjecting it to tokenization using the pre-trained tokenizer, τ(.). These
tokenized inputs are then fed to the model, µ(.), which prociently generates the requisite correction.
GEC method is depicted in gure 1. In mathematical terms, the entire process can
be succinctly summarized as:
ˆy=µ(τ([x1, x2, ..., xn])) (1)
4.1 Problem Formulation
The word-level Bangla grammatical error correction task strives to map an erro-
neous sequence denoted as X= [x1, x2, ..., xn]into the corresponding correct sequence
denoted as Y= [y1, y2, ..., ym]where Xiand Yjare the ith and jth word of the
erroneous and correct sentences, respectively, such that the lengths n∈Z+and
m∈Z+but are not necessarily required to be equal. The erroneous sentence, X, is
rst fed into the pre-trained tokenizer, τ(.), that tokenizes Xwhich is represented as
Xτ= [xτ1, xτ2, ..., xτn]where xτiis the numerical value of ith token if and only if the
word is present in the vocabulary, otherwise a unknown (< unk >) token. Next, the
Bangla grammatical error correction model, denoted as µ(.), processes the tokenized
sentence Xτand produces a prediction referred to as ˆy. Lastly, the model’s prediction
is assessed by means of several evaluation metrics by comparing ˆywith corresponding
target.
12
4.2 Panini
It is essentially a tweaked MarianMT[49], which is a Vaswani et al. [46] style seq2seq
transformer model, for Bangla GEC Task. MarianMT[49] was selected primarily
because it incorporates several low-resource machine translation techniques for gram-
matical error correction (GEC) tasks, and has achieved state-of-the-art results in
neural GEC on several benchmarks, such as the CoNLL-2014 and JFLEG test sets.
However, the model consists of a stack of 12 encoder and decoder blocks, with each
block including self-attention, recurrent connections, and feedforward neural networks.
Below are the descriptions of the encoder and decoder blocks.
4.2.1 Encoder
It is responsible for processing the input sequence of tokens , Xτ= [xτ1, xτ2, ..., xτn],
and producing a sequence of hidden states that capture the meaning and context of
each token in the input by incorporating positional encoding. Especially, each encoder
layer includes a self-attention mechanism that computes attention scores between
all pairs of tokens in the input sequence, and a feedforward neural network that
applies a non-linear transformation to the output of the self-attention mechanism.
The self-attention mechanism computes a weighted sum of the hidden states for each
token in the input sequence, where the weights are based on the similarity between
the token and all other tokens in the sequence. This allows the encoder to focus on
the most relevant parts of the input sequence for each token, taking into account
the context in which it appears. However, transformer model uses multi-head self-
attention to capture multiple relationships, increase expressive power, become robust
to variations in data, and address regularization. The formula for calculating each
head’s self-attention is as follows:
Attention(Q, K, V ) = Sof tmax(QK T
√dk
)V(2)
Here, Q,K, and Vare matrics of querys, keys, and values, respectively. The dkis the
dimenstion of Keys that is utilzed to scale the resultant score.
The multi-head self-attention is essentially the amalgamation of each head’s
outcome which can be represented as,
MultiH ead(Q, K, V ) = Concat(Head1, H ead2, ..., Headh)Wo(3)
Headi=Attention(QW Q
i, K W K
i, V W V
i)(4)
Here, Headiis the ith head, and WQ
i,WK
i,WV
iare the corresponding weight metrics
of the queries (Q), keys (K), and values (V).
Finally, a feed-forward neural network (F(.)) takes the response of multi-head self-
attention followed by a recurrence connection and applies a non-linear transformation
to the output of the self-attention mechanism, which allows the encoder to capture
more complex relationships between the tokens in the input sequence.
13
4.2.2 Decoder
It is responsible for generating the output sequence based on the encoded input
sequence generated by the encoder. The decoder is auto-regressive that takes the
encoded input sequence and generates the output sequence token-by-token. Each
decoder layer comprises of masked multi-head self-attention layer, multi-head atten-
tion layer, and feed-forward neural network layer. First, masked self-attention is
computed over the target sequence Y. The masked multi-head self-attention layer is
similar to the self-attention layer in the encoder but with a mask applied to ensure
that the decoder cannot attend to future tokens in the output sequence. This sub-
layer allows the decoder to attend to relevant parts of the output sequence generated
so far and capture the dependencies between the tokens in the output sequence. Next,
attention is computed over the encoded hidden representations H. The multi-head
attention layer is responsible for attending to the encoded input sequence generated
by the encoder. This sub-layer enables the decoder to incorporate information from
the input sequence into the output sequence and produce a translated version of the
input sequence. Then, a position-wise feed-forward network is applied to the output
representation obtained in the previous step. The feed-forward neural network (F(.))
layer applies a non-linear transformation including residual connections and layer nor-
malization to the output of the attention layers to generate the nal output sequence.
During training, the decoder uses teacher forcing, where the true previous token is
fed as input to the decoder at each time step. During inference, the decoder generates
the output sequence token-by-token by recursively predicting the most likely token at
each time step based on the previous tokens and the encoded input sequence.
5 Experimental Analysis
5.1 Bangla GEC Corpus
We developed and published a large-scale Bangla GEC parallel corpus containing
7,074,425 (≈7.1M) source-target pairs. It amalgamates 10 distinct types of Bangla
grammar errors, as depicted in Table 1. To maintain optimal model performance and
avoid unnecessary asymptotic complexity, we limited the maximum sentence length
to 50, as we observed that including longer sentences did not yield any improvements
in model performance. The frequencies of sentence lengths have been illustrated in
Figure 2. We employ the dataset for the Bangla grammar error correction task by
partitioning it into training, validation, and test sets, ensuring that all three subsets
incorporate the comprehensive range of all 10 error types.
•Training Set: There are 5,730,275 (≈5.73M) instances in the training set. This
subset is primarily utilized to train the model. The model assimilates informa-
tion from these instances to discern intricate patterns, establish correlations, and
generate accurate predictions.
•Validation Set: It comprises 636,702 (≈636.7K) instances which are used to assess
the performance and ne-tune the parameters of the trained model.
14
•Test Set: The test set encompasses 707,448 (≈707.4K) instances, which are kept
untouched during training or validation to evaluate the generalization capabilities
of the model, unveiling an impartial gauge of its ecacy in handling unseen data.
0 10 20 30 40 50
Length of the sentence
0
100000
200000
300000
400000
500000
600000
No. of instances
Fig. 2 The instances vs sentence length plot, which provides a visual representation of the sentence
length distribution within the BGEC corpus.
5.2 Baselines
•Bangla-T5 [47]. This is a large-scale pre-trained language model for the primary
purpose of performing the Bangla paraphrase task. As it is a transformer-based
pre-trained sequence-to-sequence model, we further ne-tuned and cross-checked its
performance on the Bangla GEC task by transferring knowledge.
•T5-Small [50]. It is an ecient version of the T5 language model developed
by Google, designed to require fewer resources while still maintaining high per-
formance. We rst ne-tuned the model for the downstream Bangla paraphrase
task. Then, we tailored the model for the Bangla GEC task while transferring the
knowledge gained from the paraphrasing task.
5.3 Performance Evaluation
•Accuracy and F1-Score. Accuracy and F1 score are two commonly used metrics
to evaluate the performance of a model. Accuracy is a measure of how often the
model is correct in its predictions. It is dened as the number of correct predictions
divided by the total number of predictions. Mathematically, it can be expressed as:
Accuracy =T P +T N
T P +T N +F P +F N (5)
15
where TP is the number of true positives, TN is the number of true negatives, FP
is the number of false positives, and FN is the number of false negatives.
While accuracy is a useful metric, it can be misleading in cases where the classes are
imbalanced. The F1 score is a more balanced metric that takes into account both
precision and recall. Precision measures how many of the positive predictions are
correct, while recall measures how many of the positive examples are correctly pre-
dicted. The F1 score is the harmonic mean of precision and recall and is calculated
as follows:
F1Score = 2 ×precision ×recall
precision +recall (6)
where precision =T P
T P +F P and recall =T P
T P +F N .
•BERT Score. It measures the similarity between two pieces of text based on their
semantic meaning. The mathematical equation for BERT score is as follows:
BERT S core =1
N×
N
∑
i
F1Score(BERTsimilarity i, BE RTprecisioni, BE RTrecalli)
(7)
where, N is the number of text pairs being evaluated, F1 Score is the harmonic mean
of BERTprecision and BERTrecall ,BE RTsimilar ityiis the cosine similarity between
the embeddings of the two text pieces, BERTpr ecisionithe proportion of words in
the rst text that have a matching word in the second text, and BERTrecalliis the
proportion of words in the second text that have a matching word in the rst text.
•SacreBLEU. It also measures the similarity between prediction and one or more
human-annotated references. It evaluates not only the accuracy of individual words
but also the overall uency and coherence of the translated text. The score ranges
from 0 to 100, with higher scores indicating better performance. The mathematical
equation for calculating SacreBLEU score is as follows:
SacreBLEU = 100 ×e(1−BP )×(sum of N gram matches
total Ngrams )(8)
BP =min(1, e(1−
ref erence length
candidate length ))(9)
where, BP or Brevity Penalty is a penalty term that adjusts the score for the length
of the candidate predictions relative to the length of the reference annotations, the
reference length, and candidate length are the length of the concatenated predictions
and candidate annotations, respectively. The sum of n-gram matches indicates the
total number of matching n-grams in the candidate annotations and total n-grams
refers to the number of n-grams in the candidate predictions.
5.4 Hyperparameters
The default conguration for MarianMT[49] uses six stacked layers for both the
encoder and decoder. Likewise, MarianMT’s default hidden layer size of 512 is uti-
lized to represent the feature vectors. A learning rate of 5×105is employed with a
16
batch size of 16 during the training process. The model is trained for 30 epochs with
the AdamW optimizer for updating the weights of the network to minimize the loss
function.
5.5 Transformers on Bangla GEC Task
5.5.1 Quantitative Result
The quantitative results of dierent BGEC baselines are demonstrated in Table 2,
which explicitly manifests the preeminence of our proposed Panini model over the
BanglaT5 and T5-Small models in the BGEC task by attaining an accuracy score of
83.33%, an f1-score of 0.833, a BERT score of 99.43, and a ScarceBLEU score of 95.9.
However, the T5-Small model exhibits relatively inferior performance compared to
the other models, which is expected given its substantially lower number of trainable
parameters. On the other hand, the BanglaT5 model, which possesses around 3.5 times
more parameters than our model, falls signicantly short in all evaluation criteria,
making it the second-best performer overall. Our Panini outperforms BanglaT5 with
a substantial lead, improving the accuracy score by 18.81%, the f1-score by 0.19, the
BERT Score by 1.93, and the ScarceBLEU score by 11.5.
Method Training Inference Param.
Loss Accuracy F1 Score BERT Score ScarceBLEU
BanglaT5 4.21 ×10−264.52% 0.645 97.5 84.4 247.53M
T5-Small 4.5×10−259.53% 0.595 96.73 80.3 60.51M
Panini 2.79 ×10−283.33% 0.833 99.43 95.9 74.36M
Table 2 The table of the quantitative results, where the empirical outcomes of our proposed
Panini are compared with other transformer-based baselines for the BGEC task.
5.5.2 Qualitative Result
Table 3exemplies the qualitative outcomes of BanglaT5, T5-Small, and Panini.
It unequivocally illustrates the outstanding performance of Panini compared to
BanglaT5 and T5-Small models. Our Panini model excels in rectifying a myriad of
grammatical errors, including subject-verb agreement, tense inconsistencies, articles,
prepositions, punctuation, and verb agreement, to name a few. This extraordinary
prociency showcases its superior aptitude in capturing intricate language patterns
across diverse error typologies contrasted to the other two baseline models. Moreover,
Panini exhibits adeptness in resolving errors in long sentences, in which the other
two baselines fall short. Furthermore, the errors made by our method also manifest
a discernible degree of meaningfulness, duly considering the semantic meaning of the
predicted sentence. For instance, for the erroneous input “
”, it generated the correction as “ ”,
which is not semantically wrong at all.
17
To make the grammatical correction, our Panini takes the erroneous sentence
as input and processes it using the encoder module. However, the decoding process
begins by generating the rst token of the target sentence using embeddings and
positional encodings, similar to the encoder. Then, the decoder’s masked multi-head
self-attention mechanism focuses on the generated tokens, aiding the understand-
ing of token relationships. Next, the encoder-decoder attention allows the decoder
to reference the source sentence, aiding alignment between the source and target.
Finally, the decoder predicts subsequent tokens using self-attention and encoder-
decoder attention, iteratively generating tokens until the end-of-sentence (<EOS>)
token is reached. Figure 3illustrates how each word in the source erroneous sen-
tence attends to and inuences the words in the target corrected sentence during the
translation process.
<SOS>
মেরর
মেরর
সামেন
একট
িবশাল
িদিঘ
আেছন
<EOS>
সামেন
একট
িবশাল
িদিঘ
আেছ
<EOS>
Fig. 3 Illustration of the attention dynamics across the source erroneous sentence for each decoding
step.
5.6 Transformers on Bangla Paraphrase Task
The BanglaT5, T5-Small, and Panini demonstrate competitive performance in the
Bangla paraphrase task. While BanglaT5 and Panini achieve the highest ScrceBLEU
score and BERT score, respectively, the T5-Small model has the fewest trainable
parameters among them, which refers to better asymptotic complexity. The BanglaT5
scores the highest ScrceBLEU score of 29.6, which is 8.8 and 2.1 points higher than
the performance of the T5-Small and Panini models. On the other hand, Panini
enhances the BERT score of BanglaT5 and T5-Small by 1.85 and 4.27 points, respec-
tively. Despite achieving a higher ScrceBLEU score than Panini, BanglaT5’s size is
3.33 times larger than that of Panini, which results in a signicant increase in asymp-
totic complexity. This leads to atypical training and inference time. On the contrary,
18
(Input)
(BanglaT5)
✓
(T5-Small)
×
( Panini)
✓
(Input)
(BanglaT5)
×
(T5-Small)
×
( Panini)
✓
(Input)
(BanglaT5)
×
(T5-Small)
×
( Panini)
×
Table 3 The qualitative results table that elucidates the eectiveness of Panini and other
transformer-based baselines in rectifying Bangla grammatical errors.
T5-Small, with the smallest parameter size of 60.51M, fails to achieve a high Scrce-
BLEU or BERT score. Here, our Panini achieves the highest BERT score of 91.13 and
demonstrates competitive performance in terms of the ScrceBLEU score, considering
its relatively small parameter size.
5.7 Ablation Study
To assess the inuence of training corpus size on the ecacy of our proposed Panini, we
undertook a series of experiments employing three dierent versions of the training set.
We evaluated their performance on a shared test set, which comprised 50K instances.
The three training sets varied in size, with the rst and second sets scaled down by
factors of 100 and 50, respectively, compared to the actual set. These smaller sets
19
Method ScrceBLEU BERT Score Param.
BanglaT5 29.6 89.29 247.53M
T5-Small 20.8 86.86 60.51M
Panini 27.5 91.13 74.36M
Table 4 Comparing the empirical outcomes of various
transformer-based methods on the Bangla paraphrase task.
contained approximately 57.3K and 114.6K instances, respectively. In contrast, the
third set was a comprehensive large-scale training set consisting of around 5.73M
instances. The empirical performance of Panini on these three versions of the corpus
can be found in table 5.
Method Corpus Size Inference
Train Test Accuracy F1 Score BERT Score ScrceBLEU
Panini 57.3K 50K 52.28% 0.523 98.64 89.1
Panini 114.6K 50K 58.72% 0.587 98.82 90.6
Panini 5.73M 50K 87.52% 0.875 99.61 96.3
Table 5 The eectiveness of our proposed Panini in the context of the BGEC task across varying
sizes of the training corpus.
The empirical ndings (Table 5) emphasize the signicant impact of utilizing a
large-scale corpus in Panini to achieve optimal performance. During the model train-
ing on 57.3K instances, it achieved an accuracy of 52.28%, an f1 score of 0.523, a BERT
score of 98.64, and a ScrceBLEU score of 89.1, respectively. Then, training the model
on another variant of the corpus, consisting of 114.6K instances, led to slight improve-
ments in accuracy, f1 score, BERT score, and ScrceBLEU, with an increase of 6.44%,
6.4×10−2, 0.18, and 1.5, respectively. Although these improvements were negligi-
ble, training the model on the actual training set, which comprised 5.73M instances,
resulted in more signicant enhancements, with an increase of 35.22% in accuracy,
0.352 in f1 score, 0.97 in BERT score, and 7.2 in ScrceBLEU, respectively.
The ablation study serves as a vivid depiction of how corpus size profoundly
impacts the model’s performance. The ndings of this study provide a lucid and all-
encompassing comprehension of the intricate interplay between the size of the training
corpus and the ecacy of the model. As the corpus size increases, our model con-
sistently exhibits a discernible elevation in its performance trajectory. This empirical
insight emphatically underscores the pivotal role data volume plays in increasing the
model’s performance, thereby enhancing its prociency and eectiveness.
20
57.3K 114.6K 5.73M
50
60
70
80
90
100
Accuracy
BERT Score
ScrceBLEU
Fig. 4 The empirical outcomes of Panini on three dierent-sized training sets in terms of accuracy,
BERT Score, and ScereBLEU.
6 Conclusion
The Bangla grammatical error correction task holds paramount importance in ensur-
ing the utmost perspicuity and scrupulousness of written Bangla text, thus bestowing
a pinnacle of clarity and precision. In response to the task, a monolingual transformer-
based baseline for Bangla grammatical error correction has been introduced in this
study, aiming to fulll the need for an eective BGEC method in the Bangla language
by utilizing transformer models. In pursuit of this objective, a large-scale parallel
corpus for the task has been developed and made publicly accessible, which in turn
has made Bangla no longer a low-resource language for the task. Moreover, we intro-
duced Panini, which has emerged as a new state-of-the-art method, outperforming the
BanglaT5 and T5-Small baselines by a signicant margin. It excelled not only in the
BGEC task but also in the Bangla paraphrase task, surpassing the performance of the
previous state-of-the-art method. Furthermore, the ecacy of transfer learning from
the Bangla paraphrase task in the context of the BGEC task has been thoroughly
examined and analyzed. However, the scrutiny of the inuence of training corpus size
on the eectiveness of our proposed Panini has unveiled the signicant data depen-
dency of the method, highlighting its profound reliance on extensive data resources.
Nevertheless, while Panini exhibited commendable outcomes on BGEC task, there
remains ample scope for improving the performance by targeting specic error cat-
egories and rening its ecacy. Overall, we introduced a robust foundation for the
BGEC task, serving as a baseline for forthcoming advancements in the task. In the
future, we will alleviate the model’s reliance on copious data through the utilization
of zero-shot learning. Also, we will empirically investigate the ecacy of combining
21
a pre-trained model from other languages with our monolingual pre-trained model
through the utilization of knowledge distillation.
Acknowledgement
This research is funded by Institute of Advanced Research (Grant No.
UIU/IAR/02/2021/SE/22), United International University, Bangladesh.
Conict of Interest
The authors declare no conict of interest.
Data Availability
Datasets will be made available on request.
References
[1] Rozovskaya, A., Roth, D.: Grammar error correction in morphologically rich
languages: The case of russian. Transactions of the Association for Computational
Linguistics 7, 1–17 (2019)
[2] Hu, L., Tang, Y., Wu, X., Zeng, J.: Considering optimization of english grammar
error correction based on neural network. Neural Computing and Applications,
1–13 (2022)
[3] Grundkiewicz, R., Junczys-Dowmunt, M., Heaeld, K.: Neural grammatical error
correction systems with unsupervised pre-training on synthetic data. In: Pro-
ceedings of the Fourteenth Workshop on Innovative Use of NLP for Building
Educational Applications, pp. 252–263 (2019)
[4] Wang, Y., Wang, Y., Dang, K., Liu, J., Liu, Z.: A comprehensive survey of
grammatical error correction. ACM Transactions on Intelligent Systems and
Technology (TIST) 12(5), 1–51 (2021)
[5] Hasan, K.A., Mondal, A., Saha, A.: A context free grammar and its predictive
parser for bangla grammar recognition. In: 2010 13th International Conference
on Computer and Information Technology (ICCIT), pp. 87–91 (2010). IEEE
[6] Hasan, K., Mondal, A., Saha, A., et al.: Recognizing bangla grammar using
predictive parser. arXiv preprint arXiv:1201.2010 (2012)
[7] Islam, M.A., Hasan, K.A., Rahman, M.M.: Basic hpsg structure for bangla
grammar. In: 2012 15th International Conference on Computer and Information
Technology (ICCIT), pp. 185–189 (2012). IEEE
22
[8] Purohit, P.P., Hoque, M.M., Hassan, M.K.: An empirical framework for seman-
tic analysis of bangla sentences. In: 2014 9th International Forum on Strategic
Technology (IFOST), pp. 34–39 (2014). IEEE
[9] Purohit, P.P., Hoque, M.M., Hassan, M.K.: Feature based semantic analyzer
for parsing bangla complex and compound sentences. In: The 8th International
Conference on Software, Knowledge, Information Management and Applications
(SKIMA 2014), pp. 1–7 (2014). IEEE
[10] Karim, M.S., Robi, F.R.H., Hossain, M.M., Rahman, M.T., et al.: Implementa-
tion and performance evaluation of semantic features analysis system for bangla
assertive, imperative and interrogative sentences. In: 2018 International Con-
ference on Bangla Speech and Language Processing (ICBSLP), pp. 1–5 (2018).
IEEE
[11] Hasan, K.A., Hozaifa, M., Dutta, S.: Detection of semantic errors from sim-
ple bangla sentences. In: 2014 17th International Conference on Computer and
Information Technology (ICCIT), pp. 296–299 (2014). IEEE
[12] Rabbi, R.Z., Shuvo, M.I.R., Hasan, K.A.: Bangla grammar pattern recognition
using shift reduce parser. In: 2016 5th International Conference on Informatics,
Electronics and Vision (ICIEV), pp. 229–234 (2016). IEEE
[13] Al Hadi, A., Khan, M.Y.A., Sayed, M.A.: Extracting semantic relatedness for
bangla words. In: 2016 5th International Conference on Informatics, Electronics
and Vision (ICIEV), pp. 10–14 (2016). IEEE
[14] Alamgir, T., Aren, M.S.: An empirical framework for parsing bangla imperative,
optative and exclamatory sentences. In: 2017 International Conference on Elec-
trical, Computer and Communication Engineering (ECCE), pp. 164–169 (2017).
IEEE
[15] Khatun, S., Hoque, M.M.: Semantic analysis of bengali sentences. In: 2018 Inter-
national Conference on Bangla Speech and Language Processing (ICBSLP), pp.
1–6 (2018). IEEE
[16] Saha Prapty, A., Rifat Anwar, M., Azharul Hasan, K.: A rule-based parsing
for bangla grammar pattern detection. In: Proceedings of International Joint
Conference on Advances in Computational Intelligence: IJCACI 2020, pp. 319–
331 (2021). Springer
[17] Afroz, S., Susmoy, M., Anjum, F., Nowshin, N.: Examining lexical and grammat-
ical diculties in bengali language using nlp with machine learning. PhD thesis,
Brac University (2021)
[18] Faisal, A.M.F., Rahman, M.A., Farah, T.: A rule-based bengali grammar checker.
In: 2021 Fifth World Conference on Smart Trends in Systems Security and
23
Sustainability (WorldS4), pp. 113–117 (2021). IEEE
[19] Alam, M., UzZaman, N., Khan, M., et al.: N-gram based statistical grammar
checker for bangla and english (2007)
[20] Kundu, B., Chakraborti, S., Choudhury, S.K.: Nlg approach for bangla gram-
matical error correction. In: 9th International Conference on Natural Language
Processing, ICON, pp. 225–230 (2011)
[21] Kundu, B., Chakraborti, S., Choudhury, S.K.: Combining condence score and
mal-rule lters for automatic creation of bangla error corpus: grammar checker
perspective. In: Computational Linguistics and Intelligent Text Processing: 13th
International Conference, CICLing 2012, New Delhi, India, March 11-17, 2012,
Proceedings, Part II 13, pp. 462–477 (2012). Springer
[22] Sinha, M., Dasgupta, T., Jana, A., Basu, A.: Design and development of a
bangla semantic lexicon and semantic similarity measure. International Journal
of Computer Applications 975, 8887 (2014)
[23] Khan, N.H.: Verication of bangla sentence structure using n-gram. Global
Journal of Computer Science and Technology 14, 1–5 (2014)
[24] Rahman, M.R., Habib, M.T., Rahman, M.S., Shuvo, S.B., Uddin, M.S.: An
investigative design based statistical approach for determining bangla sentence
validity. International Journal of Computer Science and Network Security 16(11),
30–37 (2016)
[25] Nipu, A.S., Pal, U.: A machine learning approach on latent semantic analysis for
ambiguity checking on bengali literature. In: 2017 20th International Conference
of Computer and Information Technology (ICCIT), pp. 1–4 (2017). IEEE
[26] Husna, A., Mostofa, M., Khatun, A., Islam, J., Mahin, M.: A framework for
word clustering of bangla sentences using higher order n-gram language model.
In: 2018 International Conference on Innovation in Engineering and Technology
(ICIET), pp. 1–6 (2018). IEEE
[27] Rana, M.M., Sultan, M.T., Mridha, M., Khan, M.E.A., Ahmed, M.M., Hamid,
M.A.: Detection and correction of real-word errors in bangla language. In: 2018
International Conference on Bangla Speech and Language Processing (ICBSLP),
pp. 1–4 (2018). IEEE
[28] Mridha, M., Rana, M.M., Hamid, M.A., Khan, M.E.A., Ahmed, M.M., Sultan,
M.T.: An approach for detection and correction of missing word in ben-
gali sentence. In: 2019 International Conference on Electrical, Computer and
Communication Engineering (ECCE), pp. 1–4 (2019). IEEE
[29] Rahman, M.R., Habib, M.T., Rahman, M.S., Islam, G.Z., Khan, M.A.A.: An
24
exploratory research on grammar checking of bangla sentences using statistical
language models. International Journal of Electrical and Computer Engineering
(IJECE) 10(3), 3244–3252 (2020)
[30] Hossain, N., Islam, S., Huda, M.N.: Development of bangla spell and gram-
mar checkers: Resource creation and evaluation. IEEE Access 9, 141079–141097
(2021)
[31] Kundu, S.B., Chakraborti, S., Choudhury, S.K.: Complexity guided active learn-
ing for bangla grammar correction. In: 10th International Conference on Natural
Language Processing, ICON, vol. 1, p. 4 (2013)
[32] Mridha, M., Hamid, M.A., Rana, M.M., Khan, M.E.A., Ahmed, M.M., Sultan,
M.T.: Semantic error detection and correction in bangla sentence. In: 2019 Joint
8th International Conference on Informatics, Electronics & Vision (ICIEV) and
2019 3rd International Conference on Imaging, Vision & Pattern Recognition
(icIVPR), pp. 184–189 (2019). IEEE
[33] Islam, S., Sarkar, M.F., Hussain, T., Hasan, M.M., Farid, D.M., Shatabda, S.:
Bangla sentence correction using deep neural network based sequence to sequence
learning. In: 2018 21st International Conference of Computer and Information
Technology (ICCIT), pp. 1–6 (2018). IEEE
[34] Shajalal, M., Aono, M.: Semantic textual similarity in bengali text. In: 2018
International Conference on Bangla Speech and Language Processing (ICBSLP),
pp. 1–5 (2018). IEEE
[35] Abujar, S., Masum, A.K.M., Chowdhury, S.M.H., Hasan, M., Hossain, S.A.:
Bengali text generation using bi-directional rnn. In: 2019 10th International Con-
ference on Computing, Communication and Networking Technologies (ICCCNT),
pp. 1–5 (2019). IEEE
[36] Rakib, O.F., Akter, S., Khan, M.A., Das, A.K., Habibullah, K.M.: Bangla
word prediction and sentence completion using gru: an extended version of rnn
on n-gram language model. In: 2019 International Conference on Sustainable
Technologies for Industry 4.0 (STI), pp. 1–6 (2019). IEEE
[37] Islam, M.S., Mousumi, S.S.S., Abujar, S., Hossain, S.A.: Sequence-to-sequence
bangla sentence generation with lstm recurrent neural networks. Procedia
Computer Science 152, 51–58 (2019)
[38] Pandit, R., Sengupta, S., Naskar, S.K., Dash, N.S., Sardar, M.M.: Improv-
ing semantic similarity with cross-lingual resources: A study in bangla—a low
resourced language. In: Informatics, vol. 6, p. 19 (2019). MDPI
[39] Noshin Jahan, M., Sarker, A., Tanchangya, S., Abu Yousuf, M.: Bangla real-word
error detection and correction using bidirectional lstm and bigram hybrid model.
25
In: Proceedings of International Conference on Trends in Computational and
Cognitive Engineering: Proceedings of TCCE 2020, pp. 3–13 (2020). Springer
[40] Chowdhury, M.A.H., Mumenin, N., Taus, M., Yousuf, M.A.: Detection of com-
patibility, proximity and expectancy of bengali sentences using long short term
memory. In: 2021 2nd International Conference on Robotics, Electrical and Signal
Processing Techniques (ICREST), pp. 233–237 (2021). IEEE
[41] Iqbal, M.A., Sharif, O., Hoque, M.M., Sarker, I.H.: Word embedding based tex-
tual semantic similarity measure in bengali. Procedia Computer Science 193,
92–101 (2021)
[42] Anbukkarasi, S., Varadhaganapathy, S.: Neural network-based error handler in
natural language processing. Neural Computing and Applications, 1–10 (2022)
[43] Dhar, A.C., Roy, A., Habib, M.A., Akhand, M., Siddique, N.: Transformer deep
learning model for bangla–english machine translation. In: Proceedings of 2nd
International Conference on Articial Intelligence: Advances and Applications:
ICAIAA 2021, pp. 255–265 (2022). Springer
[44] Aurpa, T.T., Sadik, R., Ahmed, M.S.: Abusive bangla comments detection on
facebook using transformer-based deep learning models. Social Network Analysis
and Mining 12(1), 24 (2022)
[45] Bijoy, M.H., Hossain, N., Islam, S., Shatabda, S.: Dpcspell: A transformer-based
detector-puricator-corrector framework for spelling error correction of bangla
and resource scarce indic languages. arXiv preprint arXiv:2211.03730 (2022)
[46] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N.,
Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural
information processing systems 30 (2017)
[47] Akil, A., Sultana, N., Bhattacharjee, A., Shahriyar, R.: Banglaparaphrase: A
high-quality bangla paraphrase dataset. arXiv preprint arXiv:2210.05109 (2022)
[48] Shahgir, H., Sayeed, K.S.: Bangla grammatical error detection using t5 trans-
former model. arXiv preprint arXiv:2303.10612 (2023)
[49] Junczys-Dowmunt, M., Grundkiewicz, R., Dwojak, T., Hoang, H., Heaeld, K.,
Neckermann, T., Seide, F., Germann, U., Aji, A.F., Bogoychev, N., et al.: Marian:
Fast neural machine translation in c++. arXiv preprint arXiv:1804.00344 (2018)
[50] Rael, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y.,
Li, W., Liu, P.J.: Exploring the limits of transfer learning with a unied text-to-
text transformer. The Journal of Machine Learning Research 21(1), 5485–5551
(2020)
26