A technique for computer detection and correction of spelling errors.

Communications of the ACM (Impact Factor: 2.86). 01/1964; 7:171-176. DOI: 10.1145/363958.363994
Source: DBLP

ABSTRACT The method described assumes that a word which cannot be found in a dictionary has at most one error, which might be a wrong, missing or extra letter or a single transposition. The unidentified input word is compared to the dictionary again, testing each time to see if the words match—assuming one of these errors occurred. During a test run on garbled text, correct identifications were made for over 95 percent of these error types.

  • [Show abstract] [Hide abstract]
    ABSTRACT: This article presents a system designed for automatic detection of errors in second language learner's writing. The system contains two modules, one aimed at detecting and correcting of spelling errors, and a second module providing a broad detection of higher level language errors. The article describes extensions to existing error detection algorithms and introduces a novel method of context-based ranking of spelling correction candidates based on probabilistic context-free grammars. The performance of the system is evaluated on real data with manually marked and classified errors.
    Language Matters 01/2006; 37(2):141-159. DOI:10.1080/10228190608566258 · 0.23 Impact Factor
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: The manipulation of XML based relational representations of biological systems (BioML for Bioscience Markup Language) is a big challenge in systems biology. The needs of biologists, like translational study of biological systems, cause their challenges to become grater due to the material received in next generation sequencing. Among these BioML’s, SBML is the de facto standard file format for the storage and exchange of quantitative computational models in systems biology, supported by more than 257 software packages to date. The SBML standard is used by several biological systems modeling tools and several databases for representation and knowledge sharing. Several sub systems are integrated in order to construct a complex bio system. The issue of combining biological sub-systems by merging SBML files has been addressed in several algorithms and tools. But it remains impossible to build an automatic merge system that implements reusability, flexibility, scalability and sharability. The technique existing algorithms use is name based component comparisons. This does not allow integration into Workflow Management System (WMS) to build pipelines and also does not include the mapping of quantitative data needed for a good analysis of the biological system. In this work, we present a deterministic merging algorithm that is consumable in a given WMS engine, and designed using a novel biological model similarity algorithm. This model merging system is designed with integration of four sub modules: SBMLChecker, SBMLAnot, SBMLCompare, and SBMLMerge, for model quality checking, annotation, comparison, and merging respectively. The tools are integrated into the BioExtract server leveraging iPlant collaborative resources to support users by allowing them to process large models and design work flows. These tools are also embedded into a user friendly online version SW4SBMLm.
    05/2014, Degree: MS, Supervisor: Etienne Z. Gnimpieba
  • Source
    [Show abstract] [Hide abstract]
    ABSTRACT: A spelling error detection and correction application is typically based on three main components: a dictionary (or reference word list), an error model and a language model. While most of the attention in the literature has been directed to the language model, we show how improvements in any of the three components can lead to significant cumulative improvements in the overall performance of the system. We develop our dictionary of 9.2 million fully-inflected Arabic words (types) from a morphological transducer and a large corpus, validated and manually revised. We improve the error model by analyzing error types and creating an edit distance re-ranker. We also improve the language model by analyzing the level of noise in different data sources and selecting an optimal subset to train the system on. Testing and evaluation experiments show that our system significantly outperforms Microsoft Word 2013, OpenOffice Ayaspell 3.4 and Google Docs.
    Natural Language Engineering 02/2015; · 0.47 Impact Factor

Preview (4 Sources)

Available from