Gokhan Ercan

Gokhan Ercan
Isik University · Computer Science

Phd cand.

About

10
Publications
5,359
Reads
How we measure 'reads'
A 'read' is counted each time someone views a publication summary (such as the title, abstract, and list of authors), clicks on a figure, or views or downloads the full-text. Learn more
60
Citations

Publications

Publications (10)
Conference Paper
Full-text available
MorphoLex is a study in which root, prefix and suffixes of words are analyzed. With MorphoLex, many words can be analyzed according to certain rules and a useful database can be created. Due to the fact that Turkish is an agglutinative language and the richness of its language structure, it offers different analyzes and results from previous studie...
Conference Paper
Full-text available
In this paper, we present a two-level morphological analyzer for Turkish which consists of five main components: finite state transducer, rule engine for suffixa-tion, lexicon, trie data structure, and LRU cache. We use Java language to implement finite state machine logic and rule engine, Xml language to describe the finite state transducer rules...
Conference Paper
Full-text available
Türkçe doğal dil işleme alanında var olan çalışmalardan en bilinen iki tanesi Zemberek ve İTÜ Doğal Dil İşleme Yazılım Zinciri'dir. Bu çalışmalardan Zemberek açık kaynak kodlu olup çeşitli doğal dil işleme bileşenlerinden oluşan bir yazılım kütüphanesidir. İTÜ Doğal Dil İşleme Yazılım Zinciri ise, çevrimiçi bir kullanıcı arayüzü sunmasına rağmen, a...
Conference Paper
Doğal dil işleme çalışmamızın amacı Türkçe dili için paragraf-cümle düzeyinde anlamsal söylem analizi ve paragraf-cümle ve cümle-cümle düzeyinde metinsel benzerlik ölçümlemesi için bir veri kümesi hazırlamaktır. Girdi olarak kullanılan çoktan seçmeli sorular Türkiye Cumhuriyeti Ölçme, Seçme ve Yerleştirme Merkezi tarafından gerçekleştirilen sınavla...
Conference Paper
Full-text available
In this paper, we present AnlamVer, which is a semantic model evaluation dataset for Turkish designed to evaluate word similarity and word relatedness tasks while discriminating those two relations from each other. Our dataset consists of 500 word-pairs annotated by 12 human subjects , and each pair has two distinct scores for similarity and relate...
Conference Paper
In this paper, we present the first multilayer annotated corpus for Turkish, which is a low-resourced agglutinative language. Our dataset consists of 9,600 sentences translated from the Penn Treebank Corpus. Annotated layers contain syntactic and semantic information including morphological disambigua-tion of words, named entity annotation, shallow...

Network

Cited By