Cilt 18, Sayı 2, Sayfalar 346 - 359 2017-06-30

STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM

Senem Kumova Metin [1] , Bahar Karaoğlan [2]

93 67

In a wide group of languages, the stop words, which have only grammatical roles and not contributing to information content, may be simply exposed by their relatively higher occurrence frequencies. But, in agglutinative or inflectional languages, a stop word may be observed in several different surface forms due to the inflection producing noise.

In this study, some of the well-known binary classification methods are employed to overcome the inflectional noise problem in stop word detection. The experiments are conducted on corpora of an agglutinative language, Turkish, in which the amount of inflection is high and a non-agglutinative language, English, in which the inflection is lower for stop words. The evaluations demonstrated that in Turkish corpus, the classification methods improve stop word detection with respect to frequency-based method. On the other hand, the classification methods applied on English corpora showed no improvement in the performance of stop word detection.

stop word,content word,binary classification,tf-idf
Konular Mühendislik ve Temel Bilimler
Dergi Bölümü Araştırma Makalesi
Yazarlar

Yazar: Senem Kumova Metin
E-posta: senem.kumova@ieu.edu.tr

Yazar: Bahar Karaoğlan
E-posta: bahar.karaoglan@ege.edu.tr

Bibtex @ { aubtda322136, journal = {Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik}, issn = {1302-3160}, address = {Anadolu Üniversitesi}, year = {2017}, volume = {18}, pages = {346 - 359}, doi = {10.18038/aubtda.322136}, title = {STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM}, language = {en}, key = {cite}, author = {Kumova Metin, Senem and Karaoğlan, Bahar} }
APA Kumova Metin, S , Karaoğlan, B . (2017). STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM. Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik, 18 (2), 346-359. DOI: 10.18038/aubtda.322136
MLA Kumova Metin, S , Karaoğlan, B . "STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM". Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik 18 (2017): 346-359 <http://dergipark.gov.tr/aubtda/issue/29641/322136>
Chicago Kumova Metin, S , Karaoğlan, B . "STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM". Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik 18 (2017): 346-359
RIS TY - JOUR T1 - STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM AU - Senem Kumova Metin , Bahar Karaoğlan Y1 - 2017 PY - 2017 N1 - doi: 10.18038/aubtda.322136 DO - 10.18038/aubtda.322136 T2 - Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik JF - Journal JO - JOR SP - 346 EP - 359 VL - 18 IS - 2 SN - 1302-3160-2146-0205 M3 - doi: 10.18038/aubtda.322136 UR - http://dx.doi.org/10.18038/aubtda.322136 Y2 - 2017 ER -
EndNote %0 Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM %A Senem Kumova Metin , Bahar Karaoğlan %T STOP WORD DETECTION AS A BINARY CLASSIFICATION PROBLEM %D 2017 %J Anadolu Üniversitesi Bilim Ve Teknoloji Dergisi A - Uygulamalı Bilimler ve Mühendislik %P 1302-3160-2146-0205 %V 18 %N 2 %R doi: 10.18038/aubtda.322136 %U 10.18038/aubtda.322136