Cilt 13, Sayı 4, Sayfalar 873 - 881 2017-12-29

Web Proxy Log Data Mining System for Clustering Users and Search Keywords

Turgay Tugay Bilgin [1] , Mustafa Koray Aytekin [2]

19 19

In this study, Internet users were clustered by the search keywords which they type into search bars of search engines. Our proposed software is called UQCS (User Queries Clustering System) and it was developed to demonstrate the efficiency of our hypothesis. UQCS co-operates with the Strehl’s relationship based clustering toolkit and performs segmentation on users based on the keywords they use for searching the web. Internet Proxy server logs were parsed and query strings were extracted from the search engine URL’s and the resulting IP-Term matrix was converted into a similarity matrix using Euclidean, Jaccard, Cosine Distance and Pearson Correlation Distance metrics. K- Means and graph-based OPOSSUM algorithm were used to perform clustering on the similarity matrices.  Results were illustrated by using CLUSION visualization toolkit.


Data mining,Document clustering,Graph clustering,web mining
  • [1] Sankar K.Pal, Varun Talwar, Pabitra Mitra, “Web Mining in Soft Computing Framework:Relevance, State of the Art and Future Directions”, IEEE Transactions on Neural Networks, Vol.13, No.5, September 2002
  • [2] O.Etzioni. “The World Wide Web: Quagmire or Gold Mining”, Communicate of the ACM, (39)11:65-68, 1996;
  • [3] Kosala and Blockeel, “Web mining research: A sur-vey,” SIGKDD:SIGKDD Explorations: Newsletter of the Special Interest Group (SIG) on Knowledge Discovery and Data Mining, ACM, Vol. 2, 2000
  • [4] Qingyu Zhang and Richard s. Segall,” Web mining: a survey of current research,Techniques, and software”, in the International Journal of Information Technology & Decision Making Vol. 7, No. 4 (2008) 683– 720
  • [5] Chun-Ling Zhang, Zun-Feng Liu, Jing-Rui Yin, “The Application Research on Web Log Mining in E-Marketing”, Hebei Polytechnic University, 978-1-4244-5895-0 IEEE 2010
  • [6] Strehl, Alexander, “Relationship-based Clustering and Cluster Ensembles for High-dimensional Data Mining”, 2002 Doctoral Dissertation, University of Texas
  • [7] A. Strehl and J. Ghosh, "Relationship-based Cluster-ing and Visualization for High-dimensional Data Min-ing", INFORMS Journal on Computing, pages 208-230, Spring 2003
  • [8] G. Karypis and V. Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal of Scientific Computing, 20(1):359–392, 1998.
  • [9] http://code.google.com/p/zemberek/ (Access Date: 08.10.2016)
  • [10]http://strehl.com/soft.html (Access Date: 10.10 .2016)
Konular Mühendislik ve Temel Bilimler
Dergi Bölümü Makaleler
Yazarlar

Yazar: Turgay Tugay Bilgin
E-posta: ttbilgin@gmail.com
Ülke: Turkey


Yazar: Mustafa Koray Aytekin
E-posta: kaytekn@hotmail.com
Ülke: Turkey


Bibtex @araştırma makalesi { cbayarfbe330088, journal = {Celal Bayar Üniversitesi Fen Bilimleri Dergisi}, issn = {1305-130X}, address = {Celal Bayar Üniversitesi}, year = {2017}, volume = {13}, pages = {873 - 881}, doi = {10.18466/cbayarfbe.330088}, title = {Web Proxy Log Data Mining System for Clustering Users and Search Keywords}, language = {en}, key = {cite}, author = {Bilgin, Turgay and Aytekin, Mustafa} }
APA Bilgin, T , Aytekin, M . (2017). Web Proxy Log Data Mining System for Clustering Users and Search Keywords. Celal Bayar Üniversitesi Fen Bilimleri Dergisi, 13 (4), 873-881. Retrieved from http://dergipark.gov.tr/cbayarfbe/issue/33464/330088
MLA Bilgin, T , Aytekin, M . "Web Proxy Log Data Mining System for Clustering Users and Search Keywords". Celal Bayar Üniversitesi Fen Bilimleri Dergisi 13 (2017): 873-881 <http://dergipark.gov.tr/cbayarfbe/issue/33464/330088>
Chicago Bilgin, T , Aytekin, M . "Web Proxy Log Data Mining System for Clustering Users and Search Keywords". Celal Bayar Üniversitesi Fen Bilimleri Dergisi 13 (2017): 873-881
RIS TY - JOUR T1 - Web Proxy Log Data Mining System for Clustering Users and Search Keywords AU - Turgay Tugay Bilgin , Mustafa Koray Aytekin Y1 - 2017 PY - 2017 N1 - DO - T2 - Celal Bayar Üniversitesi Fen Bilimleri Dergisi JF - Journal JO - JOR SP - 873 EP - 881 VL - 13 IS - 4 SN - 1305-130X-1305-1385 M3 - UR - Y2 - 2017 ER -
EndNote %0 Celal Bayar Üniversitesi Fen Bilimleri Dergisi Web Proxy Log Data Mining System for Clustering Users and Search Keywords %A Turgay Tugay Bilgin , Mustafa Koray Aytekin %T Web Proxy Log Data Mining System for Clustering Users and Search Keywords %D 2017 %J Celal Bayar Üniversitesi Fen Bilimleri Dergisi %P 1305-130X-1305-1385 %V 13 %N 4 %R %U