Yıl 2018, Cilt 9, Sayı 1, Sayfalar 17 - 32 2018-03-31

Investigation of Two Facets Design With Generalizability In Item Response Modeling
Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi

Gülden KAYA UYANIK [1] , Selahattin GELBAL [2]

105 87

An approach called generalizability in item response modeling (GIRM) is investigated with two facets sx(i:t) design and results are compared with results of generalizability theory in this study. In this study simulated data is used. In Generalizability Theory linear model random facets balanced bx(m:h) design are used for generating data. Generated data are differed by factors. These factors are testlet effect, testlet length and number of testlet. All generated data consist of two different universes and all universes have four different conditions. According to the results of this study the estimates of variance components obtained using GIRM approach are generally quite similar to those obtained using GT approach. Briggs and Wilson‘s study is supported this result. There is no difference between results of GIRM and GT but error variance could be separated from residual variance with GIRM.  This study also examines the reliability of testlets under different conditions. Testlets are more reliable when person-item variance is smaller. Furthermore, when testlet effect is increased,reliability is decreased. When conditions of all universes are investigated it is concluded that it is effective to have more items to increase reliability.

Bu çalışmada, Madde Tepki Modellemesinde Genellenebilirlik (MTMG) yaklaşımı iki yüzeyli bx(m:t) deseni ile incelenmiş ve Genellenebilirlik Kuramından (GK) elde edilen sonuçlar ile karşılaştırılmıştır. Çalışmada simülasyon verisi kullanılmıştır. Genellenebilirlik Kuramı doğrusal veri seti bx (m:t) dengelenmiş rastgele deseni için üretilmiştir. Üretilen veriler madde takımı etkisi, madde takımı uzunluğu ve madde takımı sayısı açısından farklılık göstermektedir. Veriler toplamda iki evrenden ve her evren dört farklı koşuldan oluşmaktadır. Araştırmanın sonucu tüm evrenlere ait koşulların varyans kestirimlerinin MTMG yaklaşımı ve GK ile elde edilen sonuçlar arasında bir fark olmadığını göstermektedir. Elde edilen bu sonuç MTMG yaklaşımını ortaya atan ve tek yüzeyli desen üzerinde inceleyen Briggs ve Wilson’ın yapmış oldukları çalışma ile desteklenmektedir. MTMG yaklaşımı ve GK ile kestirilen değerler arasında fark yoktur; ancak MTMG yaklaşımında hata varyansı etkileşim varyansından ayrı olarak gözlenebilir. Çalışmada ayrıca madde takımları güvenirliği farklı koşullar altında incelenmiştir. Birey-madde takımı etkileşiminin küçük olduğu durumlarda etkileşimin büyük olduğu durumlara göre daha yüksek güvenirlik elde edilmiştir. Bunun yanında madde takımı etkisi arttıkça güvenirliğin düştüğü gözlenmiştir. Ayrıca tüm evrenlere ait koşullar incelendiğinde madde takımları için madde sayısı artıkça güvenirliğin arttığı gözlenmiştir. 

  • Alkahtani, S. F. (2012). Oral performace scoring using generalizability theory and many-facet Rasch measurement: A comparison study. Unpublished Doctoral Dissertation, The Pennsylvania State University.
  • Bock, R. D., Brennan, R. L. ve Muraki, E. (2002). The information in multiple ratings. Applied Psychological Measurement, 26, 364-375.
  • Bradlow, E. T., Wainer, H. ve Wang, X. (1999). A bayesian random effects model for testlets. Psychometrika, 64, 153-168.
  • Brennan, R. L. (2001). Generalizability theory. New-York: Springer-Verlag.
  • Briggs, D. C. ve Wilson, M. (2004). Generalizability theory in item response modeling. Presentation at the International Meeting of the Psychometric Society, Pacific Grove, CA.
  • Briggs, D. C. ve Wilson, M. (2007). Generalizability theory in item response modeling. Journal of Educational Measurement, 44(2), 131-155.
  • Chien, Y. M. (2008). An investigation of testlet-based item response models with a random facets design in generalizability theory. Unpublished Doctoral Dissertation. University of Iowa.
  • Cronbach, L. J., Linn, R. L., Brennan, R. L. ve Haertel, E. (1995). Generalizability analysis for educational assessments. Evaluation Comment. Los Angeles: UCLA's Center for the Study of Evaluation and The National Center for Research on Evaluation, Standards and Student Testing, http:--www.cse.ucla.edu.
  • DeMars, C. E. (2006). Application of the bi-factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43(2), 145-168.
  • Dimitrov, D. M. (2003). Marginal true-score measures and reliability for binary items as a function of their IRT parameters. Applied Psychological Measurement, 27(6), 440- 458.
  • Dresher, A. R. (2004). An empirical investigation of LID using the testlet model: A further look. Paper presented at the Annual Meeting of the National Council on Measurement in Education, San Diego, CA.
  • Feldt, L. S. ve Quails A. L. (1989). Reliability. In R. L. Linn (Ed.), Educational measurement (3r ed.) (pp. 105-146). New York: American Council on Education and Macmillan.
  • Ferrara, S., Huynh, F. L. ve Bagli, H. (1997). Contextual characateristics of locally dependent open-ended item clusters on a large-scale performance assessment. Applied Measurement in Education, 12, 123-144.
  • Ferrara, S., Huynh, F. L. ve Michaels, H. (1999). Contextual explanations of local dependence in item clusters in a large-scale hands-on science performance assessment. Journal of Educational Measurement, 36, 119-140.
  • Fox, J. P. ve Glas, C. A. W. (2001). Bayesian estimation of a multilevel IRT model using Gibbs sampling. Psychometrika, 66, 271-288.
  • Glas, C. A. W. (1989). Contributions to estimating and testing Rasch models. Unpublished Doctoral Dissertation. Enschede, University of Twente.
  • Güler, N., Kaya Uyanık, G. ve Taşdelen Teker, G. (2012). Genellenebilirlik kuramı. Ankara: Pegem Akademi Yayıncılık.
  • Hendrickson, A. B. (2001). Reliability of scores from tests composed of testlets: a comparison of methods. Paper presented at the Annual Meeting of the National Council on Measurement in Education (Seatle, WA, April1-13, 201).
  • Jiao, H., Kamata, A., Wang, S. ve Jin, Y. (2012). A multilevel testlet model for dual local dependence. Journal of Educational Measurement, 49(1), 82-100.
  • Karasar, N. (2004). Bilimsel Araştırma Yöntemi. 13. Baskı, Ankara: Nobel Yayınları.
  • Kim, S. C. ve Wilson, M. (2008). A comparative analysis of the ratings in performance assessment using generalizability theory and the many-facet Rasch model. Journal of Applied Measurement, 10(4), 408-423.
  • Kolen, M. ve Harris, D. (1987). A multivariate test theory model based on item response theory and generalizability theory. Paper presented at the American Educational Research Association, Washington, DC.
  • Lee, G. ve Park, I. Y. (2012). A comparison of the approaches of generalizability theory and item response theory in estimating the reliability of test scores for testlet-composed tests. Asia Pacific Education Review, 13(1), 47-54.
  • Lee, G., Brennan, R. L. ve Frisbie, D. A. (2000). Incorporating the testlet concept in test score analyses. Educational Measurement: Issues and Practice, 19(4), 9-15.
  • Lee, G. ve Frisbie, D. A. (1999). Estimating reliability under a generalizability theory model for test scores composed of testlets. Applied Measurement in Education, 12(3), 237-255.
  • Li, Y., Bolt, D. M. ve Fu, J. (2006). A comparison of alternative models for testlets. Applied Psychological Measurement, 30(1), 3-21.
  • Linacre, J. M. (1989). Many-facet Rasch measurement. Chicago: MESA Press.
  • Linacre, J. M. (1999). FACETS (Version 3.17) [Computer software]. Chicago: MESA Press.
  • Lord, F. M. (1983). Unbiased estimation of ability parameters, of their variance, and of their parallel forms reliability. Psychometrika, 48, 233-245
  • Patz, R., Junker, B., Johnson, M. S. ve Mariano, L. (2002). The hierarchical rater model for rated test items and its application to large-scale educational assessment data. Journal of Educational and Behavioral Statistics, 27, 341-384.
  • Raju, N. S. ve Oshima, T. C. (2005). Two prophecy formulas for assessing the reliability of item response theory-based ability estimates. Educational and Psychological Measurement, 65(3), 361-375.
  • Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: Danish Institute for Educational Research.
  • Rosenbaum, P. R. (1988). Items bundles. Psychometrika, 53(3), 349-359.
  • Samejima, F. (1977). A use of the information function in tailored testing. Applied Psychological Measurement, 1, 233-247.
  • Samejima, F. (1994). Estimation of reliability coefficients using the test information function and its modifications. Applied Psychological Measurement, 18, 229-244.
  • Shavelson, R. J. ve Webb, N. M. (1991). Generalizability theory: A Primer. USA: SAGE Publications.
  • Sireci, S. G., Thissen, D. ve Wainer, H. (1991). On the reliability of testlet-based tests. Journal of Educational Measurement, 28, 237-247.
  • Thissen, D., Steinberg, L. ve Mooney, J. (1989). Trace lines for testlets: A use of multiple-categorical response models. Journal of Educational Measurement, 26, 247- 260.
  • Verhelst N. D. ve Verstralen, H. H. F. M. (2001). IRT models for multiple raters. In A. Boomsma, T. Snijders, and M. Van Duijn (Eds.), Essays in Item Response Modeling (pp. 89-108) New York: Springer-Verlag.
  • Wainer, H. (1995). Precision and differential item functioning on a testlet-based test: The 1991 law school admissions test as an example. Applied Measurement in Education, 8, 157-186.
  • Wainer, H. ve Kiely, G. L. (1987). Item clusters and computerized adaptive testing: a case for testlets. Journal of Educational Measurement, 24 (3), 185-201.
  • Wainer, H. ve Lewis, C. (1990). Toward a psychometrics for testlets. Journal of Educational Measurement, 27(1), 1-14.
  • Wainer, H. ve Thissen, D. (1996). How is reliability related to the quality of test scores? What is the effect of local dependence on reliability? Educational Measurement: Issues and Practice, 15(1), 22-29.
  • Wainer, H. ve Wang, C. (2000). Using a new statistical model for testlets to score TOEFL. Journal of Educational Measurement, 37, 203-220.
  • Wainer, H., Bradlow, E. T. ve Du, Z. (2000). Testlet response theory: An analog for the 3PL model useful in testlet-based adaptive testing. Dordrecht: Kluwer Academic Publishers.
  • Wang, X., Bradlow, E. T. ve Wainer, H. (2002). A General bayesian model for testlets: theory and application. Applied Psychological Measurement, 26(1), 109-128.
  • Wilson, M. ve Hoskens, M. (2001). The rater bundle model. Journal of Educational and Behavioral Statistics, 26, 283-306.
  • Yen, W. M. (1993). Scaling performance assessments: Strategies for managing local item dependence. Journal of Educational Measurement, 30, 187-213.
  • Zhang, X. ve Roberts, W. L. (2013). Investigation of standardized patient ratings of humanistic competence on a medical licensure examination using Many-Facet Rasch Measurement and generalizability theory. Advances in Health Sciences Education, 18(5), 929-944.
  • Zwinderman, A. H. (1991). A generalized Rasch model for manifest predictors. Psychometrika, 56, 589-600.
Birincil Dil tr
Konular Sosyal ve Beşeri Bilimler
Dergi Bölümü Makaleler
Yazarlar

Orcid: orcid.org/0000-0002-8100-6994
Yazar: Gülden KAYA UYANIK
Kurum: Sakar
Ülke: Turkey


Orcid: orcid.org/0000-0001-5181-7262
Yazar: Selahattin GELBAL

Bibtex @araştırma makalesi { epod349718, journal = {Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi}, issn = {1309-6575}, eissn = {1309-6575}, address = {Eğitimde ve Psikolojide Ölçme ve Değerlendirme Derneği}, year = {2018}, volume = {9}, pages = {17 - 32}, doi = {10.21031/epod.349718}, title = {Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi}, key = {cite}, author = {GELBAL, Selahattin and KAYA UYANIK, Gülden} }
APA KAYA UYANIK, G , GELBAL, S . (2018). Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 9 (1), 17-32. DOI: 10.21031/epod.349718
MLA KAYA UYANIK, G , GELBAL, S . "Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi". Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi 9 (2018): 17-32 <http://dergipark.gov.tr/epod/issue/36355/349718>
Chicago KAYA UYANIK, G , GELBAL, S . "Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi". Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi 9 (2018): 17-32
RIS TY - JOUR T1 - Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi AU - Gülden KAYA UYANIK , Selahattin GELBAL Y1 - 2018 PY - 2018 N1 - doi: 10.21031/epod.349718 DO - 10.21031/epod.349718 T2 - Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi JF - Journal JO - JOR SP - 17 EP - 32 VL - 9 IS - 1 SN - 1309-6575-1309-6575 M3 - doi: 10.21031/epod.349718 UR - http://dx.doi.org/10.21031/epod.349718 Y2 - 2018 ER -
EndNote %0 Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi %A Gülden KAYA UYANIK , Selahattin GELBAL %T Madde Tepki Modellemesinde Genellenebilirlik İle İki Yüzeyli Desenlerin İncelenmesi %D 2018 %J Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi %P 1309-6575-1309-6575 %V 9 %N 1 %R doi: 10.21031/epod.349718 %U 10.21031/epod.349718