Somers' D as an Alternative for the Item–Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

Jari Metsämuuronen*

item analysis pearson correlation somers d item total correlation item rest correlation item discrimination power

Somers' D as an Alternative for the Item–Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

Jari Metsämuuronen

https://doi.org/10.12973/ijem.6.1.207

Pub. date: February 15, 2020
Pages: 207‒221
1083 Downloads
1262 Views
16 Citations

Pearson product–moment correlation coefficient between item g and test score X, known as item–test or item–total correlation (Rit), and item–rest correlation (Rir) are two of the most used classical estimators for item discrimination power (IDP). Both Rit and Rir underestimate IDP caused by the mismatch of the scales of the item and the score. Underestimation of IDP may be drastic when the difficulty level of the item is extreme. Based on a simulation, in a binary dataset, a good alternative for Rit and Rir could be the Somers’ D: it reaches the ultimate values +1 and –1, it underestimates IDP remarkably less than Rit and Rir, and, being a robust statistic, it is more stable against the changes in the data structure. Somers’ D has, however, one major disadvantage in a polytomous case: it tends to underestimate the magnitude of the association of item and score more than Rit does when the item scale has four categories or more.

Keywords: Item analysis, Pearson correlation, Somers' D, item–total correlation, item–rest correlation, item discrimination power.

cloud_download PDF

pie_chart Metrics

Cite

Citations

Article Metrics

Views

1083

Download

1262

Citations

Crossref
16

Scopus

References

Birnbaum A (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical Theories of Mental Test Scores (pp. 397–479). Addison-Wesley Publishing Company.

Brogden, H. E. (1949). A new coefficient: Application to biserial correlation and to estimation of selective efficiency. Psychometrika, 14(3), 169–182. https://doi.org/10.1007/BF02289151

Byrne, B. M. (2001). Structural Equation Modeling with AMOS. Basic concepts, applications, and programming. Lawrence Erlbaum Associates, Publishers.

Crocker, L., & Algina, J. (1986). Introduction to classical & modern test theory. Wadsworth.

Cronbach, L. J. (1951). Coefficient Alpha and the Internal Structure of Tests. Psychometrika, 16(3), 297–334. https://doi.org/10.1007/BF02310555

Cureton, E. E. (1956). Rank-biserial correlation. Psychometrika, 21(3), 287–290. https://doi.org/10.1007%2FBF02289138

Cureton E. E. (1966a). Simplified Formulas for Item Analysis. Journal of Educational Measurement, 3(2), 187–189. https://doi.org/10.1111/j.1745-3984.1966.tb00879.x

Cureton E. E. (1966b). Corrected item–test correlations. Psychometrika, 31(1), 93–96. https://doi.org/10.1007/BF02289461.

ETS (1960). Short-cut statistics for teacher-made tests. Educational Testing Service.

ETS (2019). Glossary of Standardized Testing Terms. https://www.ets.org/understanding_testing/glossary/

FINEEC (2018). National Assessment of Learning Outcomes in Mathematics at Grade 9 in 2004. Unpublished dataset opened for the re-analysis 18.2.2018. Finnish National Education Evaluation Centre.

Flora, D. B., & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for confirmatory factor analysis with ordinal data. Psychological methods, 9(4), 466–491. https://doi.org/10.1037/1082-989X.9.4.466

Forero, C. G., & Maydeu-Olivares, A. (2009). Estimation of IRT graded response models: Limited versus full information methods. Psychological Methods, 14(3), 275-299. https://doi.org/10.1037/a0015825

Glass, G. V. (1966). Note on rank biserial correlation. Educational and Psychological Measurement, 26(3), 623-631. https://doi.org/10.1177/001316446602600307

Goktas, A. & Isci. O. A. (2011). Comparison of the Most Commonly Used Measures of Association for Doubly Ordered Square Contingency Tables via Simulation. Metodoloski zvezki, 8(1), 17–37.

Goodman, L. A., & Kruskal, W. H. (1954). Measures of association for cross classifications. Journal of the American Statistical Association, 49(268), 732–764. https://doi.org/10.1080/01621459.1954.10501231

Greiner, R. (1909). Über das Fehlersystem der Kollektivmaßlehre [Of the Error Systemic of Collectives]. Journal of Mathematics and Physics /Zeitschift fur Mathematik und Physik, 57, 121–158, 225–260, 337–373.

Henrysson, S. (1963). Correction of Item–Total Correlations in Item Analysis. Psychometrika, 28(2), 211–218. https://doi.org/10.1007/BF02289618

Henrysson, S. (1971). Gathering, analyzing and using data on test items. In R. L. Thorndike (Ed.), Educational measurement (2nd ed.) (pp. 130–159). American Council on Education.

Holgado–Tello, F. P., Chacón–Moscoso, S., Barbero–García, I., Vila–Abad, E. (2010). Polychoric versus Pearson correlations in exploratory and confirmatory factor analysis of ordinal variables. Quality & Quantity, 44, 153–166. https://doi.org/10.1007/s11135-008-9190-y

Howard K. I, & Forehand, G. A. (1962). A Method for correcting item-total correlations for the effect of relevant item inclusion. Educational and Psychological Measurement, 22(4), 731–735. https://doi.org/10.1177/001316446202200407

IBM. (2017). IBM SPSS Statistics 25 Algorithms. IBM. ftp://public.dhe.ibm.com/software/analytics/spss/documentation/statistics/25.0/en/client/Manuals/IBM_SPSS_Statistics_Algorithms.pdf

Jöreskog, K. G. (1994). Structural equation modeling with ordinal variables. In T. W. Anderson, K. T. Fang, & I. Olkin (Eds.), Multivariate analysis and its applications (pp. 297–310). Hayward, CA: Institute of Mathematical Statistics. https://doi.org/10.1214/lnms/1215463803

Kendall, M. (1949). Rank and Product–Moment Correlation. Biometrika, 36(1/2), 177–193. https://doi.org/10.2307/2332540

Lancaster, H. O., & Hamdan, M. A. (1964). Estimation of the correlation coefficient in contingency tables with possibly nonmetrical characters. Psychometrika, 29(4), 383–391. https://doi.org/10.1007/BF02289604

Liu, F. (2008). Comparison of several popular discrimination indices based on different criteria and their application in item analysis. University of Georgia. https://getd.libs.uga.edu/pdfs/liu_fu_200808_ma.pdf

Livingston, S. A., & Dorans, N. J. (2004). A graphical approach to item analysis. (Research Report No. RR-04-10). Educational Testing Service. https:// doi.org/10.1002/j.2333-8504.2004.tb01937.x

Lord, F. M., & Novick, M. R. (1968). Statistical Theories of Mental Test Scores. Addison–Wesley Publishing Company.

Macdonald, P., & Paunonen, S. V. (2002). A Monte Carlo comparison of item and person statistics based on item response theory versus classical test theory. Educational and Psychological Measurement, 62(6), 921–943. https://doi.org/10.1177/0013164402238082

Metsämuuronen, J. (2016). Item–total Correlation as the Cause for the Underestimation of the Alpha Estimate for the Reliability of the Scale. GJRA - Global Journal for Research Analysis, 5(1), 471–477. https://www.worldwidejournals.com/global-journal-for-research-analysis-GJRA/file.php?val=November_2016_1478701072__159.pdf.

Metsämuuronen, J. (2017a). Essentials of Research Methods in Human Sciences. Vol 1: Elementary Basics. SAGE Publications.

Metsämuuronen, J. (2017b). Essentials of Research Methods in Human Sciences. Vol 3: Advanced Analysis. SAGE Publications.

Moses, T. (2017). A Review of Developments and Applications in Item Analysis. In R. Bennett & M. von Davier (Eds.), Advancing Human Assessment. The Methodological, Psychological and Policy Contributions of ETS (pp. 19–46). Springer Open. https://doi.org/10.1007/978-3-319-58689-2_2

Moustaki, I, Jöreskog, K. G., & Mavridis D. (2004). Factor Models for Ordinal Variables with Covariate Effects on the Manifest and Latent Variables: A Comparison of LISREL and IRT Approaches. Structural Equation Modeling: A Multidisciplinary Journal, 11(4), 487‒513. https://doi.org/10.1207/s15328007sem1104_1

Newson, R. (2002). Parameters Behind “Nonparametric” Statistics: Kendall’s tau, Somers D and Median Differences. The Stata Journal, 2(1), 45–64. http://www.stata-journal.com/sjpdf.html?articlenum=st0007

Newson, R. (2008). Identity of Somers D and the rank biserial correlation coefficient. http://www.rogernewsonresources.org.uk/miscdocs/ranksum1.pdf

Öllerer, V., & Croux, C. (2010). Robust high-dimensional matrix estimation. In K. Nordhausen & S, Taskinen (Eds.), Modern Nonparametric, Robust and Multivariate Methods: Festschrift in Honour of Hannu Oja (pp. 325–350). Springer.

Olsson, U., Drasgow, F., & Dorans, N. J. (1982). The polyserial correlation coefficient. Psychometrika, 47(3), 337–347. https://doi.org/10.1007/BF02294164

Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44(4), 443–460. https://doi.org/10.1007/BF02296207

Oosterhof, A. C. (1976). Similarity of various item discrimination indices. Journal of Educational Measurement, 13(2), 145–150. https://doi.org/10.1111/j.1745-3984.1976.tb00005.x.

Pearson, K. (1896). Mathematical contributions to the theory of evolution III. regression, heredity, and panmixia. philosophical transactions of the royal society of London. Series A, Containing Papers of a Mathematical or Physical Character, 187, 253–318. https://doi.org/10.1098/rsta.1896.0007

Pearson, K. (1900). I. Mathematical contributions to the theory of evolution. VII. On the correlation of characters not quantitatively measurable. Philosophical Transactions of the Royal Society A. Mathematical, Physical and Engineering Sciences, 195(262–273), 1–47. https://doi.org/10.1098/rsta.1900.0022.

Pearson, K. (1903). I. Mathematical contributions to the theory of evolution. —XI. On the influence of natural selection on the variability and correlation of organs. Philosophical Transactions of the Royal Society A. Mathematical, Physical and Engineering Sciences, 200(321–330), 1–66. https://doi.org/10.1098/rsta.1903.0001.

Pearson, K. (1905). On the general theory of skew correlation and non-linear regression. Dulau & Co. https://archive.org/details/ongeneraltheory00peargoog/page/n3.

Pearson, K. (1913). On the measurement of the influence of “broad categories” on correlation. Biometrika, 9(1–2), 116–139. https://doi.org/10.1093/biomet/9.1-2.116

Rigdon, E. E., & Ferguson, C. E. JR. (1991). The performance of the polychoric correlation coefficient and selected fitting functions in confirmatory factor analysis with ordinal data. Journal of Marketing Research, 28(4), 491–497. https://doi.org/10.1177/002224379102800412

Siegel, S., & Castellan, N. J., Jr. (1988). Nonparametric statistics for the behavioral sciences (2nd ed.). McGraw-Hill.

Somers, R. H. (1962). A new asymmetric measure of association for ordinal variables. American Sociological Review, 27(6), 799–811. https://doi.org/10.2307/2090408

Stata corp. (2018). Stata manual. Stata. https://www.stata.com/manuals13/mvalpha.pdf

Tallis, G. (1962). The maximum likelihood estimation of correlation from contingency tables. Biometrics, 18(3), 342–353. https://doi.org/10.2307/2527476

Uebersax, J. S. (2015). The tetrachoric and polychoric correlation coefficients. Statistical Methods for Rater Agreement. http://www.john-uebersax.com/stat/tetra.htm

Wendt, H. W. (1972). Dealing with a common problem in social science: A simplified rank-biserial coefficient of correlation based on the U statistic. European Journal of Social Psychology, 2(4), 463–465. https://doi.org/10.1002/ejsp.2420020412

Verhelst ND, Glas CAW, & Verstralen HHFM (1995). One-parameter logistic model OPLM. Cito.

Wolf, R. (1967). Evaluation of several formulae for correction of item-total correlations in item analysis. Journal of Educational Measurement, 4(1), 21–26. https://doi.org/10.1111/j.1745-3984.1967.tb00565.x

Yi-Hsin, C. & Li, I. (2015). IA_CTT: A SAS^®macro for conducting item analysis based on classical test theory. Paper CC184. https://analytics.ncsu.edu/sesug/2015/CC-184.pdf

Receive Email Alerts

RHAPSODE LTD

RHAPSODE LTD

Somers' D as an Alternative for the Item–Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

References

Investigating the Visual Mathematics Literacy Self-Efficacy (VMLSE) Perceptions of Eighth Grade Students and Their Views on This Issue

A Comparison of Faculty and Graduate Students’ Perceptions of Engaging Online Courses: A Mixed-Method Study

A Comparison of Faculty and Graduate Students’ Perceptions of Engaging Online Courses: A Mixed-Method Study

Receive Email Alerts

RHAPSODE LTD

RHAPSODE LTD

Somers' D as an Alternative for the Item–Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings

References

Investigating the Visual Mathematics Literacy Self-Efficacy (VMLSE) Perceptions of Eighth Grade Students and Their Views on This Issue

A Comparison of Faculty and Graduate Students’ Perceptions of Engaging Online Courses: A Mixed-Method Study

A Comparison of Faculty and Graduate Students’ Perceptions of Engaging Online Courses: A Mixed-Method Study

HOW TO CITE THIS ARTICLE

Modal title