'ordinal data' Search Results
Somers' D as an Alternative for the Item–Test and Item-Rest Correlation Coefficients in the Educational Measurement Settings
item analysis pearson correlation somers' d item–total correlation item–rest correlation item discrimination power...
Pearson product–moment correlation coefficient between item g and test score X, known as item–test or item–total correlation (Rit), and item–rest correlation (Rir) are two of the most used classical estimators for item discrimination power (IDP). Both Rit and Rir underestimate IDP caused by the mismatch of the scales of the item and the score. Underestimation of IDP may be drastic when the difficulty level of the item is extreme. Based on a simulation, in a binary dataset, a good alternative for Rit and Rir could be the Somers’ D: it reaches the ultimate values +1 and –1, it underestimates IDP remarkably less than Rit and Rir, and, being a robust statistic, it is more stable against the changes in the data structure. Somers’ D has, however, one major disadvantage in a polytomous case: it tends to underestimate the magnitude of the association of item and score more than Rit does when the item scale has four categories or more.
Dimension-Corrected Somers’ D for the Item Analysis Settings
item analysis pearson correlation item–total correlation item–rest correlation somers’ d item discrimination power...
A new index of item discrimination power (IDP), dimension-corrected Somers’ D (D2) is proposed. Somers’ D is one of the superior alternatives for item–total- (Rit) and item–rest correlation (Rir) in reflecting the real IDP with items with scales 0/1 and 0/1/2, that is, up to three categories. D also reaches the extreme value +1 and ‒1 correctly while Rit and Rir cannot reach the ultimate values in the real-life testing settings. However, when the item has four categories or more, Somers’ D underestimates IDP more than Pearson correlation. A simple correction to Somers’ D in the polytomous case seems to lead to be effective in item analysis settings. In the simulation with real-life items, D2 showed very few cases of obvious underestimation and practically no cases of obvious overestimation. With certain restrictions discussed in the article, D2 seems to be a good alternative for these classic estimators not only with dichotomous items but also with the polytomous ones. In general, the magnitudes of the estimates by D2 are higher than those by Rit, Rir, and polychoric correlation and they seem to be close of those of bi- and polyserial correlation coefficients without out-of-range values.
High School Principals’ Ability to Estimate Work Time
principals’ time use; principalship; congruence of time use measurements; instructional leadership...
Time management for educational leaders has remained highly relevant to scholars, policymakers and practitioners. We analyzed survey responses from 98 public high school principals to examine the congruency between average total hours they worked per week against the sum total of the average hours worked per week in each of five distinct categories of leadership tasks. The observed congruence was 0.32, while Cohen’s kappa coefficient was 0.10. Female principals tended to underreport, and male principals tended to overreport, total work time. Principals with doctorate degrees exhibited higher congruence than those without, and overreporting was inversely related to highest degree. Principals in charge of large teaching staffs were more likely than their counterparts to be congruent and less likely to overreport total work time. Self-report appears to be an inaccurate method to measure time use among high school principals. If time use is a key component of the quality of principal leadership, more detailed and robust techniques for collecting time use data should be utilized in future studies.
Goodman–Kruskal gamma and Dimension-Corrected Gamma in Educational Measurement Settings
item analysis goodman–kruskal gamma somers d jonckheere–terpstra test pearson correlation...
Although Goodman–Kruskal gamma (G) is used relatively rarely it has promising potential as a coefficient of association in educational settings. Characteristics of G are studied in three sub-studies related to educational measurement settings. G appears to be unexpectedly appealing as an estimator of association between an item and a score because it strictly indicates the probability to get a correct answer in the test item given the score, and it accurately produces perfect latent association irrespective of distributions, degrees of freedom, number of tied pairs and tied values in the variables, or the difficulty levels in the items. However, it underestimates the association in an obvious manner when the number of categories in the item is more than four. Towards this, a dimension-corrected G (G2) is proposed and its characteristics are studied. Both G and G2 appear to be promising alternatives in measurement modelling settings, G with binary items and G2 with binary, polytomous and mixed datasets.
Measuring Purpose in Life in College Students: An Assessment of Invariance Properties by College Year and Undergraduate School
college students confirmatory factor analysis measurement invariance purpose in life...
Purpose in life is a key construct in the development of young adults, particularly college students. There are many instruments measuring sense of purpose in life, but few studies have examined their measurement properties among college students. The current study compares the measurement invariance properties of the Purpose in Life (PIL) scale and the Claremont Purpose Scale (CPS) across college year and undergraduate school. Using both a unidimensional and a two-dimensional model, we found that the PIL’s interpretability is limited among college students. Using a three-dimensional model, the CPS was invariant with respect to both grouping variables. The study suggests that the CPS can be used to make meaningful comparisons among college students categorized by school year and undergraduate school. The study also has some implications about the construct of purpose in life; namely, scale structures that work well statistically and theoretically among adults might not generalize to young adults.
Number of Response Options, Reliability, Validity, and Potential Bias in the Use of the Likert Scale Education and Social Science Research: A Literature Review
likert scale literature review potential bias reliability and validity...
This study reviews 60 papers using a Likert scale and published between 2012 – 2021. Screening for literature review uses the PRISMA method. The data analysis technique was carried out through data extraction, then synthesized in a structured manner using the narrative method. To achieve credible research results at the stage of the data collection and data analysis process, a group discussion forum (FGD) was conducted. The findings show that only 10% of studies use a measurement scale with an even answer choice category (4, 6, 8, or 10 choices). In general, (90%) of research uses a measurement instrument that involves a Likert scale with odd response choices (5, 7, 9, or 11) and the most popular researchers use a Likert scale with a total response of 5 points. The use of a rating scale with an odd number of responses of more than five points (especially on a seven-point scale) is the most effective in terms of reliability and validity coefficients, but if the researcher wants to direct respondents to one side, then a scale with an even number of responses (six points) is possible. more suitable. The presence of response bias and central tendency bias can affect the validity and reliability of the use of the Likert scale instrument.
Rethinking the Components of Regulation of Cognition through the Structural Validity of the Meta-Text Test
metacognition performance-based testing regulation of cognition structural validity...
The field of studies in metacognition points to some limitations in the way the construct has traditionally been measured and shows a near absence of performance-based tests. The Meta-Text is a performance-based test recently created to assess components of cognition regulation: planning, monitoring, and judgment. This study presents the first evidence on the structural validity of the Meta-Text, by analyzing its dimensionality and reliability in a sample of 655 Honduran university students. Different models were tested, via item confirmatory factor analysis. The results indicated that the specific factors of planning and monitoring do not hold empirically. The bifactor model containing the general cognition regulation factor and the judgment-specific factor was evaluated as the best model (CFI = .992; NFI = .963; TLI = .991; RMSEA = .021). The reliability of the factors in this model proved to be acceptable (Ω = .701 & .699). The judgment items were well loaded only by the judgment factor, suggesting that the judgment construct may actually be another component of the metacognitive knowledge dimension but having little role in cognition regulation. The results show initial evidence on the structural validity of the Meta-Text and give rise to information previously unidentified by the field which has conceptual implications for theorizing metacognitive components.
Simplification and Empirical Verification of Learning Styles Index for Indonesian Students
engineering learning style index short form verification indonesia...
This article investigates the adoption, simplification, and usage recommendations of the Indonesian Index of Learning Style Short Form (ILS-SF). The aim is to refine the initial Indonesian ILS, compare the suitability between engineering/non-engineering and high school/university, and assess their learning styles. The participants were 678 students (413 females), with an average age of 19.4±1.92 years. The methods used in this study were adopting the existing Indonesian version of ILS, simplifying–reducing the number of items, empirical verification (validity and reliability), and Indonesia data assessment. The results show that the original ILS could be simplified without sacrificing the quality of the model. On the contrary, validity and reliability measures have increased. Confirmatory Factor Analysis (CFA) supports a reduction from 44 to 15 items. It confirms the validity with favorable indices such as CFI (0.972), TLI (0.966), RMSEA (0.021), SRMR (0.049), and GFI (0.999)—Active-Reflective Cronbach's alpha at 0.507, Sensing-Intuitive at 0.590, and Visual-Verbal at 0.553. Indonesian ILS-SF is faster, simpler, more suitable for engineering than non-engineering, and more ideal for undergraduate than high school students. The analysis revealed that sensory (40.2%), active (18%), and visual (10.2%) preferences dominate among Indonesian students. This study highlights assessment tools tailored to diverse educational contexts.
0
Bringing AI into Teaching: Understanding Vietnamese Teachers’ Perspectives and Pedagogical Challenges
ai in education digital transformation educational policy pedagogical challenges teacher perspectives utaut...
Artificial Intelligence (AI) is reshaping education across the Asia-Pacific, yet its integration depends on teachers’ readiness and perspectives. This study explores AI adoption among Vietnamese teachers, a critical lens for the region’s digital education reforms, using the Unified Theory of Acceptance and Use of Technology (UTAUT). Through Structural Equation Modeling (SEM) and Latent Dirichlet Allocation (LDA), we analyzed responses from 246 teachers nationwide. Results show attitude strongly predicts adoption intention, with privacy and ethical concerns shaping acceptance, though fears of AI dependence hinder uptake. Uniform challenges across urban-rural and STEM-non-STEM contexts suggest systemic barriers in Vietnam’s education system. Teachers foresee AI as a pedagogical assistant but highlight insufficient training and privacy risks as key obstacles. These findings underscore the need for Asia-Pacific-relevant policies—AI literacy programs, ethical governance, and equitable access—to foster sustainable integration. This research informs regional educational policy by offering a Vietnam-centric model for balancing technological innovation with pedagogical integrity, addressing shared challenges in the Asia-Pacific’s digital transformation.
0
A Proposed Standard for the Reporting of Structural Equation Models With Ordinal Variables: Why Ordinal Data Should be Treated With Extra Care?
confirmatory factor analysis likert items ordinal data structural equation modelling...
Educational researchers, as well as researchers in other disciplines, often work with ordinal data, such as Likert item responses and test item scores. Critical questions arise when researchers attempt to implement statistical models to analyse ordinal data, given that many statistical techniques assume the data analysed to be continuous. Could ordinal data be treated as continuous data, that is, assuming the ordinal data to be continuous and then applying statistical techniques as if analysing continuous data? Why and why not? Focusing on structural equation models (SEMs), particularly confirmatory factor analysis (CFA), this article discusses an ongoing debate on the treatment of ordinal data and reports a short review on the practices of conducting and reporting SEMs, in the context of mathematics education research. The author reviewed 70 publications in mathematics education research that reported a study involving SEMs to analyse ordinal data, but less than half discussed how data were treated or guided readers through the analysis; it is therefore harder to repeat such an analysis and evaluate the results. This article invites methodological discussions on SEMs with ordinal variables in the practices of educational research. Subsequently, a standard for reporting SEMs with ordinal data is proposed, followed by an example. This standard contributes to educational research by enabling researchers (self and others) to evaluate SEMs reported. The example demonstrates, using real-life research data, how two different approaches for analysing ordinal data (as continuous or as a product of discretisation from some continuous distributions) can lead to results that disagree.
0