Chapter 1 Introduction

Over the past several years, diagnostic classification models (DCMs) have become a more prominent research focus in the field of educational assessment, and psychometrics more broadly (Bradshaw, 2017; Rupp & Templin, 2008 b; Rupp, Templin, & Henson, 2010). Rather than providing a single scaled-score for unidimensional construct, as is common in many item response theory based assessments (see Ayala, 2009), DCMs are multidimensional assessments that provide as their scores a profile of mastery or non-mastery on the skills, or attributes, that are assessed. Thus, DCMs are able to provide more detailed and actionable information about the skills a student has mastered, and the skills that could use more instruction.

In all multidimensional models (e.g., multidimensional item response theory, structural equation modeling, DCMs), there is a measurement model that relates observed data to the latent traits and a structural model that defines the relationships between the latent traits. If all possible parameters are estimated for both the measurement and structural models, then it is likely that unnecessary parameters are estimated, which can impact the stability of other parameters and scores, as well as increasing the complexity and intensity of computation (Browne, Rockloff, & Rawat, 2016). However, Templin & Bradshaw (2014 b) note that it is also important to estimate enough parameters to capture the full complexity of the data. Model reduction refers to the process of removing parameters to provide a more parsimonious model while still capturing the appropriate level of complexity.

In the context of DCMs, there is little research or guidance as to how the model reduction process should take place. For instance, the measurement model and structural model could be reduced simultaneously, one could be reduced after the other, or only one could be reduced. An exploration of these various processes using an assessment known as Diagnosing Teachers’ Multiplicative Reasoning assessment (Chapter 3) shows that the choice of model reduction process can have a profound impact on the final set of parameters included in the model, the estimates and standard errors of the parameters across processes, and respondent assignment to attribute profiles.

The current study further explores this gap in the literature concerning best practices for model reduction of DCMs. A simulation study is conducted, whereby data is simulated from the log-linear cognitive diagnosis model (section 2.3), and then the DCM is estimated using each of the possible model reduction processes. Bias and mean-squared error of the parameter estimates, along with estimated attribute mastery agreement provide insight as to which model reduction process is most appropriate under a variety of data generation conditions. The findings of this study have practical implications for the estimation of DCMs, as the simulation study provides evidence for effectiveness of various model reduction processes. Additionally, practitioners using DCMs in an applied setting will be able to benefit, as a more parsimonious model that is still accurate may provide a more efficient estimation process.

1.1 Study constraints

Although DCMs can be estimated with attributes that have more than one latent category (Rupp et al., 2010), this paper limits the discussion to binary latent attributes. Binary attributes are the most commonly used with DCMs, and this limitation simplifies the problem for the initial investigation proposed in this study. Further, the proposed study limits the discussion of data to dichotomously scored items. There are generalized DCMs that can accommodate alternative response types (e.g., Templin, Henson, Rupp, Jang, & Ahmed, 2008), however, these have not been widely used in the literature. Thus, the proposed study is limited to the types of DCMs that have been most widely investigated and used operationally: binary attributes with dichotomous item responses.

1.2 Colophon

This document was written in Rmarkdown inside RStudio (RStudio, 2018) using the rmarkdown (Allaire et al., 2017), bookdown (Xie, 2017 a), and jayhawkdown (Thompson & Johnson, 2017) packages. The raw Rmarkdown was converted to html and pdf documents using pandoc (“Pandoc,” 2017) and the knitr package (Xie, 2017 b). All graphics were created using the ggplot2 (Wickham & Chang, 2018), ggforce (Pedersen, 2016), and colorblindr (McWhite & Wilke, 2018) packages, and tables were formatted using the kableExtra package (Zhu, 2018). The website was made with jekyll (Preston-Werner, 2018) and published to Netlify (Netlify, 2018) with Travis-CI (Travis CI, 2018). The source code for this document is available on GitHub.