Simulated IPIP-10-like Likert responses

A simulated person-by-item matrix of Likert-scaled (1-5) responses built to mimic the qualitative features of a short, positively oriented Big-Five Conscientiousness subscale – in particular, the Conscientiousness items of the IPIP-50 inventory (Goldberg, 1992; Goldberg et al., 2006). It is intended as a self-contained illustration dataset for the polytomous Likert use case in the single-facet, person-by-item crossed Generalizability Theory design implemented in csem_gt.

Usage

ipip_like

Format

An integer matrix with 2000 rows (persons) and 10 columns (items). Each entry is an integer in \(\{1, 2, 3, 4, 5\}\). Columns are named item01, item02, ..., item10.

Source

Simulated to be broadly comparable to the Conscientiousness subscale of the IPIP-50 inventory, as administered in the public dataset of the Open-Source Psychometrics Project (https://openpsychometrics.org/_rawdata/). The underlying instrument is described in Goldberg, L. R. (1992) and Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. G. (2006).

Details

These are simulated data, not real IPIP-50 responses. ipip_like was generated by drawing independent normal person, item, and residual effects under the random-effects model that Generalizability Theory assumes for the \(p \times i\) design, and then rounding and clipping the resulting scores to the 1-5 Likert metric. The pre-truncation variance components are inflated to absorb the contraction induced by the clip-and-round step, so that the post-truncation ANOVA estimates approximate the targets \(\sigma^2(p) \approx 0.434\), \(\sigma^2(i) \approx 0.136\), and \(\sigma^2(pi) \approx 1.000\), with generalizability coefficient \(E\rho^2 \approx 0.81\). These targets are representative of the Conscientiousness subscale of the IPIP-50 dataset distributed by the Open-Source Psychometrics Project (n.d.) and can be reproduced by any user with access to that public dataset via a one-way \(p \times i\) ANOVA. The matrix uses \(A = 2{,}000\) simulated persons and \(I = 10\) items, with all items oriented positively (any reverse-keying is assumed already applied so that higher scores indicate higher Conscientiousness). The full, seeded generation script is in data-raw/make_ipip_like.R.

References

Goldberg, L. R. (1992). The development of markers for the Big-Five factor structure. Psychological Assessment, 4(1), 26-42.

Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. G. (2006). The International Personality Item Pool and the future of public-domain personality measures. Journal of Research in Personality, 40(1), 84-96.

Open-Source Psychometrics Project. (n.d.). Raw data. https://openpsychometrics.org/_rawdata/

Examples

data(ipip_like)
dim(ipip_like)
#> [1] 2000   10
ipip_like[1:5, 1:6]
#>      item01 item02 item03 item04 item05 item06
#> [1,]      3      4      5      4      4      4
#> [2,]      4      2      5      4      5      2
#> [3,]      4      5      4      4      4      4
#> [4,]      2      3      2      1      4      3
#> [5,]      5      2      5      3      4      2

# \donttest{
fit <- csem_gt(ipip_like, error_type = "relative", method = "full",
               smoother = "polynomial")
fit
#> ----------------------------------------------------------------
#> Conditional SEMs in Generalizability Theory
#> ----------------------------------------------------------------
#> Design          :  univariate single-facet (p x i, crossed)
#> Persons (n_p)   :  2000
#> G-study items   :  10
#> D-study items   :  10
#> Method          :  full
#> SE method       :  analytical
#> Smoothing       :  quadratic on observed score
#> ANOVA table
#> ----------------------------------------------------------------
#>   Effect    df              SS              MS         sigma^2
#> ----------------------------------------------------------------
#>   p         1999    10709.670200        5.357514      0.435002
#>   i            9     2490.601200      276.733467      0.137863
#>   pi       17991    18125.798800        1.007493      1.007493
#> D-study error variances and SEMs (n_i' = 10)
#> ----------------------------------------------------------------
#>   sigma^2(Delta) =   0.114536      sigma(Delta) = 0.338431  (absolute)
#>   sigma^2(delta) =   0.100749      sigma(delta) = 0.317410  (relative)
#> Reliability-like coefficients
#> ----------------------------------------------------------------
#>   Generalizability coef.    E rho^2     =   0.8119
#>   Dependability coef.       Phi         =   0.7916
#> Quadratic smoothing fits  (y = b0 + b1*score + b2*score^2)
#> --------------------------------------------------------------------------
#>   Quantity              b0         b1         b2        R^2       RMSE
#> --------------------------------------------------------------------------
#>   rel_ev_full          -0.12526    0.16103   -0.02677     0.1761    0.04126
#> Mean variance of estimator across persons
#> ----------------------------------------------------------------
#>   Quantity              Analytical
#> ----------------------------------
#>   rel_ev_full         7.243653e-03
plot(fit, plot_type = "both", cibands = "model")

# }