Home Normrank Correlations for Testing Associations and for Use in Latent Variable Models
Article Open Access

Normrank Correlations for Testing Associations and for Use in Latent Variable Models

  • Daniel B. Wright EMAIL logo
Published/Copyright: April 26, 2024

Abstract

Pearson’s correlation is widely used to test for an association between two variables and also forms the basis of several multivariate statistical procedures including many latent variable models. Spearman’s ρ is a popular alternative. These procedures are compared with ranking the data and then applying the inverse normal transformation, or for short the normrank transformation. Using the normrank transformation was more powerful than Pearson’s and Spearman’s procedures when the distributions have less than normal kurtosis (platykurtic), when the distributions have greater than normal kurtosis (leptokurtic), and when the distribution is skewed. This is examined for testing if there is an association between two variables, identifying the number of factors in an exploratory factor analysis, identifying appropriate loadings in these analyses, and identifying relations among latent variables in structural equation models. R functions and their use are shown.

Pearson’s correlation is widely reported in much of the scientific literature and is the basis for many other statistical procedures. Others calculated correlations before Pearson (e.g., Bravais, Galton), but Pearson showed that the product moment correlation, which bears his name, has some valuable properties for bivariate normal distributions. Pearson (1896, p. 265), using different notation, defined the correlation between x i and y i as follows:

(1) r x y = ( x i x ¯ ) ( y i y ¯ ) ( x i x ¯ ) 2 ( y i y ¯ ) 2 ,

where x ¯ and y ¯ are the sample means of two variables. It is widely used to examine the association between two variables, and correlation matrices are used with factor analysis, structural equation modeling (SEM) etc. Intervals for these estimates and the associated p -values are reported in most statistical software. The intervals and p -values make assumptions about the variables’ population distribution (i.e., bivariate normal).

Soon after Pearson’s paper, Spearman (1904) devised a method for “characteristics that cannot be measured quantitatively” (p. 78). It has become known as Spearman’s ρ . His rationale was as follows. If variables cannot be measured quantitatively, meaning is not lost ranking them. Spearman’s approach involves ranking the data then applying equation (1), though in the days before computers other algorithms were available to make computations easier. It is often used when researchers do not make the bivariate normal assumption or do not want outliers and influential points to have as much impact as with Pearson’s correlation. These are different rationale than Spearman’s rationale of not losing meaning by ranking, but the procedure is less affected by extreme values than Pearson’s r . Spearman’s ρ , as his version is called, is popular and provides more reliable estimates for distributions with outliers than Pearson’s version (e.g., de Winter, Gosling, & Potter, 2016; Edgell & Noon, 1984; Ventura-León, Pennnna-Calero, & Burga-León, 2022). Spearman’s procedure involves ranking both variables, assigning the mid-rank to ties, and then using the equation for Pearson’s correlation. Alternative formula for the standard error can be used for constructing intervals and p -values (see Bonett & Wright, 2000; Ruscio, 2008).

Numerical transformations can be used to map the bivariate data used in a correlation such that each of the new variables is uniformly distributed (e.g., Nelsen, 2006). Ranking is a type of transformation. When there are no ties, it results in a discrete uniform distribution. These ranks can be transformed using the inverse of the normal distribution, to make their distribution more like this distribution. This is often called the rank-based inverse normal transformation (Beasley, Erickson, and Allison, 2009). For short, here it will be called the normrank transformation. van der Waerden (1952) compared Student’s t test and the Mann-Whitney Wilcoxon (MWW) test for testing the difference between the center of two samples using this approach. The transformation he used was:

(2) new x i = Φ 1 rank ( x i ) n + 1 ,

where Φ 1 is the inverse of the standard normal function. Consider the data in Table 1. The transformation increases the relative spacing for the first few values but decreases the spacing for the final few values. This is because the first few values are closer to each other than predicted for a tail of the normal distribution but those on the right side of the table are further spread out.

Table 1

Example values before and after transforming

Ten values
Raw data 1.00 0.90 0.80 0.60 0.40 0.20 0.00 0.50 1.00 2.00
After equation (2) 1.34 0.91 0.60 0.35 0.11 0.11 0.35 0.60 0.91 1.34

The transformed variables were then entered into the t -test formula. van der Waerden called this the X -test, but it is now usually called the van der Waerden or vdw test. The test is implemented in the R package PMCMRplus (Pohlert, 2020) and has been extended to compare multiple means as an alternative to the traditional one-way analysis of variance. Beasley et al. (2009) performed more extensive research comparing this transformation in the t and F tests contexts.

The blom (named for Blom who suggested δ = 3 8 in equation (3)) function from rcompanion (Mangiafico, 2020) implements a more general form of this:

(3) new X i = Φ 1 rank ( x ) δ n 2 δ + 1 .

When δ = 0 , this is the same as equation (2), but other values have been proposed. The norm is to have δ [ 1 , + 1 ) , where [ 1 , + 1 ) is the half-open interval as the upper limit would produce values of and + for the ranks of 1 and n , respectively (and δ > 1 produces percentiles outside of [ 0 , 1 ] for these values where the inverse normal function is undefined). A simple R function for this is:

normrank <- function(x,delta)
qnorm((rank(x) - delta)/(length(x) - 2 * delta + 1))

Similar procedures are available in SAS, SPSS, etc.

The first simulation in this article explores different values of δ to see which δ values, in different circumstances, produce variables that are distributed most similarly to the normal distribution. In subsequent simulations, the normranked data are created using this δ . The primary purpose of this article is to show how well this transformation works when applied to two variables, as opposed to a single response variables as examined in van der Waerden (1952) and Beasley et al. (2009).

Two issues are important to raise. While this transformation addresses univariate normality, it does not address the assumption of bivariate normality because both variables being normally distributed is a necessary but not a sufficient condition for bivariate normality (and similarly for multivariate normality). Second, information is lost when ranking the data. This may make it more difficult to interpret the meaning of the resulting correlation as an effect size than Pearson’s version. This concern exists also with Spearman’s correlation and any rank transformed procedure.

1 Simulation 1: Which Values of δ to Use?

The blom function in rcompanion (Mangiafico, 2020) calculates equation (3) with δ values for 0 (Van der Waerden’s), π 8 0.393 (Elving’s), 1 3 0.333 (Tukey’s), 3 8 = 0.375 (Blom’s), and 1 2 = 0.5 (the rankit method). Here, the values in δ [ 0 , 1 ) are examined. Values of δ from 0 to 0.99 in steps of 0.01, from 0.99001 to 0.99999 in steps of 0.00001, and 0.999999999, and used. Values very near one are used because δ = 1 produces ∞ for the rank n value so there is interest in the transformation’s behavior near this threshold. The code for this and the other simulations are presented in Appendix A.

Five levels of discreteness in the data are examined. The first are data drawn from a uniform distribution with a range of (0, 10), though any continuous distribution of size n will produce the same transformed values. The next level uses integers from 1 to 20, created by doubling the first variable, adding 0.5, and rounding. Analogous approaches were used to create ten point, five point, and three point integer scales, similar to those often used in surveys and personality research. This approach will produce roughly equal numbers of values in each bin. Sample sizes of n = 50 , n = 100 , and n = 500 were used. Thus, there are 5 × 3 = 15 conditions. The W from the Shapiro–Wilk (Shapiro & Wilk, 1965) test, as implemented in shapiro.test in R using Royston’s algorithm (1995), is used as a measure to compare distance from normality. W is the square of a Pearson correlation so cannot exceed one.

The values are stored, plotted in Figure 1, and the maximum and minimum W values, along with their δ values, are presented in Table 2. Note that for the case of continuous data, individual W values are not shown because each replication produces the same transformed data and therefore the same W . The results are presented as follows:

  1. Little effect in the changes in sample size between n = 50 , n = 100 , and n = 500 ,

  2. The more continuous a variable is, the more closely it can approximate the normal distribution, which is itself continuous,

  3. There appears very little differences among the W s for the δ s ( 0 , 1 ] for all the discrete variables,

  4. There is an effect of δ for the continuous variable with the W reaching a maximum around 0.7 (with a maximum W rounding to 1.000) and then dropping off when δ approaches 1. The dropping off does not happen for the discrete distributions because several values have the extreme bins, and therefore, their mid-rank is well above 1 or below n . If samples drawn from populations with fewer values that would be in the extreme bins, they would drop off too as sometimes only one value would be in the extreme bins, and therefore would have a rank of 1 or n , and a transformed value of it being infinite.

  5. The δ that produces the highest W with discrete values tend to be higher than 0.7, particularly as the sample size increases.

Here, δ = 0.7 will be used, though as is clear from this figure, except when δ 1 , the size of this does not make a large difference. One thousand replications for each δ for each condition were used.

Figure 1 
               Finding 
                     
                        
                        
                           W
                        
                        {\boldsymbol{W}}
                     
                   values with continuous and discrete scales for different 
                     
                        
                        
                           δ
                        
                        \delta 
                     
                   values, for three sample sizes. The black line is the continuous data, the red line is for the 20-point scale, the green line for the 10-point scale, the darker blue line for the 5-point scale, and the light blue line for the 3-point scale.
Figure 1

Finding W values with continuous and discrete scales for different δ values, for three sample sizes. The black line is the continuous data, the red line is for the 20-point scale, the green line for the 10-point scale, the darker blue line for the 5-point scale, and the light blue line for the 3-point scale.

Table 2

Minima and Maxima W values and the corresponding δ s from the first simulation

n Discrete W min δ min W max δ max
50 Continuous 0.813 1.000 1.000 0.710
50 20pt 0.916 1.000 0.981 0.640
50 10pt 0.945 1.000 0.955 0.770
50 5pt 0.945 0.997 0.892 0.998
50 3pt 0.945 0.992 0.787 0.998
100 Continuous 0.857 1.000 1.000 0.710
100 20pt 0.976 1.000 0.981 0.920
100 10pt 0.955 0.350 0.956 0.991
100 5pt 0.955 1.000 0.894 0.993
100 3pt 0.955 0.992 0.790 0.998
500 Continuous 0.956 1.000 1.000 0.700
500 20pt 0.982 0.040 0.982 1.000
500 10pt 0.957 0.998 0.958 0.995
500 5pt 0.957 0.999 0.896 1.000
500 3pt 0.957 0.996 0.793 0.994

2 How are Data Typically Distributed?

Simulation is used in this article to explore the value of the normrank correlation measure. Simulations are done to examine testing for an association, exploratory factor analysis (EFA), and SEM. A selection of the possible data creation scenarios are reported. The aim is to show a selection of uses of this normrank procedure to illustrate its potential, rather than to provide an exhaustive set of possible data creation scenarios.

Before discussing the main simulations reported in this article, it is necessary to examine what types of distributions are typical in education and allied disciplines. The distribution of variables used in correlations varies by discipline and by many other contextual details. The interest here is with kurtosis (for discussion of this statistic, see DeCarlo, 1997; Wright & Herrington, 2011) because of its effect on power (Wilcox, 2017). There are several types of kurtosis based on the fourth moment and more robust alternatives (e.g., Moors, 1988). These are reviewed by Joanes and Gill (1998). Type 3 kurtosis is the default of the function kurtosis found in the e1071 package (Meyer, Dimitriadou, Hornik, Weingessel, & Leisch, 2018) and is reported here.

Micceri (1989) requested data from several education sources, test publishers, and authors of psychology and education journal articles. He analyzed 440 separate distributions with respect to kurtosis. He classified their kurtosis as follows: similar to the uniform distribution (3.2% of his sample), less than the normal distribution (14.8%), similar to the normal distribution (15.8%), similar to a moderate contaminated normal distribution (17.7%), similar to an extreme contaminated normal distribution (32.5%), and similar to a double exponential distribution (16.6%). He defined a moderate contaminated distribution as 95% drawn from a normal distribution with σ = 1 and 5% drawn from a normal distribution with σ = 2 . Extreme contaminated normal was defined as 85% drawn from a normal distribution with σ = 1 and 15% drawn from a normal distribution with σ = 3 . Thus, about half of the distributions he examined had kurtosis at least that of what he called the extreme contaminated normal distribution. He found approximately 18% to have kurtosis less than normal.

Paraphrasing Poincaré, everybody believes the normal distribution assumption is justified: the mathematicians – because they think the scientists say all their data are normally distributed; the scientists – because they think the mathematicians say it is justified because of the central limit theorem. Micceri (1989) titled his paper “The unicorn, the normal curve, and other improbably creatures” to stress that real data are seldom normally distributed. Thus, scientists do not say that all their data are usually normally distributed. Tukey (1960) showed that nonnormally distributed data do affect statistical results, countering the idea that the central limit theory makes this assumption justified in all circumstances. Tukey’s advice was to assume that “all assumptions are wrong” (Tukey, 1986, p. 72), but then to explore how problematic it is to make them and what alternatives exist. One of the main goals of this article is to examine the costs of assuming data are normally distributed and discuss an alternative.

Four distributions are considered here to illustrate how the three correlations perform: the uniform distribution, the normal distribution, a contaminated normal distribution with 15% with σ = 3 , and the χ 2 with df = 1 , which is equivalent to the square of a normal distribution. The normal and contaminated normal distributions are shown in Figure 2. They can appear very similar in histograms, but the tails of the contaminated normal distribution are larger (right panel). Tukey showed that this can have a large impact on the power of the tests as the standard deviations can become much larger. The kurtosis for these four distributions are: uniform equals 1.80, normal equals 0, χ 1 2 equals 12, and the contaminated normal used equals approximately 5.06. The χ 1 2 distribution has skewness of 8 2.83 . The others are symmetric. An example of how to perform a normrank correlation is in the first section of Appendix B.

Figure 2 
               The normal distribution (
                     
                        
                        
                           μ
                           =
                           0
                           ,
                           σ
                           =
                           1
                        
                        \mu =0,\sigma =1
                     
                  ) and a contaminated normal distribution that includes 15% with 
                     
                        
                        
                           σ
                           =
                           3
                        
                        \sigma =3
                     
                  . The right panel focuses on the tails with 
                     
                        
                        
                           x
                           >
                           2
                        
                        x\gt 2
                     
                  . This makes the heavier tail of the contaminated distribution clearer.
Figure 2

The normal distribution ( μ = 0 , σ = 1 ) and a contaminated normal distribution that includes 15% with σ = 3 . The right panel focuses on the tails with x > 2 . This makes the heavier tail of the contaminated distribution clearer.

3 Simulation 2: Testing for an Association

Pearson’s correlation, Spearman’s correlation, and the normrank correlation are compared for whether they maintain the stated Type 1 error rate, here 5%, and for their power to detect an association. Other tests of association exist (e.g., polychoric correlations), but the desire was to compare these because they all use the same procedure after transforming the original data (with no transformation, ranking, or with equation (3) using δ = 0.7 ) and the popularity of the first two procedures. A concern for Pearson’s approach is that it is greatly influenced by extreme values (e.g., Wilcox, 2017). The ranking transformation used with Spearman’s correlation directly addresses this so that all values are equally spaced, provided that there are no ties (ties, and discrete variables generally, are not considered in this article). Compared with ranking, equation (3) changes this so that more extreme rank values are spaced more than less extreme rank values. Whether these extreme values are spaced further apart than the original data depends on the distribution of the original data (Table 1).

First, the performance of these measures are considered, when there is no association. Sample sizes of 40, 60, and 100 were used. The distributions-uniform, normal, contaminated normal, and χ 1 2 – were the same for both variables. The χ 2 distribution with d f = 1 , which is equivalent to squaring a normally distributed variable with μ = 0 and σ = 1 , was included so that one of the variables is skewed (skewness is 8 2.83 ). There were 10,000 replications per condition for the remaining simulations reported in this article. The R code for these is in Appendix A and can be adapted for different sample sizes, distributions, and other parameters. The p -values are found for the Pearson, Spearman, and normrank correlations using R’s cor.test function. The nominal α is 0.05, so the proportion of p < 0.05 should be near this value. The proportions of p < 0.05 results and the 95% confidence intervals for these proportions, found with the Wilson method, are reported. Because the number of replications is large, the widths of these intervals are small. Differences of 0.01 between the correlation procedures are large enough that they are discussed, and differences greater than 0.02 may be considered substantial in some contexts.

Table 3, shows for each procedure, for each distribution, and for each sample size, the proportion of p -values less than 0.05 and the confidence intervals of these proportions. All are near the nominal level of 0.05. None of the individual cell proportions is greater than 0.01 away from the nominal level. Therefore, it is concluded that all are adequate at controlling Type 1 error.

Table 3

Proportion of replications where p < 0.05 for the different distributions, sample sizes, and correlation measures when there is no association (i.e., when the null hypothesis is true)

Correlation measures
Pearson Spearman Normrank
Prop. 95% CI Prop. 95% CI Prop. 95% CI
n = 40
Uniform 0.048 (0.044, 0.053) 0.048 (0.044, 0.053) 0.047 (0.043, 0.051)
Normal 0.052 (0.048, 0.056) 0.053 (0.049, 0.057) 0.051 (0.047, 0.055)
Contaminated 0.055 (0.051, 0.059) 0.053 (0.049, 0.058) 0.053 (0.049, 0.058)
χ 1 2 0.049 (0.045, 0.053) 0.054 (0.050, 0.058) 0.051 (0.046, 0.055)
n = 60
Uniform 0.052 (0.048, 0.057) 0.052 (0.048, 0.057) 0.053 (0.049, 0.058)
Normal 0.050 (0.046, 0.055) 0.050 (0.046, 0.054) 0.050 (0.046, 0.054)
Contaminated 0.055 (0.051, 0.060) 0.054 (0.050, 0.059) 0.053 (0.048, 0.057)
χ 1 2 0.046 (0.042, 0.050) 0.053 (0.048, 0.057) 0.050 (0.046, 0.055)
n = 100
Uniform 0.049 (0.045, 0.054) 0.047 (0.043, 0.052) 0.046 (0.042, 0.050)
Normal 0.052 (0.048, 0.056) 0.049 (0.045, 0.054) 0.050 (0.046, 0.055)
Contaminated 0.051 (0.047, 0.056) 0.052 (0.048, 0.057) 0.052 (0.048, 0.057)
χ 1 2 0.049 (0.045, 0.054) 0.053 (0.049, 0.058) 0.052 (0.048, 0.057)
95% confidence intervals found with Wilson’s method
Ten thousand replications per cell

This simulation was repeated, but with the null hypothesis being false. The variables were created using the three distributions but for one of the variables, 0.32 times the other was added. This results in a Pearson correlation of approximately r = 0.30 , what Cohen (1992) called a medium sized effect, for the normally distributed variables. The correlations are slightly different for the other distributions and other measures. The primary interest here is whether the different statistical procedures vary in how well they detect an association for the same data, not differences across the distributions.

Table 4 shows the proportion of the replications where the null hypothesis was correctly rejected for each procedure, for each distribution, and for each sample size. As expected, these proportions increase with sample size (i.e., the power increases with the sample size). For the normal distribution, Pearson’s correlation correctly rejects H o about 1% more often than the normrank correlation and about 4–5% more than Spearman’s correlation. Thus, in this situation, Spearman’s is the least powerful and Pearson’s is the most powerful. With both the uniform and contaminated normal distributions, the normrank procedure had about a 3–4% advantage over Pearson’s with Spearman’s performing between the two. Based on Micceri (1989) findings, where most distributions he analyzed had greater than normal kurtosis and a small proportion less than normal, it is likely that education and other social science researchers will face circumstances where the normrank is more powerful than Pearson’s procedure. In none of the situations examined was Spearman’s procedure the most powerful procedure. It is recommended not to use Spearman’s correlation, and it will not be further examined in this article.

Table 4

Proportion of replications where p < 0.05 for the different distributions, sample sizes, and correlation measures when there is an association (i.e., when the null hypothesis is false)

Correlation measures
Pearson Spearman Normrank
Prop. 95% CI Prop. 95% CI Prop. 95% CI
n = 40
Uniform 0.497 (0.487, 0.507) 0.448 (0.438, 0.458) 0.541 (0.532, 0.551)
Normal 0.493 (0.484, 0.503) 0.446 (0.436, 0.456) 0.473 (0.463, 0.483)
Contaminated 0.497 (0.487, 0.506) 0.502 (0.492, 0.511) 0.524 (0.514, 0.533)
χ 1 2 0.488 (0.478, 0.498) 0.690 (0.681, 0.699) 0.698 (0.689, 0.707)
n = 60
Uniform 0.669 (0.660, 0.679) 0.611 (0.602, 0.621) 0.736 (0.727, 0.745)
Normal 0.669 (0.660, 0.678) 0.620 (0.610, 0.629) 0.647 (0.637, 0.656)
Contaminated 0.660 (0.651, 0.670) 0.688 (0.679, 0.697) 0.700 (0.691, 0.709)
χ 1 2 0.632 (0.622, 0.641) 0.861 (0.854, 0.867) 0.871 (0.864, 0.877)
n = 100
Uniform 0.882 (0.876, 0.888) 0.836 (0.828, 0.843) 0.933 (0.928, 0.938)
Normal 0.879 (0.872, 0.885) 0.839 (0.832, 0.846) 0.867 (0.860, 0.874)
Contaminated 0.864 (0.857, 0.870) 0.897 (0.891, 0.903) 0.904 (0.898, 0.910)
χ 1 2 0.842 (0.835, 0.849) 0.978 (0.975, 0.980) 0.982 (0.979, 0.984)
95% confidence intervals found with Wilson’s method
Ten thousand replications per cell

4 Simulation 3: EFA

In this simulation, Pearson’s correlation and the normrank correlation are compared for EFA when there are no underlying factors and when there are 1–4 underlying factors with 12 observed variables. Spearman’s is not included as it performed worse than others for all distributions tested in the previous simulation.

EFA is one of the most used forms of the latent variable model particularly within the educational and social sciences. It is usually the first latent variable model that students are taught. If you have some number of observed variables and believe that this observed set is caused by a smaller number of latent variables, but you do not have particular relationships hypothesized, EFA can be an appropriate tool. EFA produces two main results. First, it can suggest how many latent variables are appropriate for modeling the data. This is often done using a related technique: calculating the eigenvalues of the correlation matrix. The second result is that it shows which observed variables are related to the same latent variable as others. There are several sources for EFA, for example, in increasing level of assumed mathematical knowledge: Bartholomew, Knott, and Moustaki (2011), Loehlin and Beaujean (2017), Mair (2018), Mulaik (2010), and Wright and Wells (2020).

The simulation in this section will examine whether using a Pearson correlation matrix or a matrix composed using the normrank transformation is better able both to determine the correct number of latent variables and to identify the loadings correctly. Velicer, Eaton, and Fava (2000) (see also Auerswald & Moshagen, 2019) review different methods to determine the number of factors. Based on this, the empirical Kaiser criterion is used (Braeken & van Assen, 2017). It involves comparing the observed eigenvalues with those observed from a random data matrix. It is implemented in the EFA.dimensions package (O’Connor, 2021). To estimate whether the pattern of factor loadings is accurate, factor analysis was conducted assuming the correct number of factors. The highest loading for each observed variable was recorded. For the complete pattern of loadings to be deemed accurate, all observed variables which were influenced by the same latent variable during data creation had to have their highest loading for the same latent variable. This was done with the default settings of the R function factanal.

Only a single sample size is used here. There is no agreed upon minimum sample size for EFA (de Winter, Dodou, & Wieringa, 2009; MacCallum, Widaman, Preacher, & Hong, 2001), but several rules of thumb exist. Loehlin and Beaujean (2017) say “with high commonalities, a small number of factors, and a relatively large number of indicators per factor, N s of 100 or or less yield good recovery of factors” (p. 207). A sample size of 100 will be used with 12 observed variables since it is near the minimum that might be considered adequate (though larger samples are recommended). Each observed variable will be created as:

observed i = latent i + x noise i ,

where x is a scalar to control the amount of noise (see below).

The latent variables and the noise variables are each drawn from one of several contaminated normal distributions. Because Micceri (1989) found the preponderance of the variables he examined to have greater than normal kurtosis, this simulation will focus on distributions with varying amounts of contamination for the contaminated normal distribution. In addition, results for the χ 1 2 distribution are presented for comparison to show a skewed distribution.

Contamination for contaminated normal distributions can be created in different ways. Here, the variability of the two distributions are fixed at σ 1 = 1 and σ 2 = 3 and the proportion from each of these distributions is varied between 0 and 1, with the endpoints being normal distributions. This is done in 101 steps of 0.01. The kurtosis for these different contaminated normal distributions is shown in Figure 3. The maximum kurtosis is when about 10% is drawn from a normal distribution with σ = 3 . The uniform distribution is presented for comparison.

Figure 3 
               Kurtosis for normal distributions with 
                     
                        
                        
                           σ
                           =
                           1
                        
                        \sigma =1
                     
                   for proportions from 0 to 1 with the remainder being drawn from normal distributions with 
                     
                        
                        
                           σ
                           =
                           3
                        
                        \sigma =3
                     
                  , as well as the kurtosis for the uniform distribution.
Figure 3

Kurtosis for normal distributions with σ = 1 for proportions from 0 to 1 with the remainder being drawn from normal distributions with σ = 3 , as well as the kurtosis for the uniform distribution.

Different amounts of noise are used for the different numbers of latent variables to prevent results being at ceiling (correctly identifying that there are four latent variables is more difficult than identifying that there is one). The scalars for noise are x = 2.5 , x = 2 , x = 1.5 , and x = 1.2 for one, two, three, and four latent variables, respectively.

Figures 4 and 5 show that when the data are all produced by noncontaminated normal distributions (mixtures of 0 and 1), Pearson’s and the normrank correlation perform similarly for both correctly identifying the number of factors and identifying the correct pattern of loadings. They perform similarly until the distribution has less than about 50% drawn from the one with the larger standard deviation. This is also where the kurtosis becomes noticeable larger (Figure 3). For these distributions, the normrank procedure was more accurate both for identifying the correct number of factors and for identifying the correct pattern of factor loadings.

Figure 4 
               Proportion of replications identifying the correct number of latent factors. Note that 
                     
                        
                        
                           y
                        
                        y
                     
                  -axes are different for the five plots. The proportion from the population with 
                     
                        
                        
                           σ
                           =
                           3
                        
                        \sigma =3
                     
                   increased from 0 to 1 in steps of 0.01, with 10,000 replications at each step.
Figure 4

Proportion of replications identifying the correct number of latent factors. Note that y -axes are different for the five plots. The proportion from the population with σ = 3 increased from 0 to 1 in steps of 0.01, with 10,000 replications at each step.

Figure 5 
               Proportion of time identifying the correct loadings (i.e., the highest loading being the same as the others influenced by the same factor). The proportion from the population with 
                     
                        
                        
                           σ
                           =
                           3
                        
                        \sigma =3
                     
                   increased from 0 to 1 in steps of 0.01, with 10,000 replications at each step. Note that the 
                     
                        
                        
                           y
                        
                        y
                     
                  -axis scales are different.
Figure 5

Proportion of time identifying the correct loadings (i.e., the highest loading being the same as the others influenced by the same factor). The proportion from the population with σ = 3 increased from 0 to 1 in steps of 0.01, with 10,000 replications at each step. Note that the y -axis scales are different.

Table 5 shows the results when the data are distributed χ 1 2 . When there is more than one factor used to create the data, the normrank correlation is more accurate than Pearson’s correlation for both identifying the number of factors and the factor loading pattern.

Table 5

Results using χ 1 2 distributions for exploratory factor analyses with the data created from different numbers of latent variables. There were 10,000 replications

Number of true factors
0 1 2 3 4
Finding the correct number of factors
Pearson Proportion 0.963 0.747 0.777 0.440 0.165
95% CI (0.959, 0.966) (0.739, 0.756) (0.768, 0.785) (0.431, 0.450) (0.158, 0.173)
Normrank Proportion 0.974 0.994 0.988 0.777 0.327
95% CI (0.971, 0.977) (0.992, 0.995) (0.986, 0.990) (0.769, 0.786) (0.318, 0.336)
Finding the correct loadings
Pearson Proportion 0.791 0.727 0.691
95% CI (0.783, 0.799) (0.718, 0.735) (0.682, 0.700)
Normrank Proportion 0.973 0.936 0.896
95% CI (0.969, 0.976) (0.931, 0.941) (0.889, 0.901)

There are many different parameters that could be used to construct data for EFA and different ways to assess the accuracy of the results. The results presented here are just to illustrate the use of the two correlation procedures. For these, and consistent with the correlation simulation, when the data derive from normal distributions the Pearson’s and the normrank correlation perform similarly, but for nonnormal distributions, both those with less than normal kurtosis (the uniform distributions) and those with greater than normal kurtosis (the contaminated normal distributions), the normrank performed better. Given Micceri’s (1989) finding that most of the distributions he analyzed in psychology and education were not similar to the normal distribution, and most of these had greater than normal kurtosis, the recommendation here is that the normrank correlation should be preferred over Pearson’s correlation when conducting EFA for many psychology and education data sets.

5 Simulation 4: SEM

SEM can take numerous forms. It allows the researcher to hypothesize relationships among latent variables. However, it is a complex analysis that should be used cautiously.

When we come to models for relationships between latent variables we have reached a point where so much has to be assumed that one might justly conclude that the limits of scientific usefulness have been reached, if not exceeded.

Bartholomew et al. (2011, p. 228)

A good applied introduction on SEM using R is Beaujean (2014), with more coverage in Loehlin and Beaujean (2017). The lavaan manual Rosseel (2012), one of the main R SEM packages, is a good resource. Here, only a single example is shown to show how the normrank correlation compares with Pearson’s correlation using three distributions (uniform, normal, contaminated normal) for a relatively simple SEM model. A single situation is used to explore if the findings are consistent with the earlier simulations. There are numerous variations in SEM models that could be explored further. As with the other simulations, within each data creation scenario, all the random variables will have the same distributions: uniform (range 0 to 1), normal ( μ = 0 , σ = 1 ), and contaminated normal (85% drawn from a normal with μ = 0 and σ = 1 and 15% drawn from a normal with μ = 0 and σ = 3 ).

The data will be created from the model depicted in Figure 6. There are three latent variables, l v 1 l v 3 , each influencing three observed variables, labeled o 1 o 9 . l v 1 influences l v 2 , and l v 2 influences l v 3 . For half of the simulations, l v 1 also influences l v 3 . The statistical models compare the data creation models with and without this effect and evaluated for whether this effect is detected at p < 0.05 . Given the nature of SEM modeling, sometimes models fail to converge. Often alternative estimation procedures and different control settings can be used to achieve convergence. Here, the BFGS (Broyden-Fletcher-Goldfarb-Shanno) algorithm is used, but some errors persisted and are treated as missing. Only 0.12% of the 20,000 replications failed to produce a result (Table 6).

Figure 6 
               Causal model used to create the data for SEM simulation. The effect corresponding with the arrow surrounded by the dashed ellipse is included for half the samples.
Figure 6

Causal model used to create the data for SEM simulation. The effect corresponding with the arrow surrounded by the dashed ellipse is included for half the samples.

Table 6

Proportion of replications rejecting the hypothesis that there is a direct effect from l v 1 to l v 3 (the effect in the dashed ellipse in Figure 6). For half of the simulations these are incorrect decisions (top half of the table) and should be near the α = 0.05 level. For half they are the correct decisions (the bottom half). There were 10,000 replications for each condition

Uniform Normal Contaminated
Null hypothesis true ( l v 1 l v 3 not present)
Pearson’s Proportion 0.170 0.080 0.240
95% CI (0.162, 0.177) (0.075, 0.086) (0.232, 0.249)
Normrank Proportion 0.150 0.082 0.219
95% CI (0.143, 0.157) (0.077, 0.088) (0.211, 0.227)
Null hypothesis false ( l v 1 l v 3 present)
Pearson’s Proportion 0.484 0.495 0.351
95% CI (0.474, 0.494) (0.485, 0.505) (0.341, 0.360)
Normrank Proportion 0.496 0.494 0.418
95% CI (0.487, 0.506) (0.484, 0.504) (0.408, 0.428)

The primary SEM results can be summarized as follows: Pearson’s and the normrank produce similar findings when the distributions are uniform or normal, but the normrank approach performs better in terms for both not rejecting the true null (by a couple of percent) and correctly rejecting a false null hypothesis (by about seven percent) when the distributions have high kurtosis. Large differences were found for the different distributions for incorrectly rejecting a true null (i.e., the Type 1 error proportions). In the uniform conditions, the true null was incorrectly rejected less than the nominal level of 0.05, while in the normal conditions, it was rejected slightly more than the nominal level, and in the contaminated normal conditions, it was rejected much more. These differences highlight the importance of considering how the distributions underlying the data (real and simulate) can affect statistical models.

6 Summary

Pearson’s product moment correlation is one of the most used statistics in all of science. It is highly influenced by extreme values, which makes this procedure less reliable for some distributions. Spearman’s rank correlation is a popular alternative when there are extreme outliers and the analyst wants to limit their impact. The normrank correlation, using what is sometimes called the rank based inverse normal transformation, results in data more closely resembling a normal distribution.

Four simulations were conducted. The first was to show which variation of the normrank transformation to use. For continuous distributions, δ = 0.7 appeared to perform well. This was used in the remaining simulations. The second simulation examined how well Pearson’s, Spearman’s, and the normrank correlation performed detecting an association. Spearman’s was never the best of these three. The recommendation is not to use it. Pearson’s was more powerful than the normrank correlation with the normally distributed data, but the difference was fairly small. The normrank correlation was more powerful than Pearson’s correlation when the distribution had sub- and super-normal kurtosis. Because Micceri (1989) found most correlations in psychology and education had greater than normal kurtosis, the normrank correlation is likely to be more powerful and should be considered. However, it is important to stress that Pearson’s approach was better for normally distributed data, as has been shown in other contexts (e.g., Beasley et al., 2009), using the normrank transformation will not always produce more powerful statistical tests. A further reservation for all procedures involving ranking is that information about the original variable metric is lost.

The final two simulations compared the use of Pearson’s correlation with the normrank correlation in factor analysis and SEM. The results were similar to the second simulation. When the data are drawn for normal distributions, the procedures perform similarly, but when they are drawn from nonnormal distributions, with kurtosis much different from the normal distribution, the normrank performed better. Overall,

It is concluded, therefore that:

  1. Spearman’s procedure should not be used.

  2. The normrank should be considered unless the researcher has strong reasons to believe their data are drawn for near-normal distributions, but researchers may also wish to consider numeric transformations.

It is important to stress that this article only addresses one alternative to Pearson’s correlation on untransformed variables. The article shows that the normrank transformation can improve the subsequent statistics, but there are other approaches. First, if there is a particular substantive model that is consistent with a particular transformation (e.g., count data, which according to Mosteller & Tukey, 1977, may profit from the ln transformation), this should be considered. Second, the Box-Cox transformation (Box & Cox, 1964) usually makes data more similar to the normal distribution and is often used. If you have a particular model, whether linear or otherwise, about the relationship between the variables no transformation involving ranking should be used and the applicable model should be evaluated. If the worry is about the impact of extreme values more robust loss functions could be used (e.g., Hoaglin et al., 2000) and if the worry is about not meeting the distributional assumptions bootstrapping could be used (e.g., Efron & Tibshirani, 1993; Wright, London, & Field, 2011). In short, there are many alternatives to conducting Pearson’s correlation on untransformed variables, and in some situations, the normrank transformation data should be considered.

  1. Funding information: There was no funding for this research.

  2. Conflict of interest: Author states no conflict of interest.

  3. Data availability statement: In line with the goals of this journal to have access to as much evidence as plausible (Wright, 2020), the R code to reproduce the simulations is in Appendix A. Further, the final submitted document, in full, is on the author’s github page (https://github.com/dbrookswr/normrank/blob/main/README.md). It includes LaTeX and R, both freely available, knitted together using knitr (Xie, 2015).

Appendix A Code for Simulations

The final submitted document is available as a reproducible knitr document (Xie, 2015) at the author’s github page (https://github.com/dbrookswr/normrank/blob/main/README.md). The code for the simulations is shown in the Appendix.

A.1 Simulation 1

Below is the code for when n = 50 to find values for δ that produce variables that are distributed similarly to the normal distribution. The simulations for n = 100 and n = 200 are identical accept that the sample size is changed.

normrank <- function(x , delta)
qnorm((rank(x) - delta)/(length(x) - 2 * delta + 1))
set.seed(5859)
reps  <-  k  <-  1000
vals2  <-  matrix(nrow =5*k*length (seqvals),ncol =5) # t r i a l n o , n , d i s c r e t e n e s s , s e q v a l s , s h a p
count  <-  0
for (ww in 1:5) {
for (i in 1:reps) {
for (j in 1:length(seqvals)) {
count  <-  count + 1
n  <-  50
x  <-  runif(n,0,10)
if(ww == 2) x  <-  round(2*x+.5)
if(ww == 3) x  <-  round(x +.5)
if(ww == 4) x  <-  round(x/2 +.5)
if(ww == 5) x  <-  round(3*x/10 +.5)
newx  <-  normrank(x,delta=seqvals[j])
w  <-  shapiro.test (newx)$statistic
vals2[count,]  <-  c (count,n,ww,seqvals[j],w)
} } }

A.2 Simulation 2

These are the functions used to make the contaminated normal distributions and calculate the different correlations:

rmixnorm  <-  function(n , mix =.85, wide = 3) {
if (abs(round(n*mix)) != n*mix)
warning("Sample does not divide by mix. Will get close.")
g1  <-  abs(round(n*mix))
sample(c(rnorm (g1), rnorm(n-g1, sd=wide)))
}
normcorr  <-  function(x , y) cor.test(normrank(x,.7),normrank(y,.7))
sigbw  <-  function(r , n) sqrt((1+rˆ2/2)/(n-3))
cibw  <-  function(r , n)
tanh(atanh (r) + c(-1,1)*sigbw (r,n) * qt(.975,n-2))
corr3  <-  function(x , y , asvec =TRUE) {
pear  <-  cor.test (x,y)
spear  <-  cor.test (x,y, method = "spearman")
spearmanCI  <-  cibw (spear$estimate, length (x))
norm  <-  normcorr (x,y)
ifelse(asvec,  pcis  <-  c(pear$p.value,pear$conf.int[1],pear   $conf.int[2],
spear$p.value,spearmanCI[1],spearmanCI[2],
norm$p.value, norm$conf.int[1], norm$conf.int[2]),
{ pcis  <-  matrix(c(pear$p.value,pear$conf.int[1],pear$conf.int[2],
spear$p.value,spearmanCI[1],spearmanCI[2],
norm$p.value, norm$conf.int[1], norm$conf.int[2]),ncol=4)
rownames (pcis)  <-  c("p" , "lb" , "ub")
colnames (pcis)  <-  c("Pearson" , "Spearman" ,  "normCorr") } )
return (pcis)
}

This is the code used to create the data for the correlation simulation with H 0 true and n = 40 . The n is changed for the other two sample sizes.

k  <-  10000
n  <-  40
vals1_40  <-  array(dim = c(k,10, length(dists)),
dimnames =
list(c(paste0("replics" ,1:k)),
c("i" , "Pearp" , "Pearlb" , "Pearub" ,
"Spearp" , "Spearlb" , "Spearub" ,
"normp" , "normlb" , "normub"),
c("unif" , "norm" , "mixnorm" , "chi1")))
for (j in 1:length(dists))
for (i in 1:k) {
set.seed(8184+i)
x  <-  dists[[j]](n)
y  <-  dists[[j]](n) # + part of x if power
cc  <-  corr3 (x,y)
vals1_40[ij]  <-  c(i,corr3(x,y))
}

A.3 Simulation 3

The data were created for the EFA with the following functions, one each depending on the number of latent variables.

create0data  <-  function(n , mix , sigma =3, noise =1) {
ov1  <-  noise*rcnorm(n,mix, sigma =sigma)
ov2  <-  noise*rcnorm(n,mix, sigma =sigma)
ov3  <-  noise*rcnorm(n,mix, sigma =sigma)
ov4  <-  noise*rcnorm(n,mix, sigma =sigma)
ov5  <-  noise*rcnorm(n,mix, sigma =sigma)
ov6  <-  noise*rcnorm(n,mix, sigma =sigma)
ov7  <-  noise*rcnorm(n,mix, sigma =sigma)
ov8  <-  noise*rcnorm(n,mix, sigma =sigma)
ov9  <-  noise*rcnorm(n,mix, sigma =sigma)
ov10  <-  noise*rcnorm(n,mix, sigma =sigma)
ov11  <-  noise*rcnorm(n,mix, sigma =sigma)
ov12  <-  noise*rcnorm(n,mix, sigma =sigma)
return(cbind(ov1,ov2,ov3,ov4,ov5,ov6,ov7,ov8,ov9,ov10,ov11,ov12))
}
create1data  <-  function(n , mix , sigma =3, noise =1) {
lv1  <-  rcnorm(n,mix, sigma =sigma)
ov1  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov2  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov3  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov4  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov5  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov6  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov7  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov8  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov9  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov10  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov11  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov12  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
return(cbind(ov1,ov2,ov3,ov4,ov5,ov6,ov7,ov8,ov9,ov10,ov11,ov12))
}
create2data  <-  function(n , mix , sigma =3, noise =1) {
lv1  <-  rcnorm(n,mix, sigma =sigma)
lv2  <-  rcnorm(n,mix, sigma =sigma)
ov1  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov2  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov3  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov4  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov5  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov6  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov7  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov8  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov9  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov10  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov11  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov12  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
return(cbind(ov1,ov2,ov3,ov4,ov5,ov6,ov7,ov8,ov9,ov10,ov11,ov12))
}
create3data  <-  function(n , mix , sigma =3, noise =1) {
lv1  <-  rcnorm(n,mix, sigma =sigma)
lv2  <-  rcnorm(n,mix, sigma =sigma)
lv3  <-  rcnorm(n,mix, sigma =sigma)
ov1  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov2  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov3  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov4  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov5  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov6  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov7  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov8  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov9  <-  lv3 + noise*rcnorm(n,mix, sigma =sigma)
ov10  <-  lv3 + noise*rcnorm(n,mix, sigma =sigma)
ov11  <-  lv3 + noise*rcnorm(n,mix, sigma =sigma)
ov12  <-  lv3 + noise*rcnorm(n,mix, sigma =sigma)
return(cbind(ov1,ov2,ov3,ov4,ov5,ov6,ov7,ov8,ov9,ov10,ov11,ov12))
}
create4data  <-  function(n , mix , sigma =3, noise =1) {
lv1  <-  rcnorm(n,mix, sigma =sigma)
lv2  <-  rcnorm(n,mix, sigma =sigma)
lv3  <-  rcnorm(n,mix, sigma =sigma)
lv4  <-  rcnorm(n,mix, sigma =sigma)
ov1  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov2  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov3  <-  lv1 + noise*rcnorm(n,mix, sigma =sigma)
ov4  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov5  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov6  <-  lv2 + noise*rcnorm(n,mix, sigma =sigma)
ov7  <-  lv3 + noise*rcnorm(n,mix, sigma =sigma)
ov8  <-  lv3 + noise*rcnorm(n,mix, sigma =sigma)
ov9  <-  lv3 + noise*rcnorm(n,mix, sigma =sigma)
ov10  <-  lv4 + noise*rcnorm(n,mix, sigma =sigma)
ov11  <-  lv4 + noise*rcnorm(n,mix, sigma =sigma)
ov12  <-  lv4 + noise*rcnorm(n,mix, sigma =sigma)
return(cbind(ov1,ov2,ov3,ov4,ov5,ov6,ov7,ov8,ov9,ov10,ov11,ov12))
}

The simulation for the situation with no latent variables was conducted with the following. The other simulations used the other functions, from above, to create the data.

set.seed(3940)
k  <-  10000
n  <-  100
vals0  <-  array(dim = c(k,5, length(mixvals)),
dimnames =
list(c(paste0("replics" ,1:k)),
c("i" , "nf1EMPKC" , "nf2EMPKC" , "load1" , "load2"),
paste0("mix" ,mixvals)))
for (j in 1:length(mixvals)) {
for (i in 1:k) {
set.seed(2712+i)
x  <-  create0data(n,mixvals[j])
vals0[ij]  <-  c(i,fa(x,fac =0))
} }

A.4 Simulation 4

The following creates the SEM data.

makesemu  <-  function(n =200, noise =1, effect =TRUE) {
lv1  <-  rchisq(n,df =1)
o1  <-  lv1 + noise*rchisq(n,df =1)
o2  <-  lv1 + noise*rchisq(n,df =1)
o3  <-  lv1 + noise*rchisq(n,df =1)
lv2  <-  scale(lv1 + noise*rchisq(n,df =1))
o4  <-  lv2 + noise*rchisq(n,df =1)
o5  <-  lv2 + noise*rchisq(n,df =1)
o6  <-  lv2 + noise*rchisq(n,df =1)
ifelse(effect,
lv3  <-  scale(lv1/2 + lv2/2 + noise*rchisq(n,df =1)),
lv3  <-  scale(lv2 + noise*rchisq(n,df =1)))
o7  <-  lv3 + noise*rchisq(n,df =1)
o8  <-  lv3 + noise*rchisq(n,df =1)
o9  <-  lv3 + noise*rchisq(n,df =1)
x  <-  cbind(o1,o2,o3,o4,o5,o6,o7,o8,o9)
colnames(x)  <-  paste0("o" ,1:9)
return(x)
}
makesemn  <-  function(n =200, noise =1, effect =TRUE) {
lv1  <-  rnorm (n)
o1  <-  lv1 + noise*rnorm (n)
o2  <-  lv1 + noise*rnorm (n)
o3  <-  lv1 + noise*rnorm (n)
lv2  <-  scale(lv1 + noise*rnorm(n))
o4  <-  lv2 + noise*rnorm (n)
o5  <-  lv2 + noise*rnorm (n)
o6  <-  lv2 + noise*rnorm (n)
ifelse(effect,
lv3  <-  scale(lv1/2 + lv2/2 + noise*rnorm(n)),
lv3  <-  scale(lv2 + noise*rnorm(n)))
o7  <-  lv3 + noise*rnorm (n)
o8  <-  lv3 + noise*rnorm (n)
o9  <-  lv3 + noise*rnorm (n)
x  <-  cbind (o1,o2,o3,o4,o5,o6,o7,o8,o9)
colnames(x)  <-  paste0("o" ,1:9)
return(x)
}
makesemcn  <-  function(n =200, noise =1, effect =TRUE) {
lv1  <-  rcnorm (n)
o1  <-  lv1 + noise*rcnorm(n,mix =.85)
o2  <-  lv1 + noise*rcnorm(n,mix =.85)
o3  <-  lv1 + noise*rcnorm(n,mix =.85)
lv2  <-  scale(lv1 + noise*rcnorm(n,mix =.85))
o4  <-  lv2 + noise*rcnorm(n,mix =.85)
o5  <-  lv2 + noise*rcnorm(n,mix =.85)
o6  <-  lv2 + noise*rcnorm(n,mix =.85)
ifelse(effect,
lv3  <-  scale(lv1/2 + lv2/2 + noise*rcnorm(n,mix =.85)),
lv3  <-  scale(lv2 + noise*rcnorm(n,mix =.85)))
o7  <-  lv3 + noise*rcnorm(n,mix =.85)
o8  <-  lv3 + noise*rcnorm(n,mix =.85)
o9  <-  lv3 + noise*rcnorm(n,mix =.85)
x  <-  cbind (o1,o2,o3,o4,o5,o6,o7,o8,o9)
colnames(x)  <-  paste0("o" ,1:9) return(x) }

These are the two models compared:

# These defined outside of loop
model1  <-  ’
# measurement model
lv1 = o1 + o2 + o3
lv2 = o4 + o5 + o6
lv3 = o7 + o8 + o9
# regressions
lv2 lv1
lv3 lv1 + lv2 # residual correlation
model2  <-  ’
# measurement model
lv1 = o1 + o2 + o3
lv2 = o4 + o5 + o6
lv3 = o7 + o8 + o9
# regressions
lv2 lv1
lv3 lv2 # residual correlation

This shows the fitting procedure for the models:

semf  <-  function(model , corr , n) {
fit  <-  tryCatch(sem(model =model, sample.cov =corr, sample.nobs =n,
optim.method = "BFGS" , optim.force.converged =TRUE),
error = function(e) { } )
return (fit)
}
av  <-  function(fit1 , fit2) {
x  <-  tryCatch(anova (fit1,fit2)$Pr[2],error = function(e) { } ) if(is.null (x)) x  <-  NA
return(x)
}

This is the simulation when the effect is present:

k  <-  10000 # **
set.seed(8477)
semvalsET  <-  matrix(ncol =7, nrow =k) semvalsET[,1]  <-  1:k
for (i in 1:k) {
egchif  <-  makesemu()
egnorm  <-  makesemn()
egcnorm  <-  makesemcn()
cmu1  <-  cor (egchif)
cmu2  <-  ncorrmat (egchif)
cmn1  <-  cor (egnorm)
cmn2  <-  ncorrmat (egnorm)
cmcn1  <-  cor (egcnorm)
cmcn2  <-  ncorrmat (egcnorm)
fitup1  <-  semf(model1,cmu1, nrow(egchif))
fitup2  <-  semf(model2,cmu1, nrow(egchif))
semvalsET[i,2]  <-  av (fitup1,fitup2)
fitua1  <-  semf(model1,cmu2, nrow(egchif))
fitua2  <-  semf(model2,cmu2, nrow(egchif))
semvalsET[i,3]  <-  av (fitua1,fitua2)
fitnp1  <-  semf(model1,cmn1, nrow(egnorm))
fitnp2  <-  semf(model2,cmn1, nrow(egnorm))
semvalsET[i,4]  <-  av (fitnp1,fitnp2)
fitna1  <-  semf(model1,cmn2, nrow(egnorm))
fitna2  <-  semf(model2,cmn2, nrow(egnorm))
semvalsET[i,5]  <-  av (fitna1,fitna2)
fitcnp1  <-  semf(model1,cmcn1, nrow(egcnorm))
fitcnp2  <-  semf (model2,cmcn1, nrow(egcnorm))
semvalsET[i,6]  <-  av (fitcnp1,fitcnp2)
fitcna1  <-  semf(model1,cmcn2, nrow(egcnorm))
fitcna2  <-  semf(model2,cmcn2, nrow(egcnorm)) semvalsET[i,7]  <-  av (fitcna1,fitcna2)
}

B Examples Using Normrank in R

B.1 From Simulation 2: Finding the Normrank Correlation

The proposed alternative is implemented in three steps: (1) decide how to treat missing values, (2) transform the variables, and (3) run Pearson’s correlation. Particularly for procedures like structural equation modeling that use correlation matrices (see below), imputing missing values can be desirable, but it should be done cautiously (Rubin, 1987).

The example dataset used is EDUCATION(Petrie, 2020). It is a data set from a regression course at University of Tennessee, and just the male student data are used here (there are no missing data). The interest is with the association between height and the academic measures: College GPA and ACT scores. Figure A1 shows the scatter plots for these associations. Once the data are loaded, the transformations can be done in R with:

Figure A1 
                     Scatter plots for height with college GPA and ACT scores for male students in a regression class.
Figure A1

Scatter plots for height with college GPA and ACT scores for male students in a regression class.

newHeight  <-  qnorm(rank (Height)/(length (Height)+1))
newCGPA  <-  qnorm((rank(CollegeGPA)-.7)/(length (CollegeGPA) - 2*.7 + 1))
newACT  <-  qnorm((rank (ACT)-.7)/(length (ACT) - 2*.7 + 1))

If you plan to use this transformation often or use it in other functions it can be useful to write a function for it.

normrank  <-  function(x , delta)
qnorm((rank(x) - delta)/(length(x) - 2 * delta + 1))

The final step is conducting the correlation. This can be done with the R functions cor or cor.test. There is a negative association between height and college GPA, but the association between height and ACT scores is near zero. The cor.test function provides more information than cor, including p -values and confidence intervals.

cor.test (newHeight,newCGPA)
##
## Pearson’s product-moment correlation ##
## data: newHeight and newCGPA
## t = −3.8639, df = 605, p-value = 0.0001236
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.23190484 -0.07655196
## sample estimates:
## cor
## -0.1551877
cor.test (newHeight,newACT)
##
## Pearson’s product-moment correlation
##
## data: newHeight and newACT
## t = 1.612, df = 605, p-value = 0.1075
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.01425778 0.14422812
## sample estimates:
## cor
## 0.06539758

These steps can be combined:

ncorr  <-  function(x , y , delta =.7) {
if (any(is.na(cbind(x,y))))
warning("Some missing values. NAs treated as maxima.")
return(cor(normrank(x,delta =delta),normrank(y,delta=delta)))
}

Note that by default the R rank function treats missing values as maxima, even higher than :

rank(c(1,NA,Inf,2))
## [1] 1 4 3 2

This could lead to unexpected results so it is important to address missing values if this is not appropriate. You can set na.last = "keep" in the rank functions and use="pairwise.complete.obs" in the cor function if you want to exclude missing data pairwise.

B.2 EFA in R with the Normrank Correlation

Consider the fictional data Field (2018) created for an SPSS anxiety scale available at www.discoveringstatistics.com/statistics-hell-p/veritas-advanced-topics/factor-analysis-and-pca/. This is an SPSS file so needs to be read into R using one of the packages that does this, for example, foreign (R Core team, 2022) or haven (Wickham & Miller, 2021). Once the data are read they are called spssanx. Only the first 23 columns are read in; these are the scale’s questions. It is convenient to have a function designed to create correlation matrices using the normrank correlation, calling that function (the normrank function shown earlier).

ncorrmat  <-  function(x) {
if (any(is.na(x)))
warning("Some missing values. NAs treated as maxima.")
return(cor(apply(x,2,normrank,delta=0.7)))
}

Applying this to spssanx results in an normrank correlation matrix:

altcor  <-  ncorrmat (spssanx)

That can be used to create a scree plot, using the principle component functions in R (e.g., prcomp, princomp):

screeplot(princomp(covmat =altcor))

or directly calling eigen or svd with the correlation matrix:

eigen (altcor)$values
svd (altcor)$d

The correlation matrix can also be placed within the factanal function for factor analysis. Suppose there are three latent variables. The code would be:

factanal(covmat =altcor, factors =3, n.obs = nrow(spssanx))

B.3 Using the Normrank with SEM in R

As with EFA, a correlation matrix can be created by transforming all the variables and creating a matrix, and then reading this matrix into the multivariate procedure. The functions cfa, sem, sam, and growth, are all wrappers for the function lavaan. They set up various parameters consistent with those used by most using confirmatory factor analysis, structural equation modeling, structural after measurement, and latent growth modeling, respectively. Consider the model for the simulation:

model1  <-  ’
lv1 = o1 + o2 + o3
lv2 = ∼o4 + o5 + o6
lv3 = ∼o7 + o8 + o9
lv2lv1
lv3lv1 + lv2
  ’

The sem function can be used to estimate this model with the normrank correlation matrix by using the ncorrmat function described earlier. The sem function has a slot sample.cov. The correlation matrix can be placed in this. In addition, the function needs to be told the number of people that the correlation is based on. Suppose the data are in a standard rectangular data matrix called semegdata. The sample size is the number of rows: nrow (semegdata).

sem1  <-  sem(model1,
sample.cov = ncorrmat(semegdata),
sample.nobs = nrow(semegdata))
summary (sem1)

References

Auerswald, M., & Moshagen, M. (2019). How to determine the number of factors to retain in exploratory factor analysis: A comparison of extraction methods under realistic conditions. Psychological Methods, 24, 468–491. https://doi.org/10.1037/met0000200. Search in Google Scholar

Bartholomew, D J., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach (3rd ed.). Chichester, UK: Wiley. 10.1002/9781119970583Search in Google Scholar

Beasley, T. M., Erickson, S., & Allison, D. B. (2009). Rank-based inverse normal transformations are increasingly used, but are they merited? Behavior Genetics, 39, 580–595. https://doi.org/10.1007/s10519-009-9281-0. Search in Google Scholar

Beaujean, A. A. (2014). Latent variable modeling using R (1st ed.). New York, NY: Routledge. 10.4324/9781315869780Search in Google Scholar

Bonett, D. G., & Wright, T. A. (2000). Sample size requirements for estimating Pearson, Kendall, and Spearman correlations. Psychometrika, 65, 23–28. https://doi.org/10.1007/BF02294183. Search in Google Scholar

Box, G. E. P., & Cox, D. R. (1964). An analysis of transformations. Journal of the Royal Statistical Society. Series B (Methodological), 26, 211–252. 10.1111/j.2517-6161.1964.tb00553.xSearch in Google Scholar

Braeken, J., & van Assen, M. A. L. M. (2017). An empirical Kaiser criterion. Psychological Methods, 22, 450–466. https://doi.org/10.1037/met0000074. Search in Google Scholar

Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155–159. https://doi.org/10.1037/0033-2909.112.1.155. Search in Google Scholar

DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological Methods, 2, 292–307. https://doi.org/10.1037/1082-989X.2.3.292. Search in Google Scholar

de Winter, J. C. F., Dodou, D., & Wieringa, P. A. (2009). Exploratory factor analysis with small sample sizes. Multivariate Behavioral Research, 44, 147–181. https://doi.org/10.1080/00273170902794206. Search in Google Scholar

de Winter, J. C. F., Gosling, S. D., & Potter, J. (2016). Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychological Methods, 21, 273–290. https://doi.org/10.1037/met0000079. Search in Google Scholar

Edgell, S. E., & Noon, S. M. (1984). Effect of violation of normality on the t test of the correlation coefficient. Psychological Bulletin, 95, 576–583. https://doi.org/10.1037/0033-2909.95.3.576. Search in Google Scholar

Efron, B., & Tibshirani, R. J. (1993). An introduction to the bootstrap (monographs on statistics and applied probability), Boca Raton, FL: Chapman and Hall/CRC. Search in Google Scholar

Field, A. (2018). Discovering statistics using IBM SPSS statistics (5th ed.). London, UK: Sage Publications. Search in Google Scholar

Hoaglin, D. C., Mosteller, F., Tukey, J. W. (Eds.). (2000[1982]). Understanding robust and exploratory data analysis. New York, NY: Wiley. Search in Google Scholar

Joanes, D. N., & Gill, C. A. (1998). Comparing measures of sample skewness and kurtosis. Journal of the Royal Statistical Society: D, 47, 183–189. https://doi.org/10.1111/1467-9884.00122. Search in Google Scholar

Loehlin, J. C., & Beaujean, A. A. (2017). Latent variable models: An introduction to factor, path, and structural analysis (5th ed.). New York, NY: Routledge. 10.4324/9781315643199Search in Google Scholar

MacCallum, R. C., Widaman, K. F., Preacher, K. J., & Hong, S. (2001). Sample size in factor analysis: The role of model error. Multivariate Behavioral Research, 36, 611–637. https://doi.org/10.1207/S15327906MBR3604_06. Search in Google Scholar

Mair, P. (2018). Modern psychometrics with R. Cham, Switzerland: Springer. 10.1007/978-3-319-93177-7Search in Google Scholar

Mangiafico, S. (2020). rcompanion: Functions to support extension education program evaluation [Computer software manual]. https://CRAN.R-project.org/package=rcompanion (R package version 2.3.25). Search in Google Scholar

Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., & Leisch, F. (2018). e1071: Misc functions of the department of statistics, probability theory group (Formerly: E1071), TU Wien [Computer software manual]. https://CRAN.R-project.org/package=e1071 (R package version 1.7-0). Search in Google Scholar

Micceri, T. (1989). The unicorn, the normal curve, and other improbable creatures. Psychological Bulletin, 105, 156–166. https://doi.org/10.1037/0033-2909.105.1.156. Search in Google Scholar

Moors, J. J. A. (1988). A quantile alternative for Kurtosis. Journal of the Royal Statistical Society: D, 37, 25–32. 10.2307/2348376Search in Google Scholar

Mosteller, F., & Tukey, J. W. (1977). Data analysis and regression: A second course in statistics. Reading, MA: Addison-Wesley. Search in Google Scholar

Mulaik, S. A. (2010). Foundations of factor analysis (2nd ed.). Boco Raton, FL: CRC Press. 10.1201/b15851Search in Google Scholar

Nelsen, R. B. (2006). An introduction to copulas (2nd ed.). New York, NY: Springer. Search in Google Scholar

O’Connor, B. P. (2021). EFA.dimensions: Exploratory factor analysis functions for assessing dimensionality [Computer software manual]. https://CRAN.R-project.org/package=EFA.dimensions (R package version 0.1.7.2). Search in Google Scholar

Pearson, K. (1896). VII. Mathematical contributions to the theory of evolution.-III. Regression, heredity, and panmixia. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 187, 253–318. https://doi.org/10.1098/rsta.1896.0007. Search in Google Scholar

Petrie, A. (2020). regclass: Tools for an introductory class in regression and modeling [Computer software manual]. https://CRAN.R-project.org/package=regclass (R package version 1.6). Search in Google Scholar

Pohlert, T. (2020). PMCMRplus: Calculate pairwise multiple comparisons of mean rank sums extended [Computer software manual]. https://CRAN.R-project.org/package=PMCMRplus (R package version 1.4.4). Search in Google Scholar

R Core team. (2022). foreign: Read data stored by ‘Minitab’, ‘S’, ‘SAS’, ‘SPSS’, ‘Stata’, ‘Systat’, ‘Weka’, ‘dBase’, [Computer software manual]. https://CRAN.R-project.org/package=foreign (R package version 0.8-82). Search in Google Scholar

Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. http://www.jstatsoft.org/v48/i02/. 10.18637/jss.v048.i02Search in Google Scholar

Royston, P. (1995). A remark on algorithm AS 181: The W-test for normality. Journal of the Royal Statistical Society. Series C (Applied Statistics), 44, 547–551. 10.2307/2986146. Search in Google Scholar

Rubin, D. B. (1987). Multiple imputation for nonresponse in surveys. New York, NY: John Wiley & Sons. 10.1002/9780470316696Search in Google Scholar

Ruscio, J. (2008). Constructing confidence intervals for Spearmanas rank correlation with ordinal data: A simulation study comparing analytic and bootstrap methods. Journal of Modern Applied Statistical Methods, 7, 416–434. https://doi.org/10.22237/jmasm/1225512360. Search in Google Scholar

Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52, 591–611. https://doi.org/10.1093/biomet/52.3-4.591. Search in Google Scholar

Spearman, C. (1904). The proof and measurement of association between two things. The American Journal of Psychology, 15, 72–101. https://doi.org/10.2307/1412159. Search in Google Scholar

Tukey, J. W. (1960). A survey of sampling from contaminated distributions. In: I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow, & H. B. Mann (Eds.), Contributions to probability and statistics: Essays in honor of Harold Hotelling (pp. 448–485). Stanford, CA: Stanford University Press. Search in Google Scholar

Tukey, J. W. (1986). Sunset Salvo. The American Statistician, 40, 72–76. https://doi.org/10.1080/00031305.1986.10475361. Search in Google Scholar

van der Waerden, B. L. (1952). Order tests for the two-sample problem and their power. Indagationes Mathematicae (Proceedings), 55, 453–458. https://doi.org/10.1016/S1385-7258(52)50063-5. Search in Google Scholar

Velicer, W. F., Eaton, C. A., & Fava, J. L. (2000). Construct explication through factor or component analysis: A review and evaluation of alternative procedures for determining the number of factors or components. In: R. D. Goffin, & E. Helmes (Eds.), Problems and Solutions in Human Assessment (pp. 41–71). Boston, MA: Kluwer. 10.1007/978-1-4615-4397-8_3Search in Google Scholar

Ventura-León, J., Pennnna-Calero, B. N., & Burga-León, A. (2022). The effect of normality and outliers on bivariate correlation coefficients in psychology: A Monte Carlo simulation. The Journal of General Psychology, 1–18. https://doi.org/10.1080/00221309.2022.2094310. Search in Google Scholar

Wickham, H., & Miller, E. (2021). haven: Import and Export ‘SPSS’, ‘Stata’ and ‘SAS’ Files. [Computer software manual]. https://CRAN.R-project.org/package=haven (R package version 2.4.1). Search in Google Scholar

Wilcox, R. R. (2017). Introduction to robust estimation and hypothesis testing (4th ed.). San Diego, CA: Academic Press. 10.1016/B978-0-12-804733-0.00001-9Search in Google Scholar

Wright, D. B. (2020). Improving trust in research: Supporting claims with evidence. Open Education Studies, 2, 1–8. https://doi.org/10.1515/edu-2020-0106. Search in Google Scholar

Wright, D. B., & Herrington, J. A. (2011). Problematic standard errors and confidence intervals for skewness and kurtosis. Behavior Research Methods, 43, 8–17. https://doi.org/10.3758/s13428-010-0044-x. Search in Google Scholar

Wright, D. B., London, K., & Field, A. P. (2011). Using Bootstrap estimation and the plug-in principle for clinical psychology data. Journal of Experimental Psychopathology, 2, 252–270. https://doi.org/10.5127/jep.013611. Search in Google Scholar

Wright, D. B., & Wells, S. M. (2020). Creating latent variables. In: G. M. Breakwell, D. B. Wright, & J. Barnett (Eds.), Research Methods in Psychology (5th ed.). London, UK: Sage Publications. Search in Google Scholar

Xie, Y. (2015). Dynamic documents with R and knitr (2nd ed.). Boca Raton, FL: Chapman and Hall/CRC. Search in Google Scholar

Received: 2021-04-27
Revised: 2023-05-02
Accepted: 2024-04-02
Published Online: 2024-04-26

© 2024 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Articles in the same Issue

  1. Special Issue: Building Bridges in STEAM Education in the 21st Century - Part II
  2. The Flipped Classroom Optimized Through Gamification and Team-Based Learning
  3. Method and New Doctorate Graduates in Science, Technology, Engineering, and Mathematics of the European Innovation Scoreboard as a Measure of Innovation Management in Subdisciplines of Management and Quality Studies
  4. Impact of Gamified Problem Sheets in Seppo on Self-Regulation Skills
  5. Special Issue: Disruptive Innovations in Education - Part I
  6. School-Based Education Program to Solve Bullying Cases in Primary Schools
  7. The Project Trauma-Informed Practice for Workers in Public Service Settings: New Strategies for the Same Old Objective
  8. Regular Articles
  9. Limits of Metacognitive Prompts for Confidence Judgments in an Interactive Learning Environment
  10. “Why are These Problems Still Unresolved?” Those Pending Problems, and Neglected Contradictions in Online Classroom in the Post-COVID-19 Era
  11. Potential Elitism in Selection to Bilingual Studies: A Case Study in Higher Education
  12. Predicting Time to Graduation of Open University Students: An Educational Data Mining Study
  13. Risks in Identifying Gifted Students in Mathematics: Case Studies
  14. Technology Integration in Teacher Education Practices in Two Southern African Universities
  15. Comparing Emergency Remote Learning with Traditional Learning in Primary Education: Primary School Student Perspectives
  16. Pedagogical Technologies and Cognitive Development in Secondary Education
  17. Sense of Belonging as a Predictor of Intentions to Drop Out Among Black and White Distance Learning Students at a South African University
  18. Gender Sensitivity of Teacher Education Curricula in the Republic of Croatia
  19. A Case Study of Biology Teaching Practices in Croatian Primary Schools
  20. The Impact of “Scratch” on Student Engagement and Academic Performance in Primary Schools
  21. Examining the Structural Relationships Between Pre-Service Science Teachers’ Intention to Teach and Perceptions of the Nature of Science and Attitudes
  22. Validation of the Undesirable Behavior Strategies Questionnaire: Physical Educators’ Strategies within the Classroom Ecology
  23. Economics Education, Decision-Making, and Entrepreneurial Intention: A Mediation Analysis of Financial Literacy
  24. Deconstructing Teacher Engagement Techniques for Pre-service Teachers through Explicitly Teaching and Applying “Noticing” in Video Observations
  25. Influencing Factors of Work–Life Balance Among Female Managers in Chinese Higher Education Institutions: A Delphi Study
  26. Examining the Interrelationships Among Curiosity, Creativity, and Academic Motivation Using Students in High Schools: A Multivariate Analysis Approach
  27. Teaching Research Methodologies in Education: Teachers’ Pedagogical Practices in Portugal
  28. Normrank Correlations for Testing Associations and for Use in Latent Variable Models
  29. The More, the Merrier; the More Ideas, the Better Feeling”: Examining the Role of Creativity in Regulating Emotions among EFL Teachers
  30. Principals’ Demographic Qualities and the Misuse of School Material Capital in Secondary Schools
  31. Enhancing DevOps Engineering Education Through System-Based Learning Approach
  32. Uncertain Causality Analysis of Critical Success Factors of Special Education Mathematics Teaching
  33. Novel Totto-Chan by Tetsuko Kuroyanagi: A Study of Philosophy of Progressivism and Humanism and Relevance to the Merdeka Curriculum in Indonesia
  34. Global Education and Critical Thinking: A Necessary Symbiosis to Educate for Critical Global Citizenship
  35. The Mediating Effect of Optimism and Resourcefulness on the Relationship between Hardiness and Cyber Delinquent Among Adolescent Students
  36. Enhancing Social Skills Development in Children with Autism Spectrum Disorder: An Evaluation of the “Power of Camp Inclusion” Program
  37. The Influence of Student Learning, Student Expectation and Quality of Instructor on Student Perceived Satisfaction and Student Academic Performance: Under Online, Hybrid and Physical Classrooms
  38. Household Size and Access to Education in Rural Burundi: The Case of Mutaho Commune
  39. The Impact of the Madrasati Platform Experience on Acquiring Mathematical Concepts and Improving Learning Motivation from the Point of View of Mathematics Teachers
  40. The Ideal Path: Acquiring Education and Gaining Respect for Parents from the Perspective of Arab-Bedouin Students
  41. Exploring Mentor Teachers’ Experiences and Practices in Japan: Formative Intervention for Self-Directed Development of Novice Teachers
  42. Research Trends and Patterns on Emotional Intelligence in Education: A Bibliometric and Knowledge Mapping During 2012–2021
  43. Openness to Change and Academic Freedom in Jordanian Universities
  44. Digital Methods to Promote Inclusive and Effective Learning in Schools: A Mixed Methods Research Study
  45. Translation Competence in Translator Training Programs at Saudi Universities: Empirical Study
  46. Self-directed Learning Behavior among Communication Arts Students in a HyFlex Learning Environment at a Government University in Thailand
  47. Unveiling Connections between Stress, Anxiety, Depression, and Delinquency Proneness: Analysing the General Strain Theory
  48. The Expression of Gratitude in English and Arabic Doctoral Dissertation Acknowledgements
  49. Subtexts of Most Read Articles on Social Sciences Citation Index: Trends in Educational Issues
  50. Experiences of Adult Learners Engaged in Blended Learning beyond COVID-19 in Ghana
  51. The Influence of STEM-Based Digital Learning on 6C Skills of Elementary School Students
  52. Gender and Family Stereotypes in a Photograph: Research Using the Eye-Tracking Method
  53. ChatGPT in Teaching Linear Algebra: Strides Forward, Steps to Go
  54. Partnership Quality, Student’s Satisfaction, and Loyalty: A Study at Higher Education Legal Entities in Indonesia
  55. SEA’s Science Teacher Voices Through the Modified World Café
  56. Construction of Entrepreneurship Coaching Index: Based on a Survey of Art Design Students in Higher Vocational Colleges in Guangdong, China
  57. The Effect of Audio-Assisted Reading on Incidental Learning of Present Perfect by EFL Learners
  58. Comprehensive Approach to Training English Communicative Competence in Chemistry
  59. The Collaboration of Teaching at The Right Level Approach with Problem-Based Learning Model
  60. Effectiveness of a Pop-Up Story-Based Program for Developing Environmental Awareness and Sustainability Concepts among First-Grade Elementary Students
  61. Effect of Computer Simulation Integrated with Jigsaw Learning Strategy on Students’ Attitudes towards Learning Chemistry
  62. Unveiling the Distinctive Impact of Vocational Schools Link and Match Collaboration with Industries for Holistic Workforce Readiness
  63. Students’ Perceptions of PBL Usefulness
  64. Assessing the Outcomes of Digital Soil Science Curricula for Agricultural Undergraduates in the Global South
  65. The Relationship between Epistemological Beliefs and Assessment Conceptions among Pre-Service Teachers
  66. Review Articles
  67. Fostering Creativity in Higher Education Institution: A Systematic Review (2018–2022)
  68. The Effects of Online Continuing Education for Healthcare Professionals: A Systematic Scoping Review
  69. The Impact of Job Satisfaction on Teacher Mental Health: A Call to Action for Educational Policymakers
  70. Developing Multilingual Competence in Future Educators: Approaches, Challenges, and Best Practices
  71. Using Virtual Reality to Enhance Twenty-First-Century Skills in Elementary School Students: A Systematic Literature Review
  72. State-of-the-Art of STEAM Education in Science Classrooms: A Systematic Literature Review
  73. Integration of Project-Based Learning in Science, Technology, Engineering, and Mathematics to Improve Students’ Biology Practical Skills in Higher Education: A Systematic Review
  74. Teaching Work and Inequality in Argentina: Heterogeneity and Dynamism in Educational Research
  75. Case Study
  76. Teachers’ Perceptions of a Chatbot’s Role in School-based Professional Learning
Downloaded on 10.9.2025 from https://www.degruyterbrill.com/document/doi/10.1515/edu-2024-0003/html
Scroll to top button