Exploratory Graph Analysis in context

This paper presents the network psychometric framework for dimensionality and item analysis termed exploratory graph analysis (EGA). It starts by briefly contextualizing the field of network psychometrics and the early work from the 1950s and 1960s. Then, it provides a brief overview of EGA and other recent developments, such as the network loadings (a metric akin to factor loadings), total entropy fit index (verification of dimensionality fit), dynamic EGA, bootstrap EGA for dimensionality and item stability, random intercepts EGA (handling wording effects), and hierarchical EGA to estimate high-order structures (e.g., generalized bifactor models). The goal of the paper is to present the reader with a list of contextualized references.


3
The year 2022 marks the first decade of a methodological revolution in the area of quantitative psychology that emerged from the popularization of network methods to analyze psychological datasets. This revolution was ignited by the publication of the qgraph package for R (Epskamp et al., 2012). The qgraph package popularized the use of network models by providing easy-to-use and free software for network visualization and estimation, with a wide range of applications in psychology and related areas. The qgraph package helped to popularize network methods in psychology, which were gaining momentum with the publication of the mutualism model of intelligence (Van Der Maas et al., 2006) and network perspective on psychopathology (Borsboom, 2008;Borsboom et al., 2011;Cramer et al., 2010).
The network perspective on psychological constructs originated a new subfield of quantitative psychology called network psychometrics Epskamp et al., 2017). Network models are used to estimate the relationship between multiple variables -typically using the Gaussian graphical model (GGM) (Lauritzen, 1996), where nodes (e.g., test items) are connected by edges (or links) that indicate the strength of the association between the variables (Epskamp & Fried, 2018). From this representation, some network advocates suggest that variables form a causal system of mutually reinforcing components Cramer, 2012). This perspective on psychological phenomena is often juxtaposed against common-cause perspectives represented by latent variable models (Bringmann & Eronen, 2018;Guyon et al., 2017).
Despite differences in representation, network and latent variable models are closely related and can produce model parameters that are consistent with one another (Boker, 2018;Christensen & Golino, 2021c;Epskamp et al., 2017;Golino & Epskamp, 2017;Golino et al., 2021;Marsman et al., 2018). These statistical similarities can be used as a way to explore the dimensionality structure of measurement instruments in a new framework termed Exploratory Graph Analysis Golino & Demetriou, 2017;Golino & Epskamp, 2017;Golino, Moulder, et al., 2020;.
However, not many psychologists recognize that the use of networks in psychology has a historical precedent. In an account of the Reticular Action Model (RAM) (McArdle, 1979(McArdle, , 1980, Boker (2018) shows that Cattell (1965) argued for a network structure for latent variables in which both observable and latent variables would present a multi-influence graph structure. When developing the Reticular Action Model, what McArdle had in mind was the more general implementation of a network of relations, as pointed out by Boker (2018): As McArdle presented his ideas it became evident that he was driven to implement the most general network of relations, the reticular relations proposed by Cattell (1965). I argued that "reticular" was an unnecessarily obscure term -why did he not just call it Network Analysis Modeling. In his usual lighthearted manner, McArdle replied that the acronym NAM had obvious negative associations for many Americans. But primarily, he wanted to honor Cattell's contribution in helping generalize latent structure away from the strict input-output causal implications that had previously dominated statistical modeling. There is a subtle point here. Input-output designs such as factor analysis, multiple regression, and mediation models all Boker (2018) continues his recollection of the development of the RAM model by pointing out that: The "Reticular" in RAM emphasizes the fact that SEM needs to be understood as being a network model.
When Cattell (1965) proposed the idea of any observational model being embedded in a larger network of unobserved relations, few understood the wider implications. One of McArdle's main advances was to instantiate this idea of a general network model in a way that it could be estimated using cost function minimization optimizers (p. 138).
In sum, the history of network models in psychology, particularly psychometric networks, is much older than the more recent popularization of modern network techniques. It even preceded the work of Cattell (1965) by a decade, with the work of Guttman (1953). Guttman (1953) proposed a method termed image structure analysis, which is essentially the basis of contemporary node-wise regression network models (Haslbeck & Waldorp, 2020).

Exploratory Graph Analysis
In network psychometrics Epskamp et al., 2017), networks are typically estimated using the Gaussian graphical model (Lauritzen, 1996) and the EBICglasso approach (Epskamp & Fried, 2018). The EBICglasso approach operates much like McArdle imagined: minimizing a penalized log-likelihood function and selecting the best model fit using the extended Bayesian information criterion (EBIC) (Chen & Chen, 2008). The estimation of networks opened the door for psychologists to begin applying network science methods that have been developed in other areas of science to psychological problems such as dimensionality (e.g., factor analysis). Golino and Epskamp (2017) showed that the GGM model combined with a clustering algorithm for weighted networks (Walktrap) (Pons & Latapy, 2005) could accurately recover the number of simulated factors, presenting a higher accuracy than traditional factor analytic-based methods. Golino and Epskamp (2017) termed this new method exploratory graph analysis (EGA).
Expanding on this work,  compared EGA with different types of factor analytic methods (including two types of parallel analysis). They found that (using the GGM network model) achieves the highest overall accuracy (87.91%) in estimating the number of simulated factors, followed by the traditional parallel analysis with principal components of Horn Other network estimation approaches have since been used, such as the Triangulated Maximally Filtered Graph (TMFG) (Christensen et al., 2018;Massara et al., 2016) and non-regularized partial correlation approaches (Williams & Rast, 2020;Williams et al., 2019) in the EGA framework . EGA usually presents a higher accuracy in estimating the number of factors in (cross-sectional) simulation studies data when using the GGM model rather than the TMFG or non-regularized partial correlation models. However, in the case of highly skewed data, the TMFG method leads to higher accuracy .
Despite the advantages of EGA, the technique had previously only been suitable for cross-sectional data. In the case of intensive longitudinal data, the researcher might be interested in evaluating the dynamic structural organization of factors. Golino et al. (2022) proposed an extension of EGA, termed Dynamic Exploratory Graph Analysis (DynEGA), to deal with intensive longitudinal data by combining techniques from dynamical systems with network psychometrics.
Instead of using the covariance (or correlation) of observable variables to estimate the network, the DynEGA technique uses the covariance (or correlation) of derivatives estimated via generalized local linear approximation (Boker et al., 2010). Therefore, communities estimated in the network represent variables that are changing together or dynamical factors. Golino et al. (2022) showed that DynEGA can recover the structure of intensive longitudinal data simulated using a dynamic factor model with high accuracy.
The EGA framework presents several advantages over more traditional methods . First, unlike exploratory factor analysis (EFA) methods, EGA does not require a rotation method to interpret the estimated first-order factors. Although rotations are rarely discussed in the validation literature, they have significant consequences for validation (e.g., estimation of factor loadings; Sass & Schmitt, 2010). Second, EGA automatically places items into factors without the researcher's direction, which contrasts with exploratory factor analysis, where researchers must decipher a factor loading matrix. Such a placement opens the door for dimension and item stability methods, which will talk about in the section. Third, the network representation depicts how items relate within and between dimensions.
The structures of networks are less amenable to traditional fit indices (but see Kan et al., 2020) because of their number of parameters and exploratory nature. The total entropy fit index (TEFI) (Golino, Moulder, et al., 2020) was developed as an alternative to traditional fit measures used in factor analysis and structural equation modeling. In a comprehensive simulation study, the TEFI demonstrated higher accuracy in correctly identifying the number of simulated factors than the comparative fit index (CFI), the root mean square error of approximation (RMSEA), and other indices used in structural equation modeling (Golino, Moulder, et al., 2020). The TEFI is based on the Von Neumann (1927) entropy -a measure developed to quantify the amount of disorder in a system as well as the entanglement between two subsystems (Preskill, 2018 TEFI index is a relative measure of fit that can be used to compare two or more dimensionality structures. The dimensionality structure with the lowest TEFI value indicates a better fit for the data.

Other developments and future directions
The robustness of the EGA framework has expanded into several important areas of psychometrics. Christensen and Golino (2021c) developed a new metric termed network loadings computed by standardizing node strength -the sum of the edges a node is connected to -split between dimensions identified by EGA. They showed a simulation study that network loadings are akin to factor loadings, opening a new line of research within the EGA approach. These loadings have opened to door to assessing measurement (metric) invariance (Jamison et al., 2022) from the network perspective as well as determining whether data are generated from a factor or network model (Christensen & Golino, 2021b).
Based on the automated item placement of EGA, Christensen and Golino (2021a) (Blondel et al., 2008) to detect lower and higher order factors in data. They demonstrate via simulation that hierEGA outperforms the original Louvain algorithm as well as traditional factor analytic techniques such as parallel analysis for detecting high-order factors.
All the EGA-based techniques pointed out in this brief paper are implemented in a free and open-source R package EGAnet (Golino & Christensen, 2019), which has become one of the main software in network psychometrics. As in any evolving field, several issues are still to be addressed in network psychometrics and the EGA framework. For example, little is known about the robustness of EGA in the presence of population error or how stable are the dynamical factors estimated using DynEGA. Additionally, more simulation studies are necessary to understand the conditions in which the TMFG network method works better than GGM networks (e.g., how much