Emotion Regulation Test: Analysis with IRT

Emotional regulation is a skill related to emotional intelligence, which has shown important impacts on several aspects of life. This work aimed to analyze the psychometric properties of the Emotion Regulation Test (ERT) based on the Item Response Theory (IRT). The instrument


4
Emotional regulation (ER) is a skill related to emotional intelligence and emerged as a field of study in the 1990s when James J. Gross improved understanding of this topic.The author defined a set of processes through which individuals (consciously or unconsciously) modify the trajectory of one or more components of an emotional response.These components are physiological responses, cognitive processes, and behaviors associated with an emotion (Gross, 2015).Gross (2015) states that emotion occurs in an individual-situation context that mobilizes one's attention and demands a cognitive assessment of the situation, resulting in a (manifested or unmanifested) behavioral response.Thus, cognitive processing plays a prominent role in this model, which provides four stages or decisions based on a discrepancy between one's desired and actual emotional state.First, this discrepancy is identified as an opportunity to regulate emotion (identification stage); then, an ER strategy is selected (selection); this strategy is then implemented through specific tactics (implementation); and the entire cycle is monitored to successfully achieve the regulatory goal (monitoring) (McRae & Gross, 2020).
The selected and implemented ER strategies can be organized into five families related to choosing a situation, changing a situation, redirecting attention, cognitive restructuring, and response modulation (Gross, 2015).Different strategies result in different outcomes on how an individual feels, thinks, and acts immediately after and over time.
With the emergence of the ER concept, emotion is no longer considered a phenomenon one passively experiences; instead, the active role of individuals in regulating emotions through different strategies is highlighted.Some emotion theories reflect the dynamic nature of this process.Plutchik (2003), for instance, states that an emotion is not only a set of subjective sensations, but also an entire chain of events that includes feelings, cognition, impulses to act, among others.
In the psycho-evolutionary theory, Plutchik (2003) proposed that emotions have an adaptive or survival value, i.e., emotions have a purpose in people's lives.From this perspective, eight basic emotions are identified (joy, fear, sadness, acceptance, anger, surprise, disgust, and expectation) that establish relationships with other mental functioning areas.
Hence, emotion is always associated with cognition, a behavior that tends to affect the environment.Plutchik (2003) describes the sequential model of emotions as typically containing a stimulus that triggers an emotional effect, depending on how individuals interpret it.In turn, an emotional state is triggered, followed by an impulse for action, which does not always trigger an action per se, although it frequently does.Intervening in these sequences to regulate emotions is essential for achieving a healthy and successful life (English et al., 2017).The relevance of this ability has been reported by many recent studies addressing the topic, mak- ing ER one of the academic fields of most significant growth within Psychology (Gross, 2015;McRae & Gross, 2020).
ER studies focus on patterns of strategies used to regulate emotions (stage) and on their effectiveness or success (implementation stage).There is a consensus that research on ER has produced definitions and models of the construct; outlined different emotional consequences of an individual's involvement with different ways to regulate emotions; described psychological and neurobiological mechanisms by which regulation influences emotion; and documented the effects of ER interventions (McRae & Gross, 2020).
One of the first problems faced when a research field emerges more consistently is developing instruments with good psychometric properties capable of capturing the individual differences of the construct of interest.As a result, some self-reports and performance instruments have been developed to assess emotion regulation.
The Emotion Regulation Questionnaire (ERQ) (Preece et al., 2019) and Difficulties in Emotion Regulation Scale (DERS) (Bjureberg et al., 2016;Miguel et al., 2017) are examples of self-report instruments.Self-reported instruments capture the respondents' opinions about their typical behaviors related to emotion regulation rather than their performance per se.This type of instrument usually presents a higher correlation with personality traits than intelligence, considering that they depict the respondents' self-perception and the subjective nature of one's emotional experience (Petrides, 2017).
Other instruments include the Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT) (Mayer et al., 2002), the Emotion Regulation Profile (ERP) (Gondim et al., 2015;Nelis et al., 2011), the Teste de Regulação de Emoções pelo Stroop Emocional (TRE_Stroop) (Bueno, 2013), the Situational Test of Emotional Management -Brief (STEM-B) (Allen et al., 2015), and the Emotion Regulation Test (ERT) (Lira & Bueno, 2020).These are examples of instruments assessing the performance of emotion regulation.The MSCEIT, ERP, and STEM-B are situational performance tests, whereas TRE_Stroop is an instrument based on the Stroop Emotional technique, in which a cognitive task competes with the interference of emotional content.Such measures usually present positive and significant correlations with other measures of intelligence and performance and non-significant correlations with personality measures.Even though difficulties are reported with the definition of scoring criteria, this type of measurement has reconciled the theoretical aspects of the construct operationalization by preserving the logic of using tasks to access cognitive skills required to answer the instrument (Petrides, 2017).
The ERT, the object of this study, also assesses respondents' performance by asking them to judge the effectiveness of emotion regulation strategies.It is a self-reported instrument composed of eight vignettes built upon the sequences of basic emotions proposed by Plutchik (2003).The test presents conflicting situations involving fictional characters, and respondents are asked to rate the effectiveness of each ER strategy presented to deal with a given situation.The instrument is characterized as a situational judgment test (Ambiel et al., 2015), in which it is assumed that the better the participant's judgment, the better his/her ability on what the test assesses -i.e., his/her ability to regulate emotions.
An exploratory factor analysis revealed that ERT has a two-factor structure in which the items are distributed according to the effectiveness (factor 1) or ineffectiveness (factor 2) of the strategy used to regulate emotions.These factors presented good internal consistency indices, measured by the Kuder-Richardson formula (0.75 for factor 1 and 0.62 for factor 2); they presented positive and significant correlation, although with a moderate magnitude between them; and gender was found to have a positive effect for women, and education had a positive effect for higher educational levels, even after controlling for age (Lira & Bueno, 2020).
This investigation was based on the Classical Test Theory, which presents important limitations compared to the modern Item Response Theory (IRT).The IRT allows for independently estimating parameters for the items and individuals, which provides a more detailed analysis of the people's level of ability, the items' difficulty, and the interaction between the two of them (Draheim et al., 2018).In this context, this study's objective was to analyze the ERT psychometric properties based on the IRT.The Rasch Model analysis in the IRT was used to investigate the following aspects: 1. general adjustment of the items to the IRT model (infit and outfit); 2. internal consistency; 3. descriptive analysis of the participants (theta) and items; 4. analysis of scoring criteria according to the agreement reached by the experts; 5. analysis of the distribution of items and latent trait (map of items).

Instruments
The ERT (Lira & Bueno, 2020) was used in this study along with a sociodemographic questionnaire to characterize the sample.Both questionnaires were available on an online platform so that data were collected via the Internet.
The ERT is a situational judgment test composed of scenarios, operationalized in vignettes, in which a character is faced with one of the eight basic emotions proposed by Plutchik (2003), namely: joy, fear, sadness, acceptance, anger, surprise, disgust, or expectation.For instance, the vignette focused on fear presents the following situation: "Ana is com-ing home later than usual.She is walking alone and knows that there have been several robberies on the street ahead".Next, three items are presented to each of the eight vignettes, representing different actions that can be adopted to regulate emotions.After the exploratory factor analysis (EFA) described in Lira and Bueno (2021), the items in which the factor loads were below 0.3 (items 5, 6, 13, and 20) were eliminated.The remaining scenarios and items are presented in Tables 2 and 4 in the results section.
The ERT items are based on the ER strategies proposed by Gross (2015) (selecting a situation, modifying a situation, redirecting attention, cognitive restructuring, and response modulation), in agreement with the natural sequences of each emotion as predicted by Plutchik (2003).For instance, Plutchik (2003) suggests that the elicitation sequence for fear would be: when faced with a threat (stimulus), a person can interpret a real or imaginary situation as dangerous (cognition), experience fear (emotional state), and probably run away (observable behavior) to save him/herself (effect).
The options presented in the ERT in the vignette addressing fear were: 1. "Take another path considered safer, but longer"; 2. "Stand still and wait for someone's help"; 3.
"Think positively and trust that you will get home safely even if you continue on the same street".Next, the participants were instructed to rate on a five-point Likert scale the effectiveness of each strategy, assigning 1 to very ineffective strategies and 5 to very effective strategies.Scores 2, 3, or 4 represent intermediate levels of effectiveness (Lira & Bueno, 2020).
The participants' responses were scored according to the agreement reached by experts, i.e., the participants scored one point whenever their answer was in agreement with the answer the expert panel considered to be the most appropriate (see Bueno & Zuanazzi, 2019).
The expert panel was composed of graduate and undergraduate students involved with research addressing the psychological assessment of emotional intelligence skills such as emotion regulation.

Procedures
The participants were contacted through e-mail or social media and received clarification regarding the study's objectives and procedures.A link in the message invitation led to the survey page.The participants affirmatively indicated consent by checking on the free and informed consent form, after which access was granted to the sociodemographic form and (ERT).The project was submitted to and approved by the Institutional Review Board under Informed Consent Form number 51159715.9.0000.5208.
Data collected were automatically stored in an electronic spreadsheet to which only the researchers had access.Statistical analysis was performed using the Rasch model, a mathematical modeling technique intended to relate the items' difficulty to the participants' abilities.Therefore, if an item's difficulty is below an individual's ability, the higher the likelihood she/he will get the answer correct; otherwise, the participant will probably get the answer wrong.In this model, the items' difficulty is represented by the letter b and the individuals' abilities by the Greek letter theta (Ѳ) (Bond & Fox, 2015).
Data analysis was performed using Winsteps, version 3.69.1.6(Linacre & Wright, 2009).Because it is a mathematical modeling technique, the first analysis refers to the adjustment of data to the IRT model, which, in Winsteps software, is given by the infit and outfit indexes; values between 0.72 and 1.33 indicate goodness of fit (Bond & Fox, 2015).Then, descriptive analyzes were calculated for the items (b) and participants (theta) along with the factors' reliability indexes, which are supposed to be above 0.7 (Cunha et al., 2016).
Descriptive analysis of items was performed to: 1. verify the pertinence of the answers considered correct in the scoring system according to the experts' agreement; 2. verify whether the items' difficulty level was appropriate to the individuals' ability; and 3. analyze latent trait.The last two analyzes were facilitated by visually inspecting the map of items and people, which outlines a parallel between the distribution of items and people over the ability continuum (See Nakano et al., 2015 for an example of this procedure).

Results
The analyses were performed per factor, respecting the assumption of IRT unidimensionality (Bond & Fox, 2015).Therefore, the results considering Factor 1, which is related to detecting effective or adaptive strategies, are presented first, and the results concerning Factor 2, which detects ineffective or maladaptive strategies, are presented later.
The infit mean in Factor 1 was 0.99 (SD=0.06),ranging from 0.91 to 1.07, and the outfit mean was 0.99 (SD=0.20),ranging from 0.55 to 1.23.The theta mean was 0.25 (SD=1.79),and the factor precision, calculated by IRT, was 0.69 (real) and 0.72 (modeled).The descriptive statistics of the items in Factor 1 are presented in Table 1.
Data analysis in Figure 1 shows whether the scores were confirmed or readjusted according to the experts' agreement.Note that, in almost all the items in Factor 1, the response considered correct by the experts (score=1) was the response chosen by the participants with the highest theta mean.An exception was item 3, which presented an inversion.The inversion was observed for two participants only, who incorrectly opted for alternative 1 instead of 5, which constitutes a minimal number of participants with a very high standard error (0.86).For this reason, alternative 5 remained as the correct answer for this item.SD=1.79) are close.The items are distributed over a large extension of the ability-to-regulate emotions continuum.However, probably due to the small number of items, there are gaps in this continuum, indicating that new items would be needed to fill in these gaps.
As shown on the map (Figure 1), the items were grouped into three blocks, according to the similarity of difficulty levels, to facilitate the analysis of data presented in    The analysis of the items in Factor 2 (ineffective or maladaptive strategies to regulate emotions) was also initiated by analyzing the data fitting to the TRI model.The infit mean was 0.99 (SD=0.08),ranging from 0.80 to 1.08; while the outfit mean was 0.96 (SD=0.16),ranging from 0.65 to 1.17.The theta mean was 0.76 (SD=1.48),and precision of Factor 2, calculated by IRT, was 0.56 (real) and 0.60 (modeled).The descriptive statistics of the items in Factor 2 are presented in Table 3.

Table 3
Descriptive response considered correct (according to the experts) in all the items was also the response chosen by the high-ability participants (theta mean).The relationship between the distribution and mean of items and people is presented in Figure 2.

Figure 2
People-item map of TRE Factor 2 Note.Each "#" equals 5 people and each "." equals from 1 person to 4 people.
Figure 2 presents the people-item map of Factor 2. Note that the means of the items and participants are relatively close and, where there are many individuals, there is also a large number of items.Additionally, the items appear distributed over a large extension of the ability-to-regulate emotions continuum.
Similar to the analysis of Factor 1, Table 3 shows the items in Factor 2 grouped according to similar difficulty levels, according to the map of items (Figure 2).Items 10 and 18 compose the block with the most difficult items, in which complex situations with avoidant, however, socially reprehensible strategies are presented.Items 1, 8, 9, 16, and 21 compose the block of intermediate difficulty, and items 2, 11, 15, and 16 compose the third block, in which non-regulation responses are the most easily identifiable as infective to deal with the situations presented.
The easiest strategies to be detected as ineffective are those with no regulation; rather, there is an emotion-motivated response.Then, there are instances where there is a strategy to solve the situation, but the problem persists.A given strategy might be effective in the most challenging situations, but does not consider all those involved; it may be a strategy frequently used, although considered socially inappropriate.

Discussion
This study's primary objective was to investigate the fitting of data to the Rasch model in IRT, using the infit and outfit indices obtained with Winsteps software.The results show the goodness of fit for both factors in the ERT, indicating that data are appropriate to IRT modeling and that subsequent analyzes could be performed (Bond & Fox, 2015).
Reliability indexes were more consistent for Factor 1 than for Factor 2, although both were lower than those obtained by Allen et al. (2015) using other instruments to assess emotion regulation.Consequently, measurement errors are more likely to occur when estimating the scores (thetas) obtained in Factor 2 than in Factor 1.It means that high-ability individuals are missing easy items and/or low-ability individuals are answering difficult items correctly; i.e., estimation of the individuals' ability level fails in predicting what items they will answer correctly or incorrectly.Thus, even though this instrument can be used in research, it is recommended to review the item scoring criteria and the number of items and strategies that may improve the precision of Factor 2.
The analysis of both factors' descriptive statistics shows little difference between the means of the participants' ability and the items' difficulty, with the items being slightly easier for respondents.This discrepancy was slightly higher for Factor 2, although this finding generally shows that the difficulty of items was compatible with the respondents' ability, thus, avoiding floor or ceiling effects.Answer the opposing provocations. -1.69

2
Remain joyful as if it was a great day when facing a traffic jam. Honk. -1.99 The analysis of the items map shows that items in both factors are distributed on an extensive range of the ability continuum.There are a reasonable number of items in the central area, where most of the scores are also concentrated.In addition, there are items to discriminate different levels of abilities, precisely where most of the scores are, but there are also items in the extreme regions.Nonetheless, there are gaps (lack of items) in some areas of abilities in both factors, suggesting the need to develop more items to fill them in.
The latent trait analysis shows that the items tend to become more difficult as the complexity of the situation-response increases.For instance, in Factor 1, it is more challenging to identify the strategy "gently say that you do not like the food when having dinner in a friend's home than saying you felt disgusted" as effective.Complementary, in Factor 2, it is even more challenging to identify strategy "deviate from your path to avoid getting close to someone you encountered naked downtown" as ineffective.In both cases, the most efficient emotion regulation strategy would be to approach the individuals to say the food was not good (Factor 1) and seeking a way to help the naked person exposed to an embarrassing situation (Factor 2).Both are difficult items, and people tend to answer them wrongly because they choose avoidance as the strategy to be used.Similar results were found by Allen et al. (2015) among Australian college students.They found that situation modification was the strategy most frequently used by individuals who scored high on emotion regulation.On the other hand, situation selection strategies (especially avoidance) were less likely to be chosen by them.The authors highlight the relevance of an instrument that enables this distinction for clinical practice and teaching emotion regulation strategies.Koole et al. (2015) noted that choosing an ER strategy is not an easy task, considering that people can choose from many different strategies.Additionally, the adaptability of strategies is not fixed but varies according to circumstances, which makes the task even more challenging.For this reason, specific strategies may be more appropriate to achieve certain types of objectives than others, suggesting that adaptive ER involves using the correct strategy to achieve objectives in a given context (English et al., 2017).
Note that the situations proposed in the ERT with an important objective for the person involved, such as a job situation or a desire to keep a friendship, tend to make them more challenging to identify an effective/adaptive strategy (items 19, 22, and 23 in Table 2).However, as the level of complexity diminishes (for not involving a relationship or the relationship is with unfamiliar individuals), the difficulty in detecting effective responses also decreases.
One study that identified the frequency with which individuals use ER strategies in everyday life and in their emotional experience shows that strategies differed depending on who was involved and why the person was trying to regulate her/his emotions.Thus, the selection of strategies varied due to differences in the events rather than in the individuals, in terms of stable personality factors (English et al., 2017).In addition, in many everyday situations, people deal with dynamic and unpredictable environments, making it challenging to deliberate on how they will regulate their emotions (Koole et al., 2015).
In many cases, ER is not implemented only to modulate mood, but is also motivated by instrumental objectives such as finishing a task or avoiding conflicts.Sometimes, people even regulate their emotions using counter-hedonic strategies such as keeping or increasing negative emotions or decreasing positive emotions, intending to achieve an instrumental ob-jective (English et al., 2017).For this reason, it is essential to understand more clearly how individuals draw on the wide range of strategies available to regulate their emotions based on their current situational demands.Hence, as Aldao and Nolen-Hoeksema (2012) suggested, more variable and positive implementation of effective strategies can be a function of a more flexible assessment of the contextual variation.
Similarly, it was observed in ERT (for instance, items 18 and 10 in Table 4) that difficulty in detecting an ineffective/maladaptive response increases in situations in which a culturally acceptable response is considered ineffective for not considering the social aspects of a situation, especially the other people involved.As previously shown, the social characteristics of the context can play an important role in determining the strategy people adopt to regulate their emotions in everyday life (English et al., 2017).This study was intended to investigate the psychometric properties of the ERT with the support of the Rasch model, from the IRT.The results present important contributions, although specific characteristics of the sample may limit the generalization of data.For instance, the sample is predominantly composed of women with a high educational level from the state of Pernambuco, Brazil.Hence, there is a need to investigate whether the results obtained here would be the same if more diversified and balanced samples in terms of population representativeness were adopted.
The results obtained in this study indicate that the development of the ERT should focus on two main aspects: improving the instrument's precision and filling in the gaps in the continuum of abilities assessed by the test.Even though the instrument requires these aspects to be improved, its psychometric properties qualify it to be used in studies demanding the assessment of effective and non-effective strategies to regulate emotions.Additionally, the latent trait analysis supports the search for achieving these objectives and contributes to acquiring a theoretical and developmental understanding of the ability to regulate emotions.

For
this reason,Aldao and Nolen-Hoeksema (2012) state that interventions should focus on helping individuals develop an awareness of the characteristics of contexts that influence the use of strategy to regulate emotion and learn to implement strategies in a flexible manner that is appropriate to each context.Knowing how emotion regulation items increase or decrease the level of difficulty supports reflecting and devising interventions with implications for both the practice of psychologists and research in psychological assessment.In terms of professional practice, this finding suggests how the ability to regulate emotions is used, enabling psychologists to intervene and encourage the selection and implementation of more effective strategies and the ability to detect ineffective ER strategies.From the perspective of psychological assessment, this result shows how to interfere in the difficulty of items, facilitating the development of new items to fill in the gaps found in some areas of the ability continuum.

Table 1
Descriptive Statistics by Factor 1 items

Table 2 .
Thus, items 19 and 23 compose the block with the most challenging items; items 7, 14, 17, and 22 compose the block of intermediate difficulty; and items 3, 4, 12, and 24 compose the block with the easiest items.Note that the items' difficulty level increases according to the level of threat or pressure a given situation exerts on an individual, making it more challenging to detect the most effective strategy.

Table 2
Factor 1 Latent Trait Analysis

Table 2
Factor 1 Latent Trait Analysis

Table 4
Factor 2 Latent Trait Analysis