4 Study 1: Reliability, Validity, and Fakability
of a Shyness IAP and a Shyness IAT

↓42

4.1  Introduction

Recently, Asendorpf et al. (2002) adapted an IAT to assess the implicit personality self-concept of shyness. They showed that the shyness IAT (a) reliably assesses individual differences that (b) are partly independent from traditional direct self-ratings, and (c) increase significantly the prediction of spontaneous behavior in a realistic social situation. In Study 1, a total of 139 participants were observed in a naturalistic lab situation that induced shyness, and completed an Implicit Association Test (IAT; Greenwald et al., 1998) and direct self-ratings of shyness. The IAT correlated moderately with the direct self-ratings, and uniquely predicted spontaneous (but not controlled) shy behavior, whereas the direct ratings uniquely predicted controlled (but not spontaneous) shy behavior (double dissociation).

The robustness of the IAT against faking was investigated in Asendorpf et al.’s Study 2 through the experimental variation of participants' self-presentation of being non-shy. A control group of 18 females participated in a shyness-inducing role play allegedly to study social perception. Their shyness IAT scores, direct self-ratings of shyness, observer-judged shyness, and coded behaviors were contrasted with an experimental group of 23 females who completed the same procedures except that they were presented as part of a simulated job application procedure and that the participants were instructed to act non-shy in order to "get the job". As expected, the direct self-ratings and the controlled shy behaviors were much lower in the experimental group whereas the shyness IAT scores and the spontaneous shy behaviors were not lower.

↓43

The present study was an attempt to replicate the results of Asendorpf et al.’s (2002) Study 2 with a much larger, sex-balanced sample and to extend this approach into four different directions. First, the study attempted to replicate the findings for the shyness IAT with a different, new indirect procedure. Second, the study explored dissociations between direct and indirect measures of shyness under faking instructions not only with regard to the group means but also with regard to the correlates of these measures. Third, the effects of faking were explored also on observer judgments of shyness. Fourth, the state dependence of the indirect measures was examined by contrasting both their mean levels and their correlates between participants who completed them either before or after the shyness-inducing role play. The next sections discuss each research question in more detail.

4.1.1  Research Question 1: A New Indirect Assessment Procedure

A new measurement tool, the Indirect Association Procedures (IAPs), was employed in the present study in order to estimate the convergent validity of the IAT and a different indirect measure for the assessment of the implicit personality self-concept of shyness. The shyness IAP was pre-tested in two pilot studies (see Chapter 3). The final IAP variant showed good internal consistencies, correlated highly with the IAT, and, similarly to the IAT, intermediately with direct shyness self-ratings. The main difference to the IAT is that the IAP induces automatic movement tendencies and already the response has its own valence by triggering approach (pulling the joystick toward oneself) or avoidance (pushing the joystick away from oneself) behavior. The detailed procedure of the IAP is described in Chapter 3.

4.1.2 Research Question 2: Dissociations of Indirect and Direct Measures Under Faking

Job applicants produce more socially desirable self-descriptions than research participants under most conditions (see, e.g., Ones & Viswesvaran, 1998; Rosse, Stecher, Miller, & Levin, 1998). Similarly, laboratory experiments have shown that revealing one's self-descriptions to the public and faking good instructions increase the social desirability of participants' self-descriptions (Paulhus, 1984). These situational effects on the mean social desirability of self-descriptions are commonly interpreted as a threat to the validity of these descriptions. Less often it has been noted, however, that such mean effects do not necessarily imply a lower validity of the interindividual differences in the self-descriptions. If all individuals fake good to the same extent, the rank order of the individuals and hence the validity of the self-descriptions is perfectly preserved. Only if different individuals fake to a different degree (differential faking), the validity is threatened. There is good evidence for substantial differential faking both in job application and in research settings (Ones & Viswesvaran, 1998; Paulhus, 1984; Rosse et al., 1998).

↓44

The present study investigated both the main effect of faking good and the effect of differential faking on indirect and direct measures. Faking was studied by contrasting these measures between an experimental group that was instructed to appear non-shy, and a control group that was instructed to act naturally. The between-group difference in the means informs us about the general faking susceptibility of the indirect versus direct measures. In contrast, the between-group differences in particular correlates of the indirect versus direct measures can be informative about the amount of differential faking.

According to the findings by Asendorpf et al. (2002), Study 1, a moderate correlation close to .40 is expected between the indirect and direct measures of shyness in the control group. To the extent that differential faking occurs, and affects only the direct self-ratings, the direct–indirect correlation should become much smaller in the experimental group. Furthermore, direct shyness is expected to correlate in the control group somewhat negatively with social desirability tendencies because shyness is a somewhat undesirable personality trait (e.g., Jones, Briggs, & Smith, 1986). To the extent that differential faking occurs, this negative correlation should become much stronger in the experimental group because the more participants fake good, the higher will be their social desirability score, and the lower their shyness score. Such a between-group difference is not expected for the correlations between the indirect measures and social desirability tendencies. These correlations should be low in both groups.

Finally, it was expected that the double dissociation between indirect and direct measures with regard to spontaneous versus controlled behavior reported by Asendorpf et al. (2002), Study 1, would be found not only in the control group but also in the experimental group because the direct self-ratings would be less predictive of spontaneous shy behavior and the indirect measures would be less predictive of controlled behavior.

4.1.3 Research Question 3: Validity of Observer Judgments

↓45

In Asendorpf et al.'s (2002) Study 1, the observer judgments of shyness correlated .58 with the controlled shy behavior and .48 with the direct self-ratings, but only .35 with the spontaneous shy behavior and .31 with the IAT. Thus, they seem to reflect more strongly controlled behavior. However, the participants in this study were not particularly motivated to control expressions of shyness. Participants who were instructed to fake non-shyness in Study 2 received only slightly lower shyness ratings by observers of their social interaction despite the fact that they talked much more. It is not clear from this pattern of correlational and mean effects what one should expect for differential faking.

It could be that the observers are strongly influenced by participants' self-presentation in the role play as being non-shy; in this case, the strong correlation with the direct self-ratings would be preserved, and the lower correlation with the indirect measures would decrease even more because it is less susceptible to faking. Such a pattern would suggest that the validity of the observer judgments for participants' true shyness is undermined by the participants' self-presentation in the role play. However, because behavior in role play situations can be faked less easily than answers in a questionnaire, it seems more likely that the participants' true shyness in the role play perspires to the observers to a great extent. In this case, the direct shyness - observer correlation should decrease, and the indirect shyness – observer correlation should be less affected. Thus, the difference between the faking-induced decreases in the self - observer correlations for direct versus indirect measures informs us about the validity of the observer measures.

4.1.4 Research Question 4: State Influences on the Indirect Measures

Research in Spielberger's state-trait anxiety tradition suggests stability of trait anxiety and increase of state anxiety when assessed immediately after anxiety induction (Spielberger, Gorsuch, & Lushene, 1970; Spielberger, Auerbach, Wadsworth, Dunn, & Taulbee, 1973). In line with these results, investigations with the German version of the Positive and Negative Affect Schedule (PANAS; Watson, Clark, & Tellegen, 1988) revealed that state affect is a better predictor for affect report regarding closer and shorter periods whereas trait affect is a better predictor for affect report regarding more prolonged periods (Krohne, Egloff, Kohlmann, & Tausch, 1996).

↓46

Recently, a study by Schmukle and Egloff (2003) provided evidence that an anxiety IAT was, in contrast to direct state anxiety measures, not influenced by an anxiety induction. Thus, whereas situational or contextual effects on implicit prejudice and stereotypes were demonstrated in several recent studies (Blair, Ma, & Lenton, 2001; Dasgupta & Greenwald, 2001, Lowery, Hardin, & Sinclair, 2001; Rudman, Ashmore, & Gary, 2001; Wittenbrink, Judd, & Park, 2001), state influences on implicit personality self-concept measures have not yet been shown. In order to make sure that the IAT and the new IAP procedure reflect interindividual differences in the enduring self-concept rather than in fluctuating affective states, it is important to show empirically that state influences are negligible.

This is particularly important because earlier studies have consistently found that the retest or parallel test reliability of IATs is lower than the internal consistency of the IAT (Asendorpf et al., 2002; Bosson et al., 2000; Dasgupta & Greenwald, 2001; Egloff, & Schmukle, 2002; Greenwald & Farnham, 2000). This lower retest reliability could be due to differential learning effects that occur between test and retest (e.g., some participants develop a more efficient cognitive strategy for the more difficult part of the IAT where they must associate self with incompatible attributes whereas others do not develop such a strategy). Alternatively, the lower retest reliability could be due to effects of state changes between the two tests (e.g., when a first shyness IAT is assessed immediately after a shyness-inducing situation and the retest 20 minutes later, the retest correlation could be lowered by the fact that the first IAT was influenced by the actual shyness experienced in the immediately preceding situation whereas the retest reflected more one's enduring self-concept of shyness). This latter interpretation could be ruled out if it could be shown that the IAT is unaffected by state changes. The robustness of both indirect measures was studied with regard to their mean level and their correlates by contrasting them between participants who completed the indirect measure before or after the shyness-inducing role play.

4.2 Design of the Present Study

In order to answer these 4 research questions, the design of Asendorpf et al.'s (2002) Study 2 was extended in two main respects. First, the new IAP was included in addition to the IAT. Second, both females and males were included and sample size was much larger to be able to detect significant differences between correlations. Statistical power considerations suggest that in order to detect significant between-group differences of approximately .30 with one-tailed tests and a power of .80 (Cohen, 1988), a size of N=120 for each group is required. Because I wanted to experimentally vary both participants' faking tendency and the position of the two indirect tests (before/after the role play situation), a complete between-subjects design would include 2 (faking) x 2 (position) x 2 (indirect procedure) x 120 = 960 participants.

↓47

To avoid such an unrealistically large study, I (a) restricted the analysis of the position effect to the faking condition which thus required 240 participants, (b) chose only 60 participants for the control group which still provided sufficiently reliable correlations within this condition and a sufficient power for the faking effects, and (c) had each participant complete one indirect procedure before and the other indirect procedure after the role play, with a between-participant variation of the order of the tests, because I assumed that there would be only minimal transfer effects between different procedures. In this way the total sample size was reduced to 2 (position) x 120 + 60 = 300.

Additionally, I included two social desirability scales to study the effects of faking on the responses to these two scales, and interviewed the participants in the faking condition about possible faking strategies in the indirect procedures.

4.3 Hypotheses

Study 1 tested the following hypotheses.

↓48

Hypothesis 1 (Main faking effects). Under faking, the social desirability scores increase, and the direct self-ratings of shyness and the controlled shy behaviors decrease. In contrast, the spontaneous shy behaviors and the two indirect measures are unaffected by the faking instruction, replicating Asendorpf et al. (2002), Study 2.

Hypothesis 2 (Main position effects). Whether the indirect tests are completed before or immediately after the shyness-inducing role play has no effects on their mean level.

Hypothesis 3 (Differential coherence). The indirect and explicit self-concept measures are less strongly correlated in the faking condition than in the control condition.

↓49

Hypothesis 4 (Differential relation to social desirability). The direct self-ratings correlate more negatively with the social desirability scores in the faking condition than in the control condition. In contrast, both indirect measures do not correlate with social desirability scores under both experimental conditions.

Hypothesis 5 (Robustness of observer judgments to differential faking). Under faking, the correlation of the observer judgments of shyness with the direct self-ratings decreases more strongly than their correlations with the two indirect procedures.

Hypothesis 6 (Double Dissociation). Both indirect procedures uniquely predict spontaneous (but not controlled) shy behavior whereas the direct self-ratings uniquely predict controlled (but not spontaneous) shy behavior when the alternative predictor is statistically controlled, replicating Asendorpf et al., 2002, Study 1.

4.4 Methods

4.4.1  Participants

↓50

Participants were 300 university students (150 female, 150 male; age M = 24.5 years, range 20-34 years), none of whom were psychology students or had participated in the lab’s earlier studies. All participants were claimed to be native speakers of German. Most participants were approached on the campus of Humboldt University, Berlin. The remaining were recruited using postings at the university buildings.

Following Asendorpf et al.'s (2002) Study 2 procedure, participants were asked to participate in either "a job application procedure" (faking condition, n = 240, 120 of either sex) or "a study on social perception" (control condition, n = 60, 30 of either sex). In the first case, they were motivated for participation by informing them that the study included a simulated job assessment center and video feedback on their performance. In addition, they were offered DM 20 (approximately US $ 10) for their cooperation in the 1.5 hour study. In the second case, they were motivated by informing them that they would receive individual feedback on their results after the study. In addition, they were offered DM 15 (approximately US $ 7.5) for their cooperation in the 1 hour study.

4.4.2 Assessments and Measures

Overall procedure and design. The overall procedure and design of Study 1 is shown in Table 8. All participants (a) completed an indirect shyness test (either IAT or IAP), (b) judged themselves on bipolar personality-describing items, (c) were video-taped in a shyness inducing role play, (d) completed a different indirect shyness procedure (IAP or IAT), (e) judged themselves on other sets of personality items, (f) completed a retest of (d), and (h) were interviewed about the indirect tests. Participants in the assessment center condition additionally (g) judged themselves on the personality items of step (e) under a honesty instruction and (i) received video feedback on their performance in the role play by the role play partner. The shyness items were identical for both indirect procedures and were included as direct self-ratings in steps (b), (e), and (g). Thus, the first indirect test was completed before the direct ratings. This excluded possible transfer effects from the direct to the indirect measures. The direct shyness ratings, the IAT, the instructions for the faking and control group, and the role play were identical with the procedures in Asendorpf et al.'s (2002) Study 2.

↓51

As can be seen in Table 8, there were two between-subject variations: faking instruction and position of the two indirect tests. Consistent with their invitation, participants received either the faking instruction (assessment center group) or the honesty instruction (social perception group). Invitations were scheduled such that approximately every fifth participant was in the social perception group. Within each group, half of the participants completed first the IAT and later IAP and IAP retest; the other half completed first the IAP and later IAT and IAT retest. Assignment to the 2 orders alternated between successive participants. Finally, the participants were thanked, asked for permission of analyzing the videotapes (all gave permission), and were promised individual feedback about their results (only participants in the social perception condition). Four months after the study was finished, all participants received a letter explaining the procedures and general findings of the study, and the control participants were invited for a feedback session where they were informed about their individual results.

Table 8
Overall Procedure and Design of Study 1

Cover story

Duration
(Min.)

Assessment center:
Faking instruction

Social perception:
Honesty instruction

(a)

Indirect shyness test

IAT

IAP

IAT

IAP

10

(b)

Bipolar self-ratings

Shyness, irritability, conscientiousness, intellect

5

(c)

Behavior observation

Shyness inducing role play

5

(d)

Different indirect shyness test

IAP

IAT

IAP

IAT

10

(e)

Direct self-ratings

- Self-monitoring scale
- Bipolar items for shyness and irritability
- Social desirability scales

12

(f)

Retest of (d)

IAP

IAT

IAP

IAT

10

(g)

Retest of (e)

Honesty instruction

-

12

(h)

Interview about indirect tests

- Problems with IAT or IAP
- Answer or faking strategies

7

(i)

Video feedback

Role play performance

-

20

n

120

120

30

30

~1.5/1h

Note. IAT = Implicit Association Test, IAP = Implicit Association Procedure.

Instructions. All instructions were identical to Asendorpf et al.’s Study 2 (2002) Upon arrival at the lab, the participants in the assessment center condition received the following instruction: “The following assessment center assesses your ability to present yourself as successfully as possible for a position in a company that you are very interested in. An important part of your future job is to present the company as successfully as possible in interactions with new clients. Therefore, you must be able to warm-up strangers quickly and to avoid insecure behavior because such insecurity could easily make an unprofessional impression." Then, the experimenter explained the different steps of the assessment center and stressed the point that in order to get the job the participant should make a favorable impression in all parts of the assessment, including both the role play and the personality tests. The instruction “Please do not forget to present yourself in a way that you get the job.” was repeated before each set of direct ratings and each indirect test.

↓52

The participants in the social perception condition were informed that they would participate in a study on social perception, that is, "how you perceive yourself and how others perceive you". After explaining the different steps of the experiment, the experimenter continued "Please describe yourself in all personality tests as honestly and realistically as possible and act in the role play simply as you would do in real life". The instruction “Please do not forget to present yourself as honestly and realistically as possible.” was repeated before each set of direct ratings and each indirect test.

Role play instructions. Before participants of the assessment center condition were shown into the observation room, they were reminded that "it is very important for getting the job that you show in the role play that you can easily and openly approach strangers". In the control condition, the participants were informed that "the role play is informative about particular personality characteristics" and that they would be evaluated by their role play partner after the role play. All participants were informed that the role play would be recorded by two cameras. Then, the role play situation was described: "You are an employee in a company. In your company, the boss will be replaced by a new one. This new boss, your future boss, was supposed to meet the present boss now, but unfortunately the present boss is still in another meeting for about 10 minutes. You have been asked to fill in for these 10 minutes and to make the situation as comfortable for your future boss as possible." In the assessment center condition, this instruction was continued: "You should present yourself as favorably as possible. Have in mind that your role play partner will be your future boss." In the control condition, the instruction was continued differently: "Act in the role play just as you would do in real life."

Role play. The role play situation was identical for all participants. In the observation room, an older-looking, unfamiliar, opposite-sex, advanced psychology student, dressed in a business suit, was sitting at a low table. The participant was asked to take place on a chair, that was put at a 90° angle to the confederate's chair. The confederate was blind to the experimental condition. S/he was trained to play the role of the future boss described in the instruction. The confederate was instructed to act slightly indignant at the delay of the meeting with the present boss and to slightly patronize the participant. This procedure was designed to induce shyness by (a) the unfamiliarity and (b) the status difference of the boss, (c) the assumed evaluation by the boss, (d) the opposite sex of the boss, and (e) the videotaping.

↓53

The role play was videotaped with two cameras that were operated from another room using S-VHS recorders. One camera filmed the participant and the confederate from a 45° angle. These tapes were used for behavioral analyses. A different camera directly looked toward the participant and recorded a zoomed-up view of the participants face. When participants interrupted the role play (e.g., by talking about the role play or walking around), the confederate tried to get them back in the role play as quick as possible. The time period until the role play was continued was defined as missing. For the judgments and codings of shy behavior secondary tapes were prepared that contained the first three minutes of noninterrupted role play of each participant.

Implicit Association Test (IAT) and Implicit Association Procedure (IAP). The same procedures as in Pilot Study 2 were used. Following Greenwald et al. (1998) three aspects were modified concerning data reduction. First, latencies below 300 ms were recoded as 300 ms, as well as IAT latencies above 3000 ms were recoded as 3000 ms. Given that in the IAP the presentation of the stimulus stopped after 3000 ms, there were no response latencies longer than that. Second, the first two responses in the combined tasks were not analyzed. Third, calculations of the internal consistencies and the test scores were based on log-transformed latencies to correct for the skewed latency distribution. However, for presentation purposes, descriptive statistics of the IAT and the IAP are reported in milliseconds.

This data reduction procedure was identical to that used by Asendorpf et al. (2002). To maximize comparability between both studies I do not report results for the improved scoring algorithm that Greenwald, Nosek, and Banaji (2003) suggested recently. However, I analyzed both the present data and the Asendorpf et al. (2002) data with this new procedure but found only minimal changes (differences in correlations below .02). The main reason for the minimal between-procedure difference seems to be that the Asendorpf et al. (2002) procedure already included a major feature of the Greenwald et al. (2003) procedure, namely inclusion of the practice trials for the combined tasks into the analyses. The gain in internal consistency and validity due to this variation from the original procedure used by Greenwald et al. (1998) was larger than the gain due to the remaining features of the Greenwald et al. (2003) procedure.

↓54

Direct self-ratings. Concerning bipolar self-ratings in step (b), the same 10 shyness items as in Pilot Study 1 were used. These items were mixed with 30 conscientiousness, intellect, and irritability items in a fixed random order. In order to minimize transfer effects from the preceding indirect test, the shyness items occurred only among the last 20 items. Self-ratings in step (e) started with a 32-item self-monitoring scale that should again minimize transfer effects from the preceding indirect test and was not analyzed for the purpose of the present study. The scale was followed by the 10 shyness and irritability items of step (b) and concluded with the social desirability scales of Pilot Study 1. The reliability of the direct self-ratings was separately calculated for the assessment center and the social perception condition and was above α = .84 in each case.

Interview about the indirect procedures. All participants were interviewed by the experimenter about (a) problems with the IAT or IAP, and (b) whether they used particular strategies during the IAT or IAP in order to decrease error rate, increase speed, or make a favorable impression.

Judgments of shy behavior. Four student judges who were blind to the experimental condition independently rated their overall impression of the participants' shyness. Each minute of the 3-minute secondary tapes was separately rated on a 7-point scale ranging from 7 = "shy" to 1 = "not shy". Beforehand, the judgments were anchored by two examples of extremely shy and extremely nonshy participants from Asendorpf et al.'s (2002) Study 1. For each participant the 12 ratings were averaged The reliability (interjudge agreement) was above α = .92 for both conditions.

↓55

Codings of shy behavior. Codings were done on a PC using the Computer Aided Observation System (CAOS) software. This program synchronizes video player and PC and registers onset and offset of behavioral codings when the appropriate key is pressed. Codings were carried out for speech duration, body movements, and tenseness of body posture. Following Ekman and Friesen's (1972) classification body movements were coded as illustrators (movements illustrating speech), facial adaptors (self-stimulations of the face), and body adaptors (self-stimulations of the body). For data analysis body movements and speech duration were considered in terms of their relative duration of the 3 minute observation time. For statistical analyses body movement codings were log-transformed to correct for the skewed distribution. Tenseness of body posture was defined as deviation from a normally relaxed body posture and was coded on a 3-point scale as normal, slight, or strong tension. Using the weights of 0, 1, and 2, the durations of the three tension categories (in % of observation time) were summed, resulting in scores ranging from 0% to 100%. From these 5 variables, indices of spontaneous and controlled shy behavior were computed as in Asendorpf et al.'s (2002) studies by aggregating the z-transformed scores of the three spontaneous behaviors facial adaptor duration, body adaptor duration, and tense body posture, and separately the 2 controlled non-shy behaviors speech duration and illustrator duration. Coding reliability was checked by independent codings of 45 participants by another coder; the reliability was satisfactory for all 5 main behavioral indicators, r > .86 in each case.

4.5 Results

The first two sections of the Results section report the main effects of instruction (faking versus social perception) and position (before versus after the role play). Then, the effects of these experimental variations on the correlations between direct, indirect, and behavioral measures are explored.

4.5.1  Effects of Instruction and Position on Indirect and Direct Measures

In this section, the main indirect and direct measures are described, and effects of instruction (main faking effects, Hypothesis 1) and position (before versus after the role play, Hypothesis 2) are analyzed.

↓56

IATs. For both IATs, the error rates in the two combined tasks were similar to those in Asendorpf et al. (2002), for the first IAT, M = 5.1%, SD = 3.6%; for the second IAT, M = 4.9%, SD = 3.8%. Inspection of the error distributions indicated three extreme scorers (in the faking condition, 1 participant in the first IAT, 25% error, and 1 in the second IAT, 26% error; in the control condition, 1 in the first IAT, 26% error). All other error rates were below 20%. Therefore, the IAT data of these 3 participants were excluded from all analyses. The distributions of the log-based IAT and IAT retest scores were not even marginally different from a normal distribution, Z < 1. Their overall internal consistency α, calculated across IAT scores that were separately determined for the trials 3-20, 21-40, 41-60, and 61-80, was .78 for test and .76 for retest and highly similar for all conditions; in particular, it was not lower in the faking condition (.78 in the faking versus .73 in the control condition for test, and .78 versus .63 for retest, respectively). Thus, internal consistency was acceptable for all conditions although it was slightly lower than in Asendorpf et al.'s (2002) studies. The retest reliability of the IAT was r = .68 and thus highly similar to the parallel test reliability of .66 reported by Asendorpf et al. (2002).

Table 9
Summary Statistics and Instruction Effect for the Main Variables

Faking
n = 240a

Control
n = 60b

Instruction effect

df = 298c

Variable (range of scores)

M

SD

M

SD

t

p

d

IAT

-115 ms

194 ms

-76 ms

169 ms

1.99

.05

.23

IAP

-85 ms

134 ms

-62 ms

142 ms

1.27

.21

.15

Bipolar shyness self-rating (1-7)

1.85

0.59

3.58

1.01

17.3

.001

2.00

- before role play

1.90

0.64

3.62

1.01

16.3

.001

1.89

- after role play

1.79

0.59

3.54

1.03

17.3

.001

2.00

Social desirability score (0-1)

0.85

0.14

0.48

0.17

17.8

.001

2.06

Observer shyness judgment (1-7)

3.72

1.19

4.11

1.26

2.29

.02

.27

Speech duration (%)

85.9

26.3

68.9

24.7

4.52

.001

.52

Illustrator duration (%)

6.22

5.85

4.82

5.97

1.70

.10

.20

Facial adaptor duration (%)

3.39

10.4

5.08

11.8

1.22

.22

.14

Body adaptor duration (%)

35.1

39.7

28. 6

39.9

1.25

.21

-.14

Tense body position (%)d

66.9

29.6

54.1

27.2

3.03

.01

-.35

Note. M and SD refer to raw scores, statistical tests to log-transformed scores in the case of the IAT and IAP latencies and the body movement codings. The effect sizes d were defined such that positive scores indicate less shyness in the faking condition.
a n = 239 for IAT and IAP; b n = 59 for IAT and IAP.
c df = 294 for IAT and IAP, t = √F in case of ANOVAs.
d Weighted duration of normal, slight, and strong tension.

Effects of instruction, position, and their interaction on the IAT means were tested by a 2x2 ANOVA. A significant effect was found only for instruction, F (1,294) = 3.97, p < .05. Table 9 indicates that participants had lower IAT scores in the faking condition than in the control condition. Although the effect size was small, it suggested that some participants might have manipulated the IAT in order to present themselves as nonshy. Therefore, the participants' reports about faking the IAT in the interview with the experimenter at the end of the study were related to their IAT scores. Of the 239 participants in the faking condition, 58 reported attempts of influencing the direction of the IAT outcome. In 57 cases, they reported to bias their results by having taken the perspective of a nonshy job applicant; one other participant reported to have deliberately committed errors. A t test contrasting them with the other 181 participants in the faking condition confirmed the hypothesis that they had lower IAT scores, t (237) = 1.78, p < .05, one-tailed, d = .23. When these 58 participants were excluded from analysis, the remaining participants had only marginally lower IAT scores than those in the control condition, t (238) = 1.44, p < .08, one-tailed, d = .19. In terms of untransformed reaction times, the mean IAT score was –154 ms for fakers, -103 ms for assumed nonfakers, and –76 ms for control participants. Because some of the assumed nonfakers might have tried as hard as the fakers to influence the IAT, but did not report it, the instruction effect for the IAT seems to be due to the tendency of a minority of the participants to take the perspective of a nonshy job applicant.

↓57

IAPs. For both IAPs, the error rates in the combined tasks were similar to those in the IAT (for the first IAP, M = 5.0%, SD = 5.3%; for the second IAP, M = 3.8%, SD = 3.5%). Inspection of the error distributions indicated two clear outliers (in the faking condition, 1 participant in the first IAP, 40% error; in the control condition, 1 in the first IAP, 45% error). These participants did not produce extreme scores in the IAT. All other error rates were below 24%. Therefore, the IAP data of these 2 participants were excluded from all analyses. The distributions of the log-based IAP and IAP retest were not even marginally different from a normal distribution, Z < 1. The internal consistency of the two IAPs was evaluated similarly to the IATs by computing Cronbach's for the separately determined IAP scores for 4 blocks of trials (3-32, 33-64, 65-96, 97-128). The overall internal consistency was .83 for the test and .77 for the retest but was somewhat unsatisfactory in the control group for the retest. In particular, it was .82 in the faking versus .86 in the control condition for the test, and .81 versus .55 for the retest, respectively. Nevertheless, internal consistency was completely satisfactory for the first test at least. The retest reliability of the IAP was r = .65 and thus highly similar to the retest reliability of the IAT of .68.

Effects of instruction, position, and their interaction on the IAP means were tested by a 2x2 ANOVA. No significant effects were found. In particular, the instruction effect was not even marginally significant, F (1,294) = 1.61, p = .21. Thus, the IAP tended to be more robust than the IAT with regard to faking. This conclusion was also supported by an analysis of reported faking. Of the 239 participants in the faking condition, 68 reported attempts of influencing the IAP outcome. In 64 cases, they reported to have taken the perspective of a nonshy job applicant; 4 other participant reported to have deliberately committed errors. These figures were slightly higher than for the IAT. However, a t test contrasting them with the other 171 participants in the faking condition did not even reveal marginal differences, t < 1. In terms of untransformed reaction times, the IAP score was -91 ms for fakers, -83 ms for assumed nonfakers, and –62 ms for control participants. Although the rank-order of these means was identical with the results for the IAT, the differences between the means were minimal.

Direct self-ratings. All self-rating scales showed a satisfactory internal consistency, α > .80. Both shyness means in the control condition were not even marginally different from those in Study 1 by Asendorpf et al. (2002), < 1, which suggests that the sample of the control condition was not differently selected for shyness from the sample of this earlier study. Effects of instruction, position, and their interaction on the shyness self-ratings were tested by a mixed 2x2 ANOVA with instruction as a between-subjects factor and order as a within-subjects factor. A very large instruction effect was found, F (1,298) = 298.9, p < .001. As Table 9 indicates, participants in the faking condition reported shyness that was 2 standard deviations lower than in the control condition. In addition, a moderate position effect was found, F (1,298) = 13.25, p < .001, = .40 (computed as √2(M 1  – M 2 )/SD where SD is the standard deviation of the difference scores; see Cohen, 1988). Participants in the faking and in the control group reported somewhat less shyness after the role play than before (see Table 9). This may be attributed to the mastery of the role play that probably made participants to consider themselves as less shy than before. The position by instruction interaction was not significant, < 1.

↓58

It should be noted that position effects on direct shyness measures were analyzed in a within-subjects design whereas position effects on indirect shyness measures were analyzed between subjects (see the overall design in Table 8). Thus, comparing results for direct and indirect measures is not entirely fair, since the statistical tests had a higher level of power for the former than for the latter. However, analysis of means did not indicate any common trend for position effects on indirect shyness measures. In terms of untransformed reaction times, the mean IAT score was -130 ms (SD = 214 ms) before and -85 ms (SD = 159 ms) after the role play. Thus, participants were more likely to attain higher shyness scores after the role play. The opposite was true for the IAP, -73 ms (SD = 146 ms) before and –88 (SD = 125 ms) after the role play. Given the standard deviations, though, none of these differences were significant and should be seen as chance variations. Thus, the indirect measures, in fact, seemed to be more robust against position effects.

Most other direct self-ratings showed also large instruction effects, particularly the social desirability scale, d = 2.06, but also the bipolar self-ratings of conscientiousness, d = 1.34, and intellect, d = 1.23. Thus, the participants in the faking condition showed a strong, generalized tendency to present themselves in socially desirable ways.

4.5.2 Effects of Instruction on Behavioral Shyness Measures

In this section, the judgments and codings of shy behavior are described, and effects of instruction (main faking effects, Hypothesis 1) are analyzed.

↓59

Judgments of shy behavior. In the control condition, the mean of the observer judgments of shyness was marginally higher than in Asendorpf et al.’s (2002) Study 1, t (196)  = 1.65, p < .10. Because the observers used a response scale that was anchored with extreme examples from this earlier study, this difference can be attributed to a slightly more successful induction of shyness by the role play. Table 9 indicates that the participants in the faking condition were judged as less shy than those in the control condition but this instruction effect (d = 0.27) was not large compared to the effect for the direct ratings.

Codings of shy behavior. The durations of the 3 types of body movement were skewly distributed and therefore log(x+1)-transformed. Table 9 indicates that, as expected, the participants in the faking condition talked more and used somewhat more illustrating gestures (significant for a one-tailed test), thus showed less controlled shy behavior than those in the control condition. In contrast, they did not show less spontaneous shy behavior with regards to facial or body adaptors, and even showed higher body tension, when they were instructed to appear non-shy. This behavioral pattern completely replicates the pattern that Asendorpf et al. (2002) reported for a much smaller, female-only sample. Thus, the participants in the faking condition followed the instruction to present themselves as non-shy in their controlled behavior. However, they failed to suppress or even showed more spontaneous shy behavior than in the control condition.

4.5.3 Correlational Analyses

In the preceding analyses, I explored main effects of instruction (faking versus control) and position (before versus after the role play). In this section, I study differential effects of faking and position, that is, how faking and position affected interindividual differences and their correlates (Hypotheses 3-6).

↓60

Position effects. Explored were position effects on the correlations between the implicit and direct self-concept measures, the observer judgments, and the behavior codings, both overall and within the faking and the control group. All order effects were small and not even marginally significant. Although relatively large samples are needed to detect significant differences between correlations, the sample size for the two positions for the faking condition was n = 120 and thus sufficient for detecting marginally significant between-correlation differences of approximately .30 or larger with a power of .80 (Cohen, 1988). In particular, no systematic trend was found that the direct or indirect self-concept assessments before the role play were less strongly related to the behavioral observations than the same assessments after the role play. Furthermore, the self-ratings before and after the role play correlated above .83 in both the faking and the control condition, which is close to the reliability of these ratings. Therefore, the two bipolar shyness self-ratings were averaged for each participant, yielding one aggregated index of the explicit self-concept of shyness, and the position of the indirect measure was ignored in the following analyses.

Table 10 indicates that IAT and IAP were moderately correlated in both the faking and the control group and showed highly similar correlations with the other main variables. Thus, all major IAT correlates were replicated with the IAP. Therefore, both IAP and IAT were z-transformed within experimental condition to make their scores comparable, and then averaged, yielding one aggregated index of the implicit self-concept of shyness. The remaining analyses of differential effects (Hypotheses 3 - 6) were restricted to the aggregated measures of the explicit and the implicit self-concept of shyness (lower right-hand side of Table 10). Numerous observations can be made from this part of Table 10.

First, as expected by Hypothesis 3, the implicit and explicit self-concept measures were significantly less strongly correlated in the faking condition than in the control condition, r = .19 vs. r = .50, z = 2.39, p < .01, one-tailed. The correlation of .50 in the control condition was similar to the correlation of .44 between the indirect and direct measure in Asendorpf et al.'s (2002) Study 1.

↓61

Table 10
Correlations of the Main Variables by Instruction

1

2

3

4

5

6

7

8

1. IAT

.50***

.87***

.15*

-.07

.14*

.04

.06

2. IAP

.44***

.87***

.18**

-.09

.10

.04

.03

3. Implicit shynessa

.85***

.85***

.19**

-.09

.14*

.05

.05

4. Explicit shynessb

.35**

.49***

.50***

-.48***

.13*

-.01

.06

5. Social desirability

-.13

-.09

-.13

-.17

-.08

-.04

-.03

6. Observer judgment

.17

.28*

.27*

.36**

.16

.19**

.47***

7. Spontaneous behaviorc

.04

.07

.07

.15

.04

.34**

.02

8. Controlled behaviord

.10

.02

.07

.18

.05

.70***

.29*

Note. Correlations above the diagonal refer to faking condition (n = 238), correlations below the diagonal to control condition (n = 58). * p < .05 ** p < .01 *** p < .001.
a Mean of z-transformed IAT and IAP.
b Mean of the bipolar shyness self-ratings before and after the role play.
c Average of z-transformed duration of facial and body adaptors and tense body posture.
d Average of reversed z-transformed duration of speech and illustrators.

Second, as expected by Hypothesis 4, the indirect measure did not correlate with social desirability neither in the faking nor in the control group, r = -.09 and r = .13. In contrast, the direct measure correlated significantly more negatively with the social desirability index in the faking condition than in the control condition, r = -.48 vs. r = -.17, z = 2.41, p < .01, one-tailed. As pointed out in the introduction, this correlational difference confirms the undermining effect of differential self-presentation tendencies on the direct shyness ratings in the faking condition.

Third, as expected by Hypothesis 5, the correlation of the observer judgments of shyness with the direct self-ratings of shyness decreased significantly under faking (from r = .36 to r = .13, z = 1.67, p < .05, one-tailed) whereas the correlation with the indirect measure did not (from r = .27 to r = .14, z = .92, n.s., one-tailed). Although the difference in the decrease of the correlations was not significant, it should be noted that the indirect and the direct measure showed significant and equally strong associations with the observer judgment under faking.

↓62

Because the indirect and direct measures were correlated, and to a different degree in the faking and the control group, I analyzed unique contributions of the indirect versus direct measures to the observer judgments, using multiple regression. In the control group, only the direct self-ratings explained significant unique variance of the observer judgments, β = .30, p < .05, whereas the unique contribution of the indirect measures was not significant, β = .11, p = .41. In the faking group, both measures explained similar unique but small variance that was significant for the direct self-rating, β =.13, p < .05, and marginal for the indirect measure, β =.11, p < .10. Thus, whereas the unique contribution of the direct self-ratings tended to be smaller in the faking than in the control group, the unique contribution of the indirect measure was the same in both groups. These findings suggest that the observers were to some extent resistant to participants' differential cheating. That was also indicated by the nonsignificant correlations between the observer judgments of shyness and the social desirability index in both groups (see Table 10).

The correlation of .19 (p < .01) between participants' direct self-ratings and the indirect measure under faking suggests that these self-ratings were not completely invalid for participants' true self-concept. This assumption was supported by a similarly high correlation of .24 (p < .001) between the self-ratings of shyness that were completed under the faking versus the honesty instruction at the end of the experiment. Although this correlation is much lower than the retest correlation of .83 under faking, the rank order of the participants in self-reported shyness was preserved to some extent despite differential faking. Thus, the significant but low correlation of .13 between the observer judgments and the direct self-ratings under faking may reflect this valid portion of the direct self-ratings rather than a faking effect on the observer shyness judgment.

Fourth, the indices of spontaneous and controlled shy behavior were significantly correlated with the observer judgment of shyness in both the faking and the control group. However, contrary to Hypothesis 6, both behavioral indices of shyness were not significantly correlated with the indirect and direct shyness measures in the control condition (both behavioral indices were significantly correlated with the indirect and the direct shyness measures in Asendorpf et al.'s, 2002, Study 1). Thus, although the observers interpreted these two behavioral indices as indicators of shyness, they were in fact unrelated to the self-concept of shyness. This lack of validity applied not only to the aggregated behavioral indices but also to each single behavioral variable. Because of these zero correlations, the expected double dissociation between the indirect and direct measures of shyness was not found for the control situation.

↓63

Fifth, given this lack of validity of the behavioral measures for the control condition, it is not surprising that they lacked validity also in the faking condition. Again, the correlations between the indirect and the direct shyness measures and the two behavior composites (and each single behavior within the composites) were not significant. Therefore, the expected double dissociation between the indirect and direct measures of shyness was not found also for the faking condition.

All in all, the 4 hypotheses concerning correlations between the implicit and the explicit self-concept of shyness, the social desirability index, and the observer judgments of shyness were at least marginally confirmed but not Hypothesis 6 because of the invalidity of both the spontaneous and the controlled behavioral measures for the role play situation.

Exploration of alternative behaviors indicating shyness in the role play. The significant correlations between the observer judgments and the implicit and explicit self-concept measures suggested that the observers were aware of interindividual differences in shyness but used different cues than those captured by the a priori defined spontaneous and controlled behavioral measures. Therefore, alternative behavioral measures of the self-concept of shyness were systematically explored, using the videotapes of both the control condition and the Asendorpf et al. (2002) Study 1. As a safeguard against chance findings, given the post hoc nature of these analyses, only those behavioral measures were considered that correlated significantly with the implicit or explicit self-concept of shyness in both the control role play situation and in Asendorpf et al.'s (2002) Study 1. More than a dozen different nonverbal behaviors were explored for this purpose (e.g., body posture, facial cues, vocal cues, a detailed analysis of speech pauses of different length) but not a single behavior was found that survived this test. Thus, it seems that shyness is differently expressed in behavior in the role play situations of the present study than in the more naturalistic interactions with a confederate in Study 1 by Asendorpf et al. (2002). Nevertheless, it was not possible to identify the cues that the observers used for their valid shyness judgments.

4.6 Discussion

4.6.1  Summary of the Main Findings

↓64

Study 1 tested six hypotheses on the differential operation of indirect versus direct measures of the personality self-concept under naturalistic faking conditions. The indirect measures were an Implicit Association Test (IAT) and a newly developed Implicit Association Procedure (IAP). I discuss the results separately for each hypothesis, contrast the two indirect procedures with one another, and then briefly discuss general conclusions and open questions for the indirect assessment of personality self-concept.

As expected in Hypothesis 1, the direct self-ratings of shyness, the social desirability scores and the controlled shy behaviors decreased under faking; the decrease was particularly strong for the questionnaire measures (approximately 2 standard deviations). Also in line with this hypothesis, the IAP scores and the spontaneous shy behaviors did not decrease, supporting their non-fakability, and replicating Asendorpf et al.'s (2002) Study 2 findings. There was a slight tendency of the IAT scores to decrease under faking, but a more detailed analysis showed that this decrease was restricted to a minority of participants who had spontaneously attempted to vividly imagine themselves as a nonshy job applicant. Comparable effects of mental imagery on IAT scores and priming measures have been reported in studies that experimentally induced mental imagery of counter stereotypes (Blair, Ma, & Lenton, 2001). It should be noted, however, that even for these participants the effect was only moderate (less than a quarter of a standard deviation).

Whether the indirect tests were completed before or immediately after the shyness-inducing role play had no effects on their mean level, confirming Hypothesis 2. In contrast, the direct self-ratings of shyness were lower after the shyness-inducing role play. This may be attributed to the mastery of the role play that decreased the direct shyness self-ratings. The higher robustness of the indirect measures against state effects is important for the interpretation of the indirect measures because they are assumed to refer to a relatively stable self-concept of personality, not to current states (cf. Schmukle & Egloff, 2003).

↓65

Turning to the correlational hypotheses, the implicit and explicit self-concept measures were significantly less strongly correlated in the faking condition than in the control condition, which fully confirmed Hypothesis 3. This hypothesis was based on the assumption that the direct self-ratings were less valid in the faking condition than in the control condition because they were distorted by differential tendencies of the participants to present themselves in socially desirable ways.

This assumption was supported by the finding that the indirect measures did not correlate with participants' social desirability scores under both experimental conditions whereas the direct self-ratings correlated more negatively with the social desirability scores in the faking condition than in the control condition. Thus, Hypothesis 4 was fully confirmed. Together, these findings for the effect of faking on the correlations between the indirect measures of shyness, the direct self-ratings of shyness, and social desirability scores strongly support the view that the indirect measures were robust with regard to interindividual differences in faking attempts.

Hypothesis 5 on the validity of the observer judgments of shyness was marginally confirmed. Whereas the observer judgments tended to correlate more strongly with the explicit than with the implicit self-concept of shyness in the control group, these correlations were virtually identical under faking. Moreover, the correlation between the indirect measures and the observer judgment was similar under both experimental conditions. The unique contribution of the direct self-rating under faking to the observer judgments, independent of the contribution of the indirect measure, does not necessarily indicate that the observers were influenced by participants' faking attempts because there were two indications that participants' true shyness perspired in their behavior in the faking condition: a significant correlation for the direct self-ratings of shyness between the faking and the honesty condition, and a significant correlation between the direct and indirect measure under faking.

↓66

Together, these results suggest that observer judgments of temperamental traits in role play situations are not very much influenced by the role players' self-presentation even when they systematically try to fake the cues that the observers might use for their judgments (in this case, cues for non-shyness such as talking and gesturing). It seems that the observers use other cues that the participants cannot easily control. Unfortunately, it was not possible to identify such cues from the videotaped behavior.

Turning finally to Hypothesis 6 on a double dissociation between indirect and direct measures, the observer judgments of shyness correlated significantly with both the spontaneous and the controlled indices of shy behavior under both experimental conditions. This validated the spontaneous and controlled indices as behavioral measures of shyness. However, these correlations were smaller than in Asendorpf et al.'s (2002) Study 1, and contrary to these prior findings, controlled shy behavior was not significantly correlated with the direct shyness self-ratings, and spontaneous shy behavior was not significantly correlated with the indirect measures of shyness. Thus, although the observers interpreted these two behavioral indices as indicators of shyness, they were in fact unrelated to the self-concept of shyness in the control condition. Therefore, the expected double dissociation between the indirect and direct measures of shyness was not found for the control condition.

Because the mean direct self-ratings of shyness in the control condition were not lower than in Asendorpf et al.'s (2002) Study 1, and the observers rated the participants in the control condition even slightly more shy than the participants in this earlier study, the lack of validity of the behavioral measures cannot be attributed to an insufficient induction of shyness by the role play procedure. Instead, it seems that the role play framework itself, the more structured situation (a clear communication goal was defined) and/or the clear status differences between the participants ("boss" versus "employee") apparently changed the meaning of behaviors that were found to be valid indicators of shyness in an unstructured interaction between strangers.

↓67

Given this lack of validity of the behavioral measures for the control condition, it was not surprising that all correlations between the behavioral measures and the indirect and direct measures of the self-concept of shyness were not significant also in the faking condition. Therefore, the expected double dissociation between the indirect and direct measures of shyness was again not found.

Thus, in my view the main problem of the present study did not concern the indirect procedures. Instead, it concerned the fact that valid behavioral cues for shyness in more naturalistic situations became completely invalid in a role play context, and could not be replaced by alternative valid cues. If the assessment of shy behavior in role play situations is not the focal point, as in the present study, future studies might try to circumvent this problem by motivating participants to fake non-shyness in dyadic interactions of the type used by Asendorpf et al. (2002), Study 1. I did not follow these lines because I feared that direct instructions to do so would be perceived by the participants as artificial and would therefore insufficiently bias their actual behavior. Alternatively, it might be possible to motivate participants more indirectly to fake non-shyness.

4.6.2  An Alternative Procedure: The IAP

The Implicit Association Procedure (IAP) produced results that were highly similar to those found for the Implicit Association Test (IAT). The error response rate was similar to the IAT, the distribution of the scores was also close to a normal distribution, and the retest correlation was virtually identical. The internal consistency was slightly higher for the IAP which can be attributed to the fact that there were 256 trials in the critical blocks in the IAP, but only 160 trials in the IAT. The total test durations were not different, though, because there is no need in the IAP for attribute and reverse target discriminations. The two indirect tests showed substantial correlations of .50 (faking condition) and .44 (control condition), and their correlations with external variables were highly similar. A minor difference was that the IAP tended to be slightly more robust against faking.

↓68

A disadvantage of the IAP is that it is more difficult to implement this procedure on standard computers than the IAT. A joystick is needed, the joystick has to be continuously calibrated, and the program routines for implementing the procedure are much more complex than for the IAT. All in all, then, the IAP may be considered less an alternative to the IAT than an useful addition to the IAT that allows one to replicate IAT-findings and to reduce method variance of the IAT by aggregating IAT and IAP scores.

Let me conclude with a comment on the utility of indirect measures for the assessment of personality differences. On the positive side, the study showed that these indirect measures were fairly robust to faking attempts of the participants. Only participants who tried to bias their results by deliberately taking the perspective of a non-shy person were able to bias their IAT scores (but not their IAP scores), and this bias was very small compared to the bias in their direct self-ratings. Also, it was possible to construct a new indirect assessment procedure, the IAP, which correlated .50 with the IAT and showed highly similar correlates. Between-procedure correlations of this size are rarely achieved for indirect procedures that assess the same construct (Bosson et al., 2000; Cunningham et al., 2001). This new method made it possible to increase the reliability and validity of the assessment of the implicit self-concept through the aggregation of both procedures.

On the negative side, the direct self-ratings predicted the observer judgments in the control condition slightly better than the indirect measures, and in the faking condition not worse than the indirect measures. Although the direct self-ratings were strongly biased with regard to their mean, there were multiple indications that they were not completely invalid with regard to their interindividual differences. Furthermore, the .50 correlation between the IAT and IAP is not high compared to the .70 correlations that are regularly achieved when the same personality dimension is self-rated on different questionnaire scales. Both the relatively low retest correlation of .65 - .68 for the IAP and IAT and their relatively low parallel test reliability of .50 indicate that the amount of specific method variance for these indirect procedures is much higher than the specific method variance for direct ratings.

↓69

Much work may be required to increase these methodological weaknesses of the current indirect procedures for the assessment of stable interindividual differences to a psychometrically satisfactory level. Unless such a satisfactory level is reached, the indirect procedures can be considered interesting research instruments in need for improvement, not methods that are ready to be applied for practical assessment purposes. Another important aspect concerning practical implications is whether indirect measures, similar to direct measures, allow for the concurrent assessment of more than one personality trait. Therefore, the next study explores whether the IAT may be used to assess two traits, anxiousness and angriness, within one sample.


© Die inhaltliche Zusammenstellung und Aufmachung dieser Publikation sowie die elektronische Verarbeitung sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung. Das gilt insbesondere für die Vervielfältigung, die Bearbeitung und Einspeicherung und Verarbeitung in elektronische Systeme.
DiML DTD Version 4.0Zertifizierter Dokumentenserver
der Humboldt-Universität zu Berlin
HTML-Version erstellt am:
28.11.2006