Chapter IV:
Meta-analysis

↓106

In a series of nine studies in the field and in the laboratory involving choices among exotic jams, wine, jelly beans, restaurants, charity organizations, and classical music with a total of 1,445 participants, there was only one case in which I found an effect of the number of options on choice motivation. This is despite the fact that in each study, options were used that the decision makers were largely not familiar with, to rule out the influence of strong preferences prior to choice. The only case in which I found an effect was under the condition that participants had to give a justification for their decision. Even if future research confirms this “justification effect,” it can still be concluded that the effect of too much choice is far less robust that previously thought. Averaged over all 1,278 participants across the eight studies that I conducted in which there was an option not to choose, 49% of all participants make a choice from the large set and 48% make a choice from the small set.

↓107

In the two music studies in which participants in the United States and in Berlin were forced to make a choice and the main dependent variables were self-reported satisfaction, regret, and the willingness to pay, the results look similar. If anything, the 167 participants in both countries were more satisfied, less regretful, and willing to pay more when choosing from a large set as compared to a small set.

As has been pointed out by the statistician R. A. Fisher, replicability of empirical evidence is the foundation of science and the path to cumulative knowledge (Fisher, 1971). Along the same lines, Schmidt (1996) argued that any single empirical study usually reveals only little information and by itself can rarely resolve a controversial issue. As a consequence, the results from many studies need to be integrated to obtain reliable measures and to promote scientific progress. As a means to this end, in this chapter, I will strive to meta-analytically integrate the divergent findings into a more coherent framework.

Introduction

To get a broader picture of the true nature of the too-much-choice effect, it is advisable to incorporate as much data as possible in a meta-analysis. Toward this goal, in the following I will also include the results of the experiments on choice overload that I reviewed in Chapter I. Including my own, this makes a total of 26 experiments. Together, these studies represent all the published and unpublished experimental data on the effect of varying assortment sizes that I could get ahold of by June 2007.

Overview of the studies

↓108

The studies can be classified into two categories based on the experimental setup they used: In 14 of these experiments, including 8 of my own, participants had the option not to make a choice for the time being, to choose a default option, or to change their choice later on. In the other 12 studies, including 2 of my own, people were forced to make a choice. In the first case, with few exceptions (Chernev, 2003a, 2003b) the dependent variable is the percentage of people who made a choice (Table 2). In the latter case, the dependent variable is usually the satisfaction with the chosen option, but sometimes also the amount of consumption or the propensity to change the decision at a later point in time (Table 3). Figure 9 provides a forest plot of all effect sizes and their respective standard errors alongside each other.

Table 2: Summary of experiments with choice proportions as dependent variable. The asterisks (*) mark my own experiments

Study name

N total

Assortment size

Choice in %

Effect size ( d )

SE ( d )

small set

large set

small set

large set

difference

  

Iyengar & Lepper (2000), Jam study

249

6

24

29.8%

2.8%

27.0%

0.77

0.13

Iyengar & Lepper (2000), Chocolate study

67

6

30

48.0%

12.0%

36.0%

0.88

0.26

Shah & Wolford (2007)

60

2-10

12-20

60.0%

44.0%

16.0%

0.32

0.20

Chernev (2003a)

58

4

16

16.0%

84.0%

−68.0%

−1.44

0.24

Chernev (2003b), Study 1

88

4

16

82.0%

74.0%

8.0%

0.23

0.22

Chernev (2003b), Study 2

75

4

16

75.0%

69.0%

6.0%

0.22

0.23

Jam study Berlin*

504

6

24

33.3%

32.0%

1.3%

0.03

0.09

Jelly bean study*

66

6

30

63.6%

78.8%

−3.0%

−0.27

0.24

Wine study*

280

3

12

34.5%

38.3%

−3.8%

−0.10

0.12

Restaurant study*

80

5

30

30.0%

35.0%

−5.0%

−0.14

0.22

Charity study I – well known charities*

60

2

30

83.3%

93.3%

−10.0%

−0.53

0.26

Charity study I – least known charities*

57

5

40

65.5%

71.4%

−5.9%

−0.17

0.26

Charity study II (Bloomington)*

112

5

40 & 79

80.6%

87.0%

−6.5%

−0.26

0.20

Charity study III*

119

5

40 & 80

88.1%

72.7%

15.4%

0.57

0.20

Table 3: Summary of experiments with satisfaction ratings (a), amount of consumption (b), or propensity to change the decision (c) as dependent variable. The asterisks (*) mark my own experiments

Study name

N total

Assortment size

Mean value

Effect size ( d )

SE ( d )

small set

large set

small set

large set

difference

  

Haynes & Olson (2007), Study 1 a

69

3

10

7.85

7.20

0.65

0.44

0.25

Haynes & Olson (2007), Study 2 a

72

5

20

7.17

7.28

-0.11

-0.20

0.23

Lenton, Fasolo, & Todd (2005) a

96

4

20

5.19

5.36

−0.17

−0.08

0.20

Reutskaja & Hogarth (2005), Study 1 a

60

10

30

8.50

7.10

1.40

0.68

0.27

Reutskaja & Hogarth (2005), Study 2 a

60

10

30

7.30

7.70

−0.40

−0.33

0.25

Kahn & Wansink (2004), Study 1 b

36

6

24

16.60

22.70

−6.10

−0.37

0.33

Kahn & Wansink (2004), Study 2 b

91

6

24

34.90

50.90

−16.00

−0.46

0.20

Kahn & Wansink (2004), Study 5 b

138

6

24

43.70

60.90

−17.20

−0.39

0.17

Lin & Wu (2006) c

82

6

16

2.76

2.83

−0.07

−0.08

0.22

Mogilner, Rudnick, & Iyengar (2007), Study 3a

121

5

50

4.36

3.89

0.47

0.25

0.18

Music study (Berlin) *,a

80

6

30

0.51

1.06

−0.55

−0.17

0.16

Music study (Bloomington) *,a

87

6

30

1.90

2.00

−0.10

−0.05

0.15

↓109

Figure 9: Forest plot of all effect sizes. The bars indicate standard errors; the asterisks (*) denote my own studies

Method

One way to meta-analytically integrate these results would be simply to count the number of studies that did not find a (statistically significant) effect of assortment size. If this number exceeds the number of studies that found an effect, one would conclude that no relationship exists. As pointed out by Schmidt (1996), this so-called traditional voting method leads to wrong conclusions because it dismisses the fact that the significance of a study depends on the sample size and that studies might appear to be inconsistent with each other due to mere random error.

↓110

To obtain a more precise method of meta-analytic integration, Hunter and Schmidt (1990) suggest relying on effect sizes and their respective sampling error rather than on statistical significance. According to Hunter and Schmidt, the results of several studies are integrated by calculating a weighted average of all single effect sizes across studies (D):

(4-1)

where w is the weight of each single effect size as calculated in Equation 2-25, d i is the effect size of a study i, and m is the total number of studies. The 95% confidence interval around D is calculated as

↓111

(4-2)

where SE D  is the standard error of D, calculated as

(4-3)

↓112

To find out how much of the variance in D is due to mere sampling error and how much is due to meaningful differences between the studies, one needs to calculate a homogeneity analysis based on the Q-statistic as laid out in Formula 2-1. In the homogeneity analysis in Chapter II, the variance between studies could not be fully accounted for by error variance and as a consequence, searching for moderators and mediators seemed justified. However, with the empirical data of 21 additional studies on hand, the results of this analysis may look different.

Selection of studies

The meta-analysis on hand is only concerned with the main effect of too much choice in controlled experimental settings. Interaction effects, such as the effect of need for cognition (Lin & Wu, 2006), or the effect of entropy (Kahn & Wansink, 2004) are not considered because there is not enough data to separately estimate the effect of those interactions in a meta-analysis.

In general, the meta-analytical integration of results requires that the studies be comparable in their design and their hypotheses. The experimental designs of the studies that used choice proportions as the dependent variable are rather similar and so a meta-analytical integration seems justified. However, in one of Chernev’s (2003b) studies, the proportion of people who changed their decision was measured, and in another (Chernev, 2003a), it was the proportion of people who choose from the large assortment rather than the small. While both dependent variables are somewhat different from those in the other studies that looked at the proportions of people who did (not) choose any of the options, in each case the authors argued that on a conceptual level their measure reflects people’s motivation to make a choice. Thus, the integration of studies seems justified.

↓113

The studies that used a continuous dependent variable are not directly comparable to those that measured choice proportions because for the contiunous measures, participants were forced to make a selection from a given set. These latter studies are also somewhat more heterogeneous in their design. The majority of the studies used satisfaction with the chosen option (Haynes & Olson, 2007; Mogilner et al., 2007, Experiment 3; Reutskaja & Hogarth, 2005; my own music studies), measured with a Likert scale. In contrast, Lin and Wu (2006) asked participants about their propensity to change their decision later on, which seems at least similar to satisfaction. The experiments by Kahn and Wansink (2004) were based on consumption, which according to the authors themselves is conceptually different from choice or subjective satisfaction. However, as these studies generally tested the effect of assortment size, I included them in the meta-analysis initially and then checked in a second step if the results change when this set of studies is excluded.

Coding of studies

To integrate the studies, effect sizes have to be calculated. Effect size measures such as Cohen’s d express the magnitude of an effect in units of standard deviations. For studies with proportion of choice, the calculation of standard deviations is straightforward because they are a function of the sample size and the proportion, both of which are commonly provided by the authors. For studies with a continuous measure, such as satisfaction or amount of consumption, a calculation of effect sizes requires that standard deviations be explicitly reported in the original study. Unfortunately, this is often not the case. In some studies (Haynes & Olson, 2007, Study 1; Lenton et al., 2005; Lin & Wu, 2006; Mogilner et al., 2007, Experiment 3), this problem can be solved by reverse-engineering standard deviations from test statistics such as F- or t-values. In one case (Kahn & Wansink, 2004), however, this is not possible because the main effect of assortment size on the dependent variable, the amount of consumption, was not tested statistically. As I also could not obtain the necessary statistics from the authors, strictly speaking the data cannot be integrated into a meta-analysis. Yet because Kahn and Wansink’s studies provid important insight, for the present purpose I decided to integrate them regardless by estimating the standard deviations, knowing that the resulting effect size is error prone. To reduce the error, I estimated that the standard deviation would be equal to the mean consumption value. As this is most probably an overestimation, the effect sizes will be smaller and thus the weight of the studies in the meta-analysis will be reduced.

In some experiments (Reutskaja & Hogarth, 2005; Shah & Wolford, 2007), the number of options varied across a wide range. To integrate the results of the study by Reutskaja & Hogarth, I selected those assortment sizes for which the effect was greatest (10 vs. 30 options). In the study by Shah & Wolford, assortment sizes varied between 2 and 20 with increments of 2. To test for the main effect of too much choice, I grouped the assortment sizes 2 to 10 into a small assortment and compared it to the mean choice proportions for assortment sizes of 12 to 20 options.

Results

Studies with choice as dependent variable

↓114

Integrating all 14 studies listed in Table 2 with choice proportions as dependent variable, the mean effect size according to equation 4-1 is D=0.07 (SE D =0.05) and the 95% confidence interval ranges from -0.02 to 0.17. As the confidence interval includes zero, the mean effect size is not statistically different from zero. The Q-value obtained from Equation 2-1 is 100.2. As this is larger than the critical chi-square value of 22.4 (α=0.05; df=13), the variability across effect sizes exceeds what would be expected based on sampling error. Thus, according to Hedges and Olkin (1985), it can be concluded that the different results between the studies are due to the influence of moderator variables.

If the data from Chernev (2003a) with its extreme effect size of d=-1.44 and the data from Chernev (2003b) that used a slightly different dependent variable are excluded, the mean effect size D is 0.13 with a 95% confidence interval ranging from 0.05 to 0.22. The corresponding Q-value is 57.7, which is still higher than the critical chi-square value of 18.3 (df=10).

Studies with satisfaction as dependent variable

For all eight studies with satisfaction as dependent measure, excluding the studies by Kahn and Wansink (2004) and Lin and Wu (2006), the mean effect size D is 0.02 (SE D =0.07) with a 95% confidence interval ranging from -0.12 to 0.16. Thus, the data on hand is similar to the results obtained for the data on choice as dependent variable such that the mean effect is not statistically significantly different from zero. The Q-value for the studies with satisfaction as dependent variable is 15.2, which is larger than the critical value of 14.1 (α=0.05; df=7), which again suggests a further search for moderators.

↓115

For all studies with a continuous dependent variable, including the studies by Kahn and Wansink and by Lin and Wu, the mean effect size is D=-0.09 (SE D =0.06) with a 95% confidence interval ranging from -0.2 to 0.02. The homogeneity statistic Q=24.9, which is larger than the critical chi-square value of 19.7 (α=0.05; df=11).

Publication bias and differences in assortment sizes

To see whether the mean effect size is driven by a few studies with extreme effect sizes and/or extreme weights, I drew a so-called funnel plot in which the weight of each study is plotted against its effect size (Figure 10). By looking at Figure 10, it can be seen that the effect sizes are distributed relatively equally. There are only two outliers. One is the jam study conducted in Berlin, which has an extreme weight. Yet, due to its small effect size, it does not have a large leverage on the mean effect size. The other outlier is the study by Chernev (2003a), which has already been dealt with in the analysis above. The funnel plot can also be used as to detect potential publication or selection biases by checking if the distribution of effect sizes follows the shape of a symmetrical inverted funnel. This is because the results of studies with a smaller sample size (and thus a larger standard error) should have a higher variability. If the plot is asymmetrical, for example, it could be that studies with a small sample size that did not find an effect are missing. Yet with the exception of Chernev’s study, the studies are somewhat symmetrically distributed, which suggests a fairly representative selection of studies.

To see whether the difference between the small and the large assortment influences the effect sizes, I correlated the difference in the assortment sizes (coded as large set–small set) with the effect size across studies. The resulting Pearson correlation coefficient is r=.10. However, this correlation is mainly due to the study by Chernev (2003a) because if this study is excluded, the correlation drops to r=.01. Thus, it can be concluded that an increase in the difference between small and large set does not increase the effect size (Figure 11).

↓116

Figure 10: Funnel Plot: Distribution of effect sizes relative to study weight w=1/SE(d)²

Figure 11: Scatter plot of the correlation between effect size and difference in assortment sizes

Discussion

When summarizing the empirical evidence on the effect of too much choice based on the published effect sizes, two things can be concluded: First, the effect is far less generalizable than previously thought. Second, the strong effect sizes in some studies that found the effect cannot be explained by random variation, which indicates the presence of moderator and mediator variables.

↓117

With regard to the first conclusion, Kelly (2006) noted that unsuccessful replications leave no clear basis for deciding between the original and the replicate study. Based on an analysis of literature databases, he found that unsuccessful replications are cited less often than the original work, which suggests that contradictory evidence is at times ignored. Also, if studies that find a strong effect have a higher probability of being published than studies that find a small effect or no effect, the estimation of the true effect size will be biased.

There is some evidence that studies that are not published in scientific journals have smaller effect sizes than studies that appear in journals. In a study by Rosenthal and Rubin (1978), the average effect size of 32 dissertations that did not appear in journals, expressed in fractions of standard deviations, was 0.35, whereas the average effect size of 313 published journal papers on the same topic was 0.56. With regard to the effect of choice overload, this suggests that the true effect might also be lower.

With regard to the second conclusion, exploring the boundary conditions of the too-much-choice effect clearly is an important and necessary step toward the understanding of its underlying psychological mechanisms. In the preceding chapters, I discussed, evaluated, and eventually tested a number of potential mediators and moderators. There the conclusion was that with one exception, none of the variables that I considered seemed to facilitate the effect of too much choice. Extending this earlier analysis, in the following chapter, I will lay out and critically evaluate additional theoretical explanations as well as potential moderators and mediators that may help provide a more clear-cut picture of the seemingly inconsistent effect of too much choice. Doing so will help to identify the most promising variables to be tested in future experiments.


Fußnoten und Endnoten

5  



© Die inhaltliche Zusammenstellung und Aufmachung dieser Publikation sowie die elektronische Verarbeitung sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung. Das gilt insbesondere für die Vervielfältigung, die Bearbeitung und Einspeicherung und Verarbeitung in elektronische Systeme.
DiML DTD Version 4.0Zertifizierter Dokumentenserver
der Humboldt-Universität zu Berlin
HTML-Version erstellt am:
24.09.2008