[page 7↓]

2  Instructions and Response Coding: Theoretical Positions

Research participants generally only respond when they are asked to do so (or when they infer that they are supposed to do something). They do not usually produce a response simply because they have registered some stimulus (e.g., a triangle). Models of action control frequently assume that, upon instruction, task sets are formed that specify the relevant stimulus dimension, the required responses, and the task-relevant mappings from stimuli to responses. Task sets can thus be viewed as behaviorally relevant representations of the to-be-performed task, implementing the goal to respond to certain (classes of) stimuli in a specific way. In the first section of this chapter (Section 2.1) I sample from recent theories of action control (i.e., Cohen et al., 2000; Logan & Gordon, 2001) and describe their assumptions on how instructions are translated into effective task sets. Although their predictions regarding the impact of specific response labels on action control remain relatively vague, they are consistent with at least two general theoretical positions.

In the second section of this chapter (Section 2.2) these positions will be elaborated and specified with respect to spatially organized keypress responses (i.e., left and right keypresses), on which the emphasis in the remainder of this thesis will be. To this end, I will provide a review of dual-route models of response selection (or more precisely, response activation) that have been proposed to explain so-called compatibility effects.

2.1 Task Representations and the Control of Action

Current models of action control tend to assume that behavior is controlled at different ‘levels’. Furthermore, it is commonly believed that verbal instructions (or the internal representations of the language input) do not directly control behavior, but that instead verbal instructions have to be encoded/translated/compiled into other types of internal representations that, in turn, control behavior.

For example, the prefrontal cortex (PFC) model by Cohen et al. (2000; see also O’Reilly, Braver, & Cohen, 1999), well-known implementations of which have been provided by Jonathan Cohen’s interactive activation models of Stroop performance (e.g., Cohen, Dunbar, & McClelland, 1990), assumes that instructions are encoded into discrete, combinatorial, and self-maintaining PFC representations containing internal contextual information. While [page 8↓]O’Reilly et al. (1999; see also O’Reilly & Soto, 2002) concede a special role for the phonological loop in the encoding and maintenance of task relevant information, the resulting PFC representations themselves are not assumed to be verbal, albeit possibly symbolic and categorical. Moreover, according to the PFC model, it is not the PFC alone that controls behavior. Rather, according to the model, PFC representations are or can be used to bias and constrain the activation flow in another network, the perceptual-motor cortex (PMC) layer, that is characterized by highly distributed representations and slow integrative learning through inductive weight changes. Whereas the PFC (in some versions of the model assisted by fast-learning mechanisms such as hippocampal systems) is responsible for maintaining (instructed) task goals and for constraining the behavioral PMC network accordingly, it is the PMC layer that processes stimuli and generates responses.

Similarly, in the Logan and Gordon (2001) model that was designed to model executive processes and phenomena associated with dual-task performance, verbal instructions are parsed into propositional task level representations that are stored in working memory. However, it is not the propositional representations that control behavior. Instead the propositional representations of instructions need to be translated into a set of parameters (e.g., response categories and weights for biases, attentional weight parameters, etc.) at the parameter level, which, in turn, are passed down to ‘lower level’ behavioral modules responsible for stimulus identification and categorization as well as response selection. Thus, in the Logan and Gordon (2001) model, the parameter set that is passed down constitutes the ‘effective’ task set.

In short, both the PFC model and the Logan and Gordon model assume that behavior is controlled at different levels, and that instructions need to be translated into a ‘language of the mind’, that is, into a representational format that sets up internal constraints, which in turn allow effective control of behavior.

However, neither of these more general models contains a principled account of how exactly instructions are translated into effective task sets. For instance, in different (hybrid connectionist) implementations of the PFC model, the PFC representations contain information about either the relevant stimulus dimension, or the relevant position, or specific response alternatives (see Botwinick, Braver, Barch, Carter, & Cohen, 2001; Cohen et al., 2000). Hence, what is extracted from instructions seems to depend on task demands, that is, of what needs to be controlled or what researchers assume needs controlling on a given task.


[page 9↓]

With regard to my main question of interest, namely whether or not the response labels used in simple binary stimulus-response (S-R) instructions affect how a task is performed, the PFC model in its current form therefore seems consistent with two different general positions. On the one hand, it is conceivable that the bias exerted by PFC representations is rather unspecific with regard to response coding. That is, it might just set up general constraints (e.g., respond to color instead of color words), whereas specific S‑R mappings, and hence responses, are coded within PMC in a way that allows to discriminate all possible responses in a certain task context by biasing (pre-) existing processing pathways.

On the other hand, however, it is also conceivable that PFC representations “mediate an appropriate behavioral response” (Cohen et al., 2000, p. 196) by exerting a more direct influence, which may well depend on the specific contents of instructions (e.g., respond “left”). In order to arrive at more specific predictions with respect to the influence of response labels on response coding, a more precise account of the nature of the assumed PFC and PMC representations and their susceptibility to instruction is needed1.

Similarly, in order to derive unequivocal predictions regarding the impact of response instructions on response coding, the Logan and Gordon (2001) model needs to be more specific with respect to (a) how parameters are extracted from the (propositional) task level representation, and, (b) the nature of response representations. As is, the model assumes that the ‘response set’ consists of response-relevant (stimulus) categories (e.g., odd vs. even for number stimuli) that are mapped to response counters. The latter are incremented according to stimulus categorizations “that correspond to them” (p. 400). Furthermore, it is assumed that “a later motor stage [not covered by the model …] turns a symbolic representation of the response into an overt action” (p. 396). Consequently, the Logan and Gordon model is again open to two interpretations regarding response coding. According to one, responses are coded and accessed in terms of the categories they are supposed to signal (e.g., as meaning “odd”). According to this interpretation, response coding in the Logan and Gordon model would be similar to Meiran’s (e.g., Meiran, 2000; 2001) notion of response recoding, and would directly depend on instructions (i.e., on the response relevant stimulus categories).

On the other hand, however, it is also conceivable that the ‘counters’ responsible for response selection are coded in an instruction-independent way (e.g., as left and right), and that [page 10↓]the categorization evidence is (automatically) transmitted to the responses assigned to them. In the latter case, response coding and access would be logically independent from stimulus categorization, and, possibly, response instruction.

In sum, both models seem consistent with two general theoretical positions concerning how specific response labels given in task instructions for manual two-choice tasks (e.g., “when you see a square, then press the blue key; when you see a circle, then press the green key”) affect response coding, and hence response selection.

On the one hand, task instructions might set up general constraints on how actions can be coded in order to meet task demands. According to this view, the response labels used in the instruction do not directly determine response coding. Rather, responses are coded in terms of features that allow to discriminate between response alternatives in the context of any given task instruction. In what follows, this view will be termed the ‘constraint hypothesis.’

On the other hand, however, it is also conceivable that instructed response labels directly influence response coding. For example, a simple S‑R instruction might set up a link between the stimulus and the response components of the instruction by activating and linking the corresponding concepts (categories) mentioned in the instruction. The motor programs2 that are needed to perform the instructed response might then also be accessible via the mental representation activated by the response label. In the following, this position will be termed the ‘direct coding hypothesis.’

Note that the different coding hypotheses are not necessarily mutually exclusive. For instance, the direct coding hypothesis does not preclude the possibility that instructed codes are merely added to some sort of ‘default’ representation of responses. In this case, it is conceivable that participants use instructed codes only in the beginning of working on a new task, or when considered useful.

In the next section, the two general theoretical positions will be discussed with respect to spatially organized keypress responses (i.e., left and right keypresses), on which the emphasis in the remainder of this thesis will be. To this end, I will interpret and classify so-called dual route models of stimulus-response-compatibility (SRC) with respect to their assumptions regarding response coding.


[page 11↓]

2.2  Response Coding

Dual route models of compatibility have been proposed to explain SRC effects. SRC effects are variations in reaction time (RT) and accuracy that occur as a function of the way in which stimuli are assigned to responses. The general finding has been that responding is easier (faster and less error prone) in the compatible (matching) than in the incompatible condition. For instance, left keypress responses to stimuli appearing on the left of the screen (compatible or matching condition) are faster than left responses to right stimuli (incompatible condition) (e.g., Broadbent & Gregory, 1962). Such performance differences are observed even when irrelevant stimulus attributes match viz. mismatch the required response, as is the case, for instance, in Simon-type tasks (for reviews, see Lu & Proctor, 1995; Simon, 1990). In a typical Simon task, subjects are required to respond to arbitrary stimulus attributes such as pitch of a tone or the color of a visual stimulus by pressing spatially organized (usually left vs. right) keys, while stimulus position varies randomly. Although stimulus position is task irrelevant, it affects performance such that responses are faster (and more accurate) when stimulus position and the side of the required response correspond than when they do not correspond.

Dual-route models (e.g., Barber & O’Leary, 1997; De Jong, Liang, & Lauber, 1994; Hommel, 1997; Kornblum et al., 1990; Tagliabue, Zorzi, Umiltà, & Bassignani, 2000; Zhang, Zhang, & Kornblum, 1999) represent an influential subclass of coding accounts that have been proposed to explain such SRC effects, and appear to be particularly well suited to handle irrelevant SRC effects such as the Simon effect.

Different manifestations of dual route models share the assumption that response selection is affected by two, more or less, independent routes. One of the routes, alternatively labeled short-term memory (STM) link(s), indirect link(s), conditional or conditionally automatic route, or translation route, directly depends on instruction. In most models, this route (if explicitly modeled at all) is implemented by links that connect internal representations of the task-relevant stimulus attributes (e.g., codes of the letters A and B in Figure 1) to representations of the responses assigned to them (see solid arrows in Figure 1). Some models explicitly distinguish between stimulus feature codes and a hidden layer (see STM nodes in Figure 1) coding “task relevant attributes” (e.g., Tagliabue et al., 2000, p. 661) that mediate S‑R translation in the STM route.


[page 12↓]

According to most of the newer dual route models, activation is transmitted automatically along these links once they are implemented. However, because these links depend on instructions (i.e., on task-relevant stimulus attributes and their assignment to responses), this route is considered conditionally automatic (e.g., De Jong et al., 1994).

In addition to the indirect route provided by the STM links, stimuli are assumed to activate their “corresponding” responses via a direct route (also called long-term memory (LTM) links, unconditional or unconditionally automatic route) if stimulus and response attributes (codes) overlap. Since activation along these direct links does not depend on the task relevance of the stimulus attribute that elicits it, this route is considered unconditionally automatic (see broken arrows in Figure 1).

Figure 1: A schematic illustration of the core assumptions of (different classes) of dual route models.

Solid arrows represent STM links that connect task relevant attributes (i.e., letter identity) with representations coding the required responses via STM nodes. The broken arrows indicate direct links connecting overlapping stimulus and response features, regardless of whether the respective stimulus attributes are task relevant or not (L=left and R=right stimulus position codes; R(A) and R(B): response representations linked to stimuli A and B; M(r) and M(l): right and left hand motor programs). See text for details.

Although I agree with Hommel (1996a, p.108) that “it is clear that a principled account of response coding is lacking,” in most coding accounts of SRC, dual route models seem to allow inferences regarding their often implicit or vaguely formulated response coding assumptions. This is so because activation is only transmitted via the direct route if stimulus and response codes overlap. Thus, a closer inspection of what types of S‑R overlap lead to direct activation of spatially organized responses (e.g., left and right keypress responses) in the un[page 13↓]conditional route provides insights regarding how responses are assumed to be coded and accessed.

Roughly, two classes of models can be distinguished regarding their assumptions concerning direct response activation, and, by implication, how spatially organized responses are thought to be represented and accessed.

One class of models seems to hold a strong ‘spatial is special’ view, in that they assume that only spatial stimulus attributes (i.e., the L and R codes in Figure 1) directly (unconditionally) activate their respective responses, implying that responses are assumed to be spatially coded regardless of instructions. This assumption comes in three flavors.

The most widely accepted version of this view is represented by spatial coding accounts (e.g., Barber & O’Leary, 1997; Heister et al., 1990; Lien & Proctor, 2002; Lu & Proctor, 1995). It holds that both the indirect and the direct route converge on cognitive response codes that (primarily) represent relative key position (instead of the anatomical motor codes themselves; see the dashed lines in Figure 1 that connect stimulus position codes and response codes R(a) and R(b)), which in turn activate their corresponding motor responses. According to this view, responses are selected on the basis of spatial response codes representing relative key position whenever key position allows the discrimination of responses. A second, albeit less prominent (cf. Roswarski & Proctor, 2003b) version of this view is represented by motor priming accounts of the Simon effect (e.g., Wascher, Schatz, Kuder, & Verleger, 2001). They propose that (certain types of) spatial stimulus attributes directly specify the motor parameters of the required lateralized responses, without any intervening cognitive response codes (see dotted lines, Figure 1, that directly connect the stimulus position codes R and L with their corresponding motor programs M(r) and M(l)). Finally, according to a third version of the spatial view (e.g., De Jong et al., 1994; Tagliabue et al., 2000), stimulus position is unique (and the only source of direct activation) because of a “natural tendency to react toward the source of stimulation” (Simon, 1969, p. 174). The notion of such a ‘natural tendency’ seems rather unspecific, and, in principle, appears to be consistent with both, a location coding and a motor priming interpretation of spatial coding.

Although all versions of the ‘spatial is special’ view share the basic assumption that spatially organized responses such as bimanual keypress responses are somehow spatially coded whenever spatial coding allows discrimination of the responses, in what follows, I will restrict the term ‘spatial coding hypothesis’ to the first, location coding, view unless otherwise [page 14↓]noted. According to this view, instructing response keys non-spatially (e.g., by using symbolic non-spatial response instructions and labels, such as response labels A and B, see Figure 1), does not directly affect response coding. Rather, if response instructions have any effect at all their impact is restricted to some intermediate translation stage in the conditional route. That is, they are assumed to only affect translation efficiency in the indirect route through usually ill-defined (stimulus and/or response) recoding processes3 that facilitate or, in case of incompatible mappings, hinder translation from relevant stimulus attributes to the spatially coded responses (cf. De Jong et al., 1994; Lu & Proctor, 1995). This implies that so-called symbolic SRC effects, such as faster red-key responses to red than to green stimuli when responses are instructed in terms of color, are not attributed to a match between stimulus and response codes, but instead to a match viz. mismatch of codes (e.g., verbal labels) at some intermediate stage that leads to selection of spatially coded responses (e.g., Bashore, 1990; for a more recent explication, see Mattes, Leuthold, & Ulrich, 2002).

The spatial coding hypothesis can thus be considered representative of the constraint hypothesis (see Chapter 2.1) where spatially organized keypress responses are concerned: Instructions only set up general constraints on how the conditional route is configured, and hence, how relevant stimulus attributes are translated onto responses without affecting response coding per se. Rather, responses are coded and accessed in terms of relative location whenever the spatial dimension allows discrimination of responses.

In contrast, a second class of dual route models seems to be more consistent with the direct coding hypothesis, which holds that instructed response labels directly influence response coding, for instance, by priming the corresponding concepts (categories) that are integrated into the response representations, and that can subsequently be used in response selection. Proponents of these models explicitly (e.g., Hommel, 1997; Hommel et al., 2001; Kornblum, Stevens, Requin, & Whipple, 1999) or implicitly (e.g., Kornblum et al., 1990; Zhang et al., 1999) assume that responses are represented such that every response is coded in terms of its features, dimensions, or categories (e.g., as being blue, left, manual, leading to a high pitch response effect etc.). Stimuli that overlap with respect to any of these features are assumed to (unconditionally) automatically activate their corresponding response features, regardless of whether this feature or dimension is spatial or not, and whether the overlapping stimulus is [page 15↓]task relevant or not. Hence, according to this view, the direct route is not restricted to the spatial dimension, but extends to all overlapping features (see dash-dot-dot lines from stimulus nodes A and B to response keys labeled A and B in Figure 1). As a consequence, these models do not make a principled distinction between spatial and symbolic compatibility effects.

Within this second class of dual route models, two positions can be distinguished that differ regarding their assumptions concerning intentional weighing of response codes, and, by implication, with respect to the role of spatial response codes under non-spatial response instructions.

The weak version of the direct coding view, as, for example, represented by all instantiations of the dimensional overlap (DO) model proposed by Kornblum and colleagues (e.g., Kornblum et al., 1990; Kornblum et al., 1999; Zhang et al., 1999) does not distinguish between overlap on ‘implicit’ (uninstructed) and ‘explicit’ (instructed) dimensions. That is, the strength of direct S‑R activation is the same regardless of whether a stimulus attribute is task relevant or not, and, more importantly, whether certain response features are task relevant (e.g., instructed) or not. This view is, for instance, reflected in Zhang et al.’s (1999) implementation of a task in which colored stimuli that randomly appear to the left or right of fixation are mapped to left and right keypress responses that are instructed (and labeled) in terms of color (i.e., the Hedge & Marsh (1975) task, see Chapter 3.1.4). In their model of this task, stimulus position codes and stimulus color codes (directly) activate response codes to the same extent. Thus, uninstructed spatial default codes are not only assumed to be part of the response representations under non-spatial response instructions, but they are weighed as strongly as instructed (non-spatial) response categories.

In contrast, Hommel et al. (2001; also see Hommel, 1997) assume that both stimulus and response features can be differentially weighed, depending on task demands. More specifically, a core assumption of the theory of event coding is that action representations include codes of (perceivable) proximal and distal action effects (e.g., a “left” proprioceptive feedback, a loud click on the left side, a light on the right that is turned on by a left keypress and so on). According to the theory of event coding, responses are accessed via their intended (anticipated) action effects. Direct S‑R activation occurs in this model as a consequence of overlap of features that are used to code both stimuli and responses in a common representational medium. As in the DO model, feature overlap, and hence, direct activation, is therefore not restricted to the spatial dimension. Importantly though, the theory of event coding as[page 16↓]sumes that features can be weighed according to task demands (e.g., instructions). Hence, intended action effects contribute more strongly to response coding and response control than implicit (non-intended) features, although the latter may still be part of the response representation. As a consequence, compatibility arising from overlap between (irrelevant) stimuli and instructed (intended) response codes can be expected to override ‘implicit’ S‑R overlap. Because instructed coding can dominate uninstructed coding, this view can be considered a strong version of the direct coding hypothesis.

Taken together, two broad theoretical positions have been identified with regard to the main question of this thesis, namely whether or not the specific response labels given in simple binary choice task instructions involving spatially organized keypress responses determine how such a task is performed, that is, how responses are coded and selected. According to the spatial coding hypothesis(e.g., De Jong et al., 1994; Lu, 1997; Roswarski & Proctor, 2003a), which represents the constraint hypothesis regarding spatially organized keypress responses, response labels used in the instruction do not directly determine response coding. Rather, responses are coded in terms of relative key location whenever the spatial dimension allows discriminating between responses.

On the other hand, the direct coding hypothesis assumes that instructed response labels directly influence response coding by activating and linking the corresponding concepts (categories) mentioned in the instruction. According to both the weak and the strong versions of the direct coding hypothesis, instructed codes are included into the response representations and contribute to response selection. However, not even proponents of the direct coding hypothesis propose that non-spatial response instructions lead to complete substitution or elimination of spatial response codes. Rather, they assume that spatial (default) codes are part of the response representation and are or can be used to access responses even when response instructions do not refer to the spatial dimension. Whereas the weak direct coding hypothesis holds that spatial response coding is largely unaffected by the inclusion of instructed (non-spatial) response codes, the strong direct coding hypothesis assumes that response codes can be weighed according to instruction. Consequently, only the strong direct coding hypothesis predicts that instructed (non-spatial) codes or features can dominate spatial coding.

In Chapter 3, I will review evidence from the compatibility literature involving tasks requiring spatially organized keypress responses that is consistent viz. inconsistent with the spatial coding hypothesis and the strong and weak versions of the direct coding hypothesis. [page 17↓]The main question throughout this literature review will be whether response instructions affect the size or the direction of the compatibility effects under consideration.


Footnotes and Endnotes

1 It should be noted that the authors themselves acknowledge this shortcoming in their model by stating that they “have not yet specified what this [the representational scheme of the PFC] is, nor the principles that might characterize it,” (Cohen et al., 2000, p. 207) and by declaring this a major goal for the future.

2 Here and in the following, I use the terms ‘motor program’ and ‘motor code’ interchangeably.

3 Typically, differential translation efficiency for different types of mapping has been modeled by simply assigning either high or low weights to the links from STM nodes to response codes (e.g., Tagliabue et al., 2000).



© Die inhaltliche Zusammenstellung und Aufmachung dieser Publikation sowie die elektronische Verarbeitung sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung. Das gilt insbesondere für die Vervielfältigung, die Bearbeitung und Einspeicherung und Verarbeitung in elektronische Systeme.
DiML DTD Version 3.0Zertifizierter Dokumentenserver
der Humboldt-Universität zu Berlin
HTML generated:
02.09.2004