[page 57↓]

Peptide generation by the proteasome

The proteasome degrades intracellular proteins to peptides between 3-30 amino acids in length. This pool of peptides is thought to be the main source of MHC-I epitopes or their N-terminally prolonged precursors. As demonstrated in the previous chapter, identification of epitopes can be improved by using predicted TAP transport efficiencies as a filter that rules out poorly transported peptides without notably reducing the number of true epitopes. The obvious next step is to check if the same strategy can be applied to filter out those epitope candidates that are unlikely to be generated by the proteasome.

This chapter starts with an evaluation of existing algorithms that predict proteasomal cleavage (section 4.1). Their predictions are poor, which is thought to be a consequence of the lesser quality of their training data, as proteasomal cleavage rates are inherently difficult to measure and interpret, which is discussed in section 4.2. To address this problem, a novel method to quantify proteasomal cleavage rates from time resolved experiments is introduced in section 4.3. This method is applied to a series of experiments analyzing the digestion of two polypeptide substrates by constitutive and immuno-type proteasomes (section 4.4). In the last section (4.5), the differences between digests with immuno- and constitutive proteasomes are discussed.

Most of the results reported in this chapter are taken from (Peters, et al., 2002;Peters, et al., 2003).

4.1  Evaluating published algorithms predicting proteasomal cleavage

Hitherto there are no indications that the C-terminus of proteasomal fragments undergoes further trimming along the MHC-I presentation pathway (Rock and Goldberg, 1999;Shastri, et al., 2002). Therefore, selecting potential epitopes and their N-terminally prolonged precursors by the probability that their C-terminus is generated by the proteasome should single out false epitope candidates without loosing true epitope candidates. Currently, there exist three publicly available methods to predict proteasomal cleavage: NetChop (Kesmir, et al., 2002), PaProc (Nussbaum, et al., 2001) and FragPredict (Holzhutter, et al., 1999). All of these are trained on data from in vitro digests of proteins or oligopeptides.

[page 58↓]

4.1.1  NetChop

The NetChop algorithm (Kesmir, et al., 2002) is an artificial neural network trained on different sets of experimental data. Here, the 20S version of NetChop trained on in vitro digest of yeast enolase (Toes, et al., 2001) and bovine β-casein (Emmerich, et al., 2000) was used. The output of the algorithm for each possible cleavage site within a protein sequence is a continuous number indicating the likelihood of cleavage. Predictions were obtained online at www.cbs.dtu.dk/services/NetChop.

Alternative versions of NetChop are available that have been trained on collections of the flanking regions of known presented epitopes, which are thought to be cleavage sites of the proteasome. While this can be a valid approach, predictions trained this way can obviously not be used to evaluate the influence of the proteasome on epitope generation, as the proteasome is implied to be the source of all epitopes when using this kind of training data. Rather, the training data has to come directly from the proteasome itself, as is the case for proteasomal in-vitro digests.

4.1.2 PaProc

PaProc (Nussbaum, et al., 2001) is essentially a matrix based method combined with pair-coefficients describing the interaction between the residues P1 and P1' surrounding the cleavage site. The coefficient values were determined using an evolutionary algorithm. The training data consists mainly of an in-vitro digest of yeast enolase plus several polypeptides (Kuttler, et al., 2000). There are several implementations of the method based on different sets of experimental data. Here, the 'wild type III' method was used, which was trained on the largest dataset. PaProc is available online at www.paproc.de. Its output consists of 4 different discrete scores ('-', '+', '++' and '+++'), where '-' is designated to be 'non-cleavable'.

4.1.3 FragPredict

The FragPredict method (Holzhutter, et al., 1999) is not available online, but as a computer program distributed on request. It was the first published prediction method, trained on all in-vitro digests of polypeptides published at that time. It is capable not only of predicting cleavage [page 59↓]sites, but also to predict which fragments are formed from combinations of cleavages. To be comparable to the other methods, only the cleavage site prediction algorithm was used.

4.1.4 Identifying epitopes using proteasomal cleavage predictions

For each peptide, the predictions of its C-terminal cleavage were used to determine if it has the potential to become an epitope or not. Figure 19 depicts the ROC curves for the three cleavage prediction methods when applied to the HLA-X dataset described in section 3.4. According to the AUC values, the best discriminations between epitopes and random peptides were achieved with NetChop (AUC=0.61), closely followed by FragPredict (AUC=0.59), while PaProc (AUC=0.54) was significantly inferior to the other two prediction methods. Comparing the ROC curves of Figure 19 with those of Figure 14, it can be inferred that the discriminating power of existing prediction methods for proteasomal cleavage sites is far below that of TAP transport scoring developed in the previous chapter, let alone those for MHC-I affinity.

Figure 19: ROC curves for proteasomal cleavage predictions

For the three proteasomal cleavage prediction methods NetChop (AUC=0.61), FragPredict (AUC=0.59) and PaProc (AUC=0.54), the score for C-terminal cleavage is used to predict epitopes from the HLA-X dataset.

[page 60↓]

4.1.5  Combining proteasomal cleavage predictions with predictions of MHC-I affinity

Next, combined predictions of C-terminal proteasomal cleavage and MHC-I binding were tested, using the same two-step prediction protocol as described for TAP in section 3.5. For each of the three prediction methods of proteasomal cleavages, a cutoff value singling out peptides as 'not-generated' was chosen with a similar selective strength as the one used for TAP-transport, where 30% of the peptides were classified as 'not-transportable'. For PaProc, the fraction of omitted peptides was necessarily larger, as this method predicted about 60% of peptide bonds to have the lowest score (‘-‘, not cleaved). The ROC curves for the combined predictions are shown in Figure 20; all of them indicating that the combined predictions are significantly worse than those based on predictions of MHC-I binding affinities alone.

Apparently, the 2-step prediction protocol used successfully to combine TAP and MHC-I prediction fails when predictions of C-terminal proteasomal cleavages were used as a filter. This disappointing result may have three different reasons. One is, that the selective power of the proteasome is weak as it generates nearly every possible peptide. Second, there might be other proteases serving as suppliers of antigenic peptides besides the proteasome. Finally, existing prediction algorithms of proteasomal cleavage sites might not be accurate enough. The last explanation seems most likely, because in vitro digests of epitope-containing model substrates by the proteasome provide with very few exceptions the epitope or one N-terminally prolonged precursor (Kessler, et al., 2001). The poor quality of prediction algorithms for proteasomal cleavage sites is also evidenced by contradictory results obtained when applying them to the same set of test protein sequences. Most likely, the poor prediction quality of proteasomal cleavages is mainly caused by the lack of a sufficiently large set of quantitative and consistent experimental data on cleavage rates, which are more difficult to measure and interpret than the affinity assays used to characterize peptide binding to TAP and MHC-I.

[page 61↓]

Figure 20: ROC curves for proteasomal cleavage + MHC-I binding predictions

Only peptides with a score for proteasomal cleavage of their C-Terminus better than a fixed cutoff are considered to be potential epitopes. They are assigned a score according to their predicted MHC-I binding affinity. The best results are obtained with NetChop (cutoff = 0.1, AUC=0.872) followed by FragPredict (cutoff = 0.5, AUC=0.858) and PaProc (cutoff = '+', AUC=0.623). All of these combined predictions are worse than using predicted MHC-I affinities alone (AUC = 0.919).

4.2  Problems with evaluating experimental proteasome digests

For an in vitro digest, proteasomes are incubated with a polypeptide or protein as a substrate. After a defined incubation time, the digest is stopped, and the generated mixture of peptide fragments is called the proteasomal digest of the substrate. To analyze these digests, usually Edman degradation or Mass Spectrometry (MS) are used. These methods are associated with different obstacles in the interpretation of results.

4.2.1 A single snapshot of a digest does not provide reliable cleavage rates

Using Edman degradation to analyze proteasomal digests, the peptide mixture is first separated using high performance liquid chromatography (HPLC). Ideally, each probe coming from the [page 62↓]HPLC should contain only one kind of peptide. The sequence and amount of each peptide can then be identified using Edman degradation. This is a reliable but time consuming method to produce quantified data, which has lead most experimentalists to limit the analysis of digests to a single incubation time, i.e. to analyze a snapshot of the fragment concentrations present in the digest at one time.

A naïve way of interpreting this snapshot is to divide the concentration of each generated fragment by the amount of depleted substrate and interpret these ratios as relative generation rates. This is not a valid interpretation, because proteasomal digests do not follow a simple substrate + enzyme à substrate + product description. The proteasome can 're-process' its products, cutting them further into smaller fragments. While this re-processing may not play a significant role in vivo, where the products will either be degraded by other proteases or rescued from degradation by transport into the ER by TAP, it is unavoidable for in vitro experiments. Therefore, these relative generation rates would vary hugely depending on the incubation time, because longer fragments dominating at early times will later be cleaved into smaller fragments. This can also lead to misinterpretations of differences in the digests generated by different types of proteasomes. If two types differ only in their speed in which they degrade a substrate, the amounts of fragments generated can vary greatly after the same incubation time, even if their cleavage preference is completely identical (Figure 21)

[page 63↓]

Figure 21: Different proteasome species with identical cleavage preference can produce large differences in individual fragment amounts

The data in the Figure stems from experiments described in section 4.4.1. The black and gray bars indicate the amount of nine pp89-25mer peptides produced by the T2 and T2.27 proteasome after 2h of incubation. The peptide amounts were assessed from the respective MS-signals by using calibration curves. The position of each peptide fragment in the sequence of the substrate is indicated on the x-axis. There are significant differences in the amount of the peptides 5-15, 8-15 or 16-24. Since the cleavage probabilities are unaltered (values given in Table 8), these differences result exclusively from the faster procession by the T2.27 proteasome and its tendency to re-process shorter peptides.

[page 64↓]

Figure 22: Re-processing of peptides makes the relative amounts of fragments associated with each cleavage site time dependent

The data used in this figure stems from the model fits described in 4.4.3, which are a noise-free set of peptide amount profiles. The four graphs depict the relative usage of the cleavage sites Y4, M6, Y7 and M24 in the pp89-25mer at various time points of the simulated digestion experiment with T2.27. The relative usage of a cleavage site at a given time point is calculated by summing up the amounts of all peptides beginning or ending at that cleavage site, divided by the maximum sum found for any site at that time point (always after L15 in these experiments). If the relative usage of a cleavage site was equivalent to the cleavage probability in the substrate, it should be constant over time, as the cleavage probability is an intrinsic property of the substrate. As can be seen from the graphs, the relative usage is not constant over time, as re-processing of a fragment increases the usage of weaker cleavage sites that are still present in the fragments of the substrate.

[page 65↓]

A much better way to evaluate proteasomal digests is to sum up the amounts of fragments associated with each cleavage site, thereby assigning cleavage strengths, which are thought to be equivalent to cleavage site usage in the original substrate. While this is much better than to look at individual fragments, this definition of cleavage strengths also depends on the digestion time, as shown in Figure 22. This is due to the following reasons: (1) As the strongest cleavage sites are cut first, their number decreases faster than others, making it more likely that weaker cleavage sites are used when fragments are re-processed. (2) It is known that shorter peptides are less likely to be cleaved then longer peptides, making the cleavage site usage dependent on its surrounding sequence, which changes when fragments are re-processed.

4.2.2 MS-signals do not give quantified peptide amounts

As discussed in the previous section, experiments evaluating only one digestion time-point can only provide a snap-shop of the digest that cannot completely determine the mechanism of degradation of the proteasome. Using Edman degradation to analyze the digests, repeating an experiment for several different digestion times means lots of work. A much quicker method to analyze digest data is mass spectrometry (MS). Here, the peptides of the digest are again typically separated by HPLC and thereafter analyzed by MS. While this allows for a highly sensitive qualitative analysis of the digest (a list of peptides that were generated in a detectable amount after a certain incubation time), estimation of the quantities of the peptides is problematic. The intensity of the MS-signal is in principal related to the detected peptide amount, but several intrinsic properties of the peptides influence their ionization behavior and therefore the MS-signal. The presence of aromatic amino acids (Valero, et al., 1998), phosphate groups (Janek, et al., 2001), and charged side chains (Cohen and Chait, 1996) such as guanidino group of arginine (Krause, et al., 1999) as well as the peptide size (Olumee, et al., 1995) have been reported to influence the signal intensity. Hitherto there is no reliable theoretical approach enabling the calculation of the MS-signal intensity from the sequence of a given peptide. In principle, the problem to derive amount values from MS-signals can reasonably well be solved by synthesizing the observed peptides and measuring calibration curves for each of them, but this is also a rather time consuming work, especially for digests of long protein substrates in which a large number of observed peptides is produced.

[page 66↓]

4.3  Novel protocol of experimental evaluation

4.3.1  Determining peptide amounts from MS-signals

In this section, a much more efficient method to assess peptide amounts from MS-signals than the use of calibration curves is proposed. The basic idea is to use mass balance rules: At an arbitrary time point of the digest experiment, the amounts of all peptides having at least one sequence position in common must add up to the amount of the substrate at the beginning of experiment. Mathematically, this conservation rule can be stated as








where a0 is the initial amount of the substrate of length n and ai(t) denotes the amount of peptide i at time t. The sum on the left-hand side of equation (11) includes all those peptides fi that contain sequence position j. From the calibration curves shown in Figure 26, it can be inferred that the relationship between MS-signal and peptide amount can be roughly approximated by a linear function,












where si denotes the MS-signal produced by peptide i and the signal conversion coefficient vi is a characteristic constant determined by the physico-chemical properties of peptide i converting its MS-signal into the respective amount value. Demanding fulfillment of equation (11) for all sequence positions, one may estimate the scaling factors vi by inserting relation (12) into the conservation equation (11):







where the index α counts the number of discrete time points at which MS-signals for the peptides are available. Numerical values for the unknown conversion factors vi can then be estimated by minimizing the violation of the n x m conservation conditions (13). Violation of these [page 67↓]conservation rules may result from three sources: First, measurements of the MS-signals are subject to random as well as systematic errors. Second, the true functional relationship between the signals and the amount of a peptide will certainly deviate from a simple linear one. Third, the set of detectable peptides will never be complete. In particular, short peptides (1-3 residues) are likely to escape from HPLC-MS analysis. The latter fact gives rise to a systematic loss of mass as more small peptides are formed during the time-course of the digest. Therefore it is reasonable to determine the unknown conversion factors by minimizing the violation of the conditions (13) between two successive time points of the experiments, i.e.




and choosing the distance metric in (14) as








which punishes the unlikely 'gain' of peptides (x > 0) five times higher than their more likely 'loss' (x < 0).

When minimizing the functional (14) with respect to the unknown signal conversion coefficients vi, one encounters the typical problem in regression analysis that the signal conversion coefficients of peptides with very small MS-signals are poorly determined because they can be largely varied without significant change of the functional Φ. Thus, to avoid unrealistic values of the calculated signal conversion coefficients, the minimization problem (14) is replaced by the constraint problem







[page 68↓]

where the additional term











measures the deviations of the vi's from a plausible reference value v0. This reference value v0 was determined from a set of experimental calibration curves. Depending on the choice of the positive factor λ in (16), the minimization problemmay become at the extreme either completely unconstraint (λ à 0), or all signal conversion coefficients are forced to the reference value v0 (λ à∞ ).

4.3.2  Kinetic modeling

In this section, a kinetic model of the proteasome is introduced which is supposed to serve as a mechanistic platform for the interpretation and comparison of kinetic data produced by in vitro digestion of model substrates. Proteasomal degradation comprises a multitude of distinct elementary processes, such as uptake of the substrate, transport through the interior of the proteasome, binding to the active sites, threonine-catalysed cleavage of peptide bonds under putative formation of covalent acyl-intermediates, hydrolytic liberation of these acyl-intermediates from the active-site threonine, and release of the products from the proteasome. As none of these elementary processes could be kinetically characterized so far, it makes no sense to incorporate them individually into a complex kinetic model containing a huge number of non-identifiable parameters. Instead, a simple kinetic model is established by lumping all elementary processes involved in the complete procession of a peptide into a single overall processing step. Compared with classical enzyme kinetics, the resulting proteasome model can be considered as a sort of Michaelis-Menten model expressing the most essential kinetic features in terms of a few phenomenological parameters which can be identified from the experimental data.

The time-dependent variation of the amount of peptides including the initial substrate is described by a system of linear kinetic equations,

[page 69↓]



Here kij is the rate constant with which peptide j is converted into peptide i per time unit and Ki is the total degradation rate of the i-th peptide. The peptides are labeled with decreasing lengths, a1 being the substrate, so that kij = 0 for i < j since cleavage always shortens a peptide.

In order to derive an explicit expression for the transition rates kij, two cardinal terms are introduced: the procession rate rj of peptide j and the cleavage probability pk of a cleavable peptide bond (= cleavage site). These two terms are explained in the following. Procession rate

The procession rate is the rate (i.e. number of events per time unit) with which a peptide undergoes a procession cycle. A single procession cycle encompasses all events taking place between uptake of a peptide into the proteasome and release of all peptides derived from it. For peptide j with length Lj, it is put



where rmax represents the maximum possible procession rate, L0 represents a critical peptide length at which 50% of the maximum procession rate is reached, and the exponent c > 0 controls how sensitive the procession rate is to varying peptide lengths. This takes into account in a phenomenological manner that short peptides are degraded with lower turnover rates than longer peptides. A decelerated degradation with decreasing peptide length was observed for oligopeptides having up to 30 residues (Dolenc, et al., 1998), which is likely to be the maximum size of cleavage products. This type of length dependency can also explain why proteasomal [page 70↓]digests contain medium-size peptides which are not further degraded although they contain peptide-bonds which were cleavable in the original substrates. Cleavage probability

A cleavage probability pk is assigned to all cleavage sites k of the protein substrate, i.e. to those peptide bonds which need to be cleavable to explain the peptide pattern observed in the digest. The cleavage probability of all other peptide bonds is a priori put to zero. The assumption is made that multiple cleavages may occur independently and randomly during a processing cycle. This implies that there are as many different partitions, i.e. possible subdivisons of a given peptide into smaller pieces, as there are different combinations of possible cleavages. If the substrate contains n* cleavage sites, there are 2n* such possible partitions, each of them occuring with a partition probability Pm (m=0,…,2n*-1 ) that is determined by the cleavage probabilities of the individual cleavage sites (cf. Figure 23 for a simple example with n*=2 ).

Figure 23: Possible partitions of a peptide containing 2 cleavage sites

Partition probabilities Pm are calculated by treating the individual cleavages as statistically independent events. For example: The probability P2 to fractionize the substrate according to partition 2 is given by the probability p2 for a cut to occur at cleavage site 2 times the probability (1-p1) that cleavage site 1 is not cut.

[page 71↓]

As this model generates all peptides that can be produced by any combination of cleavage sites, there will usually be more peptides predicted in the model than observed in the experiment. These peptides are called hypothetical peptides. Definition and estimation of rate constants

The rate constants kij in the equation system (18) are chosen as the procession rate for peptide j times the sum of the probability of all partitions in which peptide i is generated:



Similarly, the coefficients Ki are given by



where the sum includes all partitions except P0, in which no cleavage occurs at all. For a given set of cleavage probabilities and procession rate parameters, the kij and Ki have explicit values for which the linear differential equation system (18) can be solved analytically yielding explicit mathematical formulas for the theoretical peptide amount profiles ai(t). Thus, numerical values for the unknown model parameters (rmax, L0, c and pk with k=1, …, n*) can be determined by minimizing the distance between the theoretical peptide amount profiles and the observed ones. This minimization is performed using the following distance metric Δ:




[page 72↓]

In (22) the symbols amid, amin and amax denote the mean, minimum and maximum peptide amount as derived from the measured MS-signal and asim denotes the simulated value predicted by the model. The distance metric Δ increases steeply (as a quadratic function) for values of asim lying outside of the experimental range [amin, amax]. The weighting factor 5 is somewhat arbitrary as long as it is greater than 1. Subtracting 4 ensures continuity of the distance at δ=1. If a calibration curve was used to assess the amount of a peptide, the values for amid, amin and amax were taken directly from the calibration curve as described in Figure 26. If the mass balance method was used, the value for amid was determined using the signal conversion coefficient and putting amin=amid / 2 and amax= 2 amid.

To be consistent with the experiment, the hypothetical peptides found only in the model should have amounts below the quantification threshold. As discussed below, this threshold is about 5 pmol. To be on the safe side, the values amid = amin = 0 and amax = 2 pmol were chosen in the distance metric (22) for all hypothetical peptides.

4.4  Application and testing of novel protocol

4.4.1  Experimental setup

Time-dependent peptide profiles were obtained from degradation of the two model peptides: pp89 (a 25-mer derived from the IE pp89 of the Murine Cytomegalovirus) and LLO, a 27-mer representing a partial sequence region of listeriolysin O from Listeria monocytogenes). These two substrates were digested by a constitutive proteasome (T2) isolated from T2 cells lacking the gene region for β1i and β5i and by an immunoproteasome (T2.27) isolated from T2 cells transfected with β1i and β5i and characterized by an enhanced incorporation of the endogenous β2i, the third immuno subunit. Peptide synthesis

Peptides were synthesized by solid-phase methods on an automated Pioneer Peptide Synthesis System (PerSeptive Biosystems) using Fmoc chemistry. Peptide purity was confirmed by reversed-phase HPLC and MS. Syntheses were performed by Dr. P. Henklein (Charité, Berlin).

[page 73↓]  Purification and analysis of 20S proteasome complexes

Proteasomes were purified from the human lymphoblastic cell line T2 and T2 cells transfected with the proteasomal subunits LMP2 and LMP7 (T2.27) as previously described (Kuckelkorn, et al., 1995). Peptide digestion assays.

20 μg of the pp89-25mer or LLO-27mer oligopeptide and 3.3 μg of purified proteasomes were incubated in 1 mL of assay buffer (20 mM Hepes / pH 7.8, 2 mM Mg(CH3COO)2, 1mM dithiothreitol) at 37°C for 0, 0.5, 1, 1.5, 2, 3, 4, 6, 8 and 10 h, stopped by adding 0.1 vol 1% trifluoroacetic acid then frozen at -20°C. For each time point, two samples of 30 μL digest were used independently for HPLC-MS-analysis. The resulting MS signal intensities were averaged. The experiments were carried out in duplicate.  HPLC-MS analysis

Samples (proteasomal digests and dissolved peptides for the calibrations curves) were separated by reversed-phase chromatography on a µRPC C2/C18 SC 2.1/10 column (Pharmacia Biotech) by linear gradient elution (eluent A, 0.05% trifluoroacetic acid in water; eluent B, 0.045% trifluoracetic acid in 70% acetonitrile; flow rate, 73 µL/min). Analyses were performed online with an ion trap mass spectrometer (LCQ, Thermo-Finnigan) equipped with an electrospray ion source. As internal standard the peptide 9GPS (YPHFMPTNLGPS) was added to each sample. For calculation of the peak area the most intensive ion signal of the peptides was used.

4.4.2 Comparing theoretical and experimentally derived fragment amounts  Calibration curves

For calibration of the peptide amount with the MS signal, dilution series were prepared from stock solutions of 12 individual peptides found in the pp89-25mer. The peptide amounts were varied in a broad range between 1 and 500 pmol. Three types of dilution series were analyzed: In a first series of experiments, MS-signals were recorded for each individual peptide under isolated [page 74↓]conditions, i.e. without presence of other peptides. In order to assess the impact of collective effects, the MS-signals for peptide mixtures were also recorded in two further dilution series experiments, one with equimolar mixtures of 12 peptides, the other one with different molar ratios for three groups of peptides (group A : group B : group C = 10 : 5 : 1, see the last column in Figure 24 for the group assignments).

Figure 24: Fragments of the pp89-25-mer

List of all fragments of the pp89-25mer detected in the proteasome digest. The second row contains the original 25-mer substrate, with those residues in bold print that are used as cleavage sites. For 12 peptides calibration curves were monitored, whereby the capital letter in the last column indicates the molar ratio with which the peptide wastested in a peptide mixture.

[page 75↓]

Figure 25: Fragments of the LLO-27mer

List of all fragments of the LLO-27mer detected in the proteasome digests. The second row contains the original substrate, with those residues in bold print that are used as cleavage sites.

A typical calibration curve obtained in this series of experiments is depicted in Figure 26. The average relative deviations of the recorded signals from the mean are listed in Table 7 for the set of 12 peptides at the various amounts tested. The major source of these deviations are systematic differences between the calibration curves recorded either under isolated conditions or in peptide mixtures. Compared with these systematic deviations the variations of MS-signals between repeat measurements carried out under identical conditions are small.

[page 76↓]

Figure 26: Calibration curve for the 11mer peptide 5-DMYPHFMPTNL-15 contained in the pp89-25mer.

Three types of calibration curves were recorded, differing in the number and amount of peptides that were simultaneously present in one sample: ˜ only the 5-15 peptide present, ¿ all 12 analyzed peptides simultaneously present at the same amount, p all 12 analyzed peptides simultaneously present but at different amounts. The solid lines interpolate linearly between the minimum, mean and maximum values recorded. In cases where the minimum or maximum recorded signal deviated from the mean by less than the average values given in Table 7, these average values were taken instead (see, for example, the minimum signal value at 500 pmol). The dotted lines indicate how a fixed MS-signal can be translated into an amount range. In this example, the MS-signal of 400 x 106 corresponds to a minimum amount of amin=140 pmol, mean amount of amid=220 pmol and a maximum amount of amax=345 pmol.

[page 77↓]

Table 7: Amount dependent signal deviations monitored in the calibration curves

Peptide Amount [pmol]

median relative difference
max - mean

median relative difference
min - mean




























Increasing the peptide amount by about two orders of magnitude from 1 pmol to 500 pmol, the lower and upper boundary for signal variations around the mean value decrease by about one order of magnitude to a level of about 20%. For low peptide amounts, the relative signal variations are very large. While all peptides produced a quantifiable MS-signal when added at an amount of 10 pmol or higher, no quantifiable MS-signal was detected in 6 out of 72 experiments at a peptide amount of 5 pmol. Diminishing the peptide amount further down to 1pmol, the number of experiments with unsuccessful peptide recovery increases to 15 out of 72. It was concluded that - under these experimental conditions - peptide amounts below 5 pmol do not necessarily produce quantifiable MS-signals.

[page 78↓]  Assessment of peptide amounts from MS signals using the mass balance method

For all peptides detected in the digests, time-dependent amount values were obtained from the measured MS-signals by applying the mass balance method described in section 4.3.1. Optimal values for the two control parameters λ and ν0 in (16) and (17) were determined in the following way: λ was continually increased starting with λ=0 until the maximum difference between the vi's was within a factor of 5. This yielded a value of λ=1 for the pp89-25mer digest and of λ=0.1 for the LLO-27mer digest. ν0 was determined by taking the logarithmic mean of the experimental minimum and maximum values; this yielded ν0=1.

Figure 27 shows the theoretically determined values of the signal conversion coefficients for the pp89-25mer digest in comparison to the experimental values assessed by means of calibration curves. Both methods are in good agreement: All calculated coefficients fall into the expected range, except for the peptide 1-4 which lies slightly above the experimental data.

4.4.3  Fitting the kinetic model to the experimental data Comparison of experimental and theoretical time-dependent amount profiles

For both substrates, the time-courses of MS-signals were translated into time-courses of peptide amounts using the signal conversion coefficients calculated by means of the mass balance method. Fitting of the kinetic model to the time-dependent amount profiles was performed as described in section 4.3.2. As the partition probabilities Pm decline rapidly with increasing number of cleavages, only partitions including up to four cleavages for the pp89 25-mer and up to six cleavages for LLO 27-mer were considered in the calculation of the transition rates kij and Ki (cf. equations 20 and 21) in order to save computation time.

[page 79↓]

Figure 27: Comparison of experimentally and theoretically determined signal conversion coefficients for fragments derived from the pp89-25mer

Gray columns represent the signal conversion coefficients (cf. equation 12) obtained using the mass balance method. Black and white columns represent the experimental minimum and maximum for the coefficients. These boundaries were determined by picking the maximum signal of a given peptide measured in the time series of digest experiments and using the calibration curve to translate this signal into a minimum (amin) and maximum (amax) amount value (See arrows in Figure 26). Dividing the signal by amax and amin determines the experimental range for the value of the signal conversion coefficient.

Figure 28 depicts measured and calculated time-dependent amount profiles for the pp89 25-mer substrate and the 9 peptides generated with the highest abundance. The expected range of the experimental values as defined by the boundaries amin and amax is indicated by the gray shaded area. For both types of proteasomes, the calculated amount profiles of almost all peptides fall into the range expected from the experiment. This also holds true for the other 12 peptides found in this experiment (not shown). The calculated amount of all 31 hypothetical peptides predicted by the model but not detected in the experiment was below 2 pmol, i.e. remained below the reliable experimental quantification threshold.

[page 80↓]

Figure 28: Time courses of peptide amounts for the pp89-25mer digests

Connected boxes represent the theoretical amount values predicted by the model. The shaded areas indicate the amount ranges determined from the MS-signals in the proteasome digests using the mass balance method. The caption of each graph names the position of the peptide in the sequence of the 25mer substrate. The x-axis indicates the incubation time in hours, and the y-axis the amount in units of pmol. Note the different scaling of the amount axis for the various peptides.

[page 81↓]

A similarly good correspondence between experiment and model was obtained for the LLO-27mer: Nearly all of the calculated time-dependent amount profiles for the 21 peptides detected in the digest (cf. Figure 25) were within the range of experimental uncertainty defined by the boundaries amin and amax, and the calculated maximal amount of all 26 hypothetical fragments was below the experimental quantification threshold of 2 pmol. For both substrates, the residuals between simulated and experimental time courses as measured by (22) are evenly spread across the entire time course, indicating random deviations between theoretical and experimental results.

Since the number of adjustable model parameters is small (pp89-25mer: 13 parameters / LLO-27mer: 12 parameters) compared with the number of data points (pp89-25mer: 220 experimentally observed data points + 310 hypothetical data points / LLO-25mer: 210 experimentally observed data points + 260 hypothetical data points), the good agreement between simulations and experimental data can be taken as a strong indication for the reliability of the model. Assessing the variability of model parameters with a jack-knife procedure

To assess whether the numerical values for any model parameter differ significantly between the two types of proteasomes, it is necessary to relate the difference of the parameter values to their standard deviations. The standard deviation of a model parameter characterizes the expected range of its variability when determined from a set of independent repeat experiments. An alternative to carrying out new experiments is the so-called jack-knife procedure, which mimics the possible outcome of future experiments by replacing the original data base with a computer-generated artificial dataset. Such a jack-knife procedure was applied by omitting the measurements at 4 consecutive time points from the original dataset. Repeating this procedure and fitting the kinetic model to each dataset yields a collection of parameter estimates from which standard deviations can be assessed. The estimated numerical values for the model parameters and their jack-knife standard deviations are listed in Table 8 and Table 9, and are graphically displayed in Figure 29 and Figure 30. Note that the cleavage probabilities obtained in different fits cannot be directly compared because the model can provide equally good fits with either a high maximal procession rate rmax combined with a low average level of the cleavage [page 82↓]probabilities pi or, alternatively, with low rmax and high pi. Hence, to make cleavage probabilities comparable, they have to be related to the probability P0 that no cleavage is made during procession of the substrate. The numerical estimates for the model parameters turn out to be fairly insensitive to large variations of the data base as produced in the jack-knife analysis.

Table 8: Estimated model parameters for the pp89-25mer digests









p3 / (1-P0)





p4 / (1-P0)





p5 / (1-P0)





p6 / (1-P0)





p7 / (1-P0)





p10 / (1-P0)





p11 / (1-P0)





p15 / (1-P0)





p22 / (1-P0)





p24 / (1-P0)





p0 = Π (1- pi)




















[page 83↓]

Table 9: Estimated model parameters for the LLO-27mer digests









p3 / (1-P0)





p5 / (1-P0)





p6 / (1-P0)





p8 / (1-P0)





p12 / (1-P0)





p14 / (1-P0)





p16 / (1-P0)





p21 / (1-P0)





p26 / (1-P0)





p0 = Π (1- pi)




















Inspection of the graphs in Figure 29 and Figure 30 reveals that for both substrates tested, the immunoproteasome possesses a significantly higher procession rate combined with a greater preference to process shorter peptides compared with the constitutive proteasome. The effects on cleavage probabilities associated with switching from the constitutive proteasome to the immunoproteasome are diverse for the two substrates. For the pp89-25mer (Figure 29, Table 8), there are no significant changes of cleavage probabilities at all. For the LLO-27mer (Figure 30, Table 9), the result of this analysis is a significant change of the cleavage probability at 4 of the 9 [page 84↓]detectable cleavage sites: an increase of cleavage probabilities at the residues I3 and S5 and a decrease at Y8 and H21. These results are discussed in more detail in section 4.5.

Figure 29: Variability of model parameters for the pp89-25mer digests assessed by a jack-knife procedure: Experimental peptide amounts determined by the mass balance method

(a)-(b): Cleavage probabilities normalized to (1-P0). Each Figure shows the relative cleavage probabilities for 7 different fits: The black boxes indicate the cleavage probabilities obtained by fitting the model to the entire set of experimental data. The gray diamonds refer to parameter values obtained by fitting the model to reduced datasets where four 4 consecutive time points were left out from the experimental data. (c)-(d): The procession rate was calculated from the model parameters rmax, L and c according to equation (19) and then multiplied by (1-P0). Each Figure shows the results of 7 different fits detailed above.

[page 85↓]

Figure 30: Variability of model parameters for the LLO-27mer digests assessed by a jack-knife procedure: Experimental peptide amounts determined by the mass balance method

The amount of all relevant peptides identified in the digests of the LLO-27mer were derived from MS-signals by using the mass balance method. Variability of model parameters was assessed by repeated model fitting to truncated datasets (see legend of Figure 29). Checking the equivalence of model computations based on either the mass balance method or experimental calibration curves

In a second series of computations, estimation of the model parameters for the pp89-25mer digests was performed on the basis of time-dependent amount profiles which have been constructed by using the available calibration curves instead of using the novel mass balance method for the 12 peptides indicated in the last column of Figure 24. Again, robustness of the numerical estimates was assessed by a jack-knife procedure as outlined above (Figure 31). Importantly, mean values and variances of all model parameters do not significantly deviate from previous values obtained by fitting the model to time-dependent amount profiles derived [page 86↓]from MS signals by the mass balance method. This result underlines the finding in Figure 27 that the mass balance method enables reliable conversions of MS-signals into peptide amounts.

Figure 31: Variability of model parameters for the pp89-25mer digests assessed by a jack-knife procedure: Experimental peptide amounts determined by using calibration curves

Calibration curves were used to transform the MS-signals from the proteasome digests into peptide amounts, if available (see Figure 24). Using this dataset, the same seven fits as described in Figure 29 have been made to asses the variability of the model parameters. See the legend of Figure 29 for further details. Adequate fitting of data requires a length-dependent procession rate

It was tested if a simpler version of the procession rate in equation 19 can produce similar quality time courses. To this end, constraints were imposed on the parameters L and c of the procession rate and the achieved quality of the fit was compared with the unconstrained one. The unconstrained fit yields a total residual distance of Δ=249 (Δ=458) for the pp89-25mer (LLO-27mer) digests according to the distance measure (22), when summed over both types of proteasome, all peptides and all time points. Setting L0=0, i.e. neglecting the length-dependence by equating the procession rate to rmax for all peptides, the total residual distance amounts to [page 87↓]Δ=427 (Δ=605), i.e. the quality of the fit decreases significantly. At the other extreme, setting L0= (substrate length-0.5) and c=1000 such that only the initial substrate undergoes procession with rate rmax while re-processing of proteolytic fragments is prevented, the total residual distance amounts to Δ=456 (Δ=549) again indicating a clear drop in fit quality. These findings demonstrate that length restrictions in the procession of shorter peptides are an essential feature of proteasomal cleavage.

4.5  Differences between constitutive- and immuno-proteasomal digests

Comparing the model parameters determined for the digests by T2.2 and T2.27 proteasomes gives information about differences between them. First of all, both types of proteasome have a remarkably similar cleavage pattern. For the pp89-25mer, there is no significant difference in the determined cleavage probabilities at any cleavage site. This is an unexpected finding considering the large differences between the time-dependent product patterns produced by the two proteasome species. However, the theoretical analysis of the data demonstrates that these differences can be well accounted for by changes in the overall procession rate: Compared with the constitutive proteasome, the immunoproteasome works faster and accepts shorter peptides for re-procession. Since the kinetic model does not explicitly relate the procession rate to the various elementary steps involved in a procession cycle, it cannot be decided whether the higher procession rate of the immunoproteasome is due to an accelerated uptake and release of peptides or/and to a general increase in the catalytic capacity of its active sites. The finding that the immunoproteasome possesses a higher turnover rate than its constitutive counterpart is in agreement with previous observations (Boes, et al., 1994;Cardozo and Kohanski, 1998;Kuckelkorn, et al., 1995).

Using the substrate LLO-27mer, the differences in the overall procession rate of the constitutive proteasome and the immunoproteasome are very similar to those obtained for the pp89-25mer. In addition, there are significant alterations of the cleavage probabilities at four cleavage sites which in a concerted fashion give rise to an enhanced production of the epitope (VAYGRQVYL) by the immunoproteasome.

[page 88↓]

In summary, the results obtained with two different oligomeric substrates show that the kinetic effects associated with replacement of the constitutive proteasome by the immunoproteasome can be subdivided into a non-specific enhancement of the overall procession rate and peptide-bond specific alterations of cleavage probabilities. Since the latter effects are clearly restricted to a few cleavage sites it seems not very likely that the exchange of the active-site subunits by their interferon-inducible counterparts leads to a general stimulation of the trypsin-like and chymotrypsin-like activities accompanied by a depression of the peptidylglutamyl-peptide-hydrolyzing activity as postulated in several previous studies (Aki, et al., 1994;Boes, et al., 1994;Cardozo and Kohanski, 1998;Gaczynska, et al., 1996;Gaczynska, et al., 1993;Kuckelkorn, et al., 1995;Toes, et al., 2001). In particular, lacking changes of the cleavage probabilities at the three leucine residues present in the two substrates tested is hardly compatible with the common view (Groettrup, et al., 2001) that the immunoproteasome possesses a generally increased inclination for cleavages after certain categories of P1 residues (hydrophobic, branched chain, positively charged).

Recently Toes et al. (Toes, et al., 2001) have compared the fragment patterns of denaturated enolase-1 (436 amino acids) generated by constitutive and immunoproteasome. Only about 25% of the peptides produced by the immunoproteasome were also found in constitutive proteasome digests. Such a diversity in the peptide pools generated by either proteasomes was not seen here. For both oligomeric substrates, the two peptide pools detected in the digest were identical for both types of proteasome. The various peptides differed only in their amount which to a large extend could be explained by differences in the overall procession rate. The obvious inconsistence of the results reported here with those of Toes et al. is remarkable and may have two reasons. First, it is conceivable that the mechanisms by which the 20S proteasome degrades a denaturated long protein substrate and a relatively short (25 or 27 residues long) oligopeptide differ in that threading of a 436 long peptide chain through the proteasome may pose additional constraints on the accessibility of the active sites. Second, a moderate (2-5 fold) variation of cleavage probabilities as found for some cleavage sites of the LLO-27mer may amplify to larger variations (4 - 25 fold) of respective peptide amounts. Given that the abundance of a considerable portion of peptides derived from a long substrate is close to the detection threshold, such variations in peptide amounts could result in an apparent 'loss' or 'appearance' of peptides.

[page 89↓]

It has to be emphasized that the model in its present form was established to describe the degradation kinetics of oligopeptides as typically used in in vitro digests. Extension of this approach to kinetic experiments with long substrates will certainly require modifications of some basic assumptions, e.g. concerning the monotonous increase of the procession rate with peptide size or the statistical independence of cleavage combinations.

4.6 Summary

Existing algorithms describing protein degradation by the proteasome deliver poor results when used to identify epitopes by their predicted C-terminal cleavage. This is believed to be the consequence of the lesser quality of experimental data available for training of these prediction algorithms. To tackle this problem, a novel protocol to interpret proteasomal digests was developed. This protocol addresses two problems: (1) How to quantify the amounts of peptides present in a digest when only MS data is available, and (2) how to extract cleavage rates from a digest in which fragments are re-processed.

The conversion of MS-signals into peptide amounts is realized using mass balance equations and assuming a linear correlation between peptide amounts and their MS-signals. The amounts calculated with this approach are in good agreement with those determined using calibration curves. Problem (2) is addressed by developing a kinetic model of proteasomal digests. By fitting this model to the amount profiles from experimental digests, numerical values for cleavage rates are obtained, which are free parameters of the model. Comparing these fitted model parameters for digests made by constitutive and immuno-proteasomes shows that the differences in observed peptide amounts profiles can to a large extend be explained by an enhanced procession speed of the immuno-proteasome.

[page 90↓]

© Die inhaltliche Zusammenstellung und Aufmachung dieser Publikation sowie die elektronische Verarbeitung sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung. Das gilt insbesondere für die Vervielfältigung, die Bearbeitung und Einspeicherung und Verarbeitung in elektronische Systeme.
DiML DTD Version 3.0Zertifizierter Dokumentenserver
der Humboldt-Universität zu Berlin
HTML generated: