[Seite 7↓]

1.  Introduction

1.1. Genetic and Epigenetic information

The hereditary material in all known biological systems (except prions) consists of nucleic acids, wherein the information is encoded in the sequence of four nitrogenous bases, viz. adenine, guanine, cytosine, thymidine/uridine. This forms the genetic information which codes for the complete repertoire of functional molecules. However, not all functions are required at all times and only a subset of genes is expressed at any given time. Thus expression of genes is regulated in such a way that some genes are kept repressed and called upon for action by appropriate environmental cues. As we move from procaryotes to eucaryotes and lower eucaryotes to higher eucaryotes, the genome size and gene number increases, as does complexity of the organisms. In higher eucaryotes, complex functions are performed by specialized cellular systems, for example the nervous system and muscular system, each requiring expression of unique sets of genes which make up the distinct phenotypes. Thus even though cells from different tissues within the same organism have the same genetic information, they have different gene expression programs. Once a specific gene expression program is established for a tissue, all progenitor cells in that tissue faithfully inherit both the genetic information and the gene expression program to maintain the integrity of the tissue. To fulfill these requirements, organisms have evolved novel mechanisms to ‘mark’ genes as transcriptionally on or off in such a way that the sequence of the four bases is not changed. Such ‘mark’ forms the epigenetic information, which along with the genetic information, is passed on to the daughter cells when a cell divides.

Here we examine some of the mechanisms by which transfer of genetic and epigenetic information occurs during cell division, which are essential for stable inheritance of phenotype by daughter cells.

1.2. Replication of genetic information

1.2.1. DNA replication origins

Studies on procaryotic DNA replication have led to the replicon model which provided a framework for understanding the process of initiation of DNA replication and its regulation. According to this model, a replicon is a genetic element that is replicated from a single origin of replication (the replicator), which is recognized by a specific positive regulatory protein (the initiator) (Jacob et al., 1964). In procaryotes, the whole genome makes up one replicon in that replication initiates at a single defined origin from where the whole genome is replicated. Identification of origins in eucaryotes extended this model to the eucaryotic system. However, in contrast to procaryotes, the eucaryotic genome contains multiple origin sites where DNA replication initiates. In the unicellular eucaryote S. cerevisiae, origins of replication are composed of conserved DNA sequences spread over 100 – 150 bp called autonomously replicating sequence (ARS) (Stinchcomb et al., 1979). The S. cerevisiae genome consists of numerous such elements and DNA replication initiates at a subset of them (Newlon and Theis, 1993). Metazoan origins of replication are [Seite 8↓]very complex as they cannot be defined in terms of DNA sequence and virtually any sequence could behave as an origin in Xenopus egg extracts and Drosophila cells (Cox and Laskey, 1991) (Smith and Calos, 1995) (Gilbert, 1998). Even though DNA replication initiates at defined sites in the genome during normal cellular DNA replication (Giacca et al., 1994) (Abdurashidova et al., 2000), these sites do not show any sequence conservation. Moreover, during early development there seems to be no sequence requirement for defining origin function, as observed in the early embryos of D. melanogaster and X. laevis that can virtually replicate any DNA sequence (Blow, 2001). Thus, it seems that even though the replicon model operates in eucaryotes, the requirement for specific sequence elements has been replaced by as yet unidentified feature(s). There have been various hypotheses to explain the specificity of replication origins in higher eucaryotes, including role of nuclear matrix, transcriptional activity of genes, chromatin structure, DNA sequence and DNA methylation (reviewed in (DePamphilis, 1999). By whatever mechanism origins are selected in higher eucaryotes, DNA replication consistently initiates at these sites, and exact copies of the whole genome are synthesized during normal cell proliferation.

1.2.2. DNA replicates in discrete sites in the nucleus called replication foci

Cells grown in the presence of halogenated nucleotide analogues (e.g. BrdU) incorporate the analogue into genomic DNA during the process of DNA replication. The halogenated nucleotides incorporated into DNA can be detected by immunocytochemical techniques that allow investigation of DNA replication at the cellular level (reviewed in (Leonhardt and Cardoso, 1995). Using such an approach it was observed that in cells grown in the presence of BrdU for a short period of time (pulse labeling), the BrdU signal was present at discrete sites in the nucleus (Nakamura et al., 1986). Since these sites correspond to regions of the chromosome which had incorporated BrdU by the process of DNA replication, these sites are called replication foci (RF). Extraction of nuclei with physiological salt concentrations, non-ionic detergents and endonucleases retains the labeled replicated DNA as discrete sites in the nuclear matrix indicating that the RF is stably tethered to the nuclear matrix (Jackson and Cook, 1986). It is suggested that each replication focus is constituted of a subchromosomal domain of adjacent replicons from the same chromosomal region that replicate together. These domains are stably maintained as single units through multiple rounds of cell division, and they occupy stable positions in the nucleus (Sparvoli et al., 1994) (Jackson and Pombo, 1998) (Ma et al., 1998) (Sadoni et al., 1999).

The sites of DNA replication can also be visualized as discrete foci where proteins involved in DNA replication and associated activities assemble during S phase (reviewed in (Leonhardt and Cardoso, 1995). The first protein to be identified at these foci during S phase is PCNA (Celis and Celis, 1985) (Bravo and Macdonald-Bravo, 1987). Many other proteins have since been shown to be associated with these sites, including the maintenance DNA methyltransferase (DNMT1) (Leonhardt et al., 1992), DNA polymerase α (Hozak et al., 1993), DNA Ligase I (Cardoso et al., 1997), RPA-70 and cell cycle regulators cyclin A and cdk2 (Cardoso et al., 1993), and DNA repair factors like uracil-DNA-glycosylase (Otterlei et al., 1999). Nuclear matrix preparations retain some replication factors, like PCNA, and polymerizing activity at these sites in the nuclear matrix indicating that the replication factors form insoluble complexes at the sites of DNA replication during S phase (reviewed in (Leonhardt [Seite 9↓]and Cardoso, 1995). Based on these findings, these sites were called replication factories or replication foci (we use the latter term throughout this work and abbreviate it as RF). Thus, RF can be defined as microscopically visible subchromosomal domains in the nucleus tethered to the nuclear matrix where DNA is being replicated and replication factors are concentrated.

It is estimated that each replication focus comprises on average 1Mbp of DNA. Most of the mammalian replicons are in a size range of 75-150 Kbp. Thus, it is estimated that each replication focus consists of at least 10 replicons (reviewed in (Berezney et al., 2000). Since DNA replication occurs bidirectionally in each replicon, on average it is expected that each replication focus consists of 20 replication forks. However, the RF are heterogenous in their size (0.25 µm to several µm) and lifetimes (30 min to over 3 h) (Leonhardt et al., 2000a), and the number of replication forks might vary greatly. Based on studies on mouse 3T3 cells, it is estimated that the whole genome is replicated in ~10,000 RF (Ma et al., 1998). These RF are all not active at the same time and rather follow a defined pattern of activation throughout S phase as discussed below.

1.2.3. Temporal and spatial order of DNA replication

Specific regions of eucaryotic chromosomes replicate at defined times during S phase and this timing is correlated to the transcriptional competence of the DNA elements (Goldman et al., 1984) (Hatton et al., 1988). In general, it was observed that transcriptionally active regions replicate during early-S phase and the transcriptionally inactive regions replicate at any interval during S phase. The more condensed heterochromatin, which is typically found at centromeric regions, has been shown to replicate during late-S phase (Ten Hagen et al., 1990). At the cellular level, this temporal program of replication is reflected in the number, size and location of RF throughout S phase (Nakayasu and Berezney, 1989) (O'Keefe et al., 1992). This results in formation of distinct spatial patterns of RF in the nucleus during S phase (Fig. 1.1). Typically, during early-S phase, when most of the euchromatin replicates, the RF are distributed throughout the nucleus. During early to mid-S phase (called mid-S phase throughout this work), the RF can be observed as discrete perinuclear and perinucleolar sites. During mid to late-S phase (called late-S phase throughout this work), the highly condensed heterochromatin replicates and RF form large "donut" shaped structures. Existence of the various patterns of RF has been shown in living mammalian cells by labeling the RF with GFP fused to core replication factors, like DNA Ligase I or PCNA (Cardoso et al., 1997) (Leonhardt et al., 2000a). Such studies have shown that replication proteins continuously assemble at many subchromosomal domains to form RF, and continuously disassemble from sites that have completed replication (reviewed in (Cardoso et al., 1999). This cycle of assembly and disassembly progresses throughout S phase giving rise to the distinct patterns of RF.


[Seite 10↓]

Fig. 1.1. Distinct spatio-temporal patterns of RF in S phase mammalian nuclei. Mouse cells in S phase displaying the distinct patterns of RF. Euchromatin replicates in early-S phase and heterochromatin replicates in late-S phase The RF are labeled with a GFP-PCNA fusion protein.

The discovery of RF indicated that replication is not an event occuring randomly throughout the nucleus, but an event organized in discrete domains in time and space. Formation of such functional domains could have consequences on regulation of replication and formation/maintenance of chromatin states. For an understanding of the process of replication as it occurs in vivo, it is essential to define the mechanisms by which various proteins assemble at the RF (Leonhardt et al., 2000b).

1.2.4. Specific protein sequences mediate assembly of replication factors at RF

Replication factors undergo dynamic re-distribution in the nucleus during S phase. In the G1 phase, most replication proteins are diffused in the nucleus. On entry into S phase, replication proteins form punctate patterns that co-localize with RF (reviewed in (Leonhardt and Cardoso, 1995). This association with RF is mediated by specific peptide sequences in the replication proteins called replication foci targeting sequence (RFTS). The first such domain identified was in the maintenance methyltransferase (DNMT1) and was called targeting sequence (TS) (Leonhardt et al., 1992). Fusion of TS with a heterologous protein like the β-gal epitope mediated association of the fusion protein with RF. It is interesting to note that DNMT1 is an enzyme responsible for catalyzing transfer of a methyl group to cytosine residues in DNA and not involved in the process of DNA replication per se. The association of DNMT1 with RF is best interpreted as a feature suited to its role in maintaining methylation patterns whereby it can methylate newly synthesized DNA at the site of synthesis. Later on, two more domains were identified in DNMT1 that mediated association with RF. One of these domains interacts with PCNA, called PCNA binding domain (PBD), and site directed mutagenesis of this domain abolishing PCNA binding also abolished association with RF (Chuang et al., 1997). The other domain identified to mediate association with RF is the PBHD/BAH domain (Liu et al., 1998).

The second protein in which an RFTS was mapped is DNA ligase I (Cardoso et al., 1997). Subsequently, it was shown that the region in DNA ligase I that mediates association with RF binds to PCNA via a domain that is related to the PBD [Seite 11↓]in DNMT1. Also, this interaction of DNA ligase I with PCNA was shown to be essential for association with RF (Montecucco et al., 1998). Another protein shown to associate with RF mediated by a similar PBD includes DNA polymerase η (Kannouche et al., 2001).

Over the years many proteins involved in DNA metabolism and cell cycle regulation have been shown to interact with PCNA via the canonical PBD found in DNMT1 and DNA Ligase I (reviewed in (Leonhardt et al., 1998), some of which have been shown to be associated with RF. Considering the presence of a PBD in various DNA replication and repair factors and the requirement of the PBD to mediate association of DNA Ligase I with RF , it has been suggested that the mechanistic basis for the association of PBD containing proteins with RF is an interaction of the PBD with PCNA (Montecucco et al., 1998). In the scheme shown in Fig. 1.2, the PCNA trimer forms the central core encircling DNA and various replication factors associate with PCNA via the PBD. The PBD is conserved in homologous proteins from archaebacteria, yeast, worms, flies, amphibians and mammals (Warbrick et al., 1998). Such conservation across different classes of organisms suggests that the recruitment of proteins to RF mediated by PBD-PCNA interaction is a mechanism conserved throughout evolution. However, like DNMT1, there could be other proteins with a PBD that have additional domains involved in mediating association of replication factors with RF. In the case of DNMT1 all the three RFTS are conserved in the DNMT1 homologues from metazoans (this study) indicating that they are essential. It is suggested that assembly of a replication focus might involve a web of unique interactions among replication factors (Fig. 1.2) (Leonhardt et al., 1992). To better understand the assembly of replication proteins to RF, it is essential to learn the roles played by these additional RFTS (Leonhardt et al., 2000b).

Fig. 1.2. Schematic representation of RF. (A) The three regions in the N-terminal domain that have been shown to independently associate with RF. (B) Schematic of the RF. Various proteins are shown as boxes with protrusions to depict the protein domains that interact with other members of the RF by protein-protein interactions. The different shapes of the protrusions are meant to depict different domains that may play a role in targeting proteins to the RF. The central green ring is the PCNA trimer that encircles DNA (not shown). Association of proteins with the RF is principally mediated by interaction with PCNA. DNMT1 and DNA Lig I are two typical proteins which are targeted to the RF by interaction of their PBD with PCNA. The TS and PBHD are depicted to interact with other unidentified proteins in the complex.


[Seite 12↓]

1.3. Epigenetic information

Epigenetic information is defined as a heritable mark on DNA that does not change the DNA sequence but that modulates gene activity, and that is stably inherited through mitotic/meiotic divisions (Wu and Morris, 2001).

1.3.1. Types of epigenetic information

Eucaryotic genomes contain two types of epigenetic marks:

  1. Histone modifications: The DNA in eucaryotes is wound around a protein core called nucleosome that consists of histones. Four types of histones make up the octameric nucleosome core, a H3-H4 tetramer and two H2A-H2B dimers. All four histones are small basic proteins closely related to each other in that they share a globular motif called the histone fold. In addition to the histone fold, each of the core histones has a long N-terminal tail, which is rich in basic amino acid and extends out from the histone core. These histone tails are subject to several types of post-translational modifications, viz. acetylation, phosphorylation, methylation, ubiquitination and ADP-ribosylation (Berger, 2002). Covalent modifications in the globular domain have also been described (reviewed in (Varga-Weisz and Dalgaard, 2002). These modifications were purported to play a role in chromatin structure by influencing histone-DNA and histone-histone contacts, and thereby influencing transcription. However, observations made in the past two to three years have led to a "histone code hypothesis" that proposes an active and decisive role of histone modifications in chromatin function (Strahl and Allis, 2000). According to this, distinct histone tail modifications, individually or in combinations, would create specific binding sites for various chromatin modifiers with distinct functions thereby inducing formation of specialized chromatin domains. Such domains would have far-reaching consequences on processes like transcription, replication, recombination, mitosis etc. An example of the histone code hypothesis is the opposing effects of methylation of histone H3 tail at lysine-4 (H3K4Me) and lysine-9 (H3K9Me) on transcriptional activity (reviewed in (Lachner and Jenuwein, 2002). Presence of H3K9Me is correlated with transcriptionally silenced chromatin while H3K4Me marks transcriptionally active regions (Noma et al., 2001) (Litt et al., 2001). Mechanistically, H3K9Me attracts a transcriptionally repressive protein HP-1 that induces formation of silent chromatin domain (Lachner et al., 2001) (Bannister et al., 2001). In contrast, H3K4Me prevents association of the negatively acting nucleosome remodelling and histone deacetylation (NuRD) complex, and induces formation of a transcriptionally active domain (Nishioka et al., 2002). Even though very little is known about how regions in the genome are identified for establishing specific histone modifications, it is now clear that histone modifications function as epigenetic marks that can stably establish gene expression states.
  2. DNA methylation: The DNA of most organisms is modified by a post-replicative process which results in three types of methylated bases in DNA: C5-methylcytosine (5mC), N4-methylcytosine and N6-methyladenine. The latter two [Seite 13↓]are more widespread in procaryotes, while the former is the major class of methylated base in all organisms. Formation of 5mC is accomplished by an enzyme called DNA methyltransferase (DNA MTase), which transfers a methyl group from S-adenosyl-L-methionine (SAM) to carbon-5 in the pyrimidine ring of cytosine (Fig. 1.3) (Wu and Santi, 1987). Briefly, in this process a cysteine thiol of the enzyme attacks carbon-6 of cytosine and forms a covalent DNA-protein intermediate. The addition of the cysteine thiol activates the carbon-5 allowing transfer of the methyl group from SAM and release of S-adenosyl-L-homocysteine (SAH). This reaction mechanism is conserved in all organisms and the enzyme involved is conserved across the whole spectrum of organisms. DNA methylation is a covalent modification of DNA that does not change the DNA sequence, but has an influence on gene activity. Although in procaryotes one of the major role of DNA methylation is to protect host DNA from the restriction-modification system, in eucaryotes the role of DNA methylation as an epigenetic mark has gained great importance. In vertebrates, DNA methylation is distributed throughout the genome and primarily occurs at CpG sequences, producing methyl-CpG symmetrically on both strands of the DNA (reviewed in (Bird, 2002). In human somatic cells, 5mC constitutes about 1% of total DNA bases and therefore 70-80% of all CpG dinucleotides in the genome are methylated (Ehrlich and Wang, 1981). Some of the remaining unmethylated CpG dinucleotides constitute the CpG islands and are found at promoter segments of genes. Some of these CpG islands become methylated during development and this results in stable silencing of the gene, for example genes silenced in the inactive X chromosome and silenced alleles of imprinted regions. In general, it is an accepted view that promoter methylation is one of the regulatory mechanisms employed in gene silencing (reviewed in (Cardoso and Leonhardt, 1999a). However, this is not a an absolute requirement as there are examples where a CpG island in a promoter is unmethylated while the gene is still kept silent, for example the CpG island in human α-globin gene promoter is unmethylated in both erythroid and non-erythroid tissues (Bird et al., 1987). Such cases might now be explained by the role of histone modifications in gene silencing. However, the importance of DNA methylation in mammalian development and in regulation of gene expression is well established and is known to be essential. This is best emphasized by the fact that disruption of the gene(s) encoding the enzyme that catalyzes DNA methylation is lethal in mice early in development (Li et al., 1992) (Lei et al., 1996) (Okano et al., 1999)


[Seite 14↓]

Fig. 1.3. Mechanism of transfer of methyl group to C5-cytosine based on the mechanism proposed by Wu and Santi (Wu and Santi, 1987) for thymidylate synthase and tRNA-(uracil-5)methyltransferase. A cysteine thiol of the enzyme attacks the 6-carbon of cytosine and forms a covalent DNA-enzyme intermediate. The resulting carbanion at 5-carbon of cytosine then attacks the methyl group of SAM (AdoMet) forming a covalent bond with the methyl group and SAH (AdoHcy) is released. Elimination of the conjugate occurs through abstraction of the proton from carbon-5 by a base (B:) to yield the product 5-methylcytosine.

1.3.2. Role of DNA methylation

DNA methylation has been demonstrated to play important roles during development, differentiation, aging, X-chromosome inactivation, genomic imprinting, tumourigenesis and transposon inactivation (reviewed in (Cardoso and Leonhardt, 1999a) (Leonhardt and Cardoso, 2000) (Bird, 2002) (Ehrlich, 2002) (Yoder et al., 1997b)). Most of these roles arise as a consequence of the effect of DNA methylation on transcription and chromatin structure as discussed below.

Inhibitory effect on transcription: There is a strong correlation between DNA methylation and gene silencing. For example, the CpG islands that span promoter regions are heavily methylated in the inactive X chromosome while the corresponding regions in the active X chromosome are not (reviewed in (Bird, 2002). DNA methylation can inhibit transcription in three ways (Fig. 1.4A) (reviewed in (Leonhardt and Cardoso, 2000). Firstly, DNA methylation can directly block transcription factor binding, which has been shown to be the case for some transcription factors (AP-2, c-Myc/Myn, E2F and Nf-κB) (Becker et al., 1987). However, other transcription factors are not sensitive to methylation (Sp1, CTF and YY1) (Tate and Bird, 1993). Secondly, DNA methylation represses promoter activity indirectly by attracting factors that specifically recognize and bind methylated cytosines thereby blocking access of transcription factors to promoters (Fig. 1.4B). These factors share a domain called methyl-CpG-binding domain (MBD). Out of five MBD containing proteins identified, four (MBD1, MBD2, MBD3, and MeCP2) have been implicated in DNA methylation dependent transcriptional silencing. Thirdly, DNA methylation represses transcription by altering chromatin structure through the MBD proteins that can function as a complex containing nucleosome remodeling factors (for example, the MeCP1 complex consists of MBD2 + Mi2/NuRD; MeCP2 binds Sin3/HDAC) (Fig. 1.4C). In this case DNA methylation will lead to deacetylation of histones thereby suppressing transcription. Thus, DNA methylation can modulate the histone code and lead to repression.

Effect on chromatin structure: DNA methylation is known to have a profound influence on chromatin structure (reviewed in (Leonhardt and Cardoso, 2000). For example, mutation of a gene encoding a DNA MTase (DNMT3B, discussed later) is linked to a hereditary disorder called ICF syndrome (Xu et al., 1999) (Okano et al., [Seite 15↓]1999). Cells from these patients show deletions or duplications of entire chromosomal arms, isochromosomes and centromere breakage (Franceschini et al., 1995). DNA methylation studies in ICF patients showed hypomethylation of classical satellites II and III, which are major components of constitutive heterochromatin (Jeanpierre et al., 1993). These regions are normally highly methylated indicating that DNA methylation is essential for proper centromere structure and stability (reviewed in (Robertson and Wolffe, 2000).


Fig. 1.4. Models for effect of methylation on gene activity. Unmethylated DNA is depicted as half circles and methylated DNA as filled circles. (A ) Methylation directly prevents binding of transcription factor (TF) thereby inhibiting transcription. (B) Methyl DNA binding proteins bound to the methylated promoter prevent binding of TF and inhibit transcription. (C) MeCP2 complex containing HDAC binds to methylated

1.3.3. Regulation of DNA Methylation

The mammalian genome undergoes sweeping changes in its methylation pattern, the most dynamic being observed during development. In mice, just after fertilization the male pronucleus is rapidly demethylated while the maternal genome progressively loses methylation until the blastocyst stage. The methylation levels decrease to ~30% of that in the adult somatic cells but return to higher levels during implantation (Monk et al., 1987) (Oswald et al., 2000) (Mayer et al., 2000) (Reik et al., 2001). In effect, these changes cause an erasure of the existing DNA methylation patterns (except some regions like imprinted loci) followed by establishment of new DNA methylation patterns. Such reprogramming of DNA methylation patterns is essential for setting up tissue specific gene expression, X chromosome inactivation in female mammals and genomic imprinting. For normal functioning, once methylation patterns are established, these patterns have to be faithfully inherited by daughter cells. Changes in methylation patterns sometimes occur in adult tissues with harmful effects. For example, in somatic cells, some CpG islands get methylated during aging and tumourigenesis (reviewed in (Cardoso and Leonhardt, 1999a) (Bird, 2002)). Formation of many tumours have been correlated with hypomethylation and/or hypermethylation of specific regions in the genome resulting in activation of oncogenes or suppression of tumour suppressors (reviewed in (Leonhardt and Cardoso, 2000) (Ehrlich, 2002)). The changes in DNA methylation that occur during all biological processes, both normal and diseased, are mediated by three processes, viz. de novo methylation, maintenance methylation and demethylation (Fig. 1.5). An understanding of how these processes function is central to our understanding of development and disease.


[Seite 16↓]
Fig. 1.5. Processes that change or maintain DNA methylation pattern. In vertebrates DNA methylation occurs mainly in CpG dinucleotides depicted here as CG. Methyl residues are depicted as ‘m’. New methylation patterns are established by the process of de novo methylation (left). Existing methylation pattern can be erased by demethylation (right). During DNA replication (centre), the newly synthesized DNA strand (thin line) is unmethylated while the parent strand (thick line). retains its methylation pattern. The methylation pattern from the parent strand is copied on to the daughter strand by maintenance methyltransferase.

  1. De novo methylation: It is the process in which unmethylated sites in DNA are methylated resulting in formation of new methylation patterns (Fig. 1.5). The highest de novo MTase activity is detected in embryonal carcinoma (EC) and embryonic stem (ES) cells (Stewart et al., 1982) (Lei et al., 1996) and specific MTases have been shown to be involved in de novo methylation (discussed in the next section). In mammals, most of the de novo methylation occurs during development when both paternally and maternally derived genomes undergo gross DNA methylation (Monk et al., 1987). Other examples of de novo methylation of sites in the genome where viral DNA integrates (Toth et al., 1990), age related hypermethylation in the c-myc gene in liver of mice (Ono et al., 1989) and methylation of the estrogen receptor (ER) gene in ageing colorectal mucosa resulting in predisposition to sporadic colorectal tumorigenesis (Issa et al., 1994). Even though there exist many examples of de novo methylation, little is known about how specific DNA sequences are selected for DNA methylation. Many observations indicate that the DNA methylation machinery is targeted to transcriptionally inactive regions. In the case of the X-linked Hprt gene, DNA methylation occurs after chromosome inactivation and transcriptional silencing (Lock et al., 1987). Where ever examined, DNA methylation occurs after methylation of histone at lysine9 (H3K9Me), the other epigenetic mark characteristic of silent chromatin (Heard et al., 2001) (Bachman et al., 2003). Notably, silencing occurs prior to DNA methylation and concomitant with formation of H3K9Me. These studies have suggested that DNA methylation plays an important role in stabilizing the silenced state established by histone modifications.
  2. Maintenance methylation: It is the process by which DNA methylation patterns are maintained after each round of DNA replication. Since each round of DNA replication results in a newly synthesized strand that is unmethylated while the parent strand is methylated, a mechanism is required to methylate the newly synthesized strand (Fig. 1.5). It was proposed that once a methylation pattern has [Seite 17↓]been set by de novo methylation, this would be clonally inherited by the action of a maintenance DNA methyltransferase specific for hemi-methylated CpG sites (Riggs, 1975) (Holliday and Pugh, 1975). Biochemical experiments have shown that mammalian DNA methyltransferases purified from somatic cells prefer hemimethylated DNA as substrate (Gruenbaum et al., 1982) (Bestor and Ingram, 1983).Such an activity has also been demonstrated in vivo wherein it was observed that in vitro methylated DNA introduced into mouse cells by transfection retains the methylation pattern after several rounds of replication. In contrast, cells transfected with unmethylated DNA showed no methylation of the DNA suggesting that the cell had some MTases that specifically "replicate" methylation patterns (Wigler et al., 1981). An MTase, DNMT1 (discussed in the next section), with a preference to methylate DNA containing hemi-methylated CpG dinucleotide was later cloned and characterized (Bestor et al., 1988) (Bestor, 1992). Several biochemical, cell biological and genetic studies have shown that DNMT1 is the key enzyme involved in maintaining DNA methylation patterns.
  3. Demethylation: It is the process by which DNA methylation patterns are erased (Fig. 1.5). Demethylation mainly occurs during preimplantation development, but also occurs throughout development as a prelude to transcriptional activation. The demethylation process is not as clear as DNA methylation and two possible mechanisms have been reported for demethylation:

Passive demethylation: This results in the gradual decrease of DNA methylation levels due to the absence of maintenance methylation at each round of DNA replication. For example, passive demethylation occurs after inactivation of cellular MTases with 5AzaC (Jones and Taylor, 1980). Analysis of the kinetics of demethylation that occur in the maternally inherited genome during preimplantation has been correlated with successive loss of methylation at each chromosome replication cycle (Rougier et al., 1998). This is explained to occur through active retention of DNMT1 in the cytoplasm during development from the oocyte to the blastocyst stage (Cardoso and Leonhardt, 1999b) (Carlson et al., 1992).

Active demethylation: Active demethylation occurs independent of DNA replication, and is mediated by enzymes. Cases where this is observed include the global demethylation that occurs in the zygotic paternal genome (Mayer et al., 2000), demethylation of the vitellogenin gene in chick liver upon induction of transcription (Wilks et al., 1984), demethylation of globin gene stimulated in erythroleukemia cells (Razin et al., 1986) and the genome wide demethylation in differentiating myoblasts (Jost and Jost, 1994). Three main biochemical mechanisms have been proposed to carry out active demethylation: excision of the methylated base by a glycosylase, excision of the methylated nucleotide, or direct replacement of the methyl group by a hydrogen atom (Kress et al., 2001). However, the molecular mechanism behind these processes is still not known.

1.3.4. Enzymes involved in methylating DNA

Numerous DNA methyltransferases (MTases) have been identified and cloned from both procaryotes and eucaryotes, and have been shown to share a conserved catalytic domain in the form of 10 small sequence motifs. Based on a phylogenetic [Seite 18↓]comparison of the catalytic domains, the eucaryotic MTases are grouped into 5 families, DNMT1, DNMT2, DNMT3, Masc 1 (only one member from a fungus), and CMT (chromomethylase, only in plants) (Fig. 1.6) (Colot and Rossignol, 1999). All eucaryotic MTase families, except the DNMT2 family, consist of proteins which have an additional N-terminal domain with various functional motifs. DNA methylation being an important epigenetic mark, presence of many different MTases suggest special functions in regulating DNA methylation. Here we examine the known mammalian DNA methyltransferases.

Fig. 1.6. Five families of eucaryotic MTases. Phylogenetic relationship between known MTases based on comparison of the conserved motifs in the catalytic domains (adapted from (Colot and Rossignol, 1999)). The eucaryotic MTases group into five families (boxed): DNMT1, DNMT2, DNMT3, Chromomethylase (CMT), Masc1 (only one member). Some of the recently identified proteins (CMT3 and DNMT3L) are not shown. MTases from eubacteria and archaebacteria are divergent and lie scattered.

DNMT1: DNMT1 is the first mammalian MTase that was cloned, and is referred to as maintenance MTase because it is responsible for maintaining DNA methylation patterns. Based on functional and structural data it is suggested to result from fusion of three genes, one of them being an ancestral procaryotic DNA MTase (Margot et al., 2000). The enzyme consists of two main domains - the C-terminal catalytic domain (570 amino acids) and the N-terminal regulatory domain (1051 amino acids) - linked by a stretch of repeating Gly-Lys dipeptide (linker) (Fig. 1.7). DNMT1 homologues have been identified in a wide range of organisms including fungi, plants, sea urchin, amphibians, fish, birds and mammals, all having a similar structure with a long N-terminal domain and shorter C-terminal catalytic domain. In mammalian DNMT1, the N-terminal domain has various motifs with specific functions (Fig. 1.7) (reviewed in (Leonhardt and Cardoso, 2000) (Bestor, 2000)). Two properties of DNMT1 distinguishes it as maintenance MTase. Firstly, the enzyme shows a preference for hemimethylated DNA (Bestor, 1992), and secondly it is specifically relocated to RF when the cell enters S phase (Leonhardt et al., 1992). These observations paved the way for an understanding of the mechanism by which cells maintain their methylation pattern. At every round of DNA replication in organisms with methylated genome, the product is double stranded DNA wherein the parent strand retains the methylation pattern while the newly synthesized daughter strand is unmethylated (Fig. 1.5). The preference of DNMT1 for hemimethylated CpG sequences enables DNMT1 to copy the methylation pattern into the newly [Seite 19↓]synthesized strand. By virtue of being targeted to RF, DNMT1 would be positioned exactly at the site where its substrate, hemi-methylated DNA, is synthesized (Fig. 1.8). Three regions in the N-terminal domain of DNMT1, viz. TS, PBD and PBHD/BAH, have been reported to mediate association with RF (Leonhardt et al., 1992) (Chuang et al., 1997) (Liu et al., 1998).

Fig. 1.7. Domain structure of DNMT1. The somatic long isoform of DNMT1 is shown here. DMAP corresponds to the region in DNMT1 that binds DMAP (Rountree et al., 2000). PBD (PCNA binding domain) (Chuang et al., 1997), TS (targeting sequence) (Leonhardt et al., 1992) and PBHD (polybromo homology domain) (Liu et al., 1998) are reported to target to replication foci. P marks an identified phosphorylation site (Glickman et al., 1997). NLS is the nuclear localization signal (Cardoso and Leonhardt, 1999b). Zn-1 (Bestor, 1992) and Zn-2 (Chuang et al., 1996) are the two Zn binding domains. DB is a DNA binding domain just preceding the TS (Chuang et al., 1996). HDAC1 corresponds to the region in DNMT1 that binds to HDAC1 (Fuks et al., 2000).

In addition, other regions in the N-terminus have been demonstrated to play special functions (Fig. 1.7). At least three nuclear localization signals (NLS) have been identified that mediate nuclear import (Cardoso and Leonhardt, 1999b). A Cysteine-rich region and two Zn-binding regions have been identified (Bestor, 1992) (Chuang et al., 1996). Using biochemical approaches, it was observed that the N-terminal domain can mediate transcriptional repression that is partially mediated by interaction with histone deacetylases (HDAC1 and HDAC2) and a novel trancriptional repressor (DMAP1) (Fuks et al., 2000) (Robertson et al., 2000) (Rountree et al., 2000). The N-terminal domain has also been observed to mediate transcriptional repression directly through a region related to the trithorax-related protein HRX (Fuks et al., 2000). It is suggested that DNMT1 mediates recruitment of HDAC2 to late-RF and that this serves as a mechanism to deacetylate the acetylated histones that are assembled on newly replicated DNA (Rountree et al., 2000). Thus, in addition to its role in maintaining DNA methylation pattern, DNMT1 is proposed also to play a role in maintaining heterochromatin structure.


[Seite 20↓]

Fig. 1.8. Coupling of DNA methylation with DNA replication. Association of DNMT1 with the replication machinery mediated by RFTS (marked as T) couples maintenance of DNA methylation with DNA replication. The replication machinery is shown tethered to the nuclear matrix. RFTS mediated association of DNA Ligase I with the replication machinery is also shown.

Observations made on DNMT1 mutant mice created by targeted mutation of the Dnmt1 gene have strongly supported the role of DNMT1 as a maintenance methyltransferase (Li et al., 1992) (Lei et al., 1996). Homozygous Dnmt1 mutation causes a severe reduction of 5mC in ES cells and embryos. These mutant embryos die very early in development and were reported to be extensively demethylated at all sites in the genome that were examined. Loss of DNMT1 results in activation of silenced alleles of imprinted genes due to inability to maintain the methylation pattern at the imprinted loci (Li et al., 1993). These observations made on mice lacking DNMT1 have established DNMT1 as an enzyme responsible for maintaining DNA methylation patterns.

It is also suggested that DNMT1 might function as a de novo methyltransferase. This is based on in vitro experiments using extracts from various tissues and cell types in which DNMT1 had a significant activity on unmethylated DNA substrates (Yoder et al., 1997a). Other studies have shown that the de novo methyltransferase activity of mouse DNMT1 is higher than that of the known de novo methyltransferases (DNMT3 proteins, see below) (reviewed in (Bestor, 2000). In conclusion, although DNMT1 could potentially methylate DNA de novo, genetic studies show that it seems to be mainly involved in maintenance of methylation patterns. It is possible that in vivo it performs both functions.

DNMT2: DNMT2 forms a family of proteins that is related to pmt1p of S. pombe, an organism that does not show any 5mC in its DNA. In S. pombe, disruption of the pmt1+ gene results in no discernible phenotype, and purified pmt1p does not show any methylation activity in vitro (Wilkinson et al., 1995). However, deletion of a single amino acid in pmt1p restores catalytic activity (Pinarbasi et al., 1996). Disruption of the Dnmt2 gene in mouse ES cells did not yield any detectable effect on DNA methylation, nor did purified DNMT2 show any methylation activity in biochemical assays (Okano et al., 1998b). Homologues of DNMT2 have been identified in other vertebrates, D. melanogaster, plants and S. pombe (reviewed in [Seite 21↓](Bestor, 2000) but it is not clear whether DNMT2 has any function in these organisms.

DNMT3: The DNMT3 family consists mainly of three enzymes, DNMT3A, DNMT3B and DNMT3L, that were identified from EST database searches. The catalytic domain of DNMT3A and DNMT3B proteins is more similar to the bacteriophage MTases than to DNMT1 and DNMT2. The murine Dnmt3a and Dnmt3b genes are highly expressed in undifferentiated ES cells but downregulated after differentiation and expressed at low levels in adult somatic tissue (Okano et al., 1998a). In contrast, DNMT1 is expressed at high levels both in ES cells and somatic cells. Biochemical experiments showed that DNMT3A and DNMT3B could methylate both unmethylated and hemi-methylated DNA with equal activity (Okano et al., 1998a). In vivo studies showed that DNMT3 proteins indeed have the ability to catalyze de novo methylation. This was shown by an assay wherein retroviral DNA is introduced into wild type and mutant ES cells and the methylation state of the retroviral DNA is tested after several days. In such an assay it was observed that ES cells deficient in both DNMT3A and DNMT3B (double mutant), completely lacked the ability to methylate the retroviral DNA (Okano et al., 1999). These observations have heralded DNMT3A and DNMT3B as enzymes that mediate de novo methylation.

Studies on the sub-nuclear localization of epitope tagged DNMT3A and DNMT3B have shown that both specifically associate with pericentric heterochromain in embryonic stem cells, while in embryonic fibroblasts only DNMT3A is associated with these sites (Bachman et al., 2001). This corroborates some of the observations made in mice mutant for Dnmt3 genes, which show that DNMT3B is important in methylating the centromeric repeats during early development, and not in differentiated cells (Okano et al., 1999). It is not clear what role DNMT3A has at the pericentric regions in differentiated cells. An alternative form of DNMT3A, DNMT3A2, produced from an alternative promoter in the Dnmt3a gene exhibited localization to euchromatin (Chen et al., 2002). In contrast to these reports, it has been demonstrated that both Dnmt3a and Dnmt3b are not associated with pericentric heterochromatin in mouse myoblast cells (C2C12) (Margot et al., 2001). These studies indicate that DNMT3 proteins are recruited to their target sites by non-overlapping mechanisms, and that these mechanisms might be specific to the developmental stage and cell type. It follows from these studies that the targeting mechanisms could be controlled to regulate de novo methylation. Further, such studies on the localization of the various methyltransferases should shed light on the mechanisms involved in targeting and the in vivo targets of these enzymes, which are essential for our understanding of regulation of DNA methylation.

DNMT3L: DNMT3L is the most recent addition to the list of DNA MTases. It is closely related to DNMT3A and DNMT3B, but lacks the conserved residues in the catalytic domain that are essential for enzymatic activity. Consistent with this, it does not show any catalytic activity in vitro (Hata et al., 2002). DNMT3L is shown to interact with DNMT3A and DNMT3B, and also co-localizes with these enzymes in nuclei of transfected COS cells (Hata et al., 2002). DNMT3L is shown to be specifically expressed in undifferentiated ES cells and mice lacking DNMT3L are defective in establishing maternal genomic imprints (Hata et al., 2002) (Bourc'his et al., 2001). It is suggested that DNMT3L co-operates with the DNMT3 family proteins [Seite 22↓]to carry out imprinting of genes during oogenesis and early mouse development (Hata et al., 2002) (Bourc'his et al., 2001).

1.3.5. Organisms that lack DNA methylation

Even though DNA methylation is an essential epigenetic mark in vertebrate system, many lower eucaryotes, like yeast, C. elegans and D. melanogaster lack this modification. Although D. melanogaster was long considered to lack DNA methylation, there are studies showing the presence of low levels of 5mC in its genome (Achwal et al., 1984) (Gowher et al., 2000) (Lyko et al., 2000). Unlike in mammalian cells, most of the methylation here is found in CpA dinucleotides and not CpG. In D. melanogaster, 5mC accounts for just about 0.1% of the total cytosines (Gowher et al., 2000) as compared to ~2-10% of cytosines in mammmalian cells (Ehrlich and Wang, 1981). The only DNA methyltransferase found in D. melanogaster is the DNMT2 homologue which does not show methyltransferase activity in vitro (Tweedie et al., 1999) (Lyko et al., 2000). It could be possible that this enzyme is active in vivo and is responsible for the traces of 5mC. No homologues of the functional DNA MTases, present in all other organisms with a methylated genome, have been observed in Drosophila. Even though there are insects whose genome is methylated, for example the cricket Acheta domesticus (Tweedie et al., 1999), it is not known whether they have a DNMT1 homologue. However, after complete sequencing of the Drosophila genome, it is now established that Drosophila does not have any homologue of DNMT1. Absence of a homologue of DNMT1 might mean that Drosophila never had a DNMT1 homologue or has lost it during evolution.

Homologues of downstream effectors of methylation, the methyl DNA binding proteins, identified in D. melanogaster totally lack the highly conserved methyl DNA binding domain (MBD) found in other organisms whose genome is known to be methylated (Tweedie et al., 1999). Also, in some fungi (N. crassa and A. immersus) whose genome is methylated, the MTases are not essential. Absence of DNA methylation in some of these eucaryotes indicates that other epigenetic mechanisms, like histone modifications, are sufficient for gene regulation in these organisms.

1.4. Questions addressed in this work

To study mechanisms by which proteins associate with RF in diverse eucaryotes and the role of the regulatory sequences of DNMT1 in this association, we addressed the following questions:

Are the mechanisms that mediate association of proteins with RF conserved in evolution?

Many features of DNA replication are conserved in higher eucaryotes. Firstly, general features of the replication process itself, like bi-directional replication fork movement, continuous leading and discontinuous lagging strand synthesis, requirement of RNA primers to start DNA synthesis (Baker and Bell, 1998) are all conserved. Secondly, the proteins involved in controlling DNA replication and [Seite 23↓]catalyzing the process of DNA replication are conserved in diverse eucaryotes (Leipe et al., 1999). Thirdly, organization of replication into RF that follow a spatio-temporal pattern is a conserved feature in various eucaryotes (Samaniego et al., 2002) (Ahmad and Henikoff, 2001). However, it is not known whether the mechanisms by which replication factors associate with RF are conserved in evolution. To determine this, we have analyzed the ability of the various domains in DNMT1 to associate with RF in Drosophila cells. Moreover, as shown in Fig 1.8, maintenance of epigenetic information (DNA methylation) is coupled to DNA replication by association of DNMT1 with the replication machinery. In this regard, Drosophila is interesting as its genome is scarcely methylated and it lacks the DNMT1 homologue (see section 1.3.5; when this work was planned Drosophila was still known as completely lacking DNA methylation). An understanding of the ability of the three targeting sequences in DNMT1 to associate with RF and subnuclear structures in Drosophila cells would tell us whether the extra targeting domains have specifically evolved to function in organisms with methylated and complex genomes coupling DNA replication with maintenance of DNA methylation.

What is the function of the various regulatory sequences in controlling the subnuclear distribution of DNMT1 during the cell cycle?

As discussed earlier, it is suggested that PBD-containing replication proteins associate with RF via the PBD. DNMT1 has three targeting sequences (TS, PBD and PBHD) that have been reported to independently mediate association with RF. Here we sought to understand whether the two extra sequences, viz TS and PBHD, have any specialized functions. Replication in eukaryotes follows a defined spatial and temporal order that reflects the state of transcriptional activity of the chromatin. In general, the sparsely methylated euchromatin replicates early and the densely methylated heterochromatin replicates late, which can be easily discerned by microscopic examination of the pattern of RF (Fig. 1.1). DNMT1 associates with the RF and maintains DNA methylation patterns. Considering this function of DNMT1, and the differences in methylation density at early-replicating euchromatin and late-replicating heterochromatin, we were prompted to investigate whether the three targeting sequences in DNMT1 have any preference for early or late replication foci. We directly addressed this question by analyzing the subnuclear localization of the three targeting sequences, each fused individually and in combinations to GFP/YFP, during different stages in S phase and throughout the entire cell cycle.

Are the regulatory sequences of the mammalian DNMT1 present in other proteins?

As mentioned earlier, there are five classes of DNA Mtases (see Fig 1.6). Only the DNMT1 family of proteins are purported to play a role in maintaining DNA methylation patterns. One evidence that strongly supports such a role for DNMT1 is its association with RF thereby coupling DNA replication with maintenance methylation. Such an analysis has been performed only for mammalian DNMT1 and it is not known whether plant and fungal DNMT1 proteins also localize to RF or other subnuclear structures. Here all the MTase family members and other proteins in the database were analyzed for the presence of sequences similar to the targeting domains in DNMT1. This would provide insight into whether and how other MTases and other nuclear proteins could associate with RF. This analysis may thus contribute to our [Seite 24↓]knowledge of the evolution of nuclear architecture and the introduction of epigenetic information.


© Die inhaltliche Zusammenstellung und Aufmachung dieser Publikation sowie die elektronische Verarbeitung sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung. Das gilt insbesondere für die Vervielfältigung, die Bearbeitung und Einspeicherung und Verarbeitung in elektronische Systeme.
DiML DTD Version 3.0Zertifizierter Dokumentenserver
der Humboldt-Universität zu Berlin
HTML-Version erstellt am:
11.05.2004