Epigenetics: DNA Methylation (1/3)

Published on 1 September 2025 at 12:00

Epigenetics is the study of the modifications applied to DNA which change how that DNA is used. In this essay series, the epigenetic mechanisms of DNA methylation, histone modification, and non-coding RNA will be discussed with a particular focus on how the epigenetic marks work together to enact important developmental events such as X-chromosome inactivation. This first essay focuses on DNA methylation, specifically the addition, removal, and interpretation of the methyl mark, how DNA methylation is localised in the genome and how that location can change the effect DNA methylation.

Rating: 0 stars
0 votes

Epigenetics: DNA Methylation

By Ashley Shipley

1 September 2025

 

Introduction

The human genome contains approximately 19 900 genes1 that can be used by the cell to produce a protein. These proteins vary in function from structural aspects of cells to molecular transporters, from components of hair and nails to blood and bone. As each of the between 17 and 36 trillion cells2 of the human body contain the full genome, they can all hypothetically express all nearly 20 000 of these coding genes and produce all of the nearly 230 000 proteins3 currently identified. This however would produce cells that all appeared and functioned the same with no difference between skin cells and muscle cells, for example. To produce such a complex creature as the human being, therefore, cells must be able to express these genes and produce these proteins in varying amounts to create unique cell types which function together as a whole organism. This is where epigenetics comes in. If the genome can be considered as the instruction manual for the creation of an organism, the epigenome is how that instruction manual is used, controlling which genes are expressed and how much protein is produced in each individual cell. More specifically, epigenetics is the study of heritable changes to gene activity without an underlying change in DNA sequence4. Epigenetic modifications are added to or removed from the DNA molecule which changes how that part of the molecule is read by the cell machinery, typically by either opening up the DNA structure to make it easier to read or by obstructing the sequence to make it harder. In this sense, most epigenetic modifications, such as DNA methylation and histone modification, can be seen as reversible ‘highlighting’ or ‘blacking-out’ of the DNA instruction manual. As a consequence of unique epigenetic influence, each cell type will express a unique suite of genes which will produce a unique cohort of proteins, specific to the function of that cell, which will then be inherited by the daughter cells to maintain this specificity throughout the life of the organism. This essay series will discuss the three main types of epigenetic modification present in human cells – DNA methylation, histone modification and chromatin remodelling, and non-coding RNA (ncRNA) – before consolidating all three modifications in a discussion of a major epigenetic event that requires all three, X-chromosome inactivation. This first essay focuses on how DNA methylation is added, maintained, and removed as well as its effects on gene expression.

 

2. DNA Methylation

Possibly the most stable form of epigenetic modification is DNA methylation. This modification includes the addition of a methyl (CH3) group to a component of the DNA sequence called the cytosine base to produce the modified base 5-methylcytosine (5mC). DNA is perhaps best thought of as a sequence of smaller molecules, or bases, called cytosine (C), guanine (G), adenine (A), and thymine (T), and DNA methylation in humans only occurs on cytosines directly next to guanines, a sequence referred to as CpG5. As the 5mC base more greatly stabilises the DNA structure compared to the unmethylated C6, it is unsurprising that most of the CpG sequences throughout the genome are typically methylated7. What may be surprising, however, is that approximately 15% of the CpGs are routinely unmethylated8. These sequences are usually concentrated in 1000-base stretches of DNA9, over half of these bases being Cs immediately followed by a G, immediately before or at the start of genes in regions called CpG Islands (CGIs)10. The vast majority of these CGIs are protected from methylation, a likely reason why they contain more CpG sequences compared to other regions of the genome as will be discussed later10, although between 10 and 15% of CGIs may be methylated when they are associated with transposable elements or genes involved with X-chromosome inactivation11.

The addition of DNA methylation from S-adenosyl methionine to unmethylated CpG sequences9, also referred to as de novo methylation, is carried out very early in development by DNA methyltransferases (DNMTs) – DNMT3a and DNMT3b12. DNA methyltransferases are proteins, more specifically enzymes, that are called writers as they add epigenetic modifications to the DNA sequence4. An additional DNA methyltransferase, DNMT1, is also present in cells and can be of use, in combination with DNMT3a/3b, for more efficient methylation, however its primary job appears to be the maintenance of the methylation pattern over successive generations so this protein will be discussed further later13.

As previously discussed, DNA methylation typically only occurs on cytosine bases immediately followed by a guanine base (CpG sequence), at least in humans. This suggests that DNMTs specifically bind to these CpG sequences but as also discussed, approximately 15% of all CpGs within the genome are left unmethylated. There are several theories for how DNMTs are localised to specific areas of the DNA sequence in order to methylate some regions and not others, and most of these theories involve other proteins.

 

2.1. Transcription Factors:

The first theory is that DNMTs methylate all CpGs in the genome, except for those that are shielded by some other mechanism12. This appears to correlate well with the findings that most CpG sequences are methylated7, when transcription factors, proteins that may act as a shield, are downregulated or their binding sites mutated, methylation of these previously shielded sequences occurs12, and that a large number of unmethylated regions are associated with gene promoters, the beginning of genes where transcription factors typically bind9. Transcription factors may not simply act as passive shields, however, as DNMTs have been shown to directly interact with a range of transcription factors, potentially illuminating a more active role in either anchoring14 or localisation12 of DNMTs to the promoter region. As will be discussed further in the ‘Effects’ section, methylated and non-methylated gene promoters may have varying effects on the way in which that gene can be expressed.

 

2.2. Epigenetic Crosstalk:

An additional theory regarding the localisation of DNA methylation is in keeping with what will become a major theme throughout this essay series, epigenetic crosstalk. There are many examples of one epigenetic mechanism influencing another and DNA methylation is no different. Not only have DNMTs been shown to bind to another epigenetic writer, histone deacetylase, but their localisation has also been shown to be impacted by the effects of not just this writer, but several other histone writers. The actual mechanisms behind histone modifications will be discussed in more detail in the next essay so there will only be a brief discussion of how they can effect DNA methylation here. In summary, some histone modifications are associated with increased methylation, possible due to the binding of DNMTs within complexes which also contain histone modifiers11. Other histone modifications are associated with decreased methylation, possibly due to a negative interaction between the writer or the modification itself and the DNMTs which prevents them existing in the same region of the genome15. Furthermore, non-coding RNAs can also influence DNA methylation patterns16.

 

2.3. DNA Sequence and Structure:

Finally, the larger DNA sequence and structure may influence DNA methylation as it has been shown that DNMTs preferentially bind to highly condensed, methylated13, and repetitive sequences of DNA7. This will therefore concentrate DNA methylation in these areas and reduce the amount of methylation found in more variable and open parts of the DNA13.

 

 

3. DNA Methylation Maintenance:

During cell division, the DNA structure separates into two strands, replicates to form four strands and then rebinds to form two double-stranded DNA molecules, one strand previously belonging to the original DNA molecule and the other being newly synthesised. This is referred to as semi-conservative replication and is an important characteristic that allows for faithful replication of not just the DNA sequence but also any epigenetic marks that exist on top of it. As DNMTs add methylation marks to both strands of DNA during the methylation process, newly replicated DNA contains one strand that is methylated and one that is not, a state called hemi-methylation. For full DNA methylation to be maintained across cell divisions, therefore, methylation must be reapplied to the unmethylated strand, a process carried out by the maintenance methyltransferase DNMT15. DNMT1 preferentially binds to hemi-methylated DNA over unmethylated DNA9 and, as only small regions of the DNA replicate and are therefore hemi-methylated at a time, is localised to this site of replication12 by the protein co-factors PCNA, which binds to the site of replication, and UHRF1/2, which binds to DNA methylation marks13. PCNA and UHRF1/2 also bind to DNMT1 leading to the methyltransferase’s activation and localisation to hemi-methylated sites15. This is an example of a reader-writer complex as a reader of methylation marks, UHRF1/2, is working with a writer of that mark, DNMT14, and these complexes are often key to the maintenance of epigenetic marks throughout an organism’s life.

It is tempting to assume that DNA methylation maintenance involves the site-specific replication of methylation marks so that all DNA molecules have methylation on the same CpG sequences. This, however, appears not to be the case. Although DNMT1 does have a much higher affinity for hemi-methylated than unmethylated DNA, there does seem to be some de novo methylation of unmethylated DNA possible. This means that instead of copying DNA methylation patterns exactly to the same sequences, DNMT1 maintains methylation densities. In low-CpG dense areas of the genome, such as within genes, methylation can typically occur in a site-specific manner due to the low number of CpG sequences available to methylated and the likelihood that most if not all of these sequences will be methylated. In high-CpG dense areas, however, such as gene promoters, the exact position of methylation may change during replication although the overall amount of methylation in that region will be maintained13.

 

 

4. DNA Demethylation:

Although DNA methylation is understood to be perhaps to strongest epigenetic mark, being able to relatively faithfully be recaptured following cell division and rarely being actively removed from the DNA, that is not to say that removal of DNA methylation, otherwise known as demethylation, does not occur. In fact, demethylation can be quite extensive during early development in order to remove and replace any aberrant methylation marks that accumulated during the life of the egg and sperm cells that fuse to form the new embryo, a process known as reprogramming17. Demethylation can occur via two pathways, passive and active.

 

4.1. Passive Demethylation:

As discussed in the previous section, the hemi-methylated state created by semi-conservative DNA replication is typically recognised and resolved by DNMT1. However, if there is a lack of this methyltransferase or its cofactors within the cell5, 12, or a large amount of transcription factors that may block the action of DNMT1 in particular areas of the genome7, the DNA molecule will remain hemi-methylated at certain positions. Another round of semi-conservative replication therefore will produce a DNA molecule in which both strands are unmethylated in that position, effectively demethylating the molecule. This process is slow14 and requires imbalances or lack of functioning in the cell’s suite of proteins to occur, so may be a reason for the decrease in DNA methylation found in older individuals9.

 

4.2. Active Demethylation:

Active demethylation is a much faster process than passive methylation14 and can occur in different ways, the main two categories being hydroxylation and oxidation, and deamination.

 

4.2.1. Hydroxylation and Oxidation:

Just as the addition of DNA methylation and any epigenetic modification is done by an epigenetic writer, the active removal of these modifications is enacted by an epigenetic eraser4. In the case of demethylation, eraser enzymes TET1 and TET218 hydrolyse the methylated cytosine (5mC) to produce 5hmC which is then in turn oxidised to produce 5fC which is finally converted by a final oxidation to 5caC. 5hmC cannot bind effectively to certain proteins that 5mC binds to, the importance of which will become clear in the ‘Effects’ section of this essay, so this initial hydroxylation step may be deemed sufficient for demethylation. However, the further oxidation steps are required to produce a substrate that can be recognised by the base excision repair (BER) pathway. Briefly, the structure of 5fC and 5caC is similar to that of the DNA base thymine. As these augmented cytosine bases are still paired with guanine bases on the opposite strand of DNA, the BER pathway, a natural mechanism within the cell which ensures faithful DNA replication, is alerted to a ‘mismatch’ pairing so removes the augmented cytosine and replace it with an unmethylated version12. This completes the demethylation process and the now unmethylated cytosine may go on to be remethylated, as is the case in early development17, or to be protected from remethylation, remaining unmethylated and potentially changing the expression of genes in the cell. Interestingly, TET enzymes have a high affinity for the higher-density CpG promoter regions19, 20, indicating active demethylation may contribute to the lack of methylation found in CGIs13.

 

4.2.3. Deamination:

Another active demethylation method is deamination. Deamination can work on both the initial 5mC or the hydroxylated 5hmC and involves AID/APOBEC removing the amine group and replacing it with an oxygen molecule, producing thymine and 5hmU, respectively. Just like the oxidised products described above, these two products can then be recognised by the BER pathway and removed and replaced with an unmethylated cytosine, completing the demethylation process12. The BER pathway is not perfect, however, and if particularly the thymine produced from deamination of the methylated cytosine is left within the DNA sequence this will lead to a fixed mutation after cell division leading to not just a loss of methylation but the cytosine base, and by extension the guanine base, at that position altogether6. This is a likely reason why the CpG sequences are relatively rare throughout the genome and why their density is preserved in CGIs, regions of the genome with little methylation and therefore little deamination10.

 

5. Effect:

Until this point, I have generally avoided discussing the effects associated with DNA methylation, but this will remedied in this section. The effects of DNA methylation are typically mediated by a third and final type of protein called an epigenetic reader. Readers can bind to methylated DNA and therefore act as either attractors or simply anchors for other proteins which influence the expression of genes. Important readers include UHRF1/2, which was discussed previously in relation to methylation maintenance, and methyl-binding domain (MBD) proteins, which will be discussed in further detail in this section12.

Whether DNA methylation occurs in a gene promoter or within the gene body can influence its effect. Genes with methylated promoters tend to be less expressed than their unmethylated counterparts while genes with methylated bodies tend to be more greatly expressed, at least in dividing cells12. This shows that the same modification, the simple addition of a methyl group to a cytosine base, can cause different effects depending on where in the genome it occurs. As a great amount of research has been done on promoter DNA methylation, this shall be discussed first.

 

5.1. Gene Promoter Methylation:

When a gene promoter, the region of the genome before the gene and which is the binding site for many proteins involved in gene expression, is methylated this can have two major consequences, both of which tends to reduce gene expression: reduced transcription factor binding and increased MBD protein binding.

Transcription factors are proteins that provide the framework for and work within the transcription complex, a coalition of proteins that allow for the expression of genes. When methylation occurs and disrupts the DNA structure, these transcription factors can no longer bind, not because of a change in the underlying DNA sequence but because they simply cannot fit within the new DNA structure. Therefore, DNA methylation of promoters can obstruct transcription factor binding which can cause the gene to be less expressed 15, 21, 22.

When it comes to the effects of DNA methylation, however, it appears that the most research has been of the increased binding of MBD proteins. There are several different MBD proteins, but they all contain a methyl-binding domain, as indicated by their name, which allows them to bind to a methylation mark on DNA. Once they have bound to the DNA they may act as a scaffold for the recruitment of other proteins that can alter the structure of the DNA, therefore making it harder for transcription factors to bind and for expression of the gene to occur. Possibly the most researched MBD protein is MECP2. Once MECP2 binds to methylated DNA, it recruits a complex of proteins, referred to as the Sin3 complex, which contains a type of protein called histone deacetylase14. Separately, MECP2 can also recruit another type of protein called histone methyltransferase22. As will be discussed in greater detail in the next essay in this series, histone deacetylases and methyltransferases remove acetyl groups and add methyl groups to histones, respectively12. These modifications are associated with greater compaction of the DNA structure which prevents transcription factor binding and therefore reduces gene expression15. This is an example of the important crosstalk between epigenetic modifications.

In this discussion of gene promoter methylation I have conveniently left out a fact discussed throughout the earlier parts of this section. That being that many promoters contain CpG islands and are therefore generally unmethylated. Indeed approximately 70% of gene promoters are considered high CpG density with many of these being completely unmethylated regardless of the expression level of their genes. This discussion, therefore, may appear superfluous as it does not seem to relate to actual events occurring in the cell, simply hypothetical situations in which promoters are methylated. Even the 30% of promoters of gene promoters considered to have low CpG density and are therefore typically methylated tend to use other mechanisms to change gene expression. There are some high density CpG promoters that are generally methylated and do use this methylation to change gene expression, but this is relatively rare, and methylation appears to not be necessary for this gene expression regulation. It may seem surprising, therefore, that the phenomenon of promoter-methylation leading to decreased gene expression, being practically rare in humans although being theoretically sound, should dominate the discussion on the effects of DNA methylation when much more frequent effects of DNA methylation have been observed. We must therefore turn to this more neglected yet seemingly more realistic form of DNA methylation, that which occurs within the gene body.

 

5.2. Gene Body Methylation:

When methylation occurs within the gene body of dividing cells it is generally associated with an increase in gene expression12. The precise reason for this is yet to be fully understood however it is possible that the methylation readers that bind within genes may attract parts of the expression machinery away from promoters, thereby reducing ‘transcriptional noise’ and allowing greater expression of the methylated gene. This hypothesis and even the association between methylation and gene expression is far from conclusive, however, so more research is needed to gain a more comprehensive understanding23.

Finally, an important consequence of gene body methylation is the silencing of intragenic promoters15 which may belong to retrotransposons. Retrotransposons make up approximately 45% of the genome and, when expressed, can move around and insert into other parts of the genome, thereby disrupting gene expression12. Retrotransposons contain large amounts of repetitive DNA, possibly due to their replicate-insert nature, which is a principal target for DNA methylation16. Like other promoters, intragenic retrotransposon promoters are silenced by DNA methylation leading to lower expression of the retrotransposon24. We will return to the silencing of retrotransposons when we discuss non-coding RNAs in a future essay. The use of gene body methylation could shed some light on the contentious association described above as, when transposons are expressed, the expression of nearby genes is affected. Although the example given in Carrey (2012) suggests that increased transposon expression can increase gene expression5, it certainly seems plausible that this gene expression disruption can also work in the other direction leading to the logical conclusion that decreased transposon expression may increase nearby gene expression. This is of course simply theoretical and requires more research to be more greatly understood.

 

 

6. Conclusion:

In conclusion, methylation of the cytosine base in a CpG sequence is mediated by writers, DNMT3a/3b, and erasers, TET1/2, which tend to add a methyl group to most CpG sequences and remove it from areas of high CpG density, CpG islands, respectively. DNA methylation is a relatively stable epigenetic mark and is typically maintained through DNA replication by DNMT1. When methylation occurs in promoters it tends to reduce the expression of the associated gene through the recruitment of readers and other effector proteins which prevent the expression machinery from binding. As most promoters contain CpG islands, however, expression-effecting DNA methylation within promoters tends to be relatively rare, therefore the comparatively more abundant gene body methylation, which may increase expression, should be of greater focus for further research.

In the next essay in this series, histone modifications and chromatin remodelling will be discussed and the theme of epigenetic crosstalk will be built on.

 

    References

    1 MedlinePlus (updated 21 May 2024). What is a gene? [online] Available at: https://medlineplus.gov/genetics/understanding/basics/gene [Accessed 11 August 2025]

    2 Hatton, I. A., Galbraith, E. D., Merleau, N. S. C. and Shander, J. A. (2023) ‘The human cell count and size distribution’, PNAS, 120 (39), e2303077120. Available at: https://www.pnas.org/doi/10.1073/pnas.2303077120

    3 Crowley, R. (2025). Proteins by the Numbers. [Blog]. Biomedical Beat Blog – National Institute of General Medical Sciences. Available at: https://biobeat.nigms.nih.gov/2025/01/proteins-by-the-numbers/ [Accessed 11 August 2025]

    4 Cavalli, G. and Heard, E. (2019) ‘Advances in epigenetics link genetics to the environment and disease’, Nature, 571(7766), pp. 489–499. Available at: https://doi.org/10.1038/s41586-019-1411-0.

    5 Nessa Carey (2012) The Epigenetics Revolution. UK: Icon Books Ltd.

    6 Hugh Fletcher and Ivor Hickey (2013) Genetics. 4th edn. New York: Garland Science, Taylor & Francis Group (Bios Instant Notes).

    7 Edwards, J.R. et al. (2017) ‘DNA methylation and DNA methyltransferases’, Epigenetics & Chromatin, 10(1), p. 23. Available at: https://doi.org/10.1186/s13072-017-0130-8.

    8 Cotton, A.M. et al. (2015) ‘Landscape of DNA methylation on the X chromosome reflects CpG density, functional chromatin state and X-chromosome inactivation’, Human Molecular Genetics, 24(6), pp. 1528–1539. Available at: https://doi.org/10.1093/hmg/ddu564.

    9 Taby, R. and Issa, J.-P.J. (2010) ‘Cancer Epigenetics’, CA: A Cancer Journal for Clinicians, 60(6), pp. 376–392. Available at: https://doi.org/10.3322/caac.20085.

    10 Lao, V.V. and Grady, W.M. (2011) ‘Epigenetics and colorectal cancer’, Nature Reviews Gastroenterology & Hepatology, 8(12), pp. 686–700. Available at: https://doi.org/10.1038/nrgastro.2011.173.

    11 Tsai, H.-C. and Baylin, S.B. (2011) ‘Cancer epigenetics: linking basic biology to clinical medicine’, Cell Research, 21(3), pp. 502–517. Available at: https://doi.org/10.1038/cr.2011.24.

    12 Moore, L.D., Le, T. and Fan, G. (2013) ‘DNA Methylation and Its Basic Function’, Neuropsychopharmacology, 38(1), pp. 23–38. Available at: https://doi.org/10.1038/npp.2012.112.

    13 Jeltsch, A. and Jurkowska, R.Z. (2014) ‘New concepts in DNA methylation’, Trends in Biochemical Sciences, 39(7), pp. 310–318. Available at: https://doi.org/10.1016/j.tibs.2014.05.002.

    14 Gibney, E.R. and Nolan, C.M. (2010) ‘Epigenetics and gene expression’, Heredity, 105(1), pp. 4–13. Available at: https://doi.org/10.1038/hdy.2010.54.

    15 Mattei, A.L., Bailly, N. and Meissner, A. (2022) ‘DNA methylation: a historical perspective’, Trends in Genetics, 38(7), pp. 676–707. Available at: https://doi.org/10.1016/j.tig.2022.03.010.

    16 Hou, L. et al. (2012) ‘Environmental chemical exposures and human epigenetics’, International Journal of Epidemiology, 41(1), pp. 79–105. Available at: https://doi.org/10.1093/ije/dyr154.

    17 Boskovic, A. and Rando, O.J. (2018) ‘Transgenerational Epigenetic Inheritance’, Annual Review of Genetics, 52, pp. 21–41. Available at: https://doi.org/10.1146/annurev-genet-120417-031404.

    18 Inbar-Feigenberg, M. et al. (2013) ‘Basic concepts of epigenetics’, Fertility and Sterility, 99(3), pp. 607–615. Available at: https://doi.org/10.1016/j.fertnstert.2013.01.117.

    19 Tammen, S.A., Friso, S. and Choi, S.-W. (2013) ‘Epigenetics: The link between nature and nurture’, Molecular Aspects of Medicine, 34(4), pp. 753–764. Available at: https://doi.org/10.1016/j.mam.2012.07.018.

    20 Dawson, M.A. and Kouzarides, T. (2012) ‘Cancer Epigenetics: From Mechanism to Therapy’, Cell, 150(1), pp. 12–27. Available at: https://doi.org/10.1016/j.cell.2012.06.013.

    21 Chang, S., C. (2006) ‘Mechanisms of X-chromosome inactivation’, Frontiers in Bioscience, 11(1), p. 852. Available at: https://doi.org/10.2741/1842.

    22 Sharma, S., Kelly, T.K. and Jones, P.A. (2010) ‘Epigenetics in cancer’, Carcinogenesis, 31(1), pp. 27–36. Available at: https://doi.org/10.1093/carcin/bgp220.

    23 Cotton, A.M. et al. (2015) ‘Landscape of DNA methylation on the X chromosome reflects CpG density, functional chromatin state and X-chromosome inactivation’, Human Molecular Genetics, 24(6), pp. 1528–1539. Available at: https://doi.org/10.1093/hmg/ddu564.

    24 Arthur M. Lesk (2012) Introduction to Genomics. 2nd edn. United States, New York: Oxford University Press.

    Add comment

    Comments

    There are no comments yet.

    Create Your Own Website With Webador