ZEN-3694

Epigenetic Tools (The Writers, The Readers and The Erasers) and Their Implications in Cancer Therapy

Abstract

Addition of chemical tags on the DNA and modification of histone proteins impart a distinct feature on chromatin architecture. With the advancement in scientific research, the key players underlying these changes have been identified as epigenetic modifiers of the chromatin. Indeed, the plethora of enzymes catalyzing these modifications portray the diversity of epigenetic space and the intricacy in regulating gene expression. These epigenetic players are categorized as writers, which introduce various chemical modifications on DNA and histones; readers, the specialized domain-containing proteins that identify and interpret those modifications; and erasers, the dedicated group of enzymes proficient in removing these chemical tags. Research over the past few decades has established that these epigenetic tools are associated with numerous disease conditions, especially cancer. Besides, with the involvement of epigenetics in cancer, these enzyme and protein domains provide new targets for cancer drug development. This is certain from the volume of epigenetic research conducted in universities and the R&D sector of the pharmaceutical industry. Here, we have highlighted the different types of epigenetic enzymes and protein domains with an emphasis on methylation and acetylation. This review also deals with the recent developments in small molecule inhibitors as potential anti-cancer drugs targeting the epigenetic space.

Abbreviations

ADA2, Adenosine deaminase 2; ASH1L, Absent, Small, or Homeotic 1-like; BAF, BRG1 associated factor; BAH, Bromoadjacent homology; BAZ2A, Bromodomain adjacent to zinc finger domain protein 2A; BET, Bromodomain and Extra-Terminal domain; BRPF1, Bromodomain and PHD finger containing protein 1; CARM1, Coactivator associated arginine methyltransferase-1; ChIP, Chromatin immunoprecipitation; CoREST, Corepressor of RE1 silencing transcription factor; CREBBP, CREB binding protein; DCD, Double chromodomain; DDT, DNA binding homeobox and Different Transcription factors; DOT1, Disruptor of telomeric silencing 1; DPF, Double PHD finger domain; EHMT1, Euchromatic histone lysine N-methyltransferase 1; EZH1, Enhancer of Zeste Homolog 1; FAD, Flavin adenine dinucleotide; GCN5, General control non-repressible 5; GLP, G9a like protein; HAT, Histone acetyltransferase; HDAC, Histone deacetylase; HKMT, Histone lysine methyltransferase; hMSL3, human male specific lethal 3; HP1, Heterochromatin protein 1; IGF, Insulin like growth factor; ING5, Inhibitor of growth protein 5; JmjC, Jumonji C; KDM, Histone lysine demethylase; KMT, Histone lysine methyltransferase; LSD, Lysine specific demethylase; MBD, Methyl CpG binding domain; MeCP2, Methyl CpG binding protein 2; MEF2, Myocyte enhancer factor 2; MSL3, Male specific lethal 3; MYND, myeloid, Nervy and DEAF1; NCoA, Nuclear receptor coactivator; NCoR, Nuclear receptor corepressor; NuRD, Nucleosome remodeling and deacetylase; NSD, Nuclear receptor SET-domain containing; ORC1, Origin recognition complex 1; PCAF, p300/CBP associated factor; PCNA, Proliferating cell nuclear antigen; PHD, Plant homeodomain; PHF1, PHD finger protein 1; PRC2, Polycomb repressive complex 2; PRMT, Protein Arginine Methyltransferase; SAM, S-adenosyl-L-methionine; SET, Su(var)3-9, Enhancer of zeste and Trithorax; SETDB1, SET domain bifurcated 1; SIRT, Sirtuins; SMRT, Silencing mediator of retinoid and thyroid receptor; SMYD, SET and MYND domain; SRA, SET and RING finger associated; SRC, Steroid receptor coactivator; STAT3, Signal transducer and activator of transcription 3; SUV39H1, Suppressor of variegation 3-9 homolog 1; TET, Ten-eleven translocation; TIF1α, Transcriptional intermediary factor 1α; TRD, Transcription repression domain; TTD, Tandem tudor domain; UHRF1, Ubiquitin like with PHD and RING finger domain 1; WHSC1, Wolf Hirschhorn Syndrome candidate 1; ZFP, Zinc finger protein; ZnF, Zinc finger.

Keywords

Epigenetics, Writers, Readers, Erasers, Cancer, Epigenetic drugs

Introduction

Expression of specific genes in different cells is a key feature that multicellular organisms possess through distinct mechanisms. DNA in eukaryotic cells is wrapped around proteins known as histones, which protect and regulate the packed DNA. This structure of DNA wrapped around histones is known as chromatin, and it plays a vital role in regulating gene expression. The ability of eukaryotic cells to maintain distinct phenotypes, although containing identical genetic constituents, is ensured by chromatin-associated proteins and the various chemical modifications occurring on DNA and histones. The way our blueprint is accessed during development and differentiation in distinct cells is regulated by epigenetics. Initially coined by C.H. Waddington in 1942 and modified over the years by various researchers, epigenetics refers to the heritable change occurring in the genome leading to an altered gene expression pattern without affecting the core DNA sequence. These changes play a crucial role in cellular identity and regulate vital cellular processes, even though they do not interfere with the basic DNA sequence. Chromatin in the nucleus of a cell exists in two different states: a closed chromatin (heterochromatin), which is associated with transcriptional repression, and an open chromatin (euchromatin), favorable towards transcription. This level of chromatin regulation is significantly governed by post-translational modifications of core histone proteins and the covalent modifications of DNA.

Nucleosome, the basic building block of chromatin, is composed of an octamer of histone proteins (dimers of H3 and H4 along with dimers of H2A and H2B) creating a compact central structure around which 147 base pairs of DNA are wrapped. Histone proteins have N-terminal tails that protrude out from the core and are subjected to various post-translational modifications. These modifications, along with the direct modification of DNA, constitute the epigenetic code. A distinct feature of this code is its ability to adapt and respond to environmental changes occurring throughout the life cycle of an organism, developing from a zygote to an adult. So far, researchers have identified four different types of cytosine residue modifications in the DNA, including methylation, hydroxy-methylation, formylation, and carboxylation. In addition, more than ten different types of histone modifications have been identified. The key players associated with these modifications are being deciphered progressively with our growing knowledge in epigenetics. These are: writers, a plethora of enzymes capable of modifying nucleotide bases and specific amino acid residues on histones; erasers, a group of enzymes proficient in removing these marks; and readers, a diverse range of proteins that possess specialized domains capable of recognizing specific epigenetic marks in a locus. These enzymes and protein domains together constitute the epigenetic tools. Recent research has suggested that altered regulation of these epigenetic tools plays a key role in tumorigenesis. Since the modifications laid down by these enzymes are reversible, therapeutic approaches targeting these alterations are one of the thriving areas of research in academia and industry in terms of cancer drug discovery. In this review, we have focused on comprehending the concepts associated with epigenetic writers, readers, and erasers. The recent developments in targeting these tools for cancer drug discovery are also discussed.

Epigenetic Tools

Writers

Modifications of DNA and histone proteins occur through the addition of various chemical groups utilizing numerous enzymes. Although a plethora of modifications are possible, we have focused on the two most widely studied epigenetic alterations, namely methylation and acetylation. Both DNA and histone proteins are prone to methylation, while acetylation is associated only with histones. These two modifications frequently govern the gene expression pattern in a cell by altering between transcriptional activation or repression. The epigenetic writers that are explained here include DNA methyltransferases, histone lysine methyltransferases, protein arginine methyltransferases, and histone acetyltransferases.

DNA Methyltransferases

DNA methyltransferases (DNMTs) catalyze the transfer of a methyl group from the methyl donor, S-adenosyl-L-methionine (SAM), to the 5ʹ-position of cytosine residues in DNA. Among the nucleotides which constitute the building block of DNA, cytosine is highly prone to methylation. Cytosine methylation predominantly occurs in the context of cytosine-guanine (CpG) dinucleotide, resulting in the formation of 5-methylcytosine. DNMTs are a family of highly conserved proteins which in mammals consist of five members: DNMT1, DNMT2, DNMT3A, DNMT3B, and DNMT3L. DNMT1 is the most abundant DNMT in adult cells and functions as a maintenance methyltransferase. Post-replication, the newly synthesized hemi-methylated strands are recognized by DNMT1, which catalyzes the faithful propagation of methylation marks from the parental strand onto the daughter strand, maintaining DNA methylation patterns through mitosis. DNMT1 is present near the replication fork from where it performs its catalytic role. DNMTs consist of two functional regions: the C-terminal catalytic domain and the N-terminal region of variable size. Maintenance methyltransferase property of DNMT1 is attributed to three sequences on the N-terminal region: the heterochromatin targeting sequence (TS domain), the polybromo homology domain (BAH1/2), and the proliferating cell nuclear antigen (PCNA) binding domain. However, the catalytic C-terminal domain of DNMT1 is the only DNA methyltransferase domain which does not bear any catalytic activity when expressed separately. UHRF1 (ubiquitin-like with PHD and ring finger domain 1), also known as ICBP90 (human) or Np95 (mouse), plays a crucial role as a co-factor in recruiting DNMT1 to the hemi-methylated CpG sites. UHRF1 cooperates with DNMT1 throughout the S-phase of the cell cycle and interacts with hemi-methylated DNA using a SET and Ring associated (SRA) domain.

DNMT2 is the enigmatic member of the DNMT family and is a highly conserved cytosine DNMT in eukaryotes. Although DNMT2 shares a strong sequence homology with other DNMTs, it hardly possesses any detectable DNA-cytosine methylation property. In mammals, DNMT2 refrains from DNA methylation, and in turn, methylates the 38th cytosine residue in the anticodon loop of tRNAs. It utilizes a DNMT mechanism for RNA methylation. DNMT2 is the only methyltransferase that catalyzes RNA methylation and provides stability to the tRNAs. The members DNMT3A and DNMT3B are known as de novo methyltransferases since they are crucial for genome-wide methylation of DNA and work to introduce methylation patterns during gametogenesis and early embryogenesis. Unlike DNMT1, members of DNMT3 methylate both hemi-methylated and unmethylated CpG sites on the DNA and are distributed throughout the nucleus. In early embryonic cells where the rate of de novo methylation is higher, the expression of DNMT3A and DNMT3B is high, whereas in adult somatic cells, their expression is downregulated. The DNMT3 family constitutes a catalytically inactive member, DNMT3L, which is a regulatory factor for de novo methylation. Although the amino acid sequence for DNMT3L is similar to that of other members of this family, it lacks the residues in the C-terminal domain required for methyltransferase activity. Surprisingly, when DNMT3L associates with DNMT3A and DNMT3B, it stimulates their catalytic activity by fifteen-fold. This widely cited model of maintenance and de novo methylation is being revised after compelling experimental evidence suggesting the overlapping role of DNMT1 with DNMT3A and DNMT3B. For further information on the DNMT family, readers are advised to consult comprehensive reviews in the field.

Histone Lysine Methyltransferase

Methylation of histones is a unique post-translational modification since it can add up to three methyl groups on single lysine (K) residues, resulting in mono-, di-, and tri-methylated states. These modifications are associated with transcriptional activation or repression based on the location of the lysine residues. Histone lysine methyltransferases (KMTs) catalyze the transfer of a methyl group from adenosyl-methionine, producing three methylated products, and adenosylhomocysteine. Due to the extremely intricate nature of this system, only some well-characterized modifications are highlighted. Lysine residues 4, 9, 27, 36, and 79 of histone H3 and residue 20 of histone H4 are highly prone to methylation by KMTs. The actively transcribed region of DNA is associated with H3K4me2,3, H3K9me1, H3K27me1, H3K36me3, and H4K20me1 methylation marks, whereas H3K9me2,3, H3K27me2,3, and H4K20me3 are the marks of heterochromatin. KMTs which catalyze these modifications are divided into two broad groups based on the presence and absence of the SET (Su(var)3-9, Enhancer-of-zeste, and Trithorax) domain. H3K4 trimethylation is an epigenetic signature of actively transcribed regions and is extremely important for transcriptional initiation. The H3K4 methyltransferases family includes SET and MYND domain containing protein 3 (SMYD3), SMYD1, mixed lineage leukemia (MLL1-5) family, SET1A/B, Absent, Small, or Homeotic 1-like (ASH1L), ASH2L, PR domain zinc finger protein 9 (PRDM9), and SET7. Promoters of transcriptionally active genes bear the trimethyl mark of H3K4, whereas mono-methyl marks are associated with enhancers and other regulatory elements.

Histone Lysine Methyltransferase

Methylation of histones is a unique post-translational modification since it can add up to three methyl groups on single lysine residues, resulting in mono-, di-, and tri-methylated states. These modifications are associated with transcriptional activation or repression based on the location of the lysine residues. Histone lysine methyltransferases (KMTs) catalyze the transfer of a methyl group from adenosyl-methionine, producing three methylated products and adenosylhomocysteine. Due to the extremely intricate nature of this system, only some well-characterized modifications are highlighted here. Lysine residues 4, 9, 27, 36, and 79 of histone H3 and residue 20 of histone H4 are highly prone to methylation by KMTs. The actively transcribed region of DNA is associated with H3K4me2,3, H3K9me1, H3K27me1, H3K36me3, and H4K20me1 methylation marks, whereas H3K9me2,3, H3K27me2,3, and H4K20me3 are the marks of heterochromatin. KMTs which catalyze these modifications are divided into two broad groups based on the presence and absence of the SET (Su(var)3-9, Enhancer-of-zeste, and Trithorax) domain. H3K4 trimethylation is an epigenetic signature of actively transcribed regions and is extremely important for transcriptional initiation. The H3K4 methyltransferases family includes SET and MYND domain containing protein 3 (SMYD3), SMYD1, mixed lineage leukemia (MLL1-5) family, SET1A/B, Absent, Small, or Homeotic 1-like (ASH1L), ASH2L, PR domain zinc finger protein 9 (PRDM9), and SET7. Promoters of transcriptionally active genes bear the trimethyl mark of H3K4, whereas mono-methyl marks are associated with enhancers and other regulatory elements.

H3K9 methylation is another important post-translational modification that is associated with gene silencing and heterochromatin formation. The major H3K9 methyltransferases include SUV39H1, G9a, GLP, SETDB1, and ESET. H3K27 methylation is catalyzed by the polycomb repressive complex 2 (PRC2), which contains the catalytic subunit EZH2. H3K27me3 is a hallmark of gene repression and is crucial for maintaining the silenced state of developmental genes. H3K36 methylation is associated with transcriptional elongation and is catalyzed by SETD2, NSD1, and ASH1L. H3K79 methylation is catalyzed by DOT1L and is linked to active transcription. H4K20 methylation is catalyzed by SUV420H1 and SUV420H2 and is associated with chromatin compaction and gene repression.

Protein Arginine Methyltransferases

In addition to lysine methylation, arginine residues on histones and non-histone proteins are also subject to methylation. Protein arginine methyltransferases (PRMTs) catalyze the transfer of methyl groups from SAM to the guanidino nitrogen atoms of arginine residues. PRMTs are classified into three types based on the type of methylation they catalyze: type I PRMTs generate asymmetric dimethylarginine, type II PRMTs generate symmetric dimethylarginine, and type III PRMTs generate monomethylarginine. PRMT1, PRMT3, PRMT4 (CARM1), PRMT6, and PRMT8 are type I enzymes, while PRMT5 and PRMT9 are type II, and PRMT7 is a type III enzyme. Arginine methylation plays a role in regulating transcription, RNA processing, DNA repair, and signal transduction.

Histone Acetyltransferases

Acetylation of lysine residues in histone tails is catalyzed by histone acetyltransferases (HATs). The addition of an acetyl group neutralizes the positive charge of lysine, leading to a more relaxed chromatin structure and facilitating transcriptional activation. HATs are classified into three main families: the GNAT (GCN5-related N-acetyltransferase) family, the MYST (Moz, Ybf2, Sas2, Tip60) family, and the p300/CBP family. GCN5 and PCAF are members of the GNAT family, while Tip60, MOZ, and HBO1 belong to the MYST family. The p300 and CBP proteins are large, multi-domain coactivators with intrinsic HAT activity. HATs not only acetylate histones but also modify non-histone proteins, thereby regulating various cellular processes.

Readers

Epigenetic readers are specialized proteins that recognize and bind to specific epigenetic marks on DNA or histone proteins. These proteins contain conserved domains that mediate the recognition of methylated, acetylated, or phosphorylated residues. The main types of reader domains include bromodomains, chromodomains, plant homeodomain (PHD) fingers, Tudor domains, and methyl-CpG binding domains (MBDs).

Bromodomains are protein modules that specifically recognize acetylated lysine residues, particularly on histone tails. Proteins containing bromodomains include members of the BET (bromodomain and extra-terminal) family such as BRD2, BRD3, BRD4, and BRDT. These proteins play critical roles in transcriptional regulation, cell cycle progression, and chromatin remodeling.

Chromodomains are domains that bind to methylated lysine residues, such as H3K9me3 and H3K27me3. HP1 (heterochromatin protein 1) family members contain chromodomains and are involved in the formation and maintenance of heterochromatin. The Polycomb group proteins, which are key regulators of gene silencing, also contain chromodomains.

PHD fingers are zinc finger-like domains that recognize methylated or unmethylated lysine residues on histone tails. For example, the PHD finger of ING2 binds to H3K4me3, a mark of active transcription. Tudor domains are also methyl-lysine binding modules and are found in proteins involved in DNA damage response and RNA metabolism.

Methyl-CpG binding domains (MBDs) are present in proteins such as MeCP2, MBD1, and MBD2, which recognize methylated CpG dinucleotides in DNA. These proteins play a role in transcriptional repression and chromatin compaction.

Erasers

Epigenetic erasers are enzymes that remove chemical modifications from DNA or histone proteins, thereby reversing the epigenetic marks set by writers. The main classes of erasers include histone deacetylases (HDACs), histone demethylases, and DNA demethylases.

Histone Deacetylases

HDACs catalyze the removal of acetyl groups from lysine residues on histone tails, leading to chromatin condensation and transcriptional repression. HDACs are classified into four classes based on their sequence homology and domain organization. Class I HDACs (HDAC1, 2, 3, and 8) are primarily nuclear and are involved in transcriptional repression. Class II HDACs (HDAC4, 5, 6, 7, 9, and 10) shuttle between the nucleus and cytoplasm and regulate differentiation, development, and signal transduction. Class III HDACs, also known as sirtuins (SIRT1-7), are NAD+-dependent deacetylases involved in metabolism, aging, and stress response. Class IV includes only HDAC11, which shares features of both class I and II.

Histone Demethylases

Histone demethylases remove methyl groups from lysine or arginine residues on histones. The two main families of histone lysine demethylases are the amine oxidase family (including LSD1 and LSD2) and the Jumonji C (JmjC) domain-containing family. LSD1 is a flavin adenine dinucleotide (FAD)-dependent enzyme that demethylates mono- and di-methylated H3K4 and H3K9. The JmjC family demethylases use Fe(II) and α-ketoglutarate as cofactors and can demethylate tri-methylated lysine residues. These enzymes play important roles in gene regulation, development, and disease.

DNA Demethylases

DNA demethylation occurs through both passive and active mechanisms. Active DNA demethylation is mediated by the ten-eleven translocation (TET) family of dioxygenases, which oxidize 5-methylcytosine to 5-hydroxymethylcytosine, 5-formylcytosine, and 5-carboxylcytosine. These oxidized forms are then removed by base excision repair pathways, resulting in unmethylated cytosine. TET proteins are essential for embryonic development, stem cell pluripotency, and lineage specification.

Epigenetic Tools and Cancer

Aberrant regulation of epigenetic modifications is a hallmark of cancer. Mutations, deletions, or overexpression of epigenetic writers, readers, or erasers can lead to abnormal gene expression patterns, contributing to tumorigenesis, metastasis, and drug resistance. For instance, mutations in DNMT3A and TET2 are commonly found in hematological malignancies. Overexpression of EZH2, the catalytic subunit of PRC2, is associated with aggressive forms of prostate and breast cancers. Mutations in the histone demethylase UTX and the acetyltransferase CREBBP have also been linked to various cancers.

Targeting Epigenetic Modifiers in Cancer Therapy

Given the reversible nature of epigenetic modifications, targeting epigenetic enzymes and reader domains has emerged as a promising strategy for cancer therapy. Several small molecule inhibitors targeting DNMTs, HDACs, HATs, KMTs, and bromodomains have been developed and are in various stages of clinical development. DNMT inhibitors such as azacitidine and decitabine are approved for the treatment of myelodysplastic syndromes and acute myeloid leukemia. HDAC inhibitors such as vorinostat, romidepsin, and panobinostat are approved for the treatment of cutaneous and peripheral T-cell lymphomas and multiple myeloma.

Inhibitors targeting the bromodomains of BET proteins, such as JQ1 and I-BET762, have shown efficacy in preclinical models of leukemia, lymphoma, and solid tumors. EZH2 inhibitors such as tazemetostat have demonstrated clinical activity in patients with relapsed or refractory follicular lymphoma and epithelioid sarcoma. Inhibitors of LSD1 and PRMT5 are also being evaluated in clinical trials for various cancers.

Conclusion

Epigenetic modifications play a crucial role in regulating gene expression and maintaining cellular identity. The dynamic interplay between writers, readers, and erasers governs the epigenetic landscape of the genome. Dysregulation of these processes contributes to the development and progression of cancer. Understanding the molecular mechanisms underlying epigenetic regulation has led to the identification of novel therapeutic targets and the development of epigenetic drugs. Continued research in this field holds promise for improving the diagnosis, prognosis,ZEN-3694 and treatment of cancer and other diseases associated with epigenetic dysregulation.