4.3Regulation of Gene Expression in Eukaryotic Cells
In prokaryotes, a ribosome binds to mRNA that is still being synthesized and initiates protein synthesis, and the mRNA is degraded with a half-life of a few minutes (see Fig. 3-14 in Chapter 3). Determination of whether a gene is expressed to synthesize a protein is mainly based on whether mRNA is transcribed from it, or in other words, by transcriptional regulation.
In eukaryotes, on the other hand, mRNA is first transcribed as pre-mRNA (i.e., precursor mRNA), which, after going through various processes in the nucleus, is transported to the cytoplasm through the nuclear pores, where it is used for protein synthesis (Fig. 4-3). The transcriptional regulation of mRNA is also essential in eukaryotes (as discussed later in more detail), and the process between transcription and protein synthesis is also regulated. Regulation after transcription is called posttranscriptional regulation, and one of the characteristics of eukaryotic cells is that they are subject to posttranscriptional regulation in addition to transcriptional regulation.
With regard to posttranscriptional regulation, it is known that the stages of splicing, transportation from the nucleus to the cytoplasm, translation in the cytoplasm and mRNA degradation vary with the developmental stage in the same cell, as well as exhibiting specificity depending on the tissue and cell and changing in line with the in vivo physiological conditions. It is also known that pre-mRNA molecules made from the same gene have different destinies in different cells, which leads to the production of proteins with different amino acid sequences.
Operons and Regulons
In prokaryotes, the use of operons is not the only means of simultaneously regulating multiple genes. As an example, when heat shock is applied to prokaryotes through high temperature, unusual transcription initiation factors are produced, which induces the expression of multiple genes for proteins called heat shock proteins, thus causing response reactions to the heat shock. Heat shock protein genes do not form operons, and are spread over the DNA. The mechanism that simultaneously regulates the expression of such scattered genes is called a regulon (a heat shock regulon in the case of the example above). Likewise, the SOS regulon induces the expression of many DNA repair enzymes when DNA has been damaged.
Gene Expression Regulation by miRNA
It is mentioned earlier in this book that rRNA and tRNA are the main classes of non-coding RNA, and are found in all cells, but eukaryotes also have snRNA (a type of non-coding RNA), which is involved in splicing (see 3.3). It has recently been discovered that small RNA molecules suppress gene expression (Column Fig. 4-2). Such RNA is believed to inhibit transcription by sequence-specifically binding to DNA, binding to and degrading mRNA and inhibiting protein synthesis. The phenomenon of gene expression interference by RNA is called RNAi (RNA interference). RNA molecules with such functions all belong to the non-coding RNA group. RNA that acts on mRNA is synthesized as complementary double-stranded RNA or hairpin RNA, which is cut into small fragments (approx. 21 bp) by double strand RNA cleaving enzymes (i.e., dicers), and functions after binding to a protein complex called RISC. These small RNA molecules are called miRNA (micro RNA), and inhibit protein synthesis only for specific mRNA. If such RNA is experimentally introduced to a cell (or synthesized in a cell) it acts as siRNA (small interfering RNA), with effects similar to those seen when gene function is knocked out. This method is therefore called knockdown, and is now widely used in research.
Eukaryotic genes also have a promoter region to which RNA polymerase binds. The basic mechanism of this binding in eukaryotes is similar to that in prokaryotes, but is far more complicated. Eukaryotes have many proteins that recognize particular sequences (called general transcription factors), and these proteins bind to RNA polymerase to bring about its binding to the promoter (Fig. 4-4). Particular DNA sequences involved in transcriptional regulation, including the promoter, are called cis-elements, and the proteins that bind to these elements to regulate their expression (Fig. 4-5) are trans-elements.
In addition to these basic mechanisms, some genes have transcriptional regulatory sequences called enhancers and silencers to enable changes in their expression in response to developments in intra- and extracellular conditions. These regions are cis-elements with specific sequences, and some proteins recognize and bind to them (these are known as trans-elements). The specific binding proteins to the enhancer sequence promote the binding of RNA polymerase to the promoter region, resulting in a large amount of gene expression (Fig. 4-5). Conversely, the silencer suppresses gene expression.
Enhancers and silencers are different from promoters in that they function even if they are located far upstream of the gene (sometimes dozens of kbp away), and in some cases if they are inside or downstream of the gene. Promoters do not exist inside or downstream of the gene because they represent points where RNA polymerase binds, thereby initiating transcription. Another difference is that enhancers and silencers can still function in the same way even if their sequences are reversed. With promoters, the binding direction of RNA polymerase is determined by the direction of the promoter sequence, based on which the direction of enzyme movement and the selection of the template strand are determined; if the sequence is reversed, the enzyme therefore proceeds in the opposite direction, preventing the promoter from functioning.
It is common for single genes to contain multiple types of cis-element (such as those described above) that regulate gene expression in accordance with various intra- and extracellular signals. In eukaryotes, regions that regulate gene expression are therefore very long, often exceeding 10 kbp.
The Possibility of Many Non-coding RNA Molecules Being miRNA Molecules
It has become increasingly clear that the nucleus of eukaryotes contains more RNA types than first thought, many of which are non-coding RNA molecules transcribed from DNA regions that are wider than initial estimations (see the Column in Chapter 3). Many of these are relatively small RNA molecules, and many of them form double-stranded structures based on their functions, it is feasible to suggest that they may be miRNA. Since miRNA plays a number of basic and important roles (including gene expression regulation at chromosome and mRNA level), if the large amounts of RNA present in cells turns out to be miRNA, our perception of expression regulation in eukaryotic cellular genes will dramatically change. Research in this area is progressing rapidly.
Regulation by Chromatin Remodeling
Eukaryotic DNA exists as nucleosomes structure in which DNA firmly binds to histones (basic proteins) (Fig. 4-6). These nucleosomes then form a chromatin fiber. When histones and DNA are tightly connected, it is difficult for RNA polymerase to bind to the DNA and synthesize RNA. One of the roles of enhancers is to loosen the connection between histones and DNA (and, in some cases, deconstruct the nucleosomes) to promote RNA synthesis by RNA polymerase. This phenomenon is called chromatin remodeling because it changes the nucleosome structure.
Regulation by chromatin remodeling is a complex reaction in which large numbers of enzymes and proteins are involved. To give a simplified explanation, a transcriptional activation protein binds to a particular enhancer, which induces histone acetyltransferase to also bind to the enhancer, thereby acetylating the amino group of histones. This lowers the basicity of the histones, thus weakening their connection with DNA. With this as a turning point, dissociation between histones and DNA proceeds to neighboring regions; this soon exposes the promoter, thereby facilitating the binding of the RNA polymerase complex to it (Fig. 4-7). Conversely, when expression is suppressed, histone deacetylase detaches the acetyl group to restore the nucleosome structure.
Genetics and Reverse Genetics
To investigate gene functions in traditional Mendelian genetics, the inheritance patterns of the phenotypes (or traits) of an organism are first observed, and then the characteristics of the genes that control the phenotypes in question are analyzed. Simply put, analysis starts from phenotypes and moves toward genes, with gene cloning as one of the goals.
On the other hand, while genes can now be easily cloned thanks to the completion of the Human Genome Project and other scientific advances, the nature of phenotypes brought about by the creation of cloned genes remains largely unknown. Analysis that starts from the genes obtained and moves toward phenotypes (gene functions) is called reverse genetics.
Often in reverse genetics, a knockout cell is made using one of the genes obtained. An animal (such as a knockout mouse) is then cloned from this cell, and the phenotypes of the animal are then analyzed (see the Column in Chapter 12). Despite the low success rate of this method, breakthroughs have been made using the technique in areas where gene functions were previously difficult to investigate, such as the functions of genes involved in the developmental process and those involved in the higher functions of the brain.
Cloned animals and plants in which foreign genes have been artificially introduced are called transgenic animals and plants. They are used to analyze genes whose functions are unknown, and are also widely used in applied research to create livestock that produces particular proteins.
Chromatin Structure and Gene Expression Regulation
In the DNA of non-expressed genes associated with differentiation functions, cytosines - a type of nucleobase - are highly methylated. At sites with a two-base sequence of 5’-CG-3’ in particular, over 70% of cytosines are methylated. If a large number of methylated cytosines exist in the DNA that constitutes a nucleosome, the protein complex that recognizes them binds to the DNA, and the histone methyltransferase contained in the complex methylates histones. A protein then binds to the methylated histones, tightly packing the chromatins at the site (Fig. 4-8). This structure is called a constitutive heterochromatin (see the Column). Conversely, methylation occurs to a lesser extent in genes that are expressed.
When cells grow, the cytosines in the newly created DNA strand (i.e., the daughter strand) are not methylated. However, if the cytosines at some sites in the template strand (i.e., the parent strand) are methylated (the sequence is 5’-CG-3’ in both strands), an enzyme methylates the cytosines in the daughter strand at locations opposite the methylated parts of the template, thus maintaining the methylated state of both the parent and daughter strands. In this way, non-expressed genes in heterochromatin domains are inherited through cell division to progeny cells in unchanged form. In such cases, methylation information is passed on to progeny cells in the same way as changes in DNA sequences (genetic changes, or mutations), but no changes in sequence actually occur. The phenomenon is known as epigenetic change, and this type of gene expression regulation is called epigenetic regulation (Fig. 4-8).
Heterochromatin and Euchromatin
Under a microscope, it can be seen that a nucleus consists of heterochromatin and euchromatin. The former is stained dark when basic dye is applied, while the latter exhibits light staining. In heterochromatin, chromatin fibers are bound with proteins, thereby tightly packing chromatins in which non-expressed genes are concentrated. Euchromatin is a loosely packed form of chromatin in which genes that can be expressed are concentrated. Heterochromatin is believed to consist of constitutive heterochromatin and facultative heterochromatin. Throughout the life of the cell, constitutive heterochromatin domains form heterochromatin, in which the expression of genes is suppressed. One of the two X chromosomes in women is packaged in constitutive heterochromatin. An often-quoted fact is that the serum albumin gene is not expressed in any cell except hepatocytes, probably because it dwells within constitutive heterochromatin. Facultative heterochromatin is a type of heterochromatin that moves between heterochromatin and euchromatin states.
DNA Methylation, Development and Somatic Cell Cloned Animals
DNA methylation is closely associated with heterochromatin formation, and the genes in highly methylated DNA sites are not expressed. The DNA in the cells between the germ-cell and early development stages is hypomethylated, meaning that all genes during this period can be expressed. It can also be said that all such cells have totipotency - the ability to develop into all cell types. As somatic cells become more differentiated during the developmental process, DNA methylation gradually proceeds, thus limiting the paths of differentiation for those cells (although the regions methylated differ by cell type). Subsequently, most of the genes associated with differentiation functions (except for some specific genes) become highly methylated, meaning that they will never need to be expressed. Although the mechanism of particular DNA regions becoming methylated during the developmental process is not yet clear, it is understood that this developmental process is where DNA methylation proceeds. One of the reasons for the low success rate of animal cloning from somatic cells may be the lack of methods to efficiently transform hypermethylated DNA to a hypomethylated state (i.e., initialization).
Genomes, Transcriptomes and Proteomes
The aim of the Human Genome Project was to determine the entire DNA sequence of humans. In addition to this, the complete sequences of over 20 species of organism have so far been determined or almost determined. The term genome refers to the entire collection of DNA in one cell; although only a small portion of the genome functions as genes, in functional terms the word refers to the entire collection of genes in one cell. Using the success of the Human Genome Project as a springboard, several projects have been implemented in which the functions of organisms are comprehensively investigated (rather than concentrating on individual species). As an example, one area of investigation focuses on transcriptomes - the set of all RNA molecules transcribed (transcripts) - in terms of the types and amounts of mRNA synthesized in particular tissues or cells. These should differ by organ, tissue and cell in the same species, and should also naturally change in line with alterations in physiological function or when the organism is diseased. It is expected that comprehensive investigation of these areas will enable the identification of changes in mRNA types and amounts in diseased conditions as well as in normal ones. Likewise, comprehensive analyses are under way in areas such as proteomes (the entire complement of proteins expressed), interactomes (the whole set of protein associations and interactions), metabolomes (the complete set of all metabolites) and epigenomes (overall epigenetic changes, including DNA methylation). Some joke that we are now in the era of the “ome.”