2.4Replication of DNA
Outline of DNA Replication
The replication of DNA involves the production of high-molecular DNA by polymerizing deoxyribonucleotides, which are units of DNA.
Generally, this process is expressed as follows:
[dNMP] n + dNTP → [dNMP] n+1 + PPi
dNMP - a compound generated as a result of the detachment of pyrophosphate*1 from dNTP - is added to the 3'-OH of [dNMP]n. This indicates that DNA synthesis occurs in the direction from 5' to 3'. This is also the case for RNA synthesis, and high-molecular nucleic acids are always synthesized from 5' toward 3'.
Deoxyribonucleotides are linked by DNA polymerase. In E. coli, DNA polymerase I, II, III, etc. are known, and III is the main replicative enzyme. In mammals, several types, including α, β, γ, δ and ε, have been identified. Of these, α, δ and ε are the main replicative enzymes, while the others serve mainly to repair damage to DNA.
DNA Damage and Repair
DNA is continually subjected to damage. There are many natural or artificial chemicals that bond with DNA bases, form base-base bonds, or cut DNA strands. Radiation such as ultraviolet rays and cosmic rays also modifies bases or cuts strands. DNA faces such threats from the birth of the organism that carries it, and all organisms are equipped with a range of functions to detect and repair DNA damage. As an example, damage to bases caused by ultraviolet rays or certain chemicals is fixed by a mechanism called excision repair, in which the damaged sites (including the surrounding regions) are cut out and removed, and the resulting gaps are filled with DNA polymerase (Column Fig. 2-1). Defects in the genes coding for the enzymes involved in this mechanism result in a hereditary disease called xeroderma pigmentosum, which makes sufferers prone to developing cancer. Many other hereditary diseases associated with repair-enzyme defects are known; these often cause cancer and, in rare cases, accelerated senescence (progeria). In short, many repair-enzyme systems are in place to continually repair gene damage, thus minimizing the accumulation of defects.
The Need for a Template in Replication
During the process of replication, the original double strands are unwound, and new nucleotides are added to each single strand in a way that forms base pairs (Fig. 2-13). This diagram shows the DNA double-helix structure published in 1953 by James Watson and Francis Crick, which suggested the possibility that DNA was replicated using templates. In fact, during the replication process, base pairs (C-G and A-T) are formed using each original strand as a template. As a result, once the process is complete, two new double strands of DNA with the same base sequence as the original are created. One strand of the new double strand is the original template (i.e., the parent strand), and the other is a new strand (i.e., the daughter strand). This method of replication is called semiconservative replication. While many high molecules such as proteins and sugar chains exist in the living body, this template-based semiconservative synthetic method is unique to DNA.
Accuracy of Replication
DNA replication must be accurate. If DNA were synthesized using nucleotides that do not follow the pairing rule, changes in the sequence would occur in the newly created strand to form what is known as a mutation. A mutation that occurs within the region of an important gene, if the effect is significant, results in the death of the cell. Alternatively, in some cases, the cell may become cancerous. In humans, one cell contains 6 x 109 base pairs, and the division of 1011 - 1012 cells is thought to occur each day.
When DNA polymerase extends a new strand, the possibility of its inserting incorrect nucleotides is said to be 10-6 - 10-4. DNA polymerase has a proofreading function by which incorrectly inserted nucleotides are removed and replaced with the correct ones. Additionally, errors missed by the proofreading function are detected and replaced with the correct nucleotides by a mismatch repair mechanism. There are multiple mismatch repair systems, and the final frequency of errors is in the range of 10-11 - 10-10. To artificially build a reaction system that achieves such low error rates is difficult, even in the field of precision engineering.
Double strands of DNA always run in opposite directions. This is the case for completed DNA as well as for DNA during the process of replication. When DNA synthesis is considered based on the Watson-Crick model (Fig. 2-14), one of the daughter strands must be synthesized in the 3'→5' direction. However, DNA polymerase always synthesizes in the 5'→3' direction. Let's look at this in more detail.
During the synthesis of the daughter strands following the uncoiling of the parent strand, three double strands of DNA appear in a structure called the replication fork (Fig. 2-14). Looking closely at this fork structure, at the point where DNA synthesis occurs (i.e., the replication point), one of the daughter strands (i.e., the leading strand) is synthesized in the same direction as that in which the replication fork runs. The other daughter strand (i.e., the lagging strand) is synthesized in the direction opposite to that of the replication fork, because DNA is synthesized in the 5'→3' direction (Fig. 2-14). Along the lagging strand, short DNA fragments of approximately 100 nucleotides are continually synthesized, and are subsequently linked with each other. These short strands are called Okazaki fragments after Reiji Okazaki, the molecular biologist who discovered them. This type of replication is called discontinuous replication.
DNA polymerase follows the reaction below:
[dNMP] n + dNTP → [dNMP] n+1 + PPi
However, this reaction does not occur when n=1. At least a fragment of two or more nucleotides (called a primer) is needed so that new nucleotides can be added there. On the other hand, RNA polymerase can synthesize RNA from n=1 using DNA as a template. RNA primers are synthesized by RNA polymerase prior to DNA synthesis, and DNA synthesis starts from there using DNA polymerase (Fig. 2-15). This mechanism was also discovered by Okazaki et al.
DNA synthesis proceeds along the lagging strand - removing RNA primers that have performed their roles on the way - and the gaps between the short DNA fragments are subsequently bonded by DNA ligase*2.
DNA ligase: An enzyme that links together breaks between 3’-OH and 5’-phosphate on one of the double strands of DNA. It cannot link a break if even one base is missing.
The Many Enzymes Involved in DNA Replication
In fact, replication reactions are even more complicated than replication itself (Column Fig. 2-2). At the tip of the replication fork, helicase unwinds the parental double strand. There are single-strand binding proteins that stabilize the single strands exposed by the helicase. RNA primers are synthesized by primase, and DNA polymerase extends the primers to form new DNA strands. Along the lagging strand, as previously mentioned, DNA is synthesized while RNA primers are being removed, and DNA ligase subsequently joins the deoxyribonucleotides together. Ahead of the replication point of the fork, topoisomerase (DNA gyrase) cuts the DNA strand to release the tension held by the parental strand and links it again. Various enzymes and proteins with such a function form a large replication complex; similar mechanisms are essentially at work in organisms from bacteria through to humans.
Replication Origin and Endpoint
In prokaryotes, the long DNA strand has a single replication origin from which replication forks proceed in both directions. Since prokaryotic DNA is circular, the two replication points meet at the opposite side of the circle at a location known as the replication endpoint. The replication origin and endpoint have characteristic base sequences, and particular proteins lead the start and the end of DNA synthesis. A DNA unit that replicates from a single origin of replication is called a replicon. Prokaryotic DNA consists of one replicon, and in E. coli the replication of the replicon takes approximately 40 minutes. Eukaryotes have more DNA content than prokaryotes, and have multiple replication origins on their DNA. In other words, eukaryotic DNA consists of multireplicons. In this case, replication forks run in both directions from each replication origin, and the replication of each replicon takes approximately one hour.
PCR is a technique by which a particular fragment of DNA with a known base sequence is amplified in tubes. Template DNA, DNA primers (fragments of 10 - 20 nucleotides are chemically synthesized in advance) and DNA polymerase are needed for this process. As shown in Column Figure. 2-3, a particular segment of DNA can be amplified without limit for analysis. In theory, DNA can be amplified even from a single cell. Fragmented DNA can also be amplified as long as the target segment is not fragmented. RNA can also be used as a template, and PCR has a wide range of application areas such as criminal investigations and court evidence as well as gene cloning and the amplification and cloning of particular DNA segments.