2.3Genes and DNA
Definitions of Genes
It is often said that genes are DNA, or conversely, that DNA consists of genes; however, genes are not spread throughout high-molecular DNA from one end to the other. A gene is defined as a region of high-molecular DNA containing information that determines the primary structure of proteins (amino acid sequence) or the structure of non-coding RNA (base sequence) (explained in Chapter 3). Generally in prokaryotes, genes are densely located with very narrow intervals between them. In eukaryotes, they are sparsely located throughout the DNA with wide intervals between them.
The overall DNA contained in one cell is called a genome. In prokaryotes, cells contain one thread (i.e., one molecule) of DNA, whereas human somatic cells have 46 threads of DNA per cell (i.e., per nucleus), of which 23 are derived from the mother (the ovum) and 23 from father (the sperm). The somatic cells of eukaryotes commonly have two sets of genes derived from both parents, and these cells are called diploids. Cells that have only one set of genes are called haploids. Most prokaryotes are haploids, as are the germ cells of eukaryotes. The whole DNA of diploid cells is called a genome; however, in some cases, from a functional viewpoint, the DNA of haploid cells is called a genome and that of diploid cells is referred to as two sets (copies) of the genome.
DNA Content of Organisms
DNA content greatly varies among organisms. Figure 2-10 shows the amount of DNA per haploid in several organisms. Generally, DNA content per cell is larger in eukaryotes than in prokaryotes. Human somatic cells contain approximately 1,000 times as much DNA as those of E. coli (per haploid). For diploid cells, the amount per cell is 6 pg. Generally among eukaryotes, the higher an organism, the more DNA content there is, although there can be great variations among organisms of the same group. Among vertebrates, such variations can be very distinctive in fish and amphibians, with some species having more DNA content than humans. Some higher plant species also have more DNA than humans, so it does not necessarily hold true that higher DNA content represents a higher organism. In other words, humans are not the highest species with the largest amount of DNA.
The Number of Genes in Organisms
The Human Genome Project, which aims to determine the sequence of all the chemical base pairs in human DNA, is now almost complete, and the genome sequences of many other organisms are also being increasingly identified. Contrary to predictions, the number of genes in humans is now estimated to be only six times as many as that in E. coli (approx. 26,000 in humans and 4,300 in E. coli). The number of genes also does not differ greatly among fruit flies, Arabidopsis thaliana and humans.
Despite the less-than-significant difference in the number of genes between humans and E. coli, in eukaryotes (including humans), one gene can synthesize multiple types of protein with different amino acid sequences, and the number of protein types produced in humans is estimated to be around 100,000. This mechanism is discussed later (Chapter 3).
Eukaryotes - Characterized by Many Non-gene DNA Regions
The DNA of eukaryotes has a much larger proportion of regions that are not genes (amino-acid-coding sequences) than prokaryotes. As an example, humans have a much higher DNA content than E. coli, but have only slightly more genes. As shown in Figure 2-11, in mammals, only 3% of the whole DNA sequence codes for amino acids. One of the characteristics of eukaryotic DNA is that - unlike DNA in prokaryotes - it has a large number of repetitive sequences, which represent over half the total DNA in some species. Short repetitive sequences may be located at the same sites or be scattered throughout the genome, and little is known about the meaning and function of their existence.
Only a small part of the gene that determines the structure of a protein in eukaryotes has an amino-acid-coding sequence for protein synthesis. Figure 2-12 shows a schematic diagram of a eukaryotic gene. Introns do not contain an amino-acid-coding sequence. Based on the classical definition, therefore, introns are not genes, but by convention in eukaryotes, genes include introns and exons. Some genes have introns that are 10 to 100 times as long as exons.
In prokaryotes, regions that regulate gene expression are short (from tens of bp to a hundred bp), whereas in eukaryotes they are much longer (dozens of kbp). This is another major difference between prokaryotes and eukaryotes.