Read this article to learn about the composition and structure of DNA and RNA.
There are two types of nucleic acids, namely deoxyribonucleic acid (DNA) and ribonucleic acid (RNA). Primarily, nucleic acids serve as repositories and transmitters of genetic information.
DNA was discovered in 1869 by Johann Friedrich Miescher, a Swiss researcher. The demonstration that DNA contained genetic information was first made in 1944, by Avery, Macleod and MacCary.
Functions of Nucleic Acids:
DNA is the chemical basis of heredity and may be regarded as the reserve bank of genetic information. DNA is exclusively responsible for maintaining the identity of different species of organisms over millions of years. Further, every aspect of cellular function is under the control of DNA. The DNA is organized into genes, the fundamental units of genetic information. The genes control the protein synthesis through the mediation of RNA, as shown below
The interrelationship of these three classes of biomolecules (DNA, RNA and proteins) constitutes the central dogma of molecular biology or more commonly the central dogma of life.
Components of Nucleic Acids:
Nucleic acids are the polymers of nucleotides (polynucleotides) held by 3′ and 5′ phosphate bridges. In other words, nucleic acids are built up by the monomeric units—nucleotides (It may be recalled that protein is a polymer of amino acids).
Nucleotides are composed of a nitrogenous base, a pentose sugar and a phosphate. Nucleotides perform a wide variety of functions in the living cells, besides being the building blocks or monomeric units in the nucleic acid (DNA and RNA) structure. These include their role as structural components of some coenzymes of B-complex vitamins (e.g. FAD, NAD+), in the energy reactions of cells (ATP is the energy currency), and in the control of metabolic reactions.
Structure of Nucleotides:
The nucleotide essentially consists of base, sugar and phosphate. The term nucleoside refers to base + sugar. Thus, nucleotide is nucleoside + phosphate.
Purines and pyrimidine’s:
The nitrogenous bases found in nucleotides (and, therefore, nucleic acids) are aromatic heterocyclic compounds. The bases are of two types—purines and pyrimidine’s. Their general structures are depicted in Fig. 2.1. Purines are numbered in the anticlockwise direction while pyrimidine’s are numbered in the clockwise direction. And this is an internationally accepted system to represent the structure of bases.
Major bases in nucleic acids:
The structures of major purines and pyrimidine’s found in nucleic acids are shown in Fig. 2.2. DNA and RNA contain the same purines namely adenine (A) and guanine (G). Further, the pyrimidine cytosine (C) is found in both DNA and RNA. However, the nucleic acids differ with respect to the second pyrimidine base. DNA contains thymine (T) whereas RNA contains uracil (U). As is observed in the Fig. 2.2, thymine and uracil differ in structure by the presence (in T) or absence (in U) of a methyl group.
Tautomeric forms of purines and pyrimidine’s:
The existence of a molecule in a keto (lactam) and enol (lactim) form is known as tautomerism. The heterocyclic rings of purines and pyrimidine’s with oxo functional groups exhibit tautomerism as simplified below.
The purine—guanine and pyrimidine’s-cytosine, thymine and uracil exhibit tautomerism. The lactam and lactim forms of cytosine are represented in Fig. 2.3.
At physiological pH, the lactam (keto) tautomeric forms are predominantly present.
Minor bases found in nucleic acids:
Besides the bases described above, several minor and unusual bases are often found in DNA and RNA. These include 5-methylcytosine, N4-acetylcytosine, N6– methyladenine, N6, N6-dimethyladenine, pseudouracil etc. It is believed that the unusual bases in nucleic acids will help in the recognition of specific enzymes.
Sugars of Nucleic Acids:
The five carbon monosaccharide’s (pentoses) are found in the nucleic acid structure. RNA contains D-ribose while DNA contains D-deoxyribose. Ribose and deoxyribose differ in structure at C2. Deoxyribose has one oxygen less at C2 compared to ribose (Fig. 2.4).
Nomenclature of Nucleotides:
The addition of a pentose sugar to base produces a nucleoside. If the sugar is ribose, ribonucleosides are formed. Adenosine, guanosine, cytidine and uridine are the ribonucleosides of A, G, C and U respectively. If the sugar is a deoxyribose, deoxyribo- nucleosides are produced.
The term mononucleotide is used when a single phosphate moiety is added to a nucleoside. Thus adenosine monophosphate (AMP) contains adenine + ribose + phosphate. The principal bases, their respective nucleosides and nucleotides found in the structure of nucleic acids are given in Table 2.1. Note that the prefix ‘d’ is used to indicate if the sugar is deoxyribose (e.g. dAMP).
The Binding of Nucleotide Components:
The atoms in the purine ring are numbered as 1 to 9 and for pyrimidine as 1 to 6 (Fig. 2.1). The carbons of sugars are represented with an associated prime (‘) for differentiation. Thus the pentose carbons are 1′ to 5’. The pentose’s are bound to nitrogenous bases by β-N-glycosidic bonds. The N9 of a purine ring binds with C1(1’) of a pentose sugar to form a covalent bond in the purine nucleoside. In case of pyrimidine nucleosides, the glycosidic linkage is between N1 of a pyrimidine and C’1 of a pentose.
The hydroxyl groups of adenosine are esterified with phosphates to produce 5′- or 3′-mono- phosphates. 5′-Hydroxyl is the most commonly esterified, hence 5′ is usually omitted while writing nucleotide names. Thus AMP represents adenosine 5′-monophosphate. However, for adenosine 3′-monophosphate, the abbreviation 3′-AMP is used. The structures of two selected nucleotides namely AMP and TMP are depicted in Fig. 2.5.
Nucleoside Di- and Triphosphates:
Nucleoside monophosphates possess only one phosphate moiety (AMP, TMP). The addition of second or third phosphates to the nucleoside results in nucleoside diphosphate (e.g. ADP) or triphosphate (e.g. ATP), respectively. The anionic properties of nucleotides and nucleic acids are due to the negative charges contributed by phosphate groups.
Structure of DNA:
DNA is a polymer of deoxyribonucleotides (or simply deoxynucleotides). It is composed of monomeric units namely deoxyadenylate (dAMP), deoxyguanylate (dGMP), deoxycytidylate (dCMP) and deoxythymidylate (dTMP) (It may be noted here that some authors prefer to use TMP for deoxythymidylate, since it is found only in DNA). The details of the nucleotide structure are given above.
Schematic Representation of Polynucleotides:
The monomeric deoxynucleotides in DNA are held together by 3′, 5′-phosphodiester bridges (Fig. 2.6). DNA (or RNA) structure is often represented in a short-hand form. The horizontal line indicates the carbon chain of sugar with base attached to C1-. Near the middle of the horizontal line is C3– phosphate linkage while at the other end of the line is C5– phosphate linkage (Fig. 2.6).
Chargaff’s Rule of DNA Composition:
Erwin Chargaff in late 1940s quantitatively analysed the DNA hydro lysates from different species. He observed that in all the species he studied DNA had equal numbers of adenine and thymine residues (A = T) and equal numbers of guanine and cytosine residues (G = C).
This is known as Chargaff’s rule of molar equivalence between the purines and pyrimidine’s in DMA structure. The significance of Chargaff’s rule was not immediately realised. The double helical structure of DNA derives its strength from Chargaff’s rule. Single-stranded DNA, and RNAs which are usually single-stranded, do not obey Chargaff’s rule. However, double-stranded RNA which is the genetic material in certain viruses satisfies Chargaff’s rule.
DNA Double Helix:
The double helical structure of DNA was proposed by James Watson and Francis Crick in 1953 (Nobel Prize, 1962). The elucidation of DNA structure is considered as a milestone in the era of modern biology. The structure of DNA double helix is comparable to a twisted ladder. The salient features of Watson-Crick model of DNA (now known as B-DNA) are given below (Fig. 2.7).
1. The DNA is a right handed double helix. It consists of two polydeoxyribonucleotide chains (strands) twisted around each other on a common axis.
2. The two strands are antiparallel, i.e., one strand runs in the 5′ to 3′ direction while the other in 3′ to 5′ direction. This is comparable to two parallel adjacent roads carrying traffic in opposite direction.
3. The width (or diameter) of a double helix is 20 A° (2 nm).
4. Each turn (pitch) of the helix is 34 A° (3.4 nm) with 10 pairs of nucleotides, each pair placed at a distance of about 3.4 A0.
5. Each strand of DNA has a hydrophilic deoxyribose phosphate backbone (3’-5’ phosphodiester bonds) on the outside (periphery) of the molecule while the hydrophobic bases are stacked inside (core).
6. The two polynucleotide chains are not identical but complementary to each other due to base pairing.
7. The two strands are held together by hydrogen bonds formed by complementary base pairs (Fig. 2.8). The A-T pair has 2 hydrogen bonds while G-C pair has 3 hydrogen bonds. The G ≡ C is stronger by about 50% than A = T.
8. The hydrogen bonds are formed between a purine and a pyrimidine only. If two purines face each other, they would not fit into the allowable space. And two pyrimidine’s would be too far to form hydrogen bonds. The only base arrangement possible in DNA structure, from special considerations is A-T, T-A, G-C and C-G.
9. The complementary base pairing in DNA helix proves Chargaff’s rule. The content of adenine equals to that of thymine (A = T) and guanine equals to that of cytosine (G = C).
10. The genetic information resides on one of the two strands known as template strand or sense strand. The opposite strand is antisense strand. The double helix has (wide) major grooves and (narrow) minor grooves along the phosphodiester backbone. Proteins interact with DNA at these grooves, without disrupting the base pairs and double helix.
Conformations of DNA Double Helix:
Variation in the conformation of the nucleotides of DNA is associated with conformational variants of DNA. The double helical structure of DNA exists in at least 6 different forms-A to E and Z. Among these, B, A and Z forms are important (Table 2.2). The B-form of DNA double helix, described by Watson and Crick, is the most predominant form under physiological conditions. Each turn of the B-form has 10 base pairs spanning a distance of 3.4 nm. The width of the double helix is 2 nm.
The A-form is also a right-handed helix. It contains 11 base pairs per turn. There is a tilting of the base pairs by 20° away from the central axis. The Z-form (Z-DNA) is a left-handed helix and contains 12 base pairs per turn. The polynucleotide strands of DNA move in a somewhat ‘zigzag’ fashion, hence the name Z-DNA. It is believed that transition between different helical forms of DNA plays a significant role in regulating gene expression.
Other Types of DNA Structure:
It is now recognized that besides double helical structure, DNA also exists in certain unusual structures. It is believed that such structures are important for molecular recognition of DNA by proteins and enzymes. This is in fact needed for the DNA to discharge its functions in an appropriate manner. Some selected unusual structures of DNA are briefly described.
In general, adenine base containing DNA tracts are rigid and straight. Bent conformation of DNA occurs when A-tracts are replaced by other bases or a collapse of the helix into the minor groove of A- tract. Bending in DNA structure has also been reported due to photochemical damage or mis-pairing of bases. Certain antitumor drugs (e.g. cisplatin) produce bent structure in DNA. Such changed structure can take up proteins that damage the DNA.
Triple-stranded DNA formation may occur due to additional hydrogen bonds between the bases. Thus, a thymine can selectively form two Hoogsteen hydrogen bonds to the adenine of A-T pair to form T-A-T. Likewise, a protonated cytosine can also form two hydrogen bonds with guanine of G-C pairs that results in C+-G-C. An outline of Hoogsteen triple helix is depicted in Fig. 2.9.
Triple-helical structure is less stable than double helix. This is due to the fact that the three negatively charged backbone strands in triple helix results in an increased electrostatic repulsion.
Polynucleotides with very high contents of guanine can form a novel tetrameric structure called G-quartets. These structures are planar and are connected by Hoogsteen hydrogen bonds (Fig. 2.10A). Antiparallel four-stranded DNA structures, referred to as G-tetraplexs have also been reported (Fig. 2.10B).
The ends of eukaryotic chromosomes namely telomeres are rich in guanine, and therefore form G-tetraplexes. In recent years, telomeres have become the targets for anticancer chemotherapies. G-tetraplexes have been implicated in the recombination of immunoglobulin genes, and in dimerization of double-stranded genomic RNA of the human immunodeficiency virus (HIV).
The Size of DNA Molecule-Units of Length:
DNA molecules are huge in size. On an average, a pair of B-DNA with a thickness of 0.34 nm has a molecular weight of 660 daltons. For the measurement of lengths, DNA double- stranded structure is considered, and expressed in the form of base pairs (bp). A kilobase pair (kb) is 103 bp, and a megabase pair (Mb) is 106 bp and a gigabase pair (Gb) is 109 bp.
The kb, Mb and Gb relations may be summarized as follows:
1 kb = 1000 bp
1 Mb = 1000 kb = 1,000,000 bp
1 Gb = 1000 Mb = 1,000,000,000 bp
It may be noted here that the lengths of RNA molecules (like DNA molecules) cannot be expressed in bp, since most of the RNAs are single- stranded. The length of DNA varies from species to species, and is usually expressed in terms of base pair composition and contour length. Contour length represents the total length of the genomic DNA in a cell. Some examples of organisms with bp and contour lengths are listed.
i. λ phage virus – 4.8 × 104 bp – contour length 16.5 mm.
ii. E. coli—4.6 x 106 bp — contour length 1.5 mm.
iii. Diploid human cell (46 chromosomes) — 6.0 × 109 bp — contour length 2 meters.
It may be noted that the genomic DNA size is usually much larger the size of the cell or nucleus containing it. For instance, in humans, a 2-meter long DNA is packed compactly in a nucleus of about 10µm diameter. The genomic DNA may exist in linear or circular forms. Most DNAs in bacteria exist as closed circles. This includes the DNA of bacterial chromosomes and the extra chromosomal DNA of plasmids. Mitochondria and chloroplasts of eukaryotic cells also contain circular DNA.
Chromosomal DNAs in higher organisms are mostly linear. Individual human chromosomes contain a single DNA molecule with variable sizes compactly packed. Thus the smallest chromosome contains 34 Mb while the largest one has 263 Mb.
Denaturation of DNA Strands:
The two strands of DNA helix are held together by hydrogen bonds. Disruption of hydrogen bonds (by change in pH or increase in temperature) results in the separation of polynucleotide strands. This phenomenon of loss of helical structure of DNA is known as denaturation (Fig. 2.11). The phosphodiester bonds are not broken by denaturation. Loss of helical structure can be measured by increase in absorbance at 260 nm (in a spectrophotometer).
Melting temperature (Tm) is defined as the temperature at which half of the helical structure of DNA is lost. Since G-C base pairs are more stable (due to 3 hydrogen bonds) than A-T base pairs (2 hydrogen bonds), the Tm is greater for DNAs with higher G-C content.
Thus, the Tm is 65°C for 35% G-C content while it is 70°C for 50% G-C content. Form-amide destabilizes hydrogen bonds of base pairs and, therefore, lowers Tm. This chemical compound is effectively used in recombinant DNA experiments. Renaturation (or re-annealing) is the process in which the separated complementary DNA strands can form a double helix.
Organization of DNA in the Cell:
As already stated, the double-stranded DNA helix in each chromosome has a length that is thousands times the diameter of the nucleus. For instance, in humans, a 2-meter long DNA is packed in a nucleus of about 10 µm diameter! This is made possible by a compact and marvelous packaging, and organization of DNA inside in cell.
Organization of prokaryotic DNA:
In prokaryotic cells, the DNA is organized as a single chromosome in the form of a double- stranded circle. These bacterial chromosomes are packed in the form of nucleoids, by interaction with proteins and certain cations (polyamines).
Organization of eukaryotic DNA:
In the eukaryotic cells, the DNA is associated with various proteins to form chromatin which then gets organized into compact structures namely chromosomes (Fig. 2.12).
The DNA double helix is wrapped around the core proteins namely histones which are basic in nature. The core is composed of two molecules of histones (H2A, H2B, H3 and H4). Each core with two turns of DNA wrapped round it (approximately with 150 bp) is termed as a nucleosome, the basic unit of chromatin. Nucleosomes are separated by spacer DNA to which histone H1 is attached (Fig. 2.13).
This continuous string of nucleosomes, representing beads-on-a string form of chromatin is termed as 10 nm fiber. The length of the DNA is considerably reduced by the formation of 10 nm fiber. This 10-nm fiber is further coiled to produce 30-nm fiber which has a solenoid structure with six nucleosomes to every turn.
These 30-nm fibers are further organized into loops by anchoring the fiber at A/T-rich regions namely scafold-associated regions (SARS) to a protein scafold. During the course of mitosis, the loops are further coiled, the chromosomes condense and become visible.
Structure of RNA:
RNA is a polymer of ribonucleotides held together by 3′, 5′-phosphodiester bridges. Although RNA has certain similarities with DNA structure, they have several specific differences
The sugar in RNA is ribose in contrast to deoxyribose in DNA.
RNA contains the pyrimidine uracil in place of thymine (in DNA).
3. Single strand:
RNA is usually a single-stranded polynucleotide. However, this strand may fold at certain places to give a double-stranded structure, if complementary base pairs are in close proximity.
4. Chargaff’s rule—not obeyed:
Due to the single-stranded nature, there is no specific relation between purine and pyrimidine contents. Thus the guanine content is not equal to cytosine (as is the case in DNA).
5. Susceptibility to alkali hydrolysis:
Alkali can hydrolyse RNA to 2′, 3′-cyclic diesters. This is possible due to the presence of a hydroxyl group at 2′ position. DNA cannot be subjected to alkali hydrolysis due to lack of this group.
6. Orcinol colour reaction:
RNAs can be histologically identified by orcinol colour reaction due to the presence of ribose.
Types of RNA:
The three major types of RNAs with their respective cellular composition are given below
1. Messenger RNA (mRNA): 5-10%
2. Transfer RNA (tRNA): 10-20%
3. Ribosomal RNA (rRNA): 50-80%
Besides the three RNAs referred above, other RNAs are also present in the cells. These include heterogeneous nuclear RNA (hnRNA), small nuclear RNA (snRNA), small nucleolar RNA (snoRNA) and small cytoplasmic RNA (scRNA). The major functions of these RNAs are given in Table 2.3.
The RNAs are synthesized from DNA, and are primarily involved in the process of protein biosynthesis. The RNAs vary in their structure and function. A brief description on the major RNAs is given.
Messenger RNA (mRNA):
The mRNA is synthesized in the nucleus (in eukaryotes) as heterogeneous nuclear RNA (hnRNA). hnRNA, on processing, liberates the functional mRNA which enters the cytoplasm to participate in protein synthesis. mRNA has high molecular weight with a short half-life.
The eukaryotic mRNA is capped at the 5′-terminal end by 7-methylguanosine triphosphate. It is believed that this cap helps to prevent the hydrolysis of mRNA by 5′-exonucleases. Further, the cap may be also involved in the recognition of mRNA for protein synthesis.
The 3′-terminal end of mRNA contains a polymer of adenylate residues (20-250 nucleotides) which is known as poly (A) tail. This tail may provide stability to mRNA, besides preventing it from the attack of 3′-exonucleases. mRNA molecules often contain certain modified bases such as 6-methyladenylates in the internal structure.
Transfer RNA (tRNA):
Transfer RNA (soluble RNA) molecule contains 71-80 nucleotides (mostly 75) with a molecular weight of about 25,000. There are at least 20 species of tRNAs corresponding to 20 amino acids present in protein structure. The structure of tRNA (for alanine) was first elucidated by Holley. The structure of tRNA depicted in Fig. 2.14 resembles that of a clover leaf. tRNA contains mainly four arms, each arm with a base paired stem.
1. The acceptor arm:
This arm is capped with a sequence CCA (5′ to 3′). The amino acid is attached to the acceptor arm.
2. The anticodon arm:
This arm, with the three specific nucleotide bases (anticodon), is responsible for the recognition of triplet codon of mRNA. The codon and anticodon are complementary to each other.
3. The D arm:
It is so named due to the presence of dihydrouridine.
4. The TΨC arm:
This arm contains a sequence of T, pseudouridine (represented by psi, Ψ) and C.
5. The variable arm:
This arm is the most variable in tRNA. Based on this variability, tRNAs are classified into 2 categories:
(a) Class I tRNAs:
The most predominant (about 75%) form with 3-5 base pairs length.
(b) Class II tRNAs:
They contain 13-20 base pair long arm.
Base pairs in tRNA:
The structure of tRNA is maintained due to the complementary base pairing in the arms.
The four arms with their respective base pairs are given below:
The acceptor arm – 7 bp
The TΨC arm – 5 bp
The anticodon arm – 5 bp
The D arm – 4 bp
Ribosomal RNA (rRNA):
The ribosomes are the factories of protein synthesis. The eukaryotic ribosomes are composed of two major nucleoprotein complexes-60S subunit and 40S subunit. The 60S subunit contains 28S rRNA, 5S rRNA and 5.8S rRNA while the 40S subunit contains 18S rRNA. The function of rRNAs in ribosomes is not clearly known. It is believed that they play a significant role in the binding of mRNA to ribosomes and protein synthesis.
The various other RNAs and their functions are summarised in Table 2.3.
In certain instances, the RNA component of a ribonucleoprotein (RNA in association with protein) is catalytically active. Such RNAs are termed as ribozymes. At least five distinct species of RNA that act as catalysts have been identified. Three are involved in the self-processing reactions of RNAs while the other two are regarded as true catalysts (RNase P and rRNA). Ribonuclease P (RNase P) is a ribozyme containing protein and RNA component. It cleaves tRNA precursors to generate mature tRNA molecules.
RNA molecules are known to adapt tertiary structure just like proteins (i.e. enzymes). The specific conformation of RNA may be responsible for its function as biocatalyst. It is believed that ribozymes (RNAs) were functioning as catalysts before the occurrence of protein enzymes, during the course of evolution.