Nucleic acid is one of the principle foundations of biology. It is a collective term used to describe all DNA (deoxyribonucleic acid) and RNA (ribonucleic acid), which together compose the "messages" directing cellular development and homeostasis. Nucleic acid gets its name from nucleus, as DNA is located in the cell's nucleus, and acid, as DNA and RNA molecules are composed of chains of acidic residues.



Nucleosides are precursors to the fundamental units of nucleic acid. They are composed of two key elements:

1. Ribose

The backbone of nucleic acid is the monosaccharide ribose, which is composed of a 5-membered ring. The carbon atoms in and attached to this ring are numbered from 1 - 5. The standard nomenclature is to number them with a "prime" ('), hence 1' - 5' (read "one prime to five prime"). This practice came into being in order to distinguish from the numbered carbons on the attached fucntional groups.

2. Purine or Pyrimidine

To create a nucleoside, a ribose ring is covalently bound to another functional group at its 1' position. There are two classes of functional groups used to create a nucleoside:

The purines are bicyclic compounds derived from the basic purine structure, which is a six-membered ring fused to a five-membered ring. The various functional groups that are attached to the ring give it its name. Adenine and guanine are the two purines comprising nucleic acid. When covalently attached to the 1' position of a ribose ring, they become know as adenosine and guanosine, respectively.

Pyrimidines are monocyclic compounds composed of a six-membered ring. Like the purines, the covalently attached functional groups give them their names: cytosine, thymine, and uracil. When attached to a ribose ring at its 1' position, the molecules become known as cytidine, thymidine, and uridine, respectively.

All purines and pyrimidines are, by nature, basic. This brought about the collective nomenclature of "base", which is often used to refer to the molecule occupying a given position. This is due to the fact that the presence of the purine or pyrimidine (or other) basic residue at the 1' position of the ribose ring is what defines its unique functions and shape, and therefore is regarded as the "point of interest" on the nucleic acid molecule. The ribose rings and phosphodiester bond become secondary, as they provide the repeating backbone.


Nucleotides are derivatives of nucleosides, and are composed of a ribose ring, a purine or pyrimidine covalently attached the 1' position, and a phosphate group covalently attached at the 5' position. The five common nucleotides are ATP (adenosine triphosphate), GTP, (guanosine triphosphate), CTP (cytidine triphosphate), TTP (thymidine triphosphate), and UTP (uridine triphosphate).

DNA versus RNAEdit

The presence or absence of another hydroxyl on the ribose ring (this time at the 2' position) determines whether the molecule built from it will be DNA or RNA. The deoxy in deoxyribonucleic acid (DNA) indicates the absence of the 2' hydroxyl, which is present in ribonucleic acid (RNA). Furthermore, RNA prefers uracil to thymine, and uses uridine as a "replacement" for thymidine in the RNA transcript that is read from the DNA template.


Nucleotides utilize the available 5' phosphate and 3' hydroxyl groups on separate molecules to polymerize into long chains of nucleotides, collectively known as nucleic acid. This architecture is fundamental to the genetic code, because it endows the DNA or RNA molecule with the ability to encode information via unique sequences defined by the relative positioning of the available 5 nucleotides in the chain. Because individual genes can often be in the range of several thousand nucleotides in length, the possibilities for variation in the code are virtually endless.

Base PairingEdit

Base pairing is a key aspect of nucleic acid structure. Because each base leaves exposed hydrogen bond acceptors and donors, these provide "anchoring points" for a complementary strand of nucleic acid. This synergistic bond is termed base pairing owing to the nature with which the two molecules of nucleic acid interact, i.e. by "pairing" individual bases with each other.

The pairing is very specific, and is driven by thermodynamics. The pairing rules are as follows:

Base One-letter
Nucleotide Pairs with
T, U, I
G, I
A, I
A, C, U

Gene ExpressionEdit

Gene expression is the physical act of converting the information contained in nucleic acid into actual specific functions within the cell. This process is highly regulated, highly conserved between species, and extremely precise.


Transcription is the act of transcribing DNA into an RNA copy. The DNA sequence is faithfully preserved in the RNA copy, with only one variance: thymidines (T) are replaced with uridines (U). The "code" contained within the DNA template is therefore "transcribed", as it is a verbatim representation of the original message in a new molecule. The resulting RNA template is then used to either direct protein synthesis, or becomes integrated into complexes that direct transcriptional regulation.


Translation is the act of converting the information contained in an RNA template (that resulted from transcription) into a protein of specified sequence. Because the code contained within the RNA molecule must be read by machinery that decides what amino acid to put in each position within the polypeptide chain, the RNA message must be "translated". This is done by segregating the individual nucleotides in the RNA molecule into sets of three, called codons. Each codon specifies a particular amino acid, as defined by the genetic code.


Because of the negative charge carried by the phosphodiester bonds comprising the nucleic acid backbone, it is possible to force nucleic acid to physically migrate through an applied electric field. Using a gel matrix containing pores of a specific diameter, DNA molecules of different size can be separated in such an electric field by applying a constant voltage for a defined period of time. Larger DNA molecules will move more slowly than smaller ones, thereby defining a "banding pattern", with each distinct band representing a particular size. This process is known as electrophoresis, and is a key aspect of molecular biological analysis.