Protein is a term used to describe all polypeptides, which are polymers of amino acids joined by peptide bonds. Proteins are involved in the vast majority of cellular processes, and are literally the building blocks of life.


The synthesis of proteins is controlled by the cell's transcription and translation machinery. DNA in the cell's nucleus is transcribed into messenger RNA (mRNA), which is then translated into protein. This process is highly regulated and evolutionarily conserved, as it relies on "codes" embedded in sequences contained in the DNA and RNA molecules.

Protein synthesis occurs in the cytoplasm, where ribosomes translate mRNA sequences into chains of amino acids. The amino acid that is incorporated into each position in the growing chain is determined by the codes in the mRNA, termed codons. Codons contain a triplet of nucleotides, and each triplet combination encodes a particular amino acid, as dictated by the genetic code. In this way, the sequence that was copied from the original DNA template is faithfully converted into the appropriate polypeptide.


Proteins interact with numerous different molecules in the cell, and do so in order to regulate their functions in very specific and evolutionarily conserved manners. The slightest perturbation in the amino acid sequence of a protein can seriously disrupt its ability to interact with its cofactors, which can have lethal consequences. This also makes a molecular biologist's life easier by allowing a particular protein to be used as a highly specific "bait" for its cofactors, opening the door for such methods as immunohistochemistry, immunoprecipitation, yeast hybrid systems, EMSA, and inducible expression.

Protein:Nucleic AcidEdit

Many proteins are designed to bind to nucleic acid in order to regulate the processes of transcription and translation. Oftentimes, proteins will recognize very specific sequences of nucleotides, making them an essential factor in gene expression. This interaction is often dependent upon the ability of the protein to interact with cofactors, which can be other nucleic acid sequences, other proteins, or both.

The protein:nucleic acid interaction is driven by the favorable creation of binding sites in both the protein and the nucleic acid sequences. This is mediated by interactions between hydrogen bond donors and acceptors in each, where the nucleic acid's nucleotide bases create a unique shape and chemistry, while the protein's amino acid side chains do the same. Where the two coincide favorably, energy is released, which oftentimes is enough to bend DNA by as much as 90°, creating favorable conditions for transcription machinery.


In addition to DNA and RNA, proteins are often designed to bind to other proteins. This occurs through hydrogen bonding between amino acids in separate proteins. Because proteins often form large complexes and polymers, this is an essential aspect of protein function, and defines an enormous number of interactions.

Just as with the protein:nucleic acid interaction, the protein:protein interaction relies upon specific pairing between hydrogen bond donors and acceptors. Here, though, only amino acid side chains are participating in this interaction. These interactions are often very strong, and mediate some of the basic processes in cellular movement and shape.


Because of the sheer number of proteins generated by just one cell in a short period of time, processes have evolved to recycle the amino acids contained in "used" proteins in order to generate new proteins with minimal enegy expenditure. Proteins are degraded by organelles known as proteasomes, which contain enzymes (proteases) that cleave peptide bonds, freeing up the individual amino acids that can then be used to create new proteins.


Because proteins are so numerous and diverse, they can be classified according to their general functions within a cell.


Enzymes are the best known type of protein. They are typically large molecules (several thousand amino acids) with catalytic domains that are of the correct shape, size, and chemistry to carry out the chemical reactions they catalyze. Enzymes perform myriad processes, from transcription to proteolysis, and from cellular respiration to apoptosis.

No enzyme functions independently, however. All enzymes require occupation of their catalytic domain by the proper substrate, and must be maintained in the appropriate conditions, and with the appropriate cofactors in order to achieve their functions. Furthermore, because many enzymes are actually composed of distinct subunits that are bound together by hydrogen bonding, the enzyme itself cannot function without all subunits in place, which oftentimes requires specific ranges for pH, salt concentration, and temperature.

Architectural ProteinsEdit

Some proteins are destined to become components of cellular architecture. The cell's cytoskeleton provides physical support to the cell and to its myriad processes by providing anchoring points and mechanisms for numerous interactions. Mutations in architectural proteins can be severely detrimental to cell function, and are the cause of many myopathies and skeletal defects in humans.

Transcription FactorsEdit

Transcription factors (TFs) are proteins that associate with DNA (through direct binding or through binding to other proteins) in order to regulate its transcription. TFs often bind to highly conserved and unique recognition sequences in DNA, adding a great deal of specificity to their functions. The effect of the TF-DNA interaction can be to repress, activate, enhance, or attenuate transcription of a gene. This normally occurs through interactions between the TF and cofactors which enact conformational changes in the chromatin and/or recruit yet other cofactors to do the same.


Antibodies are a special class of proteins that contain highly specific domains that are designed to recognize certain amino acid sequences (antigens) in other proteins. The purpose of this interaction is to provide the cell a means of defending itself against invasion by recognizing protein sequences on invading pathogens in order to target them for destruction without affecting other cellular processes. Antibodies are generated by the host organism only upon infection (i.e. the presence of the antigen), eliminating the need to re-process unused antibodies.