Genetic Code
Some forensic identification techniques that detect living organisms or their products (e.g., toxins) rely on the detection of genetic sequences within the organism's genetic material. These tests can be exquisitely sensitive, allowing the detection of only a few organisms.
For example, tests have established that less than a dozen Escherichia coli bacteria can be detected in samples such as food and water. To put that in perspective, over a million bacterial cells will fit into the period at the end of this sentence.
A fundamental understanding of the genetic code is essential to understanding the molecular basis of advanced deoxyribonucleic acid (DNA) and the genetic tests that are increasingly important in forensic science and identification technology.
The genetic information that is passed on from parent to offspring is carried by the DNA of a cell. The genes on the DNA code for specific proteins that determine all aspects of the organism. In order for a gene to produce the proteins, the gene must first be transcribed from DNA to RNA (specifically, a type of RNA called messenger RNA; mRNA) in a process known as transcription. Translation is the process in which genetic information, carried by the mRNA, directs the synthesis of proteins from amino acids. The primary structure of the protein is determined by the nucleotide sequence in the mRNA.
The elements of the encoding system, the nucleotides, differ by only four different bases. These are known as adenine (A), guanine, (G), thymine (T) and cytosine (C), in DNA or uracil (U) in RNA. Thus RNA contains U in the place of C.
Proteins found in nature consist of 20 naturally occurring amino acids. One important question is, how can four nucleotides code for 20 amino acids? If a single nucleotide coded for one amino acid, then only four amino acids could be provided for. Alternatively, if two nucleotides specified one amino acid, then there could be a maximum number of 16 (42) possible arrangements. If, however, three nucleotides coded for one amino acid, then there would be 64 (43) possible permutations, more than enough to account for all the 20 naturally occurring amino acids. The latter, which was proposed by the Russian born physicist, George Gamow (1904–1968), was proved to be correct.
It is now well known that every amino acid is coded by at least one nucleotide triplet or codon, and that some triplet combinations function as instructions for the termination or initiation of translation. Three combinations in tRNA, UAA, UGA, and UAG, are termination codons, while AUG is a translation start codon.
The genetic code was solved between 1961 and 1963. The American scientist Marshall Nirenberg (1927–), working with his colleague Heinrich Matthaei, made the first breakthrough when they
discovered how to make synthetic mRNA. They found that if the nucleotides of RNA carrying the four bases A, G, C and U were mixed in the presence of the enzyme polynucleotide phosphorylase, a single stranded RNA was formed in the reaction, with the nucleotides being incorporated at random. This offered the possibility of creating specific mRNA sequences and then seeing which amino acids they would specify. The first synthetic mRNA polymer obtained contained only uracil (U) and when mixed in vitro with the protein synthesizing machinery of Escherichia coli it produced a polyphenylalanine—a string of phenylalanine. From this it was concluded that the triplet UUU coded for phenylalanine. Similarly, a pure cytosine (C) RNA polymer produced only the amino acid proline, so the corresponding codon for cytosine had to be CCC. This type of analysis was refined when nucleotides were mixed in different proportions in the synthetic mRNA and a statistical analysis was used to determine the amino acids produced. It was quickly found that a particular amino acid could be specified by more than one codon. Thus, the amino acid serine could be produced from any one of the combinations UCU, UCC, UCA, or UCG. In this way the genetic code is said to be degenerate, meaning that each of the 64 possible triplets have some meaning within the code and that several codons may encode a single amino acid.
This work confirmed the ideas of the British scientists Francis Crick (1916–2004) and Sydney Brenner (1927–). Brenner and Crick were working with mutations in the bacterial virus bacteriophage T4 and found that the deletion of a single nucleotide could abolish the function of a specific gene. However, a second mutation in which a nucleotide was inserted at a different, but nearby position, restored the function of that gene. These two mutations are said to be suppressors of each other, meaning that they cancel each other's mutant properties. It was concluded from this that the genetic code was read in a sequential manner starting from a fixed point in the gene. The insertion or deletion of a nucleotide shifted the reading frame in which succeeding nucleotides were read as codons, and was thus termed a frameshift mutation. It was also found that whereas two closely spaced deletions, or two closely spaced insertions, could not suppress each other, three closely spaced deletions or insertions could do so. Consequently, these observations established the triplet nature of the genetic code. The reading frame of a sequence is the way in which the sequence is divided into the triplets and is determined by the precise point at which translation is initiated. For example, the sequence CATCATCAT can be read CAT CAT CAT or C ATC ATC AT or CA TCA TCA T in the three possible reading frames. Sometimes, as in particular bacterial viruses, genes have been found that are contained within other genes. These are translated in different reading frames so the amino acid sequences of the proteins encoded by them are different. Such economy of genetic material is, however, quite rare.
The same genetic code appears to operate in all living things, but exceptions are known. In human mitochondrial mRNA, AGA and AGG are termination or stop codons. Other differences also exist in the correspondences between certain codon sequences and amino acids.