Recombinant deoxyribonucleic acid (DNA) technology allows the creation and manipulation of DNA sequences that come from different sources, even different species. The development of recombinant DNA technology in the 1970s was hailed as the most exciting invention since the development of transistors some twenty to thirty years earlier. The transistor changed people's lives forever by creating the microelectronics revolution and enabling
Recombinant DNA technology has had to create its place instead of entering an existing market. As a result, recombinant DNA technology has probably consumed more finances than it has yet generated, although this discounts the long-term value of increasing knowledge. Where recombinant DNA technology has made the biggest economic impact is in the pharmaceutical industry, allowing the production of single human proteins for therapeutic use or to generate specific antibodies. Harvesting human insulin created in bacterial cells is far easier than isolating it from pig or human cadaver pituitary glands, for instance. The financial base for recombinant DNA technology should continue to improve as genetically modified organisms are becoming widely used in agriculture; more than half the U.S. soybean crop now consists of a strain genetically modified to reduce the amount of herbicides necessary to bring in a good yield.
A clone is a collection of organisms that are genetically identical, and a recombinant DNA clone is a collection of genetically identical organisms (most often bacteria) that each carry a specific foreign (from another source) DNA molecule. "Clone" also refers to the foreign DNA itself after being
The process of DNA cloning has two components. One is the use of restriction enzymes in vitro to cut DNA into a unique set of fragments. Restriction enzymes are endonucleases that bacteria naturally use to defend against DNA viruses by cleaving DNA at specific sites. The enzyme EcoRI, for example, from E. coli, cleaves every site with the six-nucleotide sequence of GAATTC, found on average every 4,100 nucleotides in DNA. (A companion methylase enzyme modifies the bacterium's own GAATTC sites so they are not targets of EcoRI.) Researchers have isolated many different restriction enzymes from bacterial species. The enzymes differ in the sequences of the target sites that they cut, in the locations of the cleavage sites, and by whether modified target sites are cleaved (in some cases modification is required for cleavage). The collection of restriction enzymes with these different properties provides an invaluable toolbox for cutting and joining DNA molecules from different sources.
The other component of DNA cloning technology is the use of vectors to ensure that the host organism carries and replicates the foreign DNA.
The original vectors used were based on naturally occurring small, circular DNA molecules distinct from the bacterial chromosome , called plasmids . The most widely used vector of the late 1970s to early 1980s was the plasmid pBR322, which contained an origin of replication, a gene that confers resistance to the antibiotic ampicillin, and a second gene that confers resistance to the antibiotic tetracycline. Each of these antibiotic resistance genes contains the recognition sequence for a restriction endonuclease. Opening the vector at one of those sites by restriction digestion in vitro and ligating (splicing) the foreign DNA into that site destroys the resistance encoded by that gene but leaves the other resistance factor intact. Plasmid DNA can then be put back into bacterial host cells (by transfection) where it can replicate up to several hundred copies per bacterium. The bacteria are then grown in media containing one or the other antibiotic. This facilitates selection and identification of bacteria receiving the ligation product. The cloned DNA then replicates along with the rest of the plasmid DNA to which it is joined.
Later improved editions of plasmid vectors incorporated such features as polylinker sequences that consist of several unique restriction sites close together to form a specific cloning site and inducible promoter sequences adjacent to the polylinker used to transcribe into RNA, or "express," the cloned DNA as desired.
cDNAs and Gene Analysis
Plasmid vectors are limited in the size of the cloned DNA that can be incorporated and successfully reintroduced into the bacterium, typically holding a maximum of about 15 kb (kilobases [1 kb equals 1,000 bases]) of foreign DNA. One common use for plasmid vectors is to make cDNA (complementary DNA) libraries; cDNA molecules are DNA copies of messenger ribonucleic acid (mRNA) molecules, produced in vitro by action of the enzyme reverse transcriptase . Because cDNAs represent only the portions of eukaryotic genes that are transcribed into the mRNA, cDNA clones are particularly useful for analysis of gene expression and cell specialization. The existence of a cDNA is also evidence that the gene is active, or transcribed, in the cells or tissues from which the mRNA was isolated. Such information can be used to compare gene activities in healthy versus diseased cells, for instance.
Frequently the simpler sequence of a cDNA is easier to analyze than the corresponding genomic sequence since it will not contain noncoding, or intervening, sequences (introns). Another advantage of cDNA is that generally the sequence does not include enhancers or regulatory sequences to direct their transcription. As a result, they can be combined with other regulatory systems in the clone to direct their expression.
Genome sequencing projects typically generate sequence information from many different cDNA clones. The cDNA cloned sequence is termed an "expressed sequence tag" (EST), and, when correlated with the whole genomic DNA sequence, EST information can help determine the locations and sizes of genes.
In order to obtain the cDNA for a specific gene, it is first necessary to construct a cDNA "library." This is a collection of bacteria that contain all the cDNAs from the cell or tissue type of interest. To make a library, the thousands of different mRNAs are first harvested from the cell of interest, and cDNA is made using reverse transcriptase. The cDNA is then cloned into plasmids, and introduced into bacteria. Under the right conditions, each bacterium will take up only one cDNA. The bacteria are then grown in Petri dishes on a solid medium. A library therefore consists of a mixed population of bacteria, each carrying one type of cDNA. To find the bacterium containing a particular type of cDNA, one can either search for the gene itself with a nucleotide probe or for its protein product with an antibody .
Screening a library depends either on having a probe bearing part of the nucleotide sequence or an antibody or other way of recognizing the protein coded by the gene. Screening by nucleotide probes (labeled with radioactive or chemical tags for detection) depends on base pair complementarity between the single-stranded target DNA and the probe DNA; this allows the label to mark the cell with the desired cDNA. Screening by labeled antibody depends on binding of the antibody to the protein encoded by the gene. Literally thousands of cloned genes have been isolated this way from libraries of many different species. One of the most powerful observations in biology is that the same or similar gene sequences can be isolated from different species, ranging from bacteria to humans.
Human insulin was the first medicine to be created through recombinant DNA technology. Insulin is a protein hormone produced by the pancreas that is vital for regulation of blood sugar. In the disease insulin-dependent diabetes mellitus (IDDM), the immune system attacks and destroys the insulin-producing cells. A person with IDDM requires daily injections of insulin to control blood sugar. Before 1980, insulin was isolated from pigs or other animals. Animal insulin has a slightly different amino acid sequence from the human form. In the early 1980s, recombinant DNA technology was used to splice the human insulin gene into bacteria, which were grown in vats to make large amounts of the human protein. Recombinant human insulin was the first recombinant drug approved for human use. Since then more than two dozen other drugs have been created in this way, including growth hormone, blood clotting factors, and tissue plasminogen activator, used to break up blood clots following a stroke. Gene sequence similarities indicate that all living organisms have descended from shared common ancestors, back to the beginning of life.
Cloned DNA can also be incorporated into the genomes of multicellular organisms to create a transgenic organism. This makes possible a new approach to designing genotypes by adding genes (gene-coded functions) to species where those genes (functions) do not exist. Genetically modified organisms (GMOs) created by modifying a gene or adding one from another species frequently offer the most direct way to improve the way people use organisms for food or chemistry.
One example of a GMO is the development of "golden rice," designed to reduce blindness caused by vitamin A deficiency in rice-consuming areas of the world. A polished rice grain, which is the portion of the seed that provides nourishment (the endosperm ) does not contain beta-carotene, the substance the human body converts into vitamin A, yet many plants with yellow/orange colored leaves or flowers produce it in abundance. To convert rice endosperm into a beta-carotene-rich food, a transgene was constructed with the genes required for beta-carotene production and inserted into rice cells. The transgene consists of a cDNA for phytoene synthase, from a daffodil flower library, plus other sequences. Rice with these extra genes show a rich "golden" color from the beta-carotene that accumulates in the rice grain. If golden rice can be bred into commercial strains and enough can be provided into the diet to reduce the incidence of vitamin A–related blindness, current agitation against GMO crops may evolve into enthusiasm for their application.
Genome Libraries: Sequencing Genomes
Recall that cDNAs do not contain introns . Comparing a cDNA sequence with its corresponding DNA sequence on a chromosome (the genomic sequence) reveals the locations of introns in the genomic sequences. Genomic DNA libraries, in which the cloned DNA originates from fragments of the chromosomal DNA, carry intronic sequences, as well as the DNA between genes. In the more complex eukaryotes the same genomic region may correspond to several different cDNAs. This reveals the existence of alternative splicing, in which different sets of exons are used to make separate mRNA transcripts from one gene region. This expands the diversity of the protein, encoded by a single gene to include slightly different protein forms, called isoforms. Tissue-specific regulation of splicing indicates that these isoforms contribute important nuances to creating developmental differences between tissues.
Genomic DNA libraries have also proved invaluable for isolating genes that are poorly expressed (that is, make little mRNA) and for mapping disease-causing genes to specific chromosomal sites. The vectors used in genomic libraries are designed to incorporate greater lengths of cloned DNA than plasmids can carry. The first of these vectors was the lambda bacterial virus, which could hold an insert of 15 kb, followed by the cosmid, a hybrid between a plasmid and a phage (a virus that infects bacteria) with a DNA insert size of 45 kb. Development of linear yeast artificial chromosomes (YACs), which include a yeast centromere , origin of replication, and ends (telomeres), which successfully grow in the yeast Saccharomyces cerevisiae, carry clones of 200 kb to more than 2,000 kb. Subsequent development of bacterial artificial chromosomes (BACs) that contain 100 kb of insert DNA and are relatively easy to culture has put genomic cloning within reach of almost every molecular biology laboratory. (Clones are harder to work with as they get larger.)
BACs provided one route to sequencing the human genome, where their large capacity was critical. All the different genome sequencing projects start with a large number of BAC clones for that species, subclone 1 kb fragments of the DNA from each BAC into plasmids, and determine their sequence using high-speed machines. Computer-based comparisons of the results then assemble the nucleotide sequences into a coherent order by aligning the regions where they overlap.
A library of genomic DNA contains many clones with inserts that partially overlap each other because random breakage of chromosomal DNA is used to produce fragments for cloning. The order of fragments in the original chromosome can be determined by "chromosome walking." In this technique, a portion (subclone) from one clone is used as a probe to identify another clone that also carries that sequence. The two clones are then compared, and the nonoverlapping end of the second clone is subcloned for use as the next probe. In this way, a "walk" is carried out over many steps to identify adjacent DNA on the same chromosome, allowing the fragments to be placed in sequence. A series of sequential, partially overlapping clones is termed a "contig" (for contiguous sequence); the goal of genome mapping is to make a separate contig for all the DNA clones from one chromosome (a continuous covalent molecule). Contigs made large genome sequencing feasible since a minimum number of BACs could be chosen from their order in the map.
Finding Disease Genes
Locating a human disease gene on a chromosome map is now equivalent to locating the gene (approximately) on a contig and the DNA sequence map. This speeds gene identification through cloning the gene and determining what protein the gene encodes. The positional approach is important for single-gene (Mendelian) disease traits that are well known clinically but not at a biochemical level.
Cystic fibrosis (CF) was one such disease. It is the most common severe autosomal recessive disorder among European populations and their descendants in the New World. Patients suffer from mucus accumulation and frequent bacterial infections in their lungs. In the United States, CF patients are the single largest group receiving transplants to replace damaged lungs. However, the clinical studies failed to determine which gene product is defective in the patients. Extensive studies on families with CF led to identification of the causative gene on chromosome 7.
Initially recombination studies placed the gene within a small region of the chromosome of approximately one million base pairs. Starting at DNA clones from both ends of this region, the researchers used chromosome walking to clone all of the interval; several candidate genes were identified within the region but rejected as the cause of CF. Finally, one gene was identified within these clones that had the right properties: It was normally expressed in the lungs but not the brain, and it encoded a protein that made sense for the cause of the disease. In addition, patients with CF had specific mutations in this gene. The functional CF gene encodes a chloride channel transmembrane regulatory protein (CFTR) that controls transport of certain ions in and out of epithelial (surface) cells. The most common mutation encodes a CFTR protein that is missing one amino acid and cannot reach its site of function in the cell membrane. As a result, ions become too concentrated inside the cell, and water moves in. The result is dried secretions , such as very sticky mucus.
Gene therapy to add a functional copy of the CFTR to lung cells has not been successful, in part because the patients develop an immune response to reject the vector, and, in some cases, the normal protein. Mild improvements have been short-lived, or affect only small patches of cells in the respiratory tract. Alternative approaches to better understanding the physiology of the disease to direct drug design seem more viable. To this end, a mouse model with an inactivated CFTR gene is used to test potential drugs.
SEE ALSO Bacterial Genetics ; Bioinformatics ; Clone ; DNA Sequencing ; Forensic DNA Analysis ; Gene Therapy ; Genetic Diseases ; Genome ; Genomics ; Human Genome Project ; Radiation Hybrid Mapping ; Reverse Transcriptase ; Transgenic Techniques
Alberts, Bruce, et al. Molecular Biology of the Cell, 4th ed. New York: Garland Publishing, 2000.
Felsenfeld, Gary. "DNA." Scientific American 253 (1985): 58–67.
Levin, Benjamin. Genes VII. New York: Oxford University Press, 1999.
Watson, James D., and Francis H. Crick. "A Structure for Deoxyribose Nucleic Acid." Nature 171 (1953): 737.
Watson, James D., Michael Gilman, Jan Witkowski, and Mark Zoller. Recombinant DNA, 2nd ed. New York: Scientific American Books, 1992.