Linkage and Gene Mapping
Linkage refers to the presence of two different genes on the same chromosome . Two genes that occur on the same chromosome are said to be linked, and those that occur very close together are tightly linked. Study of linkage provides information about the relative position of genes on chromosomes, allowing the construction of chromosome maps.
Different forms of the same gene, called alleles , are present on matching, or homologous, chromosomes in similar positions, or loci. For instance, in Gregor Mendel's experiments with peas, green and yellow are two alleles for pod color. In a heterozygote, which has both alleles, the two alleles occupy the same loci on homologous chromosomes. Similarly, round and wrinkled are alleles for seed texture. In the pea, these two genes—pod color and seed texture—are on different pairs of homologs and are therefore not linked. When gametes form in double heterozygotes (for example, a green/yellow–round/wrinkled plant), these genes assort independently, because the two chromosomes that bear them assort independently. Therefore, meiosis will create equal numbers of green-round, green-wrinkled, yellow-round, and yellow-wrinkled gametes. Mating between double heterozygotes (called a dihybrid cross) will give a characteristic ratio of the different possible plant types.
However, if the two traits were located close to one another on the same chromosome—in other words, if they were linked—the observed ratio will be quite different from that seen for unlinked traits. Allele combinations that began together (for instance, round-green) will tend to stay together, and the offspring will show a skewed ratio reflecting the original combinations.
Despite being on the same chromosome, the round and green alleles could become separated during meiosis by crossing over, a form of genetic recombination. During crossing over, homologous chromosomes exchange segments. This could allow the yellow allele to switch places with the green allele and lead to a round-yellow gamete. If the loci for the two genes are very close, crossing over is unlikely to separate alleles, whereas if they are far apart, crossing over is much more likely to separate them. Therefore, the frequency of crossing over is related to the physical distance between the loci for the two genes.
The particular combination of alleles on the homologous chromosomes in the dihybrid parent (for example, round-green) is known as linkage phase. Separation of this combination by crossing over is said to be a change in phase. The two alleles of a particular gene are said to be markers for that site of the chromosome.
Linkage in Fruit Flies
An example of using linkage to explore gene position is provided by inheritance of eye color and body color in fruit flies, both of which are located on the X chromosome. This example begins with purebred (homozygous) parents, one yellow-bodied and red-eyed, the other grey-bodied and white-eyed. They mate to produce all heterozygous daughters, who carry the yellow-red combination on one homologous chromosome and the grey-white combination on the other. When the heterozygotes create gametes, the eye-color alleles cannot assort independently from the body-color alleles because they are linked. Some crossing over can occur, though. As in humans, male fruit flies carry only one X chromosome, and so will show exactly what alleles are present on their X. When one counts the male offspring, approximately 49.5 percent are yellow-bodied and red-eyed, 49.5
In this example, the yellow-body allele and the white-eye allele are said to be "out of phase" in the parental strains. The most frequent pair of gamete types are described as "parental types" because they retain the alleles for the two genes as transmitted by the original parent strains. The two gamete types that are less frequent are the "recombinant types," which results only from an exchange or crossover of homologous chromosomes in the interval between the genes.
As an undergraduate in 1913, A. H. Sturtevant wrote a brilliant paper that extended linkage analysis into gene mapping. Sturtevant analyzed numerous linkage experiments in the fruit fly, each using two genes. For instance, a similar experiment with body color and wing shape shows many more outof-phase offspring, indicating the wing-shape gene is further from the bodycolor
Extension of this technique allowed the distance between genes to be expressed as map units. One map unit is defined as the effective distance needed to obtain a 1 percent recombination between linked alleles. The map unit is also called the centiMorgan (cM), to honor T. H. Morgan, Sturtevant's teacher and one of the founders of chromosomal genetics. Because crossing over is not equally likely between any two points, map units do not correspond directly to number of nucleotides along the DNA double helix.
Sturtevant's work helped show that the chromosome is a linear sequence of genes. Gene mapping determines the position and order of genes relative to other genes along the chromosome. A well-marked linkage group extends from markers located at one end of the chromosome to those in the middle, and on to markers located at the other end. The number of linkage groups for an organism is equal to its number of homologous chromosome pairs.
Sturtevant's discovery led to the golden age of chromosome transmission genetics, with an emphasis on identifying genes through alleles with visible phenotypes , and using them as markers for determining their position on the linkage map. Since then the emphasis in genetics has shifted to understanding the functions of genes. Linkage and gene mapping studies have progressed to being a critical tool in cloning genes and providing more description of their roles in the organism. These approaches include:
- • Using map locations to distinguish different genes with similar sequences, mutant phenotypes, or functions. Examples are the cell division cycle mutants of the yeast Saccharomyces cercvisiae or the uncoordinated mutants of the roundworm C. elegans. In some cases mutants with different phenotypes have been shown to be done to different mutations in the same gene, as is the case with the Drosophila circadian rhythm period mutants termed short, long, and none (per[S], per[L] and per).
- • Using map locations to track down genes to clone their deoxyribonucleic acid (DNA) by chromosome position. Examples are the human cystic fibrosis transmembrane regulator gene mutated in cystic fibrosis, or the polyglutamine repeat gene that is mutated in Huntington's disease. With genome sequences available on databases, mapping mutant phenotypes points to candidate loci for the gene at the chromosome position.
New classes of markers in linkage analysis are based on naturally occurring DNA variation in the genome , and have many advantages. These variations are usually harmless and don't interrupt a gene, so there is no selection against them, meaning they persist over many generations. They are quite numerous and are distinguished throughout in the genome. Individuals are likely to be heterozygous from many of them and therefore the markers are informative for linkage. If the DNA variant is present heterozygously, can be detected, and shows Mendelian segregation, it is as good a linkage marker as yellow bodies or white eyes. The disadvantage is that analysis to detect the variant is sometimes more laborious and requires the techniques of molecular biology.
The common types of DNA markers and the molecular techniques used to follow their inheritance are:
- • Restriction fragment length polymorphisms (RFLPs) are derived from sequence variation that results in the loss of a restriction enzyme digestion site. The result is a longer fragment of the DNA from that location following digestion with that enzyme. A heterozygous parent will transmit either the allele specifying the long fragment or the allele specifying the short fragment to each child. After size separation of DNA fragments by gel electrophoresis and transfer to a Southern blot, these DNA fragments of interest can be identified with a specific DNA or ribonucleic acid (RNA) probe that also comes from that location. If the long fragment, for example, is linked to a disease gene, the child's DNA can reveal if he or she is likely to develop the disease.
- • Randomly amplified polymorphic DNAs (RAPDs) are derived from sequence variation that results in the loss of the complementary site to a primer necessary to initiate chain amplification by polymerase chain reaction (PCR). If the DNA used as template contains complementary sites for both primers, a PCR product is obtained that can be detected by gel electrophoresis. If either site is absent or changed in the template no product will be obtained from the reaction.
Human Disease Genes
Human families pose some of the greatest challenges to linkage analysis. Human families are small, and matings are not designed by the needs of genetic analysis. Mapping a mutation that causes a disease usually requires assembling enough families that transmit the mutation in hopes that some of them will be heterozygous, or informative, at some RFLP, RAPD, or other markers that are near enough to the disease gene to show linkage. Instead of determining linkage by counting crossover numbers as Sturtevant did, human genetics uses an alternative means to estimate whether linkage is present between marker and disease gene. This approach is called LOD score analysis, after Log of the Odds for or against linkage. Each child from informative parents is scored as recombinant (R) or parental (P). The total number of R and P results for each family is used to calculate "scores" for the odds that the results are due to linkage at a table of recombination frequencies from 1 cM, 10 cM, 20 cM, etc., relative to the chance that the results came from independent assortment.
The logs of the odds scores for each family are added to the log scores of other families to increase the number of independent observations. A LOD score value of 3, representing no more than a 5 percent chance of mistakenly declaring linkage, is the minimum acceptable score for assumption of true linkage between marker and disease gene. The recombination value that gives the highest LOD score over all the families is the presumptive linkage distance of the disease gene mutation from the adjacent markers. The first human disease gene mapped this way was Huntington's disease, which had a LOD score of over 6 for a recombination distance from its marker of between 5 and 10 cM. Once a marker has been found, it can be used to predict whether any particular family member has inherited the marker and therefore is likely to have inherited the disease gene.
Alberts, Bruce, et al. Molecular Biology of the Cell, 4th ed. New York: Garland Publishing, 2000.
Creighton, Thomas E. Proteins: Structures and Molecular Properties, 2nd ed. New York: W. H. Freeman and Company, 1993.
Freifelder, David. Molecular Biology, 2nd ed. Boston: Jones & Bartlett, 1987.
Lehninger, Albert L. Principles of Biochemistry, New York: Worth Publishers, 1982.