Considering the central role that genes play in the understanding of biology, it is surprising that no single, simple definition of a gene exists. This is partly because genes are under multiple evolutionary constraints, and partly because the concept of a gene has both structural and functional aspects that do not always align perfectly. A modern description of a gene must consider not only its structure, as a length of DNA, but also its function, as a unit of heredity in transmission from one generation to the next and in development as a carrier of coded information of the sequence of a protein or RNA molecule. In addition, the description should recognize the multiple roles a single gene can play in different tissues during various stages of development and over the course of evolution.

In the table on page 118, some different sorts of geneticists are listed along with the aspects of genes on which they focus and what kinds of phenomena they investigate. In order to understand someone who is discussing genes, it is critical for the listener or reader to know sufficient context such that s/he can ferret out which of the possible interpretations of "gene" in this list is most likely implied.

Units of Heredity

The modern conception of genes begins with the work of Gregor Mendel (1822–1884), who showed that inheritance involved discrete factors passed from parent to offspring. (While Mendel is given credit as the originator of modern genetics, the word "gene" was not coined until well after his death.) In this view, genes are those elements responsible for the "phenotype," the set of observable traits that make up the organism. In the original Mendelian conception, genes came in pairs, as did possible phenotypes . Classic examples include round versus wrinkled seeds in peas, or presence or absence of hairs on the middle section of the fingers in humans.

The competing school of thought for the first thirty years of the twentieth century was Darwinism, which considered characters with a continuous distribution such as speed, strength, skin color, height, weight, number of progeny , etc., for which no simple paired set of elements could account. By 1930, these seemingly incompatible views had been combined in the "neo-Darwinian synthesis," which incorporated features of both sides of the debate. This involved a transformation of the "one gene, one trait"


Ways of Investigating Genes
Kind of Biologist Major Concern Aspects of Genes of Phenomena Investigated
Molecular Biologist A piece of DNA Physical isolation; knock-out experiments
Classical Geneticist A mapped position on a chromosome; a new mutant; a functional unit % recombination; mappable and unique; satisfy complementation or cis-trans test
Cytogeneticist A band or knob on a stained chromosome (insertions, deletions, translations) Presence or absence of genetic function and occurrence of physical chromosome feature
Quantitative Geneticist Contributing alleles in an additive or multiplicative fashion Polygenic ratios; inbreeding effects; path analysis
Population Geneticist Selection, mutation, migration, genetic drift Multigenerational change in allele or genotype frequency, polymorphism, heterozygosity
Molecular Evolutionist or Phylogenetic Systematicist Evolutionary tree of changes in DNA sequence A traceable molecular character inherited by all progeny
Bioinformatician One of six reading frames of DNA with a particular pattern "Gene finding" by computer algorithms and heuristics
Developmental Geneticist Homeotic mutant Embryonic changes
Genetic Epidemiologist Marker Can be studied for spatial distribution and diffusion
Sociobiologist Selfish genes; "junk DNA" Replication without function
X-ray Crystallographer Geometry Relationship of three-dimensional structure to function
Mathematical Biologist Topology Knots, Catenanes
Biotechnology Entrepreneur Commodity Commercial value
Genetic Therapist Surgically insertable piece or "fixable" DNA Alleviation of cause of symptoms

relationship to a recognition that single inheritable genes could influence many different observable traits (called pleiotropy), and a single definable trait could be influenced by many different genes called polygenes.

Pleiotropy is a one-to-many genetic phenomenon. If a human has two copies of the gene for hemoglobin S, then with high probability the individual is likely to develop a broad constellation of symptoms that constitute sickle cell disease. Complications of swelled heart, ulcerated skin, spleen failure, and shortness of breath are all associated with this single gene.

On the other hand, polygenic inheritance, epistasis , gene interaction, operons, and regulatory circuits all involve a many-to-one relationship between genotype and phenotype. Wheat color provides a good example of polygenic inheritance, the contribution of more than one gene to a single trait. When a very dark red, completely homozygous individual is crossed with a white, completely homozygous individual, all of their progeny are phenotypically red. When these red progeny are self-crossed, their offspring include individuals that are very dark red, dark red, red, light red, and white, in a ratio of 1:4:6:4:1. The inference drawn by geneticists is that two independently assorting genes are interacting to determine color, and that each gene has two alleles , one that contributes red color and the other that does not. Hence, the genotypes range from four contributing alleles (making very dark red) to zero (making white). Involvement of more genes can give even more complex and more continuous distributions.

It is important to realize that in none of these cases is any information provided about the physical nature of the gene. In classical genetics, a gene is a unit of heredity, and understanding inheritance patterns does not require knowledge of gene structure.

However, without an understanding of structure, it is tempting to think of genes as being "for" the trait they influence, in the sense that a hammer is "for" pounding nails or a CD player is "for" listening to music. However, the whole notion of "for" is an unacceptable concept to most research biologists. "For" connotes a determinism that is inconsistent with our understanding of the complexities of cellular processes. There is no gene for intelligence, although many genes influence intelligence through their actions within individual cells. Intelligence, like any other complex trait, arises as the result of many genes interacting.

Genes Are Carried on Chromosomes

Long before the discovery that genes were made of DNA, geneticists realized that hereditary factors—genes—were carried on chromosomes . Unlike genes themselves, chromosomes can be easily seen under the microscope, and their movements can be followed during the processes of mitosis and meiosis . Beginning around 1910, Thomas Morgan and colleagues showed that the patterns of Mendelian inheritance could be correlated with the patterns of movement and recombination of the chromosomes. Morgan's group showed that one of the central events of meiosis is crossing over, in which genes trade places between maternal and paternal chromosomes. In this way, Morgan and colleagues developed the chromosomal theory of inheritance and gave a physical reality to the abstract concept of the gene.

From this point, much work was devoted to discovering the physical nature of the gene. Throughout the next several decades, a series of experiments showed that genes were made of DNA (deoxyribonucleic acid), and finally that the double-helical structure of DNA accounted for the faithful replication and inheritance of genes.

Genes Encode Enzymes and Other Proteins

Parallel to the growing understanding of the structure of the gene came discoveries about how genes affect the phenotype. From patients who suffered from Mendelian diseases and from experiments on bread mold, early researchers inferred that mutant genes were frequently associated with disfunctional enzymes that could not catalyze particular metabolic steps. Thus, they concluded that enzymes perform the actual functions in a cell that lead to phenotype. These observations led to the first definition of a

Figure 1. This simplified gene is composed of four regions. The promoter binds to an RNA polymerase in an on-off fashion and controls whether mRNA can be made. The beginning stretch of RNA is not ultimately translated into protein at the ribosome, and neither is the terminal region.
Figure 1. This simplified gene is composed of four regions. The promoter binds to an RNA polymerase in an on-off fashion and controls whether mRNA can be made. The beginning stretch of RNA is not ultimately translated into protein at the ribosome, and neither is the terminal region.
gene that combined structure and function, stated as "one gene, one enzyme." In this formulation, a gene was thought to be enough DNA to bring about the production of one enzyme. This view had to be modified slightly with the realization that many enzymes are composed of several subunits, called polypeptides , whose corresponding DNA sequences (genes) may be on entirely different chromosomes. In addition, not all proteins are enzymes; there are structural proteins, transcription factors , and other types. This led to the reformulation "one gene, one polypeptide."

Information Sequences that Code for Production of RNA

The discovery of the structure of DNA led quickly to an unraveling of the means by which it controls protein production. RNA was discovered to be an intermediate between DNA and protein, and this led Francis Crick to formulate the "central dogma of molecular genetics":

DNA ← RNA ← Protein

The sequence of DNA subunits, called nucleotides , was found to correspond to the sequence of amino acids in the resulting protein. This led to the explicit formulation of a gene as a coded instruction.

Three major aspects of DNA as a code—a sequence of symbols that carry information—are widely employed. First, molecular biologists describe genes as messages that can be decoded or translated. The letters in the DNA alphabet (A, C, G, T) are transcribed into an RNA alphabet (A, C, G, U), which in turn is translated at the ribosome into a protein alphabet (twenty amino acids). A word in DNA or RNA is a sequence of three nucleotides that corresponds to a particular amino acid. Thus, translating the messenger RNA word AUG via the standard genetic code yield the amino acid methionine.

In this conception, the gene is a DNA molecule with instructions written within it. The analogy to words, books, and libraries has been drawn repeatedly, because it offers a way to understand the hierarchy of information contained in the genome .

Further work showed that not all DNA sequences are ultimately translated into protein. Some are used only for production of RNA molecules, including transfer RNA (tRNA) and ribosomal RNA (rRNA). This led to yet another formulation of the gene definition, as the code for an RNA molecule. This encompasses tRNA, rRNA, and the mRNA that ultimately is used to make proteins.

Genes Have Complex Structures

A surprising fact about gene structure was revealed in 1977 with the discovery of intron. Introns are segments of DNA within the gene that are not ultimately translated into protein. The introns alternate with exons, segments

Figure 2. The dystrophin gene codes for slightly different proteins—isoforms—in a variety of differentiated cell types. A simplified version is illustrated above. The dystrophin gene is thought to have eight promoters, each with its own initial exon and as many as seventy-eight downstream exons.
Figure 2. The dystrophin gene codes for slightly different proteins—isoforms—in a variety of differentiated cell types. A simplified version is illustrated above. The dystrophin gene is thought to have eight promoters, each with its own initial exon and as many as seventy-eight downstream exons.
that are translated. The entire gene is first transcribed to make RNA, but then the intronic sections are removed, and the RNA exons are spliced together to form mature mRNA. The transcribed DNA of a gene is also flanked by nontranslated and nontranscribed regions that are essential to its function. These include the promoter region, a section of "upstream" DNA that binds RNA polymerase, the enzyme that forms the RNA copy. In Figure 1, an overly simplified version of a genetic message is presented. Other DNA segments called enhancers also regulate gene transcription, and these may be located upstream, downstream, within the gene, or far from it.

Genes Have Complex Functions

Further complexity arose with the discovery of alternative splicing and multiple promoters. In many eukaryotic genes, the exons can be combined in different ways to make closely related but slightly different proteins, called isoforms. There can be multiple promoters, some within the gene, that begin transcription at different sites within the gene. Such an example is illustrated in Figure 2. The dystrophin gene codes for a muscle protein that, when absent, causes Duchenne muscular dystrophy. Other isoforms of dystrophin are expressed in white blood cells, neurons , and the Schwann cells that wrap neurons with insulation.

Thus, it is difficult to speak of "the" dystrophin gene because the alternative splicing of noncontiguous pieces of RNA produces a variety of

Figure 3. Gene tree illustrating the transfer of genes from one biological ancestor to descendents.
Figure 3. Gene tree illustrating the transfer of genes from one biological ancestor to descendents.
different proteins. Isoforms help generate the differences between tissues, and are thus partly responsible for the complexity of the fully differentiated organism. Similarly, the vast variety of antibodies we produce are coded for a much smaller number of exons, shuffled and expressed in a combinatorial fashion.

With these complications, defining a gene becomes yet more complicated. While it would be possible to describe the set of dystrophin isoforms as arising from an equal-numbered set of genes, most biologists find that unnecessarily complex. Instead, the gene is defined as a DNA sequence that is transcribed as a single unit, and one that encodes one set of closely related polypeptides or RNA molecules. Thus there is one dystrophin gene, which at varies times in various tissues codes for each of the known dystrophin isoforms. This has been summarized as "one gene, many polypeptides."

Genes Act in Evolution, Heredity, and Development

Finally, some fruitful connections can be made by looking at genes in three different contexts and from three different points of view. First, developmental

Figure 4. Gene tree illustrating the different cell types that arise by division of one original cell (a zygote; fertilized egg) and differentiation of subsequent daughter cells.
Figure 4. Gene tree illustrating the different cell types that arise by division of one original cell (a zygote; fertilized egg) and differentiation of subsequent daughter cells.
biologists focus on the action of genes at different times and places over the life history of an individual from conception to death. Over time, a particular gene will be expressed or silenced depending on stage of development and the tissue it is in. Second, geneticists focus on transmission of information, assortment and recombination of markers, and reproduction within families and populations within one species. Over time, a particular gene will be copied and transmitted to offspring and may accumulate mutations in the process. Third, evolutionary biologists focus on history, mutation, variability, and gene duplication. Over time in different species, as mutation and natural selection have their effects, there is divergence of each duplicate's structure and function.

These perspectives can be understood by displaying multiple views as graphs called trees. In Figures 3 and 4, the general form of the tree, representing the transfer of genes from one biological ancestor to descendents, can be identical, yet the diagrams illustrate a passage of genes with a variety of spatial, temporal, and biological changes in different contexts.

A gene is a unit of both structure and function, whose exact meaning and boundaries are defined by the scientist in relation to the experiment he or she is doing. Despite an inability to define a gene precisely, the concept of gene has been a fruitful one for a century. In fact, these ambiguities have helped scientists to develop a concept of "gene" that has attained a robustness. This dynamic richness of meaning has contributed to the endurance of "the gene" in biologists' vocabulary. All of these meanings will have value as we face genetic problems in the future and try to establish wise policy in using our knowledge of genes.

SEE ALSO Gene Therapy ; Genetic Analysis ; Genetic Code ; Genetic Control of Development ; Genetic Diseases ; History of Biology: Inheritance ; Mendel, Gregor ; Protein Synthesis

John R. Jungck


Condit, Celeste Michelle. The Meanings of the Gene: Public Debates About Human Heredity. Madison, WI: The University of Wisconsin Press, 2000.

Dawkins, Richard. The Selfish Gene. Oxford: Oxford University Press, 1989.

Fowler, C., and P. Mooney. The Threatened Gene: Food, Politics, and the Loss of Genetic Diversity. Cambridge: Lutterworth Press, 1990.

Jones, Steve. The Language of Genes: Solving the Mysteries of Our Genetic Past, Present and Future. New York: Anchor Books, 1993.

Jungck, John R., and John N. Calley. "Genotype as Phenotype: How Genetic Engineering Has Changed Our Fundamental Concepts of Biology." American Biology Teacher 46 (1984): 357, 405.

Mulligan, R.C. "The Basic Science of Gene Therapy." Science 60 (1993): 926–932.

Olby, Robert. Origins of Mendelism, 2nd edition. Chicago: University of Chicago Press, 1985.

Singer, Maxine, and Paul Berg. Genes and Genomes. Mill Valley, CA: University Science Books, 1991.

Wallace, Bruce. The Search for the Gene. Ithaca, NY: Cornell University Press, 1992.

WEISMANN, AUGUST (1834–1914)

German biologist who kept alive English naturalist Charles Darwin's theory of natural selection as the mechanism for evolution, when most biologists were looking for other mechanisms. Weismann also predicted the existence of deoxyribonucleic acid (DNA), arguing that parents pass traits, such as eye color, to their children by means of molecules of some kind.


American biologist who, with Alfred Hershey, used a friend's blender to show that genes are made of deoxyribonucleic acid (DNA). In their ingenious experiment, Chase and Hershey labeled virus proteins with one radioactive label and virus DNA with another label. When the viruses then infected bacteria, Hershey and Chase found DNA, not protein, inside the bacteria.


Japanese molecular biologist and immunologist who won the 1987 Nobel Prize in physiology for discovering how the immune system makes billions of unique antibodies to fight disease and other unwanted intruders of the human body. Tonegawa showed that white blood cells mix and match a few genes to make billions of combinations that are then translated into billions of unique antibodies.

User Contributions:

Howie Usher
Can you put a name/s (a scientist or scientists) that identified the structure of nucleotides? I have a resource that provides a timeline for the contributions leading to Watson and Crick (lets not forget Franklin). In that timeline "biochmists" in the early 1900's are credited with the discovery that "nuclein" (DNA) is made up of nucleotides which contain a phosphate group, a 5-carbon sugar, and a nitrogen base. Who were these scientists?
I addition, a gene is defined as a sequence of nucleotides (nitrogen bases) that code for a specific trait. Can that sequence be defined as a specific/exact number of nucleotides (exactly 3 or 9 or 15...). Is a gene always the same number of nucleotides? If so how many? Does the number of nucleotides in the sequence/gene change depending on the trait? I know that a triplet of nucleotides (codon) codes for a specific amino acid in protein synthesis. So is a gene an exact number of codons... every time?
Can a gene be defined as a specific number of nucleotides and always the same number of nucleotides? Using the analogy from the "Information Sequences that Code for Production of RNA" section provided above, is a gene a word, a sentence, a paragraph, or something. I started out thinking that a gene always = a three nucleotide/base sequence. I understand that is a codon = 1 amino acid in a protein. The size of the protein dictates the the number of needed codons and therefore length of the gene. So would the analogy best fit gene = word, sentence, paragraph...?

Comment about this article, ask questions, or add new information about this topic: