Project Documentation & Protocols: Maize Gene Discovery Project: Education:
Creating Databases and Tools for Future Research
Contents: Maize Gene Discovery | The Challenge of Maize Genetics | Why Discover Maize Genes? | Finding Genes
Linking Genes to Function | Creating Databases | Building a Storehouse | Accomplishments | What's Next? | Glossary
ZmDB: Maize Genome Database
ZmDB is a maize genome database that collects all maize genome information from GenBank on a regular basis and organizes it for maize researchers. It was created for the MGDP by collaborator Volker Brendel's bioinformatics group at Iowa State University.
GenBank, the federally funded catalog of genetic information for all organisms, stores its data in chronological order. To make these data more useful, researchers create specialized databases containing genetic information about each organism. ZmDB provides that service for maize researchers. It gathers related DNA sequences and provides annotations describing a gene's likely function based on the scientific literature. ZmDB provides additional information not found in GenBank, such as links to the pictures of mutant plants discovered by the MGDP team.
PhenotypeDB, which is integrated with ZmDB, describes and catalogues the many mutant plants produced by MGDP. It allows researchers to identify plants and buy seed carrying particular mutant traits. The database also links the traits to RescueMu insertion mutations that might cause them. To walk through a PhenotypeDB exercise, click here.
Plant Genome Database (PlantGDB) http://www.plantgdb.org/
PlantGDB was also created by MGDP collaborator Volker Brendel, although funded under a separate grant,. This database of plant genomic and EST sequences allows cross-species comparisons between about 20 major crop and model species. The database provides snapshots of the current knowledge of plant gene composition and facilitates researchers' understanding of plant genetics and evolution.
This tool evaluates new ESTs to see if their sequences overlap with other EST's already listed in ZmDB. It then assembles matching ESTs into clusters corresponding to unique gene fragments. (learn more about ZmDBAssembler)
MuSeqBox: Multi-query Sequence Blast Output Examination
MuSeqbox allows researchers to easily compare many genetic sequences to one another or to their complementary proteins. The data for MuSeqBox come from BLAST, the most widely used genome search program (see link below). BLAST performs the researcher's desired comparison, but produces a complex output that is difficult to read -- especially when more than a handful of sequences are submitted. MuSeqBox converts that output into a user-friendly table containing the most important information.
Thus, for example, a researcher can generate a table comparing all maize ESTs to known proteins. The table will list each EST and the three proteins whose amino acid sequences most closely match the protein that would be produced by the EST's genetic sequence. Indeed, Brendel's group uses this approach to annotate ESTs in ZmDB.
Researchers can also refine the table further by setting their own criteria for what constitutes a close match. Or they can set criteria for determining whether a gene sequence covers the entire protein or only part of it (due to splicing).
MuSeqBox can be used to display genetic data about any other organism including maize. (learn more about MuSeqBox)
GeneSeqer and SplicePredictor
MGDP collaborator Volker Brendel's group created GeneSeqer and SplicePredictor to help locate introns -- portions of genes that get spliced out when a gene is transcribed into messenger RNA (mRNA). Intron identification boosts researchers' understanding of gene structure and may have practical value for genetic engineering.
SplicePredictor looks for probable splice sites in genomic DNA using rules about what makes a good splice site. The program further evaluates the context in which sites occur by examining neighboring sequences for known introns/exon characteristics; determining whether the site has a complementary pair such that the two sites define a likely intron or exon; and assessing whether other possible site pairs constitute more likely splice sites than the ones under consideration.
In addition, SplicePredictor can look for introns by finding stretches of genomic DNA that do not have matching ESTs. ESTs correspond to stretches of mRNA and do not contain the introns spliced out from the direct RNA copy of genomic DNA. This additional search function greatly increases SplicePredictor's accuracy in detecting splice sites.
GeneSeqer is similar to SplicePredictor but it is specifically designed to look at longer stretches of genomic DNA, and it only aligns ESTs that closely match the genomic DNA sequence. GeneSeqer displays its output graphically at ZmDB as shown below. Gaps in the EST sequence that GeneSeqer fills in with genomic sequence are presumed to be introns. (Learn more about GeneSeqer).
Genomic DNA assembler (Still in progress)
When MGDP finds genes by RescueMu tagging (learn more), the output consists of multiple, overlapping copies of the same stretches of DNA -- called genomic survey sequences, or GSSs. By repeatedly sequencing the same pool of DNA, MGDP researchers make sure they find the genes they are looking for. But this redundancy also fills ZmDB and GenBank with numerous copies of the same or overlapping sequences. Bioinformatics tools can distill this mass of data down to its essence: the gene and all sources of information supporting its existence.
To extract constituent genes from MGDP's GSS data, Brendel's group is building a genomic DNA assembler like the ZmDB assembler for ESTs. The output, like that from GeneSeqer, will graphically display the various GSSs and ESTs that make up the gene, along with their GenBank number, and information about the gene's possible function.
A Tour Through PhenotypeDB
A visit to PhenotypeDB is like a tour through a seed catalogue full of bizarre and unusual maize mutants. But, to maize geneticists, these mutants are more than curiosities; they are the tools of their trade. At PhenotypeDB, researchers can shop for the kinds of plants that will further their particular research goals. To see how a researcher might patronize this online store, try the following exercises:
Check out some tassel mutants:
A researcher interested in maize reproduction might want to peruse the bizarre male flowers photographed by the MGDP team. First, go to the ZmDB Mutant Phenotype Browser page Scroll down to the area titled "Adult Mutant Phenotype Categories." Under "phenotype," select tassel; check the "picture" box, and click "Search." Up pops a table with photos of plants with mutant tassels. Click on a photo to enlarge it. The tassel of the Grid A, row 60, column 5 plant droops; that from Grid C, row 33, column 18 has no branches low down; the one from Grid C row 34, column 4 has small or short tassel branches; and the Grid E, row 19, column 35 tassel is unbranched. The genes for these kinds of traits can help researchers understand maize flowering.
Find all dwarf mutants:
Rather than researching mutations in one plant part, a researcher might want to find all plants exhibiting a very specific mutation. To do so, she would start at the Phenotype Lists page of PhenotypeDB. This index defines the terms and abbreviations used to describe all the mutant phenotypes in the database. Select Adult Vegetative Abbreviations, and find the dwarf phenotypes. Clicking on the highlighted phenotype "dwf" transports you to the Locations Search page. Clicking on "start search" builds a table of all the plants carrying the dwf descriptor. From this result, a researcher could order seed for dwarf plants or refine the search to only specific types of dwarf plants.
Find plants with mutant ear, seedling and adult traits
Some plants exhibit unusual traits at every stage of development . seed, seedling, and adult. Such plants interest researchers because, if a single mutation causes all the observed traits, the altered gene is likely important throughout the plant's lifetime. To find such plants, start at the Location Search page. Check the boxes for "Seed URM," "Seedling URM," and "Adult URM." A table pops up listing several plants that have unique recessive mutations at each of these stages. Click on the grid letter to see a description of the mutations. For definitions of the various abbreviations, hop to the Phenotype Lists page.
Access genetic sequence data for a mutant:
PhenotypeDB also connects genetic sequences to rows and columns of grid plants. In the Mutant Browser, under Seed and Ear Mutant Phenotype Categories, scroll to "Defective Size or Kernel," and Grid "G." Check the "picture" box, and click search. The resulting table shows defective kernels in Grid G (one of the grids for which sequencing has been completed). Clicking on the grid letter for one of the items opens a table describing that plant. At the bottom of that table, hit "Click to search all RescueMu GSS generated from this row." Up pops a list of several hundred GenBank identifiers. These are the genomic survey sequences (GSSs) found when the MGDP sequenced RescueMu-tagged DNA from leaves of plants in this row of Grid G (learn more).
Clicking on any of the GenBank numbers will bring up a table describing the source of that data. Click on the GenBank number in that page and you'll get the actual genetic sequence (Cs, Gs, As and Ts). This sequence can be copied and pasted into various database tools that will compare it to other sequences of interest. For example, a researcher interested in all waxy seeds might compare all the sequences for such seeds found in PhenotypeDB. Such a comparison might turn up a common sequence that explains the mutant trait.
In cases where columns have been sequenced (in addition to rows), researchers can match RescueMu-tagged DNA to a specific plant (a row/column address) -- useful information if the RescueMu tag caused an observed mutant trait (learn more).
Katherine Miller, a freelance science writer, contributed the text for this page to the Maize Gene Discovery Project. You can reach her at [email protected].
Return to Documentation Index | Return to Maize Gene Discovery Project index | Return to Homepage