Friday, January 12, 2007 - 1:30 PM

Survey of 42,000 Gossypium hirsutum cv. Maxxa BAC-End Sequences and Frequency, Type, and Annotation of BAC-derived SSRs

Michael B. Palmer1, James Frelichowski2, Dorrie Main1, Mauricio Ulloa3, Roy Cantrell4, Stephen Ficklin1, and Jeffrey P. Tomkins1. (1) Clemson University, 51 New Cherry St, 300 BRC, Clemson, SC 29634, (2) USDA-ARS, 17053 N. Shafter Ave., Shafter, CA 93263, (3) USDA-ARS, WICS Res. Unit, Cotton Enhancement Program, 17053 N. Shafter Ave., Shafter, CA 93263, (4) Cotton Incorporated, 6399 Weston Parkway, Cary, NC 27513

The quest for more molecular markers is a major initiative in cotton, which lags behind crops such as soybean, maize, and rice in this type of research. In an effort to increase the number of microsatellite markers in Gossypium, BAC-end sequences from a publicly available Gossypium hirsutum cv. Maxxa BAC library (Tomkins 2001) were mined for microsatellites, or SSRs. Mononucleotide repeats were not included in the analysis. The minimum number of repeats accepted for each motif were as follows: 5 for dinucleotide repeats, 4 for trinucleotide repeats, 3 for tetranucleotide repeats, 3 for pentanucleotide repeats, and 3 for hexanucleotide repeats. BAC clones were sequenced from both ends, tested for redundancy, and screened against the GenBank non-redundant protein and MIPS Arabidopsis databases. Further annotation was performed using the gene ontology terms associated with matching sequences in the SwissProt database, and by performing a scan of the Integrated resource of Protein Families, Domains and Sites (InterPro). The sequences were then submitted to GenBank. The GenBank-submitted sequences were re-analyzed at a higher Phred stringency, and the resultant high-quality sequences were mined for microsatellites.

From 38,000 high quality sequences, approximately 7,000 microsatellites were developed. These sequences were analyzed for the type of repeat motif, GC content, frequency, motif frequency within the total number of microsatellites, and the presence of open reading frames. Primer3 was used to derive primers for the flanking sequences of the SSRs. Dinucleotide and tetranucleotide repeats made up 68% of the microsatellites. The genomic SSRs should improve marker saturation of the cotton genome and allow SSRs to be anchored to genomic clones, aiding in the reconciliation of extant genetic maps to physical maps. The data are stored and accessible from two websites; the Clemson University Genomics Institute (http://www.genome.clemson.edu/projects/cotton/) and the Cotton Microsatellite Database (http://www.cottonssr.org).


Recorded presentation