Thursday, January 5, 2017: 10:30 AM
Reunion C (Hyatt Regency Dallas)
SNP development and marker-assisted selection for the advancement of important agronomic traits in cotton cultivars is hindered by sequence redundancy, including polyploidy and paleopolyploidy, repetitive sequence elements, and low genetic diversity of G. hirsutum and especially domesticated genotypes. Homeologous regions within the cotton genome make the identification of true single nucleotide polymorphisms (SNPs) challenging due to the difficulty in unambiguously mapping short-read sequences to their correct location, which can lead to the detection of false SNPs. In this study we utilized a targeted sequencing approach (Capture-Seq) to preferentially sequence a 9.6 Mbp target region of the genome of 27 cotton cultivars, and aligned them to Sanger SNP-containing BAC-end sequences. SNPs were called from aligned sequences using a previously proven bioinformatic pipeline, and putatively homeo-SNPs were removed by requiring 100% homozygous SNP identification in at least one of the 27 cultivar lines. SNP positions with read depths exceeding 2 standard deviations from the average read depth were removed in order to reduce possibility of calling false SNPs in repetitive sequences. A total of 285,001 SNPs where identified among the 24 G. hirsutum cultivars while 48,647 SNPs were found among the 3 G. barbadense cultivars. A total of 10,037 (3.0%) SNPs were found to be heterozygous. From these SNPs 3,571 (14.5%) were previously identified and are currently on the CottonSNP63K SNP array. These previously validated, in silico derived SNPs along with the low heterozygosity highlights the efficacy of this SNP-mining and genotyping approach in cotton.