The Sclerotinia Sclerotiorum Whole Genome Sequencing Project

The genome sequence of the filamentous ascomycete Sclerotinia sclerotiorum has been sequenced by the Broad Institute through a project sponsored by the USDA Microbial Genomics Program. The Principal Investigators on this project are Drs. Christina Cuomo at MIT’s Broad Institute, Martin Dickman at the Texas A&M University, Linda Kohn at the University of Toronto, and Jeffrey Rollins at the University of Florida.

The first public release of the 8X genome sequence assembly was to the NCBI trace archives? on 04/25/2005. The first draft assembly with annotation was released on October 01, 2005 on the Broad Web Server.

The release of this sequence data represents the first public release of an assembled genome from the Leotiales, and the first from a broad host range necrotrophic phytopathogenic fungus. Recent efforts sponsored by the European Community to sequence the genome of Botrytinia fuckelinia (Botrytis cinerea), offer unprecedented opportunity for comparative genomics for evolutionary and functional biology inquiries. A joint community manual annotation project is under development to maximize the resources of the Botrytis and Sclerotinia communities.

The specific Objectives of this project, were to:

  1. Produce a high quality draft (HQD) sequence of the Sclerotinia sclerotiorum genome with an average depth of ~7X in Q20 bases and ~62X physical coverage in the assembly.
  2. Assemble the Sclerotinia genome sequence using the Arachne assembly program (Batzoglou et al. 2002).
  3. Sequence both ends of 12,000 cDNAs (6,000 from each of two different libraries). Align EST sequences to the genome and use this information as a training set to refine the gene prediction model and computational annotation of all genes to increase the accuracy of the S. sclerotiorum annotation.
  4. Make the assembly and annotation of Sclerotinia publicly available through the CGR web site at http://www.broad.mit.edu, which will allow users to:
    • Download the entire genome, protein set, or portions
    • Perform nucleotide or protein Blast searches
    • Interactively search for predicted genes based on name, location, homology information, protein domain, and multigene family
    • Graphically view the sequence annotated with genes, protein families, and regions of similarity to known sequences

To date we have:

  1. Sequenced the genome to a depth of 8.8x. This included 394,368 paired reads from a 4kb genomic library, 117,504 paired reads from a 10 kb genomic library, and 69,120 paired reads from a BAC library.
  2. Sequencing reactions from both 5’ and 3’ ends of 31,298 clones from three independent cDNA libraries was conducted. For the developing sclerotia library (SS_G781_seq.fasta), there are 17,533 sequences from 9,497 clones (8,036 clones have both ends sequenced). For the mycelia library (SS_G786_seq.fasta), there are 18,885 sequences from 10,566 clones (8,319 clones have both ends sequenced). For the developing apothecia library (SS_G787_seq.fasta), there are 21,333 sequences from 11,235 clones (10,098 clones have both ends sequenced). This sequence reads are available for download from the Sclerotinia Genome Web Site.

Updated 28 May 2007 by Jeffrey Rollins.