Xrate Pipeline
From Biowiki
Contents
xrate comparative genome annotation pipeline
- (MANUAL STEP) repeat detection, softmasking
- (TODO) build Repeat Masker
- (TODO) install Piler
- (MANUAL STEP) gene prediction; ortholog assignment; anchoring; multiple alignment
- (SEMI-AUTOMATED STEP) xrate model training
- (AUTOMATED STEP) xrate sliding-window scan using Window Licker
- (IN DEVELOPMENT) estimate false positive rate via genome transducer simulation
- (AUTOMATED STEP) map scan back to genome; store in database
- Mercator Perl again
- (TODO) populate Bio Store or Chado Database
- (MANUAL STEP) interface to Generic Genome Browser
- (AUTOMATED STEP) fly-specific filters (Affy transfrags)
Documentation
Related analysis
- RNA gene family analysis
- Homology-based predictions (Infernal Software, Rfam Database)
- Rate measurement (e.g. Fly Nc Rna)
- UTR regulatory elements (e.g. Fly Zipcodes)
-- Ian Holmes, Andrew Uzilov, Robert Bradley
Meeting notes
agenda 10/18/2007
preliminary outline (+timeline?) of the Drosophila screen paper
- experimental followup:
- PCR/sequencing/etc experiments; imminent visit to Sue Celniker
- targeted expression: Eric Lai, Pavel Tomancak
- modENCODE
- informatics followup:
- 100/300 prediction set: GFF, FASTA, Stockholm; put into GBrowse
- broad-trend GO analysis of gene-overlapping hits
- stemloc alignment/clustering of hit regions
- Avinash's simulator & false positive estimates
- estimate write/submit/publish schedule for Drosophila screen
- misc analysis Q's: tetra/triloops? RNA-derived repeats? phylogenetic distribution? population variation? Drosophila vs Anopheles, Bombyx?
ideas for future runs of the pipeline
- desired enhancements:
- grammars & modeling
- pipeline & infrastructure
- browser
- applications to other species clades:
- fungal genomes
- nematodes
- mammals/vertebrates/eutheria
meeting with Sue & Joe, 10/18/2007
- enrich for longer hits, separate out putative short miRNAs
- post pipeline results on wiki