Xrate Pipeline

From Biowiki
Jump to: navigation, search

xrate comparative genome annotation pipeline

  1. (MANUAL STEP) repeat detection, softmasking
  1. (MANUAL STEP) gene prediction; ortholog assignment; anchoring; multiple alignment
  1. (SEMI-AUTOMATED STEP) xrate model training
  1. (AUTOMATED STEP) xrate sliding-window scan using Window Licker
  2. (IN DEVELOPMENT) estimate false positive rate via genome transducer simulation
  3. (AUTOMATED STEP) map scan back to genome; store in database
  1. (AUTOMATED STEP) fly-specific filters (Affy transfrags)


Related analysis

-- Ian Holmes, Andrew Uzilov, Robert Bradley

Meeting notes

agenda 10/18/2007

preliminary outline (+timeline?) of the Drosophila screen paper

  1. experimental followup:
    • PCR/sequencing/etc experiments; imminent visit to Sue Celniker
    • targeted expression: Eric Lai, Pavel Tomancak
    • modENCODE
  1. informatics followup:
    • 100/300 prediction set: GFF, FASTA, Stockholm; put into GBrowse
    • broad-trend GO analysis of gene-overlapping hits
    • stemloc alignment/clustering of hit regions
    • Avinash's simulator & false positive estimates
  1. estimate write/submit/publish schedule for Drosophila screen
    • misc analysis Q's: tetra/triloops? RNA-derived repeats? phylogenetic distribution? population variation? Drosophila vs Anopheles, Bombyx?

ideas for future runs of the pipeline

  1. desired enhancements:
    • grammars & modeling
    • pipeline & infrastructure
    • browser
  1. applications to other species clades:
    • fungal genomes
    • nematodes
    • mammals/vertebrates/eutheria

meeting with Sue & Joe, 10/18/2007

  • enrich for longer hits, separate out putative short miRNAs
  • post pipeline results on wiki