Assembly Scavenger Hunt
=======================

This exercise is meant to bring together knowledge from the whole week,
and also just be fun. I've taken some pain text, embedded it in
DNA with a simple algorithm along with some random sequence,
and fragmented it to produce reads. Your job will be to assemble
the reads and put the results back through the script to retrieve the 
original message.

We'll need a few things::

    cd /mnt
    mkdir scavenger-hunt
    cd scavenger-hunt

    curl -O http://athyra.idyll.org/~cswelcher/assembly-scavenger-hunt/reads/reads.svZjxD/scavenger_reads.fa
    curl -O http://athyra.idyll.org/~cswelcher/dna2text.py
    curl -O http://athyra.idyll.org/~cswelcher/dnatextutils.py

You should have velvet already, but if not::
    
    cd /root
    curl -O http://www.ebi.ac.uk/~zerbino/velvet/velvet_1.2.10.tgz
    tar xzf velvet_1.2.10.tgz
    cd velvet_1.2.10
    make MAXKMERLENGTH=51
    cp velvet? /usr/local/bin

    cd /mnt/scavenger-hunt

Of course, sequencing chemistry is always improving, so you may want to use::

    curl -O http://athyra.idyll.org/~cswelcher/assembly-scavenger-hunt/reads/reads.Q7XSSZ/scavenger_reads.fa

You'll want to use velvet to assemble ``scavenger_reads.fa``. They're 36-base singled-ended
reads, and like an actual metagenome, have variable coverage. This means that you might need
to do some parameter exploration to get the contigs you want out of it; I would recommend
looking at the ``exp_cov`` parameter of ``velvetg`` in particular.

To decode your results, make use of the dna2text.py script. It's usage is::

    python dna2text.py contigs.fa > contigs.text

Which you can then look at with ``less``::
    
    less contigs.text

Further, you might want to make use of an ipython notebook which plots
k-mer abundance distributions, and could help you with parameters::

    curl http://2013-caltech-workshop.readthedocs.org/en/latest/_static/caltech-2013-scavenger-hunt.ipynb > /usr/local/notebooks/2013-caltech-scavenger-hunt.ipynb

Which you can then access by going to https://ec2-???????????.compute-1.amazonaws.com.

Happy hunting!