Some exercises to tryΒΆ

  1. Finish or re-run one of the tutorials that you’re particularly interested in.

  2. Download and assemble some reads from the Short Read Archive (SRA) or elsewhere.

    Remember that the ENA at EBI offers FASTQ downloads of SRA sequence.

    (see these tutorials: Understanding Read Formats and Quality Controlling Data and assembly-lab)

  3. Browse the NCBI Bacterial genomes site, download a proteome of interest (you’ll want the ‘.faa’ files), and run the various BLASTs on ‘em (e.g. find reciprocal best hits).

    See the instructions here: Basic EC2, command line, and BLAST.

  4. Map raw reads for the E. coli 0104 genome back to your assembly – e.g. take the assembly at http://athyra.idyll.org/~t/ecoli-v41.fa and map the QC reads to it (see Mapping with bwa).

    Bonus: load the resulting read mapping into Tablet.

    Bonus x 2: use the Prokka annotation .gff file to define features in Tablet. (If you didn’t get all the way through the Prokka run, you can download a zip file of the results here: http://athyra.idyll.org/~t/ecoli0104-prokka.zip)

  5. Assemble the provided ecoli reads without digital normalization.

  6. Install BLAST and the ngs-scripts, and then try comparing an assembly with a genome by using the following commands:

    blastall -i assembly.fa -d genome.fa -p blastn -e 1e-20 -o assembly.x.genome.blastn
    blastall -d assembly.fa -i genome.fa -p blastn -e 1e-20 -o genome.x.assembly.blastn
    
    python /usr/local/share/ngs-scripts/blast/calc-blast-cover.py genome.fa assembly.x.genome.blastn 300 assembly.fa
    python /usr/local/share/ngs-scripts/blast/calc-blast-cover.py assembly.fa genome.x.assembly.blastn 300 genome.fa
    
  7. Try taking one of the assemblies (e.g. http://athyra.idyll.org/~t/ecoli-v41.fa) and mapping the QC reads used to assemble it back to it, to assess assembly completeness.

comments powered by Disqus