Two different directories with data files and commands to run ABySS and BWA of the reads
against the ABySS contigs.  'Staph' has a single set of paired reads while 'Staph-2' has
two sets of reads.  See the *.sh batch files for how to work with the data.  There
is a ABySS batch file and various BWA batch files using different BWA algorithms and looking
at either the ABySS contigs or the originating genome.

The data is "artifical" (in not coming from an actual experiment) but should mimic
real data.  The reads should assemble into ABySS contigs that look like
Staphylococcus/aureus_USA300_FPR3757


In the 'staph' directory:

2_test_R1 file: #=4,000,000    BPs=396,146,474    Range=30-101
2_test_R2 file: #=4,000,000    BPs=386,269,887    Range=30-101

ABySS produced 181 contigs with 2,905,890 bases overall and one contig of 883,493 bases

7,969,807 (99.62%) of the reads back-mapped to the ABySS contigs.


In the 'staph-2' directory:

4_test_R1 file: #=8,000,000    BPs=792,369,346    Range=30-101
4_test_R2 file: #=8,000,000    BPs=771,713,765    Range=30-101

15,999,253 mapped to genome

8_test_R1 file: #=4,000,000    BPs=397,125,146    Range=30-101
8_test_R2 file: #=4,000,000    BPs=388,109,886    Range=30-101

7,999,680 mapped to genome

ABySS produced 46 contigs with 2,932,043 bases overall and one contig of 885,328 bases

23,994,168 (99.98%) of the reads back-mapped to the ABySS contigs.


Send me email if you have questions. westerman@purdue.edu  or call 40505