SNP/INDEL discovery pipeline based on CAP3 assembly

Step 19: Generation of tab-delimited file with consensus sequences

seqs_processor_007.py Python scripts takes as an input EST or contig sequences in fasta file format and generates three output files:

1. file with sequences in fasta format where every sequence formatted as a single line (this type of output is useful for pattern search using UNIX grep)
2. file with info about "GC" content for every sequence
3. tab-delimited file with sequences where first column - sequence ID, second column - sequence length and third column - sequence by itself.

For contig_target_May_01_2003.pl Perl script and Primer3 program we need a third tab-delimited file only.

In UNIX shell execute:

$ python seqs_processor_007.py tomato_ABC.fasta.cap.contigs tomato_ABC.fasta.cap.contigs.one_line DNA

tomato_ABC.fasta.cap.contigs.one_line.tab will be generated which can be used as an input for contig_target_May_01_2003.pl and Primer3 pipeline.