Postagens

Mostrando postagens de agosto, 2020
 Retirado de: https://playingwithgenomics.wordpress.com/2015/07/15/producing-a-bam-file-and-extracting-unique-reads-from-bowtie2-results/ I am always looking for ways to keep my disk usage down. Especially since I’ve been mapping close to 100 Chip-Seq files. I find that 1) piping Bowtie2 output into samtools to create a bam file and 2) keeping only the uniquely mapped reads help a lot. Here is how I do those: I’m dealing with single-end data here. You can modify the command if you are dealing with paired-end data. Text written in red should be substituted according to your data. Converting Bowtie2 output to bam file bowtie2 -x BOWTIE2_INDEX -\ -p NUM_THREADS \ -U INPUT_FILE.fastq.gz | samtools view -bS -t \ YOUR_GENOME_INDEX.fa.fai - > output.bam Counting unique reads You basically can exploit the XS tag that are set by Bowtie2  for reads that can be mapped in multiple places. Therefore, uniquely mapped reads lack the XS tag. The following command exclude unmapped (

Platanus

  Platanus_trim xxx_1.fastq xxx_2.fastq   Platanus assemble -o Pxut -f ./DRR02167[34]_[12].fastq 2> assemble.log Platanus scaffold -o Pxut -c Pxut_contig.fa -b Pxut_contigBubble.fa -IP1 ./DRR021673_1.fastq ./DRR021673_2.fastq –IP2 ./DRR021674_1.fastq ./DRR021674_2.fastq -OP3 ./DRR021675_1.fastq ./DRR021675_2.fastq ./DRR021676_1.fastq ./DRR021676_2.fastq 2> scaffold.log   Platanus gap_close -o Pxut -c Pxut_scaffold.fa -IP1 ./DRR021673_1.fastq ./DRR021673_2.fastq –IP2 ./DRR021674_1.fastq ./DRR021674_2.fastq -OP3 ./DRR021675_1.fastq ./DRR021675_2.fastq ./DRR021676_1.fastq ./DRR021676_2.fastq 2> gapclose.log    

Misturar colunas

 cat clt_Uvol_align.sh | awk '{print $1,$2,$3,$4,$5,$6,$7,S8,$6}' > clt_Uvol_align2.sh