MitoBim results
Hi there,
I've been using mitobim1.8 to reconstruct mitochondrial genomes from a bird using a related species.
It seems to have worked; however I don't quite understand the output.
In my last iteration file, there are a number of padded and unpadded fasta files... named, for example:
sample-reference-out- AllStrains.padded.fasta
sample-reference-out- AllStrains.unpadded.fasta
sample-reference-out-sample. padded.fasta
sample-reference-out-sample. unpadded.fasta
sample-reference-out- reference.padded.fasta
sample-reference-out- reference.unpadded.fasta
I understand the difference between the unpadded and padded, but not AllStrains vs sample vs reference. Could you please explain this to me? I think I've worked out that the AllStrains is being used as the backbone for the next iteration. However this confuses me, as it also appears that in the info folder from mira, in the assembly.txt file all the tags relate to the sample version.
Also, I was wondering if you have any tips for how to check the quality of the resulting assembly / finish the assembly? After converting to bam and examining it in a viewer, it *looks* ok, however is there a more rigorous way to do this?
I've been using mitobim1.8 to reconstruct mitochondrial genomes from a bird using a related species.
It seems to have worked; however I don't quite understand the output.
In my last iteration file, there are a number of padded and unpadded fasta files... named, for example:
sample-reference-out-
sample-reference-out-
sample-reference-out-sample.
sample-reference-out-sample.
sample-reference-out-
sample-reference-out-
I understand the difference between the unpadded and padded, but not AllStrains vs sample vs reference. Could you please explain this to me? I think I've worked out that the AllStrains is being used as the backbone for the next iteration. However this confuses me, as it also appears that in the info folder from mira, in the assembly.txt file all the tags relate to the sample version.
Also, I was wondering if you have any tips for how to check the quality of the resulting assembly / finish the assembly? After converting to bam and examining it in a viewer, it *looks* ok, however is there a more rigorous way to do this?
Hi Kerensa,
I completely missed your question - terribly sorry! I guess it's a bit late, but I'll answer it now anyway.
Normally you will be interested in `sample-reference-out-sample. unpadded.fasta`.
Files with the `*AllStrains*` tag represent a consensus between your
sample and the backbone used in the current iteration. This is default
MIRA behaviour. In case of MITObim `AllStrains` and `sample` files
should be pretty much identical. `sample-reference-out-sample. unpadded.fasta`
from iteration n will serve as backbone for iteration n+1. The only
potential difference being that any IUPAC ambiguity characters in the
result will be resolved using majority rule base calls, before the
sequence is used as backbone.
In terms of
quality check: A manual check for obvious conflicts in the assembly is a
good start and usually feasible in relatively small animal mt genomes.
Also you might want to check circularity. The next MITObim version will
do the latter automatically. More formal checks could be:
-do automated annotation (e.g. with MITOS or DOGMA) and check if all expected mt genes are present and intact.
-use assembly validation tools such as REAPR or QUAST.
Hope that helps! Sorry again for the long delay!
Thanks!
Christoph
Comentários
Postar um comentário