Questions and Answers
Q.: Where should I get the latest version of
Satsuma/SatsumaSynteny?
A.: Please check out a version of our svn repository.
Q.: The executables won't run on my system, it complains
about glibc missing.
A.: Please be sure to follow the instructions
for compilation and setup. If the executables you download do not
work on your system, do a 'make clean' followed by 'make', which
will rebuild the system using your server configuration. Also not
that you need gcc and csh installed.
Q.: Satsuma runs forever, even though my genomes are
small. What should I do?
A.: Please make sure to run SatsumaSynteny (not
Satsuma, which is the exhaustive version), and give it enough CPU
cores using the -n option.
Q.: How does Satsuma deal with N's in sequences? Will it
generate false hits?
A.: No, N's are ignored during alignment. Note, however,
that large chunks of N's will show up as 'white spots' in the
synteny map, as though they were specific to one genome.
Q.: If I align a genome to itself, it does not come out
as a single alignment. Why is that?
A.: This is because SatsumaSynteny estimates the coordinates
of syntenic blocks, rather than performing a full detailed
alignment.
Q.: Do I have to repeat mask sequences? Does Satsuma do that
for me?
A.: You do not have to repeat mask sequences, one of the
strengths of Satsuma is that it can resolve
non-syntentic/non-orthologous sequences without any soft or hard
masking.
Q.: Is SatsumaSynteny only for genomes? I am looking
for orthologs of genes in transcript sequences that I generated from
RNA-Seq data. Can Satsuma do that also?
A.: For large-scale searches, you might be better off with
blast, but if you only have a limited number of sequences that you
want to align taking advantage of Satsuma's sensitivity and ease of
use, run Satsuma (not SatsumaSynteny), which will perform an
exhaustive search. Be aware that the runtime is n times m, with n
and m being the sizes of your sequences to be aligned.
Q.: I am planning on conducting a study involving dozens of
large genomes. Do I have to run all pairwise synteny alignments?
A.: Absolutely not! Please check out our Kraken software.
Q.: Does Satsuma come with any visualization software?
A.: Yes, absolutely. You can use MicroSyntenyPlot to
generate dot-plots, and ChromosomePaint to generate plots that show
synteny color coded by which chromosome they hit.
Q.: What should I use as the target, and what as the query?
And does it matter?
A.: Yes, it matters in two ways. SatsumaSynteny searches the
space trying to maximize coverage in the target sequence, so that,
in principle, the target should be the more complete genome. In
practice, however, we found little difference, and it is perfectly
fine to use a finished genome, such as human or mouse, as target,
and an NGS draft assembly as the query. For another, more important
issue, see the next question below.
Q.: I am looking for duplicated regions, how should I do
that?
A.: By default, SatsumaSynteny matches sequences in the target
genome with sequences in the query genome, allowing for
one-to-many mappings from target to query, but not vice versa. If
you have duplications in your query, SatsumaSynteny will find them.
If you have duplications in both, please use the -dups option.
Q.: Can I align across a whole genome duplication event?
A.: Absolutely, but make sure that the duplicated genome is
the query, and the non-duplicated one is the target.
Q.: When I use Mizbee, the coordinates all seem off by a
factor of 10. Why is that?
A.: This is a workaround for an old bug in Mizbee, which
limits genome sizes to 2Gb in size. Thus, coordinates will in some
cases be scaled down by a factor of 10.
Back to the main page.