Highly sensitive whole-genome synteny alignments.

version 3.1.0

Satsuma is a whole-genome synteny alignment program. It takes two genomes, computes alignments, and then keeps only the parts that are orthologous, i.e. following the conserved order and orientation of features, such as protein coding genes, non-coding genes, or neutral sequences. Satsuma does not require any pre-processing, such as repeat masking, since it will automatically detect ambiguous mappings.

Satsuma has parallelization built-in and is designed to run on multi-core architectures. The run-time for aligning two bird-size genomes (~1.2 Gb) is around two days on 24 CPUs.

If you use Satsuma in your research, please cite:
Grabherr, M. G., Russell, P., Meyer, M., Mauceli, E., Alföldi, J., Di Palma, F., & Lindblad-Toh, K. (2010). Genome-wide synteny through highly sensitive sequence alignment: Satsuma. Bioinformatics, 26(9), 1145-51.

Recent additions and developments

NEW: the Chromosembler!

Map your scaffolds or contigs onto chromosome coordinates via synteny! To do so, run

./Chromosemble -t <reference> -q <your_scaffolds> -o <output_dir>

The full list of options is:

-t<string> : target fasta file (in chromosome coordinates)
-q<string> : query fasta file (the assembly)
-o<string> : output directory
-n<int> : number of CPUs (for full Satsuma run) (def=25)
-thorough<bool> : runs a full Satsuma alignment (slow!!) (def=0)
-pseudochr<bool> : maps scaffolds into chromosomes (def=0)
-s<bool> : run SatsumaSynteny at the end (def=0)

By chromosembling your assembly, you will assign putative chromosome coordinates to your sequences, while preserving re-arrangements to the degree possible. Publication and more details coming soon.

Production-quality chromosome figures in pretty colors: we added a tool, ./ChromosomePaint, to generate chromosomes painted by synteny. This is what one example based on two fish species looks like (in low resolution):
Colored synteny
Universal Genomic Coordinate Translator: to exploit the power of synteny to resolve orthologous genes and transcripts, check out our software Kraken, which allows to map GTF files across genomes without requiring all-to-all synteny maps.

Improve the connectivity of your assembly through synteny: the recently added - and still very experimental - tool ./MergeScaffoldsBySynteny allows for mapping scaffolds onto a reference genome, while preserving inter- and intrachromosomal rearrangements. In preliminary tests, we could increase the N50 of contiguous sequences ('superscaffolds') by up to orders of magnitudes for NGS based genome assemblies.

Population genomics: if you are interested in finding regions under specific selective forces within or across populations, you might find a solution in another software package that we developed, Saguaro.

(c) 2010, 2014 by Manfred Grabherr, Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University, Uppsala, Sweden, in association with the Broad Institute, Cambridge, MA, USA.

