Softwares

Onine Help for SiSeeR

Search Settings

Simple Sequence Repeats SSRs are tandem repeats of the same motive. How many motives must follow each other (Min Rep) is not definite and therefore this constraint is a user setting which further depends on the motive length (Mot Len). In general the longer the motive the less repetitions thereof may be required. In the GUI these settings are changeable in the table seen on the main screen. All cells in the table are editable so that the settings can easily be changed. In addition the button add Row allows to add additional motive length - minimal repetition constraints (there is no limit imposed). In case only specific motive length - minimal repetition pairs are search for, the button del Row can be used to delete all unwanted constraints.

The checkbox Search for approximate SSRs enables searching for imperfect SSRs i.e.: ones where the tandem repeats may deviate from each other. We believe that the settings are dependent on motive length and required repeats and therefore allow customization in the same table as for perfect SSRs.

Max Sub/Mot The maximal number of substitutions that a repeat may have in comparison tot the consensus repeat.

Max Sub/SSR The maximal number of deviations from the consensus repeat that all repeats in one SSR together may have.

Max Gap When insertions and deletions are concerned, we treat potential partial or extended repeats as gaps and max gap is the maximal number of nucleotides that may be between several partial SSRs of at least min seed length.

Min Seed To save processing time, we require that all parts of gapped SSRs have at least min seed repeats. They are only combined into gapped SSRs if their overall composition conforms to the mot len and min rep requirements.

Result Summary

For summarizing statistics or other output, the following settings may be useful:

Save only longest SSR When selected, not all SSRs of each FASTA section will be stored if they contain SSRs, instead only the longest SSR per FASTA section will be indicated in the _SSRs.fa and _results.tab files.

Exclude SSRs ... When a number larger than 1 is entered, than all SSRs that appear with less than the given count in the available sequences, are summarized in the motive 'Other' in the _result_summary.tab.

Summarize complementary motives Not all motives are actually different even if their sequence are completely different. For example AC and TG are equal as they are complementary and therefore such motives can be summarized if this option is selected they will be summarized in the _result_summary.tab.

Summarize circular motives Other motives will be equal on the basis of a small shift. For example ACT cannot be differentiated from CTA or TAC given an SSR with any of the motives and several repeats e.g.: ACTACTACTACTAC. Therefore, such events can be summarized by selecting this option and the results affect the file _result_summary.tab.

Input

As input either a FASTA formatted sequence can be pasted into the empty field or one or multiple FASTA formatted files can be opened (CTRL-s; File-Open).

In case the sequence was pasted into the field, a FASTA formatted file entitled by data and time is created in the output directory.

Output

By default, the current directory is used and a warning is displayed if no directory has been selected. A directory can be selected by CTRL-s or File-Save.

Three files are created in the selected directory:

_results.tab Contains all detected SSRs in all sections of the FASTA formatted sequence. No summary or filtering is performed on this data.

_result_summary.tab This data is on a per FASTA formatted file basis and all SSRs in that file are summarized and only statistics are presented.

_SSRs.fa Not really a FASTA formatted file but apparently somewhat useful for downstream primer generation with the following format:

>Definition line as in the original sequence file

Motif: ACTT - SSR start: 1 - SSR end: 64 - Total length: 64 - Score: 64 - Mismatches: 0 - Search Mode: PerfectLength

The sequence as it appears in the original sequence file

Running the Analysis

Either CTRL-a, Actions-Analyze, or pressing the Analyze button will run the analysis.

Upon finishing, a dialog window will appear which gives the duration in seconds and will help you locate where to find the results and indicate whether there are results at all.