Alignment

From Unofficial BOINC Wiki

Jump to: navigation, search

[edit] General

A (pairwise) alignment is a mathematical scoring model to measure the similarity between two biological sequences. It also returns a topological result displaying the (according to the model) most probable evolutionary history between the two sequences. Most simple would be the assumption, that two sequences evolved with as few editing steps as necessary. Algorithms to determine sequence-alignments are therefore algorithmically related to minimal-edit distance algorithms, but with some adaptions: especially, the different substitution probabilities between different Amino Acids enter into the model via the Substitution Matrix. Gap penalties are introduced to avoid long or many stretches of gaps.

The alignment can be either local or global. In the former case the score is maximized in a local overlap of the two sequences, in the latter over the both's length. The optimal solution of the problem can be computed in O(n2) complexity. The optimal local case can be computed by the Smith-Watermann algorithm, the global by Needlemann-Wunch.

An alignment comes basically with a topological overview and a raw alignment score, which might be processed further, for example, into E-Values An example alignment from the SIMAP Database:

966452           301 ERDKQSRISTLERLGNYGDASLGNLGSSQAGTPGVAGTRYRSTPGHRSNT    350
                     |||||:||||:|:   :|.::..:|      :.|. | ||||...||||.
1026209          301 ERDKQARISTFEK---FGSSTAASL------SEGN-G-RYRSNSAHRSNA    339
Personal tools