⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 clustalx.html

📁 经典生物信息学多序列比对工具clustalw
💻 HTML
📖 第 1 页 / 共 5 页
字号:
<P>RESIDUE EXCEPTION CUTOFF: this is a scalar value from 1 to 10, which can beused to change the number of residue exceptions which are highlighted in thealignment display. (For an explanation of this cutoff, see the CALCULATION OFRESIDUE EXCEPTIONS section below.)</P><P>PROTEIN WEIGHT MATRIX: the scoring table which describes the similarity of each amino acid to each other. </P><P> DNA WEIGHT MATRIX: two hard-coded matrices are available: IUB and CLUSTALW(1.6).</P><P>For more information about the weight matrices, see the help above forthe Low-scoring Segments Weight Matrix.</P><P>For details of the quality score calculations, see the CALCULATION sectionbelow.</P><P></P><P><STRONG>SHOW LOW-SCORING SEGMENTS</STRONG></P><P>                       The low-scoring segment display can be toggled on or off. This option does notrecalculate the profile scores.</P><P></P><P><STRONG>SHOW EXCEPTIONAL RESIDUES</STRONG></P><P>                       This option highlights individual residues which score badly in the alignmentquality calculations. Residues which score exceptionally low are highlighted byusing a white character on a grey background.</P><P><STRONG>SAVE QUALITY SCORES TO FILE</STRONG></P><P>The quality scores that are plotted underneath the alignment display can alsobe saved in a text file. Each column in the alignment is written on one line inthe output file, with the value of the quality score at the end of the line.Only the sequences currently selected in the display are written to the file.One use for quality scores is to color residues in a protein structure bysequence conservation. In this way conserved surface residues can behighlighted to locate functional regions such as ligand-binding sites.</P><P></P><P><H3>CALCULATION OF QUALITY SCORES</H3></P><P>Suppose we have an alignment of m sequences of length n. Then, the alignmentcan be written as:</P><P><PRE>        A11 A12 A13 .......... A1n        A21 A22 A23 .......... A2n        .        .        Am1 Am2 Am3 .......... Amn</PRE></P><P>We also have a residue comparison matrix of size R where C(i,j) is the scorefor aligning residue i with residue j.</P><P>We want to calculate a score for the conservation of the jth position in thealignment.</P><P>To do this, we define an R-dimensional sequence space. For the jth position in the alignment, each sequence consists of a single residue which is assigned apoint S in the space. S has R dimensions, and for sequence i, the rth dimensionis defined as:</P><P><PRE>	Sr =    C(r,Aij)</PRE></P><P>We then calculate a consensus value for the jth position in the alignment. Thisvalue X also has R dimensions, and the rth dimension is defined as:</P><P><PRE>	Xr = (   SUM   (Fij * C(i,r)) ) / m               1<=i<=R</PRE></P><P>where Fij is the count of residues i at position j in the alignment.</P><P>Now we can calculate the distance Di between each sequence i and the consensus position X in the R-dimensional space.</P><P><PRE>	Di = SQRT   (   SUM   (Xr - Sr)(Xr - Sr) )                      1<=i<=R</P><P></PRE></P><P>The quality score for the jth position in the alignment is defined as the meanof the sequence distances Di.</P><P>The score is normalised by multiplying by the percentage of sequences whichhave residues (and not gaps) at this position.</P><P><H3>CALCULATION OF RESIDUE EXCEPTIONS</H3></P><P>The jth residue of the ith sequence is considered as an exception if thedistance Di of the sequence from the consensus value P is greater than (UpperQuartile + Inter Quartile Range * Cutoff). The value used as a cutoff fordisplaying exceptions can be set from the SCORE PARAMETERS menu. A high cutoffvalue will only display very significant exceptions; a low value will allowmore, less significant, exceptions to be highlighted.</P><P>(NB. Sequences which contain gaps at this position are not included in theexception calculation.)</P><P></P><P><H3>CALCULATION OF LOW-SCORING SEGMENTS</H3></P><P>Suppose we have an alignment of m sequences of length n. Then, the alignmentcan be written as:</P><P><PRE>        A11 A12 A13 .......... A1n        A21 A22 A23 .......... A2n        .        .        Am1 Am2 Am3 .......... Amn</PRE></P><P>We also have a residue comparison matrix of size R where C(i,j) is the scorefor aligning residue i with residue j.</P><P>We calculate sequence weights by building a neighbour-joining tree, in whichbranch lengths are proportional to divergence. Summing the branches by branchownership provides the weights. See (Thompson et al., CABIOS, 10, 19 (1994) andHenikoff et al.,JMB, 243, 574 1994).</P><P>To find the low-scoring segments in a sequence Si, we build a weighted profileof the remaining sequences in the alignment. Suppose we find residue r at position j in the sequence; then the score for the jth position in the sequenceis defined as</P><P><PRE>	Score(Si,j) = Profile(j,r)   where Profile(j,r) is the profile score                                       for residue r at position j in the                                       alignment.</PRE></P><P>These residue scores are summed along the sequence in both forward and backwarddirections. If the sum of the scores is positive, then it is reset to zero.Segments which score negatively in both directions are considered as 'low-scoring' and will be highlighted in the alignment display.</P><P></P><P></P><A HREF="#INDEX"> <EM>Back to Index</EM> </A><CENTER><H2><A NAME="9">              Command Line Parameters</A></H2></CENTER><CENTER><H3>                DATA (sequences)</H3></CENTER><CENTER><TABLE ALIGN=ABSCENTER BORDER=1 CELLSPACING=1 CELLPADDING=5><TR><TD><STRONG>Parameter</STRONG></TD><TD><STRONG><EM>Description</EM></STRONG></TD></TR><TR><TD><TT>-PROFILE1=file.ext  and  -PROFILE2=file.ext  </TT></TD><TD><EM>profiles (aligned sequences)</EM></TD></TR></TABLE></CENTER><CENTER><H3>                VERBS (do things)</H3></CENTER><CENTER><TABLE ALIGN=ABSCENTER BORDER=1 CELLSPACING=1 CELLPADDING=5><TR><TD><STRONG>Parameter</STRONG></TD><TD><STRONG><EM>Description</EM></STRONG></TD></TR><TR><TD><TT>-HELP  or -CHECK    </TT></TD><TD><EM>outline the command line parameters</EM></TD></TR><TR><TD><TT>-ALIGN              </TT></TD><TD><EM>do full multiple alignment </EM></TD></TR><TR><TD><TT>-TREE               </TT></TD><TD><EM>calculate NJ tree</EM></TD></TR><TR><TD><TT>-BOOTSTRAP(=n)      </TT></TD><TD><EM>bootstrap a NJ tree (n= number of bootstraps; def. = 1000)</EM></TD></TR><TR><TD><TT>-CONVERT            </TT></TD><TD><EM>output the input sequences in a different file format</EM></TD></TR></TABLE></CENTER><CENTER><H3>                PARAMETERS (set things)</H3></CENTER><CENTER><P><STRONG>***General settings:****</STRONG></P></CENTER><CENTER><TABLE ALIGN=ABSCENTER BORDER=1 CELLSPACING=1 CELLPADDING=5><TR><TD><STRONG>Parameter</STRONG></TD><TD><STRONG><EM>Description</EM></STRONG></TD></TR><TR><TD><TT>-INTERACTIVE </TT></TD><TD><EM>read command line, then enter normal interactive menus</EM></TD></TR><TR><TD><TT>-QUICKTREE   </TT></TD><TD><EM>use FAST algorithm for the alignment guide tree</EM></TD></TR><TR><TD><TT>-TYPE=       </TT></TD><TD><EM>PROTEIN or DNA sequences</EM></TD></TR><TR><TD><TT>-NEGATIVE    </TT></TD><TD><EM>protein alignment with negative values in matrix</EM></TD></TR><TR><TD><TT>-OUTFILE=    </TT></TD><TD><EM>sequence alignment file name</EM></TD></TR><TR><TD><TT>-OUTPUT=     </TT></TD><TD><EM>GCG, GDE, PHYLIP, PIR or NEXUS</EM></TD></TR><TR><TD><TT>-OUTORDER=   </TT></TD><TD><EM>INPUT or ALIGNED</EM></TD></TR><TR><TD><TT>-CASE=       </TT></TD><TD><EM>LOWER or UPPER (for GDE output only)</EM></TD></TR><TR><TD><TT>-SEQNOS=     </TT></TD><TD><EM>OFF or ON (for Clustal output only)</EM></TD></TR></TABLE></CENTER><CENTER><H3>***Fast Pairwise Alignments:***</H3></CENTER><CENTER><TABLE ALIGN=ABSCENTER BORDER=1 CELLSPACING=1 CELLPADDING=5><TR><TD><STRONG>Parameter</STRONG></TD><TD><STRONG><EM>Description</EM></STRONG></TD></TR><TR><TD><TT>-TOPDIAGS=n  </TT></TD><TD><EM>number of best diags.</EM></TD></TR><TR><TD><TT>-WINDOW=n    </TT></TD><TD><EM>window around best diags.</EM></TD></TR><TR><TD><TT>-PAIRGAP=n   </TT></TD><TD><EM>gap penalty</EM></TD></TR><TR><TD><TT>-SCORE=      </TT></TD><TD><EM>PERCENT or ABSOLUTE</EM></TD></TR></TABLE></CENTER><CENTER><H3>***Slow Pairwise Alignments:***</H3></CENTER><CENTER><TABLE ALIGN=ABSCENTER BORDER=1 CELLSPACING=1 CELLPADDING=5><TR><TD><STRONG>Parameter</STRONG></TD><TD><STRONG><EM>Description</EM></STRONG></TD></TR><TR><TD><TT>-PWDNAMATRIX= </TT></TD><TD><EM>DNA weight matrix=IUB, CLUSTALW or filename</EM></TD></TR><TR><TD><TT>-PWGAPOPEN=f  </TT></TD><TD><EM>gap opening penalty</EM></TD></TR><TR><TD><TT>-PWGAPEXT=f  </TT></TD><TD><EM>gap opening penalty</EM></TD></TR></TABLE></CENTER><CENTER><H3>***Multiple Alignments:***</H3></CENTER><CENTER><TABLE ALIGN=ABSCENTER BORDER=1 CELLSPACING=1 CELLPADDING=5><TR><TD><STRONG>Parameter</STRONG></TD><TD><STRONG><EM>Description</EM></STRONG></TD></TR><TR><TD><TT>-USETREE=    </TT></TD><TD><EM>file for old guide tree</EM></TD></TR><TR><TD><TT>-MATRIX=     </TT></TD><TD><EM>Protein weight matrix=BLOSUM, PAM, GONNET, ID or filename</EM></TD></TR><TR><TD><TT>-DNAMATRIX=  </TT></TD><TD><EM>DNA weight matrix=IUB, CLUSTALW or filename</EM></TD></TR><TR><TD><TT>-GAPOPEN=f   </TT></TD><TD><EM>gap opening penalty</EM></TD></TR><TR><TD><TT>-GAPEXT=f  </TT></TD><TD><EM>gap extension penalty</EM></TD></TR><TR><TD><TT>-ENDGAPS     </TT></TD><TD><EM>no end gap separation pen.</EM></TD></TR><TR><TD><TT>-GAPDIST=n   </TT></TD><TD><EM>gap separation pen. range</EM></TD></TR><TR><TD><TT>-NOPGAP      </TT></TD><TD><EM>residue-specific gaps off</EM></TD></TR><TR><TD><TT>-NOHGAP    </TT></TD><TD><EM>hydrophilic gaps off</EM></TD></TR><TR

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -