emma.ihelp

来自「emboss的linux版本的源代码」· IHELP 代码 · 共 255 行 · 第 1/2 页

IHELP
255
字号
   Standard (Mandatory) qualifiers:  [-sequence]          seqall     (Gapped) sequence(s) filename and optional                                  format, or reference (input USA)  [-outseq]            seqoutset  [<sequence>.<format>] Sequence set filename                                  and optional format (output USA)  [-dendoutfile]       outfile    [*.emma] Dendrogram (tree file) from                                  clustalw output file   Additional (Optional) qualifiers (* if not always prompted):   -onlydend           toggle     [N] Only produce dendrogram file*  -dend               toggle     [N] Do alignment using an old dendrogram*  -dendfile           infile     Dendrogram (tree file) from clustalw file                                  (optional)*  -pwmatrix           menu       [b] The scoring table which describes the                                  similarity of each amino acid to each other.                                  There are three 'in-built' series of weight                                  matrices offered. Each consists of several                                  matrices which work differently at different                                  evolutionary distances. To see the exact                                  details, read the documentation. Crudely, we                                  store several matrices in memory, spanning                                  the full range of amino acid distance (from                                  almost identical sequences to highly                                  divergent ones). For very similar sequences,                                  it is best to use a strict weight matrix                                  which only gives a high score to identities                                  and the most favoured conservative                                  substitutions. For more divergent sequences,                                  it is appropriate to use 'softer' matrices                                  which give a high score to many other                                  frequent substitutions.                                  1) BLOSUM (Henikoff). These matrices appear                                  to be the best available for carrying out                                  data base similarity (homology searches).                                  The matrices used are: Blosum80, 62, 45 and                                  30.                                  2) PAM (Dayhoff). These have been extremely                                  widely used since the late '70s. We use the                                  PAM 120, 160, 250 and 350 matrices.                                  3) GONNET . These matrices were derived                                  using almost the same procedure as the                                  Dayhoff one (above) but are much more up to                                  date and are based on a far larger data set.                                  They appear to be more sensitive than the                                  Dayhoff series. We use the GONNET 40, 80,                                  120, 160, 250 and 350 matrices.                                  We also supply an identity matrix which                                  gives a score of 1.0 to two identical amino                                  acids and a score of zero otherwise. This                                  matrix is not very useful. (Values: b                                  (blosum); p (pam); g (gonnet); i (id); o                                  (own))*  -pwdnamatrix        menu       [i] The scoring table which describes the                                  scores assigned to matches and mismatches                                  (including IUB ambiguity codes). (Values: i                                  (iub); c (clustalw); o (own))*  -pairwisedatafile   infile     Comparison matrix file (optional)*  -matrix             menu       [b] This gives a menu where you are offered                                  a choice of weight matrices. The default for                                  proteins is the PAM series derived by                                  Gonnet and colleagues. Note, a series is                                  used! The actual matrix that is used depends                                  on how similar the sequences to be aligned                                  at this alignment step are. Different                                  matrices work differently at each                                  evolutionary distance.                                  There are three 'in-built' series of weight                                  matrices offered. Each consists of several                                  matrices which work differently at different                                  evolutionary distances. To see the exact                                  details, read the documentation. Crudely, we                                  store several matrices in memory, spanning                                  the full range of amino acid distance (from                                  almost identical sequences to highly                                  divergent ones). For very similar sequences,                                  it is best to use a strict weight matrix                                  which only gives a high score to identities                                  and the most favoured conservative                                  substitutions. For more divergent sequences,                                  it is appropriate to use 'softer' matrices                                  which give a high score to many other                                  frequent substitutions.                                  1) BLOSUM (Henikoff). These matrices appear                                  to be the best available for carrying out                                  data base similarity (homology searches).                                  The matrices used are: Blosum80, 62, 45 and                                  30.                                  2) PAM (Dayhoff). These have been extremely                                  widely used since the late '70s. We use the                                  PAM 120, 160, 250 and 350 matrices.                                  3) GONNET . These matrices were derived                                  using almost the same procedure as the                                  Dayhoff one (above) but are much more up to                                  date and are based on a far larger data set.                                  They appear to be more sensitive than the                                  Dayhoff series. We use the GONNET 40, 80,                                  120, 160, 250 and 350 matrices.                                  We also supply an identity matrix which                                  gives a score of 1.0 to two identical amino                                  acids and a score of zero otherwise. This                                  matrix is not very useful. Alternatively,                                  you can read in your own (just one matrix,                                  not a series). (Values: b (blosum); p (pam);                                  g (gonnet); i (id); o (own))*  -dnamatrix          menu       [i] This gives a menu where a single matrix                                  (not a series) can be selected. (Values: i                                  (iub); c (clustalw); o (own))*  -mamatrixfile       infile     Comparison matrix file (optional)'   -[no]slow           toggle     [Y] A distance is calculated between every                                  pair of sequences and these are used to                                  construct the dendrogram which guides the                                  final multiple alignment. The scores are                                  calculated from separate pairwise                                  alignments. These can be calculated using 2                                  methods: dynamic programming (slow but                                  accurate) or by the method of Wilbur and                                  Lipman (extremely fast but approximate).                                  The slow-accurate method is fine for short                                  sequences but will be VERY SLOW for many                                  (e.g. >100) long (e.g. >1000 residue)                                  sequences.*  -pwgapopen          float      [10.0] The penalty for opening a gap in the                                  pairwise alignments. (Number 0.000 or more)*  -pwgapextend        float      [0.1] The penalty for extending a gap by 1                                  residue in the pairwise alignments. (Number                                  0.000 or more)*  -ktup               integer    [1 for protein, 2 for nucleic] This is the                                  size of exactly matching fragment that is

⌨️ 快捷键说明

复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?