emma.ihelp
来自「emboss的linux版本的源代码」· IHELP 代码 · 共 255 行 · 第 1/2 页
IHELP
255 行
Standard (Mandatory) qualifiers: [-sequence] seqall (Gapped) sequence(s) filename and optional format, or reference (input USA) [-outseq] seqoutset [<sequence>.<format>] Sequence set filename and optional format (output USA) [-dendoutfile] outfile [*.emma] Dendrogram (tree file) from clustalw output file Additional (Optional) qualifiers (* if not always prompted): -onlydend toggle [N] Only produce dendrogram file* -dend toggle [N] Do alignment using an old dendrogram* -dendfile infile Dendrogram (tree file) from clustalw file (optional)* -pwmatrix menu [b] The scoring table which describes the similarity of each amino acid to each other. There are three 'in-built' series of weight matrices offered. Each consists of several matrices which work differently at different evolutionary distances. To see the exact details, read the documentation. Crudely, we store several matrices in memory, spanning the full range of amino acid distance (from almost identical sequences to highly divergent ones). For very similar sequences, it is best to use a strict weight matrix which only gives a high score to identities and the most favoured conservative substitutions. For more divergent sequences, it is appropriate to use 'softer' matrices which give a high score to many other frequent substitutions. 1) BLOSUM (Henikoff). These matrices appear to be the best available for carrying out data base similarity (homology searches). The matrices used are: Blosum80, 62, 45 and 30. 2) PAM (Dayhoff). These have been extremely widely used since the late '70s. We use the PAM 120, 160, 250 and 350 matrices. 3) GONNET . These matrices were derived using almost the same procedure as the Dayhoff one (above) but are much more up to date and are based on a far larger data set. They appear to be more sensitive than the Dayhoff series. We use the GONNET 40, 80, 120, 160, 250 and 350 matrices. We also supply an identity matrix which gives a score of 1.0 to two identical amino acids and a score of zero otherwise. This matrix is not very useful. (Values: b (blosum); p (pam); g (gonnet); i (id); o (own))* -pwdnamatrix menu [i] The scoring table which describes the scores assigned to matches and mismatches (including IUB ambiguity codes). (Values: i (iub); c (clustalw); o (own))* -pairwisedatafile infile Comparison matrix file (optional)* -matrix menu [b] This gives a menu where you are offered a choice of weight matrices. The default for proteins is the PAM series derived by Gonnet and colleagues. Note, a series is used! The actual matrix that is used depends on how similar the sequences to be aligned at this alignment step are. Different matrices work differently at each evolutionary distance. There are three 'in-built' series of weight matrices offered. Each consists of several matrices which work differently at different evolutionary distances. To see the exact details, read the documentation. Crudely, we store several matrices in memory, spanning the full range of amino acid distance (from almost identical sequences to highly divergent ones). For very similar sequences, it is best to use a strict weight matrix which only gives a high score to identities and the most favoured conservative substitutions. For more divergent sequences, it is appropriate to use 'softer' matrices which give a high score to many other frequent substitutions. 1) BLOSUM (Henikoff). These matrices appear to be the best available for carrying out data base similarity (homology searches). The matrices used are: Blosum80, 62, 45 and 30. 2) PAM (Dayhoff). These have been extremely widely used since the late '70s. We use the PAM 120, 160, 250 and 350 matrices. 3) GONNET . These matrices were derived using almost the same procedure as the Dayhoff one (above) but are much more up to date and are based on a far larger data set. They appear to be more sensitive than the Dayhoff series. We use the GONNET 40, 80, 120, 160, 250 and 350 matrices. We also supply an identity matrix which gives a score of 1.0 to two identical amino acids and a score of zero otherwise. This matrix is not very useful. Alternatively, you can read in your own (just one matrix, not a series). (Values: b (blosum); p (pam); g (gonnet); i (id); o (own))* -dnamatrix menu [i] This gives a menu where a single matrix (not a series) can be selected. (Values: i (iub); c (clustalw); o (own))* -mamatrixfile infile Comparison matrix file (optional)' -[no]slow toggle [Y] A distance is calculated between every pair of sequences and these are used to construct the dendrogram which guides the final multiple alignment. The scores are calculated from separate pairwise alignments. These can be calculated using 2 methods: dynamic programming (slow but accurate) or by the method of Wilbur and Lipman (extremely fast but approximate). The slow-accurate method is fine for short sequences but will be VERY SLOW for many (e.g. >100) long (e.g. >1000 residue) sequences.* -pwgapopen float [10.0] The penalty for opening a gap in the pairwise alignments. (Number 0.000 or more)* -pwgapextend float [0.1] The penalty for extending a gap by 1 residue in the pairwise alignments. (Number 0.000 or more)* -ktup integer [1 for protein, 2 for nucleic] This is the size of exactly matching fragment that is
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?