📄 dotmatcher.txt
字号:
dotmatcher Function Displays a thresholded dotplot of two sequencesDescription A dotplot is a graphical representation of the regions of similarity between two sequences. The two sequences are placed on the axes of a rectangular image and (subject to threshold conditions) wherever there is a similarity between the sequences a dot is placed on the image. Where the two sequences have substantial regions of similarity, many dots align to form diagonal lines. It is therefore possible to see at a glance where there are local regions of similarity as these will have long diagonal lines. It is also easy to see other features such as repeats (which form parallel diagonal lines), and insertions or deletions (which form breaks or discontinuities in the diagonal lines). dotmatcher uses a threshold to define whether a match is plotted (calculated from the substitution matrix). A window of specified length is moved up all possible diagonals and a score is calculated within each window for each position along the diagonals. The score is the sum of the comparisons of the two sequences using the given similarity matrix along the window. If the score is above the threshold, then a line is plotted on the image over the position of the window.Usage Here is a sample session with dotmatcher% dotmatcher tsw:hba_human tsw:hbb_human -graph cps Displays a thresholded dotplot of two sequencesCreated dotmatcher.ps Go to the input files for this example Go to the output files for this exampleCommand line arguments Standard (Mandatory) qualifiers (* if not always prompted): [-asequence] sequence Sequence filename and optional format, or reference (input USA) [-bsequence] sequence Sequence filename and optional format, or reference (input USA)* -graph graph [$EMBOSS_GRAPHICS value, or x11] Graph type (ps, hpgl, hp7470, hp7580, meta, cps, x11, tekt, tek, none, data, xterm, png)* -xygraph xygraph [$EMBOSS_GRAPHICS value, or x11] Graph type (ps, hpgl, hp7470, hp7580, meta, cps, x11, tekt, tek, none, data, xterm, png) Additional (Optional) qualifiers: -matrixfile matrix [EBLOSUM62 for protein, EDNAFULL for DNA] This is the scoring matrix file used when comparing sequences. By default it is the file 'EBLOSUM62' (for proteins) or the file 'EDNAFULL' (for nucleic sequences). These files are found in the 'data' directory of the EMBOSS installation. -windowsize integer [10] Window size over which to test threshhold (Integer 3 or more) -threshold integer [23] Threshold (Integer 0 or more) Advanced (Unprompted) qualifiers: -stretch toggle [N] Display a non-proportional graph Associated qualifiers: "-asequence" associated qualifiers -sbegin1 integer Start of the sequence to be used -send1 integer End of the sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -sformat1 string Input sequence format -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-bsequence" associated qualifiers -sbegin2 integer Start of the sequence to be used -send2 integer End of the sequence to be used -sreverse2 boolean Reverse (if DNA) -sask2 boolean Ask for begin/end/reverse -snucleotide2 boolean Sequence is nucleotide -sprotein2 boolean Sequence is protein -slower2 boolean Make lower case -supper2 boolean Make upper case -sformat2 string Input sequence format -sdbname2 string Database name -sid2 string Entryname -ufo2 string UFO features -fformat2 string Features format -fopenfile2 string Features file name "-graph" associated qualifiers -gprompt boolean Graph prompting -gdesc string Graph description -gtitle string Graph title -gsubtitle string Graph subtitle -gxtitle string Graph x axis title -gytitle string Graph y axis title -goutfile string Output file for non interactive displays -gdirectory string Output directory "-xygraph" associated qualifiers -gprompt boolean Graph prompting -gdesc string Graph description -gtitle string Graph title -gsubtitle string Graph subtitle -gxtitle string Graph x axis title -gytitle string Graph y axis title -goutfile string Output file for non interactive displays -gdirectory string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messagesInput file format Any 2 sequence USAs of the same type (DNA or protein). Input files for usage example 'tsw:hba_human' is a sequence entry in the example protein database 'tsw' Database entry: tsw:hba_humanID HBA_HUMAN STANDARD; PRT; 141 AA.AC P01922;DT 21-JUL-1986 (Rel. 01, Created)DT 21-JUL-1986 (Rel. 01, Last sequence update)DT 15-JUL-1999 (Rel. 38, Last annotation update)DE HEMOGLOBIN ALPHA CHAIN.GN HBA1 AND HBA2.OS Homo sapiens (Human), Pan troglodytes (Chimpanzee), andOS Pan paniscus (Pygmy chimpanzee) (Bonobo).OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Mammalia;OC Eutheria; Primates; Catarrhini; Hominidae; Homo.RN [1]RP SEQUENCE FROM N.A. (ALPHA-1).RX MEDLINE; 81088339.RA MICHELSON A.M., ORKIN S.H.;RT "The 3' untranslated regions of the duplicated human alpha-globinRT genes are unexpectedly divergent.";RL Cell 22:371-377(1980).RN [2]RP SEQUENCE FROM N.A. (ALPHA-2).RX MEDLINE; 81175088.RA LIEBHABER S.A., GOOSSENS M.J., KAN Y.W.;RT "Cloning and complete nucleotide sequence of human 5'-alpha-globinRT gene.";RL Proc. Natl. Acad. Sci. U.S.A. 77:7054-7058(1980).RN [3]RP SEQUENCE FROM N.A. (ALPHA-2).RX MEDLINE; 80137531.RA WILSON J.T., WILSON L.B., REDDY V.B., CAVALLESCO C., GHOSH P.K.,RA DERIEL J.K., FORGET B.G., WEISSMAN S.M.;RT "Nucleotide sequence of the coding portion of human alpha globinRT messenger RNA.";RL J. Biol. Chem. 255:2807-2815(1980).RN [4]RP SEQUENCE FROM N.A. (ALPHA-1 AND ALPHA-2).RA FLINT J., HIGGS D.R.;RL Submitted (JAN-1997) to the EMBL/GenBank/DDBJ databases.RN [5]RP SEQUENCE.RA BRAUNITZER G., GEHRING-MULLER R., HILSCHMANN N., HILSE K., HOBOM G.,RA RUDLOFF V., WITTMANN-LIEBOLD B.;RT "The constitution of normal adult human haemoglobin.";RL Hoppe-Seyler's Z. Physiol. Chem. 325:283-286(1961).RN [6]RP SEQUENCE.RA HILL R.J., KONIGSBERG W.;RT "The structure of human hemoglobin: IV. The chymotryptic digestion ofRT the alpha chain of human hemoglobin.";RL J. Biol. Chem. 237:3151-3156(1962).RN [7] [Part of this file has been deleted for brevity]FT /FTId=VAR_002841.FT VARIANT 130 130 A -> D (IN YUDA; O2 AFFINITY DOWN).FT /FTId=VAR_002842.FT VARIANT 131 131 S -> P (IN QUESTEMBERT; HIGHLY UNSTABLE;FT CAUSES ALPHA-THALASSEMIA).FT /FTId=VAR_002843.FT VARIANT 133 133 S -> R (IN VAL DE MARNE; O2 AFFINITY UP).FT /FTId=VAR_002844.FT VARIANT 135 135 V -> E (IN PAVIE).FT /FTId=VAR_002845.FT VARIANT 136 136 L -> M (IN CHICAGO).FT /FTId=VAR_002846.FT VARIANT 136 136 L -> P (IN BIBBA; UNSTABLE;FT CAUSES ALPHA-THALASSEMIA).FT /FTId=VAR_002847.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -