📄 mast.adh.zoops
字号:
********************************************************************************MAST - Motif Alignment and Search Tool******************************************************************************** MAST version 3.5.1 (Release date: 2006/02/01 02:08:55) For further information on how to interpret these results or to get a copy of the MAST software please access http://meme.nbcr.net.****************************************************************************************************************************************************************REFERENCE******************************************************************************** If you use this program in your research, please cite: Timothy L. Bailey and Michael Gribskov, "Combining evidence using p-values: application to sequence homology searches", Bioinformatics, 14(48-54), 1998.****************************************************************************************************************************************************************DATABASE AND MOTIFS******************************************************************************** DATABASE /home/meme/TEST/tests/adh.s (peptide) Last updated on Tue Jan 31 18:16:52 2006 Database contains 33 sequences, 9996 residues MOTIFS /home/meme/TEST/tests/meme.adh.zoops (peptide) MOTIF WIDTH BEST POSSIBLE MATCH ----- ----- ------------------- 1 27 YSASKFAVRMLTRSMRREYAPHGIRVN 2 25 QGKVVLITGCSSGIGKATAKHLHKE PAIRWISE MOTIF CORRELATIONS: MOTIF 1 ----- ----- 2 0.27 No overly similar pairs (correlation > 0.60) found. Random model letter frequencies (from non-redundant database): A 0.073 C 0.018 D 0.052 E 0.062 F 0.040 G 0.069 H 0.022 I 0.056 K 0.058 L 0.092 M 0.023 N 0.046 P 0.051 Q 0.041 R 0.052 S 0.074 T 0.059 V 0.064 W 0.013 Y 0.033 ****************************************************************************************************************************************************************SECTION I: HIGH-SCORING SEQUENCES******************************************************************************** - Each of the following 33 sequences has E-value less than 10. - The E-value of a sequence is the expected number of sequences in a random database of the same size that would match the motifs as well as the sequence does and is equal to the combined p-value of the sequence times the number of sequences in the database. - The combined p-value of a sequence measures the strength of the match of the sequence to all the motifs and is calculated by o finding the score of the single best match of each motif to the sequence (best matches may overlap), o calculating the sequence p-value of each score, o forming the product of the p-values, o taking the p-value of the product. - The sequence p-value of a score is defined as the probability of a random sequence of the same length containing some match with as good or better a score. - The score for the match of a position in a sequence to a motif is computed by by summing the appropriate entry from each column of the position-dependent scoring matrix that represents the motif. - Sequences shorter than one or more of the motifs are skipped. - The table is sorted by increasing E-value.********************************************************************************SEQUENCE NAME DESCRIPTION E-VALUE LENGTH------------- ----------- -------- ------YRTP_BACSU HYPOTHETICAL 25.3 KD PROT... 5.2e-34 238BUDC_KLETE ACETOIN(DIACETYL) REDUCTA... 7.4e-34 241AP27_MOUSE ADIPOCYTE P27 PROTEIN (AP... 3.7e-28 244FIXR_BRAJA FIXR PROTEIN 2e-27 278DHII_HUMAN CORTICOSTEROID 11-BETA-DE... 9.7e-27 292DHGB_BACME GLUCOSE 1-DEHYDROGENASE B... 1e-26 262HDE_CANTR HYDRATASE-DEHYDROGENASE-E... 2.2e-26 906HDHA_ECOLI 7-ALPHA-HYDROXYSTEROID DE... 2.3e-26 255NODG_RHIME NODULATION PROTEIN G (HOS... 6.1e-26 245YINL_LISMO HYPOTHETICAL 26.8 KD PROT... 1.2e-25 248RIDH_KLEAE RIBITOL 2-DEHYDROGENASE (... 6.7e-25 249HMTR_LEIMA no comment 7.7e-25 287FVT1_HUMAN no comment 1.8e-24 332DHMA_FLAS1 N-ACYLMANNOSAMINE 1-DEHYD... 4.3e-24 270DHB2_HUMAN no comment 5.7e-24 3873BHD_COMTE 3-BETA-HYDROXYSTEROID DEH... 1.2e-23 253ENTA_ECOLI 2,3-DIHYDRO-2,3-DIHYDROXY... 2.4e-23 248BA72_EUBSP 7-ALPHA-HYDROXYSTEROID DE... 2.6e-23 2492BHD_STREX 20-BETA-HYDROXYSTEROID DE... 3.7e-23 255BDH_HUMAN D-BETA-HYDROXYBUTYRATE DE... 7.5e-23 343DHES_HUMAN ESTRADIOL 17 BETA-DEHYDRO... 3.2e-22 327GUTD_ECOLI SORBITOL-6-PHOSPHATE 2-DE... 5.9e-22 259DHB3_HUMAN no comment 7.5e-22 310BPHB_PSEPS BIPHENYL-CIS-DIOL DEHYDRO... 5.4e-20 275LIGD_PSEPA C ALPHA-DEHYDROGENASE (EC... 1.1e-19 305RFBB_NEIGO no comment 1.1e-18 346DHCA_HUMAN no comment 7e-17 276MAS1_AGRRA no comment 8.5e-16 476PCR_PEA no comment 1.3e-15 399ADH_DROME ALCOHOL DEHYDROGENASE (EC... 8.2e-15 255YURA_MYXXA no comment 2.6e-12 258FABI_ECOLI no comment 9e-10 262CSGA_MYXXA no comment 6e-08 166****************************************************************************************************************************************************************SECTION II: MOTIF DIAGRAMS******************************************************************************** - The ordering and spacing of all non-overlapping motif occurrences are shown for each high-scoring sequence listed in Section I. - A motif occurrence is defined as a position in the sequence whose match to the motif has POSITION p-value less than 0.0001. - The POSITION p-value of a match is the probability of a single random subsequence of the length of the motif scoring at least as well as the observed match. - For each sequence, all motif occurrences are shown unless there are overlaps. In that case, a motif occurrence is shown only if its p-value is less than the product of the p-values of the other (lower-numbered) motif occurrences that it overlaps. - The table also shows the E-value of each sequence. - Spacers and motif occurences are indicated by o -d- `d' residues separate the end of the preceding motif occurrence and the start of the following motif occurrence o [n] occurrence of motif `n' with p-value less than 0.0001.********************************************************************************SEQUENCE NAME E-VALUE MOTIF DIAGRAM------------- -------- -------------YRTP_BACSU 5.2e-34 4_[2]_125_[1]_57BUDC_KLETE 7.4e-34 [2]_33_[1]_66_[1]_63AP27_MOUSE 3.7e-28 5_[2]_118_[1]_69FIXR_BRAJA 2e-27 34_[2]_129_[1]_63DHII_HUMAN 9.7e-27 32_[2]_125_[1]_83DHGB_BACME 1e-26 5_[2]_129_[1]_76HDE_CANTR 2.2e-26 6_[2]_131_[1]_131_[2]_121_[1]_ 235_[2]_153HDHA_ECOLI 2.3e-26 9_[2]_80_[1]_17_[1]_70NODG_RHIME 6.1e-26 4_[2]_122_[1]_67YINL_LISMO 1.2e-25 3_[2]_125_[1]_68RIDH_KLEAE 6.7e-25 12_[2]_122_[1]_63HMTR_LEIMA 7.7e-25 4_[2]_163_[1]_68FVT1_HUMAN 1.8e-24 30_[2]_130_[1]_120DHMA_FLAS1 4.3e-24 12_[2]_127_[1]_79DHB2_HUMAN 5.7e-24 80_[2]_126_[1]_1293BHD_COMTE 1.2e-23 4_[2]_30_[1]_64_[1]_76ENTA_ECOLI 2.4e-23 3_[2]_115_[1]_78BA72_EUBSP 2.6e-23 4_[2]_35_[1]_65_[1]_662BHD_STREX 3.7e-23 4_[2]_122_[1]_77BDH_HUMAN 7.5e-23 53_[2]_129_[1]_109DHES_HUMAN 3.2e-22 [2]_129_[1]_146GUTD_ECOLI 5.9e-22 [2]_128_[1]_79DHB3_HUMAN 7.5e-22 46_[2]_126_[1]_86BPHB_PSEPS 5.4e-20 3_[2]_124_[1]_96LIGD_PSEPA 1.1e-19 4_[2]_127_[1]_122RFBB_NEIGO 1.1e-18 4_[2]_135_[1]_155DHCA_HUMAN 7e-17 2_[2]_34_[1]_104_[1]_57MAS1_AGRRA 8.5e-16 243_[2]_123_[1]_58PCR_PEA 1.3e-15 84_[2]_34_[1]_229ADH_DROME 8.2e-15 4_[2]_122_[1]_77YURA_MYXXA 2.6e-12 159_[1]_72FABI_ECOLI 9e-10 4_[2]_129_[1]_77CSGA_MYXXA 6e-08 87_[1]_52****************************************************************************************************************************************************************SECTION III: ANNOTATED SEQUENCES******************************************************************************** - The positions and p-values of the non-overlapping motif occurrences are shown above the actual sequence for each of the high-scoring sequences from Section I. - A motif occurrence is defined as a position in the sequence whose match to the motif has POSITION p-value less than 0.0001 as defined in Section II. - For each sequence, the first line specifies the name of the sequence. - The second (and possibly more) lines give a description of the sequence. - Following the description line(s) is a line giving the length, combined p-value, and E-value of the sequence as defined in Section I. - The next line reproduces the motif diagram from Section II. - The entire sequence is printed on the following lines. - Motif occurrences are indicated directly above their positions in the sequence on lines showing o the motif number of the occurrence, o the position p-value of the occurrence, o the best possible match to the motif, and o columns whose match to the motif has a positive score (indicated by a plus sign).********************************************************************************YRTP_BACSU HYPOTHETICAL 25.3 KD PROTEIN IN RTP 5'REGION (ORF238) LENGTH = 238 COMBINED P-VALUE = 1.57e-35 E-VALUE = 5.2e-34 DIAGRAM: 4_[2]_125_[1]_57 [2] 2.0e-20 QGKVVLITGCSSGIGKATAKHLHKE +++ +++++++++++++++++++++1 MQSLQHKTALITGGGRGIGRATALALAKEGVNIGLIGRTSANVEKVAEEVKALGVKAAFAAADVKDADQVNQAVA [1] 2.0e-22 YSASKFAVRMLTRSMRREYAPHGIRVN ++++++++ ++++++++++++++++++151 VTSAYSASKFAVLGLTESLMQEVRKHNIRVSALTPSTVASDMSIELNLTDGNPEKVMQPEDLAEYMVAQLKLDPRBUDC_KLETE ACETOIN(DIACETYL) REDUCTASE (EC 1.1.1.5) (ACETOIN DEHYDROGENASE) LENGTH = 241 COMBINED P-VALUE = 2.24e-35 E-VALUE = 7.4e-34 DIAGRAM: [2]_33_[1]_66_[1]_63 [2] [1] 1.5e-20 7.7e-05 QGKVVLITGCSSGIGKATAKHLHKE YSASKFAVRMLTRSMRR +++++++++++++++++++++++++ + + + ++ +++1 MQKVALVTGAGQGIGKAIALRLVKDGFAVAIADYNDATATAVAAEINQAGGRAVAIKVDVSRRDQVFAAVEQARK EYAPHGIRVN ++ ++ ++76 ALGGFNVIVNNAGIAPSTPIESITEEIVDRVYNINVKGVIWGMQAAVEAFKKEGHGGKIVNACSQAGHVGNPELA [1] 3.9e-22 YSASKFAVRMLTRSMRREYAPHGIRVN +++++++++++++++++++++++++++151 VYSSSKFAVRGLTQTAARDLAPLGITVNGFCPGIVKTPMWAEIDRQCRKRRANRWATARLNLPNASPLAACRSLKAP27_MOUSE ADIPOCYTE P27 PROTEIN (AP27) LENGTH = 244 COMBINED P-VALUE = 1.14e-29 E-VALUE = 3.7e-28 DIAGRAM: 5_[2]_118_[1]_69 [2] 7.6e-16 QGKVVLITGCSSGIGKATAKHLHKE ++ ++++++++++++ ++++++++1 MKLNFSGLRALVTGAGKGIGRDTVKALHASGAKVVAVTRTNSDLVSLAKECPGIEPVCVDLGDWDATEKALGGIG [1 4. YS ++76 PVDLLVNNAALVIMQPFLEVTKEAFDRSFSVNLRSVFQVSQMVARDMINRGVPGSIVNVSSMVAHVTFPNLITYS ] 3e-21 ASKFAVRMLTRSMRREYAPHGIRVN ++++++++++++++++++++ ++++151 STKGAMTMLTKAMAMELGPHKIRVNSVNPTVVLTDMGKKVSADPEFARKLKERHPLRKFAEVEDVVNSILFLLSDFIXR_BRAJA FIXR PROTEIN LENGTH = 278 COMBINED P-VALUE = 6.20e-29 E-VALUE = 2e-27 DIAGRAM: 34_[2]_129_[1]_63 [2] 1.7e-16 QGKVVLITGCSSGIGKATAKHLHKE + ++++ +++++++++++++ + ++1 MGLDLPNDNLIRGPLPEAHLDRLVDAVNARVDRGEPKVMLLTGASRGIGHATAKLFSEAGWRIISCARQPFDGER [1] 8.0e-20 YSASKFAVRMLTRSMRREYAPHGIRVN + +++++ ++++++++++++++++++151 APILLAQGLFDELRAASGSIVNVTSIAGSRVHPFAGSAYATSKAALASLTRELAHDYAPHGIRVNAIAPGEIRTDDHII_HUMAN CORTICOSTEROID 11-BETA-DEHYDROGENASE (EC 1.1.1.146) (11-DH) (11-BETA- HYDROXYSTEROID DEHYDROGENASE) (11-BETA-HSD) LENGTH = 292 COMBINED P-VALUE = 2.94e-28 E-VALUE = 9.7e-27 DIAGRAM: 32_[2]_125_[1]_83 [2] 7.5e-20 QGKVVLITGCSSGIGKATAKHLHKE
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -