⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 mast.adh.tcm

📁 EM算法的改进
💻 TCM
📖 第 1 页 / 共 3 页
字号:
********************************************************************************MAST - Motif Alignment and Search Tool********************************************************************************	MAST version 3.5.1 (Release date: 2006/02/01 02:08:55)	For further information on how to interpret these results or to get	a copy of the MAST software please access http://meme.nbcr.net.****************************************************************************************************************************************************************REFERENCE********************************************************************************	If you use this program in your research, please cite:	Timothy L. Bailey and Michael Gribskov,	"Combining evidence using p-values: application to sequence homology	searches", Bioinformatics, 14(48-54), 1998.****************************************************************************************************************************************************************DATABASE AND MOTIFS********************************************************************************	DATABASE /home/meme/TEST/tests/adh.s (peptide)	Last updated on Tue Jan 31 18:16:52 2006	Database contains 33 sequences, 9996 residues	MOTIFS /home/meme/TEST/tests/meme.adh.tcm (peptide)	MOTIF WIDTH BEST POSSIBLE MATCH	----- ----- -------------------	  1    22   YCASKFAVRGFTRSMAMEYAPY	  2    25   QGKVVLITGCSSGIGKATAKHFHKE	PAIRWISE MOTIF CORRELATIONS:	MOTIF     1	----- -----	   2   0.34	No overly similar pairs (correlation > 0.60) found.	Random model letter frequencies (from non-redundant database):	A 0.073 C 0.018 D 0.052 E 0.062 F 0.040 G 0.069 H 0.022 I 0.056 K 0.058 	L 0.092 M 0.023 N 0.046 P 0.051 Q 0.041 R 0.052 S 0.074 T 0.059 V 0.064 	W 0.013 Y 0.033 ****************************************************************************************************************************************************************SECTION I: HIGH-SCORING SEQUENCES********************************************************************************	- Each of the following 33 sequences has E-value less than 10.	- The E-value of a sequence is the expected number of sequences	  in a random database of the same size that would match the motifs as	  well as the sequence does and is equal to the combined p-value of the	  sequence times the number of sequences in the database.	- The combined p-value of a sequence measures the strength of the	  match of the sequence to all the motifs and is calculated by	    o finding the score of the single best match of each motif	      to the sequence (best matches may overlap),	    o calculating the sequence p-value of each score,	    o forming the product of the p-values,	    o taking the p-value of the product.	- The sequence p-value of a score is defined as the	  probability of a random sequence of the same length containing	  some match with as good or better a score.	- The score for the match of a position in a sequence to a motif	  is computed by by summing the appropriate entry from each column of	  the position-dependent scoring matrix that represents the motif.	- Sequences shorter than one or more of the motifs are skipped.	- The table is sorted by increasing E-value.********************************************************************************SEQUENCE NAME                      DESCRIPTION                   E-VALUE  LENGTH-------------                      -----------                   -------- ------YRTP_BACSU                         HYPOTHETICAL 25.3 KD PROT...    2.3e-30    238BUDC_KLETE                         ACETOIN(DIACETYL) REDUCTA...    8.5e-29    241YINL_LISMO                         HYPOTHETICAL 26.8 KD PROT...    1.1e-25    248DHII_HUMAN                         CORTICOSTEROID 11-BETA-DE...      1e-24    292FVT1_HUMAN                         no comment                     3.4e-24    332HDE_CANTR                          HYDRATASE-DEHYDROGENASE-E...    5.8e-24    906AP27_MOUSE                         ADIPOCYTE P27 PROTEIN (AP...    1.1e-23    244DHGB_BACME                         GLUCOSE 1-DEHYDROGENASE B...    1.2e-22    2623BHD_COMTE                         3-BETA-HYDROXYSTEROID DEH...    1.8e-22    253BPHB_PSEPS                         BIPHENYL-CIS-DIOL DEHYDRO...    1.8e-22    275FIXR_BRAJA                         FIXR PROTEIN                   1.9e-22    278DHB3_HUMAN                         no comment                     3.2e-22    310DHES_HUMAN                         ESTRADIOL 17 BETA-DEHYDRO...    6.8e-22    327RIDH_KLEAE                         RIBITOL 2-DEHYDROGENASE (...    8.2e-22    249HDHA_ECOLI                         7-ALPHA-HYDROXYSTEROID DE...    8.5e-22    255NODG_RHIME                         NODULATION PROTEIN G (HOS...    9.2e-22    245DHB2_HUMAN                         no comment                     1.6e-21    387ENTA_ECOLI                         2,3-DIHYDRO-2,3-DIHYDROXY...    4.7e-21    248DHMA_FLAS1                         N-ACYLMANNOSAMINE 1-DEHYD...    8.9e-21    270HMTR_LEIMA                         no comment                     1.7e-20    287BA72_EUBSP                         7-ALPHA-HYDROXYSTEROID DE...    3.3e-20    249BDH_HUMAN                          D-BETA-HYDROXYBUTYRATE DE...    6.2e-20    343GUTD_ECOLI                         SORBITOL-6-PHOSPHATE 2-DE...      7e-19    2592BHD_STREX                         20-BETA-HYDROXYSTEROID DE...    8.4e-19    255LIGD_PSEPA                         C ALPHA-DEHYDROGENASE (EC...    2.4e-17    305RFBB_NEIGO                         no comment                     4.3e-17    346ADH_DROME                          ALCOHOL DEHYDROGENASE (EC...    2.8e-14    255DHCA_HUMAN                         no comment                     1.1e-13    276MAS1_AGRRA                         no comment                     3.6e-11    476PCR_PEA                            no comment                     9.6e-11    399YURA_MYXXA                         no comment                     1.1e-09    258CSGA_MYXXA                         no comment                     3.5e-08    166FABI_ECOLI                         no comment                     6.4e-07    262****************************************************************************************************************************************************************SECTION II: MOTIF DIAGRAMS********************************************************************************	- The ordering and spacing of all non-overlapping motif occurrences	  are shown for each high-scoring sequence listed in Section I.	- A motif occurrence is defined as a position in the sequence whose	  match to the motif has POSITION p-value less than 0.0001.	- The POSITION p-value of a match is the probability of	  a single random subsequence of the length of the motif	  scoring at least as well as the observed match.	- For each sequence, all motif occurrences are shown unless there	  are overlaps.  In that case, a motif occurrence is shown only if its	  p-value is less than the product of the p-values of the other	  (lower-numbered) motif occurrences that it overlaps.	- The table also shows the E-value of each sequence.	- Spacers and motif occurences are indicated by	   o -d-    `d' residues separate the end of the preceding motif 		    occurrence and the start of the following motif occurrence	   o [n]  occurrence of motif `n' with p-value less than 0.0001.********************************************************************************SEQUENCE NAME                      E-VALUE   MOTIF DIAGRAM-------------                      --------  -------------YRTP_BACSU                          2.3e-30  4_[2]_125_[1]_62BUDC_KLETE                          8.5e-29  [2]_126_[1]_68YINL_LISMO                          1.1e-25  3_[2]_125_[1]_73DHII_HUMAN                            1e-24  32_[2]_125_[1]_88FVT1_HUMAN                          3.4e-24  30_[2]_130_[1]_125HDE_CANTR                           5.8e-24  6_[2]_131_[1]_136_[2]_121_[1]_                                             240_[2]_153AP27_MOUSE                          1.1e-23  5_[2]_118_[1]_74DHGB_BACME                          1.2e-22  5_[2]_129_[1]_813BHD_COMTE                          1.8e-22  4_[2]_121_[1]_81BPHB_PSEPS                          1.8e-22  3_[2]_124_[1]_101FIXR_BRAJA                          1.9e-22  34_[2]_129_[1]_68DHB3_HUMAN                          3.2e-22  46_[2]_126_[1]_91DHES_HUMAN                          6.8e-22  [2]_129_[1]_151RIDH_KLEAE                          8.2e-22  12_[2]_122_[1]_68HDHA_ECOLI                          8.5e-22  9_[2]_124_[1]_75NODG_RHIME                          9.2e-22  4_[2]_122_[1]_72DHB2_HUMAN                          1.6e-21  80_[2]_126_[1]_134ENTA_ECOLI                          4.7e-21  3_[2]_115_[1]_83DHMA_FLAS1                          8.9e-21  12_[2]_127_[1]_84HMTR_LEIMA                          1.7e-20  4_[2]_163_[1]_73BA72_EUBSP                          3.3e-20  4_[2]_127_[1]_71BDH_HUMAN                           6.2e-20  53_[2]_129_[1]_114GUTD_ECOLI                            7e-19  [2]_128_[1]_842BHD_STREX                          8.4e-19  4_[2]_122_[1]_82LIGD_PSEPA                          2.4e-17  4_[2]_127_[1]_127RFBB_NEIGO                          4.3e-17  4_[2]_135_[1]_160ADH_DROME                           2.8e-14  4_[2]_122_[1]_82DHCA_HUMAN                          1.1e-13  2_[2]_165_[1]_62MAS1_AGRRA                          3.6e-11  243_[2]_123_[1]_63PCR_PEA                             9.6e-11  84_[2]_290YURA_MYXXA                          1.1e-09  159_[1]_77CSGA_MYXXA                          3.5e-08  87_[1]_57FABI_ECOLI                          6.4e-07  4_[2]_129_[1]_82****************************************************************************************************************************************************************SECTION III: ANNOTATED SEQUENCES********************************************************************************	- The positions and p-values of the non-overlapping motif occurrences	  are shown above the actual sequence for each of the high-scoring	  sequences from Section I.	- A motif occurrence is defined as a position in the sequence whose	  match to the motif has POSITION p-value less than 0.0001 as 	  defined in Section II.	- For each sequence, the first line specifies the name of the sequence.	- The second (and possibly more) lines give a description of the 	  sequence.	- Following the description line(s) is a line giving the length, 	  combined p-value, and E-value of the sequence as defined in Section I.	- The next line reproduces the motif diagram from Section II.	- The entire sequence is printed on the following lines.	- Motif occurrences are indicated directly above their positions in the	  sequence on lines showing	   o the motif number of the occurrence,	   o the position p-value of the occurrence,	   o the best possible match to the motif, and	   o columns whose match to the motif has a positive score (indicated 	     by a plus sign).********************************************************************************YRTP_BACSU  HYPOTHETICAL 25.3 KD PROTEIN IN RTP 5'REGION (ORF238)  LENGTH = 238  COMBINED P-VALUE = 6.86e-32  E-VALUE =  2.3e-30  DIAGRAM: 4_[2]_125_[1]_62         [2]         2.9e-20         QGKVVLITGCSSGIGKATAKHFHKE         +++ +++++++++++++++++++++1    MQSLQHKTALITGGGRGIGRATALALAKEGVNIGLIGRTSANVEKVAEEVKALGVKAAFAAADVKDADQVNQAVA         [1]         6.6e-19         YCASKFAVRGFTRSMAMEYAPY         ++++++++ +++++++++++++151  VTSAYSASKFAVLGLTESLMQEVRKHNIRVSALTPSTVASDMSIELNLTDGNPEKVMQPEDLAEYMVAQLKLDPRBUDC_KLETE  ACETOIN(DIACETYL) REDUCTASE (EC 1.1.1.5) (ACETOIN DEHYDROGENASE)  LENGTH = 241  COMBINED P-VALUE = 2.57e-30  E-VALUE =  8.5e-29  DIAGRAM: [2]_126_[1]_68     [2]     1.6e-20     QGKVVLITGCSSGIGKATAKHFHKE     +++++++++++++++++++++++++1    MQKVALVTGAGQGIGKAIALRLVKDGFAVAIADYNDATATAVAAEINQAGGRAVAIKVDVSRRDQVFAAVEQARK      [1]      4.7e-17      YCASKFAVRGFTRSMAMEYAPY      +++++++++++++++++++++151  VYSSSKFAVRGLTQTAARDLAPLGITVNGFCPGIVKTPMWAEIDRQCRKRRANRWATARLNLPNASPLAACRSLKYINL_LISMO  HYPOTHETICAL 26.8 KD PROTEIN IN INLA 5'REGION (ORFA)  LENGTH = 248  COMBINED P-VALUE = 3.19e-27  E-VALUE =  1.1e-25  DIAGRAM: 3_[2]_125_[1]_73        [2]        2.9e-20        QGKVVLITGCSSGIGKATAKHFHKE        ++++++++++++++++++++ ++++1    MTIKNKVIIITGASSGIGKATALLLAEKGAKLVLAARRVEKLEKIVQIIKANSGEAIFAKTDVTKREDNKKLVEL        [1]        3.3e-14        YCASKFAVRGFTRSMAMEYAPY        +++++++++ +++ ++++ +++151  GAVYGATKWAVRDLMEVLRMESAQEGTNIRTATIYPAAINTELLETITDKETEQGMTSLYKQYGITPDRIASIVADHII_HUMAN  CORTICOSTEROID 11-BETA-DEHYDROGENASE (EC 1.1.1.146) (11-DH) (11-BETA-   HYDROXYSTEROID DEHYDROGENASE) (11-BETA-HSD)  LENGTH = 292  COMBINED P-VALUE = 3.04e-26  E-VALUE =    1e-24  DIAGRAM: 32_[2]_125_[1]_88                                     [2]                                     1.1e-19                                     QGKVVLITGCSSGIGKATAKHFHKE                                     +++ +++++++++++++++++++++1    MAFMKKYLLPILGLFMAYYYYSANEEFRPEMLQGKKVIVTGASKGIGREMAYHLAKMGAHVVVTARSKETLQKVV                                     [1]                                     6.1e-14                                     YCASKFAVRGFTRSMAMEYAPY                                     +++++++++++ ++ +++++ +151  TVAALPMLKQSNGSIVVVSSLAGKVAYPMVAAYSASKFALDGFFSSIRKEYSVSRVNVSITLCVLGLIDTETAMKFVT1_HUMAN  no comment  LENGTH = 332  COMBINED P-VALUE = 1.03e-25  E-VALUE =  3.4e-24  DIAGRAM: 30_[2]_130_[1]_125                                   [2]                                   7.5e-16                                   QGKVVLITGCSSGIGKATAKHFHKE

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -