⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 epestfind.txt

📁 emboss的linux版本的源代码
💻 TXT
📖 第 1 页 / 共 2 页
字号:
                                 epestfind Function   Finds PEST motifs as potential proteolytic cleavage sitesDescription   epestfind allows rapid and objective identification of PEST motifs in   protein target sequences. Briefly, the PEST hypothesis was based on a   literature survey that combined both information on protein stability   as well as protein primary sequence information. Initially, the study   relied on 12 short-lived proteins with well-known properties [1], but   was continually extended later [2,3]. The initial group of proteins   included E1A, c-myc, p53, c-fos, v-myb, P730 phytochrome, heat shock   protein 70 (HSP 70), HMG-CoA reductase, tyrosine aminotransferase   (TAT), ornithine decarboxylase (ODC), alpha-Casein and beta-Casein.   Although all these proteins exerted various different cellular   functions it became apparent that they shared high local   concentrations of amino acids proline (P), glutamic acid (E), serine   (S), threonine (T) and to a lesser extent aspartic acid (D). From that   it was concluded that PEST motifs reduce the half-lives of proteins   dramatically and hence, that they target proteins for proteolytic   degradation.   PEST means Black Death in German, so that the name of this programme   sounds a bit strange, at least in our ears.Algorithm   PEST motifs were defined as hydrophilic stretches of at least 12 amino   acids length with a high local concentration of critical amino acids.   Remarkably, negatively charged amino acids are clustered within these   motifs while positively charged amino acids, arginine (R), histidine   (H) and lysine (K) are generally forbidden.   The epestfind algorithm defines the last criterion even more   stringently in that PEST motifs are required to be flanked by   positively charged amino acids. Though this implication greatly   facilitates computer scanning, a few PEST sequences might be missed.   Especially sequences with a high local concentration of critical amino   acids but with a long distance between positively charged amino acids   are error prone. Due to their length, these PEST motifs might become   diluted, which results in scores apparently lower than initially   expected. Another side effect of scanning for positively charged amino   acids is that very long PEST motifs are sub-divided into adjacent   smaller ones. However, identification of PEST motifs is achieved by an   initial scan for positively charged amino acids arginine (R),   histidine (H) and lysine (K) within the specified protein sequence.   All amino acids between the positively charged flanks are counted and   only those motifs are considered further, which contain a number of   amino acids equal to or higher than the window-size parameter.   Additionally, all 'valid' PEST regions are required to contain at   least one proline (P), one aspartate (D) or glutamate (E) and at least   one serine (S) or threonine(T). Sequences that do not meet the above   criteria are classified as 'invalid' PEST motifs and excluded from   further analysis.   The quality of 'valid' PEST motifs is refined by means of a scoring   parameter based on the local enrichment of critical amino acids as   well as the motif's hydrophobicity. Enrichment of D, E, P, S and T is   expressed in mass percent (w/w) and corrected for one equivalent of D   or E, one of P and one of S or T. Calculation of hydrophobicity   follows in principle the method of J. Kyte and R.F. Doolittle [4]. For   simplified calculations, Kyte-Doolittle hydropathy indices, which   originally ranged from -4.5 for arginine to +4.5 for isoleucine, were   converted to positive integers. This was achieved by the following   linear transformation, which yielded values from 0 for arginine to 90   for isoleucine.        Hydropathy index = 10 * Kyte-Doolittle hydropathy index + 45   The motif's hydrophobicity is calculated as the sum over the products   of mole percent and hydrophobicity index for each amino acid species.   The desired PEST score is obtained as combination of local enrichment   term and hydrophobicity term as expressed by the following equation:          PEST score = 0.55 * DEPST - 0.5 * hydrophobicity index.   Although, the formula above differs from the publication [1], it is in   fact the correct one, which was also implemented in the original BASIC   programme (personal communication). In addition, the programme   includes a correction for the hydropathy index of tyrosine, introduced   by Robert H. Stellwagen from the University of Southern California.   However, PEST scores can range from -45 for poly-isoleucine to about   +50 for poly-aspartate plus one proline and one serine. 'Valid' PEST   motifs below the threshold score (5.0) are considered as 'poor', while   PEST scores above the threshold score are of real biological interest.   The higher the PEST score, the more likely is degradation of proteins   mediated via 'potential' PEST motifs in eukaryotic cells.   Presently, all modified Kyte-Doolittle hydropathy indices are   hard-coded into the programme, which might change in future.   The array of linear transformed Kyte-Doolittle hydropathy indices   (ltkdhi) is listed in alphabetical order below. (A-M and N-Z as well   as N-terminus and C-terminus)   63, 10, 70, 10, 10, 72, 41, 13, 90, 0, 6, 82, 64, 10, 0, 29, 10, 0,   36, 38, 0, 87, 36, 45, 58, 10, 0, 0   The linear transformation was ltkdhi = 10 * kdhi + 45   All values range from Argine R = 0 to Isoleucine I = 90   B=(N|D)=10 since N=10 and D=10   Z=(Q|E)=10 since Q=10 and E=10   X=10*0+45=45Usage   Here is a sample session with epestfind% epestfind -graph cps -invalid Finds PEST motifs as potential proteolytic cleavage sitesInput protein sequence: exu2_drops.emblWindow length [10]: Sort order of results      1 : length      2 : position      3 : scoreSort order of results [score]: Output file [exu2_drops.epestfind]: Created epestfind.ps   Go to the input files for this example   Go to the output files for this exampleCommand line arguments   Standard (Mandatory) qualifiers:  [-sequence]          sequence   Protein sequence USA to be analysed.   -window             integer    [10] Minimal distance between positively                                  charged amino acids. (Integer 2 or more)   -order              selection  [score] Name of the output file which holds                                  the results of the analysis. Results may be                                  sorted by length, position and score.  [-outfile]           outfile    [*.epestfind] Name of file to which results                                  will be written.   -graph              xygraph    [$EMBOSS_GRAPHICS value, or x11] Graph type                                  (ps, hpgl, hp7470, hp7580, meta, cps, x11,                                  tekt, tek, none, data, xterm, png)   Additional (Optional) qualifiers:   -aadata             datafile   [Eamino.dat] Amino acids properties and                                  molecular weight data file   -threshold          float      [+5.0] Threshold value to discriminate weak                                  from potential PEST motifs. Valid PEST                                  motifs are discriminated into 'poor' and                                  'potential' motifs depending on this                                  threshold score. By default, the default                                  value is set to +5.0 based on experimental                                  data. Alterations are not recommended since                                  significance is a matter of biology, not                                  mathematics. (Number from -55.00 to 55.00)   Advanced (Unprompted) qualifiers:   -[no]potential      boolean    [Y] Decide whether potential PEST motifs                                  should be printed.   -[no]poor           boolean    [Y] Decide whether poor PEST motifs should                                  be printed.   -invalid            boolean    [N] Decide whether invalid PEST motifs                                  should be printed.   -[no]map            boolean    [Y] Decide whether PEST motifs should be                                  mapped to sequence.   Associated qualifiers:   "-sequence" associated qualifiers   -sbegin1            integer    Start of the sequence to be used   -send1              integer    End of the sequence to be used   -sreverse1          boolean    Reverse (if DNA)   -sask1              boolean    Ask for begin/end/reverse   -snucleotide1       boolean    Sequence is nucleotide   -sprotein1          boolean    Sequence is protein   -slower1            boolean    Make lower case   -supper1            boolean    Make upper case   -sformat1           string     Input sequence format   -sdbname1           string     Database name   -sid1               string     Entryname   -ufo1               string     UFO features   -fformat1           string     Features format   -fopenfile1         string     Features file name   "-outfile" associated qualifiers   -odirectory2        string     Output directory   "-graph" associated qualifiers   -gprompt            boolean    Graph prompting   -gdesc              string     Graph description   -gtitle             string     Graph title   -gsubtitle          string     Graph subtitle   -gxtitle            string     Graph x axis title   -gytitle            string     Graph y axis title   -goutfile           string     Output file for non interactive displays   -gdirectory         string     Output directory   General qualifiers:   -auto               boolean    Turn off prompts   -stdout             boolean    Write standard output   -filter             boolean    Read standard input, write standard output   -options            boolean    Prompt for standard and additional values   -debug              boolean    Write debug output to program.dbg   -verbose            boolean    Report some/full command line options   -help               boolean    Report command line options. More                                  information on associated and general                                  qualifiers can be found with -help -verbose   -warning            boolean    Report warnings   -error              boolean    Report errors   -fatal              boolean    Report fatal errors   -die                boolean    Report dying program messagesInput file format   epestfind reads any normal protein sequence USA.  Input files for usage example  File: exu2_drops.emblID   EXU2_DROPS     STANDARD;      PRT;   477 AA.AC   Q24617;DT   01-NOV-1997 (Rel. 35, Created)DT   01-NOV-1997 (Rel. 35, Last sequence update)DT   01-NOV-1997 (Rel. 35, Last annotation update)DE   Maternal exuperantia 2 protein.GN   EXU2.OS   Drosophila pseudoobscura (Fruit fly).OC   Eukaryota; Metazoa; Arthropoda; Tracheata; Hexapoda; Insecta;OC   Pterygota; Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha;OC   Ephydroidea; Drosophilidae; Drosophila.OX   NCBI_TaxID=7237;RN   [1]RP   SEQUENCE FROM N.A.RX   MEDLINE=94350208; PubMed=8070663;RA   Luk S.K.-S., Kilpatrick M., Kerr K., Macdonald P.M.;RT   "Components acting in localization of bicoid mRNA are conserved amongRT   Drosophila species.";RL   Genetics 137:521-530(1994).CC   -!- FUNCTION: ENSURES THE PROPER LOCALIZATION OF THE MRNA OF THECC       BICOID GENE TO THE ANTERIOR REGIONS OF THE OOCYTE THUS PLAYINGCC       A FUNDAMENTAL ROLE IN THE ESTABLISHMENT OF THE POLARITY OF THECC       OOCYTE. MAY BIND THE BCD MRNA (BY SIMILARITY).CC   --------------------------------------------------------------------------CC   This SWISS-PROT entry is copyright. It is produced through a collaborationCC   between  the Swiss Institute of Bioinformatics  and the  EMBL outstation -CC   the European Bioinformatics Institute.  There are no  restrictions on  itsCC   use  by  non-profit  institutions as long  as its content  is  in  no  way

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -