📄 transeq.txt

📁 emboss的linux版本的源代码
💻 TXT
📖 第 1 页 / 共 4 页
字号:
上一页 1 2 34
GGPVVEVMHDDIRVGANRFGRLPADLEHAARIDFRDARRQAAVVVLALHPDRPLWRVDVDVVQVPLHVFHPPVACGLRGAAEQQGLPVGRLGPGGDGRVLREETMAGLDEGPAGGRIDAGEVRRDHHLPLCHVTLHLRHLRLAGGQAGDRRPPAVAVATGDGAIQLGGAGAHHGGEDHVGARLVDALDGALQVVVGGIQRNVDFLEHRAAVLAIQVAHHMVAFPRIDVVRADEHHPLAVVANQVRRQRRTVLVRRRTAVDDVRRILEALVGGRVAEQRVGALDHRHHRLARVRHVAAHEEPYPPVANEVLGAQPIAVRVAAGVLGQRFDRATADAALAVQLLDREQCAIRVRALDIGGDAGFGEQQADQRPLLVRSHPFPLL*LSLSAVRSEPARAPGSPGRSCSSRRPAGGDRADARPAVP  Output files for usage example 7  File: amir.pep>PAAMIR_1 Pseudomonas aeruginosa amiC and amiR gene for aliphatic amidase regulationVPLAEHLLDHHQPGEASLEHERVRFVRR*ATVTGEETDGIAPGAAADRPAVLRNRRHRRYVRRGLHSQPGGTVPRGLLHVAHAQGGDAGGRARRRAALLPDPLRGLRVFAEHRLRRSGAEX  Output files for usage example 8  File: mito.pep>NC_001321.1_1 Balaenoptera physalus mitochondrion, complete genomeVNY*SAHDHNMTEVSYIWYFFIFFGGLARTPLWP**VSSQSDKL*LGLDVFVIWLAQPTCAVKLMVTGHSTPLFPPGSKNCMS**TKPPSFHTMLTLCLDIHHPP*QARP*I*KPFYL*INTKSDTSPMMKMHERHPYPMRWCSLNTYKA*HWKCLDGSSQPHWH**FGPSLSISS*QTYTCKYPHPSENAL*IMKIK*SGYQAR*H*QLTTPRLATPPRDTAVMKIKL*TKVRLSHVNL*VGKLRASHRGHTIDPN**KHGVKSVKEPHEMKSNLN*AVKSPN*N*AKLRKWL*YNLITRQL*SKLGLDTPLCLVVNPNSHKT*LFA*VLLATA*NSKDLAVPHTHLEEPVL*PMNPDQPHQPLLLQSMYRHLQQTLKGEK*A*PSYMKTLGQGVTHGLGSNGLHFLS*EHPLYSHESFYET*KLKEDLVVNQEQSAWLNKAM*ARTHRPSPSSSTPAMNPSSLTQAKQLYE**QVVT**AYRKVCLDKT*YSLNKACSLHLEDSTARVYLELALAHTLPTSTTTNQSNKTFTIPSKY***KFKYQWRY*DSTV*KDE*KT*K**KAKLTTCTFCMMT*LVMN*Q*DLKLNYPKPDELLMSST*NELIYVAK*WEDL*VEVKSLTSLVMAGCPWKESQFNIK*Y*KPMPSLNVYLTVNLK*YSFLEMGTTLT*E*NQT*T*LA*KQPSIKKAFKLDNKMMF*FQH*VNQLLAWLLD*SMQM*KQYC*YE*QEIFLLAQAYTSNW*YTDN*QQMNKTQH*IIY*NTVNPTQACIKE*LKKVKGTRQTQTPPVYQKHHL*HNQY*STACPVTNR*TAAVSWPCKGSMITCSLI*DLYEWPHEGFTVSYF*SVKLTSPW*GGDNKM*REDPMELQLINPKTMTLNHQGMTKPYMGWQFRLGWPRSTKNPPSD*NLGPLAKVQYHLLIQSFDQRNKLP*G*QRNPILESMSTMGFTTSMLDQDILMVQLLL*VRLFND*SPTWSEF*PE*S*SVSIYYAFLPVRKDK*NKANFKQAPSNN*WPSLNLMIKRKQTCP*PGPCWGG*VR*LHKT*TFTP*GSNPLPNKMFMINILTLILPILLAVAFLTLVERKILGYMQFRKGPNIVGPHGLLQPFADAIKLFTKEPLRPATSSTTMFIIAPVLALTLALTMWSPLPMPYPLINMNLGVLFMLAMSSLAVYSILWSGWASNSKYALIGALRAVAQTISYEVTLAIILLSVLLMNGSYTLSTLATTQEQLWLLFPSWPLAMMWFISTLAETNRAPFDLTEGESELVSGFNVEYAAGPFALFFLAEYANIIMMNMLTAILFLGTFHNPHNPELYTANLIIKTLLLTMSFLWIRASYPRFRYDQLMHLLWKNFLPLTLALCMWHISLPIMTASIPPQT*EMCLMKELLW*SK***PKSSYF*NN*NRTYP*EFKVLRATMLHYNLQ*GQLNKLSGPYPENVGSYPSHTNKPINPYYPPDNPYP*YNNGSHQLSLTISLNWLRNEHNSLHPYHNKKSYSPGH*SFYQVPPNTSHCFRTPHNSSHH*LNAL*PMNYYKTI*PNSIHTHNSSPSHQTGISPLPLLSS*SNT*YPPNH*PNPINMTKTSSLINPMPNFTIN*PTPNINHILTFHLN**L*WTKPNTTSKNHSLLINCPH*MNNGHSTM*SNPNITKSTNLHHNNLHHIYIIYPKLNYHYIVTVSNLK*NTRHHNPYHTHFTLD**TPTTIGVYTQMNNYS*TNKKWYTHCTNIHSHYSITQPMLLYTSYLLHSTNTISLHK*YKNKMTIQLHKTNSSPTNSNRNFHYATTPHTNTLNPTMGV*VKP*P*AFKALSKYNLLNSCPM*TA*LYLTSIECKSNALIKLNPH*IGGMHLPRIFS*QLNTLINWLQSTSPAA*KK*REKSRQDLKLLPWICNSKWSFTTGLGKK*TQPLSLDLQSNTYSAILPMFMNRWLFSTNHKDIGTLYLLFGAWAGMVGTGLSLLIRAELGQPGTLIGDDQVYNVLVTAHAFVMIFFMVMPIMIGGFGNWLVPLMIGAPDMAFPRMNNMSFWLLPPSFLLLMASSMIEAGAGTGWTVYPPLAGNLAHAGASVDLTIFSLHLAGVSSILGAINFITTIINMKPPAMTQYQTPLFVWSVLVTAVLLLLSLPVLAAGITMLLTDRNLNTTFFDPAGGGDPILYQHLFWFFGHPEVYILILPGFGMISHIVTYYSGKKEPFGYMGMVWAMVSIGFLGFIVWAHHMFTVGMDVDTRAYFTSATMIIAIPTGVKVFSWLATLHGGNIKWSPALMWALGFIFLFTVGGLTGIVLANSSLDIVLHDTYYVVAHFHYVLSMGAVFAIMGGFVHWFPLFSGYTLNTTWAKIHFMIMFVGVNLTFFPQHFLGLSGMPRRYSDYPDAYTTWNTISSMGSFISLTAVMLMIFIIWEAFTSKREVLAVDLTSTNLEWLNGCPPPYHTFEEPAFVNPKWS*KEGIEPSPIGFKPTS*LLCLSL*T*Y**NLM*LCQS*VTSENPVYLHGMSIPT*FP*CSITHH**APTLSRSYTNNRFSN*LFSSLHYYPNAYNQINTY*YN*RP*S*NCLNYPPSHYLNFNCLAFITDPLHN*RSQ*PLPHCKNN*SPMMLKLWVYRLR*PKLRLLYNPNI*PKA**TTII*S**PSCLTY*NNNPNISLI**RTPLMGRTLLGPKN*CNP*TPKPNNLNINTT*PILWTML*DLRLKPQFHTNCP*TSTP*SLWKMICINTMTSL*S*ISINLLS**L*VYNSP*WYATI*YINMTPYYSINTLNPLCIIPIKNLKALLFP*PQTSTYQNTKTTSSLKHHMNENLFAPFMIPVMLGIPITTLIIILPSMLFPAPNRLINNRTIAIQQWLTKLTSKQLMNVHSPKGQTWSLMLISLFLFIASTNLLGMLPHSFTPTTQLSMNVGMAIPLWAGTVTTGFRNKTKMSLAHLLPQGTPTFLIPMLVIIETISLFIQPVAWAVRLTANITAGHLLMHLIGETTLALMNINLFSAFITFTILALLTILEFAVALIQAYVFTLLVSLYLHDNT*WPTKPTHTT**TPALDPSPELYQPF**HQA*LYDFTSTQYSY*L*ACQQMF*QYTNDGEMSSEKAPSKAIMHQPSK*AYDTE*FYLSSQKSYFSQASSEPSTTQALPLLQN*ADVDHQQASAL*IP*KFPFSTPPYY*PLAYLLPEPTMAW*KETANTYFKHSSSQLH*ASTSPYYKHQSTTKPLSQSQTESTAPPSL*PQAFMGYM*SLDLLSLSSVSYVK*NSTSHQTTTLALNVPLDTDIS*TSYDYFFMYLSIDEVPSPFSINKYNWLPIS*FRCTPKKNNKPSTNTTNKYNTSPTTRIHRLLTSTTKRMR*KNKPMWMRIWPH*ISPPTLLHKILLGGHYFPSLWL*NRSLTPPSLSNSVKQPKHNTHNSLILNLPTSSQPSLWMNS**P*MSWMWYLV*DKTSDFDPLDCDQIHNYQMTLIHMNILMAFSMSLMGLLMYRSHLMSALLCLEGMMLSLFVLAALTILSSHFTLANMMPIILLVFAACEAAIGLALLVMVSNTYGTDYVQNLNLLQC*NLLFLQSY*YP*PDYQKMT*SELTPQPTVY*LASQAFSSSINSTTTALTTH*YSSPTPFLPHSWS*QYDSFP*Y**QVNPISSKNHQSEKNSTLRY*SHYKPS*L*HLLPLN*SYFMSYLKPH*SLPLSLSLAGATKQNDSMPDYTSYSMH*LDLSHY**H*YIYKMQQDP*TFYSYNTELNHYLRPDPTSSYD*PA**PS**KYLSMDYTFDCPKHT*KPPLQAP*SLQPYY*NLEAMAYYELHPYSIP*QNT*HTHFLYSLFEE*S*PALSVYVKQT*NH*LHIPQLVT*HSSSQLSSSKPPEAM*GPLP**LPTASHPPYYSVWQTRTTNAFMAEP*FCPEAYKSFYH**PVDDY*QA*QILHYPQPST*SENYS*SCRSSHDQIPLFS**EQML*LLLSTLYMY*S*HNVANTHTTSMMSPLPSHESMP**PYTLFPSCSYH*TLKSS*ALSTVSMV*K*R*FVKLTMEDQNFLLTEKVLQELLIHAPTPNSCGFFKLLQDSSYPLVLGAKKLVQLQMKVMNLFTSFTLLTLLILTTPIMMSHTGSHVNNKYQSYVKNIVFCAFITSLVPAMVYLHTNQETLISNWHWITIQTLKLTLSFKMDYFSLMFMPVALFITWSIMEFSMWYMHSDPYINQFFKYLLLFLITMLILVTANNLFQLFIGWEGVGIMSFLLIGWWFGRTDANTAALQAILYNRIGDIGLLASMAWFLSNMNTWDLEQIFMLNQNPLNFPLMGLVLAAAGKSAQFGLHPWLPSAMEGPTPVSALLHSSTMVVAGIFLLVRFYPLMENNKLIQTVTLCLGAITTLFTAICALTQNDIKKIIAFSTSSQLGLMMVTIGLNQPYLAFLHICTHAFFKAMLFLCSGSIIHNLNNEQDIRKMGGLFKALPFTTTALIIGCLALTGMPFLTGFYSKDPIIEAATSSYTNAWALLLTLIATSLTAVYSTRIIFFALLGQPRFPPSTTINENNPLLINPIKRLLVGSIFAGFILSNSIPPMTTPLMTMPLHLKLTALAMTTLGFIIAFEINLDTQNLKHKHPSNSFKFSTLLGYFPTIMHRLPPHLDLLMSQKLATSLLDLTWLETILPKTTALIQLKASTLTSNQQGLIKLYFLSFLITITLSMILFNYPE*SP**LQH**MKTNP*QSPTKHHNYMMPQSL*PPH*KPQNPQYHKQPSPLVHQTQT*SSPPHSSKHKSQLKTPPPTLKQMLLVQLY*KPKPQDTVQ*P*LLYNQMLPAFPPNKSKTPLTPKTNHQNSK*LHIQHHHPQSTLNPHK*VKALKKPPQN*LQK*YLKWKQYTLSLFSHGLQPWPMTWKIIVVIQLQEHQWPTSEKHTH**KSSTTHSSISPPHQMSLHDGTSAPYSASA*LYKS*QAYS*QYTTHQTQQPPSHQSHTSAETWITAELSDTYMQMGLLYSSSASTLT*DEAYTTAPTPSEKHEMLELFYYSQL*PPHS*ATSCPEDKYHSEAQL*SLTSYQQSHTLVPP*SNESEAVSL*MKQH*HAFLPFTLSSPSSS*H*QLSTLFSFTKQDPTTPQASHPT*MKSHSTPTTQLKTF*VPYY*S*SY*Y*PYSHPTYLETQTTMPQQTHSVPQHTLNQNGIFYSHTQSYDQSPTN*AES*PYYSQS*S*PSSQYSTHPINEA*YFDPLASSCSES*SQIY*P*HGSAANQ*NTPT*L*ANSHPSSISS*F*Y*YQ*LVLS*TNL*NEESL*YN*MPRFCKPEKET*HTSL*LKEEVLHSTISTQSWSST*TIPWKSMLYNNH*TTVLCPYWK*LALLDIIM*LVHACTST*LMASFHGYEQMYMLCMIVHSIIFTTSSWSSY*ILLILHIT*YVLMVQ*RMFLCIP*SI*IKWFLWPLH*ITSLVSMPRETSNPLG*DPSSRTGPITRGGSYLMIFM*HLVLTSGPY*LKIAHSFPLNKTSRW   One or more peptide sequences are written out.   The names of the resulting protein sequences are formed from the name   of the input nucleic acid sequence with '_' and the translation frame   appended to it. Thus a nucleic acid sequence with the name 'XYZ'   franslated in all 6 frame would produce protein sequences with the   names: 'XYZ_1', 'XYZ_2', 'XYZ_3', 'XYZ_4', 'XYZ_5', 'XYZ_6'.   If regions are specified, they are taken to be translated in frame 1   and so the output name would be 'XYZ_1'.Data files   EMBOSS data files are distributed with the application and stored in   the standard EMBOSS data directory, which is defined by the EMBOSS   environment variable EMBOSS_DATA.   To see the available EMBOSS data files, run:% embossdata -showall   To fetch one of the data files (for example 'Exxx.dat') into your   current directory for you to inspect or modify, run:% embossdata -fetch -file Exxx.dat   Users can provide their own data files in their own directories.   Project specific files can be put in the current directory, or for   tidier directory listings in a subdirectory called ".embossdata".   Files for all EMBOSS runs can be put in the user's home directory, or   again in a subdirectory called ".embossdata".   The directories are searched in the following order:     * . (your current directory)     * .embossdata (under your current directory)     * ~/ (your home directory)     * ~/.embossdata   The EMBOSS REBASE restriction enzyme data files are stored iin   directory 'data/REBASE/*' under the EMBOSS installation directory.   These files must first be set up using the program 'rebaseextract'.   Running 'rebaseextract' may be the job of your system manager.   The data files are stored in the REBASE directory of the standard   EMBOSS data directory. The names are:     * embossre.enz Cleavage information     * embossre.ref Reference/methylation information     * embossre.sup Supplier information   The column information is described at the top of the data files   The reported enzyme from any one group of isoschizomers (the   prototype) is specified in the REBASE database and the information is   held in the data file 'embossre.equ'. You may edit this file to set   your own preferred prototype, if you wish.   The format of the file "embossre.equ" is   Enzyme-name Prototype-name   i.e. two columns of enzyme names separated by a space. The first name   of the pair of enzymes is the name that is not preferred and the   second is the preferred (prototype) name.Notes   The reverse frame '-1' is defined as the translation you get when you   use the reverse-complement of the sequence with the same codon phase   as the codon in frame '1'.   Thus the sequence ACTGG in frame 1 is the translation of the codons   ACT,GG; the translation of frame -1 uses these same codons, reverse   complemented:  forward sense          ACT GG  reverse sense          TGA CC  reverse-complement     CC AGT  frame -1 translation       S   Frame -1 is the translation of CCAGT (the reverse complement of ACTGG)   using the codon 'AGT' (the first bases 'CC' are ignored). The result   is the peptide 'S'.   Similarly frame -2 is the phase used by frame 2, 'CAG T' (the first   base 'C' is ignored). The last base cannot be successfully translated   and is output as the unknown residue 'X'. The result is the peptide   'QX'.   Frame -3 is the phase used by frame 3, 'CCA GT'. The last two bases   will translate to 'V' as it does not matter what the next base is.   (GTA, GTC, GTG, GTT all code for 'V'). The result is the peptide 'PV'.   The alternative way of generating the reverse translation frames used   by some people is that frame -1 is made by taking the frame '1' of the   reverse complement. There is no correspondance betwen the codons used   in frame 1 and -1, 2 and -2, 3 and -3; the codons used change with the   length modulus 3.   There does not appear to be a convention on which definition to use.   The Staden package uses the same convention as this program.   The GCG package sneakily avoids the problem by naming the frames using   letters (a, b, c, d, e, f)   If you really need to define frame -1 as the frame given when you   reverse complement the sequence and then start translating at the   first frame in the resulting sequence, then use the '-alternative'   qualifier.   (Reverse sense translations are a biological nonsense, really, but can   be very useful in practice.)References   None.Warnings   When translating using non-standard genetic code table, always check   the table carefully for deviations from your particular organism's   code.   When using the '-regions' option, you should always leave the   '-frames' option at the default of frame '1'. If you change the frame   while specifying a region to translate, then the regions will be   offset by 1 or 2 bases, which is not what you want.Diagnostic Error Messages   Several warning messages about malformed region specifications:     * Non-digit found in region ...     * Unpaired start of a region found in ...     * Non-digit found in region ...     * The start of a pair of region positions must be smaller than the       end in ...Exit status   It exits with status 0, unless a region is badly constructed.Known bugs   When using the '-regions' option, you should always leave the   '-frames' option at the default of frame '1'. If you change the frame   while specifying a region to translate, then the regions will be   offset by 1 or 2 bases, which is not what you want.See also   Program name                        Description   backtranambig Back translate a protein sequence to ambiguous codons   backtranseq   Back translate a protein sequence   coderet       Extract CDS, mRNA and translations from feature tables   plotorf       Plot potential open reading frames   prettyseq     Output sequence with translated ranges   remap         Display sequence with restriction sites, translation etc   showorf       Pretty output of DNA translations   showseq       Display a sequence with features, translation etc   sixpack       Display a DNA sequence with 6-frame translation and ORFsAuthor(s)   Gary Williams (gwilliam
上一页 1 2 34
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -