📄 extractseq.txt
字号:
Input files for usage example 2 Database entry: tembl:hsfau1ID HSFAU1 standard; DNA; HUM; 2016 BP.XXAC X65921; S45242;XXSV X65921.1XXDT 13-MAY-1992 (Rel. 31, Created)DT 21-JUL-1993 (Rel. 36, Last updated, Version 5)XXDE H.sapiens fau 1 geneXXKW fau 1 gene.XXOS Homo sapiens (human)OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;OC Eutheria; Primates; Catarrhini; Hominidae; Homo.XXRN [1]RP 1-2016RA Kas K.;RT ;RL Submitted (29-APR-1992) to the EMBL/GenBank/DDBJ databases.RL K. Kas, University of Antwerp, Dept of Biochemistry T3.22,RL Universiteitsplein 1, 2610 Wilrijk, BELGIUMXXRN [2]RP 1-2016RX MEDLINE; 92412144.RA Kas K., Michiels L., Merregaert J.;RT "Genomic structure and expression of the human fau gene: encoding theRT ribosomal protein S30 fused to a ubiquitin-like protein.";RL Biochem. Biophys. Res. Commun. 187:927-933(1992).XXDR SWISS-PROT; P35544; UBIM_HUMAN.DR SWISS-PROT; Q05472; RS30_HUMAN.XXFH Key Location/QualifiersFHFT source 1..2016FT /db_xref="taxon:9606"FT /organism="Homo sapiens"FT /clone_lib="CML cosmid"FT /clone="15.1"FT mRNA join(408..504,774..856,951..1095,1557..1612,1787..>1912)FT /gene="fau 1"FT exon 408..504FT /number=1FT intron 505..773FT /number=1FT exon 774..856 [Part of this file has been deleted for brevity]FT RAKRRMQYNRRFVNVVPTFGKKKGPNANS"FT intron 857..950FT /number=2FT exon 951..1095FT /number=3FT intron 1096..1556FT /number=3FT exon 1557..1612FT /number=4FT intron 1613..1786FT /number=4FT exon 1787..>1912FT /number=5FT polyA_signal 1938..1943XXSQ Sequence 2016 BP; 421 A; 562 C; 538 G; 495 T; 0 other; ctaccatttt ccctctcgat tctatatgta cactcgggac aagttctcct gatcgaaaac 60 ggcaaaacta aggccccaag taggaatgcc ttagttttcg gggttaacaa tgattaacac 120 tgagcctcac acccacgcga tgccctcagc tcctcgctca gcgctctcac caacagccgt 180 agcccgcagc cccgctggac accggttctc catccccgca gcgtagcccg gaacatggta 240 gctgccatct ttacctgcta cgccagcctt ctgtgcgcgc aactgtctgg tcccgccccg 300 tcctgcgcga gctgctgccc aggcaggttc gccggtgcga gcgtaaaggg gcggagctag 360 gactgccttg ggcggtacaa atagcaggga accgcgcggt cgctcagcag tgacgtgaca 420 cgcagcccac ggtctgtact gacgcgccct cgcttcttcc tctttctcga ctccatcttc 480 gcggtagctg ggaccgccgt tcaggtaaga atggggcctt ggctggatcc gaagggcttg 540 tagcaggttg gctgcggggt cagaaggcgc ggggggaacc gaagaacggg gcctgctccg 600 tggccctgct ccagtcccta tccgaactcc ttgggaggca ctggccttcc gcacgtgagc 660 cgccgcgacc accatcccgt cgcgatcgtt tctggaccgc tttccactcc caaatctcct 720 ttatcccaga gcatttcttg gcttctctta caagccgtct tttctttact cagtcgccaa 780 tatgcagctc tttgtccgcg cccaggagct acacaccttc gaggtgaccg gccaggaaac 840 ggtcgcccag atcaaggtaa ggctgcttgg tgcgccctgg gttccatttt cttgtgctct 900 tcactctcgc ggcccgaggg aacgcttacg agccttatct ttccctgtag gctcatgtag 960 cctcactgga gggcattgcc ccggaagatc aagtcgtgct cctggcaggc gcgcccctgg 1020 aggatgaggc cactctgggc cagtgcgggg tggaggccct gactaccctg gaagtagcag 1080 gccgcatgct tggaggtgag tgagagagga atgttctttg aagtaccggt aagcgtctag 1140 tgagtgtggg gtgcatagtc ctgacagctg agtgtcacac ctatggtaat agagtacttc 1200 tcactgtctt cagttcagag tgattcttcc tgtttacatc cctcatgttg aacacagacg 1260 tccatgggag actgagccag agtgtagttg tatttcagtc acatcacgag atcctagtct 1320 ggttatcagc ttccacacta aaaattaggt cagaccaggc cccaaagtgc tctataaatt 1380 agaagctgga agatcctgaa atgaaactta agatttcaag gtcaaatatc tgcaactttg 1440 ttctcattac ctattgggcg cagcttctct ttaaaggctt gaattgagaa aagaggggtt 1500 ctgctgggtg gcaccttctt gctcttacct gctggtgcct tcctttccca ctacaggtaa 1560 agtccatggt tccctggccc gtgctggaaa agtgagaggt cagactccta aggtgagtga 1620 gagtattagt ggtcatggtg ttaggacttt ttttcctttc acagctaaac caagtccctg 1680 ggctcttact cggtttgcct tctccctccc tggagatgag cctgagggaa gggatgctag 1740 gtgtggaaga caggaaccag ggcctgatta accttccctt ctccaggtgg ccaaacagga 1800 gaagaagaag aagaagacag gtcgggctaa gcggcggatg cagtacaacc ggcgctttgt 1860 caacgttgtg cccacctttg gcaagaagaa gggccccaat gccaactctt aagtcttttg 1920 taattctggc tttctctaat aaaaaagcca cttagttcag tcatcgcatt gtttcatctt 1980 tacttgcaag gcctcaggga gaggtgtgct tctcgg 2016// You can specifiy a file of ranges to extract by giving the '-regions' qualifier the value '@' followed by the name of the file containing the ranges. (eg: '-regions @myfile'). The format of the range file is: * Comment lines start with '#' in the first column. * Comment lines and blank lines are ignored. * The line may start with white-space. * There are two positive (integer) numbers per line separated by one or more space or TAB characters. * The second number must be greater or equal to the first number. * There can be optional text after the two numbers to annotate the line. * White-space before or after the text is removed. An example range file is:# this is my set of ranges12 23 4 5 this is like 12-23, but smaller67 10348 interesting regionOutput file format The output is a normal sequence file. Output files for usage example File: result.seq>HSFAU X65923.1 H.sapiens fau mRNActcgactccat Output files for usage example 2 File: result2.seq>HSFAU1 X65921.1 H.sapiens fau 1 genetccctctcgatacactcgggacaagttagggc If the option '-separate' is used then each specified region is written to the output file as a separate sequence. The name of the sequence is created from the name of the original sequence with the start and end positions of the range appended with underscore characters between them, For example: "XYZ region 2 to 34" is written as: "XYZ_2_34"Data files None.Notes None.References None.Warnings None.Diagnostic Error Messages Several warning messages about malformed region specifications: * Non-digit found in region ... * Unpaired start of a region found in ... * Non-digit found in region ... * The start of a pair of region positions must be smaller than the end in ...Exit status It exits with status 0, unless a region is badly constructed.Known bugs None noted.CommentsSee also Program name Description biosed Replace or delete sequence sections codcopy Reads and writes a codon usage table cutseq Removes a specified section from a sequence degapseq Removes gap characters from sequences descseq Alter the name or description of a sequence entret Reads and writes (returns) flatfile entries extractalign Extract regions from a sequence alignment extractfeat Extract features from a sequence listor Write a list file of the logical OR of two sets of sequences makenucseq Creates random nucleotide sequences makeprotseq Creates random protein sequences maskfeat Mask off features of a sequence maskseq Mask off regions of a sequence newseq Type in a short new sequence noreturn Removes carriage return from ASCII files notseq Exclude a set of sequences and write out the remaining ones nthseq Writes one sequence from a multiple set of sequences pasteseq Insert one sequence into another revseq Reverse and complement a sequence seqret Reads and writes (returns) sequences seqretsplit Reads and writes (returns) sequences in individual files skipseq Reads and writes (returns) sequences, skipping first few splitter Split a sequence into (overlapping) smaller sequences trimest Trim poly-A tails off EST sequences trimseq Trim ambiguous bits off the ends of sequences union Reads sequence fragments and builds one sequence vectorstrip Strips out DNA between a pair of vector sequences yank Reads a sequence range, appends the full USA to a list fileAuthor(s) Gary Williams (gwilliam
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -