📄 extractseq.txt

📁 emboss的linux版本的源代码
💻 TXT
📖 第 1 页 / 共 2 页
字号:
12 下一页
                                extractseq Function   Extract regions from a sequenceDescription   extractseq allows you to specify one or more regions of a sequence to   extract sub-sequences from to build up a contiguous resulting   sequence.   This is modelled on the cell's process of splicing out exons from   mRNA, but the program is generally applicable to any cutting and   splicing or editing operation on a single sequence.   extractseq reads in a sequence and a set of regions of that sequence   as specified by pairs of start and end positions (either on the   command-line or contained in a file) and writes out the specified   regions of the input sequence in the order in which they have been   specified. Thus, if the sequence "AAAGGGTTT" has been input and the   regions: "7-9, 3-4" have been specified, then the output sequence will   be: "TTTAG".Usage   Here is a sample session with extractseq   Extract the region from position 10 to 20:% extractseq tembl:hsfau result.seq -regions '10-20' Extract regions from a sequence   Go to the input files for this example   Go to the output files for this example   Example 2   Extract the regions 10 to 20, 30 to 45, 533 to 537:% extractseq tembl:hsfau1 result2.seq -regions '10-20 30-45 533-537' Extract regions from a sequence   Go to the input files for this example   Go to the output files for this example   Example 3   Extract the regions 782-856, 951-1095, 1557-1612 and 1787-1912:% extractseq tembl:hsfau1 -reg "782..856,951..1095,1557..1612,1787..1912" stdout Extract regions from a sequence>HSFAU1 X65921.1 H.sapiens fau 1 geneatgcagctctttgtccgcgcccaggagctacacaccttcgaggtgaccggccaggaaacggtcgcccagatcaaggctcatgtagcctcactggagggcattgccccggaagatcaagtcgtgctcctggcaggcgcgcccctggaggatgaggccactctgggccagtgcggggtggaggccctgactaccctggaagtagcaggccgcatgcttggaggtaaagtccatggttccctggcccgtgctggaaaagtgagaggtcagactcctaaggtggccaaacaggagaagaagaagaagaagacaggtcgggctaagcggcggatgcagtacaaccggcgctttgtcaacgttgtgcccacctttggcaagaagaagggccccaatgccaactcttaa   Example 4   Extract the regions 782-856, 951-1095, 1557-1612 and 1787-1912 all to   separate output sequences:% extractseq tembl:hsfau1 -reg "782..856,951..1095,1557..1612,1787..1912" stdout -separate Extract regions from a sequence>HSFAU1_782_856 H.sapiens fau 1 geneatgcagctctttgtccgcgcccaggagctacacaccttcgaggtgaccggccaggaaacggtcgcccagatcaag>HSFAU1_951_1095 H.sapiens fau 1 genegctcatgtagcctcactggagggcattgccccggaagatcaagtcgtgctcctggcaggcgcgcccctggaggatgaggccactctgggccagtgcggggtggaggccctgactaccctggaagtagcaggccgcatgcttggag>HSFAU1_1557_1612 H.sapiens fau 1 genegtaaagtccatggttccctggcccgtgctggaaaagtgagaggtcagactcctaag>HSFAU1_1787_1912 H.sapiens fau 1 genegtggccaaacaggagaagaagaagaagaagacaggtcgggctaagcggcggatgcagtacaaccggcgctttgtcaacgttgtgcccacctttggcaagaagaagggccccaatgccaactcttaaCommand line arguments   Standard (Mandatory) qualifiers:  [-sequence]          sequence   Sequence filename and optional format, or                                  reference (input USA)   -regions            range      [Whole sequence] Regions to extract.                                  A set of regions is specified by a set of                                  pairs of positions.                                  The positions are integers.                                  They are separated by any non-digit,                                  non-alpha character.                                  Examples of region specifications are:                                  24-45, 56-78                                  1:45, 67=99;765..888                                  1,5,8,10,23,45,57,99  [-outseq]            seqoutall  [.] Sequence set(s)                                  filename and optional format (output USA)   Additional (Optional) qualifiers:   -separate           boolean    [N] If this is set true then each specified                                  region is written out as a separate                                  sequence. The name of the sequence is                                  created from the name of the original                                  sequence with the start and end positions of                                  the range appended with underscore                                  characters between them, eg: XYZ region 2 to                                  34 is written as: XYZ_2_34   Advanced (Unprompted) qualifiers: (none)   Associated qualifiers:   "-sequence" associated qualifiers   -sbegin1            integer    Start of the sequence to be used   -send1              integer    End of the sequence to be used   -sreverse1          boolean    Reverse (if DNA)   -sask1              boolean    Ask for begin/end/reverse   -snucleotide1       boolean    Sequence is nucleotide   -sprotein1          boolean    Sequence is protein   -slower1            boolean    Make lower case   -supper1            boolean    Make upper case   -sformat1           string     Input sequence format   -sdbname1           string     Database name   -sid1               string     Entryname   -ufo1               string     UFO features   -fformat1           string     Features format   -fopenfile1         string     Features file name   "-outseq" associated qualifiers   -osformat2          string     Output seq format   -osextension2       string     File name extension   -osname2            string     Base file name   -osdirectory2       string     Output directory   -osdbname2          string     Database name to add   -ossingle2          boolean    Separate file for each entry   -oufo2              string     UFO features   -offormat2          string     Features format   -ofname2            string     Features file name   -ofdirectory2       string     Output directory   General qualifiers:   -auto               boolean    Turn off prompts   -stdout             boolean    Write standard output   -filter             boolean    Read standard input, write standard output   -options            boolean    Prompt for standard and additional values   -debug              boolean    Write debug output to program.dbg   -verbose            boolean    Report some/full command line options   -help               boolean    Report command line options. More                                  information on associated and general                                  qualifiers can be found with -help -verbose   -warning            boolean    Report warnings   -error              boolean    Report errors   -fatal              boolean    Report fatal errors   -die                boolean    Report dying program messagesInput file format   extractseq reads a normal sequence USA.  Input files for usage example   'tembl:hsfau' is a sequence entry in the example nucleic acid database   'tembl'  Database entry: tembl:hsfauID   HSFAU      standard; RNA; HUM; 518 BP.XXAC   X65923;XXSV   X65923.1XXDT   13-MAY-1992 (Rel. 31, Created)DT   23-SEP-1993 (Rel. 37, Last updated, Version 10)XXDE   H.sapiens fau mRNAXXKW   fau gene.XXOS   Homo sapiens (human)OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;OC   Eutheria; Primates; Catarrhini; Hominidae; Homo.XXRN   [1]RP   1-518RA   Michiels L.M.R.;RT   ;RL   Submitted (29-APR-1992) to the EMBL/GenBank/DDBJ databases.RL   L.M.R. Michiels, University of Antwerp, Dept of Biochemistry,RL   Universiteisplein 1, 2610 Wilrijk, BELGIUMXXRN   [2]RP   1-518RX   MEDLINE; 93368957.RA   Michiels L., Van der Rauwelaert E., Van Hasselt F., Kas K., Merregaert J.;RT   " fau cDNA encodes a ubiquitin-like-S30 fusion protein and is expressed asRT   an antisense sequences in the Finkel-Biskis-Reilly murine sarcoma virus";RL   Oncogene 8:2537-2546(1993).XXDR   SWISS-PROT; P35544; UBIM_HUMAN.DR   SWISS-PROT; Q05472; RS30_HUMAN.XXFH   Key             Location/QualifiersFHFT   source          1..518FT                   /chromosome="11q"FT                   /db_xref="taxon:9606"FT                   /organism="Homo sapiens"FT                   /tissue_type="placenta"FT                   /clone_lib="cDNA"FT                   /clone="pUIA 631"FT                   /map="13"FT   misc_feature    57..278FT                   /note="ubiquitin like part"FT   CDS             57..458FT                   /db_xref="SWISS-PROT:P35544"FT                   /db_xref="SWISS-PROT:Q05472"FT                   /gene="fau"FT                   /protein_id="CAA46716.1"FT                   /translation="MQLFVRAQELHTFEVTGQETVAQIKAHVASLEGIAPEDQVVLLAGFT                   APLEDEATLGQCGVEALTTLEVAGRMLGGKVHGSLARAGKVRGQTPKVAKQEKKKKKTGFT                   RAKRRMQYNRRFVNVVPTFGKKKGPNANS"FT   misc_feature    98..102FT                   /note="nucleolar localization signal"FT   misc_feature    279..458FT                   /note="S30 part"FT   polyA_signal    484..489FT   polyA_site      509XXSQ   Sequence 518 BP; 125 A; 139 C; 148 G; 106 T; 0 other;     ttcctctttc tcgactccat cttcgcggta gctgggaccg ccgttcagtc gccaatatgc        60     agctctttgt ccgcgcccag gagctacaca ccttcgaggt gaccggccag gaaacggtcg       120     cccagatcaa ggctcatgta gcctcactgg agggcattgc cccggaagat caagtcgtgc       180     tcctggcagg cgcgcccctg gaggatgagg ccactctggg ccagtgcggg gtggaggccc       240     tgactaccct ggaagtagca ggccgcatgc ttggaggtaa agttcatggt tccctggccc       300     gtgctggaaa agtgagaggt cagactccta aggtggccaa acaggagaag aagaagaaga       360     agacaggtcg ggctaagcgg cggatgcagt acaaccggcg ctttgtcaac gttgtgccca       420     cctttggcaa gaagaagggc cccaatgcca actcttaagt cttttgtaat tctggctttc       480     tctaataaaa aagccactta gttcagtcaa aaaaaaaa                               518//
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -