📄 yank.txt
字号:
yank Function Reads a sequence range, appends the full USA to a list fileDescription yank is a simple utility to add a specified sequence name to a list file. In fact, it writes out not just the name of the sequence, but also the start and end position of a region within that sequence and, if the sequence is nucleic, if can specify whether the sequence is the reverse complement. Without the program yank you would need to use a text editor such as pico to create the appropriate list files. yank makes this process easy. List Files Many EMBOSS programs can read in a set of sequences. (Some examples are emma and union) There are many ways of specifying these sequences, including wildcarded sequence file names, wildcarded database entry names and list files. List files (files of file names) are the most flexible. yank is a utility to add sequences to a list file. Instead of containing the sequences themselves, a List file contains "references" to sequences - so, for example, you might include database entries, the names of files containing sequences, or even the names of other list files. For example, here's a valid list file, called seq.list:unix % more seq.listopsd_abyko.fastasw:opsd_xenlasw:opsd_c*@another_list This looks a bit odd, but it's really very straightforward; the file contains: * opsd_abyko.fasta - this is the name of a sequence file. The file is read in from the current directory. * sw:opsd_xenla - this is a reference to a specific sequence in the SwissProt database * sw:opsd_c* - this represents all the sequences in SwissProt whose identifiers start with ``opsd_c'' * another_list - this is the name of a second list file Notice the @ in front of the last entry. This is the way you tell EMBOSS that this file is a list file, not a regular sequence file.Usage Here is a sample session with yank This is an example of adding an entry for the part of tembl:hsfau1 between positions 1913 and 1915 to the existing list file 'cds.list':% yank Reads a sequence range, appends the full USA to a list fileInput (gapped) sequence: tembl:hsfau1 Begin at position [start]: 1913 End at position [end]: 1915 Reverse strand [N]: List of USAs output file [hsfau1.yank]: cds.list Go to the input files for this example Go to the output files for this exampleCommand line arguments Standard (Mandatory) qualifiers: [-sequence] sequence (Gapped) sequence filename and optional format, or reference (input USA) [-outfile] outfile [*.yank] List of USAs output file Additional (Optional) qualifiers: (none) Advanced (Unprompted) qualifiers: -newfile boolean [N] Overwrite existing output file Associated qualifiers: "-sequence" associated qualifiers -sbegin1 integer Start of the sequence to be used -send1 integer End of the sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -sformat1 string Input sequence format -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-outfile" associated qualifiers -odirectory2 string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messagesInput file format yank reads any valid sequence USA. You will be prompted for the start and end positions you wish to use. If the sequence is nucleic, you will be prompted whether you wish to use the reverse complement of the sequence. Input files for usage example 'tembl:hsfau1' is a sequence entry in the example nucleic acid database 'tembl' Database entry: tembl:hsfau1ID HSFAU1 standard; DNA; HUM; 2016 BP.XXAC X65921; S45242;XXSV X65921.1XXDT 13-MAY-1992 (Rel. 31, Created)DT 21-JUL-1993 (Rel. 36, Last updated, Version 5)XXDE H.sapiens fau 1 geneXXKW fau 1 gene.XXOS Homo sapiens (human)OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;OC Eutheria; Primates; Catarrhini; Hominidae; Homo.XXRN [1]RP 1-2016RA Kas K.;RT ;RL Submitted (29-APR-1992) to the EMBL/GenBank/DDBJ databases.RL K. Kas, University of Antwerp, Dept of Biochemistry T3.22,RL Universiteitsplein 1, 2610 Wilrijk, BELGIUMXXRN [2]RP 1-2016RX MEDLINE; 92412144.RA Kas K., Michiels L., Merregaert J.;RT "Genomic structure and expression of the human fau gene: encoding theRT ribosomal protein S30 fused to a ubiquitin-like protein.";RL Biochem. Biophys. Res. Commun. 187:927-933(1992).XXDR SWISS-PROT; P35544; UBIM_HUMAN.DR SWISS-PROT; Q05472; RS30_HUMAN.XXFH Key Location/QualifiersFHFT source 1..2016FT /db_xref="taxon:9606"FT /organism="Homo sapiens"FT /clone_lib="CML cosmid"FT /clone="15.1"FT mRNA join(408..504,774..856,951..1095,1557..1612,1787..>1912)FT /gene="fau 1"FT exon 408..504FT /number=1FT intron 505..773FT /number=1FT exon 774..856 [Part of this file has been deleted for brevity]FT RAKRRMQYNRRFVNVVPTFGKKKGPNANS"FT intron 857..950FT /number=2FT exon 951..1095FT /number=3FT intron 1096..1556FT /number=3FT exon 1557..1612FT /number=4FT intron 1613..1786FT /number=4FT exon 1787..>1912FT /number=5FT polyA_signal 1938..1943XXSQ Sequence 2016 BP; 421 A; 562 C; 538 G; 495 T; 0 other; ctaccatttt ccctctcgat tctatatgta cactcgggac aagttctcct gatcgaaaac 60 ggcaaaacta aggccccaag taggaatgcc ttagttttcg gggttaacaa tgattaacac 120 tgagcctcac acccacgcga tgccctcagc tcctcgctca gcgctctcac caacagccgt 180 agcccgcagc cccgctggac accggttctc catccccgca gcgtagcccg gaacatggta 240 gctgccatct ttacctgcta cgccagcctt ctgtgcgcgc aactgtctgg tcccgccccg 300 tcctgcgcga gctgctgccc aggcaggttc gccggtgcga gcgtaaaggg gcggagctag 360 gactgccttg ggcggtacaa atagcaggga accgcgcggt cgctcagcag tgacgtgaca 420 cgcagcccac ggtctgtact gacgcgccct cgcttcttcc tctttctcga ctccatcttc 480 gcggtagctg ggaccgccgt tcaggtaaga atggggcctt ggctggatcc gaagggcttg 540 tagcaggttg gctgcggggt cagaaggcgc ggggggaacc gaagaacggg gcctgctccg 600 tggccctgct ccagtcccta tccgaactcc ttgggaggca ctggccttcc gcacgtgagc 660 cgccgcgacc accatcccgt cgcgatcgtt tctggaccgc tttccactcc caaatctcct 720 ttatcccaga gcatttcttg gcttctctta caagccgtct tttctttact cagtcgccaa 780 tatgcagctc tttgtccgcg cccaggagct acacaccttc gaggtgaccg gccaggaaac 840 ggtcgcccag atcaaggtaa ggctgcttgg tgcgccctgg gttccatttt cttgtgctct 900 tcactctcgc ggcccgaggg aacgcttacg agccttatct ttccctgtag gctcatgtag 960 cctcactgga gggcattgcc ccggaagatc aagtcgtgct cctggcaggc gcgcccctgg 1020 aggatgaggc cactctgggc cagtgcgggg tggaggccct gactaccctg gaagtagcag 1080 gccgcatgct tggaggtgag tgagagagga atgttctttg aagtaccggt aagcgtctag 1140 tgagtgtggg gtgcatagtc ctgacagctg agtgtcacac ctatggtaat agagtacttc 1200 tcactgtctt cagttcagag tgattcttcc tgtttacatc cctcatgttg aacacagacg 1260 tccatgggag actgagccag agtgtagttg tatttcagtc acatcacgag atcctagtct 1320 ggttatcagc ttccacacta aaaattaggt cagaccaggc cccaaagtgc tctataaatt 1380 agaagctgga agatcctgaa atgaaactta agatttcaag gtcaaatatc tgcaactttg 1440 ttctcattac ctattgggcg cagcttctct ttaaaggctt gaattgagaa aagaggggtt 1500 ctgctgggtg gcaccttctt gctcttacct gctggtgcct tcctttccca ctacaggtaa 1560 agtccatggt tccctggccc gtgctggaaa agtgagaggt cagactccta aggtgagtga 1620 gagtattagt ggtcatggtg ttaggacttt ttttcctttc acagctaaac caagtccctg 1680 ggctcttact cggtttgcct tctccctccc tggagatgag cctgagggaa gggatgctag 1740 gtgtggaaga caggaaccag ggcctgatta accttccctt ctccaggtgg ccaaacagga 1800 gaagaagaag aagaagacag gtcgggctaa gcggcggatg cagtacaacc ggcgctttgt 1860 caacgttgtg cccacctttg gcaagaagaa gggccccaat gccaactctt aagtcttttg 1920 taattctggc tttctctaat aaaaaagcca cttagttcag tcatcgcatt gtttcatctt 1980 tacttgcaag gcctcaggga gaggtgtgct tctcgg 2016//Output file format Output files for usage example File: cds.listtembl-id:HSFAU1[782:856]tembl-id:HSFAU1[951:1095]tembl-id:HSFAU1[1557:1612]tembl-id:HSFAU1[1787:1912]tembl-id:HSFAU1[1913:1915] The output list file can now be read in by a program such as union by specifying the list file as '@cds.list' when union prompts for input.Data files None.Notes None.References None.Warnings None.Diagnostic Error Messages None.Exit status It always exits with status 0.Known bugs None.See also Program name Description biosed Replace or delete sequence sections codcopy Reads and writes a codon usage table cutseq Removes a specified section from a sequence degapseq Removes gap characters from sequences descseq Alter the name or description of a sequence entret Reads and writes (returns) flatfile entries extractalign Extract regions from a sequence alignment extractfeat Extract features from a sequence extractseq Extract regions from a sequence listor Write a list file of the logical OR of two sets of sequences makenucseq Creates random nucleotide sequences makeprotseq Creates random protein sequences maskfeat Mask off features of a sequence maskseq Mask off regions of a sequence newseq Type in a short new sequence noreturn Removes carriage return from ASCII files notseq Exclude a set of sequences and write out the remaining ones nthseq Writes one sequence from a multiple set of sequences pasteseq Insert one sequence into another revseq Reverse and complement a sequence seqret Reads and writes (returns) sequences seqretsplit Reads and writes (returns) sequences in individual files skipseq Reads and writes (returns) sequences, skipping first few splitter Split a sequence into (overlapping) smaller sequences trimest Trim poly-A tails off EST sequences trimseq Trim ambiguous bits off the ends of sequences union Reads sequence fragments and builds one sequence vectorstrip Strips out DNA between a pair of vector sequences The program extract does not make list files, but creates a sequence from sub-regions of a single other sequence.Author(s) Peter Rice (pmr
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -