📄 extractfeat.txt
字号:
>AP000504_66908_67344 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.cccaggcagc>AP000504_71741_72164 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.actgtggaat>AP000504_72744_73649 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.tttaccataa>AP000504_73962_74192 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ctcatggcct>AP000504_74520_74709 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.cctacctggg>AP000504_74856_74931 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ctcaccaatg>AP000504_75374_75489 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ctcaccatct>AP000504_76058_76160 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ctcaccaggc>AP000504_77125_77207 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ctcacttcac>AP000504_77820_78148 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ctcaccggct>AP000504_79023_79187 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.tgattagaat>AP000504_79451_80175 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.agcaggtctc>AP000504_81318_81943 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.tgcaggtctc>AP000504_83295_85730 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ggctccccaa>AP000504_85819_85964 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.gagatgagga>AP000504_86305_86403 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.cttaccttga>AP000504_86550_86648 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ctcactgtag>AP000504_86730_86803 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ccaaccttca>AP000504_87402_87556 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ctcacatgcg>AP000504_87948_88090 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.catacctcct>AP000504_91393_91628 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ccccgggcga>AP000504_92264_92384 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.tttagagacc>AP000504_94413_94530 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ttcagatgaa>AP000504_94645_94841 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.tgcagatgtt>AP000504_95076_95129 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.cccagagtga>AP000504_95289_95363 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.tgtagagtga>AP000504_96214_96449 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ttcagtgtcg>AP000504_97518_97647 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.cacagccatc>AP000504_98437_98634 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.cgcagcacga>AP000504_98843_99095 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.ttcagatcaa>AP000504_99439_99516 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.cccagattct>AP000504_99847_99959 [exon] Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region, section 3/20.cacaggacag>AB000360_808_2266 [exon] Homo sapiens PIGC gene, complete cds.cttttagtaa>HSHBB_19289_19632 [exon] Human beta globin region on chromosome 11.ggagccaaca>HSHBB_19755_19977 [exon] Human beta globin region on chromosome 11.catagactcc>HSHBB_20833_21080 [exon] Human beta globin region on chromosome 11.aacagctcct>HSHBB_34478_34622 [exon] Human beta globin region on chromosome 11.tccacacact>HSHBB_34745_34967 [exon] Human beta globin region on chromosome 11.cacaggctcc>HSHBB_35854_36069 [exon] Human beta globin region on chromosome 11.aacagctcct>HSHBB_39414_39558 [exon] Human beta globin region on chromosome 11.tccacacact>HSHBB_39681_39903 [exon] Human beta globin region on chromosome 11.cacaggctcc>HSHBB_40770_40985 [exon] Human beta globin region on chromosome 11.aacagctcct>HSHBB_45710_45800 [exon] Human beta globin region on chromosome 11.acactgtagt>HSHBB_45922_46145 [exon] Human beta globin region on chromosome 11.cacagtctcc>HSHBB_46997_47124 [exon] Human beta globin region on chromosome 11.cccagctctt>HSHBB_54740_54881 [exon] Human beta globin region on chromosome 11.tgcttacact>HSHBB_55010_55232 [exon] Human beta globin region on chromosome 11.ctcagattac>HSHBB_56131_56389 [exon] Human beta globin region on chromosome 11.cgcagctctt>HSHBB_62137_62278 [exon] Human beta globin region on chromosome 11.tgcttacatt>HSHBB_62187_62278 [exon] Human beta globin region on chromosome 11.acaccatggt>HSHBB_62390_62408 [exon] Human beta globin region on chromosome 11.attggtctat>HSHBB_62409_62631 [exon] Human beta globin region on chromosome 11.cttaggctgc>HSHBB_63482_63742 [exon] Human beta globin region on chromosome 11.cacagctcct>GMGL01_363_460 [exon] Glycine max leghemoglobin gene or pseudogene (no mRNA detected).gaaatatggg>GMGL01_555_663 [exon] Glycine max leghemoglobin gene or pseudogene (no mRNA detected).aataggatat>GMGL01_2182_2286 [exon] Glycine max leghemoglobin gene or pseudogene (no mRNAdetected).tgtaggtgcg>GMGL01_3065_3208 [exon] Glycine max leghemoglobin gene or pseudogene (no mRNAdetected).cgtaggtggt Example 4 To extract the CDS region with the exons joined into one sequence:% extractfeat tembl:hsfau1 -type CDS -join stdout Extract features from a sequence>HSFAU1_782_1912 [CDS] H.sapiens fau 1 geneatgcagctctttgtccgcgcccaggagctacacaccttcgaggtgaccggccaggaaacggtcgcccagatcaaggctcatgtagcctcactggagggcattgccccggaagatcaagtcgtgctcctggcaggcgcgcccctggaggatgaggccactctgggccagtgcggggtggaggccctgactaccctggaagtagcaggccgcatgcttggaggtaaagtccatggttccctggcccgtgctggaaaagtgagaggtcagactcctaaggtggccaaacaggagaagaagaagaagaagacaggtcgggctaagcggcggatgcagtacaaccggcgctttgtcaacgttgtgcccacctttggcaagaagaagggccccaatgccaactcttaa Example 5 To write out the 7 residues around all phosphorylated residues in the tsw database:% extractfeat tsw:* -type mod_res -value phosphorylation* -before 3 -after -4stdout Extract features from a sequence>OPSD_HUMAN_343_343 [mod_res] RHODOPSIN.TETSQVA>PAXI_HUMAN_118_118 [mod_res] PAXILLIN.EHVYSFP Go to the input files for this exampleCommand line arguments Standard (Mandatory) qualifiers: [-sequence] seqall Sequence(s) filename and optional format, or reference (input USA) [-outseq] seqout [.] Sequence filename and optional format (output USA) Additional (Optional) qualifiers: -before integer [0] If this value is greater than 0 then that number of bases or residues before the feature are included in the extracted sequence. This allows you to get the context of the feature. If this value is negative then the start of the extracted sequence will be this number of bases/residues before the end of the feature. So a value of '10' will start the extraction 10 bases/residues before the start of the sequence, and a value of '-10' will start the extraction 10 bases/residues before the end of the feature. The output sequence will be padded with 'N' or 'X' characters if the sequence starts after the required start of the extraction. (Any integer value) -after integer [0] If this value is greater than 0 then that number of bases or residues after the feature are included in the extracted sequence. This allows you to get the context of the feature. If this value is negative then the end of the extracted sequence will be this number of bases/residues after the start of the feature. So a value of '10' will end the extraction 10 bases/residues after the end of the sequence, and a value of '-10' will end the extraction 10 bases/residues after the start of the feature. The output sequence will be padded with 'N' or 'X' characters if the sequence ends before the required end of the extraction. (Any integer value) -source string [*] By default any feature source in the feature table is shown. You can set this to match any feature source you wish to show. The source name is usually either the name of the program that detected the feature or it is the feature table (eg: EMBL) that the feature came from. The source may be wildcarded by using '*'. If you wish to show more than one source, separate their names with the character '|', eg: gene* | embl (Any string is accepted) -type string [*] By default every feature in the feature table is extracted. You can set this to be any feature type you wish to extract. See http://www3.ebi.ac.uk/Services/WebFeat/ for a list of the EMBL feature types and see Appendix A of the Swissprot user manual in http://www.expasy.ch/txt/userman.txt for a list of the Swissprot feature types. The type may be wildcarded by using '*'. If you wish to extract more than one type, separate their names with the character '|', eg: *UTR | intron (Any string is accepted) -sense integer [0 - any sense, 1 - forward sense, -1 - reverse sense] By default any feature type in the feature table is extracted. You can set this to match any feature sense you wish. 0 - any sense, 1 - forward sense, -1 - reverse sense (Any integer value) -minscore float [0.0] If this is greater than or equal to the maximum score, then any score is permitted (Any numeric value) -maxscore float [0.0] If this is less than or equal to the maximum score, then any score is permitted (Any numeric value) -tag string [*] Tags are the types of extra values that a feature may have. For example in the EMBL feature table, a 'CDS' type of feature may have the tags '/codon', '/codon_start', '/db_xref', '/EC_number', '/evidence', '/exception', '/function', '/gene', '/label', '/map', '/note', '/number', '/partial', '/product', '/protein_id', '/pseudo', '/standard_name', '/translation', '/transl_except', '/transl_table', or '/usedin'. Some of these tags also have values, for example '/gene' can have the value of the gene name. By default any feature tag in the feature table is extracted. You can set this to match any feature tag you wish to show. The tag may be wildcarded by using '*'. If you wish to extract more than one tag, separate their names with the character '|', eg: gene | label (Any string is accepted) -value string [*] Tag values are the values associated with a feature tag. Tags are the types of extra values that a feature may have. For example in the EMBL feature table, a 'CDS' type of feature may have the tags '/codon', '/codon_start', '/db_xref', '/EC_number', '/evidence', '/exception', '/function', '/gene', '/label', '/map', '/note', '/number', '/partial', '/product', '/protein_id', '/pseudo', '/standard_name', '/translation', '/transl_except', '/transl_table', or '/usedin'. Only some of these tags can have values, for example '/gene' can have the value of the gene name. By default any feature tag value in the feature table is shown. You can set this to match any feature tag valueyou wish to show. The tag value may be wildcarded by using '*'. If you wish to show more than one tag value, separate their names with a space or the character '|', eg: pax* | 10 (Any string is accepted) -join boolean [N] Some features, such as CDS (coding sequence) and mRNA are composed of introns concatenated together. There may be other forms of 'joined' sequence, depending on the feature table. If this option is set TRUE, then any group of these features will be output as a single sequence. If the 'before' and 'after' qualifiers have been set, then only the sequence before the first feature and after the last feature are added. -featinname boolean [N] To aid you in identifying the type of feature that has been output, the type of feature is added to the start of the description of the output sequence. Sometimes the description of a sequence is lost in subsequent processing of the sequences file, so it is useful for the type to be a part of the sequence ID name. If you set this to be TRUE then the name is added to the ID name of the output sequence. -describe string To aid you in identifying some further
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -