📄 wordcount.txt
字号:
wordcount Function Counts words of a specified size in a DNA sequenceDescription Displays all the words of the specified length with the number of times it occurs.Usage Here is a sample session with wordcount% wordcount tembl:rnu68037 -wordsize=3 Counts words of a specified size in a DNA sequenceOutput file [rnu68037.wordcount]: Go to the input files for this example Go to the output files for this exampleCommand line arguments Standard (Mandatory) qualifiers: [-sequence] seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) -wordsize integer [4] Word size (Integer 2 or more) [-outfile] outfile [*.wordcount] Output file name Additional (Optional) qualifiers: -mincount integer [1] Minimum word count to report (Integer 1 or more) Advanced (Unprompted) qualifiers: (none) Associated qualifiers: "-sequence" associated qualifiers -sbegin1 integer Start of each sequence to be used -send1 integer End of each sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -sformat1 string Input sequence format -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-outfile" associated qualifiers -odirectory2 string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messagesInput file format wordcount reads any sequence USA. Input files for usage example 'tembl:rnu68037' is a sequence entry in the example nucleic acid database 'tembl' Database entry: tembl:rnu68037ID RNU68037 standard; RNA; ROD; 1218 BP.XXAC U68037;XXSV U68037.1XXDT 23-SEP-1996 (Rel. 49, Created)DT 04-MAR-2000 (Rel. 63, Last updated, Version 2)XXDE Rattus norvegicus EP1 prostanoid receptor mRNA, complete cds.XXKW .XXOS Rattus norvegicus (Norway rat)OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;OC Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus.XXRN [1]RP 1-1218RA Abramovitz M., Boie Y.;RT "Cloning of the rat EP1 prostanoid receptor";RL Unpublished.XXRN [2]RP 1-1218RA Abramovitz M., Boie Y.;RT ;RL Submitted (26-AUG-1996) to the EMBL/GenBank/DDBJ databases.RL Biochemistry & Molecular Biology, Merck Frosst Center for TherapeuticRL Research, P. O. Box 1005, Pointe Claire - Dorval, Quebec H9R 4P8, CanadaXXDR SWISS-PROT; P70597; PE21_RAT.XXFH Key Location/QualifiersFHFT source 1..1218FT /db_xref="taxon:10116"FT /organism="Rattus norvegicus"FT /strain="Sprague-Dawley"FT CDS 1..1218FT /codon_start=1FT /db_xref="SWISS-PROT:P70597"FT /note="family 1 G-protein coupled receptor"FT /product="EP1 prostanoid receptor"FT /protein_id="AAB07735.1"FT /translation="MSPYGLNLSLVDEATTCVTPRVPNTSVVLPTGGNGTSPALPIFSMFT TLGAVSNVLALALLAQVAGRLRRRRSTATFLLFVASLLAIDLAGHVIPGALVLRLYTAGFT RAPAGGACHFLGGCMVFFGLCPLLLGCGMAVERCVGVTQPLIHAARVSVARARLALALLFT AAMALAVALLPLVHVGHYELQYPGTWCFISLGPPGGWRQALLAGLFAGLGLAALLAALVFT CNTLSGLALLRARWRRRRSRRFRENAGPDDRRRWGSRGLRLASASSASSITSTTAALRSFT SRGGGSARRVHAHDVEMVGQLVGIMVVSCICWSPLLVLVVLAIGGWNSNSLQRPLFLAVFT RLASWNQILDPWVYILLRQAMLRQLLRLLPLRVSAKGGPTELSLTKSAWEASSLRSSRHFT SGFSHL"XXSQ Sequence 1218 BP; 162 A; 397 C; 387 G; 272 T; 0 other; atgagcccct acgggcttaa cctgagccta gtggatgagg caacaacgtg tgtaacaccc 60 agggtcccca atacatctgt ggtgctgcca acaggcggta acggcacatc accagcgctg 120 cctatcttct ccatgacgct gggtgctgtg tccaacgtgc tggcgctggc gctgctggcc 180 caggttgcag gcagactgcg gcgccgccgc tcgactgcca ccttcctgtt gttcgtcgcc 240 agcctgcttg ccatcgacct agcaggccat gtgatcccgg gcgccttggt gcttcgcctg 300 tatactgcag gacgtgcgcc cgctggcggg gcctgtcatt tcctgggcgg ctgtatggtc 360 ttctttggcc tgtgcccact tttgcttggc tgtggcatgg ccgtggagcg ctgcgtgggt 420 gtcacgcagc cgctgatcca cgcggcgcgc gtgtccgtag cccgcgcacg cctggcacta 480 gccctgctgg ccgccatggc tttggcagtg gcgctgctgc cactagtgca cgtgggtcac 540 tacgagctac agtaccctgg cacttggtgt ttcattagcc ttgggcctcc tggaggttgg 600 cgccaggcgt tgcttgcggg cctcttcgcc ggccttggcc tggctgcgct ccttgccgca 660 ctagtgtgta atacgctcag cggcctggcg ctccttcgtg cccgctggag gcggcgtcgc 720 tctcgacgtt tccgagagaa cgcaggtccc gatgatcgcc ggcgctgggg gtcccgtgga 780 ctccgcttgg cctccgcctc gtctgcgtca tccatcactt caaccacagc tgccctccgc 840 agctctcggg gaggcggctc cgcgcgcagg gttcacgcac acgacgtgga aatggtgggc 900 cagctcgtgg gcatcatggt ggtgtcgtgc atctgctgga gccccctgct ggtattggtg 960 gtgttggcca tcgggggctg gaactctaac tccctgcagc ggccgctctt tctggctgta 1020 cgcctcgcgt cgtggaacca gatcctggac ccatgggtgt acatcctgct gcgccaggct 1080 atgctgcgcc aacttcttcg cctcctaccc ctgagggtta gtgccaaggg tggtccaacg 1140 gagctgagcc taaccaagag tgcctgggag gccagttcac tgcgtagctc ccggcacagt 1200 ggcttcagcc acttgtga 1218//Output file format Output files for usage example File: rnu68037.wordcountctg 54gcc 53tgg 53ggc 51gct 47cgc 47gtg 40tgc 39cct 38gcg 36cca 29ggg 26tcc 25ctt 25cag 25ccc 24ggt 24ctc 23tgt 23ccg 22gca 22cgt 22cac 22agc 21ttg 19acg 19cgg 19tcg 18ttc 17cat 17agg 17gag 16act 16gtc 16aac 15tct 14atc 14gga 14tca 13cta 13atg 12acc 11gta 11gtt 11aca 10tga 10caa 10tac 10gac 9tag 9agt 9ttt 8cga 7gat 6taa 6aga 5tat 5gaa 4aat 3tta 3ata 3att 3aag 2aaa 1 The file simply consists of two columns, separated by spaces or TAB characters. The first column consists of all the possible words of size wordsize. The second column consists of the count of those words in the input sequence.Data files None.Notes None.References None.Warnings None.Diagnostic Error Messages None.Exit status 0 if successful.Known bugs None.See also Program name Description banana Bending and curvature plot in B-DNA btwisted Calculates the twisting in a B-DNA sequence chaos Create a chaos game representation plot for a sequence compseq Count composition of dimer/trimer/etc words in a sequence dan Calculates DNA RNA/DNA melting temperature freak Residue/base frequency table or plot isochore Plots isochores in large DNA sequences sirna Finds siRNA duplexes in mRNAAuthor(s) Ian Longden (il
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -