📄 einverted.txt
字号:
einverted Function Finds DNA inverted repeatsDescription einverted looks for inverted repeats (stem loops) in a nucleotide sequence. It will find inverted repeats that include a proprtion of mismatches and gaps (bulges in the stem loop).Algorithm It works by finding alignments between the sequence and its reverse complement that exceed a threshold score. Gaps and Mismatches are assigned a penalty (negative) score. Matches are assigned a positive score. The score is calculated by summing the values of each match, the penalties of each mismatch and the large penalties of any gaps. Any region whose score exceeds the threshold is reported. einverted uses dynamic programming and thus is guaranteed to find the optimal alignment, but is slower than, for example, a self-by-self BLAST. It can find multiple inverted repeats in a sequence. einverted does not report overlapping matches. The original "inverted" program was written to annotate the nematode genome. Excluding overlapping repeats saved problems with simple repeat sequences in this genome.Usage Here is a sample session with einverted% einverted tembl:hsts1 Finds DNA inverted repeatsGap penalty [12]: Minimum score threshold [50]: Match score [3]: Mismatch score [-4]: Sanger Centre program inverted output file [hsts1.inv]: File for sequence of regions of inverted repeats. [hsts1.fasta]: Go to the input files for this example Go to the output files for this exampleCommand line arguments Standard (Mandatory) qualifiers: [-sequence] seqall Nucleotide sequence(s) filename and optional format, or reference (input USA) -gap integer [12] Gap penalty (Any integer value) -threshold integer [50] Minimum score threshold (Any integer value) -match integer [3] Match score (Any integer value) -mismatch integer [-4] Mismatch score (Any integer value) [-outfile] outfile [*.einverted] Sanger Centre program inverted output file [-outseq] seqout [.] The sequence of the inverted repeat regions without gap characters. Additional (Optional) qualifiers: -maxrepeat integer [2000] Maximum separation between the start of repeat and the end of the inverted repeat (the default is 2000 bases). (Any integer value) Advanced (Unprompted) qualifiers: (none) Associated qualifiers: "-sequence" associated qualifiers -sbegin1 integer Start of each sequence to be used -send1 integer End of each sequence to be used -sreverse1 boolean Reverse (if DNA) -sask1 boolean Ask for begin/end/reverse -snucleotide1 boolean Sequence is nucleotide -sprotein1 boolean Sequence is protein -slower1 boolean Make lower case -supper1 boolean Make upper case -sformat1 string Input sequence format -sdbname1 string Database name -sid1 string Entryname -ufo1 string UFO features -fformat1 string Features format -fopenfile1 string Features file name "-outfile" associated qualifiers -odirectory2 string Output directory "-outseq" associated qualifiers -osformat3 string Output seq format -osextension3 string File name extension -osname3 string Base file name -osdirectory3 string Output directory -osdbname3 string Database name to add -ossingle3 boolean Separate file for each entry -oufo3 string UFO features -offormat3 string Features format -ofname3 string Features file name -ofdirectory3 string Output directory General qualifiers: -auto boolean Turn off prompts -stdout boolean Write standard output -filter boolean Read standard input, write standard output -options boolean Prompt for standard and additional values -debug boolean Write debug output to program.dbg -verbose boolean Report some/full command line options -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose -warning boolean Report warnings -error boolean Report errors -fatal boolean Report fatal errors -die boolean Report dying program messagesInput file format The input for einverted is a nucleotide sequence Input files for usage example 'tembl:hsts1' is a sequence entry in the example nucleic acid database 'tembl' Database entry: tembl:hsts1ID HSTS1 standard; DNA; HUM; 18596 BP.XXAC D00596;XXSV D00596.1XXDT 17-JUL-1991 (Rel. 28, Created)DT 27-OCT-1998 (Rel. 57, Last updated, Version 2)XXDE Homo sapiens gene for thymidylate synthase, exons 1, 2, 3, 4, 5, 6, 7,DE complete cds.XXKW thymidylate syntase.XXOS Homo sapiens (human)OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;OC Eutheria; Primates; Catarrhini; Hominidae; Homo.XXRN [1]RP 1-18596RX MEDLINE; 91056070.RA Kaneda S., Nalbantoglu J., Takeishi K., Shimizu K., Gotoh O., Seno T.,RA Ayusawa D.;RT "Structural and Functional Analysis of the Human Thymidylate SynthaseRT Gene";RL J. Biol. Chem. 265:20277-20284(1990).XXDR SWISS-PROT; P04818; TYSY_HUMAN.XXCC These data kindly submitted in computer readable form by:CC Sumiko KanedaCC National Institute of GeneticsCC 1111 YataCC Mishima 411CC JapanCC Phone: +81-559-72-2732CC Fax: +81-559-71-3651XXFH Key Location/QualifiersFHFT source 1..18596FT /chromosome="18"FT /db_xref="taxon:9606"FT /sequenced_mol="DNA"FT /organism="Homo sapiens"FT /clone="lambdaHTS-1 and lambdaHTS-3"FT /map="18p11.32"FT repeat_unit 1..148FT /note="Alu sequence"FT repeat_unit 202..477 [Part of this file has been deleted for brevity] ttttgttttt agcttcagcg agaacccaga cctttcccaa agctcaggat tcttcgaaaa 15660 gttgagaaaa ttgatgactt caaagctgaa gactttcaga ttgaagggta caatccgcat 15720 ccaactatta aaatggaaat ggctgtttag ggtgctttca aaggagctcg aaggatattg 15780 tcagtcttta ggggttgggc tggatgccga ggtaaaagtt ctttttgctc taaaagaaaa 15840 aggaactagg tcaaaaatct gtccgtgacc tatcagttat taatttttaa ggatgttgcc 15900 actggcaaat gtaactgtgc cagttctttc cataataaaa ggctttgagt taactcactg 15960 agggtatctg acaatgctga ggttatgaac aaagtgagga gaatgaaatg tatgtgctct 16020 tagcaaaaac atgtatgtgc atttcaatcc cacgtactta taaagaaggt tggtgaattt 16080 cacaagctat ttttggaata tttttagaat attttaagaa tttcacaagc tattccctca 16140 aatctgaggg agctgagtaa caccatcgat catgatgtag agtgtggtta tgaactttaa 16200 agttatagtt gttttatatg ttgctataat aaagaagtgt tctgcattcg tccacgcttt 16260 gttcattctg tactgccact tatctgctca gttccttcct aaaatagatt aaagaactct 16320 ccttaagtaa acatgtgctg tattctggtt tggatgctac ttaaaagagt atattttaga 16380 aataatagtg aatatatttt gccctatttt tctcatttta actgcatctt atcctcaaaa 16440 tataatgacc atttaggata gagttttttt tttttttttt taaactttta taaccttaaa 16500 gggttatttt aaaataatct atggactacc attttgccct cattagcttc agcatggtgt 16560 gacttctcta ataatatgct tagattaagc aaggaaaaga tgcaaaacca cttcggggtt 16620 aatcagtgaa atatttttcc cttcgttgca taccagatac ccccggtgtt gcacgactat 16680 ttttattctg ctaatttatg acaagtgtta aacagaacaa ggaattattc caacaagtta 1674
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -