📄 merger.txt
字号:
RX MEDLINE; 86016712.RA Hediger M.A, Johnson D.F., Nierlich D.P., Zabin I.;RT "DNA sequence of the lactose operon: The lacA gene and the transcriptionalRT termination region";RL Proc. Natl. Acad. Sci. U.S.A. 82:6414-6418(1985).XXDR REMTREMBL; CAA36161; CAA36161.DR SWISS-PROT; P07464; THGA_ECOLI.XXFH Key Location/QualifiersFHFT source 1..1832FT /db_xref="taxon:562"FT /organism="Escherichia coli"FT CDS <1..18FT /codon_start=1FT /db_xref="REMTREMBL:CAA36161"FT /transl_table=11FT /product="lacY gene product"FT /protein_id="CAA36161.1"FT /translation="VNEVA"FT CDS 82..693FT /db_xref="SWISS-PROT:P07464"FT /transl_table=11FT /product="thiogalactoside transacetylase"FT /gene="lacA"FT /protein_id="CAA36162.1"FT /translation="MNMPMTERIRAGKLFTDMCEGLPEKRLRGKTLMYEFNHSHPSEVEFT KRESLIKEMFATVGENAWVEPPVYFSYGSNIHIGRNFYANFNLTIVDDYTVTIGDNVLIFT APNVTLSVTGHPVHHELRKNGEMYSFPITIGNNVWIGSHVVINPGVTIGDNSVIGAGSIFT VTKDIPPNVVAAGVPCRVIREINDRDKHYYFKDYKVESSV"XXSQ Sequence 1832 BP; 519 A; 510 C; 450 G; 353 T; 0 other; gtgaatgaag tcgcttaagc aatcaatgtc ggatgcggcg cgacgcttat ccgaccaaca 60 tatcataacg gagtgatcgc attgaacatg ccaatgaccg aaagaataag agcaggcaag 120 ctatttaccg atatgtgcga aggcttaccg gaaaaaagac ttcgtgggaa aacgttaatg 180 tatgagttta atcactcgca tccatcagaa gttgaaaaaa gagaaagcct gattaaagaa 240 atgtttgcca cggtagggga aaacgcctgg gtagaaccgc ctgtctattt ctcttacggt 300 tccaacatcc atataggccg caatttttat gcaaatttca atttaaccat tgtcgatgac 360 tacacggtaa caatcggtga taacgtactg attgcaccca acgttactct ttccgttacg 420 ggacaccctg tacaccatga attgagaaaa aacggcgaga tgtactcttt tccgataacg 480 attggcaata acgtctggat cggaagtcat gtggttatta atccaggcgt caccatcggg 540 gataattctg ttattggcgc gggtagtatc gtcacaaaag acattccacc aaacgtcgtg 600 gcggctggcg ttccttgtcg ggttattcgc gaaataaacg accgggataa gcactattat 660 ttcaaagatt ataaagttga atcgtcagtt taaattataa aaattgcctg atacgctgcg 720 cttatcaggc ctacaagttc agcgatctac attagccgca tccggcatga acaaagcgca 780 ggaacaagcg tcgcatcatg cctctttgac ccacagctgc ggaaaacgta ctggtgcaaa 840 acgcagggtt atgatcatca gcccaacgac gcacagcgca tgaaatgccc agtccatcag 900 gtaattgccg ctgatactac gcagcacgcc agaaaaccac ggggcaagcc cggcgatgat 960 aaaaccgatt ccctgcataa acgccaccag cttgccagca atagccggtt gcacagagtg 1020 atcgagcgcc agcagcaaac agagcggaaa cgcgccgccc agacctaacc cacacaccat 1080 cgcccacaat accggcaatt gcatcggcag ccagataaag ccgcagaacc ccaccagttg 1140 taacaccagc gccagcatta acagtttgcg ccgatcctga tggcgagcca tagcaggcat 1200 cagcaaagct cctgcggctt gcccaagcgt catcaatgcc agtaaggaac cgctgtactg 1260 cgcgctggca ccaatctcaa tatagaaagc gggtaaccag gcaatcaggc tggcgtaacc 1320 gccgttaatc agaccgaagt aaacacccag cgtccacgcg cggggagtga ataccacgcg 1380 aaccggagtg gttgttgtct tgtgggaaga ggcgacctcg cgggcgcttt gccaccacca 1440 ggcaaagagc gcaacaacgg caggcagcgc caccaggcga gtgtttgata ccaggtttcg 1500 ctatgttgaa ctaaccaggg cgttatggcg gcaccaagcc caccgccgcc catcagagcc 1560 gcggaccaca gccccatcac cagtggcgtg cgctgctgaa accgccgttt aatcaccgaa 1620 gcatcaccgc ctgaatgatg ccgatcccca ccccaccaag cagtgcgctg ctaagcagca 1680 gcgcactttg cgggtaaagc tcacgcatca atgcaccgac ggcaatcagc aacagactga 1740 tggcgacact gcgacgttcg ctgacatgct gatgaagcca gcttccggcc agcgccagcc 1800 cgcccatggt aaccaccggc agagcggtcg ac 1832//Output file format The output sequence file contains the joined sequence, by default in FASTA format. Where there is a mismatch in the alignment, the chosen base is written to the output sequence in uppercase. The output is a standard EMBOSS alignment file. The results can be output in one of several styles by using the command-line qualifier -aformat xxx, where 'xxx' is replaced by the name of the required format. Some of the alignment formats can cope with an unlimited number of sequences, while others are only for pairs of sequences. The available multiple alignment format names are: unknown, multiple, simple, fasta, msf, trace, srs The available pairwise alignment format names are: pair, markx0, markx1, markx2, markx3, markx10, srspair, score See: http://emboss.sf.net/docs/themes/AlignFormats.html for further information on alignment formats. The output report file contains descriptions of the positions where there is a mismatch in the alignment and shows the alignment. Where there is a mismatch in the alignment, the chosen base is written in uppercase. Output files for usage example File: eclacy.merger######################################### Program: merger# Rundate: Sat 15 Jul 2006 12:00:00# Commandline: merger# -asequence tembl:eclacy# -bsequence tembl:eclaca# Align_format: simple# Report_file: eclacy.merger#########################################=======================================## Aligned_sequences: 2# 1: ECLACY# 2: ECLACA# Matrix: EDNAFULL# Gap_penalty: 50.0# Extend_penalty: 5.0## Length: 3173# Identity: 159/3173 ( 5.0%)# Similarity: 159/3173 ( 5.0%)# Gaps: 3014/3173 (95.0%)# Score: 795.0###=======================================ECLACY 1 ttccagctgagcgccggtcgctaccattaccagttggtctggtgtcaaaa 50ECLACA 1 -------------------------------------------------- 0ECLACY 51 ataataataaccgggcaggccatgtctgcccgtatttcgcgtaaggaaat 100ECLACA 1 -------------------------------------------------- 0ECLACY 101 ccattatgtactatttaaaaaacacaaacttttggatgttcggtttattc 150ECLACA 1 -------------------------------------------------- 0ECLACY 151 tttttcttttacttttttatcatgggagcctacttcccgtttttcccgat 200ECLACA 1 -------------------------------------------------- 0ECLACY 201 ttggctacatgacatcaaccatatcagcaaaagtgatacgggtattattt 250ECLACA 1 -------------------------------------------------- 0ECLACY 251 ttgccgctatttctctgttctcgctattattccaaccgctgtttggtctg 300 [Part of this file has been deleted for brevity]ECLACA 1310 ctggcgtaaccgccgttaatcagaccgaagtaaacacccagcgtccacgc 1359ECLACY 1501 -------------------------------------------------- 1500ECLACA 1360 gcggggagtgaataccacgcgaaccggagtggttgttgtcttgtgggaag 1409ECLACY 1501 -------------------------------------------------- 1500ECLACA 1410 aggcgacctcgcgggcgctttgccaccaccaggcaaagagcgcaacaacg 1459ECLACY 1501 -------------------------------------------------- 1500ECLACA 1460 gcaggcagcgccaccaggcgagtgtttgataccaggtttcgctatgttga 1509ECLACY 1501 -------------------------------------------------- 1500ECLACA 1510 actaaccagggcgttatggcggcaccaagcccaccgccgcccatcagagc 1559ECLACY 1501 -------------------------------------------------- 1500ECLACA 1560 cgcggaccacagccccatcaccagtggcgtgcgctgctgaaaccgccgtt 1609ECLACY 1501 -------------------------------------------------- 1500ECLACA 1610 taatcaccgaagcatcaccgcctgaatgatgccgatccccaccccaccaa 1659ECLACY 1501 -------------------------------------------------- 1500ECLACA 1660 gcagtgcgctgctaagcagcagcgcactttgcgggtaaagctcacgcatc 1709ECLACY 1501 -------------------------------------------------- 1500ECLACA 1710 aatgcaccgacggcaatcagcaacagactgatggcgacactgcgacgttc 1759ECLACY 1501 -------------------------------------------------- 1500ECLACA 1760 gctgacatgctgatgaagccagcttccggccagcgccagcccgcccatgg 1809ECLACY 1501 ----------------------- 1500ECLACA 1810 taaccaccggcagagcggtcgac 1832#---------------------------------------## Conflicts: ECLACY ECLACA# position base position base Using###--------------------------------------- File: eclacy.fasta>ECLACY V00295.1 E. coli lacY gene (codes for lactose permease).ttccagctgagcgccggtcgctaccattaccagttggtctggtgtcaaaaataataataaccgggcaggccatgtctgcccgtatttcgcgtaaggaaatccattatgtactatttaaaaaacacaaacttttggatgttcggtttattctttttcttttacttttttatcatgggagcctacttcccgtttttcccgatttggctacatgacatcaaccatatcagcaaaagtgatacgggtattatttttgccgctatttctctgttctcgctattattccaaccgctgtttggtctgctttctgacaaactcgggctgcgcaaatacctgctgtggattattaccggcatgttagtgatgtttgcgccgttctttatttttatcttcgggccactgttacaatacaacattttagtaggatcgattgttggtggtatttatctaggcttttgttttaacgccggtgcgccagcagtagaggcatttattgagaaagtcagccgtcgcagtaatttcgaatttggtcgcgcgcggatgtttggctgtgttggctgggcgctgtgtgcctcgattgtcggcatcatgttcaccatcaataatcagtttgttttctggctgggctctggctgtgcactcatcctcgccgttttactctttttcgccaaaacggatgcgccctcttctgccacggttgccaatgcggtaggtgccaaccattcggcatttagccttaagctggcactggaactgttcagacagccaaaactgtggtttttgtcactgtatgttattggcgtttcctgcacctacgatgtttttgaccaacagtttgctaatttctttacttcgttctttgctaccggtgaacagggtacgcgggtatttggctacgtaacgacaatgggcgaattacttaacgcctcgattatgttctttgcgccactgatcattaatcgcatcggtgggaaaaacgccctgctgctggctggcactattatgtctgtacgtattattggctcatcgttcgccacctcagcgctggaagtggttattctgaaaacgctgcatatgtttgaagtaccgttcctgctggtgggctgctttaaatatattaccagccagtttgaagtgcgtttttcagcgacgatttatctggtctgtttctgcttctttaagcaactggcgatgatttttatgtctgtactggcgggcaatatgtatgaaagcatcggtttccagggcgcttatctggtgctgggtctggtggcgctgggcttcaccttaatttccgtgttcacgcttagcggccccggcccgctttccctgctgcgtcgtcaggtgaatgaagtcgcttaagcaatcaatgtcggatgcggcgcgacgcttatccgaccaacatatcataacggagtgatcgcattgaacatgccaatgaccgaaagaataagagcaggcaagctatttaccgatatgtgcgaaggcttaccggaaaaaagacttcgtgggaaaacgttaatgtatgagtttaatcactcgcatccatcagaagttgaaaaaagagaaagcctgattaaagaaatgtttgccacggtaggggaaaacgcctgggtagaaccgcctgtctatttctcttacggttccaacatccatataggccgcaatttttatgcaaatttcaatttaaccattgtcgatgactacacggtaacaatcggtgataacgtactgattgcacccaacgttactctttccgttacgggacaccctgtacaccatgaattgagaaaaaacggcgagatgtactcttttccgataacgattggcaataacgtctggatcggaagtcatgtggttattaatccaggcgtcaccatcggggataattctgttattggcgcgggtagtatcgtcacaaaagacattccaccaaacgtcgtggcggctggcgttccttgtcgggttattcgcgaaataaacgaccgggataagcactattatttcaaagattataaagttgaatcgtcagtttaaattataaaaattgcctgatacgctgcgcttatcaggcctacaagttcagcgatctacattagccgcatccggcatgaacaaagcgcaggaacaagcgtcgcatcatgcctctttgacccacagctgcggaaaacgtactggtgcaaaacgcagggttatgatcatcagcccaacgacgcacagcgcatgaaatgcccagtccatcaggtaattgccgctgatactacgcagcacgccagaaaaccacggggcaagcccggcgatgataaaaccgattccctgcataaacgccaccagcttgccagcaatagccggttgcacagagtgatcgagcgccagcagcaaacagagcggaaacgcgccgcccagacctaacccacacaccatcgcccacaataccggcaattgcatcggcagccagataaagccgcagaaccccaccagttgtaacaccagcgccagcattaacagtttgcgccgatcctgatggcgagccatagcaggcatcagcaaagctcctgcggcttgcccaagcgtcatcaatgccagtaaggaaccgctgtactgcgcgctggcaccaatctcaatatagaaagcgggtaaccaggcaatcaggctggcgtaaccgccgttaatcagaccgaagtaaacacccagcgtccacgcgcggggagtgaataccacgcgaaccggagtggttgttgtcttgtgggaagaggcgacctcgcgggcgctttgccaccaccaggcaaagagcgcaacaacggcaggcagcgccaccaggcgagtgtttgataccaggtttcgctatgttgaactaaccagggcgttatggcggcaccaagcccaccgccgcccatcagagccgcggaccacagccccatcaccagtggcgtgcgctgctgaaaccgccgtttaatcaccgaagcatcaccgcctgaatgatgccgatccccaccccaccaagcagtgcgctgctaagcagcagcgcactttgcgggtaaagctcacgcatcaatgcaccgacggcaatcagcaacagactgatggcgacactgcgacgttcgctgacatgctgatgaagccagcttccggccagcgccagcccgcccatggtaaccaccggcagagcggtcgacData files It reads the scoring matrix for the alignment from the standard EMBOSS 'data' directory. By default it is the file 'EBLOSUM62' (for proteins) or the file 'EDNAFULL' (for nucleic sequences).Notes None.References None.Warnings None.Diagnostic Error Messages None.Exit status It exits with a status of 0Known bugs None.See also Program name Description cons Creates a consensus from multiple alignments megamerger Merge two large overlapping nucleic acid sequencesAuthor(s) Gary Williams (gwilliam
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -