📄 sirna.txt

📁 emboss的linux版本的源代码
💻 TXT
📖 第 1 页 / 共 3 页
字号:
12 3 下一页
                                   sirna Function   Finds siRNA duplexes in mRNADescription   RNA interference, or RNAi, is a phenomenon in which double stranded   RNA (dsRNA) effects silencing of the expression of genes that are   highly homologous to either of the RNA strands in the duplex. Gene   silencing in RNAi results from the degradation of mRNA sequences, and   the effect has been used to determine the function of many genes in   Drosophilia, C. elegans, and many plant species.   The duration of knockdown by siRNA can typically last for 7-10 days,   and has been shown to transfer to daughter cells. Of further note,   siRNAs are effective at quantities much lower than alternative gene   silencing methodologies, including antisense and ribozyme based   strategies.   Due to various mechanisms of antiviral response to long dsRNA, RNAi at   first proved more difficult to establish in mammalian species. Then,   Tuschl, Elbashir, and others discovered that RNAi can be elicited very   effectively by well-defined 21-base duplex RNAs. When these small   interfering RNA, or siRNA, are added in duplex form with a   transfection agent to mammalian cell cultures, the 21-base-pair RNA   acts in concert with cellular components to silence the gene with   sequence homology to one of the siRNA sequences. Strategies for the   design of effective siRNA sequences have been recently documented,   most notably by Sayda Elbashir, Thomas Tuschl, et al.   Their studies of mammalian RNAi suggest that the most efficient   gene-silencing effect is achieved using double-stranded siRNA having a   19-nucleotide complementary region and a 2-nucleotide 3' overhang at   each end. Current models of the RNAi mechanism suggest that the   antisense siRNA strand recognizes the specific gene target.   In gene-specific RNAi, the coding region (CDS) of the mRNA is usually   targeted. The search for an appropriate target sequence should begin   50-100 nucleotides downstream of the start codon. UTR-binding proteins   and/or translation initiation complexes may interfere with the binding   of the siRNP endonuclease complex. Tuschl, Elbashir et al. say that   they have successfully used siRNAs targetting the 3' UTR.   To avoid interference from mRNA regulatory proteins, sequences in the   5' untranslated region or near the start codon should not be targeted.   A set of rules for the design of siRNA has been suggested   http://www.mpibpc.gwdg.de/abteilungen/100/105/sirna.html based on the   work of Tuschl, Elbashir et al.   They suggest searching for 23-nt sequence motif AA(N19)TT (N, any   nucleotide) and select hits with approx. 50% G/C-content (30% to 70%   has also worked in for them). If no suitable sequences are found, the   search is extended using the motif NA(N21). The sequence of the sense   siRNA corresponds to (N19)TT or N21 (position 3 to 23 of the 23-nt   motif), respectively. In the latter case, they convert the 3' end of   the sense siRNA to TT.   The rationale for this sequence conversion is to generate a symmetric   duplex with respect to the sequence composition of the sense and   antisense 3' overhangs. The antisense siRNA is synthesized as the   complement to position 1 to 21 of the 23-nt motif. Because position 1   of the 23-nt motif is not recognized sequence-specifically by the   antisense siRNA, the 3'-most nucleotide residue of the antisense   siRNA, can be chosen deliberately. However, the penultimate nucleotide   of the antisense siRNA (complementary to position 2 of the 23-nt   motif) should always be complementary to the targeted sequence. For   simplifying chemical synthesis, they always use TT.   More recently, they preferentially select siRNAs corresponding to the   target motif NAR(N17)YNN, where R is purine (A, G) and Y is pyrimidine   (C, U). The respective 21-nt sense and antisense siRNAs therefore   begin with a purine nucleotide and can also be expressed from pol III   expression vectors without a change in targeting site; expression of   RNAs from pol III promoters is only efficient when the first   transcribed nucleotide is a purine.   They always design siRNAs with symmetric 3' TT overhangs, believing   that symmetric 3' overhangs help to ensure that the siRNPs are formed   with approximately equal ratios of sense and antisense target   RNA-cleaving siRNPs Please note that the modification of the overhang   of the sense sequence of the siRNA duplex is not expected to affect   targeted mRNA recognition, as the antisense siRNA strand guides target   recognition. In summary, no matter what you do to your overhangs,   siRNAs should still function to a reasonable extent. However, using TT   in the 3' overhang will always help your RNA synthesis company to let   you know when you accidentally order a siRNA sequences 3' to 5' rather   than in the recommended format of 5' to 3'.   sirna reports both the sense and antisense siRNAs as 5' to 3'.   Xeragon.com also suggest that choosing a region of the mRNA with a GC   content as close as possible to 50% is a more important consideration   than choosing a target sequence that begins with AA. They also suggest   that a key consideration in target selection is to avoid having more   than three guanosines in a row, since poly G sequences can hyperstack   and form agglomerates that potentially interfere with the siRNA   silencing mechanism.   siRNAs appear to effectively silence genes in more than 80% of cases.   Current data indicate that there are regions of some mRNAs where gene   silencing does not work. To help ensure that a given target gene is   silenced, it is advised that at least two target sequences as far   apart on the gene as possible be chosen.   mRNA secondary structure does not appear to have a significant effect   on gene silencing.  Coding region specification   It is possible (although the evidence is not clear) that regulatory   protein binding to regions in and near the untranslated 5' region   might interfere with the RNAi process.   Therefore, this program avoids choosing siRNA probes from the 5' UTR   and from the first 50 bases of the coding region. The second 50 bases   of the coding region has a penalty associated with it to reduce the   reporting of possible siRNA probes in this region.   If the input sequence has a feature table specifying a coding region,   then this will be used, else you can specify the start of the coding   region, where this is known by the '-sbegin' command-line qualifier   (which is normally used to specify the start of the region of a   sequence that should be analysed in all EMBOSS programs).   sirna looks at the feature table of the input mRNA sequence to find   the coding regions (CDS). It will ignore the 5' UTR and the first 50   bases of the CDS. It will assign a penalty of 2 points to any siRNA in   positions 51 to 100 in the CDS. If there is no CDS in the feature   table, you can specify the CDS by using the command-line qualifier   '-sbegin' to indicate where the CDS should start. If there is no CDS   in the feature table and you do not use the command-line qualifier   '-sbegin', then sirna will assume that the CDS region is not known and   will look for siRNAs in the whole of the sequence with no penalties   associated with the location within the sequence.  All these confusing regions   There are a lot of references to 23 base regions, 21 base regions, 19   base regions, etc. in any description of siRNA.   Perhaps an example with a sequence would be clearer?   The 23 base region, in this case starting with an 'AA', might   typically look like:5' AAGUGAGAGGUCAGACUCCUATC   The sense siRNA is made from the 19 bases of positions 3 to 21 of the   23 base target region, so:5'   GUGAGAGGUCAGACUCCUA   and then typically d(TT) is added, so:5'   GUGAGAGGUCAGACUCCUAdTdT   The antisense siRNA sequence is made from bases 3 to 21 of the target   region, so:5'   GUGAGAGGUCAGACUCCUA sense3'   CACUCUCCAGUCUGAGGAU antisense 3' -> 5'   so the antisense sequence that should be ordered with d(TT) added is:5'   UAGGAGUCUGACCUCUCACdTdT antisense 5' -> 3'Algorithmfor each input sequence:    find the start position of the CDS in the feature table    if there is no such CDS, take the -sbegin position as the CDS start    for each 23 base window along the sequence:        set the score for this window = 0        if base 2 of the window is not 'a': ignore this window        if the window is within 50 bases of the CDS start: ignore this window        if the window is within 100 bases of the CDS: score = -2        measure the %GC of the 20 bases from position 2 to 21 of the window        for the following %GC values change the score:                %GC <= 25% (<= 5 bases): ignore this window                %GC 30% (6 bases): score + 0                %GC 35% (7 bases): score + 2                %GC 40% (8 bases): score + 4                %GC 45% (9 bases): score + 5                %GC 50% (10 bases): score + 6                %GC 55% (11 bases): score + 5                %GC 60% (12 bases): score + 4                %GC 65% (13 bases): score + 2                %GC 70% (14 bases): score + 0                %GC >= 75% (>= 15 bases): ignore this window        if the window starts with a 'AA': score + 3        if the window does not start 'AA' and it is required: ignore this window        if the window ends with a 'TT': score + 1        if the window does not end 'TT' and it is required: ignore this window        if 4 G's in a row are found: ignore this window        if any 4 bases in a row are present and not required: ignore this window        if PolIII probes are required and the window is not NARN(17)YNN: ignore this window        if the score is > 0: store this window for output    sort the windows found by their score    output the 23-base windows to the sequence file    if the 'context' qualifier is specified, output window bases 1 and 2 in brackets to the report file    take the window bases 3 to 21, add 'dTdT' output to the report file    take the window bases 3 to 21, reverse complement, add 'dTdT' output to the report fileUsage   Here is a sample session with sirna% sirna Finds siRNA duplexes in mRNAInput nucleotide sequence(s): tembl:hsfauOutput report [hsfau.sirna]: output sequence(s) [hsfau.fasta]:    Go to the input files for this example   Go to the output files for this example   Example 2   Show the first two bases of the 23 base target region in brackets.   These do not form part of the sequence to be ordered, but it is useful   to see if the 23 base region starts with an 'AA'.% sirna -context Finds siRNA duplexes in mRNAInput nucleotide sequence(s): tembl:hsfauOutput report [hsfau.sirna]: output sequence(s) [hsfau.fasta]:    Go to the output files for this exampleCommand line arguments   Standard (Mandatory) qualifiers:  [-sequence]          seqall     Nucleotide sequence(s) filename and optional                                  format, or reference (input USA)  [-outfile]           report     [*.sirna] The output is a table of the                                  forward and reverse parts of the 21 base                                  siRNA duplex. Both the forward and reverse                                  sequences are written 5' to 3', ready to be                                  ordered. The last two bases have been                                  replaced by 'dTdT'. The starting position of                                  the 23 base region and the %GC content is                                  also given. If you wish to see the complete                                  23 base sequence, then either look at the                                  sequence in the other output file, or use                                  the qualifier '-context' which will display                                  the 23 bases of the forward sequence in this                                  report withthe first two bases in brackets.                                  These first two bases do not form part of                                  the siRNA probe to be ordered.  [-outseq]            seqoutall  [.] This is a file of the                                  sequences of the 23 base regions that the                                  siRNAs are selected from. You may use it to                                  do searches of mRNA databases (e.g. REFSEQ)                                  to confirm that the probes are unique to the                                  gene you wish to use it on.   Additional (Optional) qualifiers:   -poliii             boolean    [N] This option allows you to select only                                  the 21 base probes that start with a purine                                  and so can be expressed from Pol III                                  expression vectors. This is the NARN(17)YNN                                  pattern that has been suggested by Tuschl et                                  al.   -aa                 boolean    [N] This option allows you to select only                                  those 23 base regions that start with AA. If                                  this option is not selected then regions                                  that start with AA will be favoured by                                  giving them a higher score, but regions that                                  do not start with AA will also be reported.   -tt                 boolean    [N] This option allows you to select only                                  those 23 base regions that end with TT. If                                  this option is not selected then regions                                  that end with TT will be favoured by giving                                  them a higher score, but regions that do not                                  end with TT will also be reported.   -[no]polybase       boolean    [Y] If this option is FALSE then only those                                  23 base regions that have no repeat of 4 or                                  more of any bases in a row will be reported.                                  No regions will ever be reported that have                                  4 or more G's in a row.   -context            boolean    [N] The output report file gives the                                  sequences of the 21 base siRNA regions ready                                  to be ordered. This does not give you an                                  indication of the 2 bases before the 21                                  bases. It is often interesting to see which                                  of the suggested possible probe regions have                                  an 'AA' in front of them (i.e. it is useful                                  to see which of the 23 base regions start                                  with an 'AA'). This option displays the                                  whole 23 bases of the region with the first                                  two bases in brackets, e.g. '(AA)' to give                                  you some context for the probe region. YOU                                  SHOULD NOT INCLUDE THE TWO BASES IN BRACKETS                                  WHEN YOU PLACE AN ORDER FOR THE PROBES.
12 3 下一页
💿 文件大小 18031 K
👤 上传用户 yashashi
📂 所属分类 Linux/Unix编程
🏷️ 相关标签

#emboss #linux #版本 #源代码
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -