📄 pvcomp.1

📁 序列对齐 Compare a protein sequence to a protein sequence database or a DNA sequence to a DNA sequenc
💻 1
字号:
.TH PVCOMPFA/PVCOMPSW/v3.4 1 "January, 2003".SH NAME.B pv34compfa\- scan a protein or DNA sequence library for similarsequences using the FASTA algorithm in parallel on a network ofmachines running pvm3..B pv34compsw\- scan a protein or DNA sequence library for similarsequences using the Smith-Waterman algorithm in parallel on a networkof machines running pvm3..B ps34compfa\- evaluate sequence comparison parameters using the FASTAalgorithm and super-family-annotated libraries..B ps34compsw\- evaluate sequence comparison parameters using theSmith-Waterman algorithm and super-family-annotated libraries..SH SYNOPSIS.B pv34compfa[-Q|q -B -b # -d # -E # -f # -g # -H -i J # -n -o -p #\& -R.I STATFILE\& -r "+n/-m" \& -S -s.I SMATRIX\& -w # -1 ] query-library reference-library [.I ktup].B pv34compfa[\-QBbcefgHiJnopRrSsw1] \- interactive mode.B pv34compsw[-Q|q -B -b # -e -f delval -g gapval -i\& -n -p # -R -R.I STATFILE\& -r "+n/-m" \& -S -s\& -s.I SMATRIX ] query-library reference-library [.I ktup].B pv34compsw[\-QBbefgnpRrsS] \- interactive mode.SH DESCRIPTION.B pv34compfaand.B pv34compswcompare all of the sequences in one DNA or protein sequence library(the query library) with to all of the entries in a reference sequencelibrary using the FASTA (pv34compfa) or Smith-Waterman (pv34compsw)algorithms.  For example,.B pv34compfacan compare a library of protein sequences to all of the sequences inthe NBRF PIR protein sequence database..B pv34compfaand.B pv34compsware designed to run in parallel on networks of unix workstations usingthe PVM parallel programming system. (For more information on PVM,send email to "netlib@ornl.gov" with the message "send index for pvm3")..PP.B pv34compfauses the rapid sequence comparison algorithmdescribed in Pearson and Lipman, Proc. Natl. Acad. USA, (1988) 85:2444.The program can be invoked either with command line arguments or ininteractive mode.  The optional third argument,.I ktupsets the sensitivity and speed of the search.  If.I ktup=2,similar regions in the two sequences being compared are found bylooking at pairs of aligned residues; if.I ktup=1,single aligned amino acids are examined..I ktupcan be set to 2 or 1 for protein sequences, or from 1 to 6 for DNA sequences.The default if.Iktupis not specified is 2 for proteins and 6 for DNA..PP.B pv34compfacompares a library of query sequences (there need be only one) to areference sequence library.  Normally.B pv34compfasorts the output by the.I initnscore.  By using the.I \-1option, sequences are ranked by their.B init1score.  Alternative, the.I \-ooption causes optimized scores to be calculated for every sequencegreater than a threshold and the output to be sorted by the optimizedscores..PP.B pv34compswuses the rigorous Smith-Waterman algorithm to compare protein orDNA sequences. The gap penalties and scoring matrices can bemodified with the .I -f\c\&, .I -k\c\&, and .I -soptions..PP.B pv34compfa(and.B pv34compsw\c\&) will automatically decide whether the query sequence is DNA orprotein by reading the query sequence as protein and determiningwhether the `amino-acid composition' is more than 85% A+C+G+T..PP.B ps34compfaand.B ps34compsware versions of.B pv34compfaand.B pv34compswthat evaluate the quality of a search by reporting how manyhigh-scoring related sequences and low-scoring unrelated sequenceswere found.  These programs require that both the query library andthe reference library be annotated with superfamily numbers for everysequence in the library..SH OPTIONS.LP.B Pv34compfaand.B pv34compswnow support all the options of the fasta3(_t) programs..TP\-BReport z-score, rather than bit-score, in list of best hits..TP\-b #The number of similarity scores to be shown (10 by default)..TP\-E #Expectation value limit for displaying best scores..TP\-d #The number of alignments to be shown..TP\-f #(delval) penalty for the first residue in a gap. -12 by default for proteins..TP\-g #(gapval) penalty for additional residues in a gap after the first. -2by default for proteins..TP\-H #turn on histogram display (off by default)..TP\-iinvert (reverse complement) DNA sequence..TP\-J M:Nstart at the M-th sequence in the query library and continue to the"N-th".  By default, J=1 and the search begins with the first sequenceand ends with the last, but sometimes it makes sense to start in themiddle of the query library if a run partially completed, and tofinish "early" if the analysis will be run on several parallelclusters..TP\-nForce the program to use DNA sequence parameters..TP\-p #Number of "slave" processors to use.  Typically, one less thanthe number of processors available with.B pv34compfaso that one processor can be used to collate results.  With.B pv34compsw\c\&, it is more efficient to use every processor as a slave andnot use this option..TP\-Q \-qQuiet option.  The programs will not prompt for input..TP\-R file(STATFILE) Causes.B pv34compfaand.B pv34compswto write out the sequence identifier, superfamily number (if available),and similarity scores to .I STATFILEfor every sequence in the library.  These results are not sorted..TP\-rspecify DNA match/mismatch ratio as "+3/-2".  Default is "+5/-4".The "+" and "-" are required..TP\-STreat lower case residues as low complexity regions..TP\-s filethe filename of an alternative scoring matrix file..LP.Bpv34compfaonly.TP\-1sort similarity scores by.I init1scores instead of.I initnscores..TP\-c #(OPTCUT) the threshold for optimization with the.B -ooption..TP\-o(no-optimize); causes .B pv34compfanot to perform the default optimization on all of the sequences in the librarywith.B initnscores greater than.B OPTCUT\c\&..TP\-y #Width for limited optimization (32 by default)..SH FILES.LPQuery library files must be in Pearson/FASTA format, e.g..in +0.5i.nf>seq-id | sfnum descriptive linetmlyrghi... (sequence).fi.in -0.5i.PP.B pv34compfaand.B pv34compswrecognize the following library formats: 0 - Pearson/FASTA; 1 - Genbank tape;2 - NBRF/PIR Codata; 3 - EMBL/SWISS-PROT; 5 - NBRF/PIR VMS..PP.I Scoring matrices \-These programs use a different format for the scoring (PAM) matrixfile from FASTA; they use the PAM matrix file that is used by BLASTPand produced by Altshul's "pam.c" program in the BLAST package..SH BUGSThe program has been tested extensively only with type 0 and type 5files.  This documentation file may not be up to date..SH AUTHORBill Pearson.brwrp@virginia.EDU
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -