📄 readme.v34t0

📁 序列对齐 Compare a protein sequence to a protein sequence database or a DNA sequence to a DNA sequenc
💻 V34T0
📖 第 1 页 / 共 5 页
字号:
but nothing that is dependent on aa0.  The aa0 dependent stuff (nm0,Lambda, K, etc) is now stored in struct mngmsg.  This was mostly doneto support the pv34comp* programs, which have separate mngmsgstructures but the same pstructs.The fasts34, fasts34_t, and pv34compfs/c34.workfs have all been testedsuccessfully.>>July 19, 2002Fix an old bug in the calculation of E()-values in DNA databaseslonger than 2147483647 residues on machines with 32-bit longs.>>July 28-31, 2002(1) The various Makefile's have been "normalized".  The fast*34[_t]    (Makefile.34m.common[_sql]), Makefile.pvm4[_sql], and    Makefile.mpi4[_sql] make files all use a common set of filenames,    described in Makefile.fcom.  This greatly simplifies adding    programs, but requires that all *.o files be deleted when moving    from fast*34* to pv34comp* to mp34comp*.(2) showalign.c/p_showalign.c have been merged into mshowalign.c    showbest.c/manshowbest.c have been merged into mshowbest.c.  Some    of the related files (showun.c, manshowun.c, have not been merged    or tested).(3) Code for ranking scores with valid e_value's incorporated.(4) Bug fixes in p2_complib.c, so that fasts34/fasts34_t/pvcompfs    provide identical statistics.>>July 26, 2002Makefile.pvm4_sql and Makefile.pvm4 have been substantially simplifiedby providing the worker program name from the h_init() function in theinitfa.c/initsw.c files.>>July 24, 2002Substantial modifications to param.h, structs.h to ensure that nosequence specific information is kept in struct pstruct.  Thisstructure now holds the pam[] matrix, and other scoring parameters,but nothing that is dependent on aa0.  The aa0 dependent stuff (nm0,Lambda, K, etc) is now stored in struct mngmsg.  This was mostly doneto support the pv34comp* programs, which have separate mngmsgstructures but the same pstructs.The fasts34, fasts34_t, and pv34compfs/c34.workfs have all been testedsuccessfully.>>July 8, 2002Modifications to comp_lib.c, initfa.c and new scaleswt.c, tatstats.cto support FASTS with Tatusov statistics.last_params() has been introduced to allow aa0 dependent changes in m_msg/pstr.sortbest() has been moved into initfa.c/initsw.c to make it function specific.find_z() takes an additional parameter, escore.The do_work() results structure, beststr, and stat_str all accommodateescores as well as integer scores (stat_str also saves segn and seglbut doesn't need them).In scaleswt.c, process_hist() now knows much more about Tatusov statistics.last_stats() provided to accommodate rank-based statistical corrections.scale_scores() is the last function to modify the beststr scores(final calculation of E-value).Some sortbest*() calls and some bptr[i]->zscore=find_zp() loops havebeen moved into scale_scores();>>July 3,5, 2002Modifications to allow mySQL comments (--) in "library.sql 16" files.Thus, a first line of:	--host seqdb user password;is read by FASTA as the login information to a mySQL server, but isignored by mySQL.  "DO" commands in FASTA mySQL files can also berendered invisible to mySQL in this way.  See "do.sql".Modifications to mysql_lib.c to allow very long SQL statements.  Thebuffer is now dynamically reallocated in 4Kb chunks.The fasta3.1 man page has been updated and re-organized.>>June 26, 2002Minor modifications to nmgetaa.c (openlib()) to use the same argumentsfor searching and PRSS.  PRSS needs access to all of m_msg, butsearches do not.  Other small fixes to comp_mlib.c, towards the goalof merging comp_mlib.c and comp_lib.c.>>June 25, 2002Modify the statistical estimation strategy to sample all the sequencesin the database, not just the first 60,000.  The histogram is stillbased only on the first 60,000 scores and lengths, though all scoresan lengths are shown.  The fit to the data may be better than thehistogram indicates, but it should not be worse.Currently, this modification is available only if the -DSAMPLE_STATSoption is defined.>>June 23, 2002	CVS fa34t11d4Fix a very long-standing bug in fasty/tfasty that caused 'NNN' to betranslated as 'S', rather than 'X'.  fastx/tfastx has done thiscorrectly for many years, but the fasty/tfasty code that I receivedfrom Zheng Zhang was not implemented correctly (my fault, his code wasfine).>>June 19, 2002Added "-C #" option, where 6 <= # <= MAX_UID (20), to specify thelength of the sequence name display on the alignment labels.  Untilnow, only 6 characters were ever displayed.  Now, up to MAX_UIDcharacters are available.>>May 30, 2002	CVS fa34t11d3Fixed problem with programs using the default -E cutoff when -b wasprovided.  With this implementation, -E can override -b, but -boverrides the default -E.Fixed problem with 64-bit file offsets in param.h (change USE_FSEEK0-> USE_FSEEKO, include -D_LARGEFILE_SOURCE and -D_LARGEFILE64_SOURCEin Makefile.linux_sql).  Put limits on alignment display length (200chars).  More checks for null returns from SQL queries.>>Apr 17, 2002	CVS fa34t11d2Fixed bug in mm_file.h/ncbl2_mlib.c that caused the SGI version to beunable to read blast2 format files.Changed "mp_*" tags to "pg_*" for -m 10 option.>>Mar 30, 2002Fix embarrassing bug in revcomp() (getseq.c) that failed to complementthe central nucleotide in a sequence with an odd number of residues.Small changes to dropfs.c for more segments.>>Mar 16, 2002Added create_seq_demo.sql, nt_to_sql.pl to show how to build an SQLprotein sequence database that can be used with with the mySQLversions of the fasta34 programs.  Once the mySQL seq_demo databasehas been installed, it can be searched using the command:	fasta34 -q mgstm1.aa "seq_demo.sql 16"mysql_lib.c has been modified to remove the restriction that mySQLprotein sequence unique identifiers be integers.  This allows theprogram to be used with the PIRPSD database.  The RANLIB() functioncall has been changed to include "libstr", to support SQL text keys.Due to the size of libstr[], unique ID's must be < MAX_UID (20)characters.A "pirpsd.sql" file is available for searching the mySQL distributionof the PIRPSD database.  PIRPSD is available fromftp://nbrfa.georgetown.edu/pir_databases/psd/mysql.>>Mar 6, 2002Fix showbest.c showbest() to report pst.zdb_size as database size.Fix dropnfa.c spam() to address off-by-one on end of run, and doublecounting on backwards scan.  Fix dropnfa.c do_fasta() to fix anotherproblem introduced by -S.  Changes to comp_lib.c to ensure that boththe beginning and end of the query and library sequence have '\0'present.  Changes to initfa.c, initsw.c to ensure that a match to alower-case letter with -S gets exactly the same score as a match to an'X'.  Changes to mmgetlib.c to work with 64-bit longs in *.xin files.>>Feb 26, 2002Fixes to doinit.c, initfa.c, initsw.c to allow DNA matrices using the"-s dna.mat" option.  A new matrix, "d50ry.mat" is available thatscores +5 for a match, -2 for a transition, and -5 for atransversion. "d50ry.mat" corresponds to DNA PAM50 with transitionstwice as common as transversions.  When "-s dna.mat" is used, "-n"MUST be used as well.Query sequence names ("aa", "nt") should be more accurate.>>Feb 22, 2002Fix to getseq.c to allow "plain" sequence files.>>Feb 12, 2002Minor fix to res_stats.c.>>Jan 28, 2002Fixes to resurrect res_stats.c.  res_stats (cc -o res_statsres_stats.c scaleswn.c -lm) takes the output from a current "-Rfile.res" file and calculates statistical significance - this allowsone to take exactly the same set of scores (and lengths) and calculatestatistical estimates using different strategies.>>Jan 24, 2002modifications to mmgetlib.c, ncbl2_mlib.c to more robustly read memorymapped files (*.xin, map_db) on machines lacking "native" 64-bitlongs.  If the machine provides some definition for a 64-bit long(e.g. "long long", "int64_t"), things should work. 64-bit offsets intomemory mapped files work properly on Alpha, SGI, i386 Linux, andMacOSX.  The current implementation depends either on 64 bit longs(Compaq Alpha's pre 4.0G) or the <sys/inttype.h> file.  Makefile,Makefile.alpha, and Makefile.linux have been modified.Modifications to nmgetlib.c, mmgetlib.c to provide GI numbers andAccession versions for Genbank searches.  If the GI:123456 number isavailable, it will be used and the description line will be formatted:	gi|123456|gb|ACC1234.1|LOCUS descriptionThis should help FAST_PAN runs, where the version of a sequencechanges frequently.>>Jan 10, 2002Modifications to p2_complib.c, p2_workcomp.c to more reliably allocatespace for library sequence descriptions on the master and workers.>>Jan 2-3, 2002		CVS fa34t10c/fa34t10d3Fixes to comp_lib.c to support Macintosh and Windows/Turbo-Ccompilation.  New Makefile.tc.  Macintosh version supports both"Classic" and "Carbon" environments."<values.h>" has been replaced with the more modern "<limits.h>"Fixes to p2_complib.c to support n_libstr (libstr length) in GETLIB().comp_thr.c, complib.c removed.>>Dec 16, 2001Complete integration of comp_mlib.c with both the unthreaded andthreaded programs.  Comp_mlib allows fasta34 and fasta34_t to comparea database with a second database, just as pv34compfa does.  Usingmultiple queries with fasta34_t is not as efficient as pv34compfa (andit cannot use networks of Unix workstations), but it is much easier touse and install.With the comp_mlib.c option, fasta34 cannot automatically recognizeDNA sequences, just as pv34compfa no longer recognizes DNA sequences.You must use the "-n" option to search with DNA sequences.  The otherprograms (fastx34, tfastx34, etc) "know" the type of the query anddatabase sequences, so "-n" is only required for fasta34(_t).>>Dec 14, 2001		CVS tag fa34t10bFix problems reading DNA databases in blast2 format.>>Dec 11, 2001Changes to spam() in dropnfa.c so that, for DNA sequences, theprevious behavior for finding the boundaries of a local alignmentregion use the same algorithm as previous versions of fasta.  Forprotein sequences, the algorithm will extend the local region beyondthe "ktup" boundaries if a better score can be found.  For DNAsequences, this raises the noise rather than increasing sensitivity,so it is turned off and "ktup" boundaries are respected.  The old,"ktup" boundary algorithm is available with -DNOSPAM_EXT.This version also includes a working res_stats.c, which can be used totest various statistical estimates on exactly the same set of scores.Fixed problems with -m 9 percent identity for fastx/fasty/tfastx/tfasty.These errors have been present since -m 9 was implemented.>>Dec 10, 2001Fix to map_db.c to work correctly with files > 2 Gb when 64-bit longsare available.  It is not yet designed to work with ftello() and otheroffset types.>>Nov 11,21, 2001	CVS tag fa34t10a, fa34t10d1Substantial changes to revcomp(), getseq(), and other functions tocorrect problems with -S on DNA sequences.  Sequences with lower casenucleotides were not recognized or reverse complemented properly.Fix to dropnfa.c (v34t07, Nov 21, 2001) bg_align() to re-initializestatic globals - this fixes a problem encountered with pv34compfa.  Anew main program, comp_mlib.c has been added to the CVS archive,although it is not referenced in any of the Makefile.  comp_mlib.cworks like p2_complib.c and compares a library against anotherlibrary.>>Nov 4, 2001Change to dropnfa.c spam () while(1) -> while(lpos <= dmax->stop).This fixes a problem with ktup=1 on Suns only, so far.>>Oct 4, 2001		CVS tag fa34t10Add comp_lib.c file, which merges complib.c (unthreaded) andcomp_thr.c (threaded) code into one file.Modifications to nmgetlib.c, mmgetaa.c to allow Genbank flatfileformat without DESCRIPTION or ACCESSION lines.Additional fix for -S with ktup=1.>>Sept. 24, 2001Fix to have correct gap-penalties for short scoring matrices withtfastx/fastx.>>Sept. 10, 2001	CVS tag fa34t05d6Fix a bug introduced by -S fix in fa34t05d5.  Also, try to removechanges in p34compfa compared to pv4compfa output.>>Sept. 6, 2001		CVS tag fa34t05d5Fix the -S dropnfa/fx/fz2 bug that was not actually fixed infa34t05d4.  Incorporate the correct scaleswn.c refered to infa34t05d4.>>Sept. 5, 2001		CVS tag fa34t05d4Fix problem with m_msg.quiet that prevented interactive prompts forktup, file name, etc with threaded programs.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -