📄 tune.so

📁 berkeley db 4.6.21的源码。berkeley db是一个简单的数据库管理系统
💻 SO
字号:
m4_comment([$Id: tune.so,v 10.10 2006/08/25 23:25:17 bostic Exp $])m4_ref_title(Access Methods,    Access method tuning,    [@access method tuning, access method @tuning],    am_misc/diskspace, am_misc/faq)m4_p([dnlThere are a few different issues to consider when tuning the performanceof m4_db access method applications.])m4_tagbeginm4_tag(access method, [dnlAn application's choice of a database access method can significantlyaffect performance.  Applications using fixed-length records and integerkeys are likely to get better performance from the Queue access method.Applications using variable-length records are likely to get betterperformance from the Btree access method, as it tends to be faster formost applications than either the Hash or Recno access methods.  Becausethe access method APIs are largely identical between the m4_db accessmethods, it is easy for applications to benchmark the different accessmethods against each other.  See m4_link(M4RELDIR/ref/am_conf/select,[Selecting an access method]) for more information.])m4_tag(cache size, [dnlThe m4_db database cache defaults to a fairly small size, and mostapplications concerned with performance will want to set it explicitly.Using a too-small cache will result in horrible performance.  The firststep in tuning the cache size is to use the db_stat utility (or thestatistics returned by the m4_ref(dbh_stat) function) to measure theeffectiveness of the cache.  The goal is to maximize the cache's hitrate.  Typically, increasing the size of the cache until the hit ratereaches 100% or levels off will yield the best performance.  However,if your working set is sufficiently large, you will be limited by thesystem's available physical memory.  Depending on the virtual memoryand file system buffering policies of your system, and the requirementsof other applications, the maximum cache size will be some amountsmaller than the size of physical memory.  If you find thatm4_ref(db_stat) shows that increasing the cache size improves your hitrate, but performance is not improving (or is getting worse), then it'slikely you've hit other system limitations.  At this point, you shouldreview the system's swapping/paging activity and limit the size of thecache to the maximum size possible without triggering paging activity.Finally, always remember to make your measurements under conditions asclose as possible to the conditions your deployed application will rununder, and to test your final choices under worst-case conditions.])m4_tag(shared memory, [dnlBy default, m4_db creates its database environment shared regions infilesystem backed memory.  Some systems do not distinguish betweenregular filesystem pages and memory-mapped pages backed by thefilesystem, when selecting dirty pages to be flushed back to disk.  Forthis reason, dirtying pages in the m4_db cache may cause intensefilesystem activity, typically when the filesystem sync thread orprocess is run.  In some cases, this can dramatically affect applicationthroughput.  The workaround to this problem is to create the sharedregions in system shared memory (m4_ref(DB_SYSTEM_MEM)) or applicationprivate memory (m4_ref(DB_PRIVATE)), or, in cases where this behavioris configurable, to turn off the operating system's flushing ofmemory-mapped pages.])m4_tag(large key/data items, [dnlStoring large key/data items in a database can alter the performancecharacteristics of Btree, Hash and Recno databases.  The first parameterto consider is the database page size.  When a key/data item is toolarge to be placed on a database page, it is stored on "overflow" pagesthat are maintained outside of the normal database structure (typically,items that are larger than one-quarter of the page size are deemed tobe too large).  Accessing these overflow pages requires at least oneadditional page reference over a normal access, so it is usually betterto increase the page size than to create a database with a large numberof overflow pages.  Use the m4_ref(db_stat) utility (or the statisticsreturned by the m4_refT(dbh_stat)) to review the number of overflowpages in the database.m4_p([dnlThe second issue is using large key/data items instead of duplicate dataitems.  While this can offer performance gains to some applications(because it is possible to retrieve several data items in a single getcall), once the key/data items are large enough to be pushed off-page,they will slow the application down.  Using duplicate data items isusually the better choice in the long run.])])m4_tagendm4_p([dnlA common question when tuning m4_db applications is scalability.  Forexample, people will ask why, when adding additional threads orprocesses to an application, the overall database throughput decreases,even when all of the operations are read-only queries.])m4_p([dnlFirst, while read-only operations are logically concurrent, they stillhave to acquire mutexes on internal m4_db data structures.  For example,when searching a linked list and looking for a database page, the linkedlist has to be locked against other threads of control attempting to addor remove pages from the linked list.  The more threads of control youadd, the more contention there will be for those shared data structureresources.])m4_p([dnlSecond, once contention starts happening, applications will also startto see threads of control convoy behind locks (especially onarchitectures supporting only test-and-set spin mutexes, rather thanblocking mutexes).  On test-and-set architectures, threads of controlwaiting for locks must attempt to acquire the mutex, sleep, check themutex again, and so on.  Each failed check of the mutex and subsequentsleep wastes CPU and decreases the overall throughput of the system.])m4_p([dnlThird, every time a thread acquires a shared mutex, it has to shoot downother references to that memory in every other CPU on the system.  Manymodern snoopy cache architectures have slow shoot down characteristics.])m4_p([dnlFourth, schedulers don't care what application-specific mutexes a threadof control might hold when de-scheduling a thread.  If a thread ofcontrol is descheduled while holding a shared data structure mutex,other threads of control will be blocked until the scheduler decides torun the blocking thread of control again.  The more threads of controlthat are running, the smaller their quanta of CPU time, and the morelikely they will be descheduled while holding a m4_db mutex.])m4_p([dnlThe results of adding new threads of control to an application, on theapplication's throughput, is application and hardware specific andalmost entirely dependent on the application's data access pattern andhardware.  In general, using operating systems that support blockingmutexes will often make a tremendous difference, and limiting threadsof control to to some small multiple of the number of CPUs is usuallythe right choice to make.])m4_page_footer
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -