📄 dcachecbio.c
字号:
/* dcacheCbio.c - Disk Cache Driver *//* Copyright 1999 Wind River Systems, Inc. *//*modification history--------------------01o,31aug99,jkf changes for new CBIO API.01n,31jul99,jkf T2 merge, tidiness & spelling.01i,17nov98,lrn zero-fill blocks allocated with CBIO_CACHE_NEWBLK01h,29oct98,lrn pass along 3rd arg on CBIO_RESET, removed old mod history01g,20oct98,lrn fixed SPR#22553, SPR#2273101f,14sep98,lrn refined error handling, made updTask unconditional01e,08sep98,lrn add hash for speed (SPR#21972), size change (SPR#21975)01d,06sep98,lrn modify to work on top of CBIO (SPR#21974), and use wrapper for block devices b.c.01c,30jul98,wlf partial doc cleanup01b,01jul98,lrn written.01a,28jan98,lrn written, preliminary*//*DESCRIPTIONThis module implements a disk cache mechanism via the CBIO API.This is intended for use by the VxWorks DOS file system, to storefrequently used disk blocks in memory. The disk cache is unaware of the particular file system format on the disk, and handles thedisk as a collection of blocks of a fixed size, typically the sectorsize of 512 bytes. The disk cache may be used with SCSI, IDE, ATA, Floppy or any othertype of disk controllers. The underlying device driver may be eithercomply with the CBIO API or with the older block device API. This library interfaces to device drivers implementing the block device API via the basic CBIO BLK_DEV wrapper provided by cbioLib.Because the disk cache complies with the CBIO programming interface onboth its upper and lower layers, it is both an optional and a stackablemodule. It can be used or omitted depending on resources available and performance required.The disk cache module implements the CBIO API, which is used by the filesystem module to access the disk blocks, or to access bytes within a particular disk block. This allows the file system to use the disk cacheto store file data as well as Directory and File Allocation Table blocks, on a Most Recently Used basis, thus keeping a controllable subset of thesedisk structures in memory. This results in minimized memory requirements for the file system, while avoiding any significant performance degradation.The size of the disk cache, and thus the memory consumption of the disksubsystem, is configured at the time of initialization (see dcacheDevCreate()), allowing the user to trade-off memory consumptionversus performance. Additional performance tuning capabilities areavailable through dcacheDevTune().Briefly, here are the main techniques deployed by the disk cache:.IPLeast Recently Used block re-use policy.IPRead-ahead.IPWrite-behind with sorting and grouping.IPHidden writes.IPDisk cache bypass for large requests.IPBackground disk updating (flushing changes to disk) with an adjustableupdate period.LPSome of these techniques are discussed in more detail below; others are described in varrious professional and academic publications.DISK CACHE ALGORITHMThe disk cache is composed internally of a number cache blocks, ofthe same size as the disk physical block (sector). These cache blocksare maintained in a list in "Most Recently Used" order, that is, blockswhich are used are moved to the top of this list. When a block needs tobe relinquished, and made available to contain a new disk block, theLeast Recently Used block will be used for this purpose.In addition to the regular cache blocks, some of the memory allocatedfor cache is set aside for a "big buffer", which may range from 1/4 ofthe overall cache size up to 64KB. This buffer is used for:.IPCombining cache blocks with adjacent disk block numbers, in order towrite them to disk in groups, and save on latency and overhead .IPReading ahead a group of blocks, and then converting them to normalcache blocks..LPBecause there is significant overhead involved in accessing the diskdrive, read-ahead improves performance significantly by reading groupsof blocks at once.TUNABLE PARAMETERSThere are certain operational parameters that control the disk cacheoperation which are tunable. A number of.I presetparameter sets is provided, dependent on the size of the cache. Theseshould suffice for most purposes, but under certain types of workload,it may be desirable to tune these parameters to better suite theparticular workload patterns.See dcacheDevTune() for description of the tunable parameters. It isrecommended to call dcacheShow() after calling dcacheTune() in order to verify that the parameters where set as requested, and to inspect the cache statistics which may change dramatically. Note that the hit ratio is a principal indicator of cache efficiency, and should be inspectedduring such tuning.BACKGROUND UPDATINGA dedicated task will be created to take care of updating the disk withblocks that have been modified in cache. The time period between updatesis controlled with the tunable parameter syncInterval. Its priorityshould be set above the priority of any CPU-bound tasks so as to assureit can wake up frequently enough to keep the disk synchronized with thecache. There is only one such task for all cache devices configured. The task name is tUpdTask.The updating task also has the responsibility to invalidate disk cacheblocks for removable devices which have not been used for 2 seconds or more.There are a few global variables which control the parameters of thistask, namely:.IP <dcacheUpdTaskPriority>controls the default priority of the update task, and is set by default to 250..IP <dcacheUpdTaskStack>is used to set the update task stack size..IP <dcacheUpdTaskOptions>controls the task options for the update task..LPAll the above global parameters must be set prior to callingdcacheDevCreate() for the first time, with the exception ofdcacheUpdTaskPriority, which may be modified in run-time, and takeseffect almost immediately. It should be noted that this priority is notentirely fixed, at times when critical disk operations are performed,and FIOFLUSH ioctl is called, the caller task will temporarily.I loanits priority to the update task, to insure the completion of the flushingoperation.REMOVABLE DEVICESFor removable devices, disk cache provides these additional features:.IP "disk updating"is performed such that modified blocks will be written to disk withinone second, so as to minimize the risk of losing data in case of afailure or disk removal..IP "error handling"includes a test for disk removal, so that if a disk is removed from thedrive while an I/O operation is in progress, the disk removal event willbe set immediately..IP "disk signature"which is a checksum of the disk's boot block, is maintained by the cachecontrol structure, and it will be verified against the disk if it wasidle for 2 seconds or more. Hence if during that idle time a disk wasreplaced, the change will be detected on the next disk access, and thecondition will be flagged to the file system..IP NOTEIt is very important that removable disks should all have a uniquevolume label, or volume serial number, which are stored in the disk'sboot sector during formatting. Changing disks which have an identicalboot sector may result in failure to detect the change, resulting inunpredictable behavior, possible file system corruption..LPCACHE IMPLEMENTATIONMost Recently Used (MRU) disk blocks are stored in a collection of memorybuffers called the disk cache. The purpose of the disk cache is to reduce the number of disk accesses and to accelerate disk read and write operations, by means of the following techniques:.IPMost Recently Used blocks are stored in RAM, which results in the mostfrequently accessed data being retrieved from memory rather than from disk..IPReading data from disk is performed in large units, relying on the read-aheadfeature, one of the disk cache輘 tunable parameters..IPWrite operations are optimized because they occur to memory first. Thenupdating the disk happens in an orderly manner, by delayed write, anothertunable parameter..LPOverall, the main performance advantage arises from a dramaticreduction in the amount of time spent by the disk drive seeking, thus maximizing the time available for the disk to read and writeactual data. In other words, you get efficient use of the disk drive輘 available throughput. The disk cache offers a number of operational parameters that can be tuned by the user to suit a particular file system workload pattern, for example, delayed write, read ahead, and bypass threshold.The technique of delaying writes to disk means that if thesystem is turned off unexpectedly, updates that have not yet been written to the disk are lost. To minimize the effect of a possible crash, the disk cache periodically updates the disk.Modified blocks of data are not kept in memory more then a specifiedperiod of time. By specifying a small update period, the possibleworst-case loss of data from a crash is the sum of changes possibleduring that specified period. For example, it is assumed that an update period of 2 seconds is sufficiently large to effectively optimize disk writes, yet small enough to make the potential loss of data a reasonably minor concern. It is possible to set the update period to 0, in which case, all updates areflushed to disk immediately. This is essentially theequivalent of using the DOS_OPT_AUTOSYNC option in earlier dosFsLib implementations. The disk cache allows you to negotiate between disk performance and memory consumption: The more memory allocated to the disk cache, the higher the "hit ratio" observed,which means increasingly better performance of file system operations.Another tunable parameter is the bypass threshold, which defineshow much data constitutes a request large enough to justify bypassingthe disk cache. When significantly large read or write requests are made by the application, the disk cache is circumvented and there is a direct transfer of data between the disk controller and the user data buffer. The use of bypassing, in conjunction with support for contiguous file allocation and access (via the FIOCONTIG ioctl()command and the DOS_O_CONTIG open() flag), should provide performanceequivalent to that offered by the raw file system (rawFs). SEE ALSO: dosFsLib, cbioLib, dpartCbioINTERNALState Machine-------------Each cache block can be at one of the five different states at any time, whilethe state transitions may occur only when the mutex is taken.The three basic states are:.IP EMPTYa block does not contain any disk data.IP CLEANa block contains an unmodified copy of a certain disk block.IP DIRTYa block contains a disk block which has been modified in memory.There is also a UNSTABLE state which is used between mutex locks,which is used to indicate that a block is being modified in memoryand its data is not valid. This state is never used after mutex is released.Removable Device Support DetailsIt is worth noting that we dont trust the block driver's abilityto set its readyChanged flag correctly. Some drivers set it withoutneed, others fail to set it when indeed a disk is replaced.Hence we devised an independent approach to this issue - we areassuming that while the device is active and a disc is replaced,we will get an error, and we also assume it takes at least 2 secondsto replace a disk. Hence, if the disk has been idle for more then 2 seconds,we check the checksum of its boot block, against a previously registeredsignature.Issues to revisit or implement: + boot block number is hardcoded. + separate removable detection into a separate CBIO module below dcache*//* includes */#include "vxWorks.h"#include "stdlib.h"#include "stdio.h"#include "semLib.h"#include "ioLib.h"#include "classLib.h"#include "objLib.h"#include "taskLib.h"#include "tickLib.h"#include "string.h"#include "errno.h"#include "dllLib.h"#include "logLib.h"#include "private/classLibP.h"#include "private/semLibP.h"#include "private/taskLibP.h"#include "sysLib.h"/* START - CBIO private header */#define CBIO_DEV_EXTRA struct dcache_ctrl#include "private/cbioLibP.h"/* END - CBIO private header */#include "dcacheCbio.h"#ifndef DCACHE_MAX_DEVS#define DCACHE_MAX_DEVS 16#endif/* Cache block states */typedef enum { CB_STATE_EMPTY, /* no valid block is assigned */ CB_STATE_CLEAN, /* contains a valid block, unmodified */ CB_STATE_DIRTY, /* contains a valid but modified in memory */ CB_STATE_UNSTABLE /* block data is undefined or being modified */ } CB_STATE ;/* Cache descriptor block */typedef struct dcache_desc { DL_NODE lruList; /* element in LRU list */ block_t block; /* block number contained */ caddr_t data; /* actual data pointer */ struct dcache_desc * hashNext; /* offset of next desc in hash slot */ CB_STATE state:4; /* current state */ unsigned busy:1; /* descriptor busy (unused) */} DCACHE_DESC ;struct dcache_ctrl { CBIO_DEV_ID cbio_dev ; /* main device handle */ CBIO_DEV_ID dc_subDev ; /* subordinate CBIO device */ DL_LIST dc_LRU ; /* LRU list head */ char * dc_desc ; /* description */ u_long dc_numCacheBlocks ; caddr_t dc_BigBufPtr; u_long dc_BigBufSize; block_t dc_lastAccBlock ; u_long dc_updTick ; u_long dc_actTick ; u_long dc_diskSignature; /* Hash table params */ DCACHE_DESC ** dc_hashBase; u_long dc_hashSize; u_long dc_hashHits; u_long dc_hashMisses; /* Tunable Parameters */ u_long dc_bypassCount; u_long dc_dirtyMax; u_long dc_readAhead; u_long dc_syncInterval; u_long dc_cylinderSize; /* Statistic Counters */ u_long dc_dirtyCount; u_long dc_cookieHits ; u_long dc_cookieMisses ; u_long dc_lruHits ; u_long dc_lruMisses ; u_long dc_writesForeground ; u_long dc_writesBackground ; u_long dc_writesHidden ; u_long dc_writesForced ;} DCACHE_CTRL ;LOCAL const struct tunablePresets { u_long dc_numCacheBlocks ; u_long dc_bypassCount; u_long dc_dirtyMax; u_long dc_readAhead; u_long dc_hashSize; u_long dc_syncInterval;} dcacheTunablePresets[] = {/* <= nblks bypass dirty read-ahead hash-sz sync */{ 16, 4, 7, 1, 0, 0 },{ 32, 8, 15, 4, 37, 0 },{ 64, 8, 25, 7, 67, 0 },{ 128, 16, 50, 10, 131, 1 },{ 256, 24, 100, 22, 269, 1 },{ 512, 32, 200, 28, 547, 2 },{ 1024, 64, 500, 32, 1033, 5 },{ NONE, 128, 1000, 64, 2221, 15 },};#ifdef DEBUGint dcacheCbStateSize = sizeof(CB_STATE);int dcacheDescSize = sizeof( DCACHE_DESC);int dcacheDebug = 0;#endif/* forward declarations */LOCAL STATUS dacacheDevInit ( CBIO_DEV_ID dev );STATUS dcacheUpd ( void ) ;LOCAL DCACHE_DESC * dcacheBlockLocate( CBIO_DEV_ID dev, block_t block );LOCAL DCACHE_DESC * dcacheBlockGet( CBIO_DEV_ID dev, block_t block, cookie_t *cookie, BOOL readData );LOCAL STATUS dcacheBlkRW ( CBIO_DEV_ID dev, block_t start_block, block_t num_blocks, addr_t buffer, enum cbio_rw rw, cookie_t *cookie);LOCAL STATUS dcacheBytesRW ( CBIO_DEV_ID dev, block_t start_block, off_t offset, addr_t buffer, size_t nBytes, enum cbio_rw rw, cookie_t *cookie); LOCAL STATUS dcacheBlkCopy ( CBIO_DEV_ID dev, block_t src_block, block_t dst_block, block_t num_blocks);LOCAL STATUS dcacheIoctl ( CBIO_DEV_ID dev, int command, addr_t arg);struct dcache_ctrl dcacheCtrl[ DCACHE_MAX_DEVS ] ;int dcacheUpdTaskId = 0 ; /* updater task, one for all devices */int dcacheUpdTaskPriority = 250 ;int dcacheUpdTaskStack = 5000 ;/* tuneable by user to save space *//* CBIO_FUNCS, one per cbio driver */LOCAL CBIO_FUNCS cbioFuncs = {(FUNCPTR) dcacheBlkRW, (FUNCPTR) dcacheBytesRW, (FUNCPTR) dcacheBlkCopy, (FUNCPTR) dcacheIoctl};#define DCACHE_UPD_TASK_GRANULARITY 2 /* 1/4 sec granularity */#define DCACHE_BOOT_BLOCK_NUM 0 /* sec # where signature is */#define DCACHE_IDLE_SECS 2 /* after 2 secs idle checksum */#ifdef DEBUG#undef NDEBUG#define DEBUG_MSG(fmt,a1,a2,a3,a4,a5,a6) \ { if(dcacheDebug) logMsg(fmt,a1,a2,a3,a4,a5,a6); }int dcacheUpdTaskOptions = VX_SUPERVISOR_MODE ;#else#define NDEBUG#define DEBUG_MSG(fmt,a1,a2,a3,a4,a5,a6) {}int dcacheUpdTaskOptions = VX_SUPERVISOR_MODE | VX_UNBREAKABLE ;#endif /* DEBUG */#define strdup(s) (strcpy(malloc(strlen(s)+1), (s)))#define INFO_MSG(fmt,a1,a2,a3,a4,a5,a6) \
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -