terrain.so
来自「berkeley db 4.6.21的源码。berkeley db是一个简单的数」· SO 代码 · 共 308 行
SO
308 行
m4_comment([$Id: terrain.so,v 10.4 2001/05/05 01:49:26 bostic Exp $])m4_ref_title(Introduction, Mapping the terrain: theory and practice,, intro/data, intro/dbis)m4_p([dnlThe first step in selecting a database system is figuring out what thechoices are. Decades of research and real-world deployment have producedcountless systems. We need to organize them somehow to reduce the numberof options.])m4_p([dnlOne obvious way to group systems is to use the common labels thatvendors apply to them. The buzzwords here include "network,""relational," "object-oriented," and "embedded," with somecross-fertilization like "object-relational" and "embedded network".Understanding the buzzwords is important. Each has some grounding intheory, but has also evolved into a practical label for categorizingsystems that work in a certain way.])m4_p([dnlAll database systems, regardless of the buzzwords that apply to them,provide a few common services. All of them store data, for example.We'll begin by exploring the common services that all systems provide,and then examine the differences among the different kinds of systems.])m4_section([Data access and data management])m4_p([dnlFundamentally, database systems provide two services.])m4_p([dnlThe first service is m4_italic(data access). Data access means addingnew data to the database (inserting), finding data of interest(searching), changing data already stored (updating), and removing datafrom the database (deleting). All databases provide these services. Howthey work varies from category to category, and depends on the recordstructure that the database supports.])m4_p([dnlEach record in a database is a collection of values. For example, therecord for a Web site customer might include a name, email address,shipping address, and payment information. Records are usually storedin tables. Each table holds records of the same kind. For example, them4_bold(customer) table at an e-commerce Web site might store thecustomer records for every person who shopped at the site. Often,database records have a different structure from the structures orinstances supported by the programming language in which an applicationis written. As a result, working with records can mean:])m4_bulletbeginm4_bullet([dnlusing database operations like searches and updates on records; and])m4_bullet([dnlconverting between programming language structures and database recordtypes in the application.])m4_bulletendm4_p([dnlThe second service is m4_italic(data management). Data management ismore complicated than data access. Providing good data managementservices is the hard part of building a database system. When youchoose a database system to use in an application you build, making sureit supports the data management services you need is critical.])m4_p([dnlData management services include allowing multiple users to work on thedatabase simultaneously (concurrency), allowing multiple records to bechanged instantaneously (transactions), and surviving application andsystem crashes (recovery). Different database systems offer differentdata management services. Data management services are entirelyindependent of the data access services listed above. For example,nothing about relational database theory requires that the systemsupport transactions, but most commercial relational systems do.])m4_p([dnlConcurrency means that multiple users can operate on the database atthe same time. Support for concurrency ranges from none (single-useraccess only) to complete (many readers and writers workingsimultaneously).])m4_p([dnlTransactions permit users to make multiple changes appear at once. Forexample, a transfer of funds between bank accounts needs to be atransaction because the balance in one account is reduced and thebalance in the other increases. If the reduction happened before theincrease, than a poorly-timed system crash could leave the customerpoorer; if the bank used the opposite order, then the same system crashcould make the customer richer. Obviously, both the customer and thebank are best served if both operations happen at the same instant.])m4_p([dnlTransactions have well-defined properties in database systems. They arem4_italic(atomic), so that the changes happen all at once or not at all.They are m4_italic(consistent), so that the database is in a legal statewhen the transaction begins and when it ends. They are typicallym4_italic(isolated), which means that any other users in the databasecannot interfere with them while they are in progress. And they arem4_italic(durable), so that if the system or application crashes aftera transaction finishes, the changes are not lost. Together, theproperties of m4_italic(atomicity), m4_italic(consistency),m4_italic(isolation), and m4_italic(durability) are known as the ACIDproperties.])m4_p([dnlAs is the case for concurrency, support for transactions varies amongdatabases. Some offer atomicity without making guarantees aboutdurability. Some ignore isolatability, especially in single-usersystems; there's no need to isolate other users from the effects ofchanges when there are no other users.])m4_p([dnlAnother important data management service is recovery. Strictlyspeaking, recovery is a procedure that the system carries out when itstarts up. The purpose of recovery is to guarantee that the database iscomplete and usable. This is most important after a system orapplication crash, when the database may have been damaged. The recoveryprocess guarantees that the internal structure of the database is good.Recovery usually means that any completed transactions are checked, andany lost changes are reapplied to the database. At the end of therecovery process, applications can use the database as if there had beenno interruption in service.])m4_p([dnlFinally, there are a number of data management services that permitcopying of data. For example, most database systems are able to importdata from other sources, and to export it for use elsewhere. Also, mostsystems provide some way to back up databases and to restore in theevent of a system failure that damages the database. Many commercialsystems allow m4_italic(hot backups), so that users can back updatabases while they are in use. Many applications must run withoutinterruption, and cannot be shut down for backups.])m4_p([dnlA particular database system may provide other data management services.Some provide browsers that show database structure and contents. Someinclude tools that enforce data integrity rules, such as the rule thatno employee can have a negative salary. These data management servicesare not common to all systems, however. Concurrency, recovery, andtransactions are the data management services that most database vendorssupport.])m4_p([dnlDeciding what kind of database to use means understanding the dataaccess and data management services that your application needs. m4_dbis an embedded database that supports fairly simple data access with arich set of data management services. To highlight its strengths andweaknesses, we can compare it to other database system categories.])m4_section([Relational databases])m4_p([dnlRelational databases are probably the best-known database variant,because of the success of companies like Oracle. Relational databasesare based on the mathematical field of set theory. The term "relation"is really just a synonym for "set" -- a relation is just a set ofrecords or, in our terminology, a table. One of the main innovations inearly relational systems was to insulate the programmer from thephysical organization of the database. Rather than walking througharrays of records or traversing pointers, programmers make statementsabout tables in a high-level language, and the system executes thosestatements.])m4_p([dnlRelational databases operate on m4_italic(tuples), or records, composedof values of several different data types, including integers, characterstrings, and others. Operations include searching for records whosevalues satisfy some criteria, updating records, and so on.])m4_p([dnlVirtually all relational databases use the Structured Query Language,or SQL. This language permits people and computer programs to work withthe database by writing simple statements. The database engine readsthose statements and determines how to satisfy them on the tables inthe database.])m4_p([dnlSQL is the main practical advantage of relational database systems.Rather than writing a computer program to find records of interest, therelational system user can just type a query in a simple syntax, andlet the engine do the work. This gives users enormous flexibility; theydo not need to decide in advance what kind of searches they want to do,and they do not need expensive programmers to find the data they need.Learning SQL requires some effort, but it's much simpler than afull-blown high-level programming language for most purposes. And thereare a lot of programmers who have already learned SQL.])m4_section([Object-oriented databases])m4_p([dnlObject-oriented databases are less common than relational systems, butare still fairly widespread. Most object-oriented databases wereoriginally conceived as persistent storage systems closely wedded toparticular high-level programming languages like C++. With the spreadof Java, most now support more than one programming language, butobject-oriented database systems fundamentally provide the same classand method abstractions as do object-oriented programming languages.])m4_p([dnlMany object-oriented systems allow applications to operate on objectsuniformly, whether they are in memory or on disk. These systems createthe illusion that all objects are in memory all the time. The advantageto object-oriented programmers who simply want object storage andretrieval is clear. They need never be aware of whether an object is inmemory or not. The application simply uses objects, and the databasesystem moves them between disk and memory transparently. All of theoperations on an object, and all its behavior, are determined by theprogramming language.])m4_p([dnlObject-oriented databases aren't nearly as widely deployed as relationalsystems. In order to attract developers who understand relationalsystems, many of the object-oriented systems have added support forquery languages very much like SQL. In practice, though, object-orienteddatabases are mostly used for persistent storage of objects in C++ andJava programs.])m4_section([Network databases])m4_p([dnlThe "network model" is a fairly old technique for managing andnavigating application data. Network databases are designed to makepointer traversal very fast. Every record stored in a network databaseis allowed to contain pointers to other records. These pointers aregenerally physical addresses, so fetching the record to which it refersjust means reading it from disk by its disk address.])m4_p([dnlNetwork database systems generally permit records to contain integers,floating point numbers, and character strings, as well as references toother records. An application can search for records of interest. Afterretrieving a record, the application can fetch any record to which itrefers, quickly.])m4_p([dnlPointer traversal is fast because most network systems use physical diskaddresses as pointers. When the application wants to fetch a record,the database system uses the address to fetch exactly the right stringof bytes from the disk. This requires only a single disk access in allcases. Other systems, by contrast, often must do more than one disk readto find a particular record.])m4_p([dnlThe key advantage of the network model is also its main drawback. Thefact that pointer traversal is so fast means that applications that doit will run well. On the other hand, storing pointers all over thedatabase makes it very hard to reorganize the database. In effect, onceyou store a pointer to a record, it is difficult to move that recordelsewhere. Some network databases handle this by leaving forwardingpointers behind, but this defeats the speed advantage of doing a singledisk access in the first place. Other network databases find, and fix,all the pointers to a record when it moves, but this makesreorganization very expensive. Reorganization is often necessary indatabases, since adding and deleting records over time will consumespace that cannot be reclaimed without reorganizing. Without periodicreorganization to compact network databases, they can end up with aconsiderable amount of wasted space.])m4_section([Clients and servers])m4_p([dnlDatabase vendors have two choices for system architecture. They canbuild a server to which remote clients connect, and do all the databasemanagement inside the server. Alternatively, they can provide a modulethat links directly into the application, and does all databasemanagement locally. In either case, the application developer needssome way of communicating with the database (generally, an ApplicationProgramming Interface (API) that does work in the process or thatcommunicates with a server to get work done).])m4_p([dnlAlmost all commercial database products are implemented as servers, andapplications connect to them as clients. Servers have several featuresthat make them attractive.])m4_p([dnlFirst, because all of the data is managed by a separate process, andpossibly on a separate machine, it's easy to isolate the database serverfrom bugs and crashes in the application.])m4_p([dnlSecond, because some database products (particularly relational engines)are quite large, splitting them off as separate server processes keepsapplications small, which uses less disk space and memory. Relationalengines include code to parse SQL statements, to analyze them andproduce plans for execution, to optimize the plans, and to executethem.])m4_p([dnlFinally, by storing all the data in one place and managing it with asingle server, it's easier for organizations to back up, protect, andset policies on their databases. The enterprise databases for largecompanies often have several full-time administrators caring for them,making certain that applications run quickly, granting and denyingaccess to users, and making backups.])m4_p([dnlHowever, centralized administration can be a disadvantage in some cases.In particular, if a programmer wants to build an application that usesa database for storage of important information, then shipping andsupporting the application is much harder. The end user needs to installand administer a separate database server, and the programmer mustsupport not just one product, but two. Adding a server process to theapplication creates new opportunity for installation mistakes andrun-time problems.])m4_page_footer
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?