📄 vcg04.htm
字号:
<BR>
<UL>
<LI>Business model generation: The first step in the use of a high-end CASE tool is to create an operational model of an enterprise, which consists of defining virtually every activity involved in operating the organization. Accurately modeling the operations of a large firm as they relate to information requirements is an extraordinarily difficult and time-consuming process.
<BR>
<BR>
<LI>Schema development: A database schema is a diagram that describes the entire information system pictorially, usually with less detail than that offered by E-R diagrams. The schema for a large information system with a multiplicity of databases might cover an entire wall of a large office or conference room.
<BR>
<BR>
<LI>Relation diagramming: Some CASE tools support several methods of diagramming relations between tables. Most, but not all, CASE tools support E-R diagrams, as well as other pictorial methods, such as Bachman diagrams.
<BR>
<BR>
<LI>Data dictionary development: A data dictionary is a full description of each table in the database and each field of every table. Other properties of tables and fields, such as primary keys, foreign keys, indexes, field data types, and constraints on field values, are included. Creating data dictionaries is one of the subjects of Chapter 22, "Documenting Your Database Applications."
<BR>
<BR>
<LI>Repository creation: A repository is a database that is part of the CASE tool. It contains all the details of the structure and composition of the database. Data in the repository is used to create schema, relation diagrams, and data dictionaries. The repository is also responsible for maintaining version control when you change the database's design. When this book was written, Microsoft and Texas Instruments had joined forces to develop an object-oriented repository for 32-bit Windows database development.
<BR>
<BR>
<LI>Database generation: After you've designed the database, the CASE tool creates the SQL Data Definition Language (DDL) statements necessary to create the database and its tables. You then send the statements to the RDBMS, which builds the database for you.
<BR>
<BR>
<LI>Data flow diagramming: Some database CASE tools include the capability to create data flow diagrams that describe how data is added to tables and how tables are updated. However, data flow diagrams are application-related, not database-design-related. Thus, data flow diagramming capability is a prerequisite for qualification as a CASE database tool.
<BR>
<BR>
</UL>
<P>Mainframe database developers have a variety of CASE tools from which to choose. Several CASE tools serve the client-server market, such as Popkin Software's Data Architect. Developers using desktop DBMs haven't been so fortunate. No commercial CASE tools with the basic features in the preceding list were available for xBase and Paradox developers at the time this book was written.
<BR>
<P>Database modeling tools are a simplified version of CASE tools for information systems development. Modeling tools omit the business modeling aspects of CASE tools but implement at least the majority of the features described in the preceding list. An example of a database modeling tool designed specifically for Jet databases is Asymetrix Inc.'s InfoModeler. InfoModeler is a full-fledged database design system that you can use to create Jet databases from structured English statements that define entity and attribute classes. InfoModeler is described more fully in Chapter 22.
<BR>
<BR>
<A NAME="E68E30"></A>
<H3 ALIGN=CENTER>
<CENTER>
<FONT SIZE=5 COLOR="#FF0000"><B>Rules for Relational Database Design</B></FONT></CENTER></H3>
<BR>
<P>If you use a modeling tool to create your database structure, the modeling tool automatically creates tables that comply with the basic rules of relational database design. However, database developers often are faced with the task of importing or using existing data that isn't in a format suitable for a relational database. It's quite common for database developers to be faced with the task of transforming data contained in spreadsheet "databases" to tables of a relational database. Another scenario is the conversion of a poorly designed, dysfunctional database or a CODASYL network database that contains repeating groups into proper relational structure. (COBOL permits the use of the <I>GroupName</I> OCCURS <I>Several</I> TIMES statement to create repeating groups in network databases.)
<BR>
<P>The following sections describe the methods you use to transform nonrelational data to fully relational form.
<BR>
<BR>
<A NAME="E69E112"></A>
<H4 ALIGN=CENTER>
<CENTER>
<FONT SIZE=4 COLOR="#FF0000"><B>Organizing Entity Classes</B></FONT></CENTER></H4>
<BR>
<P>The section "Flat-File Databases" noted that the worksheet data model often contains inconsistent entities in rows. The stock prices example, shown in Figure 4.6, shows an Excel worksheet whose structure violates every rule applicable to relational database tables except attribute atomicity. STOCKS is a worksheet that lists the New York Stock Exchange's (NYSE) closing, high, and low price for shares and the sales volume of 25 stocks for a five-day period. Rows contain different entities, and columns B through F are repeating groups. The Stocks5.xls workbook in Excel 5.0/7.0 format, which is included on the CD-ROM that comes with this book, is used in the following examples. You'll find Stocks5.xls in the CHAPTR04 folder on the CD that comes with this book.
<BR>
<P><B><A HREF="04vcg09.gif" tppabs="http://202.113.16.101/%7eeb%7e/Database%20Developer's%20Guide%20with%20Visual%20C++%204,%20Second%20Edition/04vcg09.gif">Figure 4.6. A worksheet whose structure is the antithesis of proper database table design.</A></B>
<BR>
<P>You need to separate the entity classes according to the object each entity class represents. The four entity classes of the STOCKS worksheet of the Stocks5.xls workbook are the closing price, the highest transaction price, the lowest transaction price, and the trading volume of a particular stock on a given day. To separate the entity classes, you need to add a column so that the stock is identified by its abbreviation in each row. You can identify the data entities by their classes—Close, High, Low, and Volume—plus the abbreviation for the stock, which is added to the new column with a simple recorded Excel VBA macro. Then, you sort the data with the Entity and Key columns. The result of this process appears as shown for the Stocks1 worksheet, shown in Figure 4.7.
<BR>
<P><B><A HREF="04vcg10.gif" tppabs="http://202.113.16.101/%7eeb%7e/Database%20Developer's%20Guide%20with%20Visual%20C++%204,%20Second%20Edition/04vcg10.gif">Figure 4.7. The STOCKS worksheet with entities sorted by entity class.</A></B>
<BR>
<BLOCKQUOTE>
<BLOCKQUOTE>
<HR ALIGN=CENTER>
<BR>
<NOTE><B>NOTE</B>
<BR>
<BR>The dates column represents a mixed entity type (three prices in dollars and the volume in shares), but each entity is now identified by its type. Thus, you can divide the entities into separate tables at any point in the transformation process.</NOTE>
<BR>
<HR ALIGN=CENTER>
</BLOCKQUOTE></BLOCKQUOTE>
<P>Now you have a table that contains entities with consistent attribute values, because you moved the inconsistent stock name abbreviation to its own attribute class, Key, and replaced the stock abbreviation in the Entity column A to a value consistent with the Entity attribute class Close. However, the repeating-groups problem remains.
<BR>
<BLOCKQUOTE>
<BLOCKQUOTE>
<HR ALIGN=CENTER>
<BR>
<NOTE><B>NOTE</B>
<BR>
<BR>This chapter uses manual worksheet methods of manipulating tabular data because worksheet techniques such as selecting, cutting, and pasting groups of cells represent the easiest and fastest way of changing the structure of tabular data. If you need to transform a large amount of worksheet data into relational tables, you should use Visual C++ OLE Automation methods to automate the transformation process.</NOTE>
<BR>
<HR ALIGN=CENTER>
</BLOCKQUOTE></BLOCKQUOTE>
<BR>
<A NAME="E69E113"></A>
<H4 ALIGN=CENTER>
<CENTER>
<FONT SIZE=4 COLOR="#FF0000"><B>Normalizing Table Data</B></FONT></CENTER></H4>
<BR>
<P>The process of transforming existing data into relational form is called <I>normalization</I>. Normalization of data is based on the assumption that you've organized your data into a tabular structure wherein the tables contain only a single entity class. The objectives of normalization of data include the following:
<BR>
<UL>
<LI>Eliminating duplicated information contained in tables
<BR>
<BR>
<LI>Accommodating future changes to the structure of tables
<BR>
<BR>
<LI>Minimizing the impact of changes to database structure on the front-end applications that process the data
<BR>
<BR>
</UL>
<P>The following sections describe the five steps that constitute full normalization of relational tables. In most cases, you can halt the normalization process at third normal form, or model. Many developers bypass fourth and fifth normal forms because these normalization rules appear arcane and inapplicable to everyday database design.
<BR>
<BR>
<A NAME="E70E27"></A>
<H5 ALIGN=CENTER>
<CENTER>
<FONT SIZE=4 COLOR="#FF0000"><B>First Normal Form</B></FONT></CENTER></H5>
<BR>
<P>First normal form requires that tables be flat and contain no repeating groups. A data cell of a flat table may contain only one atomic (indivisible) data value. If your imported data contains multiple data items in a single field, you need to add one or more new fields to contain each data item and then move the multiple data items into the new field.
<BR>
<P>The Northwind.mdb sample database included with Access 95 has a Customers table whose Address field contains data that violates first normal form—some cells contain a two-line address. Figure 4.8 shows the Customers table of Northwind.mdb (which was NWIND.MDB in earlier versions of Access) in datasheet mode. The multiline addresses for Hungry Coyote Import Store and Island Trading violate the atomicity rule. Thus, you need another field, such as Location, to contain the second line of two-line entries in the Address field. For parcel delivery services such as Federal Express, you need the physical address in the Location field for firms that use post office boxes to receive their mail.
<BR>
<P><B><A HREF="04vcg11.gif" tppabs="http://202.113.16.101/%7eeb%7e/Database%20Developer's%20Guide%20with%20Visual%20C++%204,%20Second%20Edition/04vcg11.gif">Figure 4.8. First normal form violations in the Customers table of Access 95's Northwind.mdb sample database.</A></B>
<BR>
<BLOCKQUOTE>
<BLOCKQUOTE>
<HR ALIGN=CENTER>
<BR>
<NOTE><B>NOTE</B>
<BR>
<BR>If you're an xBase or Paradox developer, you might think that adding a field to contain the physical location portion of the address causes unwarranted expansion of the size of the Customers table. This isn't the case with Jet tables, because Jet databases use variable-length fields for the Text field data type. If an entry is missing in the Location field for a customer, the field contains only the Null value in databases tha
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -