📄 ug_ch10.htm
字号:
also include an <b>idref</b> attribute to the appropriate databaserecord, provided the value is not DB_NULL.</font></p><p><a name="Importing" id="Importing"></a>10.3 Importing Data</p><p><font size="2">This section describes the usage of the<b><i>db.*</i></b> import utility, <b>dbimp</b>. The purpose of<b>dbimp</b> is to import data from an ASCII-formatted file into a<b><i>db.*</i></b> database. The import utility is simple enough toallow great quantities of raw ASCII data to be entered into a<b><i>db.*</i></b> database with very little specification. Withmore complex requirements, <b>dbimp</b> can be instructed toperform mappings between the input data and <b><i>db.*</i></b>records and to make set connections based upon matching fieldvalues.</font></p><p>All instructions to <b>dbimp</b> are placed into a text file inthe form of an import specification language (ISL). An ISLspecification identifies the database and the files containingASCII data to be imported. It also contains mappings between theASCII data fields and the <b><i>db.*</i></b> record fields. Figure10-2 shows the general operation of <b>dbimp</b>.</p><p align="center"><b><img alt="Fig. 10-2. Import Utility Usage"src="dbstar_10-2.gif"><br><br>Fig. 10-2. Import Utility Usage</b></p><p>ASCII text files can be created by a wide variety of softwareprograms, including <b>dbexp</b>, text editors, and most other DBMSproducts.</p><p>Databases that are built with certain restrictions (see section10.4, "Using Export and Import Together") can be exported, thenimported again without losing any information. This allows simplerestructuring of a database. Also, a database in ASCII format canbe transferred to another computer and then imported into a<b><i>db.*</i></b> database, even though the source and targetcomputers have different processors and internal representations ofdata.</p><h3><a name="Usage" id="Usage"></a>10.3.1 Import Program Usage</h3><p><font size="2">The full definition of the usage of <b>dbimp</b>is shown below.</font></p><pre><font color="#0000FF">dbimp [-n] [-s "<<i>char</i>>"] [-e "<<i>char</i>>"] [-k<i>n</i>] [-p<i>n</i>] <i>impspec</i></font></pre><p><font size="2">The <b>-s</b> and <b>-e</b> specify alternateseparator characters and escape characters for the ASCII file. Forexamples, refer to the discussion of these characters in section10.2. If <b>dbexp</b> has been used to create the text files thatare to be imported, any alternate separator or escape charactersspecified on the <b>dbexp</b> command line should also be used onthe <b>dbimp</b> command line.</font></p><p>The <b>-n</b> option is a "no print" request. By default,<b>dbimp</b> will display each record of ASCII text and any warningmessages to standard output. If no such output is desired, thisoption will turn it off.</p><p>The <b>-k<i>n</i></b> option makes <b><i>n</i></b> the key sizein the created record index (see section 10.3.2, example 2). Bydefault, <b><i>n</i></b> is 25. If your data requires morecharacters to be unique, you can specify up to 228 characters. Notethat the larger the number, the larger the temporary key file needsto be. To decrease the size of the CRI keys, use a value of<b><i>n</i></b> less than 25.</p><p>The <b>-p<i>n</i></b> option makes <b>dbimp</b> call<b>d_setpages</b> with <b><i>n</i></b> as the number for thedatabase pages. By default, <b><i>n</i></b> is 17. To increaseperformance, increase <b><i>n</i></b>. To save on memory,<b><i>n</i></b> may be set as low as five.</p><p>The import specification language is contained in the file<b>impspec</b>. This is a text file containing the specificationthat has been written by the <b>dbimp</b> user. It may have anylegal file name, but we recommend the following namingconvention:</p><pre><font color="#0000FF">dbname.imp</font></pre><p><font size="2">The suffix of <b>.imp</b> will distinguish thisfile as an import specification for the database<i>dbname</i>.</font></p><h3><a name="Language" id="Language"></a>10.3.2 ImportSpecification Language</h3><p><font size="2">This section fully defines the usage and syntaxof the import specification language. The examples in the followingparagraphs introduce the language's features in a stepwise manner,with growing complexity. The complete grammar is given at the endof this section.</font></p><h4>Example One: Simple family information database</h4><p><font size="2">The first example contains the followingelements: a set of person names in the ASCII text file<b>person.asc</b>:</font></p><pre><font color="#0000FF">"Warner","Micah Wayne",9761101"Warner","Jesse David",9800706"Warner","Wayne Lawrence",9540530"Wood","Jennifer Ann",9531214"Warner","Paul Russell",9860904</font></pre><p><font size="2">and a <b><i>db.*</i></b> record definition in adatabase named <b>tree</b>:</font></p><pre><font color="#0000FF">record person { char p_last[20]; char p_first[30]; long p_birth; compound key p_name { p_first; p_last; }}</font></pre><p><font size="2">The goal is to import the ASCII records into the<b><i>db.*</i></b> records. The first ASCII field is a last name,the second is a first and middle name, and the third is an encodedbirthdate (format YYY MM DD, where 1000 must be added to the year).The fields in the <b><i>db.*</i></b> record happen to be in thesame order. Our import specification will be stored in file<b>tree.imp</b>, and will contain the followingstatements:</font></p><pre><font color="#0000FF">database tree;foreach "person.asc" { record person { field p_last = 1; field p_first = 2; field p_birth = 3; }}end;</font></pre><p><font size="2">The import utility is invoked asfollows:</font></p><pre><font color="#0000FF">dbimp tree.imp</font></pre><p><font size="2">The output would look something like the boxbelow:</font></p><pre><font color="#0000FF">Database Import Utility For db.* Copyright (C) 1984-2000 Centura Corporation, All Rights Reserved.Compilation completeStarting data import"Warner","Micah Wayne",9761101"Warner","Jesse David",9800706"Warner","Wayne Lawrence",9540530"Wood","Jennifer Ann",9531214"Warner","Thomas James",9850404"Warner","Paul Russell",9860904Successful import</font></pre><p><font size="2">Note that there are two phases in the executionof the import utility. The first phase is the compilation phase,where the import specification is read and compiled. If there areany errors or warnings in the specification, messages will beprinted before the "Compilation complete" message, and the importwill be terminated. The specification must compile correctly beforethe utility will open and update the database. This second phase isthe import phase. Its activity is logged between the "Starting dataimport" and "Successful import" messages. Normal output is a copyof the input records. If there are any problems during the import,error and warning messages will appear between the input records.Often, the warnings can be ignored, but all should be examined forpotential problems in the input data.</font></p><p>This first example has illustrated the use of five importspecification statements: <b>database</b>, <b>foreach</b>,<b>record</b>, <b>field</b>, and <b>end</b>.</p><p>The <b>database</b> statement identifies the database to beopened and updated. Records may exist in a database if they willnot collide with unique keys that are being imported. Existingrecords will not be altered by the import, nor can they beaccessed; the new data will be completely disjoint from existingdata.</p><p>The <b>foreach</b> statement specifies an ASCII text file(within quotation marks), followed by a block of statementsenclosed in braces. It may be interpreted as saying, "for each linein this file, perform the following operations." During the importphase of the utility, it will repeatedly read one line from thenamed file and process its contents according to the enclosedstatements. When the last line of text has been read, the statementfollowing the closing brace will be executed. In this simpleexample, the next statement is an <b>end</b> statement.</p><p>The <b>record</b> statement names a <b><i>db.*</i></b> recordtype, followed by a block of statements enclosed in braces. Eachtime the record statement is executed, a record of the named typeis conditionally created. There is a way to skip the creation of arecord if a record with the same contents has already been created.This capability will be discussed below. In this example, a recordwill be created for each input line of text.</p><p>A <b>field</b> statement defines a mapping between a<b><i>db.*</i></b> record field (named left of the equals sign),and an ASCII field (identified by its numeric position in therecord). The input field is converted into the type of the<b><i>db.*</i></b> field. In this example, the first and secondfields in the input record are converted into strings (by adding anull terminator following the last character), and placed into the<b>p_last</b> and <b>p_first</b> fields. Then the third input fieldis converted into a long integer, and placed into the<b>p_birth</b> field. The record is actually created with the<b>d_fillnew</b> function, so that all keys are automaticallycreated along with the record.</p><p>The <b>field</b> statement implements the mapping between theASCII and <b><i>db.*</i></b> records. For each record statement,zero or more field statements may be used, not to exceed the numberof fields defined in the record. Not all fields in the<b><i>db.*</i></b> record require a <b>field</b> statement.Unspecified fields are zero-filled. Likewise, it is not necessaryto use all ASCII fields in a <b>field</b> statement. The orderingof the fields is irrelevant.</p><p>If <b>dbimp</b> is unable to fully convert the input data intothe <b><i>db.*</i></b> field type, it will print a warning message,do the best it can, and go on. All other fields are filled. Forexample, the length of the text in an input field may be longerthan the space allowed for it in the record. The utility will fillthe <b><i>db.*</i></b> field to its maximum length, and ignore anyremaining characters.</p><h4>Example Two: Simple library database</h4><p><font size="2">Example two will illustrate the conditionalcreation of records. Suppose that the input text contains redundantdata, as follows:</font></p><pre><font color="#0000FF">"Knuth, D.",1968,"Fundamental Algorithms""Ullman, J.",1982,"Principles of Database Systems""Knuth, D.",1969,"Seminumerical Algorithms""Knuth, D.",1973,"Searching and Sorting"</font></pre><p><font size="2">Your schema will not contain redundant data butwill instead utilize a set construct:</font></p><pre><font color="#0000FF">record author { key char name[32];}record book { char title[52]; long pub_date;}set published { order ascending; owner author; member book by pub_date;}</font></pre><p><font size="2">In this example, each line of text contains partsof two record types. This implies that two record statements shouldbe included within the one <b>foreach</b> statement. The resultingdatabase should contain four <b>book</b> records, but only two<b>author</b> records. The first <b>author</b> record should beconnected, via the published set, to three <b>book</b> records,while the second <b>author</b> record should be connected to onlyone. The following import specification will do justthat:</font></p><pre><font color="#0000FF">database books;foreach "books.asc" { record author { create on 1; field name = 1; } record book { field title = 3; field pub_date = 2; } connect published;}end;</font></pre><p><font size="2">In this example, two records are conditionallycreated for each ASCII text line. The <b>create on</b> statement isused to make sure that a new <b>author</b> record is created onlyif an identical record has not already been created. Anotherstatement, <b>connect</b>, is used to perform a <b>d_connect</b>function between the two records created in the loop.</font></p><p>The <b>create on</b> statement causes <b>dbimp</b> to search itsinternal created record index (CRI) for the existence of a recordwith the same record type and field value. The field value storedin the CRI may or may not be stored in the record itself. In thisexample, the field used in the <b>create on</b> statement is alsoused in the record. Other import specifications may use the fieldvalues only to establish connections, and not for storage in the<b><i>db.*</i></b> record.</p><p>If the <b>create on</b> statement searches the CRI for amatching record type and field value and does not find it, it willcreate the record and an entry in the CRI, storing the databaseaddress of the created record in the CRI entry. If it finds amatch, it does not create a new record, but uses the databaseaddress stored in the entry to represent the record that would havebeen created.</p><p>The import utility maintains another internal list, calledcurrent of record type (CRT). This is a currency table similar tothe current set, owner, and member tables maintained by the<b><i>db.*</i></b> runtime functions. Whenever a record is created,the database address of that record is stored as the current recordof its type. The <b>connect</b> statement will search the schematables for the correct owner and member types, then extract thedatabase addresses of those types from the CRT table. If there areno current records of both types, <b>dbimp</b> will not attempt aconnection, and will skip to the next iteration of the<b>foreach</b> loop. Because only one record of each type is beingtracked, <b>dbimp</b> cannot connect recursive sets; that wouldrequire two records of that one type to be tracked.</p><p>We will perform the above example, showing the CRI and CRT listat key points. The statement to be executed is shown, followed by adescription of the actions performed by <b>dbimp</b> in processingthe statement.</p><p><b>Statement:</b></p><pre><font color="#0000FF"> foreach "books.asc" {</font></pre><p><font size="2">Read the first line of <strong>books.asc</strong>into memory:</font></p><pre><font color="#0000FF"> "Knuth, D.",1968,"Fundamental Algorithms"</font></pre><p><b><font size="2">Statement:</font></b></p><pre><font color="#0000FF"> record author { create on 1;</font></pre><p><font size="2">Search CRI (which is empty) for match.<br>Create the record (database address = [0:1]).<br>Create entry in CRI:</font></p><pre><font color="#0000FF"> key={author,"Knuth, D."}, data=[0:1]</font></pre><p><font size="2">Save CRT:</font></p><pre><font color="#0000FF"> author: [0:1] book: NULL_DBA</font></pre><p><b><font size="2">Statement:</font></b></p>
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -