⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 rfc610.txt

📁 RFC 相关的技术文档
💻 TXT
📖 第 1 页 / 共 5 页
字号:
little or no structure, but their common use dictates that they beavailable in a common, always accessible place.  It is taking advantageof the economics of the datacomputer, more than anything else, sincemost of these services are available on any file system.This mode of use is mentioned here because it may account for a largepercentage of datalanguage requests.  It requires only capabilitieswhich would be present in datalanguage in any case; the only specialrequirement is to make sure it is easy and simple to accomplish thesetasks.2.4.4 Use of Datacomputer for File ArchivingThis is another economics-oriented application.  The basic idea is tostore on the datacomputer everything that you intend to read rarely, ifever.  This could include backup files, audit trails, and the like.An interesting idea related to archiving is incremental archiving. Atypical practice, with regard to backing up data stored online in atime-sharing system, is to write out all the pages which are differentthan they were in the last dump.  It is then possible to recover byrestoring the last full dump, and then restoring all incremental dumpsup to the version desired.  This system offers a lower cost for dumpingand storage, and a higher cost for recovery; it is appropriate when theprobability of needing a recovery is low.  Datalanguage, then, should bedesigned to permit convenient incremental archiving.As in the case of the previous application (file system), archiving isimportant as a design consideration because of its expected frequencyand economics, not because it necessarily requires any extra generalityat the language level. It may dictate that specialized mechanisms forarchiving be built into the system.2.5 Data SharingControlled sharing of data is a central concern of the project. Threemajor sub-problems in data sharing are: (1) concurrent use, (2)independent concepts of the same database, and (3) varyingrepresentations of the same database.Concurrent use of a resource by multiple independent processes iscommonly implemented for data on the file level in systems in whichfiles are regarded as disjoint, unrelated objects.  It is sometimesimplemented on the page level.Considerable work on this problem has already been done within theWinter, Hill & Greiff                                          [Page 12]RFC 610           Further Datalanguage Design Concepts     December 1973datacomputer project.  When this work is complete, it will have someimpact on the language design; by and large however, we do not considerthis aspect of concurrent use to be a language problem.Other aspects of the concurrent use problem, however, may require moreconscious participation by the user.  They relate to the semantics ofcollections of data objects, when such collections span the boundariesof files known to the internal operating system.  Here the question ofwhat constitutes an update conflict is more complex.  Related questionsarise in backup and recovery. If two files are related, then perhaps itis meaningless to recover an earlier state of one without recovering thecorresponding state of the other.  These problems are yet to beinvestigated.Another problem in data sharing is that not all users of a databaseshould have the same concept of that database.  Examples: (1) forprivacy reasons, some users should be aware of only part of the database(e.g., scientists doing statistical studies on medical files do not needaccess to name and address), (2) for program-data independence, payrollprograms should access only data of concern in writing paychecks, eventhough skill inventories may be stored in the same database, (3) forglobal control of efficiency, simplicity in application programming, andprogram-data independence each application program should "see" a dataorganization that is best for its job.To further analyze example (3), consider a database which containsinformation about students, teachers, subjects and also indicates whichstudents have which teachers for which subjects.  Depending on theproblem to be solved, an application program may have a strongrequirement for one of the following organizations:(1) entries of the form (student,teacher,subject) with no concern about    redundancy.  In this organization an object of any of the three    types may occur many times.(2) entries of the form             (student,       (teacher,subject),                             (teacher,subject),                             .                             .                             .                             (teacher,subject))(3) entries of the form             (teacher,       subject,(student...student),                             subject,(student...student),                             subject,(student.. .student))and other organizations are certainly possible.One approach to this problem is to choose an organization for storeddata, and then have application programs write requests which organizeWinter, Hill & Greiff                                          [Page 13]RFC 610           Further Datalanguage Design Concepts     December 1973output in the form they want.  The application programmer applies hisingenuity in stating the request so that the process of reorganizationis combined with the process of retrieval, and the result is relativelyefficient.  There are important, practical situations in which thisapproach is adequate; in fact there are situations in which it isdesirable. In particular, if efficiency or cost is an overridingconsideration, it may be necessary for every application programmer tobe aware of all the data access and organization factors.  This may bethe case for a massive file, in which each retrieval must be tuned tothe access strategy and organization; any other mode of operation wouldresult in unacceptable costs or response times.However, dependence between application programs and data organizationor access strategy is not a good policy in general. In a widely-shareddatabase, it can mean enormous cost in the event of databasereorganization, changes to access software, or even changes in thestorage medium.  Such a change may require reprogramming in hundreds ofapplication programs distributed throughout the network.As a result, we see a need for a language which supports a spectrum ofoperating modes, including: (1) application program is completelyindependent of storage structure, access technique, and reorganizationstrategy, (2) application program parametrically controls these, (3)application program entirely controls them. For a widely-shareddatabase, mode (1) would be the preferred policy, except when (a) theapplication programmer could do a better job than the system in makingdecisions, and (b) the need for this increment of efficiency outweighedthe benefits of program-data independence.In evaluating this question for a particular application, it isimportant to realize the role of global efficiency analysis.  When thereare many users of a database, in some sense the best mode of operationis that which minimizes the total cost of processing all requests andthe total cost of storing the data.  When applications come and go, asreal-world needs change, then the advantages of centralized control aremore likely to outweigh the advantages of optimization for a particularapplication program.The third major sub-problem arises in connection with item levelrepresentations.  Because of the environment in which it executes, eachapplication program has a preferred set of formatting concepts, lengthindicators, padding and alignment conventions, word sizes, characterrepresentations, and so on.  Once again it is better policy for theapplication program to be concerned only with the representations itwants and not with the stored data representation.  However, there willbe cases in which efficiency for a given request overrides all otherfactors.Winter, Hill & Greiff                                          [Page 14]RFC 610           Further Datalanguage Design Concepts     December 1973At this level of representation, there is at least one additionalconsideration: potential loss of information when conversion takesplace.  Whoever initiates a type conversion (and this will sometimes bethe datacomputer and sometimes the application program) must also beresponsible for seeing that the intent of the request is preserved.Since the datacomputer must always be responsible for the consistencyand the meaning of a shared database, there are some conflicts to beresolved here.To summarize, it seems that the result of wide sharing of databases isthat a larger system must be considered in choosing a data managementpolicy for a particular database.  This larger system, in the case ofthe datacomputer, consists of a network of geographically distributedapplications programs, a centralized database, and a centralized datamanagement system.  The requirement for datalanguage is to provideflexibility in the management of this larger system.  In particular, itmust be possible to control when and where conversions, data re-organizations, and access strategies are made.2.6 Need for High Level CommunicationAll of the above considerations point to the need for high levelcommunication between the datacomputer and its users.  The complex anddistinct nature of datacomputer hardware make it imperative thatrequests be put to the datacomputer so that it can make major decisionsregarding the access strategies to be used.  At the same time, the largeamounts of data stored and the demand of some users for extremely hightransmission bandwidths make it necessary to provide for user control ofsome storage and transmission schemes.  The fact that databases will beused by applications which desire different views of the same data andwith different constraints means that the datacomputer must be capableof mapping one users request onto another users data.  Interprocess useof the datacomputer means that datasharing must be completelycontrollable to avoid the need for human intervention. Extensivefacilities for ensuring data integrity and controlling access must beprovided.2.6.1 Data DescriptionBasic to all these needs is the requirement that the data stored at thedatacomputer be completely described in both functional and physicalparameters.  A high level description of the data is especiallyimportant to provide the sharing and control of data.  The datacomputermust be able to map between different hardware and differentapplications. In its most trivial form this means being able to convertbetween floating point number representations on different machines.  OnWinter, Hill & Greiff                                          [Page 15]RFC 610           Further Datalanguage Design Concepts     December 1973the other extreme it means being able to provide matrix data for theILLIAC IV as well as being able to provide answers to queries from anatural language program, both addressed to the same weather data base.Data descriptions must provide the ability to specify the bit levelrepresentations and the logical properties and relationships of data.2.6.2 Data integrity and Access ControlIn the environment we have been describing, the problems of maintainingdata integrity and controlling use of data assume extreme importance.Shared use of datacomputer files depends on the ability of thedatacomputer to guarantee that the restrictions on data-access arestrictly enforced.  Since different users will have differentdescriptions, the access control mechanism must be associated with thedescriptions themselves.  One can control access to data by controllingaccess to its various descriptors.  A user can be constrained to accessa given data base only through one specific description which limits thedata he can access.  In a system where the updaters of a database may beunknown to each other, and possibly have different views of the data,only the datacomputer can assure data integrity.  For this reason, allrestrictions on possible values of data objects, and on possible ornecessary relationships between objects must be stated in the datadescription.2.6.3 OptimizationThe decisions regarding data access strategy must ordinarily be made atthe datacomputer, where knowledge of the physical considerations isavailable.  These decisions cannot be made intelligently unless therequests for data access are made at a high level.For example, compare the following two situations: (1) a request callsfor output of _all_ weather observations made in California exhibitingcertain wind and pressure conditions, (2) a series of requests is sent,each one retrieving California weather observations; when a requestfinds an observation with the required wind and pressure conditions, ittransmits this observation to a remote system.  Both sessions achievethe same result: the transmission of a certain set of observations to aremote site for processing.  In the first session, however, thedatacomputer receives, at the outset, a description of the data that isneeded; in the second, it processes a series of requests, each one ofwhich is a surprise.In the first case, a smart datacomputer has the option of retrieving allof the needed data in one access to the mass storage device.  It canthen buffer this data on disk until the user is ready to accept it.  InWinter, Hill & Greiff                                          [Page 16]RFC 610           Further Datalanguage Design Concepts     December 1973the second case, the datacomputer lacks the information it needs to makesuch an optimization.The language should permit and encourage users to provide theinformation needed to do optimization.  The cost of not doing it is muchhigher with mass storage devices and large files than it is inconventional systems.2.7 Application Oriented ConcernsIn the above sections we have described a number of features which thedatacomputer system must provide.  In this section we focus on what isnecessary to make these features readily available to users of thedatacomputer.2.7.1 Datacomputer-user InteractionAn application interacts with the datacomputer in a _session_.  Asession consists of a series of requests.  Each session involvesconnecting to the datacomputer via the network, establishing identities,and setting up transmission paths for both data and datalanguage.Datalanguage is transmitted in character mode (using network standardASCII) over the datalanguage connection. Error and status messages aresent over this connection to the application program.The data connection (called a PORT) is viewed as a bit stream and is

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -