📄 rfc610.txt

📁 RFC 的详细文档！
💻 TXT
📖 第 1 页 / 共 5 页
字号:
上一页 1 2 3 45
for the construction of hierarchical relationships of data.  The
aggregates which will definitely be available are classified as
_structs_, _arrays_, _strings_, _lists_, and _directories_.

A struct is a static aggregate of data items (called _components_).  A
struct is static in the sense that the components of a struct cannot be
added or deleted from the struct, they are inextricably bound to the
struct.  Associated with each component of the struct is a name by which
that component may be referenced relative to the struct.  The struct
aggregate may be used to model what is often thought of as a record,



Winter, Hill & Greiff                                          [Page 22]

RFC 610           Further Datalanguage Design Concepts     December 1973


with each component being a field of that record.  A struct can also be
used to group components of a record which are more strongly related,
conceptually, than other components and may be operated on together.

Arrays allow for repetition in data structures.  An array, like a
struct, is a static aggregate of data items (called _members_). Each
member of an array is of the same type.  Associated with each member is
an index by which that member can be referenced relative to the array.
Arrays can he used to model repeating data in a record (repeating
groups).

The concept of string is actually a hybrid of basic data and data
aggregates.  Strings are aggregates in that they are compositions
(similar to arrays) of more primitive data (e.g., characters). They are,
however, generally conceived of as basic in that they are mostly viewed
as a unit rather than as a collection of items, where each item has
individual importance. Also the meaning of a string is highly dependent
on the order of the individual components.  In more concrete terms,
there are operations which are defined on specific types of strings.
For example, the logical operators (_and_, _or_, etc.) are defined to
operate on strings of bits.  However, there are no operations which are
defined on arrays of bits, although there are operations defined on both
arrays, in general, and on bits.  Strings of characters, bits, and
uninterpreted data will be available in datalanguage.

Lists are like arrays in that they are collection of similar members.
However, lists are dynamic rather than static.  Members of a list can be
added and deleted from the list.  Although, the members of a list are
ordered (in fact more than one ordering can be defined on a list), the
list is not intended to be referenced via an index, as is the case with
an array.  Members of a list can be referenced via some method of
sequencing through the list.  A list member, or set (see discussion
under virtual data) of members, can also be referenced, by some method
of identification by content.  The list structure can be used to model
the common notion of a file.  Also restrictive use of lists as
components of structs provides power with respect to the construction of
dynamic hierarchical data relationships below the file level.  For
example, the members of a list may themselves be, in part, composed of
lists, as in a list of families, where each family contains a list of
children as well as other information.

Directories are dynamic data aggregates which may contain any type of
data item.  Data items contained in a directory are called _nodes_.
Associated with each node of a directory is a name by which that data
item can be referenced relative to the directory. As with lists, items
may be dynamically added to and deleted from a directory.  The primary
motivation behind providing the directory capability is to allow the
user to group conceptually related data together.  Since directories



Winter, Hill & Greiff                                          [Page 23]

RFC 610           Further Datalanguage Design Concepts     December 1973


need not contain only file type information, "auxiliary" data can be
kept as part of the directory.  For example, "constant" information,
like salary range tables for a corporation data base; or user defined
operations and data types (see below) can be maintained in a directory
along with the data which may use this information.  Also directories
may themselves be part of a directory, allowing for a hierarchy of data
grouping.

Directories will also be defined so that system controlled information
can be maintained with some of the subordinate items (e.g. time of
creation, time of update, privacy locks, etc.).  It may also be possible
to allow the data user to define and control his own information which
would be maintained with the data. At the least, the design of
datalanguage will allow for parametric control over the information
managed by the system.

Directories are the most general and dynamic type of aggregate data.
Both the name and description (see below) of directory nodes exist with
the nodes themselves, rather than as part of the description of the
directory.  Also the level of nesting of a directory is dynamic since
directories can be dynamically added to directories.  Directories are
the only aggregate for which this is true.

Datalanguage will also provide some specific and useful variations of
the above data aggregates.  Structs will be available which allow for
optional components. In this case the existence of a component would be
based on the contents of other components.  It may also he possible to
allow for the existence to be based on information found at a higher
level of data hierarchy.  Similarly, components with _unresolved_ type
will be provided.  That is the component may be one of a fixed number of
types.  The type of the component would be based on the contents of
other components of the struct.  It is also desirable to allow the type
or existence of a component to be based on information other than the
contents of other components.  For instance, the type of one component
might be based on the type of another component.  In general, we would
like for datalanguage to allow for the attributes (see below) of one
item to be a function of the attributes of other items.

We would also like to provide mixed lists.  Mixed lists are lists which
contain more than one type of member.  In this case the members would
have to be self defining.  That is, the type of all member would have to
be "alike" to the degree that information which defines the type of that
member could be found.

Similar to components whose type is unresolved are Arrays with
unresolved length.  In this case, information defining the length of the
array must be carried with the array or perhaps with other components of
an aggregate which encompasses the array.



Winter, Hill & Greiff                                          [Page 24]

RFC 610           Further Datalanguage Design Concepts     December 1973


In all of the above cases the type of an item is unresolved to some
degree and information which totally resolves the type is carried with
the item.  It is possible that in some or perhaps all of these cases the
datacomputer system could be responsible for the maintenance of this
information, making it invisible to the data user.


3.3 General Relational Capabilities

The data aggregates described above allow for the modeling of various
relationships among data.  All relationships which can be constructed
are hierarchical.

Two approaches can he taken to provide the capability of modeling non-
hierarchical relationships.  New types of data aggregates can be
introduced which will broaden the range of data relationships
expressible in datalanguage.  Or, a basic data type of "pointer" can be
introduced which will serve as a primitive out of which relations can be
represented.  Pointer would be a data type which establishes some kind
of correspondence from one item to another.  That is, it would be a
method of finding one item, given another . Providing the ability to
have items of type pointer does not necessitate the introduction of the
concept of address which we deem to be a dangerous step.  For example,
an item defined to point to a record in a personnel file could contain a
social security number which is contained in each record of the file and
uniquely identifies that record.  In general a pointer is an item of
information which can be used to uniquely identify another item.

While the pointer approach provides the greater degree of flexibility,
it does this at the price of relegating much of the work to the user as
well as severely limiting the amount of control the datacomputer system
has over the data.  A hybrid solution is possible, where some new
aggregate data types are provided as well as a restricted form of
pointer data type.  While the approach to be taken is still being
studied, the datalanguage design will include some method of expressing
non-hierarchical data structures.


3.4 Ordering of Data

Lists are generally viewed as ordered.  It is possible, however, that a
list can be used to model a dynamic collection of similar items which
are not seen as ordered.  The unordered case is important, in that,
given this information the datacomputer can be more efficient since new
members can be added wherever it is convenient.

There are a number of ways a list can be ordered.  For instance, the
ordering of a list can be based on the contents of its members.  In the



Winter, Hill & Greiff                                          [Page 25]

RFC 610           Further Datalanguage Design Concepts     December 1973


simplest case this involves the contents of a basic data item.  For
example, a list of structs containing information on employees of a
company may be ordered on the component which contains the employee's
social security number. More complex ordering criteria are possible.
For example, the same list could be ordered alphabetically with respect
to the employee's last name.  In this case the ordering relation is a
function of two items, the last and first names.  The user might also
want to define his own ordering scheme, even for orderings based on
basic data items.  An ordering could be based on an employee's job title
which might even utilize auxiliary data (i.e. data external to the
list).  It is also possible to maintain a list in order of insertion.
In the most general case, the user could dynamically define his ordering
by specification of where an item is to be placed as part of his
insertion requests.  In all of the above cases, data could be maintained
in ascending or descending order.

In addition to maintenance of a list in some order, it is possible to
define one or more orderings "imposed" on a list.  These orderings must
be based on the contents of a list's members.  This situation is similar
to the concept of virtual data (see below) in that the list is not
physically maintained in a given order, but retrieved as if it were.
Orderings of this type can be dynamically formed (see discussion of set
under virtual data).  Imposed orderings can be accomplished via the
maintenance of auxiliary structures (see discussion under internal
representation) or by utilization of a sorting strategy on retrievals.
Much work has been done with regard to effective implementation of the
maintenance and imposition of orderings on lists.  This work is
described in working paper number 2.


3.5 Data Integrity

An important feature of any data management system is the ability to
have the system insure the integrity of the data.  Data needs to be
protected against erroneous manipulation by people and against system
failure.

Datalanguage will provide automatic validity checks.  Many flavors need
to be provided so that appropriate trade-offs can be made between the
degree of insurance and the cost of validation.  The datalanguage user
will be able to request constant validation: where validity checks are
made whenever the data is updated; validation on access: where validity
checks are performed when data is referenced but before it is retrieved;
regularly scheduled validation: where the data is checked at regular
intervals; background validation: where the system will run checks in
its spare time; and validation on demand.  Constant validation and
validation on access are actually special cases of the more general
concept of event triggered validation.  In this case the user specifies



Winter, Hill & Greiff                                          [Page 26]

RFC 610           Further Datalanguage Design Concepts     December 1973


an event which will cause data validation procedures to be invoked. This
feature can be used to accomplish such things as validation following a
"batch" of updates.  Also, some mechanism for specifying combinations of
these types would be useful.

In order for some of the data validation techniques to be effective, it
may be necessary to keep some data validation "bookkeeping" information
with the data.  For example, information which can be used to determine
whether an item has been checked since it was last updated might be used
to cause validation on access if there has not been a recent background
validation.  The datacomputer may provide for optional automatic
maintenance of such special kinds of information.

In order for the datacomputer system to insure data validity, the user
must define what valid is.  Two types of validation can be requested. In
the first case the user can tell the datacomputer that a specific data
item may only assume one of a specific set of values.  For example, the
color component of a struct may only assume the values 'red', 'green',
or 'blue'.  The other case is where some relation must hold between
members of an aggregate.  For example, if the sex component of a struct
is 'male' then the number of pregnancies component must be 0.

Data validation is only half of the data integrity picture. Data
integrity involves methods of restoring damaged data.  This requires
maintenance of redundant information.  Features will be provided which
will make the datacomputer system responsible for the maintenance of
redundant data and possibly even automatic restoration of damaged data.
In section 2 we discussed possible uses of the datacomputer for file
backup.  All features which are provided for this purpose will also be
available as methods of maintaining backup information for restoration
of files residing at the datacomputer.


3.6 Privacy

Datalanguage will have to provide extensive privacy and protection
capabilities.  In its simplest form a privacy lock is provided at the
file level.  The lock is opened with a password key.  Associated with
this key is a set of privileges (reading, updating, etc.). Two degrees
of generality are sought.  Privacy should be available at all levels of
data.  Therefore, groups of related data, including groups of files
could be made private by creating private directories.  Also, specific
fields of records could be made private by having private components of
a struct where other components of the struct are visible to a wider (or
different) class of users.  We would also like the user to be able to
define his own mechanism.  In this way, very personalized, complex, and
hence secure mechanisms can be defined.  Also features such as 'everyone
can see his own salary' might be possible.



Winter, Hill & Greiff                                          [Page 27]

RFC 610           Further Datalanguage Design Concepts     December 1973


3.7 Conversion

Many types of data are related in that some or all of the possible
values of one type of data have an "obvious" translation to the values
of another.  F
上一页 1 2 3 45
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -