📄 fieldhash.pm
字号:
package Hash::Util::FieldHash;use 5.009004;use strict;use warnings;use Scalar::Util qw( reftype);our $VERSION = '1.03';require Exporter;our @ISA = qw(Exporter);our %EXPORT_TAGS = ( 'all' => [ qw( fieldhash fieldhashes idhash idhashes id id_2obj register )],);our @EXPORT_OK = ( @{ $EXPORT_TAGS{'all'} } );{ require XSLoader; my %ob_reg; # private object registry sub _ob_reg { \ %ob_reg } XSLoader::load('Hash::Util::FieldHash', $VERSION);}sub fieldhash (\%) { for ( shift ) { return unless ref() && reftype( $_) eq 'HASH'; return $_ if Hash::Util::FieldHash::_fieldhash( $_, 0); return $_ if Hash::Util::FieldHash::_fieldhash( $_, 2) == 2; return; }}sub idhash (\%) { for ( shift ) { return unless ref() && reftype( $_) eq 'HASH'; return $_ if Hash::Util::FieldHash::_fieldhash( $_, 0); return $_ if Hash::Util::FieldHash::_fieldhash( $_, 1) == 1; return; }}sub fieldhashes { map &fieldhash( $_), @_ }sub idhashes { map &idhash( $_), @_ }1;__END__=head1 NAMEHash::Util::FieldHash - Support for Inside-Out Classes=head1 SYNOPSIS ### Create fieldhashes use Hash::Util qw(fieldhash fieldhashes); # Create a single field hash fieldhash my %foo; # Create three at once... fieldhashes \ my(%foo, %bar, %baz); # ...or any number fieldhashes @hashrefs; ### Create an idhash and register it for garbage collection use Hash::Util::FieldHash qw(idhash register); idhash my %name; my $object = \ do { my $o }; # register the idhash for garbage collection with $object register($object, \ %name); # the following entry will be deleted when $object goes out of scope $name{$object} = 'John Doe'; ### Register an ordinary hash for garbage collection use Hash::Util::FieldHash qw(id register); my %name; my $object = \ do { my $o }; # register the hash %name for garbage collection of $object's id register $object, \ %name; # the following entry will be deleted when $object goes out of scope $name{id $object} = 'John Doe';=head1 FUNCTIONSC<Hash::Util::FieldHash> offers a number of functions in support ofL<The Inside-out Technique> of class construction.=over=item id id($obj)Returns the reference address of a reference $obj. If $obj isnot a reference, returns $obj.This function is a stand-in replacement forL<Scalar::Util::refaddr|Scalar::Util/refaddr>, that is, it returnsthe reference address of its argument as a numeric value. The onlydifference is that C<refaddr()> returns C<undef> when given anon-reference while C<id()> returns its argument unchanged.C<id()> also uses a caching technique that makes it faster whenthe id of an object is requested often, but slower if it is neededonly once or twice.=item id_2obj $obj = id_2obj($id)If C<$id> is the id of a registered object (see L</register>), returnsthe object, otherwise an undefined value. For registered objects thisis the inverse function of C<id()>.=item register register($obj) register($obj, @hashrefs)In the first form, registers an object to work with for the functionC<id_2obj()>. In the second form, it additionally marks the givenhashrefs down for garbage collection. This means that when the objectgoes out of scope, any entries in the given hashes under the key ofC<id($obj)> will be deleted from the hashes.It is a fatal error to register a non-reference $obj. Any non-hashrefsamong the following arguments are silently ignored.It is I<not> an error to register the same object multiple times withvarying sets of hashrefs. Any hashrefs that are not registered yetwill be added, others ignored.Registry also implies thread support. When a new thread is created,all references are replaced with new ones, including all objects.If a hash uses the reference address of an object as a key, thatconnection would be broken. With a registered object, its id willbe updated in all hashes registered with it.=item idhash idhash my %hashMakes an idhash from the argument, which must be a hash.An I<idhash> works like a normal hash, except that it stringifies aI<reference used as a key> differently. A reference is stringifiedas if the C<id()> function had been invoked on it, that is, itsreference address in decimal is used as the key.=item idhashes idhashes \ my(%hash, %gnash, %trash) idhashes \ @hashrefsCreates many idhashes from its hashref arguments. Returns thosearguments that could be converted or their number in scalar context.=item fieldhash fieldhash %hash;Creates a single fieldhash. The argument must be a hash. Returnsa reference to the given hash if successful, otherwise nothing.A I<fieldhash> is, in short, an idhash with auto-registry. When anobject (or, indeed, any reference) is used as a fieldhash key, thefieldhash is automatically registered for garbage collection withthe object, as if C<register $obj, \ %fieldhash> had been called.=item fieldhashes fieldhashes @hashrefs;Creates any number of field hashes. Arguments must be hash references.Returns the converted hashrefs in list context, their number in scalarcontext.=back=head1 DESCRIPTIONA word on terminology: I shall use the term I<field> for a scalarpiece of data that a class associates with an object. Other terms thathave been used for this concept are "object variable", "(object) property","(object) attribute" and more. Especially "attribute" has some currencyamong Perl programmer, but that clashes with the C<attributes> pragma. Theterm "field" also has some currency in this sense and doesn't seemto conflict with other Perl terminology.In Perl, an object is a blessed reference. The standard way of associatingdata with an object is to store the data inside the object's body, that is,the piece of data pointed to by the reference.In consequence, if two or more classes want to access an object theyI<must> agree on the type of reference and also on the organization ofdata within the object body. Failure to agree on the type results inimmediate death when the wrong method tries to access an object. Failureto agree on data organization may lead to one class trampling over thedata of another.This object model leads to a tight coupling between subclasses.If one class wants to inherit from another (and both classes accessobject data), the classes must agree about implementation details.Inheritance can only be used among classes that are maintained together,in a single source or not.In particular, it is not possible to write general-purpose classesin this technique, classes that can advertise themselves as "Put meon your @ISA list and use my methods". If the other class has differentideas about how the object body is used, there is trouble.For reference L<Name_hash> in L<Example 1> shows the standard implementation ofa simple class C<Name> in the well-known hash based way. It also demonstratesthe predictable failure to construct a common subclass C<NamedFile>of C<Name> and the class C<IO::File> (whose objects I<must> be globrefs).Thus, techniques are of interest that store object data I<not> inthe object body but some other place.=head2 The Inside-out TechniqueWith I<inside-out> classes, each class declares a (typically lexical)hash for each field it wants to use. The reference address of anobject is used as the hash key. By definition, the reference addressis unique to each object so this guarantees a place for each field thatis private to the class and unique to each object. See L<Name_id> inL<Example 1> for a simple example.In comparison to the standard implementation where the object is ahash and the fields correspond to hash keys, here the fields correspondto hashes, and the object determines the hash key. Thus the hashesappear to be turned I<inside out>.The body of an object is never examined by an inside-out class, onlyits reference address is used. This allows for the body of an actualobject to be I<anything at all> while the object methods of the classstill work as designed. This is a key feature of inside-out classes.=head2 Problems of Inside-outInside-out classes give us freedom of inheritance, but as usual thereis a price.Most obviously, there is the necessity of retrieving the referenceaddress of an object for each data access. It's a minor inconvenience,but it does clutter the code.More important (and less obvious) is the necessity of garbagecollection. When a normal object dies, anything stored in theobject body is garbage-collected by perl. With inside-out objects,Perl knows nothing about the data stored in field hashes by a class,but these must be deleted when the object goes out of scope. Thusthe class must provide a C<DESTROY> method to take care of that.In the presence of multiple classes it can be non-trivialto make sure that every relevant destructor is called forevery object. Perl calls the first one it finds on theinheritance tree (if any) and that's it.A related issue is thread-safety. When a new thread is created,the Perl interpreter is cloned, which implies that all referenceaddresses in use will be replaced with new ones. Thus, if a classtries to access a field of a cloned object its (cloned) data willstill be stored under the now invalid reference address of theoriginal in the parent thread. A general C<CLONE> method mustbe provided to re-establish the association.=head2 SolutionsC<Hash::Util::FieldHash> addresses these issues on severallevels.The C<id()> function is provided in addition to theexisting C<Scalar::Util::refaddr()>. Besides its short nameit can be a little faster under some circumstances (and abit slower under others). Benchmark if it matters. Theworking of C<id()> also allows the use of the class nameas a I<generic object> as described L<further down|/"The Generic Object">.The C<id()> function is incorporated in I<id hashes> in the sensethat it is called automatically on every key that is used withthe hash. No explicit call is necessary.The problems of garbage collection and thread safety are bothaddressed by the function C<register()>. It registers an objecttogether with any number of hashes. Registry means that when theobject dies, an entry in any of the hashes under the referenceaddress of this object will be deleted. This guarantees garbagecollection in these hashes. It also means that on threadcloning the object's entries in registered hashes will bereplaced with updated entries whose key is the cloned object'sreference address. Thus the object-data association becomesthread-safe.Object registry is best done when the object is initializedfor use with a class. That way, garbage collection and threadsafety are established for every object and every field that isinitialized.Finally, I<field hashes> incorporate all these functions in onepackage. Besides automatically calling the C<id()> functionon every object used as a key, the object is registered withthe field hash on first use. Classes based on field hashesare fully garbage-collected and thread safe without furthermeasures.=head2 More ProblemsAnother problem that occurs with inside-out classes is serialization.Since the object data is not in its usual place, standard routineslike C<Storable::freeze()>, C<Storable::thaw()> and C<Data::Dumper::Dumper()> can't deal with it on their own. BothC<Data::Dumper> and C<Storable> provide the necessary hooks tomake things work, but the functions or methods used by the hooksmust be provided by each inside-out class.A general solution to the serialization problem would require anotherlevel of registry, one that that associates I<classes> and fields.So far, the functions of C<Hash::Util::FieldHash> are unaware ofany classes, which I consider a feature. Therefore C<Hash::Util::FieldHash>doesn't address the serialization problems.=head2 The Generic ObjectClasses based on the C<id()> function (and hence classes based onC<idhash()> and C<fieldhash()>) show a peculiar behavior in thatthe class name can be used like an object. Specifically, methodsthat set or read data associated with an object continue to work asclass methods, just as if the class name were an object, distinct fromall other objects, with its own data. This object may be calledthe I<generic object> of the class.This works because field hashes respond to keys that are not referenceslike a normal hash would and use the string offered as the hash key.Thus, if a method is called as a class method, the field hash is presentedwith the class name instead of an object and blithely uses it as a key.Since the keys of real objects are decimal numbers, there is noconflict and the slot in the field hash can be used like any other.The C<id()> function behaves correspondingly with respect to non-referencearguments.Two possible uses (besides ignoring the property) come to mind.A singleton class could be implemented this using the generic object.If necessary, an C<init()> method could die or ignore calls withactual objects (references), so only the generic object will ever exist.Another use of the generic object would be as a template. It isa convenient place to store class-specific defaults for variousfields to be used in actual object initialization.Usually, the feature can be entirely ignored. Calling I<objectmethods> as I<class methods> normally leads to an error and isn't usedroutinely anywhere. It may be a problem that this error isn'tindicated by a class with a generic object.=head2 How to use Field HashesTraditionally, the definition of an inside-out class contains a bareblock inside which a number of lexical hashes are declared and thebasic accessor methods defined, usually through C<Scalar::Util::refaddr>.Further methods may be defined outside this block. There has to bea DESTROY method and, for thread support, a CLONE method.When field hashes are used, the basic structure remains the same.Each lexical hash will be made a field hash. The call to C<refaddr>can be omitted from the accessor methods. DESTROY and CLONE methodsare not necessary.If you have an existing inside-out class, simply making all hashesfield hashes with no other change should make no difference. Throughthe calls to C<refaddr> or equivalent, the field hashes never get tosee a reference and work like normal hashes. Your DESTROY (andCLONE) methods are still needed.To make the field hashes kick in, it is easiest to redefine C<refaddr>as sub refaddr { shift }instead of importing it from C<Scalar::Util>. It should now be possibleto disable DESTROY and CLONE. Note that while it isn't disabled,DESTROY will be called before the garbage collection of field hashes,so it will be invoked with a functional object and will continue tofunction.It is not desirable to import the functions C<fieldhash> and/orC<fieldhashes> into every class that is going to use them. Theyare only used once to set up the class. When the class is up and running,these functions serve no more purpose.If there are only a few field hashes to declare, it is simplest to use Hash::Util::FieldHash;early and call the functions qualified: Hash::Util::FieldHash::fieldhash my %foo;Otherwise, import the functions into a convenient package likeC<HUF> or, more general, C<Aux> { package Aux; use Hash::Util::FieldHash ':all'; }and call Aux::fieldhash my %foo;as needed.=head2 Garbage-Collected HashesGarbage collection in a field hash means that entries will "spontaneously"disappear when the object that created them disappears. That must beborne in mind, especially when looping over a field hash. If anythingyou do inside the loop could cause an object to go out of scope, arandom key may be deleted from the hash you are looping over. Thatcan throw the loop iterator, so it's best to cache a consistent snapshotof the keys and/or values and loop over that. You will still have tocheck that a cached entry still exists when you get to it.
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -