perlguts.pod
来自「ARM上的如果你对底层感兴趣」· POD 代码 · 共 1,725 行 · 第 1/5 页
POD
1,725 行
=head1 NAME
perlguts - Perl's Internal Functions
=head1 DESCRIPTION
This document attempts to describe some of the internal functions of the
Perl executable. It is far from complete and probably contains many errors.
Please refer any questions or comments to the author below.
=head1 Variables
=head2 Datatypes
Perl has three typedefs that handle Perl's three main data types:
SV Scalar Value
AV Array Value
HV Hash Value
Each typedef has specific routines that manipulate the various data types.
=head2 What is an "IV"?
Perl uses a special typedef IV which is a simple integer type that is
guaranteed to be large enough to hold a pointer (as well as an integer).
Perl also uses two special typedefs, I32 and I16, which will always be at
least 32-bits and 16-bits long, respectively.
=head2 Working with SVs
An SV can be created and loaded with one command. There are four types of
values that can be loaded: an integer value (IV), a double (NV), a string,
(PV), and another scalar (SV).
The six routines are:
SV* newSViv(IV);
SV* newSVnv(double);
SV* newSVpv(char*, int);
SV* newSVpvn(char*, int);
SV* newSVpvf(const char*, ...);
SV* newSVsv(SV*);
To change the value of an *already-existing* SV, there are seven routines:
void sv_setiv(SV*, IV);
void sv_setuv(SV*, UV);
void sv_setnv(SV*, double);
void sv_setpv(SV*, char*);
void sv_setpvn(SV*, char*, int)
void sv_setpvf(SV*, const char*, ...);
void sv_setpvfn(SV*, const char*, STRLEN, va_list *, SV **, I32, bool);
void sv_setsv(SV*, SV*);
Notice that you can choose to specify the length of the string to be
assigned by using C<sv_setpvn>, C<newSVpvn>, or C<newSVpv>, or you may
allow Perl to calculate the length by using C<sv_setpv> or by specifying
0 as the second argument to C<newSVpv>. Be warned, though, that Perl will
determine the string's length by using C<strlen>, which depends on the
string terminating with a NUL character.
The arguments of C<sv_setpvf> are processed like C<sprintf>, and the
formatted output becomes the value.
C<sv_setpvfn> is an analogue of C<vsprintf>, but it allows you to specify
either a pointer to a variable argument list or the address and length of
an array of SVs. The last argument points to a boolean; on return, if that
boolean is true, then locale-specific information has been used to format
the string, and the string's contents are therefore untrustworty (see
L<perlsec>). This pointer may be NULL if that information is not
important. Note that this function requires you to specify the length of
the format.
The C<sv_set*()> functions are not generic enough to operate on values
that have "magic". See L<Magic Virtual Tables> later in this document.
All SVs that contain strings should be terminated with a NUL character.
If it is not NUL-terminated there is a risk of
core dumps and corruptions from code which passes the string to C
functions or system calls which expect a NUL-terminated string.
Perl's own functions typically add a trailing NUL for this reason.
Nevertheless, you should be very careful when you pass a string stored
in an SV to a C function or system call.
To access the actual value that an SV points to, you can use the macros:
SvIV(SV*)
SvNV(SV*)
SvPV(SV*, STRLEN len)
which will automatically coerce the actual scalar type into an IV, double,
or string.
In the C<SvPV> macro, the length of the string returned is placed into the
variable C<len> (this is a macro, so you do I<not> use C<&len>). If you do not
care what the length of the data is, use the global variable C<PL_na>. Remember,
however, that Perl allows arbitrary strings of data that may both contain
NULs and might not be terminated by a NUL.
If you want to know if the scalar value is TRUE, you can use:
SvTRUE(SV*)
Although Perl will automatically grow strings for you, if you need to force
Perl to allocate more memory for your SV, you can use the macro
SvGROW(SV*, STRLEN newlen)
which will determine if more memory needs to be allocated. If so, it will
call the function C<sv_grow>. Note that C<SvGROW> can only increase, not
decrease, the allocated memory of an SV and that it does not automatically
add a byte for the a trailing NUL (perl's own string functions typically do
C<SvGROW(sv, len + 1)>).
If you have an SV and want to know what kind of data Perl thinks is stored
in it, you can use the following macros to check the type of SV you have.
SvIOK(SV*)
SvNOK(SV*)
SvPOK(SV*)
You can get and set the current length of the string stored in an SV with
the following macros:
SvCUR(SV*)
SvCUR_set(SV*, I32 val)
You can also get a pointer to the end of the string stored in the SV
with the macro:
SvEND(SV*)
But note that these last three macros are valid only if C<SvPOK()> is true.
If you want to append something to the end of string stored in an C<SV*>,
you can use the following functions:
void sv_catpv(SV*, char*);
void sv_catpvn(SV*, char*, int);
void sv_catpvf(SV*, const char*, ...);
void sv_catpvfn(SV*, const char*, STRLEN, va_list *, SV **, I32, bool);
void sv_catsv(SV*, SV*);
The first function calculates the length of the string to be appended by
using C<strlen>. In the second, you specify the length of the string
yourself. The third function processes its arguments like C<sprintf> and
appends the formatted output. The fourth function works like C<vsprintf>.
You can specify the address and length of an array of SVs instead of the
va_list argument. The fifth function extends the string stored in the first
SV with the string stored in the second SV. It also forces the second SV
to be interpreted as a string.
The C<sv_cat*()> functions are not generic enough to operate on values that
have "magic". See L<Magic Virtual Tables> later in this document.
If you know the name of a scalar variable, you can get a pointer to its SV
by using the following:
SV* perl_get_sv("package::varname", FALSE);
This returns NULL if the variable does not exist.
If you want to know if this variable (or any other SV) is actually C<defined>,
you can call:
SvOK(SV*)
The scalar C<undef> value is stored in an SV instance called C<PL_sv_undef>. Its
address can be used whenever an C<SV*> is needed.
There are also the two values C<PL_sv_yes> and C<PL_sv_no>, which contain Boolean
TRUE and FALSE values, respectively. Like C<PL_sv_undef>, their addresses can
be used whenever an C<SV*> is needed.
Do not be fooled into thinking that C<(SV *) 0> is the same as C<&PL_sv_undef>.
Take this code:
SV* sv = (SV*) 0;
if (I-am-to-return-a-real-value) {
sv = sv_2mortal(newSViv(42));
}
sv_setsv(ST(0), sv);
This code tries to return a new SV (which contains the value 42) if it should
return a real value, or undef otherwise. Instead it has returned a NULL
pointer which, somewhere down the line, will cause a segmentation violation,
bus error, or just weird results. Change the zero to C<&PL_sv_undef> in the first
line and all will be well.
To free an SV that you've created, call C<SvREFCNT_dec(SV*)>. Normally this
call is not necessary (see L<Reference Counts and Mortality>).
=head2 What's Really Stored in an SV?
Recall that the usual method of determining the type of scalar you have is
to use C<Sv*OK> macros. Because a scalar can be both a number and a string,
usually these macros will always return TRUE and calling the C<Sv*V>
macros will do the appropriate conversion of string to integer/double or
integer/double to string.
If you I<really> need to know if you have an integer, double, or string
pointer in an SV, you can use the following three macros instead:
SvIOKp(SV*)
SvNOKp(SV*)
SvPOKp(SV*)
These will tell you if you truly have an integer, double, or string pointer
stored in your SV. The "p" stands for private.
In general, though, it's best to use the C<Sv*V> macros.
=head2 Working with AVs
There are two ways to create and load an AV. The first method creates an
empty AV:
AV* newAV();
The second method both creates the AV and initially populates it with SVs:
AV* av_make(I32 num, SV **ptr);
The second argument points to an array containing C<num> C<SV*>'s. Once the
AV has been created, the SVs can be destroyed, if so desired.
Once the AV has been created, the following operations are possible on AVs:
void av_push(AV*, SV*);
SV* av_pop(AV*);
SV* av_shift(AV*);
void av_unshift(AV*, I32 num);
These should be familiar operations, with the exception of C<av_unshift>.
This routine adds C<num> elements at the front of the array with the C<undef>
value. You must then use C<av_store> (described below) to assign values
to these new elements.
Here are some other functions:
I32 av_len(AV*);
SV** av_fetch(AV*, I32 key, I32 lval);
SV** av_store(AV*, I32 key, SV* val);
The C<av_len> function returns the highest index value in array (just
like $#array in Perl). If the array is empty, -1 is returned. The
C<av_fetch> function returns the value at index C<key>, but if C<lval>
is non-zero, then C<av_fetch> will store an undef value at that index.
The C<av_store> function stores the value C<val> at index C<key>, and does
not increment the reference count of C<val>. Thus the caller is responsible
for taking care of that, and if C<av_store> returns NULL, the caller will
have to decrement the reference count to avoid a memory leak. Note that
C<av_fetch> and C<av_store> both return C<SV**>'s, not C<SV*>'s as their
return value.
void av_clear(AV*);
void av_undef(AV*);
void av_extend(AV*, I32 key);
The C<av_clear> function deletes all the elements in the AV* array, but
does not actually delete the array itself. The C<av_undef> function will
delete all the elements in the array plus the array itself. The
C<av_extend> function extends the array so that it contains C<key>
elements. If C<key> is less than the current length of the array, then
nothing is done.
If you know the name of an array variable, you can get a pointer to its AV
by using the following:
AV* perl_get_av("package::varname", FALSE);
This returns NULL if the variable does not exist.
See L<Understanding the Magic of Tied Hashes and Arrays> for more
information on how to use the array access functions on tied arrays.
=head2 Working with HVs
To create an HV, you use the following routine:
HV* newHV();
Once the HV has been created, the following operations are possible on HVs:
SV** hv_store(HV*, char* key, U32 klen, SV* val, U32 hash);
SV** hv_fetch(HV*, char* key, U32 klen, I32 lval);
The C<klen> parameter is the length of the key being passed in (Note that
you cannot pass 0 in as a value of C<klen> to tell Perl to measure the
length of the key). The C<val> argument contains the SV pointer to the
scalar being stored, and C<hash> is the precomputed hash value (zero if
you want C<hv_store> to calculate it for you). The C<lval> parameter
indicates whether this fetch is actually a part of a store operation, in
which case a new undefined value will be added to the HV with the supplied
key and C<hv_fetch> will return as if the value had already existed.
Remember that C<hv_store> and C<hv_fetch> return C<SV**>'s and not just
C<SV*>. To access the scalar value, you must first dereference the return
value. However, you should check to make sure that the return value is
not NULL before dereferencing it.
These two functions check if a hash table entry exists, and deletes it.
bool hv_exists(HV*, char* key, U32 klen);
SV* hv_delete(HV*, char* key, U32 klen, I32 flags);
If C<flags> does not include the C<G_DISCARD> flag then C<hv_delete> will
create and return a mortal copy of the deleted value.
And more miscellaneous functions:
void hv_clear(HV*);
void hv_undef(HV*);
Like their AV counterparts, C<hv_clear> deletes all the entries in the hash
table but does not actually delete the hash table. The C<hv_undef> deletes
both the entries and the hash table itself.
Perl keeps the actual data in linked list of structures with a typedef of HE.
These contain the actual key and value pointers (plus extra administrative
overhead). The key is a string pointer; the value is an C<SV*>. However,
once you have an C<HE*>, to get the actual key and value, use the routines
specified below.
I32 hv_iterinit(HV*);
/* Prepares starting point to traverse hash table */
HE* hv_iternext(HV*);
/* Get the next entry, and return a pointer to a
structure that has both the key and value */
char* hv_iterkey(HE* entry, I32* retlen);
/* Get the key from an HE structure and also return
the length of the key string */
SV* hv_iterval(HV*, HE* entry);
/* Return a SV pointer to the value of the HE
structure */
SV* hv_iternextsv(HV*, char** key, I32* retlen);
/* This convenience routine combines hv_iternext,
hv_iterkey, and hv_iterval. The key and retlen
arguments are return values for the key and its
length. The value is returned in the SV* argument */
If you know the name of a hash variable, you can get a pointer to its HV
by using the following:
⌨️ 快捷键说明
复制代码Ctrl + C
搜索代码Ctrl + F
全屏模式F11
增大字号Ctrl + =
减小字号Ctrl + -
显示快捷键?