📄 byprog.hlp
字号:
{smcl}
{* 08feb2005}{...}
{cmd:help byprog}, {cmd:help byable}
{hline}
{title:Title}
{p2colset 5 19 21 2}{...}
{p2col :{hi:[P] byable} {hline 2}}Make programs byable{p_end}
{p2colreset}{...}
{title:Syntax}
{p 8 16 2}
{cmdab:pr:ogram} {cmdab:de:fine} {it:...} [{cmd:,} {it:...}
{cmdab:by:able:(}{c -(}{cmdab:r:ecall}[{cmd:,}{cmdab:noh:eader}] |
{cmdab:o:necall}{c )-}{cmd:)} {cmdab:sort:preserve} {it:...} ]
{title:Description}
{pstd}
Most Stata commands allow the use of the {cmd:by} prefix; see {helpb by}.
For example, the syntax diagram for the {cmd:regress} command could be
presented as
{phang2}[{cmd:by} {it:varlist}{cmd::}] {cmdab:reg:ress} {it:...}
{pstd}
This entry discusses how to write programs (ado-files) so that the program
can be used with the {cmd:by} prefix.
{title:Options}
{phang}
{cmd:byable(}{c -(}{cmd:recall}[{cmd:,noheader}] | {cmd:onecall}{c )-}{cmd:)} specifies that the program is to allow the {cmd:by} prefix to be used with it and specifies the style in which the program is coded.
{pmore}
There are two supported styles, known as {cmd:by(recall)} and
{cmd:by(onecall)}. {cmd:by(recall)} programs are usually{hline 2}not
always{hline 2} easier to write and {cmd:by(onecall)} programs are
usually{hline 2}not always{hline 2}faster.
{pmore}
{cmd:byable(recall)} programs are executed repeatedly, once per by
group. {cmd:byable(onecall)} programs are executed only once and it is the
program's responsibility to handle the implications of the {cmd:by} prefix if
it is specified.
{pmore}
If you wrote program {it:myprog} in the {cmd:byable(recall)} style,
then were the user to type
{phang3}{cmd:. by pid : myprog} {it:...}
{pmore}
{it:myprog} would be executed repeatedly just as if the user had typed
{phang3}{cmd:. myprog} {it:...} {cmd:if pid==1}{p_end}
{phang3}{cmd:. myprog} {it:...} {cmd:if pid==2}{p_end}
{cmd:.} {it:etc...}
{pmore}
except that an {cmd:if} condition is not used to communicate to which
subsample {it:myprog} should restrict its calculations. Rather, the sample is
automatically restricted to the appropriate subsample when {it:myprog} uses
the {cmd:mark} or {cmd:marksample} commands; see {helpb mark}.
{pmore}
In addition, the following local macros are defined:
{p 12 26 2}{hi:`_byindex'}{space 4}contains the name of a temporary variable
containing 1, 2, ... denoting the by-groups
{p 12 26 2}{hi:`_byvars'}{space 5}contains the names of the actual by-variables
{p 12 26 2}{hi:`_byrc0'}{space 6}contains ", rc0" if the user specified
{cmd:by} {it:...}{cmd::} with the {cmd:rc0} option and contains nothing
otherwise.
{pmore}
and the following functions are also available for use in expressions:
{p 12 26 2}{cmd:_by()}{space 9}returns 1 if {cmd:by} {it:...}{cmd::} was
specified, and 0 otherwise.
{p 12 26 2}{cmd:_byindex()}{space 4}returns 1, 2, ..., reflecting the by-group
currently being executed; returns 1 if {cmd:_by()}==0.
{p 12 26 2}{cmd:_bylastcall()} returns 1 if this is the last by-group and 0
otherwise; returns 1 if {cmd:_by()}==0
{pmore}
Thus, the by-group being executed can be obtained by restricting
calculations to the subsample {cmd:`_byindex'==_byindex()}, but that is not
how it is usually done. Instead, the program uses {cmd:mark} or
{cmd:marksample} because there may be other restrictions that apply as well
and {cmd:mark} and {cmd:marksample} will consider all of them.
{pmore}
{cmd:byable(recall,noheader)} programs are distinguished from
{cmd:byable(recall)} programs in that {cmd:by} will not display a by-group
header before each calling of the program.
{pmore}
{cmd:byable(onecall)} programs are required to handle the {cmd:by}
{it:...}{cmd::} prefix themselves, including displaying the header should they
wish that. See {hi:[P] byable} for details.
{phang}
{cmd:sortpreserve} specifies that the program, during its execution,
will resort the data and that therefore Stata itself should take action to
preserve the order of the data so that the order can be reestablished
afterwards.
{pmore}
{cmd:sortpreserve} is in fact independent of whether a program is
{cmd:byable()} but {cmd:byable()} programs often specify this option.
{pmore}
Pretend you are writing the program {it:myprog} and that, in performing
its calculations, it needs to sort the data. It is very jolting for a user to
experience,
{phang3}{cmd:. by pid: myprog} {it:...}
{phang3}{cmd:. by pid: sum newvar}{p_end}
{err:not sorted}
{search r(5):r(5);}
{pmore}
Specifying {cmd:sortpreserve} will prevent this and still allow myprog
to sort the data freely. {cmd:byable()} programs that sort the data should
specify {cmd:sortpreserve}. It is not necessary to specify
{cmd:sortpreserve} if your program does not change the sort order of the
data and, in that case, things are a little better if you do not specify
{cmd:sortpreserve}.
{pmore}
{cmd:sortpreserve} takes time, although less than you might suspect.
{cmd:sortpreserve} does not actually have to resort the data at the conclusion
of your program{hline 2}an O(n ln n) operation{hline 2}it is able to arrange
things so that it can reassert the original order of the data in O(n) time,
and {cmd:sortpreserve} is, in fact, very quick about it. Nonetheless, there
is no reason to waste the time if the data never got out of order.
{pmore}
Concerning sort order, when your {cmd:byable()} program is invoked for
the first time, it will be sorted on {cmd:_byvars} but, in subsequent calls
(in the case of {cmd:byable(recall)} programs), the sort order will be just as
your program leaves it even if you specify {cmd:sortpreserve}.
{cmd:sortpreserve} restores the original order after your program has been
called for the last time.
{title:Example 1:}
{cmd:program define myprog1, byable(recall)}
{cmd:syntax [varlist] [if] [in]}
{cmd:marksample touse}
{cmd:summarize `varlist' if `touse'}
{cmd:end}
{pstd}
In the above program, it would be a mistake to code it
{cmd:program define myprog1, byable(recall)}
{cmd:syntax [varlist] [if] [in]}
{cmd:summarize `varlist' `if' `in'}
{cmd:end}
{pstd}
because in that case, the sample would not be restricted to the appropriate
by-group when the user specified the {cmd:by} {it:...}{cmd::} prefix.
{cmd:marksample}, however, knows when a program is being by'd and so will set
the {cmd:touse} variable to reflect whatever restrictions the user specified
and the by-group restriction.
{pstd}
{cmd:syntax}, too, knows about {cmd:by} and it will automatically issue an
error message when the user specifies {cmd:by} {it:...}{cmd::} and an {cmd:in}
{it:range} together even though {cmd:in} {it:range} will be allowed when not
combined with {cmd:by}.
{title:Example 2:}
{cmd:program define myprog2, byable(recall) sortpreserve}
{cmd:syntax varname [if] [in]}
{cmd:marksample touse}
{cmd:sort `touse' `varlist'}
{it:...}
{cmd:end}
{pstd}
This program specifies {cmd:sortpreserve} because it changes the sort order
of the data in order to make its calculations.
{title:Example 3:}
{cmd:program define myprog3, byable(onecall) sortpreserve}
{cmd:syntax newvar =exp [if] [in]}
{cmd:marksample touse}
{cmd:tempvar rhs}
{cmd:quietly {c -(}}
{cmd:gen double `rhs' `exp' if `touse'}
{cmd:sort `touse' `_byvars' `rhs'}
{cmd:by `touse' `_byvars': gen `type' `varlist' = /*}
{cmd:*/ `rhs' - `rhs'[_n-1] if `touse'}
{cmd:{c )-}}
{cmd:end}
{pstd}
This program specifies {cmd:sortpreserve} because it changes the sort order
of the data.
{pstd}
In addition, this program is {cmd:byable(onecall)} and, were we to change
{cmd:byable(onecall)} to {cmd:byable(recall)}, we would break the program.
This program creates a new variable and a variable can only be {cmd:generate}d
once; after that we would have to use {cmd:replace}.{p_end}
{title:Also see}
{psee}
Manual: {bf:[P] byable}
{psee}
Online: {helpb by}, {helpb program}
{p_end}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -