📄 mf_st_view.hlp
字号:
{smcl}
{* 28mar2005}{...}
{cmd:help mata st_view()}
{hline}
{* index view matrix}{...}
{* index st_view()}{...}
{* index data matrix}{...}
{title:Title}
{p 4 4 2}
{bf:[M-5] st_view() -- Make matrix that is a view onto current Stata dataset}
{title:Syntax}
{p 8 12 2}
{it:void}
{cmd:st_view(}{it:V}{cmd:,}
{it:real matrix i}{cmd:,}
{it:rowvector j})
{p 8 12 2}
{it:void}
{cmd:st_view(}{it:V}{cmd:,}
{it:real matrix i}{cmd:,}
{it:rowvector j}{cmd:,}
{it:scalar selectvar}{cmd:)}
{p 8 12 2}
{it:void}
{cmd:st_sview(}{it:V}{cmd:,}
{it:real matrix i}{cmd:,}
{it:rowvector j})
{p 8 12 2}
{it:void}
{cmd:st_sview(}{it:V}{cmd:,}
{it:real matrix i}{cmd:,}
{it:rowvector j}{cmd:,}
{it:scalar selectvar}{cmd:)}
{p 4 4 2}
where
{p 7 11 2}
1. The type of {it:V} does not matter; it is replaced.
{p 7 11 2}
2. {it:i} may be specified in the same way as with
{bf:{help mf_st_data:st_data()}}.
{p 7 11 2}
3. {it:j} may be specified in the same way as with
{bf:{help mf_st_data:st_data()}} except that time-series operators
may not be specified. If time-series operated variables are needed,
see {bf:{help mf_st_tsrevar:st_tsrevar()}}.
{p 7 11 2}
4. {it:selectvar} may be specified in the same way as with
{bf:{help mf_st_data:st_data()}}.
{title:Description}
{p 4 4 2}
{cmd:st_view()} and {cmd:st_sview()} create a matrix that is a view
onto the current Stata dataset.
{title:Remarks}
{p 4 4 2}
Remarks are presented under the headings
{bf:Overview}
{bf:Advantages and disadvantages of views}
{bf:When not to use views}
{bf:Cautions when using views 1: conserving memory}
{bf:Cautions when using views 2: assignment}
{bf:Efficiency}
{title:Overview}
{p 4 4 2}
{cmd:st_view()} does the same thing as {cmd:st_data()} -- and {cmd:st_sview()}
does the same thing as {cmd:st_sdata()} -- except that, rather than returning
a copy of the underlying values, {cmd:st_view()} and {cmd:st_sview()} create a
matrix that is a view onto the Stata dataset itself.
{p 4 4 2}
To understand the distinction, consider
{cmd:X = st_data(., ("mpg", "displ", "weight"))}
{p 4 4 2}
and
{cmd:st_view(X, ., ("mpg", "displ", "weight"))}
{p 4 4 2}
Both commands fill in matrix {cmd:X} with the same data. However,
were you to code
{cmd:X[2,1] = 123}
{p 4 4 2}
after the {cmd:st_data()} setup, you would change the value in the matrix
{cmd:X}, but the Stata dataset would remain unchanged. After the
{cmd:st_view()} setup, changing the value in the matrix would cause the
value of {cmd:mpg} in the 2nd observation to change to 123.
{title:Advantages and disadvantages of views}
{p 4 4 2}
Views make it easy to change the dataset, and that can be
an advantage or a disadvantage, depending on your desires.
{p 4 4 2}
Putting that aside,
views are, in general, better than copies because (1) they take less time
to set up and (2) they consume less memory.
(2) is the important reason. Consider a 100,000-observation dataset on
30 variables. Coding
{cmd:X = st_data(., .)}
{p 4 4 2}
creates a new matrix that is 24 megabytes in size. Meanwhile, the total
storage requirement for
{cmd:st_view(X, ., .)}
{p 4 4 2}
is roughly 128 bytes!
{p 4 4 2}
There is a cost: when you use the matrix {cmd:X}, it takes your computer
longer to access the individual elements. You have to do a lot of calculation
with {cmd:X}, however, before that extra time equals the initial savings
in setup time, and even then, the extra time is probably worth it to save
the extra memory.
{title:When not to use views}
{p 4 4 2}
Do not use views as a substitute for scalars. If you are going to loop
through the data an observation at a time, and if every usage you will
make of {it:X} is in scalar calculations, use {cmd:_st_data()}. There is
nothing faster for that problem.
{p 4 4 2}
Putting aside that extreme, views become more efficient relative to
copies the larger they are, which is to say, it is more efficient to
use {cmd:st_data()} for small amounts of data, especially if you are
going to make computationally intensive calculations with it.
{title:Cautions when using views 1: conserving memory}
{p 4 4 2}
If you are using views, it is probably because you are concerned about
memory, and if you are, you want to be careful to avoid making copies of
views. Copies of views are not views; they are copies. For instance,
{cmd:st_view(V, ., .)}
{cmd:Y = V}
{p 4 4 2}
That innocuous looking {cmd:Y = V} just made a copy of the entire dataset,
meaning that if the dataset had 100,000 observations on 30 variables,
{cmd:Y} now consumes 24 megabytes. Coding {cmd:Y = V} is okay in
some circumstances, but in general, it is better to set up another view.
{p 4 4 2}
Similarly, watch out for subscripts. Consider the following code fragment
{cmd:st_view(V, ., .)}
{cmd:for (i=1; i<=cols(V); i++) {c -(}}
{cmd:sum = colsum(V[,i])}
...
{cmd:{c )-}}
{p 4 4 2}
The problem in the above code is the {cmd:V[,i]}. That creates a new
column vector containing the values from the {cmd:i}th column of {cmd:V}.
Given 100,000 observations, that new column vector needs 800k of memory.
Better to code would be
{cmd:for (i=1; i<=cols(V); i++) {c -(}}
{cmd:st_view(v, ., i)}
{cmd:sum = colsum(v)}
...
{cmd:{c )-}}
{p 4 4 2}
If you also need {cmd:V}, too, that is okay. You can have many views of the
data setup simultaneously.
{p 4 4 2}
Similarly, be careful using views with operators. {cmd:X'X} makes a
copy of {cmd:X} in the process of creating the transpose.
Use functions such as {bf:{help mf_cross:cross()}}
designed to minimize the use of memory.
{p 4 4 2}
That said, do not be overly concerned about this issue. Making a copy of
a column of a view amounts to the same thing as introducing a temporary
variable in a Stata program -- something that is done all the time.
{title:Cautions when using views 2: assignment}
{p 4 4 2}
As mentioned earlier, the ability to assign to a view and so change the
underlying data can be either convenient or dangerous, depending on your
goals.
{p 4 4 2}
When making such assignments, there are two things of which you need be
aware.
{p 4 4 2}
The first is more of a Stata issue than it is a Mata issue. Assignment
does not cause promotion. Coding
{cmd:V[1,2] = 4059.125}
{p 4 4 2}
might store 4059.125 in the first observation of the second variable of the
view. Or, if that second variable is an {cmd:int}, what will be stored is
4059, or if it is a {cmd:byte}, what will be stored is missing.
{p 4 4 2}
The second caution is a Mata issue. To reassign all the values of the
view, code
{cmd:V[.,.] = }{it:matrix_expression}
{p 4 4 2}
Do not code
{cmd:V = }{it:matrix_expression}
{p 4 4 2}
The second expression does not assign to the underlying dataset, it
redefines {cmd:V} to be a regular matrix.
{title:Efficiency}
{p 4 4 2}
Whenever possible,
specify argument {it:i} of
{cmd:st_view(}{it:V}{cmd:,} {it:i}{cmd:,} {it:j}{cmd:)}
and
{cmd:st_sview(}{it:V}{cmd:,} {it:i}{cmd:,} {it:j}{cmd:)}
as {cmd:.} (missing value) or as a rowvector range (e.g.,
{cmd:(}{it:i1}{cmd:,}{it:i2}{cmd:)}) rather than as a
colvector list.
{p 4 4 2}
Specify argument {it:j} as a real rowvector rather than as a string
rowvector whenever
{cmd:st_view()} and
{cmd:st_sview()} are used inside of loops with the same variables.
This prevents Mata from having to look up the names over and over again.
{title:Conformability}
{p 4 4 2}
{cmd:st_view(}{it:V}{cmd:,} {it:i}{cmd:,} {it:j}{cmd:)},
{cmd:st_sview(}{it:V}{cmd:,} {it:i}{cmd:,} {it:j}{cmd:)}:
{p_end}
{it:input:}
{it:i}: {it:n x} 1 or {it:n2 x} 2
{it:j}: 1 {it:x k}
{it:output:}
{it:V}: {it:n x k}
{p 4 4 2}
{cmd:st_view(}{it:V}{cmd:,} {it:i}{cmd:,} {it:j}{cmd:,} {it:selectvar}{cmd:)},
{cmd:st_sview(}{it:V}{cmd:,} {it:i}{cmd:,} {it:j}{cmd:,} {it:selectvar}{cmd:)}:
{p_end}
{it:input:}
{it:i}: {it:n x} 1 or {it:n2 x} 2
{it:j}: 1 {it:x k}
{it:selectvar}: 1 {it:x} 1
{it:output:}
{it:V}: ({it:n}-{it:e}) {it:x k}, where {it:e} is number of
observations excluded by {it:selectvar}
{title:Diagnostics}
{p 4 4 2}
{cmd:st_view(}{it:i}{cmd:,} {it:j}[{cmd:,} {it:selectvar}]{cmd:)}
and
{cmd:st_sview(}{it:i}{cmd:,} {it:j}[{cmd:,} {it:selectvar}]{cmd:)}
abort with error if any element of {it:i} is outside the range of observations
or if a variable name or index recorded in {it:j} is not found.
Note that variable-name abbreviations are allowed. If you do not want
this, use {bf:{help mf_st_varindex:[M-5] st_varindex()}} to translate
variable names into variable indices.
{p 4 4 2}
{cmd:st_view()} and {cmd:st_sview()}
abort with error if any element
of {it:i} is out of range as described under the heading
{it:Details of observation subscripting using st_data() and st_sdata()}
in {bf:{help mf_st_data:[M-5] st_data()}}.
{p 4 4 2}
Some functions do not allow views as arguments. If
{cmd:example(}{it:X}{cmd:)} does not allow views, you can still use it by
coding
... {cmd:example(X=V)} ...
{p 4 4 2}
because that will make a copy of view {cmd:V} in {cmd:X}. Most functions
that do not allow views mention that in their {bf:Diagnostics} section,
but some do not because it was unexpected that anyone would want to use a
view in that case. If a function does not allow a view, you will see
in the traceback log:
: {cmd:myfunction(}...}{cmd:)}
{err}example(): 3103 view found where array required
mysub(): - function returned error
myfunction(): - function returned error
<istmt>: - function returned error{txt}
r(3103);
{p 4 4 2}
The above means that function {cmd:example()} does not allow views.
{title:Source code}
{p 4 4 2}
Functions are built-in.
{title:Also see}
{p 4 13 2}
Manual: {hi:[M-5] st_view()}
{p 4 13 2}
Online: help for
{bf:{help mf_st_subview:[M-5] st_subview()}},
{bf:{help mf_st_viewvars:[M-5] st_viewvars()}},
{bf:{help mf_st_data:[M-5] st_data()}};
{bf:{help m4_stata:[M-4] stata}}
{p_end}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -