📄 m1_interactive.hlp

📁 是一个经济学管理应用软件很难找的但是经济学学生又必须用到
💻 HLP
📖 第 1 页 / 共 2 页
字号:
12 下一页
{smcl}
{* 31mar2005}{...}
{cmd:help m1 interactive}
{hline}
{* index interactive use}{...}

{title:Title}

{p 4 4 2}
{bf:[M-1] interactive -- Using Mata interactively}


{title:Description}

{p 4 4 2}
With Mata, you simply type matrix formulas to obtain the desired
results.  Below we provide guidelines when doing this with statistical 
formulas.


{title:Remarks}

{p 4 4 2}
You have data and statistical formulas that you wish to calculate, such as
{bind:{bf:b} = ({bf:X}'{bf:X})^(-1){bf:X}'{bf:y}}.  Perform the following nine
steps:

{p 8 12 2}
    1.  Start in Stata.  Load the data.

{p 8 12 2}
    2.  If you are doing time-series analysis, generate new variables
        containing any {it:op}{cmd:.}{it:varname} variables you need, such as
        {cmd:l.gnp}, {cmd:d.r}, etc.

{p 8 12 2}
    3.  Create a constant variable (. {cmd:gen} {cmd:cons} {cmd:=} {cmd:1}).
        In most statistical formulas, you will find it useful.

{p 8 12 2}
    4.  Drop variables that you will not need.
        This saves memory and makes some things easier because you can 
        just refer to all the variables.

{p 8 12 2}
    5.  Drop observations with missing values.  
        Mata understands missing values, but Mata is a matrix language, not a
        statistical system, so Mata does not always ignore observations with
        missing values.

{p 8 12 2}
    6.  Put variables on roughly the same numeric scale.
        This is optional, but we recommend it.  We explain what we mean and
        how to do this below.

{p 8 12 2}
    7.  Enter Mata.
        Do that by typing {cmd:mata} at the Stata command prompt.  Do not type
        a colon after the {cmd:mata}.  This way, when you make a mistake, you
        will stay in Mata.

{p 8 12 2}
    8.  Use Mata's {bf:{help mf_st_view:[M-5] st_view()}} function to create 
        matrices based on your Stata dataset.  Create all the matrices you want 
        or find convenient.  The matrices created by {cmd:st_view()} 
        are in fact views onto a single copy of the data.

{p 8 12 2}
    9.  Perform your matrix calculations.


{title:1.  Start in Stata; load the data}

{p 4 4 2}
We will use the auto data and will fit the regression 

		{cmd:mpg}_{it:j} = {it:b0} + {it:b1}*{cmd:weight}_{it:j} + {it:b2}*{cmd:foreign}_{it:j} + {it:e}_{it:j}

{p 4 4 2}
using the formulas

	        {bf:b} = ({bf:X}'{bf:X})^(-1){bf:X}'{bf:y}

	        {bf:V} = {it:s}^2*({bf:X}'{bf:X})^(-1)

	where

              {it:s}^2 = {bf:e}'{bf:e}/({it:n}-{it:k})

		{bf:e} = {bf:y} - {bf:X}*{bf:b}

		{it:n} = rows({bf:X})

		{it:k} = cols({bf:X})

{p 4 4 2}
We begin by typing

	. {cmd:sysuse auto}
	(1978 Automobile data)


{title:2.  Create any time-series variables}

{p 4 4 2}
We do not have any time-series variables but, just for a minute, let's 
pretend we did.  If our model contained 
lagged {cmd:gnp}, we would type

	. {cmd: gen lgnp = l.gnp}

{p 4 4 2}
so that we would have a new variable {cmd:lgnp} that we would use in place 
of {cmd:l.gnp} in the subsequent steps.


{title:3.  Create a constant variable}

	. {cmd:gen cons = 1}


{title:4.  Drop unnecessary variables}

{p 4 4 2}
We will need the variables {cmd:mpg}, {cmd:weight}, {cmd:foreign}, and
{cmd:cons}, so it is easier for us to type {cmd:keep} instead of {cmd:drop}:

	. {cmd:keep mpg weight foreign cons}


{title:5.  Drop observations with missing values}

{p 4 4 2}
We do not have any missing values in our data, but let's pretend we did, 
or let's pretend we are uncertain.  Here is an easy trick for getting 
rid of observations with missing values:

	. {cmd:regress mpg weight foreign cons}
	{it:(output omitted)}

	. {cmd:keep if e(sample)}

{p 4 4 2}
We estimated a regression using all the variables and then 
kept the observations {cmd:regress} chose to use.  It does not 
matter which variable you choose as the dependent variable, nor the order
of the independent variables, so we 
just as well could have typed

	. {cmd:regress weight mpg foreign cons}
	{it:(output omitted)}

	. {cmd:keep if e(sample)}

{p 4 4 2}
or even 

	. {cmd:regress cons mpg weight foreign}
	{it:(output omitted)}

	. {cmd:keep if e(sample)}

{p 4 4 2}
The output produced by {cmd:regress} is irrelevant, even if some variables are
dropped.  We are merely borrowing {cmd:regress}'s ability to identify the
subsample with no missing values.

{p 4 4 2}
Using {cmd:regress} causes Stata to make a lot of unnecessary calculations and,
if that offends you, here is a more sophisticated alternative:

	. {cmd:local 0 "mpg weight foreign cons"}

	. {cmd:syntax varlist}

	. {cmd:marksample touse}

	. {cmd:keep if `touse'}

	. {cmd:drop `touse'}

{p 4 4 2}
Using {cmd:regress} is easier.


{title:6.  Put variables on roughly the same numeric scale}

{p 4 4 2}
This step is optional, but we recommend it.  You are about to use formulas
that have been derived by people who assumed that the usual rules of arithmetic
hold, such as ({it:a}+{it:b})-{it:c} == {it:a}+({it:b}-{it:c}).  Many of the
standard rules, such as the one shown, are violated when arithmetic is
performed in finite precision, and this leads to roundoff error in the final,
calculated results.

{p 4 4 2}
You can obtain a lot of protection by making sure that your variables are on 
roughly the same scale, by which we mean their means and standard deviations
are all roughly equal.  By roughly equal, we mean equal up to a factor of
1,000 or so.  So let's look at our data:

	. {cmd:summarize}

        {txt}    Variable {c |}       Obs        Mean    Std. Dev.       Min        Max
        {hline 13}{c +}{hline 56}
                 mpg {c |}{res}        74     21.2973    5.785503         12         41
              {txt}weight {c |}{res}        74    3019.459    777.1936       1760       4840
             {txt}foreign {c |}{res}        74    .2972973    .4601885          0          1
                {txt}cons {c |}{res}        74           1           0          1          1{txt}

{p 4 4 2}
Nothing we see here bothers us much.  Variable {cmd:weight} is the largest,
with a mean and standard deviation that are 1,000 times larger than those of the
smallest variable, {cmd:foreign}.  We would feel comfortable, but only 
barely, ignoring scale differences.  If {cmd:weight} were ten times larger,
we would begin to be concerned, and our concern would grow as {cmd:weight}
grew.

{p 4 4 2}
The easiest way to address our concern is to divide {cmd:weight} so
that, rather than measuring weight in pounds, it measures weight in 
thousands of pounds:

	. {cmd:replace weight = weight / 1000}

	. {cmd:summarize}

        {txt}    Variable {c |}       Obs        Mean    Std. Dev.       Min        Max
        {hline 13}{c +}{hline 56}
                 mpg {c |}{res}        74     21.2973    5.785503         12         41
              {txt}weight {c |}{res}        74    3.019459    .7771936       1.76       4.84
             {txt}foreign {c |}{res}        74    .2972973    .4601885          0          1
                {txt}cons {c |}{res}        74           1           0          1          1{txt}

{p 4 4 2}
What you are supposed to do is make the means and standard deviations of the
variables roughly equal.  If {cmd:weight} had a large mean and reasonable
standard deviation, we would have subtracted, so that we would have had a
variable measuring weight in excess of some number of pounds.  Or we could do
both, subtracting say 2,000 and then dividing by 100, so we would have weight
in excess of 2,000 pounds, measured in 100-pound units.

{p 4 4 2}
Remember, the definition of roughly equal allows lots of leeway, so you 
do not have to give up easy interpretation.


{title:7.  Enter Mata}

{p 4 4 2}
We type 

	. {cmd:mata}
        {txt}{hline 38} mata (type {cmd:end} to exit) {hline}
	: {cmd:_}
	
{p 4 4 2}
Note that Mata uses a colon prompt, whereas Stata uses a period.


{title:8.  Use st_view() to access your data}

{p 4 4 2}
Our matrix formulas are 

	        {bf:b} = ({bf:X}'{bf:X})^(-1){bf:X}'{bf:y}

	        {bf:V} = {it:s}^2*({bf:X}'{bf:X})^(-1)

	where

              {it:s}^2 = {bf:e}'{bf:e}/({it:n}-{it:k})

		{bf:e} = {bf:y} - {bf:X}*{bf:b}

		{it:n} = rows({bf:X})

		{it:k} = cols({bf:X})
	        
{p 4 4 2} 
so we are going to need {bf:y} and {cmd:X}.  {bf:y} is a {it:n} {it:x} 1
column vector of dependent-variable values, and {bf:X} is a {it:n}
{it:x} {it:k} matrix of the {it:k} independent variables, including the
constant.  Rows are observations, columns are variables.

{p 4 4 2}
We make the vector and matrix as follows:
12 下一页
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -