⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 merge.hlp

📁 是一个经济学管理应用软件 很难找的 但是经济学学生又必须用到
💻 HLP
字号:
{smcl}
{* 04apr2005}{...}
{cmd:help merge} {right:dialogs:  {bf:{dialog merge}}{space 9}}
{right:{dialog merge_multiple:merge multiple}}
{hline}

{title:Title}

{p2colset 5 18 20 2}{...}
{p2col :{hi:[D] merge} {hline 2}}Merge datasets{p_end}
{p2colreset}{...}


{title:Syntax}

{p 8 15 2}
{opt mer:ge} [{varlist}] {cmd:using} {it:filename} [{it:filename} {cmd:...}]
[{cmd:,} {it:options}]

{synoptset 18 tabbed}{...}
{synopthdr}
{synoptline}
{syntab :Options}
{synopt :{opth keep(varlist)}}keep only the specified variables from data in
{it:filename}{p_end}
{synopt :{opth _merge(newvar)}}{it:newvar} marks source of resulting
observation; default is {opt _merge}{p_end}
{synopt :{opt nol:abel}}do not copy value label definitions from {it:filename}{p_end}
{synopt :{opt nonote:s}}do not copy notes from {it:filename}{p_end}
{synopt :{opt update}}replace missing data in memory with data from
{it:filename}{p_end}
{synopt :{opt replace}}replace nonmissing data in memory with data from
{it:filename}{p_end}
{synopt :{opt nok:eep}}drop observations in using dataset that do not match{p_end}
{synopt :{opt nos:ummary}}drop summary variables when multiple {it:filenames}
are specified{p_end}
{p2coldent :* {opt uniq:ue}}match variables uniquely identify observations in both
data in memory and in {it:filename}{p_end}
{p2coldent :* {opt uniqm:aster}}match variables uniquely identify observations in
memory{p_end}
{p2coldent :* {opt uniqu:sing}}match variables uniquely identify observations in
{it:filename}{p_end}
{p2coldent :* {opt sort}}sort master and using datasets by match
variables before merge if they are not already sorted by the specified match
variables{p_end}
{synoptline}
{p2colreset}{...}
{p 4 6 2}* {opt unique}, {opt uniqmaster}, {opt uniqusing}, and {opt sort}
require match variables to be specified.{p_end}
{p 4 6 2}{opt sort} implies {opt unique}.{p_end}
{p 4 6 2}If {it:filename} is specified without an extension, {cmd:.dta} is
assumed.{p_end}


{title:Description}

{pstd}
{cmd:merge} joins corresponding observations from the dataset currently
in memory (called the {it:master} dataset) with those from Stata-format
datasets stored as {it:filename} (called the {it:using} datasets) into
single observations.  If {it:filename} is specified without an extension,
{cmd:.dta} is assumed.

{pstd}
{cmd:merge} can perform both one-to-one and match merges.


{title:Options}

{dlgtab:Options}

{phang}
{opth keep(varlist)} specifies the variables to be kept from the using data.
If {opt keep()} is not specified, all variables are kept.

{pmore}
The {varlist} in {opt keep(varlist)} differs from standard Stata varlists in
two ways: variable names in {it:varlist} may not be abbreviated, except by the
use of wildcard characters; and you may not refer to a range of variables,
such as price-weight. 

{phang}
{opt _merge(newvar)} specifies the name of the variable to be created that
will mark the source of the resulting observation.  The default is
{cmd:_merge(_merge)}; that is, if you do not specify this option, the new
variable will be named {opt _merge}.

{phang}
{opt nolabel} prevents Stata from copying the value label definitions from the
disk dataset into the dataset in memory.  Even if you do not specify this
option, label definitions from the disk dataset do not replace the label
definitions  already in memory. 

{phang}
{opt nonotes} prevents {help notes} in the using data from being incorporated
into the result.  The default is to incorporate notes from the using data that
do not already appear in the master.

{phang}
{opt update} specifies that the values from the using dataset be retained in
cases where the master dataset contains missing.  By default, the master
dataset are held inviolate{hline 2}values from the master dataset are retained
when the variables are found in both datasets.

{phang}
{opt replace}, allowed with {opt update} only, specifies that even when the
master dataset contains nonmissing values, they are to be replaced with
corresponding values from the using dataset when the corresponding values are
not equal.  A nonmissing value, however, will never be replaced with a missing
value.

{phang}
{opt nokeep} causes {cmd:merge} to ignore observations in the using dataset
that have no corresponding observation in the master.  The default is to add
these observations to the merged result and mark such observations with
{opt _merge}==2.

{phang}
{opt nosummary} causes {cmd:merge} to drop the summary variables created when
multiple using datasets are specified.

{phang}
{opt unique}, {opt uniqmaster}, and {opt uniqusing} specify that the match
variables in a match-merge uniquely identify the observations.  Match variables
are required with {opt unique}, {opt uniqmaster}, and {opt uniqusing}.

{pmore}
{opt unique} specifies that the match variables uniquely identify the
observations in the master data and in the using data.  For most match-merges,
you should specify {opt unique}.  {cmd:merge} does nothing differently when
you specify the option, unless the assumption you are making is false.  In that
case, an error message is issue, and the data are not merged. 

{pmore}
{opt uniqmaster} specifies that the match variables uniquely identify the
observations in memory, the master data, but not necessarily the ones in the
using data.

{pmore}
{opt uniqusing} specifies that the match variables uniquely identify the
observations in the using data, but not necessarily the ones in the master
data.

{pmore}
{opt unique} is thus equivalent to specifying {opt uniqmaster} and
{opt uniqusing}.

{pmore}
Things are more complicated when multiple using datasets are specified.
{opt unique} still means unique in all datasets, and {opt uniqusing} still
means unique in each of the using datasets, just as you would expect, but
{opt uniqmaster} takes on a whole new meaning:  {opt uniqmaster} means unique
in the master and in all using datasets except the last!  That is because what
is being asserted is that the match variables uniquely identify observations
in the master at each step, meaning when the master is merged with the first
using dataset, then when the (new) master (equal to original plus first
using) is merged with the second using dataset, and so on.
In summary, {opt uniqmaster} is simply not useful when multiple 
using datasets are specified.

{pmore}
If none of the three unique options are specified, observations in neither the
master nor the using data are required to be unique, although they could be.
If they are not unique, records that have the same values of the match
variables are joined by observation until all the records one side or the
other are matched and after that the final record on the shorter side is
duplicated over and over again to match with the remaining records needing 
to be matched on the longer side.

{phang}
{opt sort} specifies that the master and using datasets are to be sorted by the
match variables, before the datasets are merged, if they are not already sorted by them.  Match variables are required with {opt sort}.


{title:Remarks}

{pstd}
{cmd:merge} can perform both one-to-one and match merges.  In either case,
the variable {cmd:_merge} (or the variable specified in {cmd:_merge()} if
provided) is added to the data containing

{center:_merge==1    obs. from master data                            }
{center:_merge==2    obs. from only one using dataset                 }
{center:_merge==3    obs. from at least two datasets, master or using }

{pstd}
{cmd:update} can be used only when there is a single using file.  When
{cmd:update} is specified, the codes for {cmd:_merge} are

{center:_merge==1    obs. from master data                            }
{center:_merge==2    obs. from using data                             }
{center:_merge==3    obs. from both, master agrees with using         }
{center:_merge==4    obs. from both, missing in master updated        }
{center:_merge==5    obs. from both, master disagrees with using      }

{pstd}
When multiple using files are specified, a set of summary variables is created,
as long as {cmd:nosummary} is not used.  These summary variables are named
{cmd:_merge1} (related to the first using dataset), {cmd:_merge2} (related to
the second using dataset), etc. (or, once again, the variable specified in
{cmd:_merge()} if provided, followed by the number of the using file).  These
variables will contain 

{center:_merge{it:k}==0   obs. not present in corresponding using dataset  }
{center:_merge{it:k}==1   obs. present in corresponding using dataset      }

{pstd}
Variable labels identifying the dataset associated with each summary variable
are attached to these summary variables.


{title:Example:  one-to-one merge}

{phang}{cmd:. use ds1}{p_end}
{phang}{cmd:. merge using ds2, unique}

{phang}{cmd:. use ds1}{p_end}
{phang}{cmd:. merge using ds2 ds3 ds4, unique}

{title:Example:  match merge}

{phang}{cmd:. use ds2}{p_end}
{phang}{cmd:. sort recid}{p_end}
{phang}{cmd:. save ds2, replace}{p_end}
{phang}{cmd:. use ds1}{p_end}
{phang}{cmd:. sort recid}{p_end}
{phang}{cmd:. merge recid using ds2}{p_end}
{phang}{cmd:. tabulate _merge}

{phang}{cmd:. use ds2}{p_end}
{phang}{cmd:. sort recid}{p_end}
{phang}{cmd:. save ds2, replace}{p_end}
{phang}{cmd:. use ds3}{p_end}
{phang}{cmd:. sort recid}{p_end}
{phang}{cmd:. save ds3, replace}{p_end}
{phang}{cmd:. use ds4}{p_end}
{phang}{cmd:. sort recid}{p_end}
{phang}{cmd:. save ds4, replace}{p_end}
{phang}{cmd:. use ds1}{p_end}
{phang}{cmd:. sort recid}{p_end}
{phang}{cmd:. merge recid using ds2 ds3 ds4}{p_end}
{phang}{cmd:. tabulate _merge}{p_end}
{phang}{cmd:. list recid _merge*}

{title:Example:  update match merge}

{phang}{cmd:. use original, clear}{p_end}
{phang}{cmd:. merge make using updata, update}{p_end}
{phang}{cmd:. tabulate _merge}


{title:Also see}

{psee}
Manual:  {bf:[D] merge}

{psee}
Online:  {helpb append}, {helpb cross}, {helpb joinby},
{helpb save}, {helpb sort}

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -