📄 split.hlp
字号:
{smcl}
{* *! version 1.0.0 01jul2005}{...}
{cmd:help split}{right:dialog: {bf:{dialog split}}}
{hline}
{title:Title}
{p2colset 5 18 20 2}{...}
{p2col :{hi:[D] split} {hline 2}}Split string variables into parts{p_end}
{p2colreset}{...}
{title:Syntax}
{p 4 10 2}
{cmd:split}
{it:strvar}
{ifin}
[{cmd:,}
{it:options}]
{synoptset 24 tabbed}{...}
{synopthdr}
{synoptline}
{syntab :Main}
{synopt :{opt g:enerate(stub)}}begin new variable names with {it:stub};
default is {it:strvar}{p_end}
{synopt :{opt p:arse(parse_strings)}}parse on specified strings; default is to
parse on spaces{p_end}
{synopt :{opt l:imit(#)}}create a maximum of {it:#} new variables;
default is {cmd:limit(1)}{p_end}
{synopt :{opt not:rim}}do not trim leading or trailing spaces of original
variable{p_end}
{syntab :Destring}
{synopt :{opt destring}}apply {opt destring} to new string variables, replacing
initial string variables with numeric variables when possible{p_end}
{synopt :{cmdab:i:gnore("}{it:chars}{cmd:")}}remove specified non-numeric
characters{p_end}
{synopt :{opt force}}convert non-numeric strings to missing values{p_end}
{synopt :{opt float}}generate numeric variables as type {opt float}{p_end}
{synopt :{opt percent}}convert percent variables to fractional form{p_end}
{synoptline}
{title:Description}
{pstd}
{opt split} splits the contents of a string variable {it:strvar} into one or
more parts, using one or more {it:parse_strings} (by default, blank spaces),
so that new string variables are generated. Thus {opt split} is useful for
separating "words" or other parts of a string variable. {it:strvar} itself is
not modified.
{title:Options}
{dlgtab:Main}
{phang}
{opt generate(stub)} specifies the beginning characters of the new
variable names so that new variables {it:stub}{cmd:1}, {it:stub}{cmd:2},
etc., are produced. {it:stub} defaults to {it:strvar}.
{phang}
{opt parse(parse_strings)} specifies that, instead of using spaces,
parsing use one or more {it:parse_strings}. Most commonly,
one string that is a single punctuation character will be specified. For
example, if {cmd:parse(,)} is specified, then {cmd:{bind:"1,2,3"}} is split
into {cmd:"1"}, {cmd:"2"} and {cmd:"3"}.
{pmore}
You can also specify (1) two or more strings that are alternative
separators of "words" and (2) strings that consist of two or more
characters. Alternative strings should be separated by spaces. Strings
that include spaces should be bound by {cmd:{bind:" "}}. Thus if
{cmd:{bind:parse(, " ")}} is specified, then {cmd:{bind:"1,2 3"}} is also
split into {cmd:"1"}, {cmd:"2"} and {cmd:"3"}. Note particularly the
difference between, say, {cmd:{bind:parse(a b)}} and {cmd:parse(ab)}: with the
first, {cmd:a} and {cmd:b} are both acceptable as separators, while with
the second, only the string {cmd:ab} is acceptable.
{phang}
{opt limit(#)} specifies an upper limit to the number of new
variables to be created. Thus {cmd:limit(2)} specifies that, at most, two new
variables be created.
{phang}
{opt notrim} specifies that the original string variable not be trimmed
of leading and trailing spaces before being parsed. {opt notrim} is not
compatible with parsing on spaces, as the latter implies that spaces
in a string are to be discarded. You can either specify a parsing character,
or, by default, allow a {opt trim}.
{dlgtab:Destring}
{phang}
{opt destring} applies {helpb destring} to the new string variables, replacing
the variables initially created as strings by numeric variables where possible.
{phang}
{opt float}, {opt force}, {opt ignore()}, {opt percent};
see {helpb destring}.
{title:Examples}
{phang}
1. Suppose that input is somehow misread as one string variable, say when
you copy and paste into the data editor, but data are space-separated:
{p 12 16 2}
{cmd:. split var1, destring}
{phang}
2. Email addresses split at {cmd:"@"}:
{p 12 16 2}
{cmd:. split address, p(@)}
{phang}
3. Suppose a string variable holds names of legal cases which should be split
into variables for plaintiff and defendant. The separators could be
{cmd:{bind:" V "}}, {cmd:{bind:" V. "}}, {cmd:{bind:" VS "}} and
{cmd:{bind:" VS. "}}. Note particularly the leading and trailing spaces:
{cmd:"V"}, for example, would incorrectly split {cmd:{bind:"GOLIATH V DAVID"}}.
{p 12 16 2}
{cmd:. split case, p(" V " " V. " " VS " " VS. ")}
{pmore}Signs of problems would be the creation of more than two
variables and any variable having blank values, so check:
{p 12 16 2}
{cmd:. list case if case2 == ""}
{phang}
4. Suppose a string variable holds time of day in the form "hh:mm:ss",
e.g. {cmd:"12:34:56"}.
{p 12 16 2}
{cmd:. split hms, p(:) destring}{p_end}
{p 12 16 2}
{cmd:. gen timeofday = hms1 + hms2/60 + hms3/3600}
{pmore}
Or suppose a string variable holds time of day in the form
{bind:"hh:mm:ss am"} or {bind:"hh:mm:ss pm"},
e.g. {cmd:"06:54:32 am"}, "{cmd:11:22:33 pm}".
{p 12 16 2}
{cmd:. split hms, p(: " ") destring}{p_end}
{p 12 16 2}
{cmd:. gen timeofday = hms1 + hms2/60 + hms3/3600 + cond(hms4 == "pm",12,0)}
{phang}
5. Suppose a string variable contains fields separated by tabs. For example,
{helpb insheet} leaves tabs unchanged. Knowing that a tab is {cmd:char(9)}, we
can
{p 12 16 2}
{cmd:. split data, p(`=char(9)') destring}{p_end}
{pmore}
Note that {cmd:p(char(9))} would not work. The argument to {cmd:parse()} is
taken literally, but evaluation of functions on the fly can be forced as part
of macro substitution.
{title:Also see}
{psee}
Manual: {bf:[D] split}
{psee}
Online: {helpb destring},
{helpb egen},
{help functions},
{helpb rename},
{helpb separate}
{p_end}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -