📄 graph_box.hlp
字号:
{p 4 8 8}
{cmd:graph box bp_before bp_after}{break}
Two boxes, one showing average blood pressure before, and the other, after.
{p 4 8 8}
{cmd:graph box bp, over(agegrp)}{break}
{it:#_of_agegrp} boxes showing blood pressure for each age group.
{p 4 8 8}
{cmd:graph box bp_before bp_after, over(agegrp)}{break}
2*{it:#_of_agegrp} boxes showing blood pressure, before and after,
for each age group.
The grouping would look like this (assuming 3 age groups):
{c |}
{c |}
{c |} {c TT} {c TT} {c TT}
{c |} {c TLC}{c BT}{c TRC} {c TT} {c TLC}{c BT}{c TRC} {c TT} {c TLC}{c BT}{c TRC} {c TT}
{c |} {c |}-{c |} {c TLC}{c BT}{c TRC} {c |}-{c |} {c TLC}{c BT}{c TRC} {c |}-{c |} {c TLC}{c BT}{c TRC}
{c |} {c BLC}{c TT}{c BRC} {c |}-{c |} {c BLC}{c TT}{c BRC} {c |}-{c |} {c BLC}{c TT}{c BRC} {c |}-{c |}
{c |} {c BT} {c BLC}{c TT}{c BRC} {c BT} {c BLC}{c TT}{c BRC} {c BT} {c BLC}{c TT}{c BRC}
{c |} {c BT} {c BT} {c BT}
{c BRC}{hline 36}
agegrp 1 agegrp 2 agegrp 3
{p 4 8 8}
{cmd:graph box bp, over(agegrp) over(sex)}{break}
{it:#_of_agegrps}*{it:#_of_sexes} boxes showing blood pressure
for each age group, repeated for each sex. The grouping would
look like this:
{c |}
{c |} {c TT} {c TT} {c TT} {c TT}
{c |} {c TLC}{c BT}{c TRC} {c TT} {c TLC}{c BT}{c TRC} {c TLC}{c BT}{c TRC} {c TT} {c TLC}{c BT}{c TRC}
{c |} {c |}-{c |} {c TLC}{c BT}{c TRC} {c |}-{c |} {c |}-{c |} {c TLC}{c BT}{c TRC} {c |}-{c |}
{c |} {c BLC}{c TT}{c BRC} {c |}-{c |} {c BLC}{c TT}{c BRC} {c BLC}{c TT}{c BRC} {c |}_{c |} {c BLC}{c TT}{c BRC}
{c |} {c BT} {c BLC}{c TT}{c BRC} {c BT} {c BT} {c BLC}{c TT}{c BRC} {c BT}
{c |} {c BT} {c BT}
{c BRC}{hline 43}
age_1 age_2 age_3 age_1 age_2 age_3
males females
{p 4 8 8}
{cmd:graph box bp, over(sex) over(agegrp)}{break}
Same as above, but ordered differently. In the previous example we
typed {cmd:over(agegrp)} {cmd:over(sex)}. This time, we reverse it:
{c |}
{c |} {c TT} {c TT} {c TT} {c TT}
{c |} {c TLC}{c BT}{c TRC} {c TLC}{c BT}{c TRC} {c TT} {c TT} {c TLC}{c BT}{c TRC} {c TLC}{c BT}{c TRC}
{c |} {c |}-{c |} {c |}-{c |} {c TLC}{c BT}{c TRC} {c TLC}{c BT}{c TRC} {c |}-{c |} {c |}-{c |}
{c |} {c BLC}{c TT}{c BRC} {c BLC}{c TT}{c BRC} {c |}-{c |} {c |}-{c |} {c BLC}{c TT}{c BRC} {c BLC}{c TT}{c BRC}
{c |} {c BT} {c BT} {c BLC}{c TT}{c BRC} {c BLC}{c TT}{c BRC} {c BT} {c BT}
{c |} {c BT} {c BT}
{c BRC}{hline 46}
male female male female male female
age_1 age_2 age_3
{p 4 8 8}
{cmd:graph box bp_before bp_after, over(agegrp) over(sex)}{break}
2*{it:#_of_agegrps}*{it:#_of_sexes} boxes showing blood pressure,
before and after, for each age group, repeated for each sex. The grouping
would look like this:
{c |}
{c |} {c TT} {c TT} {c TT} {c TT}
{c |} {c TLC}{c BT}{c TRC} {c TT} {c TT} {c TT} {c TLC}{c BT}{c TRC} {c TT} {c TLC}{c BT}{c TRC} {c TT} {c TT} {c TT} {c TLC}{c BT}{c TRC} {c TT}
{c |} {c |}-{c |} {c TLC}{c BT}{c TRC} {c TLC}{c BT}{c TRC} {c TLC}{c BT}{c TRC} {c |}-{c |} {c TLC}{c BT}{c TRC} {c |}-{c |} {c TLC}{c BT}{c TRC} {c TLC}{c BT}{c TRC} {c TLC}{c BT}{c TRC} {c |}-{c |} {c TLC}{c BT}{c TRC}
{c |} {c BLC}{c TT}{c BRC} {c |}-{c |} {c |}-{c |} {c |}-{c |} {c BLC}{c TT}{c BRC} {c |}-{c |} {c BLC}{c TT}{c BRC} {c |}-{c |} {c |}-{c |} {c |}-{c |} {c BLC}{c TT}{c BRC} {c |}-{c |}
{c |} {c BT} {c BLC}{c TT}{c BRC} {c BLC}{c TT}{c BRC} {c BLC}{c TT}{c BRC} {c BT} {c BLC}{c TT}{c BRC} {c BT} {c BLC}{c TT}{c BRC} {c BLC}{c TT}{c BRC} {c BLC}{c TT}{c BRC} {c BT} {c BLC}{c TT}{c BRC}
{c |} {c BT} {c BT} {c BT} {c BT} {c BT} {c BT} {c BT} {c BT}
{c BLC}{hline 62}
age_1 age_2 age_3 age_1 age_2 age_3
males females
{marker remarks3}{...}
{title:Treatment of multiple yvars versus treatment of over() groups}
{pstd}
Consider two datasets containing the same data but organized differently.
The datasets contain blood pressure, before and after an intervention.
In the first dataset, the data are organized the wide way; each patient
is an observation. A little bit of the data is
{txt}{c TLC}{hline 9}{c -}{hline 6}{c -}{hline 8}{c -}{hline 11}{c -}{hline 10}{c TRC}
{c |} {res}patient sex agegrp bp_before bp_after {txt}{c |}
{c LT}{hline 9}{c -}{hline 6}{c -}{hline 8}{c -}{hline 11}{c -}{hline 10}{c RT}
{c |} {res} 1 Male 30-45 143 153 {txt}{c |}
{c |} {res} 2 Male 30-45 163 170 {txt}{c |}
{c |} {res} 3 Male 30-45 153 168 {txt}{c |}
{c BLC}{hline 9}{c -}{hline 6}{c -}{hline 8}{c -}{hline 11}{c -}{hline 10}{c BRC}{txt}
{pstd}
In the second dataset, the data are organized the long way; each patient is
a pair of observations. The corresponding observations in the second dataset
are
{txt}{c TLC}{hline 9}{c -}{hline 6}{c -}{hline 8}{c -}{hline 8}{c -}{hline 5}{c TRC}
{c |} {res}patient sex agegrp when bp {txt}{c |}
{c LT}{hline 9}{c -}{hline 6}{c -}{hline 8}{c -}{hline 8}{c -}{hline 5}{c RT}
{c |} {res} 1 Male 30-45 Before 143 {txt}{c |}
{c |} {res} 1 Male 30-45 After 153 {txt}{c |}
{c LT}{hline 9}{c -}{hline 6}{c -}{hline 8}{c -}{hline 8}{c -}{hline 5}{c RT}
{c |} {res} 2 Male 30-45 Before 163 {txt}{c |}
{c |} {res} 2 Male 30-45 After 170 {txt}{c |}
{c LT}{hline 9}{c -}{hline 6}{c -}{hline 8}{c -}{hline 8}{c -}{hline 5}{c RT}
{c |} {res} 3 Male 30-45 Before 153 {txt}{c |}
{c |} {res} 3 Male 30-45 After 168 {txt}{c |}
{c BLC}{hline 9}{c -}{hline 6}{c -}{hline 8}{c -}{hline 8}{c -}{hline 5}{c BRC}{txt}
{pstd}
Using the first dataset, we might type
{cmd:. sysuse bpwide, clear}
{cmd:. graph box bp_before bp_after, over(sex)}
{it:({stata "gr_example bpwide: graph box bp_before bp_after, over(sex)":click to run})}
{* graph boxwide}{...}
{pstd}
Using the second dataset, we could type
{cmd:. sysuse bplong, clear}
{cmd:. graph box bp, over(when) over(sex)}
{it:({stata "gr_example bplong: graph box bp, over(when) over(sex)":click to run})}
{* graph boxlong}{...}
{pstd}
The two graphs are virtually identical. They differ in that
multiple {it:yvars} {cmd:over()} groups
{hline 61}
boxes different colors yes no
boxes identified via ... legend axis label
{hline 61}
{pstd}
Option {cmd:ascategory} will cause multiple {it:yvars} to be presented as if
they were the first {cmd:over()} group, and option {cmd:asyvars} will cause
the first {cmd:over()} group to be presented as if they were multiple
{it:yvars}. Thus
{cmd:. graph box bp, over(when) over(sex) asyvars}
{pstd}
would produce the first chart and
{phang2}
{cmd:. graph box bp_before bp_after, over(sex) ascategory}
{pstd}
would produce the second.
{marker remarks4}{...}
{title:How boxes are ordered}
{pstd}
The default is to place the boxes in the order of the {it:yvars} and to order
each {cmd:over(}{it:varname}{cmd:)} group according to the values of
{it:varname}. Let us consider some examples:
{phang}
{cmd:graph box bp_before bp_after}{break}
Boxes appear in the order specified, bp_before and bp_after.
{phang}
{cmd:graph box bp, over(when)}{break}
Boxes are ordered according to the values of variable when.
{pmore}
If variable when is a numeric, the lowest when number comes first,
followed by the next lowest, and so on. This is true even if variable
when has a value label. Say that when=1 has been labeled "Before" and
when=2, labeled "After". The boxes will be in the order Before followed
by After.
{pmore}
If variable when is a string, the boxes will be ordered by the sort
order of the values of the variable (which is to say, alphabetically, but
with capital letters placed before lowercase letters). If variable
when contains "Before" and "After", the boxes will be in the order After
followed by Before.
{phang}
{cmd:graph box bp_before bp_after, over(sex)}{break}
Boxes appear in the order specified, bp_before and bp_after, and are
repeated for each sex, which will be ordered as explained above.
{phang}
{cmd:graph box bp_before bp_after, over(sex) over(agegrp)}{break}
Boxes appear in the order specified, bp_before and bp_after, repeated
for sex ordered on the values of variable sex, repeated
for agegrp ordered on the values of variable agegrp.
{marker remarks5}{...}
{title:Reordering the boxes}
{pstd}
There are two ways you may wish to reorder the boxes:
{phang2}
1. You want to control the order in which the elements of each {cmd:over()}
group appear. String variable when might contain "After" and "Before",
but you want the boxes to
appear in the order Before and After.
{phang2}
2. You wish to order the boxes according to their median values.
You wish to draw the graph
{col 16}{cmd:. graph box wage, over(industry)}
{pmore2}
and you want the industries ordered by wage.
{pstd}
We will consider each of these desires separately.
{marker remarks6}{...}
{title:Putting the boxes in a prespecified order}
{pstd}
You have drawn the graph
{cmd:. graph box bp, over(when) over(sex)}
{pstd}
Variable when is a string containing "Before" and "After". You wish the boxes
to be in that order.
{pstd}
To do that, you create a new numeric variable that orders the group as you
would like:
{cmd:. gen order = 1 if when=="Before"}
{cmd:. replace order = 2 if when=="After"}
{pstd}
You may name the variable and create it however you wish, but be sure that
there is a one-to-one correspondence between the new variable and the
{cmd:over()} group's values. You then specify the {cmd:over()}'s
{cmd:sort(}{it:varname}{cmd:)} option:
{phang2}
{cmd:. graph box bp, over(when, sort(order)) over(sex)}
{pstd}
If you want to reverse the order, you may specify the {cmd:descending}
suboption:
{phang2}
{cmd:. graph box bp, over(when, sort(order) descending) over(sex)}
{marker remarks7}{...}
{title:Putting the boxes in median order}
{pstd}
You have drawn the graph
{cmd:. graph hbox wage, over(industry)}
{pstd}
and now wish to put the boxes in median order, lowest first. You type
{phang2}
{cmd:. graph hbox wage, over( industry, sort(1) )}
{pstd}
If you wanted the largest first, you would type
{phang2}
{cmd:. graph hbox wage, over(industry, sort(1) descending)}
{pstd}
The {cmd:1} in {cmd:sort(1)} refers to the first (and in this case, only)
{it:yvar}. If you had multiple {it:yvars}, you might type
{phang2}
{cmd:. graph hbox wage benefits, over( industry, sort(1) )}
{pstd}
and you would have a chart showing wage and benefits sorted on wage.
If you typed
{phang2}
{cmd:. graph hbox wage benefits, over( industry, sort(2) )}
{pstd}
the graph would be sorted on benefits.
{marker remarks8}{...}
{title:Use with by()}
{pstd}
{cmd:graph} {cmd:box} and {cmd:graph} {cmd:hbox} may be used with {cmd:by()},
but in general, you will want to use {cmd:over()} in preference to {cmd:by()}.
Box charts are explicitly categorical and do an excellent job of presenting
summary statistics for multiple groups in a single chart.
{pstd}
A good use of {cmd:by()}, however, is when the graph would otherwise
be very long. Consider the graph
{cmd:. use nlsw88, clear}
{cmd:. graph hbox wage, over(ind) over(union)}
{pstd}
In the above graph, there are 12 industry categories and 2 union categories,
resulting in 24 separate boxes. The graph, presented at normal size,
would be virtually unreadable. One way around that problem would be to
make the graph longer than usual,
{phang2}
{cmd:. graph hbox wage, over(ind) over(union) ysize(7)}
{pstd}
See {hi:Charts with lots of categories} in {helpb graph bar} for
more information about that solution. The other solution would be to
introduce union as a {cmd:by()} category rather than an {cmd:over()}
category:
{cmd:. graph hbox wage, over(ind) by(union)}
{pstd}
Below we do precisely that, adding some extra options to produce a good-looking
chart:
{cmd}. graph hbox wage, over(ind, sort(1)) nooutside
ytitle("")
by(
union,
title("Hourly wage, 1988, women aged 34-46", span)
subtitle(" ")
note("Source: 1988 data from NLS, U.S. Dept. of Labor,
Bureau of Labor Statistics", span)
){txt}
{it:({stata "gr_example2 grboxby":click to run})}
{* graph grboxby}{...}
{pstd}
The title options were specified inside the {cmd:by()} so that they would
not be applied to each graph separately; see {it:{help by_option}}.
{* index histories}{...}
{* index Crowe}{...}
{marker remarks9}{...}
{title:History}
{pstd}
Box plots have been used in geography and climatology, under the name
"dispersion diagrams", since at least 1933; see Crowe (1933). His figure 1
shows all the data points, medians, quartiles, and octiles by month for
monthly rainfalls for Glasgow, 1868-1917. His figure 2, a map of Europe with
several climatic stations, shows monthly medians, quartiles, and octiles.
{title:Also see}
{psee}
Manual: {bf:[G] graph box}
{psee}
Online: {helpb graph bar};
{helpb lv},
{helpb summarize}
{p_end}
⌨️ 快捷键说明
复制代码
Ctrl + C
搜索代码
Ctrl + F
全屏模式
F11
切换主题
Ctrl + Shift + D
显示快捷键
?
增大字号
Ctrl + =
减小字号
Ctrl + -