📄 graph_bar.hlp

📁 是一个经济学管理应用软件很难找的但是经济学学生又必须用到
💻 HLP
📖 第 1 页 / 共 5 页
字号:
	bars identified via ...   legend                  axis label
	{hline 61}

{pstd}
Option {cmd:ascategory} causes multiple {it:yvars} to be presented as if they
were {cmd:over()} groups, and option {cmd:asyvars} causes {cmd:over()} groups
to be presented as if they were multiple {it:yvars}.  Thus

	{cmd:. graph bar (asis) tempjan, over(region)}

{pstd}
would produce the first chart and

	{cmd:. graph bar (asis) ne nc south west, ascategory}

{pstd}
would produce the second.


{marker remarks4}{...}
{title:Treatment of data}

{pstd}
In the previous two examples, we already had the statistics we wanted to plot:
27.9 (Northeast), 21.7 (North Central), 46.1 (South), and 46.2 (West).
We entered the data, and we typed

	{cmd:. graph bar (asis) ne nc south west}
    or
	{cmd:. graph bar (asis) tempjan, over(region)}

{pstd}
We do not have to know the statistics ahead of time:  {cmd:graph} {cmd:bar}
and {cmd:graph} {cmd:hbar} can calculate statistics for us.  If we had
datasets with lots of observations (say cities of the U.S.), we could type

	{cmd:. graph bar (mean) ne nc south west}
    or
	{cmd:. graph bar (mean) tempjan, over(region)}

{pstd}
and obtain the same graphs.  All we need do is change {cmd:(asis)} to
{cmd:(mean)}.  In the first example, the data would be organized the wide way:

	cityname           ne     nc     south     west
	{hline 47}
	{it:name of city}       42      .         .        .
	{it:another city}        .     28         .        .
	...
	{hline 47}

{pstd}
and, in the second example, the data would be organized the long way:


	cityname        region        tempjan
	{hline 37}
	{it:name of city}      ne          42
	{it:another city}      nc          28
	...
	{hline 37}

{pstd}
We have such a dataset, organized the long way.  In citytemp.dta, we have
information on 956 U.S. cities, including the region in which each is located
and its average January temperature:

	{cmd:. sysuse citytemp, clear}

	{cmd}. list region tempjan if _n < 3 | _n > 954
	{txt}
	     {c TLC}{hline 8}{c -}{hline 9}{c TRC}
	     {c |} {res}region   tempjan {txt}{c |}
	     {c LT}{hline 8}{c -}{hline 9}{c RT}
	  1. {c |} {res}    NE      16.6 {txt}{c |}
	  2. {c |} {res}    NE      18.2 {txt}{c |}
	955. {c |} {res}  West      72.6 {txt}{c |}
	956. {c |} {res}  West      72.6 {txt}{c |}
	     {c BLC}{hline 8}{c -}{hline 9}{c BRC}

{pstd}
With this data, we can type

	{cmd:. graph bar (mean) tempjan, over(region)}
	  {it:({stata "gr_example citytemp: gr bar (mean) tempjan, over(region)":click to run})}
{* graph barct}{...}

{pstd}
We just produced the same bar chart we previously produced when we entered
the statistics 27.9 (Northeast), 21.7 (North Central), 46.1 (South), and 46.2
(West) and typed

	{cmd:. graph bar (asis) tempjan, over(region)}

{pstd}
When we do not specify {cmd:(asis)} or {cmd:(mean)} (or {cmd:(median)} or
{cmd:(sum)} or {cmd:(p1)} or any of the other {it:stats} allowed),
{cmd:(mean)} is assumed.  Thus {cmd:(...)} is often omitted when
{cmd:(mean)} is desired, and we could have drawn the previous graph by
typing

	{cmd:. graph bar tempjan, over(region)}

{pstd}
Some users even omit typing {cmd:(...)} in the {cmd:(asis)} case because
calculating the mean of a single observation results in the number itself.
Thus in the previous section, rather than typing

	{cmd:. graph bar (asis) ne nc south west}
    and
	{cmd:. graph bar (asis) tempjan, over(region)}

{pstd}
we could have typed

	{cmd:. graph bar ne nc south west}
    and
	{cmd:. graph bar tempjan, over(region)}


{marker remarks5}{...}
{title:Multiple bars (overlapping the bars)}

{pstd}
In citytemp.dta, in addition to variable {cmd:tempjan}, there is also variable
{cmd:tempjuly}, which the average July temperature.  We can include both
averages in a single chart, by region:

	{cmd:. sysuse citytemp, clear}

	{cmd:. graph bar (mean) tempjuly tempjan, over(region)}
	  {it:({stata "gr_example citytemp: graph bar (mean) tempjuly tempjan, over(region)":click to run})}
{* graph barct2}{...}

{pstd}
We can improve the look of the chart by

{phang2}
    1.  including the {it:legend_option} {cmd:legend(label())}
	to change the text of the legend; see {it:{help legend_option}};

{phang2}
    2.  including the {it:axis_title_option} {cmd:ytitle()} to add a
	title saying "Degrees Fahrenheit"; see 
	{it:{help axis_title_options}};

{phang2}
    3.  including the {it:title_options} {cmd:title()}, {cmd:subtitle()},
	and {cmd:note()} to say what the graph is about and from where the
	data came; see {it:{help title_options}}.

{pstd}
Doing all of that produces

	{cmd}. graph bar (mean) tempjuly tempjan, over(region)
		legend( label(1 "July") label(2 "January") )
		ytitle("Degrees Fahrenheit")
		title("Average July and January temperatures")
		subtitle("by regions of the United States")
		note("Source:  U.S. Census Bureau, U.S. Dept. of Commerce"){txt}
	  {it:({stata gr_example2 grbar1a:click to run})}
{* graph grbar1a}{...}

{pstd}
We can make one more improvement to this chart by overlapping the bars.
Below we add the option {cmd:bargap(-30)}:

	{cmd}. graph bar (mean) tempjuly tempjan, over(region)
  {txt:{it:new} ->}        bargap(-30)
		legend( label(1 "July") label(2 "January") )
		ytitle("Degrees Fahrenheit")
		title("Average July and January temperatures")
		subtitle("by regions of the United States")
		note("Source:  U.S. Census Bureau, U.S. Dept. of Commerce"){txt}
	  {it:({stata gr_example2 grbar1:click to run})}
{* graph grbar1}{...}

{pstd}
{cmd:bargap(}{it:#}{cmd:)} specifies the distance between the {it:yvar} bars
(i.e., between the bars for tempjuly and tempjan); {it:#} is in
percentage-of-bar-width units, so {cmd:barwidth(-30)} means the bars overlap
by 30%.  {cmd:bargap()} may be positive or negative; its default is 0.


{marker remarks6}{...}
{title:Controlling the text of the legend}

{pstd}
In the above example, we changed the text of the legend by specifying the
legend option:

		{cmd:legend( label(1 "July") label(2 "January") )}

{pstd}
We could just as well have changed the text of the legend by typing

		{cmd:yvaroptions( relabel(1 "July" 2 "January") )}

{pstd}
Which you use makes no difference, but we prefer {cmd:legend(label())}
to {cmd:yvaroptions(relabel())} because {cmd:legend(label())} is the way to
modify the contents of a legend in a twoway graph; so why do bar charts
differently?


{marker remarks7}{...}
{title:Multiple over()s (repeating the bars)}

{pstd}
Option {cmd:over(}{it:varname}{cmd:)} repeats the {it:yvar} bars for each unique
value of {it:varname}.  Using citytemp.dta, if we typed

	{cmd:. graph bar (mean) tempjuly tempjan}

{pstd}
we would obtain two (fat) bars.  When we type

	{cmd:. graph bar (mean) tempjuly tempjan, over(region)}

{pstd}
we obtain two (thinner) bars for each of the four regions.  (We typed exactly
this command in {hi:Multiple bars} above.)

{pstd}
You may repeat the {cmd:over()} option.  You may specify {cmd:over()} twice
when you specify two or more {it:yvars} and up to three times when you
specify just one {it:yvar}.

{pstd}
In dataset nlsw88.dta we have information on 2,246 women:

	{cmd}. sysuse nlsw88, clear

	. graph bar (mean) wage, over(smsa) over(married) over(collgrad)
		title("Average Hourly Wage, 1988, Women Aged 34-46")
		subtitle("by College Graduation, Marital Status,
			  and SMSA residence")
		note("Source:  1988 data from NLS, U.S. Dept. of Labor,
		      Bureau of Labor Statistics"){txt}
	  {it:({stata gr_example2 grbar5:click to run})}
{* graph grbar5}{...}

{pstd}
If you strip away the {it:title_options}, the above command reads

	{cmd:. graph bar (mean) wage, over(smsa) over(married) over(collgrad)}

{pstd}
Note that in this three-{cmd:over()} case, the first {cmd:over()} is treated
as multiple {it:yvars}:  the bars touch, the bars are assigned different
colors, and the meaning of the bars is revealed in the legend.  When you
specify three {cmd:over()} groups, the first is treated the same way as
multiple {it:yvars}.  This means that if we wanted to separate the bars, we
could specify option {cmd:bargap(}{it:#}{cmd:)}, {it:#}>0, and if we wanted
them to overlap, we could specify {cmd:bargap(}{it:#}{cmd:)}, {it:#}<0.


{marker remarks8}{...}
{title:Nested over()s}

{pstd}
Sometimes you have multiple {cmd:over()} groups with one group explicitly
nested within the other.  In the citytemp.dta dataset, we have variables
{cmd:region} and {cmd:division}, and division is nested within region.  The Census Bureau
divides the U.S. into four regions and into nine divisions, which work like
this

	{hline 53}
	{it:Region}                         {it:Division}
	{hline 53}
	1.  Northeast                  1.  New England
				       2.  Mid Atlantic
	{hline 53}
	2.  North Central              3.  East North Central
				       4.  West North Central
	{hline 53}
	3.  South                      5.  South Atlantic
				       6.  East South Central
				       7.  West South Central
	{hline 53}
	4.  West                       8.  Mountain
				       9.  Pacific
	{hline 53}

{pstd}
Were we to type

{phang2} 
	{cmd:. graph bar (mean) tempjuly tempjan, over(division) over(region)}

{pstd}
We would obtain a chart with space allocated for 9*4 = 36 groups, of which
only 9 would be used:

	 {c TLC}{c -}{c TRC}                     {c TLC}{c -}{c TRC}                                {c TLC}{c -}{c TRC}
	 {c |} {c |}                     {c |} {c |}                                {c |} {c |}
	 {c |} {c LT}{c -}{c TRC}                   {c |} {c LT}{c -}{c TRC}                              {c |} {c LT}{c -}{c TRC}
	 {c |} {c |} {c |}                   {c |} {c |} {c |}                              {c |} {c |} {c |}
	{c -}{c BT}{c -}{c BT}{c -}{c BT}{hline 19}{c BT}{c -}{c BT}{c -}{c BT}{hline 30}{c BT}{c -}{c BT}{c -}{c BT}{hline 2}
	  1 2 3 4 5 6 7 8 9   1 2 3 4 5 6 7 8 9   ...  1 2 3 4 5 6 7 8 9
	       region 1            region 2                 region 4


{pstd}
The {cmd:nofill} option prevents the chart from including the unused
categories:

	{cmd}. sysuse citytemp, clear

	. graph bar tempjuly tempjan, over(division) over(region) nofill
		bargap(-30)
		ytitle("Degrees Fahrenheit")
		legend( label(1 "July") label(2 "January") )
		title("Average July and January temperatures")
		subtitle("by region and division of the United States")
		note("Source:  U.S. Census Bureau, U.S. Dept. of Commerce"){txt}
	  {it:({stata gr_example2 grbar2:click to run})}
{* graph grbar2}{...}

{pstd}
The above chart, if we omit one of the temperatures, also looks good
horizontally:

	{cmd}. graph hbar (mean) tempjan, over(division) over(region) nofill
		ytitle("Degrees Fahrenheit")
		title("Average January temperature")
		subtitle("by region and division of the United States")
		note("Source:  U.S. Census Bureau, U.S. Dept. of Commerce"){txt}
	  {it:({stata gr_example2 grbar3:click to run})}
{* graph grbar3}{...}


{marker remarks9}{...}
{title:Charts with many categories}

{pstd}
Using nlsw88.dta, we want to draw the chart

	{cmd:. graph bar wage, over(industry) over(collgrad)}

{pstd}
Variable industry records industry of employment in 12 categories, and
variable {cmd:collgrad} records whether the woman is a college graduate.
Thus we will have 24 bars.  We draw the above and quickly discover that the
long labels associated with industry result in considerable amounts of
overprinting along the horizontal {it:x} axis.

{pstd}
Horizontal bar charts work better than vertical bar charts when labels are
long.  We change our command to read

	{cmd:. graph hbar wage, over(ind) over(collgrad)}

{pstd}
That works better, but now we have overprinting problems of a different
sort:  the letters of one line are touching the letters of the next.

{pstd}
Graphs are by default 4{it:x}5:  4 inches tall by 5 inches wide.
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -