📄 sqlfun09.sas

📁 SAS是功能強大的統計軟體
💻 SAS
字号:
 /****************************************************************/ /*          S A S   S A M P L E   L I B R A R Y                 */ /*                                                              */ /*    NAME: SQLFUN09                                            */ /*   TITLE: fun/interesting applications of PROC SQL. (fun09)   */ /* PRODUCT: BASE                                                */ /*  SYSTEM: ALL                                                 */ /*    KEYS: SQL DATMAN SELECT WHERE GROUP ORDER COUNT MISSING   */ /*          TRANSPOSE PRINT DISTINCT                            */ /*   PROCS: SQL                                                 */ /*    DATA:                                                     */ /*                                                              */ /* SUPPORT: pmk                         UPDATE:                 */ /*     REF:                                                     */ /*    MISC: this example was contributed by Howard Schreier     */ /*          of the US Dept. of Commerce, via BITNET             */ /*                                                              */ /*          you can contribute your interesting samples.        */ /*          send internet email to KENT@UNX.SAS.COM or          */ /*          USmail to SAS Institute.                            */ /*                                                              */ /****************************************************************/ title1 'SAS SAMPLE LIBRARY, MEMBER(SQLFUN09)'; /*  * A User asks:  *  * I have a survey on which firms listed their five most  * important suppliers and their suppliers' locations.  Now I  * want to find out what percent of firms in the survey get  * supplies from different places (e.g., what percent  * get supplies from N.Y., from Boston, etc.).  *  * The five suppliers are separate variables.  If I collapse  * them by creating a new file with a separate OUTPUT statement  * for each of the five variables, I get up to five times as many  * firms as I really have (not all respondents listed as many as  * five firms), so if a place appears only once, its relative  * frequency should be 1/N where N is the actual number of firms,  * but its computed relative frequency will be 1/M where M is the  * number of supplier locations reported.  *  * On the other hand, if I do not create a separate file, the  * five original variables are not compatible.  * Firm 1 may have NY as the first variable, Firm 2 may have NY  * as its third in the list, Firm 3 may list five different  * cities, and Firm 4 may list five completely different cities.  *  * Does anybody have a suggestion for a quick and easy way to  * compute what percent of all firms get their supplies from  * the various cities?  *  */ /*  * There are undoubtedly a number of different ways to do this,  * but  I think that transposing the data so that there is just  * one city variable but multiple observations  for  each  firm  * will make it easier in any case.  *  * Here is a test data set, created in your configuration, then  * rearranged by PROC TRANSPOSE:  *  */ data survey;   infile cards;   input company $6. +1 (city1-city5) ($8.);   cards;First  Chicago London  ParisSecond Paris   Tokyo   Sydney  Cairo   TokyoThird  Caracas Manila  TokyoFourth Boston  Prague  Tokyo   London   ; proc transpose data=survey out=for_sql(rename=(col1=city));   by company notsorted;   var city1-city5;   run; proc print;   title2 'Original survey data';   run; /*  * The key to this query is the  embedded  subquery  (the  long  * expression  in parentheses).  It creates a one-row temporary  * table containing the number  of  companies,  which  is  then  * joined to the main table to set up the calculation you want.  *  * Notice  that  inclusion  of  the  DISTINCT  keyword  in  the  * expression  for  CITY_PCT  takes  care  of the repetition of  * Tokyo by the Second Company.  *  * The WHERE clause gets rid of  missing  values  generated  by  * PROC  TRANSPOSE  in  cases  where fewer than five cities are  * named.  *  * The GROUP clause controls the  rows  spanned  by  the  COUNT  * summary function in the main (outer) query.  *  */ proc sql feedback;   title2 'Supplier Percentages';   select city,          100 * count(distinct company)/co_count as city_pct    from for_sql,         (select count(distinct company) as co_count from for_sql)    where city is not missing    group by city    order by city_pct desc, city;   quit;
⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -