sas中的sql(1) 基本文法

　　Sas proc sql與尋常sas語句的的不同之處

1：The PROC SQL step does not require a RUN statement. PROC SQL executes each query automatically

2：Unlike many other SAS procedures, PROC SQL continues to run after you submit a step. To end the procedure, you must submit another PROC step, a DATA step, or a QUIT statement

PROC SQL; *sql過程中的變量不區分大小寫，表與表之間，變量與變量之間都必須要用逗号分開；

CREATE TABLE table-name AS

SELECT column-1<, . . . column-n>

FROM table-1 | view-1<, ... table-n | view-n>

having experssion

<ORDER BY column-1<, ... column-n>>;

Unlike other SAS procedures, the order of clauses within a SELECT statement in PROC SQL is important

select:In the SELECT clause, you can both specify existing columns (columns that are already stored in a table) and create new columns,和input後面插入自己想要的變量一樣，new columns exist only for the duration of the query, unless a table or a view is created（建立的列隻在查詢中）。

where:In the WHERE clause, you can specify any column(s) from the underlying table(s). The columns specified in the WHERE clause do not have to be specified in the SELECT clause(where中的變量可以為表中的任意一個變量)。

order by:後面可跟多個變量，用逗号分隔，對升序或降序的指定的關鍵字放在變量後，desc asc,也可以指定不在select clause中的變量,對于在clause中的變量比如

select a,b,c from table; order by中可用名字代替也可以數字代替 order by a=order by 1,對于多個變量的排序，用分号間隔，數值和字母可以混用 order by 1, age;。

group by:和Summary Functions用在一起，才有分組功能，否則sas會将其自動轉換為order by功能。

proc sql;    
     select membertype, sum(milestraveled) as TotalMiles   
                 from sasuser.frequentflyers        
        group by membertype;      /*group by後面接select中沒有經過summary函數的變量才有用，否則不能得出分組效果*/
quit;

Having:A HAVING clause works with the GROUP BY clause to restrict the groups that are displayed in the output, based on one or more specified conditions。和group by連用，作為限制性條件，控制輸出内容

proc sql;      
select jobcode,     
       avg(salary) as AvgSalary     format=dollar11.2        
        from sasuser.payrollmaster     
        group by jobcode        
        having avg(salary) > (select avg(salary) from sasuser.payrollmaster);
quit; 

/*先執行子查詢中的語句，計算出出所有salary的平均值，然後在having子句中，用每個group的avg工資與子查詢得出的總的avg工資進行比較，篩選出想要的結果*/

單表操作

proc sql;
    create table class as 
    select /*select和from中間全部是要擷取的變量,分為兩種，一種是資料集本身已有的，另一種是由其衍生的變量*/ 
    name,
    case when sex eq '女' then '1'
         when sex eq '男' then '2'
         else '3'
    end
    as sex_tran label="性别轉換", /*as做為關鍵字，形成一個變量，label是sas中的關鍵字*/
    sum(weight-20) as new1 format=16.2,
    mean(height-20) as new2 format=16.2,/*一般來說sum mean後面一定會出現group by 因為是要根據組别進行彙總*/
    (calculated new1-calculated new2) as new /*calculated為關鍵字，表示new1是衍生的變量，如果需要對其進行操作的話要加上calculated*/
    from sashelp.class(where=(height between 50 and 70)) /*sas中between and是閉區間*/
    group by name,calculated sex_tran/*除開group by後面的字段外，select中其他字段必須是數值型，不然無法進行group*/
    /*group系統預設按照升序排列後面的字段*/
    order by calculated sex_tran desc/*對想要進行降序的字段可額外進行order by*/
;
quit;

having的用法，實際中很少應用，能用data步的就用data步

/*having語句一般跟在group by後面，如果單獨來用，其作用和where差不多*/
proc sql;
    create table have1 as
    select 
        id,
        max(txn_seq) as max_txn,
        sum(amount) as sum_amount
    from temp.having1
    group by id
    having calculated sum_amount ge 70
;    
quit;

/*目标是去除每一組的最後一條觀測，但是使用having在分組相關變量有重複的情況下不保險，是以能用data步處理盡量用data步*/
proc sql;
    create table have2 as
    select 
        id,
        txn_seq,
        amount
    from temp.having2
    group by id
    having txn_seq eq max(txn_seq)
;    
quit;

建立和更新表