天天看點

Apache Mod/Filter Development

Apache Mod/Filter Development

catalog

0. 引言
1. windows下開發apache子產品
2. mod進階: 接收用戶端資料的 echo 子產品
3. mod進階: 可配置的 echo 子產品
4. mod進階: 過濾器      

0. 引言

Apache httpd 從 2.0 之後,已經不僅僅局限于一個 http 的伺服器,更是一個完善而強大,靈活而健壯,且容易擴充的開發平台。開發人員通過定制 Apache 子產品,可以幾乎無限制的擴充 Apache httpd,使其更好的與實際應用場景比對,而又無需考慮底層的網絡傳輸細節。這樣既可以提高開發效率,節省開發成本,又能充分利用 Apache 本身的健壯性及可靠性

0x1: Apache httpd 開發平台簡介

Apache httpd 自 2.0 之後,針對 1.0 做了大量的改進,包括對自身核心的改造,擴充機制的改進,APR(Apache Portable Runtime) 的剝離(使得 Apache 成為一個真正意義的跨平台伺服器)。Apache 2.0 成為一個很容易擴充的開發平台

Apache Mod/Filter Development

Apache 中包含了大量的擴充子產品(module),如 mod_cgi 用以處理 cgi 腳本、mod_perl 用以處理 perl 腳本,将 perl 的能力與 Apache httpd 結合起來等等。使用者可以通過一定的開發标準(接口規範)來開發符合自己業務場景的子產品,并動态的加載到 Apache 中,Apache 會根據配置檔案中的規則來定位,調用子產品,完成用戶端請求,簡單來說,apache的擴充編寫可以分類一下幾類

1. apache httpd 子產品: 編寫子產品往往要比編寫過濾器要複雜一點,可以看成是一種原始的apache擴充編寫方式,但是靈活性更高
2. 輸出過濾器: apache在子產品的基礎上進行了API封裝,使得過濾器的代碼編寫變得更簡單
3. 輸入過濾器      

0x2: Apache httpd 子產品機制

Apache httpd 由一個核心和大量的子產品組成,包括用以加載其他子產品的功能單元自身也是一個子產品,子產品是Apache擴充的基礎機制(理論上子產品可以實作任何功能)。一般而言,一個 HTTP 伺服器的工作序列是這樣的

1. 接受用戶端請求
可能是請求一個部署在 HTTP 伺服器程式可通路的檔案,讀取該檔案作為響應傳回,我們在浏覽器的位址欄中輸入類似這樣的 URL:http://host/index.html,浏覽器将會嘗試與 host 指定的 HTTP 伺服器的 80 端口建立連接配接,如果成功,則發送 HTTP 請求,擷取 index.html 頁面。如果成功,則在浏覽器中解析該 HTML 檔案 
這種工作方式在靜态頁面的場景下沒有任何問題。但是實際應用往往會與資料庫互動,動态生成頁面内容。如服務端較為流行的 cgi/php 腳本等。這就需要更進階,更靈活的内容生成器做支援

2. 預處理
    1) 權限校驗
    2) HTTP 頭資訊識别等

3. 内容生成
通過與作業系統其他資源互動 ( 如檔案讀寫,資料庫通路等 ) 來完成動态内容的生成

4. 其他善後操作等
進行日志記錄,資源釋放等操作      
Apache Mod/Filter Development

通常注冊子產品以處理配置檔案中的特殊檔案類型或其它此類标準

AddHandler cgi-script .cgi
//php、python都是通過配置指定特定字尾擴充名的處理子產品      

Apache 為每個請求調用所有處理程式,是以每個處理程式應該迅速決定請求是否是沖着它來的。是以,大多數頭檔案都從類似下面的語句開始

if (!req->handler || strcmp(req->handler, "target-module"))
    return DECLINED;      

0x3: Apache 2.0之後的過濾器機制

在大多數時候,apache擴充開發者并不需要很複雜的收包/發包邏輯,而僅僅需要對HTTP Header、Body進行檢測,為了提高效率,apache開發者對子產品代碼進行了API封裝

Apache 2.0 有專門的 API 用于開發子產品,這些子產品隻需修改對使用者響應的内容,或者隻需修改使用者的 HTTP 請求的詳細資訊。這些 API 分别被稱為

1. 輸出過濾器
輸出過濾器最為常見,一個好的示例是标準 Apache 2.0 子產品,它被用于計算傳回給使用者的内容的長度以便更新适當的頭和日志項。另一個示例是用于對出站内容進行自動拼寫檢查的子產品

2. 輸入過濾器
典型的WEB WAF      

從嚴格意義上來說,基于WEB SERVER的Mod(擴充子產品)WAF是一種HTTP全生命周期的檢測/過濾/攔截/記錄機制,它需要綜合利用"輸出過濾器"、"輸入過濾器"

Relevant Link:

http://www.ibm.com/developerworks/cn/opensource/os-cn-apachehttpd/      

1. windows下開發apache子產品

0x1: windows下安裝apache

http://apache.dataguru.cn//httpd/binaries/win32/
//一定要custom全部安裝,否則就不會有include和lib目錄      

0x2: 安裝Perl

将要使用的apx包要用到perl解析編譯,是以,需先安裝perl

http://www.activestate.com/activeperl        

0x3: 安裝apxs

1. http://www.apachelounge.com/download/apxs_win32.zip
2. 解壓到: D:\apxs
3. pushd D:\apxs
4. perl Configure.pl --with-apache2=D:\wamp\bin\apache\apache2.4.9 --with-apache-prog=httpd.exe
/*
apxs.bat has been created under D:\wamp\bin\apache\APACHE~1.9\bin.
apr-1-config.pl.bat has been created under D:\wamp\bin\apache\APACHE~1.9\bin.
apu-1-config.pl.bat has been created under D:\wamp\bin\apache\APACHE~1.9\bin.
*/

5. pushd D:\wamp\bin\apache\apache2.4.9\bin
6. apxs(出現下列提示則說明安裝成功)
/*
Use of assignment to $[ is deprecated at apxs.bat line 120.
Usage: apxs -g [-S <var>=<val>] -n <modname>
       apxs -q [-v] [-S <var>=<val>] <query> ...
       apxs -c [-S <var>=<val>] [-o <dsofile>] [-D <name>[=<value>]]
               [-I <incdir>] [-L <libdir>] [-l <libname>] [-Wc,<flags>]
               [-Wl,<flags>] [-p] <files> ...
       apxs -i [-S <var>=<val>] [-a] [-A] [-n <modname>] <dsofile> ...
       apxs -e [-S <var>=<val>] [-a] [-A] [-n <modname>] <dsofile> ...
*/

7. 配置apxs編譯環境
D:\wamp\bin\apache\apache2.4.9\build\config_vars.mk
CC = D:\CODEBL~1\MinGW\bin\gcc.exe 改為: CC = cl.exe
LD = D:\CODEBL~1\MinGW\bin\g++.exe 改為: LD = link.exe
CPP = gcc -E 改為: CPP =
LDFLAGS = kernel32.lib /nologo /subsystem:windows /dll /machine:I386 /libpath:"D:\wamp\bin\apache\APACHE~1.9\lib" 改為: LDFLAGS = kernel32.lib /nologo /subsystem:windows /dll /machine:X64 /libpath:"D:\wamp\bin\apache\APACHE~1.9\lib"

8. 使用apxs生成mod架構模版
Visual Studio 指令提示(2010)
pushd D:\安全部工作\伺服器waf子產品mod研究
D:\wamp\bin\apache\apache2.4.9\bin\apxs -g -n helloworld
/*
Use of assignment to $[ is deprecated at D:\wamp\bin\apache\apache2.4.9\bin\apxs.bat line 120.
Creating [DIR]  helloworld
Creating [FILE] helloworld/Makefile
Creating [FILE] helloworld/mod_helloworld.c
Creating [FILE] helloworld/.deps
*/

9. 進入helloworld目錄,編輯mod_helloworld.c(這就是我們要開發的内容)
cd helloworld
D:\wamp\bin\apache\apache2.4.9\bin\apxs -c -i -a mod_helloworld.c libapr-1.lib libaprutil-1.lib libapriconv-1.lib libhttpd.lib 

10. 将mod_helloworld.so拷貝到Apache2.2\modules下   
11. 打開conf檔案夾下的httpd.conf檔案
/*
LoadModule helloworld_module  "D:/wamp/bin/apache/APACHE~1.9/modules/mod_helloworld.so"
<Location /helloworld>
    SetHandler helloworld
</Location>
*/

12. 重新開機apache
13. http://localhost/helloworld      

0x4: mod通用模闆代碼架構

#include "httpd.h"
#include "http_config.h"
#include "http_protocol.h"
#include "ap_config.h"

/* 
The sample content handler 
首先需要一個實際處理用戶端請求的函數 (handler),命名方式一般為”子產品名 _handler”,接收一個 request_rec 類型的指針,并傳回一個 int 類型的狀态值
request_rec 指針中包括所有的用戶端連接配接資訊及 Apache 内部的指針,如連接配接資訊表,記憶體池等,這個結構類似于 J2EE 開發中 servlet 的 HttpRequest 對象及 HttpResponse 對象。通過 request_rec,我們可以讀取用戶端請求資料 / 寫入響應資料,擷取請求中的資訊 ( 如用戶端浏覽器類型,編碼方式等 )
*/
static int helloworld_handler(request_rec *r)
{
    if (strcmp(r->handler, "helloworld")) {
        return DECLINED;
    }
    r->content_type = "text/html";      

    if (!r->header_only)
        ap_rputs("The sample page from mod_helloworld.c\n", r);
    return OK;
}

//注冊函數,一般命名為”子產品名 _register_hooks”,傳入參數為 Apache 的記憶體池指針。這個函數用于通知 Apache 在何時,以何種方式注冊響應函數 (handler)
static void helloworld_register_hooks(apr_pool_t *p)
{
    ap_hook_handler(helloworld_handler, NULL, NULL, APR_HOOK_MIDDLE);
}

/* 
Dispatch list for API hooks 
子產品的定義,Apache 子產品加載器通過這個結構體中的定義來在适當的時刻調用适當的函數以處理響應。應該注意的是,第一個成員預設填寫為 STANDARD20_MODULE_STUFF,最後一個成員為注冊函數
*/
module AP_MODULE_DECLARE_DATA helloworld_module = {
    STANDARD20_MODULE_STUFF, 
    NULL,                  /* create per-dir    config structures */
    NULL,                  /* merge  per-dir    config structures */
    NULL,                  /* create per-server config structures */
    NULL,                  /* merge  per-server config structures */
    NULL,                  /* table of config file commands       */
    helloworld_register_hooks  /* register hooks                      */
};      

0x5: apache mod核心資料結構: request_rec

The request_rec request record is the heart and soul of the Apache API. It contains everything you could ever want to know about the current request and then some.

\apache2.4.9\include\httpd.h

/**
 * @brief A structure that represents the current request
 */
struct request_rec 
{
    /** 
    The pool associated with the request 
    This is a resource pool that is valid for the lifetime of the request. Your request-time handlers should allocate memory from this pool.
    */
    apr_pool_t *pool;

    /** 
    The connection to the client 
    This is a pointer to the connection record for the current request, 
    from which you can derive information about the local and remote host addresses, as well as the username used during authentication
    */
    conn_rec *connection;

    /** 
    The virtual host for this request 
    This is a pointer to a server record server_rec structure, from which you can gather information about the current server.
    */
    server_rec *server;

    /*
    Under various circumstances, including subrequests and internal redirects, 
    Apache will generate one or more subrequests that are identical in all respects to an ordinary request. 
    When this happens, these fields are used to chain the subrequests into a linked list. 
    1. The next field points to the more recent request (or NULL, if there is none), 
    2. and the prev field points to the immediate ancestor of the request. 
    3. main points back to the top-level request.
    */
    /** Pointer to the redirected request if this is an external redirect */
    request_rec *next;
    /** Pointer to the previous request if this is an internal redirect */
    request_rec *prev; 
    /** Pointer to the main request if this is a sub-request
     * (see http_request.h) */
    request_rec *main;

    /* Info about the request itself... we begin with stuff that only
     * protocol.c should ever touch...
     */
    /** First line of request This contains the first line of the request, for logging purposes. */
    char *the_request;

    /** HTTP/0.9, "simple" request (e.g. GET /foo\n w/no headers) */
    int assbackwards;

    /** 
    A proxy request (calculated during post_read_request/translate_name) possible values PROXYREQ_NONE, PROXYREQ_PROXY, PROXYREQ_REVERSE, PROXYREQ_RESPONSE
    If the current request is a proxy request, then this field will be set to a true (nonzero) value. Note that mod_proxy or mod_perl must be configured with the server for automatic proxy request detection. You can also set it yourself in order to activate Apache's proxy mechanism
    */
    int proxyreq;

    /** 
    HEAD request, as opposed to GET 
    This field will be true if the remote client made a head-only request (i.e., HEAD). You should not change the value of this field. 
    */
    int header_only;

    /** Protocol version number of protocol; 1.1 = 1001 */
    int proto_num;
    /** 
    Protocol string, as given to us, or HTTP/0.9 
    This field contains the name and version number of the protocol requested by the browser, for example HTTP/1.0.
    */
    char *protocol;

    /** 
    Host, as set by full URI or Host: 
    This contains the name of the host requested by the client, either within the URI (during proxy requests) or in the Host header. 
    The value of this field may not correspond to the canonical name of your server or the current virtual host but can be any of its DNS aliases. 
    For this reason, it is better to use the ap_get_server_name() API function call described under "Processing Requests."
    hostname通路可能直接DNS域名通路
    */
    const char *hostname;

    /** 
    Time when the request started 
    This is the time that the request started as a C time_t structure. 
    */
    apr_time_t request_time;

    /** 
    Status line, if set by script 
    This field holds the full text of the status line returned from Apache to the remote browser, for example 200 OK. 
    Ordinarily you will not want to change this directly but will allow Apache to set it based on the return value from your handler. 
    However, you can change it directly in the rare instance that you want your handler to lie to Apache about its intentions 
    (e.g., tell Apache that the handler processed the transaction OK, but send an error message to the browser).
    */
    const char *status_line;
    /** Status line */
    int status;

    /* Request method, two ways; also, protocol, etc..  Outside of protocol.c,
     * look, but don't touch.
     */

    /** M_GET, M_POST, etc. */
    int method_number;
    /** Request method (eg. GET, HEAD, POST, etc.) */
    const char *method;

    /**
     *  'allowed' is a bitvector of the allowed methods.
     *
     *  A handler must ensure that the request method is one that
     *  it is capable of handling.  Generally modules should DECLINE
     *  any request methods they do not handle.  Prior to aborting the
     *  handler like this the handler should set r->allowed to the list
     *  of methods that it is willing to handle.  This bitvector is used
     *  to construct the "Allow:" header required for OPTIONS requests,
     *  and HTTP_METHOD_NOT_ALLOWED and HTTP_NOT_IMPLEMENTED status codes.
     *
     *  Since the default_handler deals with OPTIONS, all modules can
     *  usually decline to deal with OPTIONS.  TRACE is always allowed,
     *  modules don't need to set it explicitly.
     *
     *  Since the default_handler will always handle a GET, a
     *  module which does *not* implement GET should probably return
     *  HTTP_METHOD_NOT_ALLOWED.  Unfortunately this means that a Script GET
     *  handler can't be installed by mod_actions.
     */
    apr_int64_t allowed;
    /** Array of extension methods */
    apr_array_header_t *allowed_xmethods;
    /** List of allowed methods */
    ap_method_list_t *allowed_methods;

    /** byte count in stream is for body */
    apr_off_t sent_bodyct;
    /** body byte count, for easy access */
    apr_off_t bytes_sent;
    /** Last modified time of the requested resource */
    apr_time_t mtime;

    /* HTTP/1.1 connection-level features */

    /** The Range: header */
    const char *range;
    /** The "real" content length */
    apr_off_t clength;
    /** sending chunked transfer-coding */
    int chunked;

    /** Method for reading the request body
     * (eg. REQUEST_CHUNKED_ERROR, REQUEST_NO_BODY,
     *  REQUEST_CHUNKED_DECHUNK, etc...) */
    int read_body;
    /** reading chunked transfer-coding */
    int read_chunked;
    /** is client waiting for a 100 response? */
    unsigned expecting_100;
    /** The optional kept body of the request. */
    apr_bucket_brigade *kept_body;
    /** For ap_body_to_table(): parsed body */
    /* XXX: ap_body_to_table has been removed. Remove body_table too or
     * XXX: keep it to reintroduce ap_body_to_table without major bump? */
    apr_table_t *body_table;
    /** Remaining bytes left to read from the request body */
    apr_off_t remaining;
    /** Number of bytes that have been read  from the request body */
    apr_off_t read_length;

    /* MIME header environments, in and out.  Also, an array containing
     * environment variables to be passed to subprocesses, so people can
     * write modules to add to that environment.
     *
     * The difference between headers_out and err_headers_out is that the
     * latter are printed even on error, and persist across internal redirects
     * (so the headers printed for ErrorDocument handlers will have them).
     *
     * The 'notes' apr_table_t is for notes from one module to another, with no
     * other set purpose in mind...
     */

    /** MIME header environment from the request */
    apr_table_t *headers_in;
    /** MIME header environment for the response */
    apr_table_t *headers_out;
    /** MIME header environment for the response, printed even on errors and
     * persist across internal redirects */
    apr_table_t *err_headers_out;
    /** Array of environment variables to be used for sub processes */
    apr_table_t *subprocess_env;
    /** Notes from one module to another */
    apr_table_t *notes;

    /* content_type, handler, content_encoding, and all content_languages
     * MUST be lowercased strings.  They may be pointers to static strings;
     * they should not be modified in place.
     */
    /** The content-type for the current request */
    const char *content_type;   /* Break these out --- we dispatch on 'em */
    /** The handler string that we use to call a handler function */
    const char *handler;        /* What we *really* dispatch on */

    /** How to encode the data */
    const char *content_encoding;
    /** Array of strings representing the content languages */
    apr_array_header_t *content_languages;

    /** variant list validator (if negotiated) */
    char *vlist_validator;

    /** If an authentication check was made, this gets set to the user name. */
    char *user;
    /** If an authentication check was made, this gets set to the auth type. */
    char *ap_auth_type;

    /* What object is being requested (either directly, or via include
     * or content-negotiation mapping).
     */

    /** The URI without any parsing performed */
    char *unparsed_uri;
    /** The path portion of the URI, or "/" if no path provided */
    char *uri;
    /** The filename on disk corresponding to this response */
    char *filename;
    /* XXX: What does this mean? Please define "canonicalize" -aaron */
    /** The true filename, we canonicalize r->filename if these don't match */
    char *canonical_filename;
    /** The PATH_INFO extracted from this request */
    char *path_info;
    /** The QUERY_ARGS extracted from this request */
    char *args;

    /**
     * Flag for the handler to accept or reject path_info on
     * the current request.  All modules should respect the
     * AP_REQ_ACCEPT_PATH_INFO and AP_REQ_REJECT_PATH_INFO
     * values, while AP_REQ_DEFAULT_PATH_INFO indicates they
     * may follow existing conventions.  This is set to the
     * user's preference upon HOOK_VERY_FIRST of the fixups.
     */
    int used_path_info;

    /** A flag to determine if the eos bucket has been sent yet */
    int eos_sent;

    /* Various other config info which may change with .htaccess files
     * These are config vectors, with one void* pointer for each module
     * (the thing pointed to being the module's business).
     */

    /** Options set in config files, etc. */
    struct ap_conf_vector_t *per_dir_config;
    /** Notes on *this* request */
    struct ap_conf_vector_t *request_config;

    /** Optional request log level configuration. Will usually point
     *  to a server or per_dir config, i.e. must be copied before
     *  modifying */
    const struct ap_logconf *log;

    /** Id to identify request in access and error log. Set when the first
     *  error log entry for this request is generated.
     */
    const char *log_id;

    /**
     * A linked list of the .htaccess configuration directives
     * accessed by this request.
     * N.B. always add to the head of the list, _never_ to the end.
     * that way, a sub request's list can (temporarily) point to a parent's list
     */
    const struct htaccess_result *htaccess;

    /** A list of output filters to be used for this request */
    struct ap_filter_t *output_filters;
    /** A list of input filters to be used for this request */
    struct ap_filter_t *input_filters;

    /** A list of protocol level output filters to be used for this
     *  request */
    struct ap_filter_t *proto_output_filters;
    /** A list of protocol level input filters to be used for this
     *  request */
    struct ap_filter_t *proto_input_filters;

    /** This response can not be cached */
    int no_cache;
    /** There is no local copy of this response */
    int no_local_copy;

    /** Mutex protect callbacks registered with ap_mpm_register_timed_callback
     * from being run before the original handler finishes running
     */
    apr_thread_mutex_t *invoke_mtx;

    /** A struct containing the components of URI */
    apr_uri_t parsed_uri;
    /**  finfo.protection (st_mode) set to zero if no such file */
    apr_finfo_t finfo;

    /** remote address information from conn_rec, can be overridden if
     * necessary by a module.
     * This is the address that originated the request.
     */
    apr_sockaddr_t *useragent_addr;
    char *useragent_ip;
};      
http://www.cnblogs.com/QRcode/p/3193397.html
http://blog.csdn.net/hxsstar/article/details/19820029
https://publib.boulder.ibm.com/iseries/v5r1/ic2924/info/rzaie/APR/structrequest__rec.html
http://docstore.mik.ua/orelly/apache_mod/128.htm
http://blog.csdn.net/wind_cludy/article/details/6557776
http://docstore.mik.ua/orelly/apache_mod/128.htm#listing10_1      

2. mod進階: 接收用戶端資料的 echo 子產品

如果Apache子產品隻能産生内容,那麼使用普通的HTML檔案(即使用httpd預設的内容生成器)也可以完成。子產品存在的意義在于,它可以輕松地處理用戶端傳遞的資料,并将這些資料加工,然後響應用戶端請求

/* 
**  mod_helloworld.c -- Apache sample helloworld module
**  [Autogenerated via ``apxs -n helloworld -g'']
**
**  To play with this sample module first compile it into a
**  DSO file and install it into Apache's modules directory 
**  by running:
**
**    $ apxs -c -i mod_helloworld.c
**
**  Then activate it in Apache's httpd.conf file for instance
**  for the URL /helloworld in as follows:
**
**    #   httpd.conf
**    LoadModule helloworld_module modules/mod_helloworld.so
**    <Location /helloworld>
**    SetHandler helloworld
**    </Location>
**
**  Then after restarting Apache via
**
**    $ apachectl restart
**
**  you immediately can request the URL /helloworld and watch for the
**  output of this module. This can be achieved for instance via:
**
**    $ lynx -mime_header http://localhost/helloworld 
**
**  The output should be similar to the following one:
**
**    HTTP/1.1 200 OK
**    Date: Tue, 31 Mar 1998 14:42:22 GMT
**    Server: Apache/1.3.4 (Unix)
**    Connection: close
**    Content-Type: text/html
**  
**    The sample page from mod_helloworld.c
*/ 

#include "httpd.h"
#include "http_config.h"
#include "http_protocol.h"
#include "ap_config.h" 

#define DFT_BUF_SIZE 1024

/** 
 * @brief read_post_data 從 request 中擷取 POST 資料到緩沖區
 * 
 * @param req         apache request_rec 對象
 * @param post        接收緩沖區
 * @param post_size   接收緩沖區長度
 * 
 * @return 
 */ 
 static int read_post_data(request_rec *req, char **post, size_t *post_size)
 { 
    char buffer[DFT_BUF_SIZE] = {0}; 
    size_t bytes, count, offset; 

    bytes = count = offset = 0; 

    if(ap_setup_client_block(req, REQUEST_CHUNKED_DECHUNK) != OK)
    { 
        return HTTP_BAD_REQUEST; 
    } 

    if(ap_should_client_block(req))
    { 
        //通過 Apache 提供的 API:ap_get_client_block 将請求中 POST 的資料讀入到緩沖區
        for(bytes = ap_get_client_block(req, buffer, DFT_BUF_SIZE);  bytes > 0;  bytes = ap_get_client_block(req, buffer, DFT_BUF_SIZE))
        { 
            //如果預配置設定的緩沖區不夠,則重新配置設定記憶體存放,并同時修改緩沖區的實際長度
            count += bytes; 
            if(count > *post_size)
            { 
                *post = (char *)realloc(*post, count); 
                if(*post == NULL)
                { 
                    return HTTP_INTERNAL_SERVER_ERROR; 
                } 
            } 
            *post_size = count; 
            offset = count - bytes; 
            memcpy((char *)*post+offset, buffer, bytes); 
        } 
    }
    else
    { 
        *post_size = 0; 
        return OK; 
    } 

    return OK; 
 }

/* 
The sample content handler 
首先需要一個實際處理用戶端請求的函數 (handler),命名方式一般為”子產品名 _handler”,接收一個 request_rec 類型的指針,并傳回一個 int 類型的狀态值
request_rec 指針中包括所有的用戶端連接配接資訊及 Apache 内部的指針,如連接配接資訊表,記憶體池等,這個結構類似于 J2EE 開發中 servlet 的 HttpRequest 對象及 HttpResponse 對象。通過 request_rec,我們可以讀取用戶端請求資料 / 寫入響應資料,擷取請求中的資訊 ( 如用戶端浏覽器類型,編碼方式等 )
*/
static int helloworld_handler(request_rec *r)
{
    int ret;
    char *post = NULL;
    size_t post_size = 0;

    if (strcmp(r->handler, "helloworld")) 
    {
        return DECLINED;
    }
    //隻接收GET、POST請求
    if((r->method_number != M_GET) && (r->method_number != M_POST))
    { 
        return HTTP_METHOD_NOT_ALLOWED; 
    } 

    post = (char *)malloc(sizeof(char) * DFT_BUF_SIZE); 
    post_size = DFT_BUF_SIZE; 
    if(post == NULL)
    { 
        return HTTP_INTERNAL_SERVER_ERROR; 
    }
    memset(post, '\0', post_size); 

    //讀取POST資料
    ret = read_post_data(r, &post, &post_size); 
    if(ret != OK)
    { 
        free(post); 
        post = NULL; 
        post_size = 0; 
        return ret; 
    }  

    ap_set_content_type(r, "text/html;charset=utf-8"); 
    ap_set_content_length(r, post_size); 

    if(post_size == 0)
    { 
        ap_rputs("no post data found", r); 
        return OK; 
    } 

    ap_rputs(post, r); 

    free(post); 
    post = NULL; 
    post_size = 0; 

    return OK; 
}

//注冊函數,一般命名為”子產品名 _register_hooks”,傳入參數為 Apache 的記憶體池指針。這個函數用于通知 Apache 在何時,以何種方式注冊響應函數 (handler)
static void helloworld_register_hooks(apr_pool_t *p)
{
    ap_hook_handler(helloworld_handler, NULL, NULL, APR_HOOK_MIDDLE);
}

/* 
Dispatch list for API hooks 
子產品的定義,Apache 子產品加載器通過這個結構體中的定義來在适當的時刻調用适當的函數以處理響應。應該注意的是,第一個成員預設填寫為 STANDARD20_MODULE_STUFF,最後一個成員為注冊函數
*/
module AP_MODULE_DECLARE_DATA helloworld_module = {
    STANDARD20_MODULE_STUFF, 
    NULL,                  /* create per-dir    config structures */
    NULL,                  /* merge  per-dir    config structures */
    NULL,                  /* create per-server config structures */
    NULL,                  /* merge  per-server config structures */
    NULL,                  /* table of config file commands       */
    helloworld_register_hooks  /* register hooks                      */
};      
Apache Mod/Filter Development
http://www.ibm.com/developerworks/cn/opensource/os-cn-apachehttpd/      

3. mod進階: 可配置的 echo 子產品

我們繼續擴充上例中的 echo_post 子產品,我們将 echo_post 擴充為可配置的子產品,通過修改配置檔案 httpd.conf 中設定 ConvertType 的值,可以使得子產品在運作時的行為發生變化

0x1: 配置資訊讀取

typedef struct{ 
    int convert_type; // 轉換類型
 }cust_config_t;      

這個結構體僅有一個成員,convert_type, 表示轉換類型,如果在配置檔案中該值被設定為 0,則将用戶端 POST 的資料轉換為大寫,如果為 1,則轉換為小寫。這樣即可通過配置資訊修改子產品運作時的行為

//create_config 函數用以建立一個使用者自定義的結構體
static void *create_config(apr_pool_t *pool, server_rec *server); 

//set_mod_config 函數用以設定配置結構體中的成員,這個函數注冊在 command_rec 數組中
static const char *set_mod_config(cmd_parms *params, void *config, const char *arg);

//而 command_rec 數組則儲存在子產品聲明結構體中: 定義一個 command_rec 結構體類型的數組
static const command_rec cust_echo_cmds[] = 
{ 
    AP_INIT_TAKE1("ConvertType", 
             set_mod_config, 
             NULL, 
             RSRC_CONF, 
            "convert type of post data"), 
    {0} 
};

//注冊子產品回調函數
 /* Dispatch list for API hooks */ 
 module AP_MODULE_DECLARE_DATA cust_echo_post_module = { 
    STANDARD20_MODULE_STUFF, 
    NULL,                  /* create per-dir    config structures */ 
    NULL,                  /* merge  per-dir    config structures */ 
    create_config,       /* create per-server config structures */ 
    NULL,                  /* merge  per-server config structures */ 
    cust_echo_cmds,      /* table of config file commands       */ 
    cust_echo_post_register_hooks  /* register hooks                      */ 
 };      

0x2: Code

/* 
**  mod_cust_echo_post.c -- Apache sample cust_echo_post module
**  [Autogenerated via ``apxs -n cust_echo_post -g'']
**
**  To play with this sample module first compile it into a
**  DSO file and install it into Apache's modules directory 
**  by running:
**
**    $ apxs -c -i mod_cust_echo_post.c
**
**  Then activate it in Apache's httpd.conf file for instance
**  for the URL /cust_echo_post in as follows:
**
**    #   httpd.conf
**    LoadModule cust_echo_post_module modules/mod_cust_echo_post.so
**    <Location /cust_echo_post>
**    SetHandler cust_echo_post
**    </Location>
**
**  Then after restarting Apache via
**
**    $ apachectl restart
**
**  you immediately can request the URL /cust_echo_post and watch for the
**  output of this module. This can be achieved for instance via:
**
**    $ lynx -mime_header http://localhost/cust_echo_post 
**
**  The output should be similar to the following one:
**
**    HTTP/1.1 200 OK
**    Date: Tue, 31 Mar 1998 14:42:22 GMT
**    Server: Apache/1.3.4 (Unix)
**    Connection: close
**    Content-Type: text/html
**  
**    The sample page from mod_cust_echo_post.c
*/ 

#include "httpd.h"
#include "http_config.h"
#include "http_protocol.h"
#include "ap_config.h"

#define DFT_BUF_SIZE 4096

module AP_MODULE_DECLARE_DATA cust_echo_post_module;

static void *create_config(apr_pool_t *pool, server_rec *server);
static const char *set_mod_config(cmd_parms *params, void *config, const char *arg);

typedef struct
{
    int convert_type; //轉換類型
}cust_config_t;

static const command_rec cust_echo_cmds[] = 
{
    AP_INIT_TAKE1("ConvertType", set_mod_config, NULL, RSRC_CONF, "convert type of post data"), {0}
};

static void *create_config(apr_pool_t *pool, server_rec *server)
{
    cust_config_t *config;
    config = (cust_config_t *)apr_pcalloc(pool, sizeof(cust_config_t));
    return (void *)config;
}

static const char *set_mod_config(cmd_parms *params, void *conf, const char *arg)
{
    cust_config_t *config = ap_get_module_config(params->server->module_config, &cust_echo_post_module);

    if(strcmp(params->cmd->name, "ConvertType") == 0)
    {
        config->convert_type = atoi((char *)arg);
    }

    return NULL;
}

/**
 * @brief read_post_data 從request中擷取POST資料到緩沖區
 *
 * @param req         apache request_rec對象
 * @param post        接收緩沖區
 * @param post_size   接收緩沖區長度
 *
 * @return 
 */
static int read_post_data(request_rec *req, char **post, size_t *post_size){
    char buffer[DFT_BUF_SIZE] = {0};
    size_t bytes, count, offset;
    
    bytes = count = offset = 0;

    if(ap_setup_client_block(req, REQUEST_CHUNKED_DECHUNK) != OK){
        return HTTP_BAD_REQUEST;
    }

    if(ap_should_client_block(req)){
        for(bytes = ap_get_client_block(req, buffer, DFT_BUF_SIZE);
                bytes > 0;
                bytes = ap_get_client_block(req, buffer, DFT_BUF_SIZE)){
            count += bytes;
            if(count > *post_size){
                *post = (char *)realloc(*post, count);
                if(*post == NULL){
                    return HTTP_INTERNAL_SERVER_ERROR;
                }
            }
            *post_size = count;
            offset = count - bytes;
            memcpy((char *)*post+offset, buffer, bytes);
        }
    }else{
        *post_size = 0;
        return OK; 
    }

    return OK;
}

/* The sample content handler */
static int cust_echo_post_handler(request_rec *req)
{
    if (strcmp(req->handler, "cust_echo_post")) 
    {
        return DECLINED;
    }

    if((req->method_number != M_GET) && (req->method_number != M_POST))
    {
        return HTTP_METHOD_NOT_ALLOWED;
    }

    char *post = (char *)malloc(sizeof(char)*DFT_BUF_SIZE);
    size_t post_size = DFT_BUF_SIZE;

    if(post == NULL)
    {
        return HTTP_INTERNAL_SERVER_ERROR;
    }
    
    memset(post, '\0', post_size);

    int ret = read_post_data(req, &post, &post_size);
    if(ret != OK)
    {
        free(post);
        post = NULL;
        post_size = 0;
        return ret;
    }

    ap_set_content_type(req, "text/html;charset=utf-8");
    ap_set_content_length(req, post_size);

    if(post_size == 0)
    {
        ap_rputs("no post data found", req);
        return OK;
    }
    
    cust_config_t *config = ap_get_module_config(req->server->module_config, &cust_echo_post_module);
    if(config == NULL)
    {
        return HTTP_INTERNAL_SERVER_ERROR;
    }
    
    //make a copy of user post data
    char *converted = strdup(post);

    int i = 0;
    //convert it according to convert_type
    switch(config->convert_type)
    {
        case 0:
            for(i = 0; i < post_size; i++)
            {
                converted[i] = toupper(((char *)post[i]));
            }
            break;
        case 1:
            for(i = 0; i < post_size; i++)
            {
                converted[i] = tolower(((char *)post[i]));
            }
            break;
        default:
            break;
    }

    ap_rputs(converted, req);

    free(converted);
    converted = NULL;

    free(post);
    post = NULL;
    post_size = 0;
    
    return OK;
}

static void cust_echo_post_register_hooks(apr_pool_t *p)
{
    ap_hook_handler(cust_echo_post_handler, NULL, NULL, APR_HOOK_MIDDLE);
}

/* Dispatch list for API hooks */
module AP_MODULE_DECLARE_DATA cust_echo_post_module = {
    STANDARD20_MODULE_STUFF, 
    NULL,                  /* create per-dir    config structures */
    NULL,                  /* merge  per-dir    config structures */
    create_config,                  /* create per-server config structures */
    NULL,                  /* merge  per-server config structures */
    cust_echo_cmds,                  /* table of config file commands       */
    cust_echo_post_register_hooks  /* register hooks                      */
};      

運作可配置 echo 子產品

LoadModule cust_echo_post_module "D:/wamp/bin/apache/APACHE~1.9/modules/mod_cust_echo_post.so"
<Location /cust_echo_post> 
 SetHandler cust_echo_post 
 </Location> 

 #configure for cust_echo_post 
 ConvertType 0      
Apache Mod/Filter Development
LoadModule cust_echo_post_module "D:/wamp/bin/apache/APACHE~1.9/modules/mod_cust_echo_post.so"
<Location /cust_echo_post> 
 SetHandler cust_echo_post 
 </Location> 

 #configure for cust_echo_post 
 ConvertType 1      
Apache Mod/Filter Development
http://www.ibm.com/developerworks/cn/opensource/os-cn-apachehttpd/      

4. mod進階: 過濾器

過濾器事實上是另一種形式的子產品,Apache對通用的資料結構都做過一些封裝,并以庫的方式提供(即APR(Apache Portable Runtime))。在過濾器中,有兩個比較重要的資料結構

1. apr_bucket
2. apr_bucket_brigade: apr_bucket_birgade 相當于一個環狀隊列,而 apr_bucket 是隊列中的元素      

所有的過濾器形成一個長鍊,資料從上一個過濾器流入,進行過濾,然後将加工過的資料流入下一個過濾器,處理一個 HTTP 事務期間可能會多次調用某個過濾器,就象不同的塊通過“桶隊列”。對于所有最普通的過濾器來說,這意味着過濾器必須能夠在兩次調用之間儲存某種上下文

我們的過濾器非常簡單,從上一個過濾器中讀到資料,将資料中的字元串轉換為大寫,然後将桶 (apr_bucket) 傳遞給下一個過濾器。Apache 提供了豐富的 API 來完成這一系列的操作

0x1: 大小寫轉換過濾器

static apr_status_t case_filter(ap_filter_t *filter, apr_bucket_brigade *bbin)
{ 
    request_rec *req = filter->r; 
    conn_rec *con = req->connection; 

    apr_bucket *bucket; 
    apr_bucket_brigade *bbout; 

    //create brigade 
    bbout = apr_brigade_create(req->pool, con->bucket_alloc); 

    //iterate the full brigade 
    APR_BRIGADE_FOREACH(bucket, bbin)
    { 
        if(APR_BUCKET_IS_EOS(bucket) || APR_BUCKET_IS_FLUSH(bucket))
        { 
            APR_BUCKET_REMOVE(bucket); 
            APR_BRIGADE_INSERT_TAIL(bbout, bucket); 
            return ap_pass_brigade(filter->next, bbout); 
        } 
        char *data, *buffer; 
        apr_size_t data_len; 

        //read content of current bucket in brigade 
        apr_bucket_read(bucket, &data, &data_len, APR_NONBLOCK_READ); 
        buffer = apr_bucket_alloc(data_len, con->bucket_alloc); 
        int i; 
        for(i = 0; i < data_len; i++)
        {
            //convert 
            buffer[i] = apr_toupper(data[i]); 
        } 

        apr_bucket *temp_bucket; 
        temp_bucket = apr_bucket_heap_create(buffer, data_len, apr_bucket_free, con->bucket_alloc); 

        APR_BRIGADE_INSERT_TAIL(bbout, temp_bucket); 
    } 

    return APR_SUCCESS; 
}      

0x2: 注冊過濾器

static void filter_echo_post_register_hooks(apr_pool_t *p) 
{ 
    ap_register_output_filter(filter_name, case_filter, NULL, AP_FTYPE_RESOURCE); 
}      

0x3: 運作過濾器子產品

對過濾器的配置要稍微複雜一些,在 httpd.conf 中,不但要使用 LoadModule 指令加載過濾器子產品,還要使用 SetOutputFilter 指令來指定過濾器的應用場景

LoadModule filter_echo_post_module modules/mod_filter_echo_post.so 
AddOutputFilter CaseFilter .cf
//指令中指定,CaseFilter 這個過濾器僅對擴充名為 .cf 的 URL 請求做過濾,其他請求則不過濾      

0x4: Code Example

/* 
**  mod_filter_echo_post.c -- Apache sample filter_echo_post module
**  [Autogenerated via ``apxs -n filter_echo_post -g'']
**
**  To play with this sample module first compile it into a
**  DSO file and install it into Apache's modules directory 
**  by running:
**
**    $ apxs -c -i mod_filter_echo_post.c
**
**  Then activate it in Apache's httpd.conf file for instance
**  for the URL /filter_echo_post in as follows:
**
**    #   httpd.conf
**    LoadModule filter_echo_post_module modules/mod_filter_echo_post.so
**    <Location /filter_echo_post>
**    SetHandler filter_echo_post
**    </Location>
**
**  Then after restarting Apache via
**
**    $ apachectl restart
**
**  you immediately can request the URL /filter_echo_post and watch for the
**  output of this module. This can be achieved for instance via:
**
**    $ lynx -mime_header http://localhost/filter_echo_post 
**
**  The output should be similar to the following one:
**
**    HTTP/1.1 200 OK
**    Date: Tue, 31 Mar 1998 14:42:22 GMT
**    Server: Apache/1.3.4 (Unix)
**    Connection: close
**    Content-Type: text/html
**  
**    The sample page from mod_filter_echo_post.c
*/ 

#include "httpd.h"
#include "http_config.h"
#include "http_request.h"
#include "http_protocol.h"
#include "ap_config.h"

#include "apr_general.h"
#include "apr_buckets.h"
#include "apr_lib.h"

#include "util_filter.h"

static const char *filter_name = "CaseFilter";

static apr_status_t case_filter(ap_filter_t *filter, 
        apr_bucket_brigade *bbin){
    request_rec *req = filter->r;
    conn_rec *con = req->connection;

    apr_bucket *bucket;
    apr_bucket_brigade *bbout;
    
    bbout = apr_brigade_create(req->pool, con->bucket_alloc);

    APR_BRIGADE_FOREACH(bucket, bbin){
        if(APR_BUCKET_IS_EOS(bucket) || APR_BUCKET_IS_FLUSH(bucket)){
            APR_BUCKET_REMOVE(bucket);
            APR_BRIGADE_INSERT_TAIL(bbout, bucket);
            return ap_pass_brigade(filter->next, bbout);
        }
        char *data, *buffer;
        apr_size_t data_len;

        apr_bucket_read(bucket, &data, &data_len, APR_NONBLOCK_READ);
        buffer = apr_bucket_alloc(data_len, con->bucket_alloc);
        int i;
        for(i = 0; i < data_len; i++){
            buffer[i] = apr_toupper(data[i]);    
        }

        apr_bucket *temp_bucket;
        temp_bucket = apr_bucket_heap_create(
                buffer, data_len, apr_bucket_free, con->bucket_alloc);

        APR_BRIGADE_INSERT_TAIL(bbout, temp_bucket);
    }

    return APR_SUCCESS;
}

/*
static apr_status_t case_filter(ap_filter_t *filter, apr_bucket_brigade *bbin){
    request_rec *req = filter->r;
    conn_rec *con = req->connection;

    apr_bucket *bucket; //the bucket of data
    apr_bucket_brigade *bbout;
    
    bbout = apr_brigade_create(req->pool, con->bucket_alloc);

    for(bucket = APR_BRIGADE_FIRST(bbin);
       bucket != APR_BRIGADE_SENTINEL(bbin);
       bucket = APR_BUCKET_NEXT(bucket)){
        char *data, *buffer;
        apr_size_t data_len;
        
        apr_bucket *temp_bucket;

        if(APR_BUCKET_IS_EOS(bucket)){
            apr_bucket *eos = apr_bucket_eos_create(con->bucket_alloc);
            APR_BRIGADE_INSERT_TAIL(bbout, eos);
            continue;
        }

        apr_bucket_read(bucket, &data, &data_len, APR_BLOCK_READ);
        buffer = apr_bucket_alloc(data_len, con->bucket_alloc);
        int i;
        for(i = 0; i < data_len; i++){
            buffer[i] = apr_toupper(data[i]);
        }

        temp_bucket = apr_bucket_heap_create(
                buffer, data_len, apr_bucket_free, con->bucket_alloc);
        APR_BRIGADE_INSERT_TAIL(bbout, temp_bucket);
    }
    
    return ap_pass_brigade(filter->next, bbout);
}
*/

static void filter_echo_post_register_hooks(apr_pool_t *p)
{
    ap_register_output_filter(filter_name, case_filter, NULL, AP_FTYPE_RESOURCE);
}

/* Dispatch list for API hooks */
module AP_MODULE_DECLARE_DATA filter_echo_post_module = {
    STANDARD20_MODULE_STUFF, 
    NULL,                  /* create per-dir    config structures */
    NULL,                  /* merge  per-dir    config structures */
    NULL,                  /* create per-server config structures */
    NULL,                  /* merge  per-server config structures */
    NULL,                  /* table of config file commands       */
    filter_echo_post_register_hooks  /* register hooks                      */
};      

過濾器将該檔案中的字元串轉換為大寫字母輸出

http://www.ibm.com/developerworks/cn/opensource/os-cn-apachehttpd/
http://www.ibm.com/developerworks/cn/linux/middleware/l-apache/      

Copyright (c) 2015 LittleHann All rights reserved

繼續閱讀