天天看點

my CouchDB tutorial

my CouchDB tutorial
  1. 曆史
  2. 特性:
  3. Examples
  4. Futon界面:
  5. 關于View /Map Reduce:
    1. 兩種View
    2. 建立view :
    3. Map & Reduce
    4. 注意:
    5. Reduce Vs reReduce
    6. group
    7. Debugging Views
  6. 安裝:
      1. ubuntu
      2. 編譯安裝:
  7. Erlang Client
  8. benchmark
  9. 好的文檔:
    1. 書:

曆史

    “Couch” 是 “Cluster Of Unreliable Commodity Hardware” 的首字母縮寫,它反映了 CouchDB 的目标具有高度可伸縮性,提供了高可用性和高可靠性,即使運作在容易出現故障的硬體上也是如此。CouchDB 最初是用 C++ 編寫的,但在 2008 年 4 月,這個項目轉移到 Erlang OTP 平台進行容錯測試。

    在這篇訪談中,Katz談 到,CouchDB其實是将Lotus Notes的核心剝離出來,去蕪存菁的産物。

    IBM曾經資助CouchDB,允許Katz全職從事項目的開發。

    2009年,Katz和Chris Anderson等 一些同仁組建Relaxed公司,同年11月公司獲得200萬美元風險投資, 改名為Couchio。

特性:

  • NOSQL, 面向文檔的資料庫
  • 追加型資料庫
  • 無中心???
  • 多 版本并發性控制(Multiversion concurrency controlMVCC)
    • -它向每個客戶機提供資料庫的 最新版本的快照。這意味着在送出事務之前,其他使用者不能看到更改。許多現代資料庫開始從鎖機制前移到 MVCC,包括 Oracle(V7 之後)和 Microsoft® SQL Server 2005 及更新版本。
  • HTTP接口,JSON API 通路
  • 強 大的 B-樹儲存引擎
  • Map/Reduce

    PS. 上面描述的這些正處在迅速的進化之中……包括身份驗證,同步過濾,URL Mapping 等等,所有需要用到的一切,正在迅速被增加進來。 (已經呼之欲出 —— CouchDB 将會進化成為一個 AppServer)

我不喜歡這點,覺得應該做成一個有 Map Reduce架構的資料庫,而不是一個AppServer(CouchApp) -Lin Yang 7/23/10 7:57 PM 

Examples

獲 取CouchDB server info:

curl http://127.0.0.1:5984/

{"couchdb":"Welcome","version":"0.11.0"}

創 建DB:

curl -X PUT http://127.0.0.1:5984/wiki

    CouchDB will reply with the following message, if the database does not exist:

{"ok":true}

    or, with a different response message, if the database already exists:

{"error":"file_exists","reason":"The database could not be created, the file already exists."}

擷取DB 資訊:

curl -X GET http://127.0.0.1:5984/wiki

{"db_name":"wiki","doc_count":0,"doc_del_count":0,"update_seq":0,

"purge_seq":0,"compact_running":false,"disk_size":79,

"instance_start_time":"1272453873691070","disk_format_version":5}

删 除DB:

curl -X DELETE http://127.0.0.1:5984/wiki

{"ok":true}

在wiki下 建立一個稱為 apple 的文檔

curl -X PUT http://127.0.0.1:5984/wiki/apple -H "Content-Type: application/json" -d {} 

{"ok":true,"id":"apple","rev":"1801185866"}。

擷取文檔:

curl -X GET http://127.0.0.1:5984/wiki/apple

{"_id":"apple","_rev":"1801185866"}

update 文檔:

curl -X PUT http://localhost:5984/wiki/apple -H "Content-Type: application/json" -d '{ "_rev":"1801185866" ,"a":3}'

{"ok":true,"id":"apple","rev":"2-b5be0b773091"}

典 型的 CouchDB View (query)

map: function(doc) {

    if (doc._attachments) {

         emit("with attachment", 1);

     }

     else {

         emit("without attachment", 1);

     }

}

reduce: function(keys, values) {

    return sum(values);

}

curl -s -i -X POST -H 'Content-Type: application/json'

-d '{"map": "function(doc){if(doc._attachments) {emit(\"with\",1);} else {emit(\"without\",1);}}",

"reduce": "function(keys, values) {return sum(values);}"}'

'http://localhost:5984/somedb/_temp_view?group=true'

Futon界面:

http://localhost:5984/_utils/

my CouchDB tutorial
my CouchDB tutorial

關于View /Map Reduce:

the view is defined by a JavaScript function that maps view keys to values

兩種View

  • permanent view
    • stored inside special documents called design documents
    • create: create doc: http://localhost:5984/{dbname}/_design/my_views
    • query: GET URI /{dbname}/{docid}/{viewname}
    •                                      _design/my_views/_view/all_docs
  • temporary view
    • Slow , very expensive to compute
    • POST URI /{dbname}/_temp_view

建立view :

在Futon界面上,Overview > wiki

右上角選擇 view : Temporary View.

(If your Futon Web-Client acts funny, clear the cookies futon created )

Map & Reduce

map

function(doc) {

    emit(null, doc);

}

    A view function should accept a single argument: the document object. To produce results, it should call the implicitly available emit(key, value) function.For every invocation of that function, a result row is added to the view

    To be able to filter or sort the view by some document property, you would use that property for the key. For example, the following view would allow you to lookup customer documents by the LastName or FirstName fields:

function(doc) {

    if (doc.Type == "customer") {

        emit(doc.LastName, {FirstName: doc.FirstName, Address: doc.Address});

        emit(doc.FirstName, {LastName: doc.LastName, Address: doc.Address});

    }

}

reduce

function (key, values, rereduce) {

    return sum(values);

}

Reduce functions must handle two cases:

1. When rereduce is false:

  • key will be an array whose elements are arrays of the form [key,id], where key is a key emitted by the map function and id is that of the document from which the key was generated.
  • values will be an array of the values emitted for the respective elements in keys
  • i.e. reduce([ [key1,id1], [key2,id2], [key3,id3] ], [value1,value2,value3], false)

2. When rereduce is true:

  • key will be null
  • values will be an array of values returned by previous calls to the reduce function
  • i.e. reduce(null, [intermediate1,intermediate2,intermediate3], true)

注意:

  • 一個View 可以隻有map函數
  • A reduce function must reduce the input values to a smaller output value    (reduce 函數要處理emit的結果,也要處理自己傳回的結果)
  • emit發射的key可以是一個數組格式:[X,Y,1]

Reduce Vs reReduce

據說,比如一個map函數産生如下 Key->Value pair:

[X, Y, 0] -> Object_A

[X, Y, 1] -> Object_B

[X, Y, 2] -> Object_C

[X, Y, 3] -> Object_D

然後,reduce函數會受到如下3個調用. ()

reduce( [  [ [X,Y,0] , id0] , [ [X,Y,1] , id1]  ], [Object_A, Object_B], false)

reduce( [  [ [X,Y,2] , id3] , [ [X,Y,3] , id3]  ], [Object_C, Object_D], false)

reduce( null                                  , [Object_CB, Object_CD], true)

    我還是不懂得這個ReReduce是什麼用處,因為我覺得,對于每個key,會來一次reduce.

之後不會再有第二次reduce了。

group

group=true可 以讓 Reduce 方法按照 Map 方法輸出的鍵進行分組

效果見示例代碼.

Debugging Views

有個 log函數可以用于輸出debug資訊:

{

     "map": "function(doc) { log(doc); }"

}

tail -f /var/log/couchdb/couch.log

View相關的主要文檔(我覺得這三個文檔寫得真難 懂...)

  • Introduction_to_CouchDB_views An introduction to views
  • HTTP_view_API How to use views
  • View_Snippets 最有用 。 -Lin Yang 7/23/10 11:58 AM

這 裡有個不錯的線上示範: 隻是不錯~不是最好. -Lin Yang 7/23/10 7:47 PM 

http://labs.mudynamics.com/wp-content/uploads/2009/04/icouch.html

安裝:

ubuntu

源裡面有.

編譯安裝:

依賴真多...

yum install js-devel icu libicu libicu-devel

wget http://curl.haxx.se/download/curl-7.21.0.tar.gz && tar vzxf curl-7.21.0.tar.gz && cd curl-7.21.0 && ./configure && make  && make install

#/usr/local/bin/couchdb -b (background)

Apache CouchDB has started, time to relax.

果 yum裡面沒有 ... 安裝SpiderMonkey

Erlang Client

http接口,不需要專用client,但是有client更 好...

couchbeam

eCouch

erlcouch ..

benchmark

某人做的:

CouchDB inserts ~2-3k documents / second in a >100k documents database

-------- 0.3-0.4ms / doc

CouchDB inserts get slower on bigger databases

在 我的8核8G上(4G free)

ab -n 10000 -c 100 http://localhost:5984/wiki/apple   (查詢請求)

Server Software:        CouchDB/0.11.0

Server Hostname:        localhost

Document Length:        118 bytes

Requests per second:    3066.17 [#/sec] (mean)

Time per request:       32.614 [ms] (mean)

Time per request:       0.326 [ms] (mean, across all concurrent requests)

好的文檔:

http://wiki.apache.org/couchdb/Getting_started_with_Erlang

http://en.wikipedia.org/wiki/CouchDB  有例子

http://www.ibm.com/developerworks/cn/opensource/os-couchdb/  介紹+例子

http://www.ibm.com/developerworks/cn/opensource/os-cn-couchdb/index.html  (長,ready)

CouchDB: The Definitive Guide 的翻譯blog  by  時之刻痕

http://wiki.apache.org/couchdb/FrontPage  總文檔!!!

clients:

http://wiki.apache.org/couchdb/Getting_started_with_Python  python示例代碼

http://wiki.apache.org/couchdb/API_Cheatsheet  API

http://news.csdn.net/a/20100714/219109.html

書:

http://books.couchdb.org/relax/

副自己的測試代碼:

#!/usr/bin/python
#coding:utf-8

# example:
# do_request('10.99.60.91:8080', '/home', 'PUT', '', {"Content-type": "application/x-www-form-urlencoded"} )
def do_request2(netloc, path, method, data='', headers={}):
    import httplib
    conn = httplib.HTTPConnection(netloc)
    conn.request(method, path, data, headers)
    response = conn.getresponse()
    if response.status/100 == 2:
        data = response.read()
        return data
    print 'ERROR: response.status = %d'% response.status
    print 'response data is ', response.read()

def do_request(url, method, data='', headers={}):
    print '>>>>>>>>>>>>> do_request: ', url
    from urlparse import urlparse 
    o = urlparse(url)
    path = o.path
    if o.query:
        path = path + '?' + o.query
    return do_request2(o.netloc, path, method, data, headers)


def delete_db(db_name):
    return do_request('http://127.0.0.1:5984/%s'%db_name, 'DELETE')

def create_db(db_name):
    return do_request('http://127.0.0.1:5984/%s'%db_name, 'PUT')

def create_doc(db_name, doc_id, doc):
    print 'create_doc %s ' % doc_id
    return do_request('http://127.0.0.1:5984/%s/%s'%(db_name, doc_id), 'PUT', doc, {'Content-Type': 'application/json'})

def get_doc(db_name, doc_id):
    return do_request('http://127.0.0.1:5984/%s/%s'%(db_name, doc_id), 'GET', '', {})
#query_string='?group=false'
def query_temp_view(db_name, doc, query_string=''):
    url = 'http://127.0.0.1:5984/%s/_temp_view%s'%(db_name, query_string)
    return do_request(url, 'POST', doc, {'Content-Type': 'application/json; charset=UTF-8'})

# 這個API是建立好多個view,一個design document
def create_permanent_view(db_name, view_name, views_json):
    return create_doc(db_name, view_name, views_json) 



def test():
    print delete_db('phone')
    print create_db('phone')
    print create_doc('phone', 'Nokia-5200','''
        {"make": "Nokia", 
        "price": 100, 
        "os": "s40"}
            ''')

    print create_doc('phone', 'Nokia-1661','''
        {"make": "Nokia", 
        "price": 32.5, 
        "os": "s40"}
            ''')

    print create_doc('phone', 'Nokia-E63','''
        {"make": "Nokia", 
        "price": 500, 
        "os": "s60"}
            ''')

    print create_doc('phone', 'HTC-Wildfire','''
        {"make": "HTC", 
        "price": 200, 
        "os": "Android"}
            ''')

    print create_doc('phone', 'BlackBerry-Bold','''
        {"make": "BlackBerry", 
        "price": 300, 
        "os": "BlackBerry-OS"}
            ''')

    print create_doc('phone', 'Samsung-Galaxy-S','''
        {"make": "Samsung", 
        "price": 400, 
        "os": "Android"}
            ''')
    print create_doc('phone', 'iPhone4','''
        {"make": "Apple", 
        "price": 1000, 
        "os": "Mac"}
            ''')
    ############################################################################
    # get all docs
    view = ''' 
{
    "map" : "function(doc){
        emit(doc.price, doc);
    }"
}
    '''
    print query_temp_view('phone', view);


    ############################################################################
    # select sum(price) form phone 
    view = ''' 
{
    "map" : "function(doc){
        emit('all-price', doc.price);
    }",
    "reduce" : "function(key, values, rereduce){
        return sum(values);
    }"
}
    '''
    print query_temp_view('phone', view);


    ############################################################################
    # test permanent_view
    views = '''
{
    "language": "javascript",
    "views": {
        "all_phones": {
            "map" : "function(doc){
                emit(doc.price, doc);
            }"
        },
        "sum_price": {
            "map" : "function(doc){
                emit('all-price', doc.price);
            }",
            "reduce" : "function(key, values, rereduce){
                return sum(values);
            }"
        },
    }
}
    '''
    create_permanent_view('phone', '_design/my_views', views)
    print ':::: retrive the views DOC: '
    print get_doc('phone', '_design/my_views')
    print ':::: now let us try to query on this views "all_phones"'
    print get_doc('phone', '_design/my_views/_view/all_phones')

    print ':::: And query on this views "sum_price"'
    print get_doc('phone', '_design/my_views/_view/sum_price')

    ############################################################################
    print ':::: we use temp view for test'
    print ':::: get all phones of Nokia: '
    view = ''' 
{
    "map" : "function(doc){
        //log(doc); // debug fun
        if (doc.make == 'Nokia')
            emit(null, doc);
    }"
}
    '''
    print query_temp_view('phone', view);


    ############################################################################
    print ':::: get phones count of every os : '
    view = ''' 
{
    "map" : "function(doc){
        emit(doc.os, 1);
    }",
    "reduce" : "function(key, values, rereduce){
        log('reduce called!!!!');
        log(key);
        log(values);
        log(rereduce);
        return sum(values); 
    }"
}
    '''
    print query_temp_view('phone', view, '?group=true');

    print ':::: let us look at the parm group.. if we set group=false : '
    print query_temp_view('phone', view, '?group=false');

    ############################################################################
    print ':::: let us get a list of unique os , just like SQL: SELECT DISTINCT(os) FROM phone'
    view = ''' 
{
    "map" : "function(doc){
        emit(doc.os, null);
    }",
    "reduce" : "function(key, values, rereduce){
        return null;
    }"
}
    '''
    print query_temp_view('phone', view, '?group=true');
   

    ############################################################################
    print ':::: let us get all phone sort by price || SELECT _id FROM phone SORT BY price'
    view = ''' 
{
    "map" : "function(doc){
        emit(doc.price, doc._id);
    }",
}
    '''
    print query_temp_view('phone', view);


    ############################################################################
    print ':::: let us get min price || SELECT min(price) FROM phone '
    view = ''' 
{
    "map" : "function(doc){
        emit('p', doc.price);
    }",
    "reduce" : "function(key, values, rereduce){
        return Math.min.apply( Math, values);
            //http://labs.mudynamics.com/wp-content/uploads/2009/04/icouch.html 上 
            //computing min width/height ( js模拟)的例子在我這裡不行
    }"
}
    '''
    print query_temp_view('phone', view, '?group=true');



    ############################################################################
    print ':::: I hava to try emit a array like emit([a,b,c], value)'

    view = ''' 
{
    "map" : "function(doc){
        emit(['p', 'min'], doc.price);
        emit(['p', 'max'], doc.price);
    }",
    "reduce" : "function(key, values, rereduce){
        log('reduce called!!!!');
        log(key);
        log(values);
        log(rereduce);

        return Math.min.apply( Math, values);
    }"
}
    '''
    print query_temp_view('phone', view, '?group=true&group-level=2');

    print ':::: if gropu-level==1 '
    print ':::: [p, min] and [p, max] will come together to a reduce fun, like this'
    print ':::: [[["p","max"],"BlackBerry-Bold"],[["p","max"],"HTC-Wildfire"],[["p","max"],"iPhone4"],[["p","max"],"Nokia-1661"],[["p","max"],"Nokia-5200"],[["p","max"],"Nokia-E63"],[["p","max"],"Samsung-Galaxy-S"],[["p","min"],"BlackBerry-Bold"],[["p","min"],"HTC-Wildfire"],[["p","min"],"iPhone4"],[["p","min"],"Nokia-1661"],[["p","min"],"Nokia-5200"],[["p","min"],"Nokia-E63"],[["p","min"],"Samsung-Galaxy-S"]]'

    print ':::: if gropu-level==2 '
    print ':::: [p, min] and [p, max] will come separate.......... like this'
    print '[[["p","min"],"Samsung-Galaxy-S"],[["p","min"],"Nokia-E63"],[["p","min"],"Nokia-5200"],[["p","min"],"Nokia-1661"],[["p","min"],"iPhone4"],[["p","min"],"HTC-Wildfire"],[["p","min"],"BlackBerry-Bold"]]' 

    #TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTODO ############################################################################
    print ':::: let us retrive top N os'









if __name__ == "__main__":
    test()