非关系数据库：cassandra

Cassandra是facebook开源出来的一个版本，可以认为是BigTable的一个开源版本，目前twitter（更名为了tweet）和digg.com在使用。

cassandra 的特点是：

1.there’s no schema enforced at column ColumnFamily level. The

Row

s do not have a predefined list of

Column

s that they contain.

即在ColumnFamily的级上，没有schema 的硬性限制，增加或者删除字段非常方便。相对于关系数据库的表格其中的元组的属性列是确定的来说，ColumnFamily中的元组可以有不同的columns。

2.设计采用互备份的方式，使其具有高可用，可扩展：单点故障不影响集群服务，可线性扩展。

下面介绍Cassandra 的具体building blocks（组成部分）:

Column

在Cassandra中最底层最小的的数据单元，它通常是由name, value 和timestamp三部分组成的。简单起见可以省略timestamp项，如下是一个Column的实例：

{ // this is a column

name: "emailAddress",

value: "[email protected]",

timestamp: 123456789

}

其中的name和value都是byte[]类型的，可以是任意长度的。

SuperColumn

它是由name和value组成的二元组。其中value看一看做是一个map对象，它包括任意数量的键值是Column的nameColumn对象。如下是一个SuperColumn 的实例：

{   // this is a SuperColumn
    name: "homeAddress",
    // with an infinite list of Columns
    value: {
        // note the keys is the name of the Column
        street: {name: "street", value: "1234 x street", timestamp: 123456789},
        city: {name: "city", value: "san francisco", timestamp: 123456789},
        zip: {name: "zip", value: "94107", timestamp: 123456789},
    }
}      
    Column和SuperColumn 的异同：他们都可以被看做是name和value组成的二元组。主要的区别就是      
    Column的value是String类型的而SuperColumn的value则是map<name,Column>组成的。此外就是SC没有timestamp项。      
    3.      
    ColumnFamily      
     类似于数据库中的表格，它是由数目不限的元组做成的结构。一个元组呢就是一个l类似于map<name，Column>的数据项。格式如下：      
        UserProfile = { // this is a ColumnFamily
    phatduckk: {   // this is the key to this Row inside the CF
        // now we have an infinite # of columns in this row
        username: "phatduckk",
        email: "[email protected]",
        phone: "(900) 976-6666"
    }, // end row
    ieure: {   // this is the key to another row in the CF
        // now we have another infinite # of columns in this row
        username: "ieure",
        email: "[email protected]",
        phone: "(888) 555-1212"
        age: "66",
        gender: "undecided"
    },
}      
        4.      
    Keyspace

Cassandra中的最大组织单元，里面包含了一系列Column family，Keyspace一般是应用程序的名称。个人觉得可以把它理解为

关系数据库中的DB。

 

5.
Sorting

与关系数据库可以通过Order by定义排序规则不同，Cassandra取出的数据顺序是总是一定的，数据保存时已经按照定义的规则顺序存放，

所以取出来的顺序已经确定了，这是一个巨大的性能优势。有意思的是，Cassandra按照column name而不是column value

来进行排序，它定义了以下几种 CompareWith 选项：BytesType, UTF8Type, LexicalUUIDType, TimeUUIDType, AsciiType,  

和LongType，用来定义如何按照column name来排序。实际上，就是把column name识别成为不同的类型，以此来达到灵活排序的目的。

UTF8Type是把column name转换为UTF8编码来进行排序，LongType转换成为64位long型，TimeUUIDType是按照基于时间的UUID来排序。

 

按照LongType进行排序结果：

 

    <!--
    ColumnFamily definition from storage-conf.xml
    -->
    <ColumnFamily CompareWith="LongType" Name="CF_NAME_HERE"/>

    // See, each Column's name is treated as a 64bit long
    // in effect, numerically ordering our Columns' by name
    {name: 3, value: "101010101010"},
    {name: 123, value: "hello there"},
    {name: 976, value: "kjjkbcjkcbbd"},
    {name: 832416, value: "kjjkbcjkcbbd"}

 

按照UTF8Type进行排序结果：

 

    <!--
    ColumnFamily definition from storage-conf.xml
    -->
    <ColumnFamily CompareWith="UTF8Type" Name="CF_NAME_HERE"/>

    // Each Column name is treated as a UTF8 string
    {name: 123, value: "hello there"},
    {name: 3, value: "101010101010"},
    {name: 832416, value: "kjjkbcjkcbbd"},
    {name: 976, value: "kjjkbcjkcbbd"}

以上便是Cassandra的基本组成的元素，为了便于理解不知是否可以与关系数据库元素进行如下类比：

1) a “Column” is a key-value pair plus timestamp (=attribute)

2) a “Super Column” is a map of attributes (=row)

3) a “Standard Column Family” is a map of rows (=table)

4) a “Super Column Family” is a map of tables (=table of nested tables)

5) a “Keyspace” is a map of “Column Families” (=data base)

以上是学习Cassandra的一些拙见，还恳请批评指正！

参考资料：http://wiki.apache.org/cassandra/FrontPage

关于Cassandra的一些API相关的介绍可以参考：http://www.rackspacecloud.com/blog/2010/05/12/cassandra-by-example/

非关系数据库：cassandra

继续阅读

Android – ListView 中添加按钮，动态删除添加ItemView的操作

报错：'mysql' 不是内部或外部命令，也不是可运行的程序或批处理文件。

Linxu常用命令技巧汇总

ERROR 1 (HY000): Can't create/write to file '/tmp/#sql_4188_1.MYI' (Errcode: 28)

艰难安装LDAP,SSL认证

《Linux命令行与Shell脚本编程大全第2版.布卢姆》pdf

MySQL的4种隔离级别？出现问题

XX系统实施过程问题总结

无组件上传图片到数据库中，最完整解决方案

【MySQL数据库】数据库索引事务1.索引2.事务

neo4j之cypher使用文档

NOSQL安全攻击

mybatis_入门程序Mybatis入门

GridView终极用法(一)

登录plsql 报错 the account is locked --用户被锁

SequoiaDB巨杉数据库C++驱动概述