天天看點

标準非STL容器 : bitset

1. 概念

什麼是“标準非STL容器”?标準非STL容器是指“可以認為它們是容器,但是他們并不滿足STL容器的所有要求”。前文提到的容器擴充卡stack、queue及priority_queue都是标準非STL容器的一部分。此外,valarray也是标準非STL容器。

bitset:一種 高效位集合操作容器。

2. API

bitset提供的api:

(constructor)    Construct bitset (public member function)

operator[]    Access bit (public member function)

set    Set bits (public member function)

reset    Reset bits (public member function )

flip    Flip bits (public member function)

to_ulong    Convert to unsigned long integer (public member function)

to_string    Convert to string (public member function)

count    Count bits set (public member function)

size    Return size (public member function)

test    Return bit value (public member function )

any    Test if any bit is set (public member function)

none    Test if no bit is set (public member function)

3. 源碼剖析

SGI bitset部分實作源碼

template<size_t _Nb>
class bitset : private _Base_bitset<__BITSET_WORDS(_Nb)>
{
private:
  typedef _Base_bitset<__BITSET_WORDS(_Nb)> _Base;
  typedef unsigned long _WordT;

private:
  void _M_do_sanitize() {
    _Sanitize<_Nb%__BITS_PER_WORD>::_M_do_sanitize(this->_M_hiword());
  }
.....
}
           
#define __BITS_PER_WORD (CHAR_BIT*sizeof(unsigned long))
#define __BITSET_WORDS(__n) \
 ((__n) < 1 ? 1 : ((__n) + __BITS_PER_WORD - 1)/__BITS_PER_WORD)
           
template<size_t _Nw>
struct _Base_bitset {
  typedef unsigned long _WordT;

  _WordT _M_w[_Nw];                // 0 is the least significant word.

  _Base_bitset( void ) { _M_do_reset(); }
  _Base_bitset(unsigned long __val) {
    _M_do_reset();
    _M_w[0] = __val;
  }

  static size_t _S_whichword( size_t __pos )
    { return __pos / __BITS_PER_WORD; }
  static size_t _S_whichbyte( size_t __pos )
    { return (__pos % __BITS_PER_WORD) / CHAR_BIT; }
  static size_t _S_whichbit( size_t __pos )
    { return __pos % __BITS_PER_WORD; }
  static _WordT _S_maskbit( size_t __pos )
    { return (static_cast<_WordT>(1)) << _S_whichbit(__pos); }

  _WordT& _M_getword(size_t __pos)       { return _M_w[_S_whichword(__pos)]; }
  _WordT  _M_getword(size_t __pos) const { return _M_w[_S_whichword(__pos)]; }

  _WordT& _M_hiword()       { return _M_w[_Nw - 1]; }
  _WordT  _M_hiword() const { return _M_w[_Nw - 1]; }

  void _M_do_and(const _Base_bitset<_Nw>& __x) {
    for ( size_t __i = 0; __i < _Nw; __i++ ) {
      _M_w[__i] &= __x._M_w[__i];
    }
  }

  void _M_do_or(const _Base_bitset<_Nw>& __x) {
    for ( size_t __i = 0; __i < _Nw; __i++ ) {
      _M_w[__i] |= __x._M_w[__i];
    }
  }

  void _M_do_xor(const _Base_bitset<_Nw>& __x) {
    for ( size_t __i = 0; __i < _Nw; __i++ ) {
      _M_w[__i] ^= __x._M_w[__i];
    }
  }
           

節選上述代碼,可以得到:

1. bitset繼承_Base_bitset,具體操作封裝在_Base_bitset中

2. bitset 的size作為模闆參數(非類型模闆參數的一個要求是,編譯器能在編譯期就能把參數确定下來),是以, bitset大小在編譯期固定,不支援插入和删除元素

3. 各種位操作, 性能高

4._Base_bitset使unsigned long作為底層存儲, 不支援指針、引用、疊代器

5. 使用 _WordT _M_w[_Nw];配置設定記憶體, 是以在棧中定義bitset需要注意大小(和STL标準容器堆記憶體配置設定差別開)。

     eg,下面的代碼将棧溢出(測試機器棧記憶體10M) 

void fun()
{
        const int n = 800000000;

        bitset<n> a;
        cout << a.size() << endl;
}
int main(int argc, char** argv)
{
        fun();

        return 0;
}
           

大記憶體配置設定可以配置設定在堆中,如下:

const int n = 800000000;

        bitset<n> *a = new(std::nothrow) bitset<n>;

        if(a)
        {
                cout << a->size() << endl;
                delete a;
                a = NULL;
        }
           

4. vector<bool>及deque<bool>

bitset高效,但是size必須在編譯器确定,不支援插入和删除。是以,一個可能的替代品是vector<bool>和deque<bool>

兩者的差別:

vector<bool>不是一個STL容器,并且不容納bool(like bitse底層t機制)

deque<bool>是一個STL容器,它儲存真正的bool值

分别運作

deque<bool> a;
        a[0] = 0;
        bool* b = &a[0];
        cout << *b << endl;
           

vector<bool> a;
        a[0] = 0;
        bool* b = &a[0];
        cout << *b << endl;
           

将會發現:

使用deque<bool>正确,而是用vector<bool>會報錯:“cannot convert `std::_Bit_reference*' to `bool*' in initialization“

但是,deque簡直是在踐踏記憶體。

使用deque<bool>

int main(int argc, char** argv)
{
        deque<bool> a(10000000000);
        sleep(100);
        return 0;
}
           

記憶體使用:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                               

23612 work      25   0 9990m 9.8g  720 S  0.0 65.0   0:39.35 test

使用vector<bool>

int main(int argc, char** argv)
{
        vector<bool> a(10000000000);
        sleep(100);
        return 0;
}
           

記憶體使用:

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                               

23909 work      25   0 1198m 1.2g  716 S  0.0  7.8   0:01.31 test

使用bitset

int main(int argc, char** argv)
{
        const unsigned long int n = 10000000000;
        bitset<n> *a = new(std::nothrow) bitset<n>;
        sleep(100);

        return 0;
}
           

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                               

24439 work      25   0 1198m 1.2g  712 S 30.7  7.8   0:00.92 test  

10億個bool,vector<bool>和bitset使用記憶體1198M,deque<bool>則是9990M

5. 總結

在需要對位集合進行操作的時候,如何操作集合大小比較固定,優先選擇高效的bitset;

如果需要動态增删元素,或者編譯期間無法确定集合大小,則可以考慮vector<bool>,deque<bool>記憶體開銷太大,基本上不考慮。

參考:

http://www.sgi.com/tech/stl/download.html

http://www.cplusplus.com/reference/stl/vector/

http://www.cplusplus.com/reference/stl/bitset/

擴充閱讀:

Vector specialization: vector<bool>

The vector class template has a special template specialization for the bool type.

This specialization is provided to optimize for space allocation: In this template specialization, each element occupies only one bit (which is eight times less than the smallest type in C++: char).

The references to elements of a bool vector returned by the vector members are not references to bool objects, but a special member type which is a reference to a single bit, defined inside the vector<bool> class specialization as:

class vector<bool>::reference {
  friend class vector;
  reference();                                 // no public constructor
public:
  ~reference();
  operator bool () const;                      // convert to bool
  reference& operator= ( const bool x );       // assign from bool
  reference& operator= ( const reference& x );  // assign from bit
  void flip();                                 // flip bit value.
}
           

繼續閱讀