1. 概念
什麼是“标準非STL容器”?标準非STL容器是指“可以認為它們是容器,但是他們并不滿足STL容器的所有要求”。前文提到的容器擴充卡stack、queue及priority_queue都是标準非STL容器的一部分。此外,valarray也是标準非STL容器。
bitset:一種 高效位集合操作容器。
2. API
bitset提供的api:
(constructor) Construct bitset (public member function)
operator[] Access bit (public member function)
set Set bits (public member function)
reset Reset bits (public member function )
flip Flip bits (public member function)
to_ulong Convert to unsigned long integer (public member function)
to_string Convert to string (public member function)
count Count bits set (public member function)
size Return size (public member function)
test Return bit value (public member function )
any Test if any bit is set (public member function)
none Test if no bit is set (public member function)
3. 源碼剖析
SGI bitset部分實作源碼
template<size_t _Nb>
class bitset : private _Base_bitset<__BITSET_WORDS(_Nb)>
{
private:
typedef _Base_bitset<__BITSET_WORDS(_Nb)> _Base;
typedef unsigned long _WordT;
private:
void _M_do_sanitize() {
_Sanitize<_Nb%__BITS_PER_WORD>::_M_do_sanitize(this->_M_hiword());
}
.....
}
#define __BITS_PER_WORD (CHAR_BIT*sizeof(unsigned long))
#define __BITSET_WORDS(__n) \
((__n) < 1 ? 1 : ((__n) + __BITS_PER_WORD - 1)/__BITS_PER_WORD)
template<size_t _Nw>
struct _Base_bitset {
typedef unsigned long _WordT;
_WordT _M_w[_Nw]; // 0 is the least significant word.
_Base_bitset( void ) { _M_do_reset(); }
_Base_bitset(unsigned long __val) {
_M_do_reset();
_M_w[0] = __val;
}
static size_t _S_whichword( size_t __pos )
{ return __pos / __BITS_PER_WORD; }
static size_t _S_whichbyte( size_t __pos )
{ return (__pos % __BITS_PER_WORD) / CHAR_BIT; }
static size_t _S_whichbit( size_t __pos )
{ return __pos % __BITS_PER_WORD; }
static _WordT _S_maskbit( size_t __pos )
{ return (static_cast<_WordT>(1)) << _S_whichbit(__pos); }
_WordT& _M_getword(size_t __pos) { return _M_w[_S_whichword(__pos)]; }
_WordT _M_getword(size_t __pos) const { return _M_w[_S_whichword(__pos)]; }
_WordT& _M_hiword() { return _M_w[_Nw - 1]; }
_WordT _M_hiword() const { return _M_w[_Nw - 1]; }
void _M_do_and(const _Base_bitset<_Nw>& __x) {
for ( size_t __i = 0; __i < _Nw; __i++ ) {
_M_w[__i] &= __x._M_w[__i];
}
}
void _M_do_or(const _Base_bitset<_Nw>& __x) {
for ( size_t __i = 0; __i < _Nw; __i++ ) {
_M_w[__i] |= __x._M_w[__i];
}
}
void _M_do_xor(const _Base_bitset<_Nw>& __x) {
for ( size_t __i = 0; __i < _Nw; __i++ ) {
_M_w[__i] ^= __x._M_w[__i];
}
}
節選上述代碼,可以得到:
1. bitset繼承_Base_bitset,具體操作封裝在_Base_bitset中
2. bitset 的size作為模闆參數(非類型模闆參數的一個要求是,編譯器能在編譯期就能把參數确定下來),是以, bitset大小在編譯期固定,不支援插入和删除元素
3. 各種位操作, 性能高
4._Base_bitset使unsigned long作為底層存儲, 不支援指針、引用、疊代器
5. 使用 _WordT _M_w[_Nw];配置設定記憶體, 是以在棧中定義bitset需要注意大小(和STL标準容器堆記憶體配置設定差別開)。
eg,下面的代碼将棧溢出(測試機器棧記憶體10M)
void fun()
{
const int n = 800000000;
bitset<n> a;
cout << a.size() << endl;
}
int main(int argc, char** argv)
{
fun();
return 0;
}
大記憶體配置設定可以配置設定在堆中,如下:
const int n = 800000000;
bitset<n> *a = new(std::nothrow) bitset<n>;
if(a)
{
cout << a->size() << endl;
delete a;
a = NULL;
}
4. vector<bool>及deque<bool>
bitset高效,但是size必須在編譯器确定,不支援插入和删除。是以,一個可能的替代品是vector<bool>和deque<bool>
兩者的差別:
vector<bool>不是一個STL容器,并且不容納bool(like bitse底層t機制)
deque<bool>是一個STL容器,它儲存真正的bool值
分别運作
deque<bool> a;
a[0] = 0;
bool* b = &a[0];
cout << *b << endl;
和
vector<bool> a;
a[0] = 0;
bool* b = &a[0];
cout << *b << endl;
将會發現:
使用deque<bool>正确,而是用vector<bool>會報錯:“cannot convert `std::_Bit_reference*' to `bool*' in initialization“
但是,deque簡直是在踐踏記憶體。
使用deque<bool>
int main(int argc, char** argv)
{
deque<bool> a(10000000000);
sleep(100);
return 0;
}
記憶體使用:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23612 work 25 0 9990m 9.8g 720 S 0.0 65.0 0:39.35 test
使用vector<bool>
int main(int argc, char** argv)
{
vector<bool> a(10000000000);
sleep(100);
return 0;
}
記憶體使用:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
23909 work 25 0 1198m 1.2g 716 S 0.0 7.8 0:01.31 test
使用bitset
int main(int argc, char** argv)
{
const unsigned long int n = 10000000000;
bitset<n> *a = new(std::nothrow) bitset<n>;
sleep(100);
return 0;
}
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
24439 work 25 0 1198m 1.2g 712 S 30.7 7.8 0:00.92 test
10億個bool,vector<bool>和bitset使用記憶體1198M,deque<bool>則是9990M
5. 總結
在需要對位集合進行操作的時候,如何操作集合大小比較固定,優先選擇高效的bitset;
如果需要動态增删元素,或者編譯期間無法确定集合大小,則可以考慮vector<bool>,deque<bool>記憶體開銷太大,基本上不考慮。
參考:
http://www.sgi.com/tech/stl/download.html
http://www.cplusplus.com/reference/stl/vector/
http://www.cplusplus.com/reference/stl/bitset/
擴充閱讀:
Vector specialization: vector<bool>
The vector class template has a special template specialization for the bool type.
This specialization is provided to optimize for space allocation: In this template specialization, each element occupies only one bit (which is eight times less than the smallest type in C++: char).
The references to elements of a bool vector returned by the vector members are not references to bool objects, but a special member type which is a reference to a single bit, defined inside the vector<bool> class specialization as:
class vector<bool>::reference {
friend class vector;
reference(); // no public constructor
public:
~reference();
operator bool () const; // convert to bool
reference& operator= ( const bool x ); // assign from bool
reference& operator= ( const reference& x ); // assign from bit
void flip(); // flip bit value.
}