update1:添加了remove,removeall()方法以及getsize()方法
update2:添加了keyset()方法用于疊代
update3:經過測試,sttable類在存儲integer類型key時,put的速度比hashmap快了接近3倍,而remove、get卻比hashmap慢;而在存儲string類型的key時,put比hashmap慢,但是get、remove卻快不少。
讀ruby hacking guide,其中專門辟了一個章節介紹了st.c中的st_table,這個資料結構也就是類似java中的hashmap,基本原理是利用數組存儲,數組的每一個元素是一個單向連結清單,連結清單中再存儲具體的元素,如下圖所示的結構

ruby中利用這個結構來存儲對象變量、類方法、常量、全局變量等資訊,因為在c ruby中,方法、變量都是用一個整型作為鍵值來存儲在st_table中,是以這個資料結構對于以整性為鍵值的map類型來說速度非常不錯(我沒有測試記憶體的占用情況)。
源碼如下:
//接口,用于定義hash函數
//hashfunction.java
public interface hashfunction<t> {
public int hash(t key);
}
連結清單元素類:
public class sttableentry<t, v> {
protected int hash; //hash值
protected t key; //鍵
protected v value; //存儲值
protected sttableentry<t, v> next; //下一節點
public sttableentry() {
}
public sttableentry(int hash, t key, v value, sttableentry<t, v> next) {
super();
this.hash = hash;
this.key = key;
this.value = value;
this.next = next;
public int gethash() {
return hash;
public void sethash(int hash) {
public t getkey() {
return key;
public void setkey(t key) {
public sttableentry<t, v> getnext() {
return next;
public void setnext(sttableentry<t, v> next) {
public v getvalue() {
return value;
public void setvalue(v value) {
完整的sttable實作,沒有實作remove,(update:添加了remove,removeall()方法以及getsize()方法):
public final class sttable<t, v> {
private hashfunction<t> hashfunction;
private int num_bins;
int num_entries;
sttableentry<t, v>[] bins;
public static int default_size = 11;
private static int default_max_density = 5;
private static int default_min_size = 8;
private static long primes[] = { 8 + 3, 16 + 3, 32 + 5, 64 + 3, 128 + 3,
256 + 27, 512 + 9, 1024 + 9, 2048 + 5, 4096 + 3, 8192 + 27,
16384 + 43, 32768 + 3, 65536 + 45, 131072 + 29, 262144 + 3,
524288 + 21, 1048576 + 7, 2097152 + 17, 4194304 + 15, 8388608 + 9,
16777216 + 43, 33554432 + 35, 67108864 + 15, 134217728 + 29,
268435456 + 3, 536870912 + 11, 1073741824 + 85, 0 };
public sttable(hashfunction<t> hashfunction) {
this.hashfunction = hashfunction;
this.num_bins = default_size;
this.num_entries = 0;
this.bins = new sttableentry[this.num_bins];
public sttable(hashfunction<t> hashfunction, int size) {
if (size == 0)
throw new illegalargumentexception(
"the size could not less than zero:" + size);
this.num_bins = size;
private long newsize(int size) {
for (int i = 0, newsize = default_min_size; i < primes.length; i++, newsize <<= 1) {
if (newsize > size)
return primes[i];
}
/* ran out of polynomials */
return -1; /* should raise exception */
public v get(t key) {
int hash_val = dohash(key);
sttableentry<t, v> entry = findentry(hash_val, key);
if (entry == null)
return null;
else
return entry.getvalue();
public v put(t key, v value) {
if (entry == null) {
// 未有鍵值,直接添加
adddirect(key, value);
return value;
} else {
v v = entry.value;
entry.value = value;
return v;
public v remove(t key) {
int bin_pos = hash_val % this.num_bins;
sttableentry<t, v> entry = this.bins[bin_pos];
// 記錄前一節點,考慮修改采用雙向連結清單也可
sttableentry<t, v> prev = null;
if (entrynotequal(entry, key, hash_val)) {
prev = entry;
entry = entry.next;
while (entrynotequal(entry, key, hash_val)) {
prev = entry;
entry = entry.next;
}
else {
if (prev != null)
prev.next = entry.next; // 前一節點的next連接配接到下一節點
else
this.bins[bin_pos] = entry.next; // entry恰好是第一個節點,将數組元素設定為next
entry = null; // gc友好
this.num_entries=0;
public void removeall() {
for (int i = 0; i < this.bins.length; i++) {
sttableentry<t, v> entry = this.bins[i];
this.bins[i] = null;
sttableentry<t, v> temp = entry;
if (entry == null)
continue;
while (entry != null) {
entry = null;
this.num_entries--;
entry = temp.next;
temp = entry;
temp = null;
entry = null;
public int getsize() {
return this.num_entries;
public set<t> keyset() {
set<t> keys = new hashset<t>(this.num_entries);
for (int i = 0; i < this.bins.length; i++) {
sttableentry<t, v> entry = this.bins[i];
if (entry == null)
continue;
while (entry != null) {
keys.add(entry.key);
entry = entry.next;
}
}
return keys;
}
// hash函數,調用hashfunction的hash方法
private int dohash(t key) {
if (hashfunction.hash(key) < 0)
"hash value could not less than zero:"
+ hashfunction.hash(key));
return hashfunction.hash(key);
// 過于擁擠,重新分布
private void rehash() {
int new_size = (int) newsize(this.num_bins);
sttableentry<t, v>[] new_bins = (sttableentry<t, v>[]) new sttableentry[new_size];
for (int i = 0; i < this.num_bins; i++) {
sttableentry<t, v> next = entry.next;
int hash_val = entry.hash % new_size;
entry.next = new_bins[hash_val];
new_bins[hash_val] = entry;
entry = next;
this.bins = null;// gc友好
this.num_bins = new_size;
this.bins = new_bins;
private void adddirect(t key, v value) {
if ((this.num_entries / this.num_bins) > default_max_density) {
rehash();
bin_pos = hash_val % this.num_bins;
sttableentry<t, v> entry = new sttableentry<t, v>();
entry.sethash(hash_val);
entry.setkey(key);
entry.setvalue(value);
entry.setnext(this.bins[bin_pos]);
this.bins[bin_pos] = entry;
this.num_entries++;
private sttableentry<t, v> findentry(int hash_val, t key) {
return entry;
// 判斷元素是否相同
private boolean entrynotequal(sttableentry<t, v> entry, t key, int hash_val) {
return entry != null
&& (entry.gethash() != hash_val || (!key.equals(entry.getkey())));
單元測試類就不列了,給一個與hashmap的簡單性能對比,以整型為鍵,顯然sttable快多了,對于字元串型,關鍵是hashfunction的定義,我直接調用string的hashcode方法,不知道有沒有其他更好的方法讓元素分布的更均勻些:
import java.util.hashmap;
import java.util.map;
public class benchmark {
public static void main(string args[]) {
long map_cost = teststringmap();
long table_cost = teststringtable();
if (map_cost <= table_cost)
system.out.println("map is faster than table ");
system.out.println("table is faster than map ");
map_cost = testintegermap();
table_cost = testintegertable();
public static long testintegermap() {
map<integer, integer> map = new hashmap<integer, integer>();
long start = system.nanotime();
for (int i = 0; i < 10000; i++)
map.put(i, i);
long result = 0;
result += map.get(i);
long end = system.nanotime();
system.out.println("result:" + result);
system.out.println("map:" + (end - start));
return (end - start);
public static long testintegertable() {
hashfunction<integer> inthash = new hashfunction<integer>() {
public int hash(integer key) {
return key;
};
sttable<integer, integer> table = new sttable<integer, integer>(inthash);
table.put(i, i);
result += table.get(i);
system.out.println("table:" + (end - start));
public static long teststringmap() {
map<string, string> map = new hashmap<string, string>();
map.put(string.valueof(i), string.valueof(i));
result += integer.parseint(map.get(string.valueof(i)));
public static long teststringtable() {
hashfunction<string> inthash = new hashfunction<string>() {
int i = 0;
public int hash(string key) {
int hashcode = key.hashcode();
return hashcode < 0 ? -hashcode : hashcode;
sttable<string, string> table = new sttable<string, string>(inthash);
table.put(string.valueof(i), string.valueof(i));
result += integer.parseint(table.get(string.valueof(i)));
結果為:
result:49995000
map:55501468
table:60999652
map is faster than table
map:44634444
table:26209477
table is faster than map
将get換成remove方法,結果也與上面的類似。
文章轉自莊周夢蝶 ,原文釋出時間 2007-09-18 <b></b>