讓我們從建立表開始探索hbase内部機制。假設hbase.root目錄為/new,測試表名為t1。
client使用HBaseAdmin的createTable接口,過程如下
1. 建立HMasterRPC連接配接,并調用之,由于hmaster端建立table是異步的,是以這裡是一個異步操作。如果不指定split規則,預設會建立一個空region。
getMaster().createTable(desc, splitKeys);
2. client線程全表掃描meta表,檢查t1表的region是否都配置設定好。預設重試100次,每次失敗sleep。
MetaScannerVisitor visitor = new MetaScannerVisitorBase() {
@Override
public boolean processRow(Result rowResult) throws IOException {
HRegionInfo info = Writables.getHRegionInfoOrNull(
rowResult.getValue(HConstants.CATALOG_FAMILY,
HConstants.REGIONINFO_QUALIFIER));
......
//拿'server'列,如果有值,則認為配置設定成功
byte [] value = rowResult.getValue(HConstants.CATALOG_FAMILY,
HConstants.SERVER_QUALIFIER);
// Make sure that regions are assigned to server
if (value != null && value.length > 0) {
hostAndPort = Bytes.toString(value);
}
if (!(info.isOffline() || info.isSplit()) && hostAndPort != null) {
actualRegCount.incrementAndGet();
}
return true;
}
};
MetaScanner.metaScan(conf, visitor, desc.getName());
來看HMaster的create table RCP接口
1.構造CreateTableHandler
1.1 等待META表就位,如果就位,則擷取META表第一個region的location,并建立RPC連接配接
public ServerName waitForMeta(long timeout)
throws InterruptedException, IOException, NotAllMetaRegionsOnlineException {
long stop = System.currentTimeMillis() + timeout;
long waitTime = Math.min(50, timeout);
synchronized (metaAvailable) {
while(!stopped && (timeout == 0 || System.currentTimeMillis() < stop)) {
if (getMetaServerConnection() != null) {
return metaLocation;
}
// perhaps -ROOT- region isn't available, let us wait a bit and retry.
metaAvailable.wait(waitTime);
}
if (getMetaServerConnection() == null) {
throw new NotAllMetaRegionsOnlineException("Timed out (" + timeout + "ms)");
}
return metaLocation;
}
}
1.2 判斷t1表是否已存在
1.3 建立并設定t1表在zk中的節點狀态為‘enabling’,節點路徑/hbase/table/t1
private void setTableState(final String tableName, final TableState state)
throws KeeperException {
String znode = ZKUtil.joinZNode(this.watcher.tableZNode, tableName);
if (ZKUtil.checkExists(this.watcher, znode) == -1) {
ZKUtil.createAndFailSilent(this.watcher, znode);
}
synchronized (this.cache) {
ZKUtil.setData(this.watcher, znode, Bytes.toBytes(state.toString()));
this.cache.put(tableName, state);
}
}
2.異步送出CreateTableHandler
this.executorService.submit(new CreateTableHandler(this,
this.fileSystemManager, this.serverManager, hTableDescriptor, conf,
newRegions, catalogTracker, assignmentManager));
3.CreateTableHandler運作
3.1 将table的元資訊寫入HDFS下的.tableinfo檔案中,檔案目錄/new/t1/.tableinfo.0000000001。
private static Path writeTableDescriptor(final FileSystem fs,
final HTableDescriptor hTableDescriptor, final Path tableDir,
final FileStatus status)
throws IOException {
// Get temporary dir into which we'll first write a file to avoid
// half-written file phenomeon.
//先寫tmp目錄
Path tmpTableDir = new Path(tableDir, ".tmp");
//順序号,從0開始
int currentSequenceid =
status == null? 0: getTableInfoSequenceid(status.getPath());
int sequenceid = currentSequenceid;
// Put arbitrary upperbound on how often we retry
int retries = 10;
int retrymax = currentSequenceid + retries;
Path tableInfoPath = null;
do {
sequenceid += 1;
//HDFS檔案名,類是.tableinfo.0000000001
Path p = getTableInfoFileName(tmpTableDir, sequenceid);
if (fs.exists(p)) {
LOG.debug(p + " exists; retrying up to " + retries + " times");
continue;
}
try {
//寫内容
writeHTD(fs, p, hTableDescriptor);
tableInfoPath = getTableInfoFileName(tableDir, sequenceid);
//重命名成最終檔案
if (!fs.rename(p, tableInfoPath)) {
throw new IOException("Failed rename of " + p + " to " + tableInfoPath);
}
}
.......
break;
} while (sequenceid < retrymax);
return tableInfoPath;
}
3.2 建立region
HRegion region = HRegion.createHRegion(newRegion,
this.fileSystemManager.getRootDir(), this.conf,
this.hTableDescriptor, null, false, true);
3.3 META表新增記錄,寫入regioninfo列資訊
private static Put makePutFromRegionInfo(HRegionInfo regionInfo)
throws IOException {
Put put = new Put(regionInfo.getRegionName());
put.add(HConstants.CATALOG_FAMILY, HConstants.REGIONINFO_QUALIFIER,
Writables.getBytes(regionInfo));
return put;
}
3.4 close region
3.5 從zk擷取活着的region server
//從/hbase/rs下擷取并過濾掉那些dead的機器
List<ServerName> servers = serverManager.getOnlineServersList();
// Remove the deadNotExpired servers from the server list.
assignmentManager.removeDeadNotExpiredServers(servers);
3.6 region配置設定,預設随機均勻配置設定,使用多線程批量配置設定,業務線程等待直到所有region都配置設定成功,詳細的配置設定過程将在下一篇介紹
this.assignmentManager.assignUserRegions(Arrays.asList(newRegions),
servers);
3.7 設定t1表在zk中的節點狀态為‘enabled’,節點路徑/hbase/table/t1
小節
create table主要涉及,table中繼資料寫入,region配置設定,zk狀态資訊修改,meta表修改和檢查。