Atomic Upsert
To support atomic upsert, an optional ON DUPLICATE KEY clause, similar to the MySQL syntax, has been encorporated into the UPSERT VALUES command as of Phoenix 4.9. The general syntax is described here. This feature provides a superset of the HBase Increment and CheckAndPut functionality to enable atomic upserts. On the server-side, when the commit is processed, the row being updated will be locked while the current column values are read and the ON DUPLICATE KEY clause is executed. Given that the row must be locked and read when the ON DUPLICATE KEY clause is used, there will be a performance penalty (much like there is for an HBase Put versus a CheckAndPut).
In the presence of the ON DUPLICATE KEY clause, if the row already exists, the VALUES specified will be ignored and instead either:
the row will not be updated if ON DUPLICATE KEY IGNORE is specified or
the row will be updated (under lock) by executing the expressions following the ON DUPLICATE KEY UPDATE clause.
Multiple UPSERT statements for the same row in the same commit batch will be processed in the order of their execution. Thus the same result will be produced when auto commit is on or off.
For example, to atomically increment two counter columns, you would execute the following command:
To only update a column if it doesn’t yet exist:
Note that arbitrarily complex expressions may be used in this new clause:
The following limitations are enforced for the ON DUPLICATE KEY clause usage:
Primary key columns may not be updated, since this would essentially be creating a new row.
Transactional tables may not use this clause as atomic upserts are already possible through exception handling when a conflict occurs.
Immutable tables may not use this clause as by definition there should be no updates to existing rows
The CURRENT_SCN property may not be set on connection when this clause is used as HBase does not handle atomicity unless the latest value is being updated.
The same column should not be updated more than once in the same statement.
No aggregation or references to sequences are allowed within the clause.
Global indexes on columns being atomically updated are not supported, as potentially a separate RPC across the wire would be made while the row is under lock to maintain the secondary index.
原子更新插入
為了支援原子更新插入,從 Phoenix 4.9 開始,類似于 MySQL 文法的可選 ON DUPLICATE KEY 子句已合并到 UPSERT VALUES 指令中。此處描述了一般文法。此功能提供了 HBase Increment 和 CheckAndPut 功能的超集,以啟用原子更新插入。在伺服器端,當處理送出時,正在更新的行将被鎖定,同時讀取目前列值并執行 ON DUPLICATE KEY 子句。鑒于在使用 ON DUPLICATE KEY 子句時必須鎖定和讀取該行,将會有性能損失(很像 HBase Put 與 CheckAndPut 的情況)。
在存在 ON DUPLICATE KEY 子句的情況下,如果該行已經存在,則指定的 VALUES 将被忽略,而是:
如果指定了 ON DUPLICATE KEY IGNORE 或
該行将通過執行 ON DUPLICATE KEY UPDATE 子句之後的表達式來更新(鎖定)。
同一送出批次中同一行的多個 UPSERT 語句将按照它們的執行順序進行處理。是以,當自動送出打開或關閉時,将産生相同的結果。
例如,要以原子方式遞增兩個計數器列,您可以執行以下指令:
UPSERT INTO my_table(id, counter1, counter2) VALUES ('abc', 0, 0) ON DUPLICATE KEY UPDATE counter1 = counter1 + 1, counter2 = counter2 + 1;
僅更新尚不存在的列:
UPSERT INTO my_table(id, my_col) VALUES ('abc', 100) ON DUPLICATE KEY IGNORE;
請注意,在這個新子句中可以使用任意複雜的表達式:
UPSERT INTO my_table(id, total_deal_size, deal_size) VALUES ('abc', 0, 100) ON DUPLICATE KEY UPDATE total_deal_size = total_deal_size + deal_size, approval_reqd = CASE WHEN total_deal_size < 100 THEN AGEN_0deal0de '總規模<100 WHENVAL_DENONEPRO1 ' 否則“副總裁準許”結束;
對 ON DUPLICATE KEY 子句的使用實施以下限制:
主鍵列可能不會更新,因為這實際上是建立一個新行。
事務表可能不使用此子句,因為在發生沖突時通過異常處理已經可以進行原子更新插入。
不可變表可能不使用此子句,因為根據定義,不應更新現有行
當使用此子句時,可能不會在連接配接上設定 CURRENT_SCN 屬性,因為除非更新最新值,否則 HBase 不處理原子性。
同一列不應在同一語句中多次更新。
子句中不允許聚合或引用序列。
不支援原子更新的列上的全局索引,因為在行處于鎖定狀态時可能會跨線路進行單獨的 RPC 以維護二級索引。