laitimes

OceanBase global index and local index exploration

author:Acerson open source community

OceanBase global index and local index exploration results in local and cross-region lookups.

Author: The screen name big data model, knows a little more about manufacturing, banking, and communications, and cares about the application practice of domestic database technology sermon and data asset construction.

Produced by the Aikesheng open source community, the original content shall not be used arbitrarily without authorization, please contact the editor and indicate the source for reprinting.

This article is about 1200 words and is expected to take 4 minutes to read.

OceanBase index

Indexes and partitions are the key core basic functions of the database, OceanBase is a monolithic distributed architecture, with high performance, high scalability, high availability characteristics, indexing and partitioning has made a great contribution.

OceanBase indexes have local indexes and global indexes. What is the difference between indexes for local indexes and global indexes? The following is a practical example to demonstrate how to optimize OceanBase. When reading, pay attention to the following key reference indicators for optimization.

  • is_index_back: Indicates whether the query SQL has returned to the table, and the numeric value is off.
  • is_global_index: Indicates whether the global index has been activated.
  • physical_range_rows: Represents the physical range row read, the lower the number, the better.
  • logical_range_rows represents the logical range row read, with a smaller number being better.
  • Plan type: There are 3 options, the best is LOCAL for local calls, REMOTE for remote calls, and worst case DISTRIBUTED.

Prepare the environment

Make a user1 table and populate it with some data.

CREATE TABLE `user1` (
  `id` int(11) DEFAULT NULL,
  `name` varchar(100) DEFAULT NULL,
  `phone` int(12) DEFAULT NULL,
  `address` varchar(100) DEFAULT NULL
) partition by hash(id+1) partitions 3;
obclient [tpch]> select count(*) from  user1
    -> ;
+----------+
| count(*) |
+----------+
|    79993 |
+----------+
1 row in set (0.025 sec)


delimiter //
create procedure bulk_user()
begin
declare i int;
declare phone int;
 set i=100000;
 set phone=1592014273;
 while i<1000001 do
   insert INTO user (id,name ,phone,address) values (i,'yang',phone+i,'address');
 set i=i+1;
 end while;
end
//
delimiter ;
           

Index usage scenarios

Scenario 1: No indexing

obclient [tpch]> explain extended  select phone ,name  from  user1 where   phone = 1592014286;
           
OceanBase global index and local index exploration
OceanBase global index and local index exploration

Index test, before adding indexes, the target object was found by scanning the whole disk, and no return table was generated during the operation.

Scenario 2: Add a local index

obclient [tpch]> create index idx_user1_phone on user1 (phone) local;
Query OK, 0 rows affected (3.152 sec)

explain extended select phone,name from user1 where phone = 1592014286;
           
OceanBase global index and local index exploration
OceanBase global index and local index exploration

Index test, after adding a partial index, only 791 rows were found in the hard disk scan, and a table return operation was actually generated during the operation.

Scenario 3: Partition search

obclient [tpch]> explain extended select name, phone  from user1  where  id= 5000;
           
OceanBase global index and local index exploration
OceanBase global index and local index exploration

Partition test, by partitioning for keyword lookup, by partition [26664] lookup, because there is no index setting, traverse all 26664, no return table is generated.

Scenario 4: Partition and index for search

obclient [tpch]>  create index idx_user1_id on user1 (id) local;
Query OK, 0 rows affected (3.379 sec)
obclient [tpch]> explain extended select name, phone  from user1  where  id= 5000;
           
OceanBase global index and local index exploration
OceanBase global index and local index exploration

Partition plus index test, id is both an index and a partition. After setting the index, the search by partition [26664] is good, and the physical_range_rows and logical_range_rows results are gratifying, but the operation of returning to the table occurs.

Why is the table back? The main statements are select name, phone, id is indexed, but name and phone are projections, thus making a return table.

Scenario 5: Create a unique index to eliminate back to the table

obclient [tpch]> create  unique index  idx_user_phone_name  on  user1 (phone,name) local ;
ERROR 1503 (HY000): A UNIQUE INDEX must include all columns in the table's partitioning function

提示 1503 错误,创建唯一索引必须指定分区的指定

obclient [tpch]> create  unique index  idx_user_id_phone_name  on  user1 (phone,name,id) local ;
Query OK, 0 rows affected (3.352 sec)

explain extended  select phone ,name  from  user1 where   phone = 1592014286;
           
OceanBase global index and local index exploration
OceanBase global index and local index exploration

Why must a unique index contain an embedded partition ID, must be unique, must include a primary key column, and is based on a local index plus a unique index, and does not produce a return table.

Scenario 6: Create a global index to eliminate back tables

create unique index   global_idx_phone  on  user1(phone,name) global ;

explain extended  select phone ,name  from  user1 where   phone = 1592014286;
           
OceanBase global index and local index exploration
OceanBase global index and local index exploration

Global indexes follow phone, name can also eliminate back and forth tables.

summary

OceanBase is a database with a single distributed architecture, the first principle of tuning follows the characteristics of first monolithic and then distributed, in short, it is best to use up the performance of the stand-alone in the inner cycle, and then use the distributed in the outer cycle, and strive for LOCAL priority, REMOTE second, DISTRUBTE is the worst, and the comprehensive execution status should be combined with the scan data range and the return to the table situation.

LOCAL INDEXES SHOULD BE USED IN SCENARIOS THAT STRIVE FOR LOCAL AND AVOID DISTRUPTE. SCENARIO 2, SCENE 3, SCENE 4 USE LOCAL, BUT SCENARIO FIVE USES DISTRUPTE. The thought-off is that partition keys must be combined to complete unique index creation. The internal logic here is that local indexes must be bound to partition keys to complete cross-domain.

The global index can also implement the LOCAL scenario, see Scenario 6. The author will do OceanBase's distributed environment in the future. ASSUMING THAT IN A BUSINESS SCENARIO WITH A DISTRIBUTED ENVIRONMENT AND A LOT OF DATA, THE AUTHOR SPECULATES THAT THERE IS A GREATER OPPORTUNITY FOR DISTRUPTE.

For more technical articles, please visit: https://opensource.actionsky.com/