ULID :Universally Unique Lexicographically Sortable Identifier(通用唯一词典分类标识符)
UUID :Universally Unique Identifier(通用唯一标识符)
Why not choose UUID
There are currently 5 versions of UUID:
Version 1: Impractical in many environments because it requires access to a unique, stable MAC address and is vulnerable to attack;
Version 2: Replace the first four bits of the timestamp of version 1 with the UID or GID of POSIX, same as above;
Version 3: Based on the MD5 hashing algorithm, a unique seed is required to generate randomly distributed IDs, which may lead to fragmentation of many data structures;
Version 4: Generated based on random or pseudo-random numbers, providing no information other than randomness;
Version 5: Generated by the SHA-1 hashing algorithm, a unique seed is required to generate randomly distributed IDs, which can lead to fragmentation of many data structures;
UUID4 is commonly used here, but even if it's random, there's a risk of conflicts.
Unlike UUIDs, which are either based on random numbers or timestamps, ULIDs are based on both timestamps and random numbers, and the timestamps are accurate to milliseconds, with 1.21e + 24 random numbers in milliseconds, there is no risk of conflict, and the conversion to strings is more friendly than UUIDs.
ULID characteristics
ulid() # 01ARZ3NDEKTSV4RRFFQ69G5FAV
- 128-bit compatibility with UUID
- 每毫秒1.21e + 24个唯一ULID
- Sort in dictionary order (i.e. alphabetically)!
- Canonically encoded as 26 strings instead of the 36 characters of the UUID
- Better efficiency and readability with Crockford's base32 (5 bits per character)
- Not case-sensitive
- No special characters (URL safe)
- Monotonic sort order (detects and processes the same milliseconds correctly)
ULID specification
The following is the current specification for ULID implemented in python (ulid-py). Binary format has been implemented
01AN4Z07BY 79KA1307SR9X4MV3
|----------| |----------------|
Timestamp Randomness
10chars 16chars
48bits 80bits
compose
timestamp
- 48-bit integer
- UNIX time (in milliseconds)
- Until 10889 AD, space will not run out.
randomness
- 80-digit random number
- If possible, cryptography is used to guarantee randomness
sort
The leftmost character must come first, and the rightmost character must come last (lexical order). The default ASCII character set must be used. Within the same millisecond, the sort order is not guaranteed
Encoding
As shown in the image, Crockford's Base32 is used. This alphabet does not include the letters I, L, O, and U to avoid confusion and abuse.
0123456789ABCDEFGHJKMNPQRSTVWXYZ
Binary layout and byte order
The components are encoded into 16 octets. Each component is encoded with the most significant byte (network byte order).
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32_bit_uint_time_high |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 16_bit_uint_time_low | 16_bit_uint_random |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32_bit_uint_random |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 32_bit_uint_random |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Application scenarios
- Replaces the database auto-increment ID, eliminating the need for DB to participate in the generation of primary keys
- In a distributed environment, the UUID is replaced, which is globally unique and ordered with millisecond accuracy
- For example, if you want to partition a database by date, you can use the timestamp embedded in the ULID to select the correct partition table
- If millisecond precision is acceptable (out of order within milliseconds), you can sort by ULID instead of separate created_at fields
Usage (python)
Installation
pip install ulid-py
Create a brand new ULIC.
The timestamp value (48 bits) comes from time.time() with a precision of milliseconds.
随机值(80位)来自 os.urandom()。
>>> import ulid
>>> ulid.new()
<ULID('01BJQE4QTHMFP0S5J153XCFSP9')>
Create a new ULID based on an existing 128-bit value (e.g. UUID).
支持ULID值类型有 int,bytes,str,和UUID。
>>> import ulid, uuid
>>> value = uuid.uuid4()
>>> value
UUID('0983d0a2-ff15-4d83-8f37-7dd945b5aa39')
>>> ulid.from_uuid(value)
<ULID('09GF8A5ZRN9P1RYDVXV52VBAHS')>
Create a new ULID from an existing timestamp value (e.g. datetime object).
支持时间戳值类型有int,float,str,bytes,bytearray,memoryview,datetime,Timestamp,和ULID
>>> import datetime, ulid
>>> ulid.from_timestamp(datetime.datetime(1999, 1, 1))
<ULID('00TM9HX0008S220A3PWSFVNFEH')>
Create a new ULID based on an existing random number.
支持随机值类型有int,float,str,bytes,bytearray,memoryview,Randomness,和ULID。
>>> import os, ulid
>>> randomness = os.urandom(10)
>>> ulid.from_randomness(randomness)
>>> <ULID('01BJQHX2XEDK0VN0GMYWT9JN8S')>
Once you have a ULID object, there are multiple ways to interact with it.
The timestamp() method will give you a timestamp snapshot of the first 48 bits of the UID, while the randomness() method will give you a snapshot of the random number for the last 80 bits.
>>> import ulid
>>> u = ulid.new()
>>> u
<ULID('01BJQM7SC7D5VVTG3J68ABFQ3N')>
>>> u.timestamp()
<Timestamp('01BJQM7SC7')>
>>> u.randomness()
<Randomness('D5VVTG3J68ABFQ3N')>
github:https://github.com/ahawker/ulid
作者丨pushiqiang
Source丨Website: blog.csdn.net/pushiqiang/article/details/117365290
The DBAPLUS community welcomes contributions from technical personnel at [email protected]
Event Recommendations
The 2024 XCOPS Intelligent O&M Manager Annual Conference will be held on May 24 in Guangzhou, where we will study how emerging technologies such as large models and AI agents can be implemented in the O&M field, enabling enterprises to improve the level of intelligent O&M and build comprehensive O&M autonomy.
Conference details: 2024 XCOPS Intelligent O&M Manager Annual Meeting - Guangzhou Station