laitimes

讲真!ULID 比 UUID 好用多了......

author:DBAplus Community

ULID :Universally Unique Lexicographically Sortable Identifier(通用唯一词典分类标识符)

UUID :Universally Unique Identifier(通用唯一标识符)

Why not choose UUID

There are currently 5 versions of UUID:

Version 1: Impractical in many environments because it requires access to a unique, stable MAC address and is vulnerable to attack;

Version 2: Replace the first four bits of the timestamp of version 1 with the UID or GID of POSIX, same as above;

Version 3: Based on the MD5 hashing algorithm, a unique seed is required to generate randomly distributed IDs, which may lead to fragmentation of many data structures;

Version 4: Generated based on random or pseudo-random numbers, providing no information other than randomness;

Version 5: Generated by the SHA-1 hashing algorithm, a unique seed is required to generate randomly distributed IDs, which can lead to fragmentation of many data structures;

UUID4 is commonly used here, but even if it's random, there's a risk of conflicts.

Unlike UUIDs, which are either based on random numbers or timestamps, ULIDs are based on both timestamps and random numbers, and the timestamps are accurate to milliseconds, with 1.21e + 24 random numbers in milliseconds, there is no risk of conflict, and the conversion to strings is more friendly than UUIDs.

ULID characteristics

ulid() # 01ARZ3NDEKTSV4RRFFQ69G5FAV           
  • 128-bit compatibility with UUID
  • 每毫秒1.21e + 24个唯一ULID
  • Sort in dictionary order (i.e. alphabetically)!
  • Canonically encoded as 26 strings instead of the 36 characters of the UUID
  • Better efficiency and readability with Crockford's base32 (5 bits per character)
  • Not case-sensitive
  • No special characters (URL safe)
  • Monotonic sort order (detects and processes the same milliseconds correctly)

ULID specification

The following is the current specification for ULID implemented in python (ulid-py). Binary format has been implemented

01AN4Z07BY      79KA1307SR9X4MV3


|----------|    |----------------|
 Timestamp          Randomness
  10chars            16chars
   48bits             80bits           

compose

timestamp

  • 48-bit integer
  • UNIX time (in milliseconds)
  • Until 10889 AD, space will not run out.

randomness

  • 80-digit random number
  • If possible, cryptography is used to guarantee randomness

sort

The leftmost character must come first, and the rightmost character must come last (lexical order). The default ASCII character set must be used. Within the same millisecond, the sort order is not guaranteed

Encoding

As shown in the image, Crockford's Base32 is used. This alphabet does not include the letters I, L, O, and U to avoid confusion and abuse.

0123456789ABCDEFGHJKMNPQRSTVWXYZ           

Binary layout and byte order

The components are encoded into 16 octets. Each component is encoded with the most significant byte (network byte order).

0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      32_bit_uint_time_high                    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|     16_bit_uint_time_low      |       16_bit_uint_random      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       32_bit_uint_random                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       32_bit_uint_random                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+           

Application scenarios

  • Replaces the database auto-increment ID, eliminating the need for DB to participate in the generation of primary keys
  • In a distributed environment, the UUID is replaced, which is globally unique and ordered with millisecond accuracy
  • For example, if you want to partition a database by date, you can use the timestamp embedded in the ULID to select the correct partition table
  • If millisecond precision is acceptable (out of order within milliseconds), you can sort by ULID instead of separate created_at fields

Usage (python)

Installation

pip install ulid-py           

Create a brand new ULIC.

The timestamp value (48 bits) comes from time.time() with a precision of milliseconds.

随机值(80位)来自 os.urandom()。

>>> import ulid
>>> ulid.new()
<ULID('01BJQE4QTHMFP0S5J153XCFSP9')>           

Create a new ULID based on an existing 128-bit value (e.g. UUID).

支持ULID值类型有 int,bytes,str,和UUID。

>>> import ulid, uuid
>>> value = uuid.uuid4()
>>> value
UUID('0983d0a2-ff15-4d83-8f37-7dd945b5aa39')
>>> ulid.from_uuid(value)
<ULID('09GF8A5ZRN9P1RYDVXV52VBAHS')>           

Create a new ULID from an existing timestamp value (e.g. datetime object).

支持时间戳值类型有int,float,str,bytes,bytearray,memoryview,datetime,Timestamp,和ULID

>>> import datetime, ulid
>>> ulid.from_timestamp(datetime.datetime(1999, 1, 1))
<ULID('00TM9HX0008S220A3PWSFVNFEH')>           

Create a new ULID based on an existing random number.

支持随机值类型有int,float,str,bytes,bytearray,memoryview,Randomness,和ULID。

>>> import os, ulid
>>> randomness = os.urandom(10)
>>> ulid.from_randomness(randomness)
>>> <ULID('01BJQHX2XEDK0VN0GMYWT9JN8S')>           

Once you have a ULID object, there are multiple ways to interact with it.

The timestamp() method will give you a timestamp snapshot of the first 48 bits of the UID, while the randomness() method will give you a snapshot of the random number for the last 80 bits.

>>> import ulid
>>> u = ulid.new()
>>> u
<ULID('01BJQM7SC7D5VVTG3J68ABFQ3N')>
>>> u.timestamp()
<Timestamp('01BJQM7SC7')>
>>> u.randomness()
<Randomness('D5VVTG3J68ABFQ3N')>           

github:https://github.com/ahawker/ulid

作者丨pushiqiang

Source丨Website: blog.csdn.net/pushiqiang/article/details/117365290

The DBAPLUS community welcomes contributions from technical personnel at [email protected]

Event Recommendations

The 2024 XCOPS Intelligent O&M Manager Annual Conference will be held on May 24 in Guangzhou, where we will study how emerging technologies such as large models and AI agents can be implemented in the O&M field, enabling enterprises to improve the level of intelligent O&M and build comprehensive O&M autonomy.

讲真!ULID 比 UUID 好用多了......

Conference details: 2024 XCOPS Intelligent O&M Manager Annual Meeting - Guangzhou Station