laitimes

Rust for Linux source code guide | Ref reference counting container

author:CRMEB

introduction

In 2022, we're likely to see experimental Rust programming language support in the Linux kernel go mainstream. 2021.12.6 was released in the morning with an updated patch that introduced the initial support and infrastructure for handling Rust in the kernel.

This update includes:

  1. Upgraded to the latest Stable compiler and Rust 2021 edition. As a result, we can get rid of the previously unstable characteristics of const_fn_transmute, const_panic, const_unreachable_unchecked, core_panic, and try_reserve.
  2. Customize core and alloc. Added more modular options for alloc to disable some features they don't need: no_rc and no_sync, mainly for upstream Rust projects.
  3. More restrictive code, documentation, and new LINTs.
  4. Abstractions and driver updates. Added abstractions of sequence locks, power management callbacks, io memory (readX/writeX), irq chips and advanced stream handlers, gpio chips (including irq chips), devices, amba devices and drivers, and certificates. In addition, the Ref (refcount_t supported) object has been improved and simplified and replaced with all instances of Rust's Arc. Arc and Rc are completely removed from alloc crate.

From now on, the Rust for Linux team will start submitting patches regularly, every two weeks or so.

In addition to support from Arm, Google, and Microsoft, the team received another letter from Red Hat: There is interest in using Rust for kernel work that Red Hat is considering.

  • v2 Patch: https://lore.kernel.org/lkml/[email protected]/
  • www.phoronix.com/scan.php?pa…
  • Kernel crate documentation

Why Ref was needed instead of Arc

This kernel crate in Rust for Linux used Arc before, but is now replaced by Ref. By looking at the related PR rust: update Ref to use the kernel's refcount_t, there are two main reasons for this:

  1. Get the most out of your existing C code and quench your panic. There is already an implementation of the reference count in the kernel refcount_t, and when it exceeds the threshold of the reference count, it returns the maximum value (saturation addition) instead of Panic (abort). For this reason, RBTree (Red-Black Tree) was also used instead of BTreeMap.
  2. Weak references are not required.
Arc has a MAX_REFCOUNT limit, which isize::MAX as usize size, and reference count additions beyond that size will overflow and then Panic(abort).

So the difference between the final implementation of Ref and Arc is:

  1. Ref is supported by kernel-based refcount_t
  2. It does not support weak references, so the size is reduced by half
  3. When it crosses the threshold, it saturates the reference count instead of abort
  4. It does not provide get_mut methods, so the reference counting object is Pin's.

Ref source analysis

Let's analyze the implementation of Ref.

Ref struct

The Ref struct is defined as follows:

/// A reference-counted pointer to an instance of `T`.
///
/// The reference count is incremented when new instances of [`Ref`] are created, and decremented
/// when they are dropped. When the count reaches zero, the underlying `T` is also dropped.
///
/// # Invariants
///
/// The reference count on an instance of [`Ref`] is always non-zero.
/// The object pointed to by [`Ref`] is always pinned.
pub struct Ref<T: ?Sized> {
    ptr: NonNull<RefInner<T>>,
    _p: PhantomData<RefInner<T>>,
}
复制代码           

It maintains an Invariants: the reference count Ref is always an instance of a nonzero, and the object referenced by The Ref is always Pin(immovable).

The struct uses NonNull <T>instead of *mut T, where covariant is required instead of invariant. You can refer to the following example:

use std::ptr::NonNull;

struct Ref<T: ?Sized> {
    x: NonNull<T>,
    // x: *mut T, // 如果换成 *mut T,编译将不会通过
}

fn take<'a>(r: Ref<&'a u32>, y: &'a u32) {}

fn give() -> Ref<&'static u32> { todo!() }

fn test() {
    let y = 1;
    // 协变,能传入 Ref<&'a u32> 的函数take,也能接收 Ref<&'static u32> 类型的参数,因为 'static: 'a ,能接受子类型,也能接受父类型
    take(give(), &y); 
}
复制代码           

NonNull is a covariant version of *mut T and also represents a non-null pointer, meaning that reference-count objects are always non-null because when the count is zero it is released.

The use of PhatomData here is for Drop checking, which means that the Ref type has RefInner<T>, and when Ref is Dropped, RefInner <T>can also be Dropped.

RefInner structure

Let's look at the RefInner structure:

#[repr(C)]
struct RefInner<T: ?Sized> {
    refcount: Opaque<bindings::refcount_t>,
    data: T,
}
复制代码           

RefInner contains the reference-counting struct implemented in the kernel refcount_t, which is designed to reuse C code.

where the Opaque type is a wrapper type built into kernel crate specifically for dealing with C, defined as follows:

pub struct Opaque<T>(MaybeUninit<UnsafeCell<T>>);

impl<T> Opaque<T> {
    /// Creates a new opaque value.
    pub fn new(value: T) -> Self {
        Self(MaybeUninit::new(UnsafeCell::new(value)))
    }

    /// Creates an uninitialised value.
    pub fn uninit() -> Self {
        Self(MaybeUninit::uninit())
    }

    /// Returns a raw pointer to the opaque data.
    pub fn get(&self) -> *mut T {
        UnsafeCell::raw_get(self.0.as_ptr())
    }
}
复制代码           

The Opaque type means that the FFi object will never need Rust code to interpret. So, in order to use the reference-counting struct that already exists in the kernel, here is the Opaque <bindings::refcount_t> type.

About refcount_t

The refcount_t struct defined in the Linux kernel is defined as follows:

// from: https://github.com/torvalds/linux/blob/master/tools/include/linux/refcount.h
typedef struct refcount_struct {
	atomic_t refs;
} refcount_t;
复制代码           

The goal of the refcount_t API is to provide a minimal API for implementing reference counters for objects. Although atomic operations are used internally, some refcount_*() and atomic_*() functions differ a lot in terms of memory order guarantees.

refcount_t There was a security vulnerability in 2018 where the reference count overflow was when the reference count reached its maximum, if you add one more, the reference count would go to zero. Therefore, the referenced object will be incorrectly released. This becomes a UAF (use-after-free) vulnerability that is easily exploited.

So now refcount_t has been added to the reference count detection:

// from: https://github.com/torvalds/linux/blob/master/tools/include/linux/refcount.h#L69

static inline __refcount_check
bool refcount_inc_not_zero(refcount_t *r)
{
	unsigned int old, new, val = atomic_read(&r->refs);

	for (;;) {
		new = val + 1;

		if (!val)
			return false;

		if (unlikely(!new))
			return true;

		old = atomic_cmpxchg_relaxed(&r->refs, val, new);
		if (old == val)
			break;

		val = old;
	}

	REFCOUNT_WARN(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");

	return true;
}
复制代码           

Saturated addition is used when the reference count is reached, that is, the largest value is returned instead of zero. Note that the compare-and-swap atomic operation is used here and no memory order is provided (using relaxed).

Some traits implemented for Ref

In order for Ref<T> to have some Arc-like<T> behavior, some built-in traits are implemented for it.

// This is to allow [`Ref`] (and variants) to be used as the type of `self`.
impl<T: ?Sized> core::ops::Receiver for Ref<T> {}

// This is to allow [`RefBorrow`] (and variants) to be used as the type of `self`.
impl<T: ?Sized> core::ops::Receiver for RefBorrow<'_, T> {}

// This is to allow coercion from `Ref<T>` to `Ref<U>` if `T` can be converted to the
// dynamically-sized type (DST) `U`.
impl<T: ?Sized + Unsize<U>, U: ?Sized> core::ops::CoerceUnsized<Ref<U>> for Ref<T> {}

// This is to allow `Ref<U>` to be dispatched on when `Ref<T>` can be coerced into `Ref<U>`.
impl<T: ?Sized + Unsize<U>, U: ?Sized> core::ops::DispatchFromDyn<Ref<U>> for Ref<T> {}

// SAFETY: It is safe to send `Ref<T>` to another thread when the underlying `T` is `Sync` because
// it effectively means sharing `&T` (which is safe because `T` is `Sync`); additionally, it needs
// `T` to be `Send` because any thread that has a `Ref<T>` may ultimately access `T` directly, for
// example, when the reference count reaches zero and `T` is dropped.
unsafe impl<T: ?Sized + Sync + Send> Send for Ref<T> {}

// SAFETY: It is safe to send `&Ref<T>` to another thread when the underlying `T` is `Sync` for
// the same reason as above. `T` needs to be `Send` as well because a thread can clone a `&Ref<T>`
// into a `Ref<T>`, which may lead to `T` being accessed by the same reasoning as above.
unsafe impl<T: ?Sized + Sync + Send> Sync for Ref<T> {}
复制代码           

As you can see from the code above, the traits used are:

  • core::ops::Receiver : is an unstable features receiver_trait that means that a struct can act as a method receiver and does not need to arbitrary_self_types feature. Some smart pointers in the standard library implement this trait, such as Box<T>/Arc<T> <T> /Rc/&T/Pin<P>, etc.
  • core::ops::CoerceUnsized : is also an unstable features coerce_unsized that means converting the Size type to a DST type.
  • core::ops::D ispatchFromDyn: Also an unstable feature dispatch_from_dyn s that are used for object safety (dynamic safe dyn safe) checks. Types that implement DispatchFromDyn can be safely used as the self type in object-safe methods.
  • Send/Sync, a stable feature in Rust, is used to mark types that can be safely passed and shared between threads.

Now<T> that these traits are implemented for Ref, Ref<T> has the corresponding behavior. Basically, Ref<T> behaves<T> similarly to Arc, except for the differences mentioned above.

Reference count management

Because Ref<T> is a multiplexed kernel C code, only the appropriate trait needs to be implemented for reference count management.

For example, Clone should increment the reference count, and Drop should decrement the reference count. So, let's look at these two implementations separately.

// 实现 Clone trait
impl<T: ?Sized> Clone for Ref<T> {
    fn clone(&self) -> Self {
        // INVARIANT: C `refcount_inc` saturates the refcount, so it cannot overflow to zero.
        // SAFETY: By the type invariant, there is necessarily a reference to the object, so it is
        // safe to increment the refcount.
        unsafe { bindings::refcount_inc(self.ptr.as_ref().refcount.get()) };

        // SAFETY: We just incremented the refcount. This increment is now owned by the new `Ref`.
        unsafe { Self::from_inner(self.ptr) }
    }
}
复制代码           

Implementing Clone trait is simple, just call the autointending method of refcount_t in the kernel refcount_inc directly through bindings:::refcount_inc.

Because refcount_inc already has reference count overflow detection, using saturation addition, there is no need to worry about zeroing.

// 实现 Drop trait
impl<T: ?Sized> Drop for Ref<T> {
    fn drop(&mut self) {
        // SAFETY: By the type invariant, there is necessarily a reference to the object. We cannot
        // touch `refcount` after it's decremented to a non-zero value because another thread/CPU
        // may concurrently decrement it to zero and free it. It is ok to have a raw pointer to
        // freed/invalid memory as long as it is never dereferenced.
        let refcount = unsafe { self.ptr.as_ref() }.refcount.get();

        // INVARIANT: If the refcount reaches zero, there are no other instances of `Ref`, and
        // this instance is being dropped, so the broken invariant is not observable.
        // SAFETY: Also by the type invariant, we are allowed to decrement the refcount.
        let is_zero = unsafe { bindings::refcount_dec_and_test(refcount) };
        if is_zero {
            // The count reached zero, we must free the memory.

            // SAFETY: This thread holds the only remaining reference to `self`, so it is safe to
            // get a mutable reference to it.
            let inner = unsafe { self.ptr.as_mut() };
            let layout = Layout::for_value(inner);
            // SAFETY: The value stored in inner is valid.
            unsafe { core::ptr::drop_in_place(inner) };
            // SAFETY: The pointer was initialised from the result of a call to `alloc`.
            unsafe { dealloc(self.ptr.cast().as_ptr(), layout) };
        }
    }
}
复制代码           

Implement drop trait, also by calling the kernel refcount_dec_and_test refcount_dec_and_test function directly through bindings:::, which also includes a reference count overflow check. However, when the reference count is zeroed, memory needs to be freed.

Note that the implementation of the two traits above, Clone and Drop, is a classic example of Unsafe Rust's abstraction of Safe Rust, mainly the Safey annotations in it, which consider the security boundaries and are explained.

Creates a new reference count object

Next, you need to focus on<T> how Ref creates a new reference-count object.

impl<T> Ref<T> {
    /// Constructs a new reference counted instance of `T`.
    pub fn try_new(contents: T) -> Result<Self> {
        let layout = Layout::new::<RefInner<T>>();
        // SAFETY: The layout size is guaranteed to be non-zero because `RefInner` contains the
        // reference count.
        let inner = NonNull::new(unsafe { alloc(layout) })
            .ok_or(Error::ENOMEM)?
            .cast::<RefInner<T>>();

        // INVARIANT: The refcount is initialised to a non-zero value.
        let value = RefInner {
            // SAFETY: Just an FFI call that returns a `refcount_t` initialised to 1.
            refcount: Opaque::new(unsafe { bindings::REFCOUNT_INIT(1) }),
            data: contents,
        };
        // SAFETY: `inner` is writable and properly aligned.
        unsafe { inner.as_ptr().write(value) };

        // SAFETY: We just created `inner` with a reference count of 1, which is owned by the new
        // `Ref` object.
        Ok(unsafe { Self::from_inner(inner) })
    }
}

复制代码           

The try_new method uses the core::alloc::Layout struct to define the memory layout.

Allocate new memory via nonNull::new and custom core::alloc::alloc functions, convert to The RefInner<T>> type, and initialize it to 1 by calling the kernel C function via bindings:::REFCOUNT_INIT. The custom core::alloc module will be synced to the rust core in the future.

where Error::ENOMEM stands for OOM error. Many kernel error codes are defined in kernel/error.rs.

Many error codes are defined using integers in the Linux kernel, and in kernel crate, they are encapsulated using the NewType pattern instead of using integer error codes directly:

macro_rules! declare_err {
    ($err:tt) => {
        pub const $err: Self = Error(-(bindings::$err as i32));
    };
    ($err:tt, $($doc:expr),+) => {
        $(
        #[doc = $doc]
        )*
        pub const $err: Self = Error(-(bindings::$err as i32));
    };
}

#[derive(Clone, Copy, PartialEq, Eq)]
pub struct Error(c_types::c_int);

impl Error {
    declare_err!(EPERM, "Operation not permitted.");

    declare_err!(ENOENT, "No such file or directory.");

    declare_err!(ESRCH, "No such process.");

    declare_err!(ENOMEM, "Out of memory.");

    // ...

}
复制代码           

Construct Ref from an already existing <T>RefInner<T>

As you can see in the try_new method above, the final step uses the from_inner method to construct a bare pointer as the final Ref<T>. And it's an internal method, not a public API.

Note that it is an unsafe method because the caller is required to ensure that the inner pointer is valid and not empty, and its documentation comments are clearer for this.

impl<T: ?Sized> Ref<T> {
    /// Constructs a new [`Ref`] from an existing [`RefInner`].
    ///
    /// # Safety
    ///
    /// The caller must ensure that `inner` points to a valid location and has a non-zero reference
    /// count, one of which will be owned by the new [`Ref`] instance.
    unsafe fn from_inner(inner: NonNull<RefInner<T>>) -> Self {
        // INVARIANT: By the safety requirements, the invariants hold.
        Ref {
            ptr: inner,
            _p: PhantomData,
        }
    }

}
复制代码           

RefBorrow<T>

There is no mutable lease on the underlying reference-count struct, but there is an immutable lease and requires a manual maintenance lifecycle.

/// A borrowed [`Ref`] with manually-managed lifetime.
///
/// # Invariants
///
/// There are no mutable references to the underlying [`Ref`], and it remains valid for the lifetime
/// of the [`RefBorrow`] instance.
pub struct RefBorrow<'a, T: ?Sized + 'a> {
    inner: NonNull<RefInner<T>>,
    _p: PhantomData<&'a ()>,
}

impl<T: ?Sized> Clone for RefBorrow<'_, T> {
    fn clone(&self) -> Self {
        *self
    }
}

impl<T: ?Sized> Copy for RefBorrow<'_, T> {}
复制代码           

The RefBorrow struct uses phantomData<&'a() > to hold lifecycle parameters and implement Copy traits for them, which behaves like ordinary immutable references.

Then<T> implement a as_ref_borrow method for Ref to<T> get RefBorrow from Ref<T>.

impl<T> Ref<T> {

    /// Returns a [`RefBorrow`] from the given [`Ref`].
    ///
    /// This is useful when the argument of a function call is a [`RefBorrow`] (e.g., in a method
    /// receiver), but we have a [`Ref`] instead. Getting a [`RefBorrow`] is free when optimised.
    #[inline]
    pub fn as_ref_borrow(&self) -> RefBorrow<'_, T> {
        // SAFETY: The constraint that lifetime of the shared reference must outlive that of
        // the returned `RefBorrow` ensures that the object remains alive.
        unsafe { RefBorrow::new(self.ptr) }
    }

}

复制代码           

In fact, according to the Rust naming convention, it is better to change the as_ref_borrow here to as_ref. But here as_ref have another use:

impl<T: ?Sized> AsRef<T> for Ref<T> {
    fn as_ref(&self) -> &T {
        // SAFETY: By the type invariant, there is necessarily a reference to the object, so it is
        // safe to dereference it.
        unsafe { &self.ptr.as_ref().data }
    }
}
复制代码           

To get &T from Ref via the as_ref method<T>.

Then implement deref trait for RefBorrow<T>, or you can<T> get &T from RefBorrow.

impl<T: ?Sized> Deref for RefBorrow<'_, T> {
    type Target = T;

    fn deref(&self) -> &Self::Target {
        // SAFETY: By the type invariant, the underlying object is still alive with no mutable
        // references to it, so it is safe to create a shared reference.
        unsafe { &self.inner.as_ref().data }
    }
}
复制代码           

Unique reference type <T>UniqueRef

In addition to Ref<T>, a UniqueRef type is implemented<T>. As the name suggests, this type represents a case where there is only one reference count.

pub struct UniqueRef<T: ?Sized> {
    inner: Ref<T>,
}

impl<T> UniqueRef<T> {
    /// Tries to allocate a new [`UniqueRef`] instance.
    pub fn try_new(value: T) -> Result<Self> {
        Ok(Self {
            // INVARIANT: The newly-created object has a ref-count of 1.
            inner: Ref::try_new(value)?,
        })
    }

    /// Tries to allocate a new [`UniqueRef`] instance whose contents are not initialised yet.
    pub fn try_new_uninit() -> Result<UniqueRef<MaybeUninit<T>>> {
        Ok(UniqueRef::<MaybeUninit<T>> {
            // INVARIANT: The newly-created object has a ref-count of 1.
            inner: Ref::try_new(MaybeUninit::uninit())?,
        })
    }
}
复制代码           

The clone and drop traits are not implemented for it, so it can only hold a single reference. The introduction of this type may provide more convenience for kernel development.

other

Ref<T> also implements other traits, such as From/TryFrom, which can be <T>converted to and from between the bare pointer and Ref.

One notable point is:

impl<T> Ref<T> {
    /// Deconstructs a [`Ref`] object into a raw pointer.
    ///
    /// It can be reconstructed once via [`Ref::from_raw`].
    pub fn into_raw(obj: Self) -> *const T {
        let ret = &*obj as *const T;
        core::mem::forget(obj);
        ret
    }
}
复制代码           

<T> When converting Ref to a bare pointer, be careful to use core::mem::forget(obj) to avoid calling obj's Drop, otherwise it will cause problems due to reduced reference counts.

brief summary

You can learn a lot of Unsafe Rust techniques from the Rust for Linux source code, especially some of the better practices of dealing with C. If you're interested, you can also learn a little bit about the Linux kernel and do some preparation for writing a Linux kernel driver for Rust in the future.

At last

If you find this article a little helpful to you, give it a thumbs up. Or you can join my development exchange group: 1025263163 learn from each other, we will have professional technical answers

If you feel that this article is a bit useful to you, please give our open source project a little bit star:http://github.crmeb.net/u/defu is greatly appreciated!

PHP Learning Manual: https://doc.crmeb.com

Technical Exchange Forum: https://q.crmeb.com

Read on