Skip to content

ptr::sentinel: ptr::dangling that cannot alias valid pointers #773

@CAD97

Description

@CAD97

Proposal

Problem statement

All variants of pointer::dangling point out that

the address of the returned pointer may potentially be that of a valid pointer, which means this must not be used as a “not yet initialized” sentinel value.

For non-zero-sized values, however, we can do better: a pointer with the address 0.wrapping_sub(size_of::<T>() (or any higher address) cannot be a valid pointer. This is because to be a valid pointer, Rust requires that ptr.add(1) is safe to call (to get the one-past-the-end pointer). Such an offset cannot be validly performed on a pointer with such a sentinel address, as that would overflow the address space.

For ZSTs, we should panic instead of returning a potentially-aliased sentinel. A post-mono const assertion error is not ideal for this function, as it isn't guaranteed to be dead-code eliminated when collections branch for handling ZSTs around a call. (If we had functionality to guarantee non-monomorphization of untaken branches, then a post-mono error would be much more appealing here.)

Motivating examples or use cases

The obvious way to implement lazy allocation in a container is Option<NonNull<_>> and initializing it to Some the first time that it's needed, or if the allocation state can be inferred by other state that isn't allocated this way, always storing NonNull<_> with a default of a dangling pointer.

With sentinel, containers can preserve the zero niche in both cases, as long as they either can entirely rule out the possibility of zero-sized allocation or

The Weak reference counted pointers in the standard library currently do this, except with a sentinel of usize::MAX IIRC. Using the sentinel described here would allow Weak to have an alignment niche and to guarantee into_raw returning an aligned pointer1, if desired.

Solution sketch

impl<T> core::ptr::NonNull<T> {
    /// Creates an aligned sentinel pointer that cannot alias a valid pointer to `T`.
    ///
    /// This is achieved by using an address `>= -size_of::<T>() as usize`. This
    /// creates a pointer where offsetting it to the one-past-the-end pointer
    /// will overflow the address space, which Rust guarantees must not happen
    /// for dereferencable pointers. A null one-past-the-end is sufficient; you
    /// could use such a reference to create a null pointer in safe code easily
    /// by converting the `&T` into `&[T; 1]` ([`[_]::from_ref`]).
    ///
    /// # Panics
    ///
    /// Panics if `T` is zero-sized, as all non-null addresses can be used for
    /// valid dereferencable pointers to zero-sized types. Typically containers
    /// treat pointers-to-ZST as always valid, such as a `Vec<()>` always having
    /// a capacity of `usize::MAX`.
    #[inline]
    #[must_use]
    #[track_caller]
    pub const fn sentinel() -> Self {
        let Some(addr) = NonZero::new(0.wrapping_sub(size_of::<T>())) else {
            panic!(...)
        };
        Self::without_provenance(addr)
    }
}

pub const fn core::alloc::Layout::sentinel_ptr<T>(&self) -> NonNull<u8>;
pub fn core::ptr::sentinel<T>() -> *const T;
pub fn core::ptr::sentinel_mut<T>() -> *mut T;

Alternatives

We can always choose to add nothing and libraries can continue to track initialization separately from a placeholder dangling pointer. Particularly clever libraries might note the possibility of using this sentinel on their own, hopefully also being clever enough to remember to safeguard against ZSTs.

Do note that adding this will provide a straightforward way to break the convention that invalid dangling pointers are in the zero page, so could potentially mean UB from dereferencing is easier to exploit. As its UB either way, though, this is only a weak caveat.

Links and related work

What happens now?

This issue contains an API change proposal (or ACP) and is part of the libs-api team feature lifecycle. Once this issue is filed, the libs-api team will review open proposals as capability becomes available. Current response times do not have a clear estimate, but may be up to several months.

Possible responses

The libs team may respond in various different ways. First, the team will consider the problem (this doesn't require any concrete solution or alternatives to have been proposed):

  • We think this problem seems worth solving, and the standard library might be the right place to solve it.
  • We think that this probably doesn't belong in the standard library.

Second, if there's a concrete solution:

  • We think this specific solution looks roughly right, approved, you or someone else should implement this. (Further review will still happen on the subsequent implementation PR.)
  • We're not sure this is the right solution, and the alternatives or other materials don't give us enough information to be sure about that. Here are some questions we have that aren't answered, or rough ideas about alternatives we'd want to see discussed.

Footnotes

  1. The current scheme for conversion to/from raw pointer provides the sentinel address unchanged, but a potential change that's been kept open is to always wrapping_offset the pointer, which would produce a null pointer with this sentinel. This is explicitly allowed as a possibility by the Weak::into_raw docs, and could potentially even be desirable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    T-libs-apiapi-change-proposalA proposal to add or alter unstable APIs in the standard libraries

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions