Skip to content

feat: HashTable::try_insert_unique_within_capacity#621

Open
morrisonlevi wants to merge 1 commit intorust-lang:masterfrom
morrisonlevi:safe-insert-within-capacity
Open

feat: HashTable::try_insert_unique_within_capacity#621
morrisonlevi wants to merge 1 commit intorust-lang:masterfrom
morrisonlevi:safe-insert-within-capacity

Conversation

@morrisonlevi
Copy link
Copy Markdown
Contributor

@morrisonlevi morrisonlevi commented May 4, 2025

This implements #618 with this signature:

    pub fn try_insert_unique_within_capacity(
        &mut self,
        hash: u64,
        value: T,
    ) -> Result<OccupiedEntry<'_, T, A>, T>

This is safe-version from issue feedback, rather than the originally proposed unsafe version.

My personal motivation is to avoid panics. With HashTable::insert_unique, it calls RawTable::insert which calls self.reserve(1, hasher), which will panic out of necessity if allocation fails.

Today, you can call HashTable::try_reserve before calling HashTable::insert_unique to avoid the panic in practice. However, this is not compatible with techniques like using no-panic to detect at compile/link time that a function does not panic. I have verified with this patch that it's no_panic::no_panic compatible. This is not public yet, so I can't share the source for you to verify it yourself, but it's nothing too interesting.

This also has an impact on code size. I don't have concrete, trustworthy numbers for this. The compiler does not optimize away the memory growth path in insert_unique after a try_reserve, so it's easy to see that it should get smaller, it's just a matter of how much.

Comment thread src/raw/mod.rs Outdated
/// This does not check if the given element already exists in the table.
#[cfg_attr(feature = "inline-more", inline)]
pub fn insert(&mut self, hash: u64, value: T, hasher: impl Fn(&T) -> u64) -> Bucket<T> {
#[inline(always)]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used #[inline(always)] unconditionally because this is a helper method which was previously written in line, so I didn't want to worry about abstraction overhead.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple #[inline] would be sufficient here. We don't want to over-inline in debug builds since this can hurt build times.

Comment thread src/raw.rs
Comment thread src/raw/mod.rs Outdated
///
/// This does not check if the given element already exists in the table.
#[inline]
pub(crate) fn insert_within_capacity(&mut self, hash: u64, value: T) -> Option<Bucket<T>> {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used pub(crate) because I wasn't sure if we wanted to grow the RawTable API surface at all.

@morrisonlevi morrisonlevi force-pushed the safe-insert-within-capacity branch 3 times, most recently from 2ce56de to 4188129 Compare May 5, 2025 14:43
@morrisonlevi morrisonlevi changed the title feat: HashTable::insert_unique_within_capacity feat: HashTable::try_insert_unique_within_capacity May 6, 2025
Add a panic-free insert path for `HashTable` that returns the value
back to the caller when the table is at its load-factor limit and
would need to grow.

Implementation:
- Extract `find_insert_slot` helper on `RawTable` that probes for a
  slot and returns `None` when growth would be required.
- Refactor `insert` to use the new helper, with `cold_path()` on the
  growth branch.
- Add `try_insert_within_capacity` on `RawTable` (pub(crate)).
- Add `try_insert_unique_within_capacity` on `HashTable` (public API).

Motivation: `insert_unique` internally calls `reserve(1, hasher)` which
panics on allocation failure.  Callers can call `try_reserve` beforehand,
but the compiler cannot eliminate the growth path in `insert_unique`,
making it incompatible with `#[no_panic]` verification.

Closes rust-lang#618

Co-authored-by: Cursor <cursoragent@cursor.com>
@morrisonlevi morrisonlevi force-pushed the safe-insert-within-capacity branch from 7237c29 to cf7e06f Compare February 17, 2026 07:16
@morrisonlevi
Copy link
Copy Markdown
Contributor Author

@Amanieu I had AI help me fix conflicts and apply your review feedback. I've gone over the code myself and it looks good to me, but again, I do not yet know hashbrown well (a bit more with each PR).

Comment thread src/raw.rs
/// Returns `None` if the table is at its load-factor limit and would need
/// to grow.
#[inline]
fn find_insert_slot(&self, hash: u64) -> Option<usize> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function doesn't depend on T and should therefore be moved to RawTableInner.

Comment thread src/raw.rs
fn find_insert_slot(&self, hash: u64) -> Option<usize> {
unsafe {
// SAFETY:
// 1. The [`RawTableInner`] must already have properly initialized control bytes since
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"1." is no longer needed in this comment.

Comment thread src/raw.rs
let old_ctrl = *self.table.ctrl(index);
if unlikely(self.table.growth_left == 0 && old_ctrl.special_is_empty()) {
if self.table.growth_left == 0 && old_ctrl.special_is_empty() {
None
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that cold_path should go here since it's unlikely in practice that capacity is full.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants