Skip to content

fix(hll): update silently dropped after deserializing a compact List sketch#117

Merged
notfilippo merged 1 commit intoapache:mainfrom
notfilippo:filippo.rossi/push-vnmtmlowxknr
Apr 22, 2026
Merged

fix(hll): update silently dropped after deserializing a compact List sketch#117
notfilippo merged 1 commit intoapache:mainfrom
notfilippo:filippo.rossi/push-vnmtmlowxknr

Conversation

@notfilippo
Copy link
Copy Markdown
Member

When an HllSketch in List mode is serialized it uses compact format: only the live coupons are written to disk, with no trailing COUPON_EMPTY sentinels. On deserialization, the container was allocated with exactly coupon_count slots, leaving the array fully packed.

list::update() scans linearly for a COUPON_EMPTY (0) sentinel to find an insertion slot. Finding none, it fell through silently — discarding every value added after the round-trip.

Always allocate 1 << lg_arr slots (initialized to COUPON_EMPTY) in List::deserialize(), reading only coupon_count elements from the byte stream in compact mode. The trailing empty slots are then available as sentinels for subsequent update() calls.

Fixes #115.

…sketch

When an `HllSketch` in **List mode** is serialized it uses compact format:
only the live coupons are written to disk, with no trailing `COUPON_EMPTY`
sentinels. On deserialization, the container was allocated with exactly
`coupon_count` slots, leaving the array fully packed.

`list::update()` scans linearly for a `COUPON_EMPTY` (`0`) sentinel to find
an insertion slot. Finding none, it fell through silently — discarding every
value added after the round-trip.

Always allocate `1 << lg_arr` slots (initialized to `COUPON_EMPTY`) in
`List::deserialize()`, reading only `coupon_count` elements from the byte
stream in compact mode. The trailing empty slots are then available as
sentinels for subsequent `update()` calls.

Fixes apache#115.
@notfilippo notfilippo force-pushed the filippo.rossi/push-vnmtmlowxknr branch from d9ffa83 to dc02681 Compare April 18, 2026 08:46
@notfilippo
Copy link
Copy Markdown
Member Author

cc @tisonkun @fulmicoton

@tisonkun tisonkun self-requested a review April 22, 2026 07:18
Copy link
Copy Markdown
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

@notfilippo notfilippo merged commit b1544aa into apache:main Apr 22, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug upgrading a deserialized hll sketch

2 participants