use LRU eviction to prevent session key data loss in the gateway#2836
Open
use LRU eviction to prevent session key data loss in the gateway#2836
Conversation
8d66b37 to
fb1d92a
Compare
badgersrus
reviewed
Mar 3, 2026
| func (t *sessionKeyActivityTracker) Stop() { | ||
| t.stopOnce.Do(func() { | ||
| close(t.stopChan) | ||
| if t.persistQueue != nil { |
Contributor
There was a problem hiding this comment.
I don't think we need to close this as well - the stopChan calls close in stop_control.go?
|
|
||
| // store all activities in the database to make them persistent and recoverable in case of restart | ||
| allActivities := s.activityTracker.ListAll() | ||
| _ = s.activityStorage.Save(allActivities) |
Contributor
There was a problem hiding this comment.
I might be misunderstanding the code but won't this overwrite potentially expired entries that occur between eviction the time expiration threshold?
- session key A gets evicted (as its hit limit) from memory and sent to cosmos via the persist queue
- key A is in cosmos and not in the in-memory tracker
- expiration loop firsts sessionKeyExpirations() but key A isn't past the expiration
ListAll()returns only the memory (key A missing)Savewrites that list and overwrites and deletes key A?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why this change is needed
The SessionKeyActivityTracker has a hard limit of 100k entries. When this limit is reached, new session key activities are silently dropped (those keys will never expire and funds).
What changes were made as part of this PR
I replaced the simple map with an LRU cache. Instead of dropping new entries at capacity, we now evict the oldest entry and persist it to CosmosDB via an async batch writer.
Changes:
session_key_activity.go— LRU cache using container/list with O(1) eviction; background goroutine batches evicted entries (100 items or 5s flush interval) and writes to CosmosDBsession_key_expiration.go— Expiration check now queries both in-memory cache AND CosmosDB, merging results to catch entries that were evicted from memorysession_key_activity_storage.go— Added SaveBatch(), ListOlderThan(), and Delete() to storage interfacecosmosdb/session_key_activity_storage.go— Implemented incremental upsert (read-merge-write) for batched persistenceGraceful shutdown flushes pending writes before exit
Write frequency impact: No additional CosmosDB writes unless cache exceeds 100k entries. When evictions occur, writes are batched efficiently (~1-2 shard writes per 100 evictions).