feat: add restart_count to HostInfo proto#232
Merged
Conversation
Add restart_count field (uint32, field 14) to the HostInfo protobuf message. This allows hadron to report its systemd restart counter to ion during the session hello, enabling crash-loop detection in the debug machines endpoint. - gravity_session.proto: added restart_count = 14 - gravity_session.pb.go: regenerated - grpc_client.go: populate RestartCount from GravityConfig - types.go: added RestartCount to GravityConfig
Contributor
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: ⛔ Files ignored due to path filters (1)
📒 Files selected for processing (3)
📝 WalkthroughWalkthroughAdded systemd restart counter support throughout the codebase. A new Changes
Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
restart_count(uint32, field 14) to theHostInfoprotobuf message so hadron can report its systemd restart counter during the session hello.Why
We discovered
hadron-s6xljlx8in us-west crash-looping every ~5 minutes (restart counter at 25+) due to a false-positive stall detection bug on idle machines with 0 containers. The companion hadron PR fixes the stall detection, but we need visibility into restart counts via the ion debug API so we can detect crash loops during health checks without SSH'ing into individual machines.Changes
gravity_session.proto: Addeduint32 restart_count = 14toHostInfogravity_session.pb.go: Regeneratedgravity/grpc_client.go: PopulatesRestartCountfromGravityConfiggravity/types.go: AddedRestartCount uint32toGravityConfigBackward compatible — old hadrons send 0 (proto default), new hadrons send the systemd
NRestartsvalue.Summary by CodeRabbit