Conversation
| log "Checking ${#ENDPOINTS[@]} endpoint(s)..." | ||
|
|
||
| for endpoint in "${ENDPOINTS[@]}"; do | ||
| endpoint=$(echo "$endpoint" | xargs) |
There was a problem hiding this comment.
Empty endpoints aren’t skipped after trim. This can call curl with no URL and create a blank endpoint_id (e.g., .../instance/). Consider continuing the loop when endpoint is empty.
| endpoint=$(echo "$endpoint" | xargs) | |
| endpoint=$(echo "$endpoint" | xargs) | |
| if [ -z "$endpoint" ]; then | |
| continue | |
| fi |
🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
|
|
||
| metrics_payload="# HELP ${metric_name} Health status of web endpoints (1 = healthy, 0 = unhealthy) | ||
| # TYPE ${metric_name} gauge | ||
| ${metric_name}{endpoint=\"${endpoint}\",endpoint_id=\"${endpoint_id}\",http_code=\"${http_code}\"} ${health_status} |
There was a problem hiding this comment.
Label values in metrics_payload are not escaped, so an endpoint containing \, " or newlines can produce invalid Prometheus exposition and be rejected by Pushgateway. Consider escaping backslashes, double-quotes and newlines for endpoint (and endpoint_id) before building the metric label string.
- metrics_payload="# HELP ${metric_name} Health status of web endpoints (1 = healthy, 0 = unhealthy)
-# TYPE ${metric_name} gauge
-${metric_name}{endpoint=\"${endpoint}\",endpoint_id=\"${endpoint_id}\",http_code=\"${http_code}\"} ${health_status}
-"
+ # escape label values per Prometheus exposition format
+ escaped_endpoint=$(printf '%s' "$endpoint" | sed -e 's/\\/\\\\/g' -e 's/"/\\"/g' -e ':a;N;s/\n/\\n/;ta')
+ escaped_endpoint_id=$(printf '%s' "$endpoint_id" | sed -e 's/\\/\\\\/g' -e 's/"/\\"/g' -e ':a;N;s/\n/\\n/;ta')
+ metrics_payload="# HELP ${metric_name} Health status of web endpoints (1 = healthy, 0 = unhealthy)
+# TYPE ${metric_name} gauge
+${metric_name}{endpoint=\"${escaped_endpoint}\",endpoint_id=\"${escaped_endpoint_id}\",http_code=\"${http_code}\"} ${health_status}
+"
🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
|
|
||
| : "${PUSHGATEWAY_URL:=http://localhost:9091}" | ||
| : "${HEALTHCHECK_TIMEOUT:=10}" | ||
|
|
There was a problem hiding this comment.
HEALTHCHECK_TIMEOUT isn’t validated. If it’s non‑numeric or out of range, curl -m fails and health is misreported. Consider validating it as a positive integer within a sane range and defaulting when invalid.
+if ! [[ "$HEALTHCHECK_TIMEOUT" =~ ^[0-9]+$ ]] || [ "$HEALTHCHECK_TIMEOUT" -le 0 ] || [ "$HEALTHCHECK_TIMEOUT" -gt 300 ]; then
+ log "Invalid HEALTHCHECK_TIMEOUT: $HEALTHCHECK_TIMEOUT; using 10"
+ HEALTHCHECK_TIMEOUT=10
+fi🚀 Reply to ask Macroscope to explain or update this suggestion.
👍 Helpful? React to give us feedback.
Add web health check metrics to Hypernova/webmon by installing
curl, copyingweb_healthcheck.shto/usr/local/bin/web_healthcheck.sh, and invoking it each loop innewrunner.shwithMONITOR_LOOP_PAUSEcontrolIntroduce a
web_healthcheck.shscript that runscurlchecks on configured endpoints and pushes Prometheus metrics, addcurlto the runtime image, and switch loop control toMONITOR_LOOP_PAUSEin docker/Dockerfile and xmtp_debug/newrunner.sh.📍Where to Start
Start with the health check flow in xmtp_debug/web_healthcheck.sh, then review its invocation inside the loop in xmtp_debug/newrunner.sh.
📊 Macroscope summarized f89c4e3. 2 files reviewed, 9 issues evaluated, 5 issues filtered, 3 comments posted
🗂️ Filtered Issues
newrunner.sh — 0 comments posted, 3 evaluated, 2 filtered
XDBG_LOOP_PAUSEtoMONITOR_LOOP_PAUSE(line 3) and the child-process overrides were changed accordingly at lines 22, 23, 24, 29, 33, and 37. If callers or thexdbgtool expectXDBG_LOOP_PAUSE, the new script will ignore that input andxdbgmay also ignore theMONITOR_LOOP_PAUSEoverride, changing behavior (e.g., failing to suppress internal pauses). The visible contract for controlling pause duration and the per-command override has changed without guard or compatibility. [ Low confidence ]xdbginvocations (lines 22–25, 29, 33, 37) and the health checkbash "$(dirname "$0")/web_healthcheck.sh"(line 39) are executed without checking exit status. Failures silently proceed to subsequent steps, potentially leaving the environment partially reset or tests running against a bad state. Use guards such asset -e,|| exit, or&&chaining to ensure a defined terminal outcome on failure. [ Low confidence ]web_healthcheck.sh — 3 comments posted, 6 evaluated, 3 filtered
200as success. Healthy endpoints frequently return other 2xx codes (e.g.,204), or require following redirects (e.g.,301/302). Without-L, many endpoints will be marked unhealthy despite being reachable. Consider treating any 2xx as healthy and usingcurl -Lwhen appropriate or making this configurable. [ Low confidence ]PUSHGATEWAY_URLis not validated. If empty or malformed,push_urlbecomes invalid (e.g., starting with/metrics/...), andcurlwill fail or target a local file path. Add a guard to ensurePUSHGATEWAY_URLis a non-empty valid URL before pushing metrics. [ Low confidence ]push_http_codeextraction is fragile:grep "HTTP_CODE:" | cut -d: -f2will return multiple lines if the response body containsHTTP_CODE:anywhere, causing ambiguous comparison and potential false negatives in success detection. Safer to read only the last line or usetail -n1to select the appended status line. [ Low confidence ]