Skip to content

[Feature] App Routing Gateway API (approuting-istio) — Support Envoy access log / Telemetry / MeshConfig customization #5728

@ghjkllasdf-fake

Description

@ghjkllasdf-fake

Is your feature request related to a problem? Please describe.

On AKS managed clusters using the App Routing Gateway API (GatewayClass: approuting-istio), there is currently no supported way to enable Envoy access logs (or any other MeshConfig / Telemetry customization) on the generated ingress gateway Pods.

Customers troubleshooting HTTP 4xx/5xx against their Gateway-fronted workloads cannot obtain per-request access logs from istio-proxy, which is a baseline observability capability for any L7 ingress.

What we've tried / confirmed doesn't work on a test AKS cluster running the App Routing Istio addon (recent version):

  1. Editing the istio ConfigMap in aks-istio-system to add accessLogFile: /dev/stdout — istiod hot-reloads and access logs start flowing, but the AKS Helm reconciler (Flux azure-service-mesh-istio-discovery-helmrelease) reverts the ConfigMap in ~100 seconds consistently (observed on multiple independent runs). The Helm release secret version bumps, istiod emits a fresh XDS Push, and the access log output disappears.

  2. Adding a meshConfig / telemetry / extensionProvider key to the istio-gateway-class-defaults ConfigMap (the per-GatewayClass infra-customization ConfigMap that the AKS PG pointed us to as not being reconciled):

    Error from server: admission webhook
      "istio-kube-gateway-class-configmap-webhook.azmk8s.io" denied the request:
      ConfigMap aks-istio-system/istio-gateway-class-defaults is invalid for
      Istio Kube Gateway Class: ConfigMap validation failed:
      disallowed key "meshConfig"
    

    Empirical whitelist via admission webhook (managed-gateway-api-ccp-validating-webhook):

    Key Upstream Istio AKS App Routing
    horizontalPodAutoscaler allowed ✅ allowed
    podDisruptionBudget allowed ✅ allowed
    deployment allowed ✅ allowed
    service allowed ✅ allowed
    serviceAccount allowed disallowed
    meshConfig n/a (not an upstream infra key) ❌ disallowed
    telemetry n/a ❌ disallowed
    extensionProvider n/a ❌ disallowed
    proxyConfig n/a ❌ disallowed
    envoyFilter n/a ❌ disallowed
  3. Telemetry API (telemetry.istio.io/v1) — CRD is not installed in the App Routing Istio addon, so kubectl apply fails with no matches for kind "Telemetry".

  4. Switching to Istio Service Mesh Add-on (az aks mesh enable) is the only currently-supported path, per feedback from the App Routing PG:

    App Routing Istio is not designed to support any Istio-specific functionality. It's intentionally a lightweight control plane designed to only support Gateway API based functionality.

This is a significant migration for customers who have already adopted App Routing Gateway API, and forces them off a first-class AKS feature just to get access logs — which most L7 proxies expose by default.

Describe the solution you'd like

Extend the App Routing Gateway API to natively support the minimum set of Envoy/mesh configurability that real-world production ingress workloads need. In rough priority order:

  1. Access log configuration — either:

    • Allow a meshConfig key in istio-gateway-class-defaults / per-Gateway parametersRef ConfigMap. Scope could be limited to the access-log-related subset (accessLogFile, accessLogFormat, accessLogEncoding, extensionProviders) to keep blast radius small. Example:
      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: istio-gateway-class-defaults
        namespace: aks-istio-system
        labels:
          gateway.istio.io/defaults-for-class: approuting-istio
      data:
        meshConfig: |
          accessLogFile: /dev/stdout
          accessLogEncoding: TEXT
    • OR install the Telemetry CRD and the minimal controller pieces so customers can use the standard upstream telemetry.istio.io/v1 resource.
  2. Per-GatewayClass default Envoy log level / log format surfaced through the existing ConfigMap pattern.

  3. (Stretch) Allow a curated subset of extensionProviders for OTLP / Prometheus so customers can ship access logs and metrics to their own collectors without bringing up a full service mesh.

The existing istio-gateway-class-defaults ConfigMap is the most natural extension point since AKS already exempts it from Helm reconciliation — this was confirmed by modifying horizontalPodAutoscaler.maxReplicas from 5 to 7 and verifying the value survived past the normal ~100s reconcile window with no revert.

Describe alternatives you've considered

  • Using Istio Service Mesh Add-on — works (Telemetry API + istio-shared-configmap-asm-1-xx), but requires: enabling the full mesh control plane, migrating from GatewayClass: approuting-istio to istio, accepting sidecar injection semantics, and re-testing all existing Gateway / HTTPRoute manifests against a different Istio revision. High migration cost for customers whose only unmet requirement is access logs.
  • Directly editing the istio ConfigMap — reverted by Helm every ~100s, not supportable.
  • Editing istio-gateway-class-defaults with meshConfig — blocked by the validating admission webhook as shown above.
  • Attaching a custom EnvoyFilter to the generated DeploymentEnvoyFilter CRD isn't installed in App Routing Istio.
  • Sidecar-based log shipping (DaemonSet tailing kubectl logs) — requires access logs to already be emitted, which is exactly the gap.

Additional context

Metadata

Metadata

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions