Authenticating Consumers

Introduction

Envoy AI Gateway authenticates every inference request at the edge and propagates the caller's identity to downstream policies. Authentication is configured with the Envoy Gateway SecurityPolicy resource, which attaches to the HTTPRoute generated by an AIGatewayRoute. After the caller is identified, selected claims are copied into request headers that token quotas and usage metering consume as the per-tenant key.

This turns a per-consumer credential, such as an SSO (Single Sign-On) token or an API key, into an identity that quota and metering can act on. It is the foundation of multi-tenant model serving on a shared gateway.

Use Cases

  • A developer obtains a JWT (JSON Web Token) from the platform identity provider and calls the gateway with it, so the gateway enforces per-user token quotas.
  • A CI job presents a service-account token so that automated traffic is attributed to a team rather than an individual.
  • A machine consumer that cannot run an interactive login presents a static API key that maps to a known tenant.

Prerequisites

  1. Envoy AI Gateway is installed. See Install Envoy AI Gateway.
  2. An AIGatewayRoute already routes requests to one or more backends.
  3. For the OIDC/JWT path: an OIDC issuer with a reachable JWKS endpoint. The platform's built-in identity provider, Dex, is the default; any other OIDC issuer (Keycloak, Auth0, Okta, GitHub OIDC, an enterprise Entra ID tenant) also works as long as the gateway can reach its /.well-known/openid-configuration and JWKS URL.
  4. For the API-key path: cluster permission to create Secret objects in the gateway's namespace.
NOTE

Create the Gateway and AIGatewayRoute in a dedicated namespace (for example maas-system), not in the Envoy Gateway control-plane namespace envoy-gateway-system. A gateway placed in the control-plane namespace may not have the AI Gateway request-processing filter and SecurityPolicy applied to its listener, which silently breaks routing and policy enforcement. See Envoy AI Gateway.

Steps

Authenticate with OIDC or JWT

Validate tokens issued by an OIDC issuer. The platform's built-in Dex is the default issuer; it can also broker external identity sources, such as LDAP or another OIDC provider, so their users obtain platform tokens. Those connectors are configured in platform IdP (Identity Provider) management. For platform IdP configuration, see Identity Providers.

Any OIDC issuer with a reachable JWKS endpoint can be used. Replace the issuer and remoteJWKS.uri below with the issuer of your choice when consumers are not platform users — for example, an enterprise Keycloak realm or a SaaS IdP — so the gateway accepts their tokens without requiring a platform account.

Point the gateway at the OIDC issuer and map its claims to identity headers:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: SecurityPolicy
metadata:
  name: maas-oidc-auth
  namespace: <your-namespace>
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: <aigatewayroute-name>  # HTTPRoute generated by your AIGatewayRoute
  jwt:
    providers:
      - name: platform-idp
        issuer: https://<platform-address>/dex
        audiences:
          - <gateway-client-id>   # reject tokens minted for other clients of the same issuer
        remoteJWKS:
          uri: https://<platform-address>/dex/keys
        claimToHeaders:
          - claim: sub      # caller identity, used as the per-user quota and metering key
            header: x-user-id
          - claim: groups    # single-valued group/department claim (see note: array claims are unsupported)
            header: x-user-group
          - claim: namespace  # custom scalar claim for per-namespace chargeback; absent by default (see note)
            header: x-user-namespace
          - claim: email
            header: x-user-email
  • <platform-address>: the platform access address. Dex publishes its issuer at /dex and its JWKS at /dex/keys.
  • <aigatewayroute-name>: the name of the HTTPRoute generated by your AIGatewayRoute.
  • audiences: the token audience(s) the gateway accepts. On a shared IdP, omitting this accepts any valid token from the same issuer — including tokens minted for other clients — which would still resolve to an x-user-id and consume quota. Set it to the client ID(s) the gateway's tokens are issued for; if your issuer does not set a distinguishing aud, register a dedicated client for the gateway.
  • claimToHeaders: the bridge between identity and policy. The emitted headers (x-user-id, x-user-group, x-user-namespace) become the selector keys for token quotas and the label values for usage metering and chargeback.
NOTE

claimToHeaders only supports scalar claims (string, int, double, bool); array-typed claims are not supported and will not populate the header. The standard OIDC groups claim is usually an array — to use it as x-user-group/department, expose a single-valued claim from the IdP connector (for example a primary-group or a dedicated department claim) and map that. If x-user-group stays empty, per-department metering, quotas, and tiers silently fall back to no grouping.

namespace is not a standard OIDC claim: it must be added in the upstream IdP connector and is absent by default. x-user-namespace is the per-namespace chargeback key consumed by Metering Token Usage; map it only when billing by namespace or tenant. To key policies on any other attribute the platform does not emit by default, such as a subscription tier, add the claim in the connector and map it with an extra claimToHeaders entry.

TIP

To roll out without blocking traffic, set jwt.optional: true first and observe. Remove it once all consumers present valid tokens.

Authenticate with an API key

NOTE

If the Gateway is created in the Envoy Gateway control-plane namespace envoy-gateway-system, apiKeyAuth (like model routing) is silently not enforced: the SecurityPolicy reports Accepted=True, but a wrong or missing key still returns 200 and no x-user-id is injected. This is the same control-plane listener-skip issue described in the prerequisites note above — not an Envoy Gateway version bug. The fix is to create the Gateway and AIGatewayRoute in a dedicated namespace (for example maas-system), where apiKeyAuth enforces natively with no patch. Verify with a single no-key request: a dedicated-namespace gateway returns 401. If the gateway must stay in envoy-gateway-system, see the supported remedies at the end of this section.

For machine consumers that cannot perform an OIDC flow, validate a static API key instead. There is no issuance service: the cluster administrator generates a random string per consumer, stores it in a Secret, and shares it out of band. The gateway's data plane validates each request by looking the presented value up in that Secret.

Generate one key per consumer and store them in a single Opaque Secret. Each data-map key is the client identifier that downstream policies see; each value is the API key the consumer presents:

kubectl -n <your-namespace> create secret generic maas-api-keys \
  --from-literal=alice="$(openssl rand -hex 32)" \
  --from-literal=ci-runner="$(openssl rand -hex 32)"

Bind the Secret to the route with a SecurityPolicy:

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: SecurityPolicy
metadata:
  name: maas-apikey-auth
  namespace: <your-namespace>
spec:
  targetRefs:
    - group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: <aigatewayroute-name>  # HTTPRoute generated by your AIGatewayRoute
  apiKeyAuth:
    credentialRefs:
      - name: maas-api-keys     # Secret whose data keys are the client identifiers
    extractFrom:
      - headers:
          - X-API-Key           # dedicated header avoids the "Bearer " prefix problem of Authorization
    forwardClientIDHeader: x-user-id  # matched client identifier is injected as this header for downstream policies
    sanitize: true              # strip the raw API key from the request before it reaches the model backend
  • credentialRefs: one or more Opaque Secrets holding the credentials. Each data-map key is the client identifier, each value is the literal API key. Adding a consumer is a kubectl patch of one entry; revoking is a single key deletion.
  • extractFrom: where Envoy reads the presented key from. The filter does a literal-string compare, so prefer a dedicated header such as X-API-Key. Reusing Authorization requires storing the value with its Bearer prefix, which mixes badly with the OIDC path on the same gateway.
  • forwardClientIDHeader: the header that carries the matched client identifier to the upstream and to later filters. Use the same name as the OIDC claimToHeaders target (x-user-id) so token quotas and usage metering see one consistent key across both auth paths.
  • sanitize: prevents the raw API key from leaking to the model backend or being logged downstream.

If the gateway must stay in envoy-gateway-system, apiKeyAuth enforcement is defeated by the same control-plane listener-skip described above, and no per-route SecurityPolicy change fixes it. The supported remedies are to move the Gateway and AIGatewayRoute to a dedicated namespace (preferred), or to upgrade Envoy AI Gateway to a release that narrows the listener-skip (v0.6.0 / Alauda release-0.6.0-alauda). A hand-rolled EnvoyPatchPolicy that edits the listener filter chain may serve as a temporary stopgap, but it is version-specific and fragile — it depends on the exact filter layout of the running Envoy build — and is not recommended for production.

Verification

Confirm the policy is accepted. SecurityPolicy status is ancestor-scoped, so the jsonpath looks one level deeper than for most resources:

kubectl get securitypolicy <policy-name> -n <your-namespace> \
  -o jsonpath='{.status.ancestors[*].conditions[?(@.type=="Accepted")].status}'

The command returns True when the policy is programmed.

For the OIDC path, send a request with a valid token and confirm the upstream service receives the x-user-id, x-user-group, and x-user-email headers.

For the API-key path, send the matching X-API-Key and confirm the upstream sees x-user-id set to the matched client identifier:

curl -sS -H "X-API-Key: <alice-key>" \
  https://<gateway-host>/v1/chat/completions \
  -d '{"model":"<model>","messages":[{"role":"user","content":"ping"}]}'

A wrong or missing key returns 401 Unauthorized from the gateway before the request reaches any backend.

Learn More

Next Steps

After identity headers are propagated, configure Configuring Token Quotas to enforce per-tenant token budgets.