Enterprise · Identity federation

Enterprise SSO

This page is for IT admins standing up Parleq’s LLM cleanup access behind one corporate OIDC sign-in instead of handing each user an API key. The user signs in once with their company identity provider; Parleq federates that sign-in into your cloud — AWS Bedrock and/or Google Cloud Vertex AI — and mints short-lived per-cloud credentials at call time. No long-lived cloud secret is stored on the device.

Setting this up at home without an IT department? The same engine works for a single user on a free identity provider (Cognito) or a personal Google account — with different gotchas. See the DIY: SSO & Sign-in with Google guide for the path-picker and the burn-an-hour gotchas.

What you get

One engine, several legs. The user runs a single OIDC sign-in (PKCE); Parleq trades the resulting token for short-lived cloud credentials on demand. The properties that matter for a security review:

No per-user API keys. Nothing to distribute, store, or rotate per employee — identity comes from your IdP.
Short-lived cloud credentials, memory only. Only the refresh token touches the Keychain; the cloud credentials live in process memory and expire on their own.
Fail-closed. If any leg fails, Parleq pastes the raw on-device transcript rather than falling back to a personal credential — the cleanup pass is skipped, the dictation is not lost.
A connection doctor. Settings → Company Account runs token-free discovery plus a silent refresh and reports which hop (IdP / token exchange / provider) last succeeded or failed — the place to confirm a new setup end-to-end.
Per-user audit attribution. The AWS leg carries the signed-in user’s email as the role session name, so CloudTrail attributes each call to the person.
A revocation caveat. Offboarding (disabling the user at the IdP) takes effect at the next refresh, but already-issued STS credentials are not revocable and live to their session maximum — keep the AWS session duration short if you need tight offboarding.

How it works

Parleq runs an OIDC authorization-code flow with PKCE in a system web view (no embedded browser, no password ever touches Parleq). The flow yields a refresh token, which Parleq stores in the macOS Keychain and uses to mint short-lived ID tokens on demand. Each cloud leg then trades that ID token for its own credentials:

AWS Bedrock — AssumeRoleWithWebIdentity exchanges the ID token for temporary STS credentials scoped to an IAM role you control.
Google Cloud Vertex AI — Workforce Identity Federation exchanges the ID token for a federated access token used directly as the Vertex bearer.

Two principles bound the data handling:

What crosses which boundary. Only OIDC tokens reach the identity provider and the cloud STS endpoints (sts.<region>.amazonaws.com, sts.googleapis.com) — never transcript text. Transcript text only ever reaches the cleanup LLM endpoint you already configured, exactly as with any other auth mode.
Fail-closed. If sign-in, refresh, or the per-cloud exchange fails, Parleq does not silently fall back to a personal credential or a different provider. It fails closed and pastes the raw on-device ASR transcript — the cleanup pass is skipped, but your dictation is never lost. When the failure is just an expired session, the overlay’s sign-in notice is tappable — sign in right there and Parleq offers to re-clean the transcript you just dictated, no trip to Settings required.

Refresh tokens are rotated where the IdP supports it: Parleq persists each newly-issued refresh token to the Keychain the instant it arrives, before doing anything else, so a rotated token is never lost. The access and ID tokens stay in process memory only. Offboarding (disabling the user at the IdP) takes effect at Parleq’s next token refresh; note that already-issued STS credentials are not revocable and live to their session maximum, so keep the AWS session duration short if you need tight offboarding (see the managed-configuration reference and security review).

Azure OpenAI note: the Azure OpenAI provider keeps its existing Microsoft Entra ID auth (via az login). OIDC federation for Azure is not part of this release — the two federated legs today are AWS Bedrock and Google Cloud Vertex AI; Azure OpenAI keeps its existing Entra ID and API-key auth modes.

Enabling it in Parleq

Set the shared OIDC section, then turn on whichever cloud leg(s) you need. All of these can be pinned fleet-wide via MDM — see the Enterprise OIDC federation keys. Hand-editing ~/.parleq/config.json directly:

{
  "oidc": {
    "issuer": "https://acme.okta.com",
    "client_id": "0oaEXAMPLEclientid",
    "scopes": ["openid", "profile", "email", "offline_access"],
    "ephemeral_browser": false
  },
  "aws": {
    "region": "us-east-1",
    "auth_mode": "oidc",
    "role_arn": "arn:aws:iam::123456789012:role/ParleqBedrock",
    "session_duration_seconds": 3600
  },
  "vertex": {
    "project": "my-gcp-project",
    "region": "us-central1",
    "auth_mode": "oidcFederation",
    "workforce_provider": "locations/global/workforcePools/parleq-pool/providers/oidc"
  }
}

Then open Settings → Company Account and sign in. That section shows the signed-in identity and a connection doctor that runs token-free discovery plus a silent refresh and reports each cloud leg’s last success/failure — the place to confirm a new IdP or cloud setup end-to-end.

Identity-provider playbooks

Okta

• Application type: Native. Grant types: Authorization Code and Refresh Token. PKCE: required.
• Sign-in redirect URI: parleq-auth://oidc/callback.
• Scopes: openid profile email offline_access (offline_access is what gets you a refresh token).
• Issuer: your org / custom authorization-server URL (e.g. https://acme.okta.com). Client ID: the app’s client ID.

Refresh rotation: Okta rotates refresh tokens and runs replay detection — reusing an old refresh token after rotation revokes the session. Parleq handles this by persisting the rotated token on receipt, but it means a lost rotated token forces a fresh sign-in. Because Okta is a public issuer with a publicly-fetchable JWKS, it works for both the AWS and GCP legs.

Microsoft Entra ID

Entra requires the loopback redirect. Unlike Okta/Cognito (custom scheme) or Google (either), Entra rejects custom-scheme redirects for apps that implement OAuth directly — so Parleq’s loopback sign-in (default browser + a transient 127.0.0.1 listener) is mandatory here, not optional. Register the redirect via the app manifest’s publicClient.redirectUris (the portal’s “Web” platform is the wrong type). The Entra ID → AWS Bedrock leg below is live-validated; the GCP Vertex leg uses the same workforce-federation shape as the validated Okta leg (confirm the oid vs sub principal claim against your tenant).

Register Parleq under App registrations → New registration. Pick single-tenant (“Accounts in this organizational directory only”) for a corporate deployment — it pins the token issuer to your tenant GUID and is the form both cloud legs federate against most cleanly:

• Redirect URI: add a Mobile and desktop applications platform (not the iOS/macOS platform — that one is MSAL-only and forces the msauth.<bundle-id>://auth scheme, which Parleq’s own OIDC engine doesn’t emit). Entra accepts only https and loopback-http redirects for an app implementing OAuth directly — a custom scheme like parleq-auth://oidc/callback is rejected. Register http://127.0.0.1/oidc/callback — the path must match the redirect_uri path you set in Parleq’s config (below). Entra ignores only the port when matching loopback redirects (not the path), and Parleq always sends the configured path with a kernel-chosen ephemeral port, so http://127.0.0.1:<port>/oidc/callback matches the registered http://127.0.0.1/oidc/callback. The portal’s redirect-URI text box may refuse an http loopback value — if so, add it via the app manifest’s replyUrlsWithType (type InstalledClient).
• Public client: no client secret. Under Authentication → Advanced settings, “Allow public client flows” can stay at its default; PKCE is automatic for the auth-code flow.
• Scopes: openid profile email offline_access (offline_access is what yields a refresh token). These are delegated, user-consentable permissions; if your tenant requires admin consent, grant it once on the app registration so users aren’t prompted.
• Issuer: https://login.microsoftonline.com/<tenant-id>/v2.0 (the v2.0 endpoint — not the v1.0 sts.windows.net form). Its discovery doc lives at .../v2.0/.well-known/openid-configuration. Tokens carry an iss of exactly that form with your tenant GUID, a matching tid claim, and an aud equal to the app’s Application (client) ID. Client ID: the registration’s Application (client) ID.

Parleq config — note the loopback redirect_uri, which routes sign-in through the default-browser + transient-127.0.0.1-listener path instead of the custom-scheme web view:

{
  "oidc": {
    "issuer": "https://login.microsoftonline.com/<tenant-id>/v2.0",
    "client_id": "<application-client-id>",
    "scopes": ["openid", "profile", "email", "offline_access"],
    "redirect_uri": "http://127.0.0.1/oidc/callback",
    "ephemeral_browser": false
  }
}

Conditional access & device trust: Parleq signs in through the system browser, which shares the Safari session — so a device already enrolled / compliant in Entra carries that state into the sign-in, and conditional-access policies (MFA, compliant-device, managed-network) apply as usual without Parleq handling any of it. Refresh rotation: Entra issues rotating refresh tokens; Parleq persists each rotated token on receipt, so a normal refresh is silent and a lost rotated token forces a fresh sign-in. Because the v2.0 issuer is public with a publicly-fetchable JWKS, Entra works for both the AWS and GCP legs.

The two cloud legs reuse the same IAM-OIDC-provider / role-trust and Workforce-pool steps as the other IdPs (see AWS Bedrock setup and Google Cloud Vertex AI setup below), with the Entra-specific values noted here. For the AWS leg, the IAM OIDC provider URL is the v2.0 issuer https://login.microsoftonline.com/<tenant-id>/v2.0 and the role’s trust-policy aud condition is the Application (client) ID:

{
  "Effect": "Allow",
  "Principal": {
    "Federated": "arn:aws:iam::<account>:oidc-provider/login.microsoftonline.com/<tenant-id>/v2.0"
  },
  "Action": "sts:AssumeRoleWithWebIdentity",
  "Condition": {
    "StringEquals": {
      "login.microsoftonline.com/<tenant-id>/v2.0:aud": "<application-client-id>"
    }
  }
}

Set aws.role_arn to the role (e.g. arn:aws:iam::<account>:role/ParleqDictation) and aws.auth_mode to oidc; the permissions policy and per-region model-access notes in the AWS section apply unchanged.

For the GCP leg, create the Workforce Identity Federation OIDC provider with the same issuer and the Application (client) ID as the allowed audience. Entra v2.0 tokens federate as a web-SSO provider; map the subject from Entra’s immutable object-ID claim and request the auth-code response type:

gcloud iam workforce-pools providers create-oidc entra \
  --workforce-pool="<pool>" \
  --location="global" \
  --issuer-uri="https://login.microsoftonline.com/<tenant-id>/v2.0" \
  --client-id="<application-client-id>" \
  --attribute-mapping="google.subject=assertion.oid,google.display_name=assertion.name" \
  --web-sso-response-type="code" \
  --web-sso-assertion-claims-behavior="merge-user-info-over-id-token-claims"

Pick the federated principal claim carefully. Entra’s oid (object ID) is the stable, immutable per-user GUID within a tenant and is the recommended subject; sub is pairwise (per-app) and also stable but app-scoped. Use assertion.oid unless you have a specific reason to scope to sub. Then put the provider’s full resource name (locations/global/workforcePools/<pool>/providers/entra) in vertex.workforce_provider, set vertex.auth_mode to oidcFederation, and follow the rest of the Vertex AI setup (role binding, x-goog-user-project).

Keycloak

Create a public client in your realm with the standard (authorization-code) flow enabled and PKCE (S256) required:

• Client authentication: off (public client). Standard flow: on. PKCE method: S256.
• Valid redirect URI: parleq-auth://oidc/callback.
• Realm refresh rotation: revokeRefreshToken on with refreshTokenMaxReuse 0 to rotate on every refresh.
• Issuer: https://<host>/realms/<realm>.

A ready-made dev rig — docker-compose Keycloak with this exact client and a test user — lives in the repo at scripts/dev/keycloak/. One caveat for end-to-end cloud testing: AWS STS fetches the issuer’s JWKS from its own side, so a localhost issuer can’t pass AssumeRoleWithWebIdentity. GCP accepts an inline-uploaded JWKS (no fetch), but requires the issuer URI itself to use HTTPS — so a plain-http localhost issuer works for neither cloud leg. For live cloud tests, use a hosted free issuer (Cognito or an Okta Integrator org) or front Keycloak with a stable-hostname HTTPS tunnel.

Generic OIDC

Any spec-compliant OIDC provider works if it offers:

• A discovery document at <issuer>/.well-known/openid-configuration with authorization_endpoint and token_endpoint (a revocation_endpoint enables clean sign-out).
• A public client (no secret) supporting authorization-code + PKCE (S256).
• The custom-scheme redirect URI parleq-auth://oidc/callback.
• Refresh tokens (request offline_access) so Parleq can refresh silently.
• An HTTPS issuer URI; the AWS leg additionally needs the JWKS publicly fetchable (GCP can take an inline-uploaded JWKS instead).

Federate into your cloud

Once your IdP is set up, wire it into the cloud that runs cleanup. Pick the leg you need — you can enable more than one.

AWS Bedrock setup

In IAM, register the IdP as an OpenID Connect identity provider (the issuer URL plus the client ID as an audience), then create a role whose trust policy admits tokens from that provider. AssumeRoleWithWebIdentity is the gate: the call itself is unauthenticated, so the role’s trust policy is what authorizes access — constrain it with an aud condition.

Example role trust policy (placeholder account ID and client ID):

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "Federated": "arn:aws:iam::123456789012:oidc-provider/acme.okta.com"
      },
      "Action": "sts:AssumeRoleWithWebIdentity",
      "Condition": {
        "StringEquals": {
          "acme.okta.com:aud": "0oaEXAMPLEclientid"
        }
      }
    }
  ]
}

Attach a permissions policy granting Bedrock model invocation:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": ["bedrock:InvokeModel", "bedrock:InvokeModelWithResponseStream"],
      "Resource": "*"
    }
  ]
}

"Resource": "*" keeps the example short — scope it to the specific model or inference-profile ARNs you have approved for least privilege.

Put the role’s ARN in aws.role_arn and set aws.auth_mode to oidc. The requested STS session duration is aws.session_duration_seconds (default 3600, clamped to 900–43200). Bedrock model access is per-region, so make sure the chosen region has the model enabled. The role’s session name carries the signed-in user’s email for CloudTrail attribution.

Google Cloud Vertex AI setup

Create a Workforce Identity Pool and add an OIDC provider for your IdP (issuer URL + the client ID as the allowed audience). For a publicly-reachable issuer GCP fetches the JWKS automatically; for a non-public issuer you can upload the JWKS inline (jwks_json) — but note the issuer URI must use HTTPS either way. Workforce Identity Pools are an Organization-level resource — creating one requires a GCP Organization (Cloud Identity) and iam.workforcePools.create, not just project access.

• Attribute mapping: google.subject = assertion.sub.
• Grant access: bind roles/aiplatform.user to the workforce-pool principal set so federated identities can call Vertex AI. Scope the binding tighter than the whole pool if you can (e.g. by group attribute).
• Workforce provider name: put the full resource name in vertex.workforce_provider (locations/global/workforcePools/<pool>/providers/<provider>) and set vertex.auth_mode to oidcFederation.

Workforce-federated calls must specify a billing/quota project: Parleq sends your configured vertex.project as the x-goog-user-project header on every Vertex request, so that project needs the Vertex AI API enabled and the federated principal needs serviceusage.services.use on it.

Google account → Vertex AI directly

The third leg skips federation entirely. GCP refuses to accept Google itself as a workforce IdP, so the workforce-pool path can’t do “sign in with Google, dictate with Gemini.” Instead, Parleq’s own Google sign-in requests the cloud-platform scope, and the resulting OAuth access token is a valid Vertex bearer directly — exactly what gcloud Application Default Credentials produces, done in-app, with no broker and no gcloud install. This is the googleOAuth Vertex mode.

• In the Google Cloud console, create an OAuth client ID of the Desktop app type (public, no usable secret for this flow) and set oidc.redirect_uri to http://127.0.0.1/oauth2redirect — Google’s current desktop guidance. Parleq signs in via the default browser and answers the callback on a temporary 127.0.0.1 listener (ephemeral port; present only during sign-in). Alternate: an iOS-type client also works — use its reversed-client-ID redirect (com.googleusercontent.apps.<client-id>:/oauth2redirect), which Parleq intercepts in-process with no local listener.
• Admin note (managed Macs): the loopback sign-in only activates for an http://127.0.0.1 redirect. Pin oidcRedirectURI to your custom-scheme value (e.g. parleq-auth://oidc/callback or the iOS-type reversed-client-ID scheme) to ensure Parleq never binds a local listener. The redirect URI is the complete control — there is no separate “disable loopback” key because pinning a custom-scheme redirect already prevents the listener from ever activating.
• Scopes: ["openid", "email", "https://www.googleapis.com/auth/cloud-platform"] — the cloud-platform scope is what makes the access token usable against Vertex.
• Force a refresh token: set oidc.extra_auth_params to {"access_type": "offline", "prompt": "select_account consent"}. Google only re-issues a refresh token when it shows the consent screen.
• Set vertex.auth_mode to googleOAuth and vertex.project to the GCP project that owns the Vertex AI quota. Every call carries x-goog-user-project, so the signing account needs both roles/aiplatform.user and roles/serviceusage.serviceUsageConsumer on it (a project Owner already has both).

For org-owned Google identities this works the same way — with no “unverified app” warning and no refresh-token cap if the OAuth client lives in your Google Cloud organization. The personal-@gmail variant, with its consent-screen trade-offs and the granular-consent gotcha, is covered in the DIY guide.

Pin it fleet-wide

Every value above can be pushed via MDM so users sign in but can’t re-point the app at a personal tenant. The nine enterprise OIDC keys — oidcIssuer, oidcClientID, oidcScopes, oidcRedirectURI, oidcExtraAuthParams, oidcEphemeralBrowserSession, awsRoleArn, awsSessionDurationSeconds, and vertexWorkforceProvider — sit alongside the existing managed-config surface. Enable a leg by also pinning awsAuthMode: oidc and/or vertexAuthMode: oidcFederation (for Workforce federation) or vertexAuthMode: googleOAuth (for the Sign-in-with-Google direct leg). See the Managed Configuration reference → Enterprise OIDC federation for the full table, and the Admin Guide for the deployment workflow.

Troubleshooting

Start at Settings → Company Account → Test connection. The connection doctor reports which hop failed (IdP discovery / token exchange / cloud provider) with the server’s reason, which usually points straight at the fix below.

Symptom	Cause / fix
`tokenCacheNotFound` / SSO token error on AWS	If your IAM OIDC provider URL or the IdP issuer ends in `/#`, drop the trailing `#` — the INI parser strips it mid-value. Confirm the issuer in the doctor matches the IAM OIDC provider exactly.
`AssumeRoleWithWebIdentity` rejected (audience / not authorized)	The role’s trust policy `aud` condition must equal the OIDC client ID, and the `Federated` principal must name the same issuer host you registered. The STS call is unauthenticated — the trust policy is the only gate.
Entra sign-in fails immediately (`AADSTS50011` redirect mismatch)	The redirect URI Parleq sends isn’t registered. Entra rejects custom schemes for an app implementing OAuth directly — set `oidc.redirect_uri` to `http://127.0.0.1/oidc/callback` and register the same value `http://127.0.0.1/oidc/callback` under the Mobile and desktop applications platform (via the manifest’s `replyUrlsWithType` if the portal text box refuses http-loopback). Entra ignores the port for loopback matching but still compares the path, so the registered path must match.
AWS leg fails with a JWKS / issuer error on a self-hosted IdP	STS fetches the issuer’s JWKS server-side, so a `localhost` issuer cannot pass. Front the IdP with a stable-hostname HTTPS URL (or use a hosted issuer such as Cognito).
GCP workforce provider create fails: “issuer must be HTTPS”	GCP can take an inline-uploaded JWKS (no fetch), but the issuer URI itself must still be HTTPS — a plain-http localhost issuer works for neither cloud leg.
Sign-in window flashes open and closes (`invalid_scope`)	The IdP rejected a requested scope the client doesn’t have enabled. On a Cognito app client this is usually `profile` — request only `["openid", "email"]` or enable Profile in the app client. Cognito also rejects `offline_access`.
Signed out shortly after signing in (refresh rejected)	A rotated refresh token was lost, or the IdP enforces replay detection. Sign in again (from Settings → Company Account, or by tapping the overlay’s sign-in notice after a dictation); for IdPs that rotate on every refresh, ensure nothing else is reusing the token.