Parleq

← Docs

Gemini · easiest setup

Google Gemini

Google AI Studio is Google's direct, lightweight way to use the Gemini family of models. You sign in with any Google account, click a button, and get back an API key — no Google Cloud project to set up, no enterprise contract, no IAM roles to configure. A free tier is available without a credit card; how far it stretches depends on Google's current per-minute and per-day quotas (see ai.google.dev/pricing) and on how heavily you dictate. This is the path to pick when you're using Parleq for personal projects, side work, or anywhere you don't have specific compliance constraints on where AI traffic goes.

At a glance

Setup

  1. 1.

    Get an API key.

    Go to Google AI Studio → API keys, sign in with any Google account, and click Create API key. The key starts with AIza.

  2. 2.

    Paste it into Parleq's setup wizard.

    On first launch the wizard will prompt for a provider. Pick Google Gemini, paste the key, click Continue, then Finish & Restart.

    Already past the wizard? Open Settings… → Cleanup provider, switch to Gemini, paste the key, and save. Parleq picks up the key on the next dictation — no restart needed when you're only changing the secret, only when you're switching the provider itself.

  3. 3.

    Test a dictation.

    Hold right Option, say a sentence, release. The overlay should show your raw words being replaced by a punctuated, capitalized version within ~500–700 ms.

Choosing a model

gemini-2.5-flash is the default and is the right choice for almost every dictation use case. Cleanup is a low-stakes, low-token task — Flash matches Pro for it, at a fraction of the latency.

To pick a different model, open Settings… → LLM. The Model dropdown carries the recommended Gemini variants — Flash (default, balanced), Flash-Lite (faster + cheaper, occasionally drops detail), and Pro (overkill for cleanup but available for long, technical transcripts where accuracy matters more than latency). Pick Custom (enter below) to type any other Gemini model identifier, e.g. an experimental preview model. Changes take effect on the next dictation; no restart needed.

Power-user note: every Settings field maps to a key in ~/.parleq/config.json and can also be edited there directly. Settings is the recommended path for most users.

Pricing & quotas

Google's free tier on AI Studio API keys has per-minute and per-day quotas that change over time and vary by model — check ai.google.dev/pricing for current numbers. Cleanup prompts are short (a few hundred input tokens, a few hundred output tokens), so light personal dictation tends to fit comfortably inside the free quotas. Heavy or back-to-back use can hit the per-minute cap, and there's a daily ceiling either way.

If you do exceed the free tier, AI Studio prompts you to enable billing on your Google Cloud project. Pricing is per-token and (as of 2026) typically lands at fractions of a cent per dictation on Flash, but rates and quotas are Google's to set — always check the pricing page directly for what applies to your account.

Compliance notes

Direct Gemini API requests go to Google's consumer AI endpoints. By default, prompts and responses on the free tier may be used to improve Google's products; paid tier traffic is excluded from training per Google's terms. If you have data-residency requirements or need formal audit logging, use Vertex AI instead — same Gemini models, on your GCP project, with regional deployment and IAM.

Audio never reaches Google. Only the text transcript output by the on-device speech model is sent.

Troubleshooting

403 Forbidden on first dictation

The key is rejected. Most common causes: copy-pasted with surrounding whitespace (re-paste it), key was deleted or rotated in AI Studio (generate a new one), or the project the key belongs to does not have the Generative Language API enabled (AI Studio enables this automatically when you create a key — check the Google Cloud console for the project if you've changed defaults).

429 Too Many Requests

You've hit the free-tier quota. Either wait for the per-minute window to reset, or enable billing on the underlying Google Cloud project to lift the cap.

Cleanup feels slower than 700 ms

Geographic distance from a Google datacenter dominates here — the API picks an endpoint based on your IP. Typical North American + European users see 500–700 ms time-to-first-token; users far from a Google region may see 1–1.5 s. Switching to gemini-2.5-flash-lite shaves 100–200 ms off.