Claude Code v2.1.108 — 4 New Environment Variables: REPL Variants, Skill Discovery, and the Limits Reckoning#
Published on April 14, 2026
Part of the Claude Code Version Tracker series. | Official Env Vars | Official Changelog
Claude Code v2.1.108 adds 4 new environment variables since v2.1.107. But the headline story of this release isn't the binary delta — it's the changelog, which reads like a direct response to the usage-limit crisis that's consumed the Claude Code community since late March. Prompt cache TTL is now user-configurable. Rate limits and plan limits get separate error messages. A warning fires at startup if caching is accidentally disabled. These aren't routine polish — they're damage control for a month where Max 20x subscribers watched 5-hour quotas evaporate in 19 minutes.
The Context: Six Weeks of Limit Pain#
To understand why v2.1.108's changelog matters, you need the backstory. Starting in early March 2026, users discovered the default prompt cache TTL had silently changed from 1 hour to 5 minutes (GitHub #46829) — a regression that community analysis estimated increased cache creation costs by 20–32% (SmartScope). By March 23, reports flooded GitHub: users across Pro, Max 5x, and Max 20x tiers saw single prompts consuming 3–7% of their session quota (GitHub #41930). Max 20x users ($200/month) reported quota exhaustion in 70 minutes instead of 5 hours (GitHub #41788).
On March 26, Anthropic's Thariq Shihipar confirmed intentional peak-hour throttling (weekdays 5am–11am PT), affecting ~7% of users (MacRumors). On March 31, the company acknowledged limits were draining "way faster than expected" and called it a top priority (The Register). Around the same time, community analysis of Claude Code's source identified two independent prompt-caching bugs inflating token consumption 10–20x (SmartScope). A 2x usage promotion also quietly ended on March 28 (Claude Help Center), compounding the perception of sudden limit reductions. Later, v2.1.100 and v2.1.101 were found to silently add ~20,000 invisible tokens to every request (GitHub #46917).
By April 13, Anthropic pushed back on community analysis, stating the quota drain was "not caused by cache tweaks" (The Register). Against this backdrop, v2.1.108 drops — and nearly half its changelog addresses cache economics, limit transparency, and error messaging.
What Changed#
| v2.1.107 | v2.1.108 | |
|---|---|---|
| Environment variables | 230 | 234 (+4) |
| Model IDs | 16 | 16 |
| Feature gates | 41 | 41 |
| Dynamic configs | 29 | 29 |
| Slash commands | 24 | 24 |
Changelog Highlights#
Prompt Cache TTL Is Now User-Configurable#
The most consequential change: ENABLE_PROMPT_CACHING_1H lets API key, Bedrock, Vertex, and Foundry users opt into 1-hour prompt cache TTL. FORCE_PROMPT_CACHING_5M forces the 5-minute TTL. The old Bedrock-specific flag ENABLE_PROMPT_CACHING_1H_BEDROCK is deprecated but still honored. This is a direct response to the cache TTL regression (GitHub #46829) — users now have explicit levers instead of being subject to silent backend changes. A new startup warning fires if DISABLE_PROMPT_CACHING* environment variables are set, so users can't accidentally run uncached without knowing it.
Previously, subscribers who set DISABLE_TELEMETRY were silently falling back to the 5-minute TTL instead of the 1-hour TTL they were entitled to. That's now fixed. In practice, this means privacy-conscious Max subscribers were paying $200/month and silently getting 12x worse cache economics — for months.
Rate Limits vs. Plan Limits Get Separate Error Messages#
Server rate limits (429s from API throttling) are now distinguished from plan usage limits (you've burned through your quota). Previously, both produced the same generic error, leaving users unable to tell whether they'd hit a temporary API backpressure wall or exhausted their 5-hour session allotment. Given the chaos of March — where users couldn't tell if they were being throttled, bugged, or legitimately out of quota (GitHub #41930) — this is overdue transparency. Additionally, 5xx and 529 errors now include a link to status.claude.com, so users can check whether the issue is on Anthropic's end.
Recap / Away Summary Feature#
A new recap feature provides context when returning to a session after being away. Configurable in /config and manually invocable with /recap. For users who have telemetry disabled, force it with CLAUDE_CODE_ENABLE_AWAY_SUMMARY (introduced in v2.1.107). This formalizes a pattern that's become common with background agents and /loop workflows — kick off work, step away, come back to a summary instead of scrolling.
Skill Discovery via the Skill Tool#
The model can now discover and invoke built-in slash commands (/init, /review, /security-review) via the Skill tool. Instead of users needing to know every slash command by name, the model can find and trigger them contextually. Unknown slash commands now suggest the closest match, reducing friction when users misremember command names.
Other Notable Changes#
/undois now an alias for/rewind— quality of life for users who reach for the more intuitive name/modelwarns before switching mid-conversation — because the next response re-reads full history uncached, which given the cache TTL context, directly impacts quota burn/resumedefaults to current-directory sessions — pressCtrl+Ato show all projects. Fixes for--resumelosing custom session names/colors, truncating sessions with self-referencing messages, and precondition errors (dirty git tree, session not found) exiting silently- Reduced memory footprint — language grammars for file reads, edits, and syntax highlighting now load on demand instead of all at startup
- "Verbose" indicator when viewing detailed transcript via
Ctrl+O - Fixed paste in
/login— regression from v2.1.105 - Fixed Agent tool permissions in auto mode — was prompting when the safety classifier's transcript exceeded its context window
- Fixed Bash tool silent failures — no output when
CLAUDE_ENV_FILE(e.g.~/.zprofile) ends with a#comment line - Fixed session titles — no more placeholder text when first message is a short greeting; Remote Control titles set in web UI no longer overwritten after the third message
- Fixed transcript write failures — disk-full errors were being silently dropped instead of logged
- Fixed diacritical marks — accents, umlauts, cedillas were being dropped when the
languagesetting was configured - Fixed policy-managed plugins — never auto-updating when running from a different project than first install
- Fixed terminal escape codes — garbage text in prompt input after
--teleport - Fixed
/feedbackretry — pressing Enter to resubmit after failure now works
New Environment Variables#
| Variable | Likely Purpose |
|---|---|
CLAUDE_CODE_REPL | Enables or configures an interactive REPL (Read-Eval-Print Loop) mode within Claude Code. This likely allows Claude to execute code snippets in a persistent runtime — think Jupyter-style cell execution rather than one-shot Bash commands. A REPL would let Claude build up state across multiple evaluations: define a function, call it in the next cell, inspect a variable. This is the kind of tool that makes Claude Code dramatically more useful for data science, prototyping, and exploratory programming. The existence of a companion CLAUDE_REPL_VARIANT variable confirms this is actively being experimented with, not just scaffolding. |
CLAUDE_REPL_VARIANT | Selects which REPL implementation variant to use when the REPL mode is active. Classic A/B testing infrastructure: one variant might use an in-process Node.js REPL, another might sandbox execution in a container, a third might use a language-server-backed evaluation loop. The variant pattern lets Anthropic measure which approach performs best across execution speed, safety, and user satisfaction before committing to a single implementation. |
CLAUDE_API_SKILL_DESCRIPTION | Provides or overrides the description text for the claude-api skill when presented to the model via the Skill tool. The v2.1.108 changelog introduces skill discovery — models can now find and invoke built-in slash commands dynamically. This variable controls how the API-focused skill describes itself in that discovery context, making skills self-documenting rather than requiring users to memorize command names. |
CLAUDE_INTERNAL_ASSISTANT_TEAM_NAME | Sets the internal team identifier when Claude Code is used within Anthropic's own infrastructure. An internal routing variable — determines which model configuration, rate limits, system prompts, or feature flags apply based on which Anthropic team is running the assistant. End users won't set this, but its presence confirms Claude Code's internal deployment uses the same codebase as the public release, differentiated by configuration rather than separate builds. |
What These Tell Us#
REPL execution is being prototyped as a new tool modality. Two of the four new variables — CLAUDE_CODE_REPL and CLAUDE_REPL_VARIANT — point to an interactive REPL capability beyond the existing Bash tool. Today, every code execution in Claude Code is a stateless shell command. A REPL would add stateful execution: define a function, call it next, inspect a variable. The variant flag confirms active A/B testing — Anthropic is evaluating multiple implementations before shipping publicly.
Skills are becoming a self-describing ecosystem. CLAUDE_API_SKILL_DESCRIPTION, combined with the changelog's skill discovery feature, reveals a shift in how Claude Code's capabilities are organized. Rather than hard-coded knowledge of every slash command, skills now describe themselves and are discovered dynamically — the architecture pattern behind plugin ecosystems. Third-party or organization-specific skills could plug into the same discovery mechanism.
The limits crisis is reshaping the product. This is the release where the usage-limit turmoil of March–April 2026 visibly hits the codebase. Prompt cache TTL goes from an opaque backend setting to a user-configurable parameter. Error messages stop conflating rate limits with plan limits. A startup warning catches disabled caching. The DISABLE_TELEMETRY → 5-minute TTL bug gets fixed. Taken together, these changes acknowledge that users need transparency and control over the economic levers that determine how fast their quota drains. Whether the root cause was bugs, intentional throttling, or both — this release gives users tools to diagnose and mitigate it themselves.
Sources#
- Claude Code Official Changelog — v2.1.108 release notes
- Cache TTL silently regressed from 1h to 5min — GitHub Issue #46829
- Critical: Widespread abnormal usage limit drain — GitHub Issue #41930
- Max 20 plan exhausted in ~70 minutes — GitHub Issue #41788
- CC v2.1.100+ inflates cache_creation by ~20K tokens — GitHub Issue #46917
- Claude Code Users Report Rapid Rate Limit Drain — MacRumors, March 26, 2026
- Anthropic admits Claude Code quotas running out too fast — The Register, March 31, 2026
- Anthropic: quota drain not caused by cache tweaks — The Register, April 13, 2026
- Claude Code token drain bug analysis — SmartScope
- Claude March 2026 usage promotion — Claude Help Center
This analysis is conducted for independent security research and interoperability purposes under fair use principles. All trademarks belong to their respective owners. The information presented here documents publicly observable behavior of installed software and is not intended to circumvent any technological protection measures, infringe on intellectual property rights, or encourage unauthorized use. Use these findings at your own discretion.
Related Versions#
- Claude Code v2.1.107 — 6 New Environment Variables — Worktrees, away mode, stream resilience
- Claude Code v2.1.104 — 2 New Environment Variables — Enterprise TLS and SDK OAuth
- Claude Code v2.1.100 — 3 New Environment Variables — Context token limits, Perforce VCS, script caps
- Claude Code v2.1.96 — No New Environment Variables — Bedrock auth hotfix
- Claude Code v2.1.94 — 5 New Environment Variables — Mantle auth, MCP sandboxing, team onboarding
Related: Context Window Management Guide | Claude Code Productivity Tips | The Agentic Engineering Playbook