Claude Code v2.1.108 — 4 New Environment Variables: REPL Variants, Skill Discovery, and the Limits Reckoning#

Published on April 14, 2026

Part of the Claude Code Version Tracker series. | Official Env Vars | Official Changelog

Claude Code v2.1.108 adds 4 new environment variables since v2.1.107. But the headline story of this release isn't the binary delta — it's the changelog, which reads like a direct response to the usage-limit crisis that's consumed the Claude Code community since late March. Prompt cache TTL is now user-configurable. Rate limits and plan limits get separate error messages. A warning fires at startup if caching is accidentally disabled. These aren't routine polish — they're damage control for a month where Max 20x subscribers watched 5-hour quotas evaporate in 19 minutes.

The Context: Six Weeks of Limit Pain#

To understand why v2.1.108's changelog matters, you need the backstory. Starting in early March 2026, users discovered the default prompt cache TTL had silently changed from 1 hour to 5 minutes (GitHub #46829) — a regression that community analysis estimated increased cache creation costs by 20–32% (SmartScope). By March 23, reports flooded GitHub: users across Pro, Max 5x, and Max 20x tiers saw single prompts consuming 3–7% of their session quota (GitHub #41930). Max 20x users ($200/month) reported quota exhaustion in 70 minutes instead of 5 hours (GitHub #41788).

On March 26, Anthropic's Thariq Shihipar confirmed intentional peak-hour throttling (weekdays 5am–11am PT), affecting ~7% of users (MacRumors). On March 31, the company acknowledged limits were draining "way faster than expected" and called it a top priority (The Register). Around the same time, community analysis of Claude Code's source identified two independent prompt-caching bugs inflating token consumption 10–20x (SmartScope). A 2x usage promotion also quietly ended on March 28 (Claude Help Center), compounding the perception of sudden limit reductions. Later, v2.1.100 and v2.1.101 were found to silently add ~20,000 invisible tokens to every request (GitHub #46917).

By April 13, Anthropic pushed back on community analysis, stating the quota drain was "not caused by cache tweaks" (The Register). Against this backdrop, v2.1.108 drops — and nearly half its changelog addresses cache economics, limit transparency, and error messaging.

What Changed#

	v2.1.107	v2.1.108
Environment variables	230	234 (+4)
Model IDs	16	16
Feature gates	41	41
Dynamic configs	29	29
Slash commands	24	24

Changelog Highlights#

Prompt Cache TTL Is Now User-Configurable#

The most consequential change: ENABLE_PROMPT_CACHING_1H lets API key, Bedrock, Vertex, and Foundry users opt into 1-hour prompt cache TTL. FORCE_PROMPT_CACHING_5M forces the 5-minute TTL. The old Bedrock-specific flag ENABLE_PROMPT_CACHING_1H_BEDROCK is deprecated but still honored. This is a direct response to the cache TTL regression (GitHub #46829) — users now have explicit levers instead of being subject to silent backend changes. A new startup warning fires if DISABLE_PROMPT_CACHING* environment variables are set, so users can't accidentally run uncached without knowing it.

Previously, subscribers who set DISABLE_TELEMETRY were silently falling back to the 5-minute TTL instead of the 1-hour TTL they were entitled to. That's now fixed. In practice, this means privacy-conscious Max subscribers were paying $200/month and silently getting 12x worse cache economics — for months.

Rate Limits vs. Plan Limits Get Separate Error Messages#

Server rate limits (429s from API throttling) are now distinguished from plan usage limits (you've burned through your quota). Previously, both produced the same generic error, leaving users unable to tell whether they'd hit a temporary API backpressure wall or exhausted their 5-hour session allotment. Given the chaos of March — where users couldn't tell if they were being throttled, bugged, or legitimately out of quota (GitHub #41930) — this is overdue transparency. Additionally, 5xx and 529 errors now include a link to status.claude.com, so users can check whether the issue is on Anthropic's end.

Recap / Away Summary Feature#

A new recap feature provides context when returning to a session after being away. Configurable in /config and manually invocable with /recap. For users who have telemetry disabled, force it with CLAUDE_CODE_ENABLE_AWAY_SUMMARY (introduced in v2.1.107). This formalizes a pattern that's become common with background agents and /loop workflows — kick off work, step away, come back to a summary instead of scrolling.

Skill Discovery via the Skill Tool#

The model can now discover and invoke built-in slash commands (/init, /review, /security-review) via the Skill tool. Instead of users needing to know every slash command by name, the model can find and trigger them contextually. Unknown slash commands now suggest the closest match, reducing friction when users misremember command names.

Other Notable Changes#

/undo is now an alias for /rewind — quality of life for users who reach for the more intuitive name
/model warns before switching mid-conversation — because the next response re-reads full history uncached, which given the cache TTL context, directly impacts quota burn
/resume defaults to current-directory sessions — press Ctrl+A to show all projects. Fixes for --resume losing custom session names/colors, truncating sessions with self-referencing messages, and precondition errors (dirty git tree, session not found) exiting silently
Reduced memory footprint — language grammars for file reads, edits, and syntax highlighting now load on demand instead of all at startup
"Verbose" indicator when viewing detailed transcript via Ctrl+O
Fixed paste in /login — regression from v2.1.105
Fixed Agent tool permissions in auto mode — was prompting when the safety classifier's transcript exceeded its context window
Fixed Bash tool silent failures — no output when CLAUDE_ENV_FILE (e.g. ~/.zprofile) ends with a # comment line
Fixed session titles — no more placeholder text when first message is a short greeting; Remote Control titles set in web UI no longer overwritten after the third message
Fixed transcript write failures — disk-full errors were being silently dropped instead of logged
Fixed diacritical marks — accents, umlauts, cedillas were being dropped when the language setting was configured
Fixed policy-managed plugins — never auto-updating when running from a different project than first install
Fixed terminal escape codes — garbage text in prompt input after --teleport
Fixed /feedback retry — pressing Enter to resubmit after failure now works

New Environment Variables#

Variable	Likely Purpose
`CLAUDE_CODE_REPL`	Enables or configures an interactive REPL (Read-Eval-Print Loop) mode within Claude Code. This likely allows Claude to execute code snippets in a persistent runtime — think Jupyter-style cell execution rather than one-shot Bash commands. A REPL would let Claude build up state across multiple evaluations: define a function, call it in the next cell, inspect a variable. This is the kind of tool that makes Claude Code dramatically more useful for data science, prototyping, and exploratory programming. The existence of a companion `CLAUDE_REPL_VARIANT` variable confirms this is actively being experimented with, not just scaffolding.
`CLAUDE_REPL_VARIANT`	Selects which REPL implementation variant to use when the REPL mode is active. Classic A/B testing infrastructure: one variant might use an in-process Node.js REPL, another might sandbox execution in a container, a third might use a language-server-backed evaluation loop. The variant pattern lets Anthropic measure which approach performs best across execution speed, safety, and user satisfaction before committing to a single implementation.
`CLAUDE_API_SKILL_DESCRIPTION`	Provides or overrides the description text for the `claude-api` skill when presented to the model via the Skill tool. The v2.1.108 changelog introduces skill discovery — models can now find and invoke built-in slash commands dynamically. This variable controls how the API-focused skill describes itself in that discovery context, making skills self-documenting rather than requiring users to memorize command names.
`CLAUDE_INTERNAL_ASSISTANT_TEAM_NAME`	Sets the internal team identifier when Claude Code is used within Anthropic's own infrastructure. An internal routing variable — determines which model configuration, rate limits, system prompts, or feature flags apply based on which Anthropic team is running the assistant. End users won't set this, but its presence confirms Claude Code's internal deployment uses the same codebase as the public release, differentiated by configuration rather than separate builds.

What These Tell Us#

REPL execution is being prototyped as a new tool modality. Two of the four new variables — CLAUDE_CODE_REPL and CLAUDE_REPL_VARIANT — point to an interactive REPL capability beyond the existing Bash tool. Today, every code execution in Claude Code is a stateless shell command. A REPL would add stateful execution: define a function, call it next, inspect a variable. The variant flag confirms active A/B testing — Anthropic is evaluating multiple implementations before shipping publicly.

Skills are becoming a self-describing ecosystem. CLAUDE_API_SKILL_DESCRIPTION, combined with the changelog's skill discovery feature, reveals a shift in how Claude Code's capabilities are organized. Rather than hard-coded knowledge of every slash command, skills now describe themselves and are discovered dynamically — the architecture pattern behind plugin ecosystems. Third-party or organization-specific skills could plug into the same discovery mechanism.

The limits crisis is reshaping the product. This is the release where the usage-limit turmoil of March–April 2026 visibly hits the codebase. Prompt cache TTL goes from an opaque backend setting to a user-configurable parameter. Error messages stop conflating rate limits with plan limits. A startup warning catches disabled caching. The DISABLE_TELEMETRY → 5-minute TTL bug gets fixed. Taken together, these changes acknowledge that users need transparency and control over the economic levers that determine how fast their quota drains. Whether the root cause was bugs, intentional throttling, or both — this release gives users tools to diagnose and mitigate it themselves.

Sources#

Claude Code Official Changelog — v2.1.108 release notes
Cache TTL silently regressed from 1h to 5min — GitHub Issue #46829
Critical: Widespread abnormal usage limit drain — GitHub Issue #41930
Max 20 plan exhausted in ~70 minutes — GitHub Issue #41788
CC v2.1.100+ inflates cache_creation by ~20K tokens — GitHub Issue #46917
Claude Code Users Report Rapid Rate Limit Drain — MacRumors, March 26, 2026
Anthropic admits Claude Code quotas running out too fast — The Register, March 31, 2026
Anthropic: quota drain not caused by cache tweaks — The Register, April 13, 2026
Claude Code token drain bug analysis — SmartScope
Claude March 2026 usage promotion — Claude Help Center

This analysis is conducted for independent security research and interoperability purposes under fair use principles. All trademarks belong to their respective owners. The information presented here documents publicly observable behavior of installed software and is not intended to circumvent any technological protection measures, infringe on intellectual property rights, or encourage unauthorized use. Use these findings at your own discretion.

Claude Code v2.1.107 — 6 New Environment Variables — Worktrees, away mode, stream resilience
Claude Code v2.1.104 — 2 New Environment Variables — Enterprise TLS and SDK OAuth
Claude Code v2.1.100 — 3 New Environment Variables — Context token limits, Perforce VCS, script caps
Claude Code v2.1.96 — No New Environment Variables — Bedrock auth hotfix
Claude Code v2.1.94 — 5 New Environment Variables — Mantle auth, MCP sandboxing, team onboarding

Let's Connect

Bluesky LinkedIn Threads