Register-aware prompt translation. Normalize linguistic register to maximize LLM output quality.
GitHub · Releases · Install · API
LLM output quality varies dramatically by the linguistic register of your input. Same question, different phrasing, different accuracy:
| Model | Spread | Best | Worst | Status |
|---|---|---|---|---|
| Claude Opus 4 | 18.8pp | direct (93.8%) | casual (75.0%) | sensitive |
| Gemini 2.5 Flash | 56.2pp | direct (56.2%) | narrative (0%) | critical |
| Claude Haiku 4.5 | 6.2pp | direct (93.8%) | casual (87.5%) | mild |
| GPT-4o Mini | 0pp | (invariant) | invariant | |
Casual register costs Claude Opus ~19% of correct answers. Gemini loses over half. Transmogrifier detects the input register and normalizes it before it hits the model.
System prompt injection. Prepends a normalization instruction.
Rule-based rewriting. Strips filler, hedging, and framing via regex.
LLM translation. Separate-context rewrite for heavy-lift cases.
pip install git+https://github.com/jmcentire/transmogrifier.git
Optional extras:
pip install "transmogrifier[anthropic]" # Claude backend (Level 3)
pip install "transmogrifier[openai]" # OpenAI backend
pip install "transmogrifier[gemini]" # Gemini backend
pip install "transmogrifier[validation]" # Semantic equivalence checking
pip install "transmogrifier[all]" # Everything
from transmogrifier.core import Transmogrifier
t = Transmogrifier()
result = t.translate("yo what's the deal with TCP", model="claude-opus-4")
result.output_text # "TCP" (rewritten)
result.system_prompt # Level 1 normalization instruction
result.detected_register # Register.casual
result.target_register # Register.direct
result.elapsed_ms # ~2ms
result.skipped # False
Invariant models are skipped automatically:
result = t.translate("yo what's TCP", model="gpt-4o-mini")
result.skipped # True
result.skip_reason # "invariant model (0.0pp spread)"
# Detect register
transmogrify detect "yo what's the deal with TCP"
# {"register": "casual", "confidence": 1.0}
# Classify task type
transmogrify classify "Write a Python function to sort a list"
# {"task_type": "code", "confidence": 1.0}
# Translate (task-aware routing)
transmogrify translate "yo what's the difference between TCP and UDP" --model gemini-2-5-flash
# Detected: casual | Task: analysis | Target: technical (per-task optimal)
# List profiles with per-task breakdown
transmogrify profile list
# Show detailed profile
transmogrify profile show claude-opus-4
# Calibrate a new model (50 tasks x 5 registers = 250 API calls)
transmogrify profile calibrate my-model --provider anthropic
claude mcp add --scope user --transport stdio transmogrifier -- transmog-mcp
Tools: transmog_translate, transmog_detect, transmog_profiles
GPT-4o Mini shows zero sensitivity. Claude Opus shows 18.8pp. Gemini shows 56.2pp. A one-size-fits-all approach wastes effort on invariant models and under-serves sensitive ones. Transmogrifier uses per-model profiles to adapt.
Academic register halves reasoning accuracy on Claude Opus (40% vs 80% direct). Meanwhile, Gemini scores 0% on analysis with direct register but 33% with technical. The optimal register depends on both model and task type. Transmogrifier v0.2.0 includes a task classifier (factual, reasoning, code, analysis, creative, instruction) and routes to the per-(model, task) optimal register.
| Model | Category | Best Register | Spread |
|---|---|---|---|
| Claude Opus | reasoning | direct (80%) | 40pp (academic=40%) |
| Claude Opus | factual/code/analysis | (all invariant) | 0pp |
| Gemini Flash | factual | direct/technical (100%) | 100pp (narrative=0%) |
| Gemini Flash | analysis | technical/academic (33%) | 33pp (direct=0%) |
| Gemini Flash | overall | technical (62.5%) | 62.5pp |
Casual input + Level 1 system prompt scored 68.8% on Gemini 2.5 Flash, beating raw direct register (56.2%) by 12.5pp. The normalization instruction provides a genuine quality boost beyond just recovering the register gap.
Asking the model to "reinterpret this casually-worded question, then answer it" in a single turn scored 0% on Gemini. Translation must happen in a separate context. This is architecturally enforced in Transmogrifier.
Claude Opus (18.8pp) is more register-sensitive than Claude Haiku (6.2pp). RLHF doesn't uniformly solve this — the effect is architecture and training dependent.
| Register | Example | Detection Markers |
|---|---|---|
| direct | What is TCP? | Short, unframed, no markers |
| casual | yo so like, what's the deal with TCP | Slang, filler, contractions |
| technical | Provide a precise technical answer: What is TCP? | Imperative verbs, structured |
| academic | In the context of established knowledge, TCP... | Hedging, passive voice, latinate |
| narrative | Explain this as if telling a story: How does TCP work? | Storytelling framing, analogies |
Register Detector → Profile Lookup → Translation Router → Level 1 + Level 2 (+ optional Level 3) → Result
Invariants:
Designed as middleware for:
Transmogrifier · MIT License · Jeremy McEntire · Source