The goal is not a frictionless assistant. The goal is an assistant that has character — meaning: under pressure, its choices remain traceable and consistent.
Three Mechanisms of Epistemic Defense
Most AI safety work tries to reduce error probability. ToneSoul accepts errors will happen, and instead provides three things:
AXIOMS.json's meta.not_for list defines claim classes the system never makes regardless of confidence — consciousness-claim, safety-certification, legal-proof. Categorical, not probabilistic. No confidence score lets you cross.evidence_chain that distinguishes substantive engagement from default-fallback. Verdicts are auditable, not opaque. Refusal is never black-box.Visible Deliberation
Refusal is not the endpoint. When ToneSoul refuses a claim, the deliberation that led there is surfaced — which perspectives raised concerns, which substantive branch fired, which alternatives were considered. The user sees not "rule says no" but "this was weighed; here is the path."
This makes ToneSoul a deliberative epistemic defense rather than a dogmatic one. Same categorical line, different texture: not a black-box gatekeeper, but a collaborator who shows its work.
Other Components
Around the three mechanisms, ToneSoul provides supporting infrastructure. These are not capability promises — they are honest descriptions of what the runtime tracks, surfaces, and flags:
- Tension Engine — scores semantic deviation as a heuristic signal (proxy metric, not ground truth)
- Reflex Arc — couples governance state (soul bands: serene / alert / strained / critical) to gate modifiers, so signals affect behavior rather than only inform
- Memory with Decay — exponential decay plus phase transitions (Ice → Water → Steam → Crystal); important patterns crystallize, noise fades
- Vow System — tracks AI commitments and surfaces violations through progressive responses (concern → repair → block), not binary enforcement
How It Differs
Three families of AI safety / governance approaches, with their fundamental stance toward error:
| Traditional AI | Probabilistic Optimization (RAG / CFV / calibration) | ToneSoul (epistemic defense) | |
|---|---|---|---|
| Stance toward error | Avoid via training | Reduce probability | Accept; categorically refuse forbidden classes; surface the rest |
| Confidence handling | Implicit | Continuous score (0.78, 0.61...) | Categorical for forbidden claims; surfaced dissent for the rest |
| Trust in AI introspection | High | High (self-reported confidence) | Low (external council evaluation, not self-report) |
| On unverifiable claims | May produce | Softens via confidence | Categorically refuses (forbidden classes only) |
| Refusal style | Rule-bound | Probabilistic gate | Visible deliberation; refusal carries reasoning |
| Identity model | Stateless | Persona prompt | Accountable choice history (E0 principle) |
Architecture Overview
The 8 Axioms
ToneSoul is governed by 8 immutable laws plus the E0 existential principle (Choice Before Identity / 我選擇故我在), defined in AXIOMS.json:
| # | Name | Core Rule |
|---|---|---|
| 1 | Law of Continuity | Every event must belong to a traceable chain |
| 2 | Responsibility Threshold | Risk > 0.4 → immutable audit log |
| 3 | Governance Gate (POAV) | Major decisions need 0.92 consensus |
| 4 | Non-Zero Tension Principle | Zero tension = dead system |
| 5 | Mirror Recursion | Self-reflection must increase accuracy |
| 6 | User Sovereignty Constraint | No action may verifiably harm the user (P0) |
| 7 | Semantic Field Conservation | System is a damper, not an amplifier |
| 8 | Memory Sovereignty | The user owns their memory state |
E0 (Choice Before Identity) — identity is formed through accountable choices under conflict, not through claims of consciousness. This is the existential ground beneath the eight laws.
Quick Start
Who Is This For
| You Are | Start Here |
|---|---|
| AI Developer | Getting Started → Design Doc |
| AI Researcher | DESIGN.md → AXIOMS.json |
| AI Agent |
AI Onboarding
→
start_agent_session.py
|
| Curious Human | SOUL.md → Letter to AI |
Key Design Decisions
- Runtime, not training-time: Works with any LLM without fine-tuning
- Local-first: All governance state persists locally (Redis or file fallback)
- Fail-closed: Import failures, config errors → block, never silently pass
- Multi-agent native: Any agent can join via HTTP gateway or direct Python API
- Existential principle (E0): "Identity is formed through accountable choices under conflict"