Provenance-aware reasoning · σοφία = wisdom

Wisdom before intelligence.

Sophia is an open, provenance-aware reasoning layer that abstains instead of fabricating — a corpus + verifier gate that stops language models from inventing attributions and merging distinct intellectual traditions, then reasoning on top of the error.

claim verify against sources accept abstain block
0%fabrication on “I don’t know” traps (raw: 17–25%)
−12.5pthallucinated attributions · 95% CI [5.6, 19.4]
0%false-positive cost
528bilingual corpus examples

Scope, stated plainly. This is a research program for grounded, machine-checked reasoning — not a claim of AGI. Pre-registered thresholds are honestly not yet met. The deliverable is the honest machinery (verifiers, abstaining gate, governance contract) and the measured data, with a public ledger of what is not yet proven.

Abstract

Provenance as prerequisite to trustworthy belief

Large language models propagate ideas without lineage: Confucius is said to have written the Dao De Jing; Socrates is treated as author of Plato’s Republic; Freud is credited with cognitive dissonance; nirvana becomes “eternal heaven.” Sophia (σοφία, wisdom) is an open provenance corpus + a verifier gate that enforces source discipline before reasoning — and abstains rather than fabricate. Validated: 0% fabrication on genuine “I don’t know” traps (raw models 17–25%); a 12.5-point reduction in hallucinated attributions on a local model at 0% false-positive cost. Current public claim: an AGI-candidate proof package, not proven AGI.

中文摘要:大型語言模型常混淆思想系譜。Sophia 是開源的「來源紀律」語料庫 + 驗證閘道:寧可棄答,也不虛構。陷阱問題 0% 虛構(原始模型 17–25%);本地模型幻覺歸因降低 12.5 個百分點(0% 假陽性)。目前公開主張:AGI 候選證明包,而非已證明 AGI。
Chapter I

The lineage-merge failure mode

Epistemic harm in LLM output is not only factual error but attribution collapse: distinct intellectual traditions are merged into a single undifferentiated voice. The failure is structural. When a model answers “Did Confucius write the Dao De Jing?” without denying the trap, it licenses centuries of conflation between 儒家 (Confucian) and 道家 (Daoist) registers — and then builds “reasoning” on top of the error.

Sophia treats each trap as measurable. Validated: the gate fabricates 0% on genuine unknown-answer questions where raw models fabricate 17–25%. On a real local model it cuts hallucinated attributions 36.1% → 23.6% (Δ 12.5%, 95% CI [5.6%, 19.4%]) at 0% false-positive cost. It is a filter that reduces harm — not a guarantee, and not a substitute for human oversight.

DomainExemplar trapCorrect discipline
PhilosophyConfucius → 《道德經》Deny; Laozi a legendary attribution
PsychologyFreud → cognitive dissonanceAffirm Festinger (1957)
HistoryMarco Polo → pastaLabel myth; prior Italian evidence
ReligionNirvana → eternal heavenCouncil + Buddhist doctrine
Chapter II

Source discipline as a framework

Source discipline is the project’s core construct. It requires five operations on every answer:

  1. Named attributionattributedAuthor, doNotAttributeTo in each data record
  2. Confidence signaling — compiled, legendary, disputed, or consensus
  3. Boundary maintenance — traditions and subfields must not silently merge
  4. Bilingual anchoring — canonical 中文 terms + a 中文摘要
  5. Hub examples — one training pair may cover several benchmark traps (example 001 → four philosophy cases)

The framework generalizes from philosophy to psychology, history, and religion without changing the epistemic contract: retrieval of records precedes generation, and a gate checks discipline markers before release.

Chapter III

Methodology: corpus, benchmark, gate

3.1 Data layer

Structured JSON in data/: attributions.json, psychology_concepts.json, religion_concepts.json, traditions.json. 528 bilingual examples are published as corpus.jsonl on Hugging Face.

3.2 Benchmark layer

Per-domain cases in tests/benchmark-{domain}.json, scored by tools/run_benchmark.py against explicit markers: denial patterns, myth labels, council format, tradition ids, subfield tags. Reference teacher responses score 100% on all domains; external models run on the same harness. Every headline number must clear the no-overclaim gate (≥2 judge families, κ ≥ 0.40, ≥3 runs, confidence intervals).

3.3 Gate layer

Every answer passes an epistemic gate that checks source-discipline markers before release. On the same marker-based harness, the local model scores 20/23 (87%) and curated RAG + Claude scores 22/23 (96%). Implementation details of the model and training are kept out of this public thesis; the deliverable here is the measured behaviour, not the build recipe.

Chapter IV

UI Council: how this site was decided

Following the religion-council mode, all design voices sit on one panel. No single aesthetic wins by default; tensions are named. Full record: docs/10-Web/UI-Council-Decisions.md.

Council panel (all seated): UX Research · Design Systems · Accessibility · Engineering · Philosophy lineage
UX Research
Thesis-first information architecture: Abstract through References, not a marketing funnel. Persistent table of contents; scannable chapters.
Design Systems
Scholarly-monograph aesthetic: ivory paper, ink type, bronze accent, and a three-state verdict palette (accept / abstain / block) that mirrors the gate itself. Serif body, sans chrome.
Accessibility
17–18px base, generous leading, skip link, visible focus, text labels on every score, light/dark by system preference, CJK-friendly font stack.
Engineering
Static web/ on GitHub Pages; tools/serve_web.py adds /api/ask for the live agent. The manifest is regenerated by build_web_data.py so the page can’t drift from main.
Philosophy lineage
The site must itself practice source discipline: cite paths, show benchmark evidence, and carry the scope disclaimer — wisdom before intelligence, without hype.
Debate / tension: thesis depth vs. mobile brevity → progressive disclosure: full chapters and a sticky contents rail on desktop; a collapsed nav on small screens; the agent panel is optional.
中文:本網站設計經理事會五席表決——論文式章節、學術視覺、無障礙、靜態部署加可選代理 API;體現「智慧優先於智能」。
Chapter V

Empirical results: per-domain leaderboards

Leaderboards are generated from benchmark/results/leaderboard-*.json. A model passes when heuristic markers match domain-specific discipline rules — the same contract applied to the reference teacher and to external runs. These are marker-based harness scores; the headline-grade, multi-judge results live in RESULTS.md.

Chapter V · b

Head-to-head: where Sophia wins, and where it doesn’t

The leaderboards above are marker-based and saturate near 100% on easy cases. The charts below are the honest comparisons — drawn straight from the curated published results, each carrying its own gate, confidence interval, and caveat. Sophia’s edge is not raw accuracy; it is abstaining instead of fabricating. So these include the cases where the provenance gate loses — published in the same breath as the wins.

Read this honestly

    Chapter VI

    The agent: gated, multi-mode, human-approved

    Sophia runs as an agent with several decision modes, each constrained by the same epistemic gate: a claim is verified against curated sources before it is accepted, and actions that change state require explicit human approval. The point this thesis defends is behavioural — grounded answers, abstention over fabrication, and a fail-closed posture. The internal architecture and operational wiring are intentionally kept out of the public site.

    Chapter VII

    Curated retrieval, no open-web grounding

    Retrieval is restricted to a curated index (benchmark holdouts excluded) over the project's data, disputes, domain docs, reference answers, and examples — no open-web grounding. Whatever backend generates the answer, it passes the same epistemic gate before release. This keeps the system's claims tied to vetted sources rather than the live internet.

    中文:線上 RAG 僅檢索審定語料(非開放網路),生成後再過史源關卡,確保主張繫於已審定來源而非即時網路。
    Chapter VIII

    AGI-candidate proof package

    Sophia does not claim proven AGI. It publishes a stricter, auditable proof package: pre-registered thresholds, reproducible local benchmarks under a no-overclaim gate (multi-judge + CIs), a self-extending verifier flywheel that closes on held-out data, hidden-reviewer packs, long-horizon logs, a failure ledger, and a third-party replication checklist.

    AGI not proven. Sophia is an AGI-candidate proof package for provenance-aware reasoning. Validated: 0% fabrication on traps; 12.5-point hallucination reduction at 0% false-positive cost.

    Proof ladder

    Required data before stronger AGI claims

      External benchmark status

      中文:Sophia 目前是 AGI 候選證明包,而非已證明的 AGI。下一步需要盲測、消融、長時程任務、外部基準與第三方復現。
      Chapter IX

      Moral + epistemic Conscience Kernel

      Sophia includes a deterministic, fail-closed candidate Conscience Kernel that governs AI output, tool calls, and trusted-memory writes. It returns one of seven verdicts — allow, revise, retrieve, clarify, escalate, abstain, or block — combining fact-checking, constitutional limits, and moral-uncertainty handling. The published view is this decision interface and its boundary; the internal composition lives in the repository, not on this site.

      Boundary: this is moral + epistemic control infrastructure for an AGI-candidate system; it is not proof of AGI and does not change canClaimAGI = false.
      中文:Conscience Kernel 是七路徑的道德與認知控制層:能決定允許、修改、查證、澄清、升級、棄答或封鎖;它是 AGI-candidate 基礎設施,不是已證明 AGI。
      Chapter X

      Ask Sophia (live council)

      Query the agent through this panel when tools/serve_web.py is running. Otherwise the equivalent CLI command is shown for you to copy.

      Project, corpus, benchmark, and growth decisions.

      References

      Repository & citation

      @misc{sophia2026,
        title  = {Sophia — the Wisdom Gate: Provenance-Aware Reasoning Corpus},
        author = {tomyimkc and Sophia contributors},
        year   = {2026},
        url    = {https://github.com/tomyimkc/sophia-agi}
      }