三元组不是事实，而是主张

三元组不是事实，而是主张 Image: AI generated

维基数据的沉默

维基数据中有这样的三元组：

(tomato, instance_of, vegetable) — rank: preferred
(tomato, instance_of, fruit)     — rank: normal

谁决定了 preferred？为什么是 preferred？在什么上下文中是 preferred？

维基数据对这些问题保持沉默。编辑者做出决定，系统只是存储决定。

但番茄是蔬菜还是水果，这不是物理常数。问厨师，是蔬菜。问植物学家，是水果。问美国最高法院，是蔬菜（1893年，Nix v. Hedden）。同一个问题有三个答案，三个都没错。

知识图谱的三元组不是事实。它们是主张。

主张需要论证

要存储主张，需要结构。Toulmin 的论证模型提供了这个结构。

要素	角色	番茄示例
Claim	主张	“番茄是蔬菜”
Ground	直接依据	“在烹饪分类体系中被视为蔬菜”
Backing	来源/权威	“Le Guide Culinaire (1903)”
Qualifier	适用范围	“在烹饪语境中”（置信度 0.8）
Rebuttal	反驳条件	“在植物学语境中是水果——子房结构”
Warrant	连接逻辑	“传统食材分类以烹饪用途为标准”

不是对一个三元组强加单一的 truth value，而是将三元组提升为论证的对象。有主张，有依据，有反驳条件，有来源。而判定——不在存储时发生，而在查询时发生。

这个想法本身并不新颖。学术界已有 Dung 的抽象论证框架（1995）、ASPIC+（2010）、nanopublication 等研究探讨知识图谱上的论证。区别只有一个——我们提供的不是论文，而是可执行的代码。用 go install 安装，用 Go 函数编写规则，现在就能运行。

上下文决定真相

存储是论证结构。判定在运行时。

ctx.Set("domain", "cooking")
results, _ := g.Evaluate(ctx)
// verdict: +0.8 → vegetable

ctx.Set("domain", "botany")
results, _ = g.Evaluate(ctx)
// verdict: -0.9 → not a vegetable (fruit)

同一个图谱，同一个论证结构，同一份代码。只改变了上下文。在烹饪语境中查询得到 +0.8（蔬菜），在植物学语境中查询得到 -0.9（水果）。判定跟随上下文。

这就是与维基数据静态 rank 的根本区别。不是编辑者决定 preferred，而是查询者的上下文产生判定。

冥王星是行星吗

Triple: (Pluto, instance_of, planet)

Claim:
  Ground: "Solar orbit, spherical, sufficient mass"
  Backing: IAU 1930 classification resolution

Rebuttal:
  Ground: "Orbit-clearing criterion not met"
  Backing: IAU 2006 Resolution 5A

Exception to rebuttal:
  Ground: "Still taught as a planet in education curricula"
  Backing: NASA Education Resources 2024

ctx.Set("domain", "astronomy_formal")
// verdict: -0.8 (not a planet)

ctx.Set("domain", "education")
// verdict: +0.3 (can be treated as planet in context)

2006年之前上小学的人，冥王星就是行星。对 IAU 来说，冥王星是矮行星。两者都有依据，两者都有来源。系统要做的不是选择其一，而是两者都存储，根据上下文进行判定。

当来源被攻击时

在学术争论中，来源本身被攻击是常事。

// Claim: Drug X is effective
claim := g.Rule(drugXEffective).
    With(&TripleSpec{
        Ground:  "Significant effect in Phase 3 clinical trial",
        Backing: "Smith et al. (2020), NEJM",
    })

// Rebuttal: Independent replication failed
counter := g.Counter(replicationFailed).
    With(&TripleSpec{
        Ground:  "Effect not confirmed in independent replication",
        Backing: "Jones et al. (2022), Lancet",
    })
counter.Attacks(claim)

// Undercutter: conflict of interest
undercutter := g.Counter(conflictOfInterest).
    With(&TripleSpec{
        Ground:  "Smith's funding came from the pharmaceutical company",
        Backing: "Retraction Watch (2023)",
    })
undercutter.Attacks(claim)

Smith 的论文发表在 NEJM 上。权威来源。但当研究经费来源被揭露，基于该论文的所有主张都会被削弱。counter 正面反驳主张，undercutter 则削弱主张的依据本身。两者都攻击主张，但方式不同。h-Categoriser 综合这些攻击的强度，计算最终 verdict。

真相以光速消逝，只有主张留存。系统管理的是主张，而不是宣告真相。

所有三元组都需要论证吗

不是。

(water, chemical_formula, H2O)   — Settled. No argumentation needed.
(tomato, instance_of, ?)         — Contested. Attach argumentation.
(Pluto, instance_of, ?)          — Contested. Attach argumentation.
(Jerusalem, capital_of, ?)       — Contested. Attach argumentation.

标准很简单：如果同一 subject + predicate 存在多个 object，或 rank 有分歧，或 reference 相互冲突——就是争议性三元组。其余的保持普通三元组即可。

给水的化学式附加论证是浪费。对耶路撒冷的首都地位不附加论证是谎言。

判定引擎：h-Categoriser

论证图谱的判定由 Amgoud 的 h-Categoriser 执行。它为每个节点计算 [-1, +1] 区间的可接受度，攻击者的可接受度越高，被攻击者的可接受度就越低。递归迭代直到收敛。

性能：即使 10 万个争议性三元组各自拥有论证图谱，查询时只需 evaluate 该三元组的图谱。与整个知识图谱的规模无关。

Triple lookup:        milliseconds (existing index)
Argumentation eval:   milliseconds (h-Categoriser)
Verdict trace:        milliseconds (graph traversal)

不是扩大模型，而是扩大论证。

与维基数据 rank 的对应

Wikidata	toulmin extension
preferred rank	verdict > +0.5 (in current context)
normal rank	0 < verdict ≤ +0.5
deprecated rank	verdict ≤ 0
reference	Backing
qualifier	Qualifier + context function conditions

区别：维基数据的 rank 是静态的，由编辑者决定。toulmin 的 verdict 是动态的，由上下文和论证结构决定。

更大的图景

这个系统不依赖于特定领域。

AI Prefrontal Cortex (feature selection):
  triple = (feature, should_include_in, app_type)
  context = current project's app tags
  verdict = should this feature be included?

Knowledge graph general:
  triple = (subject, predicate, object)
  context = querier's domain/perspective
  verdict = is this triple true in current context?

同一个引擎。同一个结构。不同的领域。规则是 Go 函数，例外是 defeats 图谱，判定是 h-Categoriser。没有 DSL。

为什么需要这个

LLM 将知识融入权重。提问就有答案。但那个答案在什么上下文中为真、基于什么来源、是否存在反驳——无法从结构上追踪。幻觉源于这种结构的缺失。

这个系统不能阻止所有幻觉。LLM 生成开放式输出，不可能预先注册所有可能的主张。但对于已经在论证图谱中注册的主张，可以将 LLM 生成的答案与之对照，评估可信度。“这个主张的 Backing 是什么。有没有攻击该 Backing 的 Counter。当前上下文中 verdict 是正数吗。”

这不是通用真相判定器。而是在积累的论证之上运作的可信度评估系统。

不是存储事实的系统，而是管理主张的系统。不是宣告真相，而是追踪判定。这就是知识图谱的下一步。

toulmin — Go Rule Engine — 基于 Toulmin 论证模型的规则引擎。本文背后的判定引擎。
Ratchet Pattern — 确定性验证与棘轮锁定。

Code: github.com/park-jun-woo/toulmin

go install github.com/park-jun-woo/toulmin@latest

The code examples in this article represent a design vision based on the toulmin library’s current API. The knowledge graph extension (TripleSpec, context-based evaluation) is under active development. The core judgment engine (h-Categoriser, defeats graph, Rule/Counter) works today.

References

Toulmin, S. (1958). The Uses of Argument. Cambridge University Press.
Dung, P.M. (1995). “On the Acceptability of Arguments and its Fundamental Role in Nonmonotonic Reasoning, Logic Programming and n-Person Games.” Artificial Intelligence, 77(2), 321–357.
Amgoud, L. & Ben-Naim, J. (2013). “Ranking-based semantics for argumentation frameworks.” SUM 2013, LNCS 8078, 134–147.
Modgil, S. & Prakken, H. (2014). “The ASPIC+ framework for structured argumentation: a tutorial.” Argument & Computation, 5(1), 31–62.
Groth, P. et al. (2010). “Anatomy of a Nanopublication.” Information Services & Use, 30(1-2), 51–56.
Nix v. Hedden, 149 U.S. 304 (1893). United States Supreme Court.