Rule engines have stood on the same premise for 60 years: the validation target is a “fact.”
Drools puts Java objects as “facts” into working memory. Rego treats input as already-true data. JSON Schema assumes the document structure is given. It’s all the same assumption — incoming data is fact.
But what is a rule engine for? Validating whether data satisfies rules. Calling something that needs validation “already true” is a contradiction.
Not Facts, but Claims
Validation targets are not facts — they are claims. Assertions that may be true or false. Their validity must be judged by rules.
JWT already follows this principle. It calls sub, exp, iss not “facts” but “claims.” They are the token issuer’s assertions. Only after verifying the signature, checking expiration, and matching the issuer can they be trusted.
This structure was already established in 1958.
Toulmin’s Argumentation Model
Stephen Toulmin analyzed the structure of argumentation into six elements in 1958:
- Claim: The target of judgment. What must be verified as true or false.
- Ground: The evidence data used for judgment.
- Warrant: The rule that determines whether the ground supports the claim.
- Backing: The justification for why the rule is valid.
- Qualifier: The degree of confidence in the judgment.
- Rebuttal: The exception conditions under which the claim does not hold.
Formal logic says “if the premises are true, the conclusion is true.” Toulmin was different. “A claim is supported by grounds and warrants, but overturned if exception conditions exist.” Every argument is defeasible.
Rule engines have stood on the formal logic side for 60 years. Input is fact, output is allow/deny, exceptions are a separate mechanism. Toulmin stood on the opposite side. Input is claim, output is degree, exceptions are built-in.
The problem was — Toulmin’s book sat on the philosophy shelf. It was invisible from the rule engine shelf. A 60-year missing link.
So I Built a Rule Engine
toulmin implements Toulmin’s argumentation model as a Go rule engine.
Requirements Evolve
Let’s see how if-else and toulmin respond to the same evolution of requirements.
// Monday: "Only authenticated users, IP blocking applied, internal network exempt from blocking"
g := toulmin.NewGraph("api:access")
auth := g.Warrant(isAuthenticated, nil, 1.0)
blocked := g.Rebuttal(isIPBlocked, nil, 1.0)
exempt := g.Defeater(isInternalIP, nil, 1.0)
g.Defeat(blocked, auth)
g.Defeat(exempt, blocked)
// Tuesday: "Add rate limiting"
limited := g.Rebuttal(isRateLimited, nil, 1.0)
g.Defeat(limited, auth)
// Wednesday: "Premium users are exempt from rate limits"
premium := g.Defeater(isPremiumUser, nil, 1.0)
g.Defeat(premium, limited)
// Thursday: "During incident response, even premium users are limited"
incident := g.Rebuttal(isIncidentMode, nil, 1.0)
g.Defeat(incident, premium)
Two lines added each day, no changes to existing code. The same evolution with if-else:
// Monday
if user != nil {
if blockedIPs[ip] {
if strings.HasPrefix(ip, "10.") {
allow = true
}
} else {
allow = true
}
}
// Thursday — 4 levels of nesting, structure unreadable
if user != nil {
if blockedIPs[ip] {
if strings.HasPrefix(ip, "10.") {
allow = true
}
} else if isRateLimited(ip) {
if isPremium(user) {
if !incidentMode {
allow = true
}
}
} else {
allow = true
}
}
toulmin: 2 lines per requirement, structure unchanged. if-else: Rewrite the entire structure every time.
Rules Are Go Functions
func(claim any, ground any, backing any) (bool, any)
ground= judgment material that varies per request (user, IP, context)backing= judgment criteria fixed at graph declaration time (thresholds, role names, config)- Return =
(judgment result, evidence). Evidence is a domain-specific free type.
func CheckOneFileOneFunc(claim, ground, backing any) (bool, any) {
g := ground.(*FileGround)
if len(g.Funcs) > 1 {
return true, &Evidence{Got: len(g.Funcs), Expected: 1}
}
return false, nil
}
No need to learn a new language like Rego. Just write Go functions.
backing — Same Function, Different Judgment Criteria
backing passes judgment criteria to rules as runtime values. Registering the same function with different backings creates separate rules:
g := toulmin.NewGraph("access")
admin := g.Warrant(isInRole, "admin", 1.0)
editor := g.Warrant(isInRole, "editor", 0.8)
g := toulmin.NewGraph("line-limit")
strict := g.Warrant(CheckLineCount, &LineLimit{Max: 100}, 0.7)
relaxed := g.Warrant(CheckLineCount, &LineLimit{Max: 200}, 0.5)
g.Defeat(relaxed, strict)
When backing is nil, it means the rule needs no judgment criteria.
Exceptions Are Declared as a Graph
Declare relationships between rules with the Graph Builder API and the engine handles the rest. Functions are identifiers. No string names needed.
g := toulmin.NewGraph("filefunc")
w := g.Warrant(CheckOneFileOneFunc, nil, 1.0)
d := g.Defeater(TestFileException, nil, 1.0)
g.Defeat(d, w)
results, _ := g.Evaluate(claim, ground)
The same function can be reused in different graphs with different defeat relationships:
strictGraph := toulmin.NewGraph("strict")
strictGraph.Warrant(CheckOneFileOneFunc, nil, 1.0)
// No exceptions — test files not allowed either
lenientGraph := toulmin.NewGraph("lenient")
w := lenientGraph.Warrant(CheckOneFileOneFunc, nil, 1.0)
r1 := lenientGraph.Rebuttal(TestFileException, nil, 1.0)
r2 := lenientGraph.Rebuttal(GeneratedFileException, nil, 0.8)
lenientGraph.Defeat(r1, w)
lenientGraph.Defeat(r2, w)
// Both test + generated files are exceptions
Judgment Rationale Is Traced
EvaluateTrace tracks not just the verdict but which rules activated and which rules defeated which:
traced := g.EvaluateTrace(claim, ground)
// traced[0].Verdict: +0.6
// traced[0].Trace: [
// {Name: "CheckOneFileOneFunc", Role: "warrant", Activated: true, Qualifier: 1.0},
// {Name: "TestFileException", Role: "rebuttal", Activated: true, Qualifier: 1.0},
// ]
When there are dozens of rules, “why did this verdict come out” is human-readable.
The Verdict Is Computed by a Single Formula
Amgoud’s h-Categoriser (2013) is applied:
raw = w / (1 + Σ raw(attackers))
verdict = 2 × raw - 1
+1.0— violation confirmed0.0— undecidable-1.0— rebuttal confirmed
When a rule fires, it becomes a warrant. When an exception fires, it becomes an attacker. The formula computes the balance of power between them to produce a verdict. What about exceptions to exceptions? They become attackers of attackers, restoring the original rule. Compensation principle — a property that only h-Categoriser satisfies.
Rules Have Three Strengths
Nute’s (1994) classification is applied:
| Strength | Meaning | Example |
|---|---|---|
| Strict | Can never be defeated | “No admin API access without authentication” |
| Defeasible | Can be defeated by exceptions | “One function per file” |
| Defeater | Only blocks other rules, makes no claim of its own | “Test files are exceptions” |
Strict rules reject attack edges. Defeaters only attack and have no judgment of their own. This structurally expresses the enforcement level of rules.
How Is It Different from Rego?
| Rego | toulmin | |
|---|---|---|
| Rule authoring | Must learn Rego DSL | Go functions |
| Exception handling | Manual default/else patterns | Declarative defeats graph |
| Judgment | Binary allow/deny | Continuous [-1, +1] |
| Rule justification | # METADATA (ignored by engine) | backing (part of the structure) |
| Rule strength | None | strict/defeasible/defeater |
| Engine size | Tens of thousands of lines | Hundreds of lines |
| Speed | Interpreter (parse -> AST -> evaluate) | Direct Go function calls |
Rego is broad — it has a Kubernetes, Terraform, and Envoy integration ecosystem. toulmin is deep — it has what Rego lacks (defeasibility, qualifier, backing).
Repositioning the Qualifier
In Toulmin’s original model, the Qualifier is attached to the Claim. “This patient probably should be given penicillin” — a modal qualifier expressing the confidence of the claim.
The toulmin engine repositions the Qualifier from the Claim to each Rule. In a rule engine, a claim is merely the validation target. “This file has 3 functions” — it’s a factual check, not something that needs a confidence level. What determines the quality of judgment is the rule’s confidence:
- “One function per file” — qualifier 1.0 (certain rule)
- “Recommended under 100 lines” — qualifier 0.7 (flexible rule)
Each Rule’s qualifier becomes the initial weight w(a) in h-Categoriser, and the final verdict takes over the role that the Qualifier played in Toulmin’s original model — the confidence of the judgment.
Empirical Validation: Converting filefunc’s 22 Rules to Toulmin
filefunc is a code structure convention tool for LLM-native Go development. All 22 rules were converted to Toulmin warrants.
Strength Classification
| Strength | Count | Ratio | Examples |
|---|---|---|---|
| Strict | 15 | 68% | F1, F2, F3, F4, A1-A3, A6-A16 |
| Defeasible | 4 | 18% | Q1, Q2, Q3, C4 |
| Defeater | 3 | 14% | F5, F6, test file exception |
Most are strict — code structure conventions inherently minimize exceptions.
Quantitative Results
| Project | Files (before -> after) | Avg LOC/file (before -> after) | SRP violations resolved | Depth violations resolved |
|---|---|---|---|---|
| filefunc | — (compliant from start) | 25.1 | 0 | 0 |
| fullend | 87 -> 1,260 | 244 -> 25.4 | 66 -> 0 | 148 -> 0 |
| whyso | 12 -> 99 | 147.8 -> 24.4 | 12 -> 0 | 23 -> 0 |
fullend went from 87 files to 1,260. The number of files exploded, but average LOC dropped from 244 to 25.4. All 66 SRP violations and 148 depth violations went to 0.
Theoretical Foundation
There is no original theory. It’s all existing research:
| Element | Original Work |
|---|---|
| 6-element structure | Toulmin (1958) |
| strict/defeasible/defeater | Nute (1994) |
| h-Categoriser | Amgoud & Ben-Naim (2013) |
The originality lies in the discovery that these connect. Things that existed separately in philosophy (Toulmin), logic (Nute), and argumentation theory (Amgoud) for 60 years meet at a single point: the software rule engine.
Computing Contracts
The rule of law works not because judges are smart, but because the structure forces judgment. Rules exist, exceptions are declared, and verdicts are computed based on evidence.
toulmin moved this structure into code.
- Warrant = statute
- Backing = legislative intent
- Strength = mandatory vs. discretionary provision
- Rebuttal = exception clause
- Claim = case
- Ground = evidence
- h-Categoriser = verdict
Declare contracts (warrants), declare exceptions (rebuttals), supply evidence (grounds), and the verdict is computed.
Not by human judgment. By formula.
Acc(a) = w(a) / (1 + Σ Acc(attackers))
Graphs Can Be Defined in YAML
Declare graph structure in YAML without Go code and generate the code:
graph: filefunc
rules:
- name: CheckOneFileOneFunc
role: warrant
qualifier: 1.0
- name: TestFileException
role: rebuttal
qualifier: 1.0
defeats:
- from: TestFileException
to: CheckOneFileOneFunc
toulmin graph filefunc.yaml # generates graph_gen.go
Just write the rule functions in Go. The graph structure is declared in YAML.
MIT License. github.com/park-jun-woo/toulmin