
Quick Tips — Just Know This and You Can Command AI
We structured code (Class 8) and systems (Class 9). What remains is data. Data is the most dangerous. When code is wrong, tests catch it. When the system is wrong, /health catches it. When data is wrong, nobody knows. It’s discovered 3 months later in a quarterly report.
To the agent: “Make the schema with explicit columns and constraints instead of JSONB. amount must be greater than 0, status must only accept defined values.”
Putting anything into JSONB means 100 different formats mixed together after 6 months. Explicit columns and constraints are data’s law. The DB immediately rejects constraint violations.
To the agent: “Import this Excel into the DB. DDL constraints must be respected. Report rows that violate constraints in a separate file.”
The agent executes the conversion, and DB constraints validate. Rejected rows are reported with reasons. You only check the rejected data.
To the agent: “Modify the DDL and pass yongol validate. Generate migration files, rollback on failure.”
The ratchet works even when schemas change. Pass and proceed to next step, fail and revert.
All you do as someone who doesn’t know DB is decide “what to store.” “Phone numbers must start with 010,” “emails must be unique” — speak these decisions in natural language and the agent translates them to DDL.
Hands-on Try
You can try without a DB. In Claude Code:
“Create CSV data: 10 customers with name, email, phone, signup date. Intentionally mix in problems: 2 with bad email format, 1 with empty phone, 1 with future signup date.”
Once CSV is generated:
“Find the problematic rows in this CSV.”
See how many AI finds. Most likely it won’t find all of them — some rows slip through with “looks fine.”
Now:
“First define a schema for this CSV. Email must contain @, phone is NOT NULL, signup date is before today. Validate again with that schema.”
With a schema declared first, AI mechanically catches everything. Ask for opinions and it misses; give rules and it catches — the principle from Class 7 applies identically to data.
Why You Need to Command This Way
Introduction: Data Corrupts Before Code
We structured code (Class 8). We structured the system (Class 9). What remains is data.
But data is fundamentally different from code or systems.
When code is wrong, tests catch it. Run go test and in 1 second “here’s what broke.” When the system is wrong, monitoring catches it. /health returns 500 and an alarm sounds immediately.
When data is wrong, nobody knows.
A customer phone number should start with 010 but someone entered one starting with 02. An order amount is negative. Delivery status is “shipping” but the shipping date is null. These errors aren’t caught by tests. Not caught by monitoring. Discovered 3 months later in a quarterly report: “why is revenue negative?”
Imagine building an app with vibe coding. “Make order management app” produces code fast. Users enter data. The agent adds features and data formats change. Migrations don’t go properly, old and new data get mixed. Code is fine but data is corrupted.
Code drift is visible. Data drift is invisible.
This is why data is more dangerous than code.
Three Types of Data Corruption
Common data corruption in vibe coding falls into three types.
1. No Schema — Trading Without a Contract
If you just tell the agent “make an order table”:
-- If you make it like this
CREATE TABLE orders (
id SERIAL,
data JSONB -- anything goes in here
);
These three lines are the root cause of 100 different formats mixed together after 6 months.
JSONB columns accept anything. Convenient at first. 6 months later, 100 different formats mixed together. Some orders have amount, others have price. Some are numbers, some are strings. For agents to handle this data, they must guess 100 formats.
2. Migration Failure — Past vs Present Collision
You tell the agent “add email field to user table.” Agent modifies DDL and runs migration. New users have email. Existing 100,000 users have email as null. Code assumes email always exists. Existing users log in and get 500 errors.
3. Business Rule Violation — Data That Shouldn’t Be Allowed
“Discount rate must be between 0-50%.” If this rule exists only in code, the agent can eliminate it during refactoring. Without CHECK (discount >= 0 AND discount <= 50) constraint in DB, a 200% discount goes in and nobody knows. Discovered 3 months later during settlement.
4 Conditions for Agent Operable Data
Four conditions are needed for agents to safely handle data.
Condition 1. Schema Is Declared — DDL Is the Contract for Data
Below is a database blueprint (DDL). It looks like programming language, but each line is one rule. You don’t need to read it. Each line is explained right below.
CREATE TABLE orders (
id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
customer_id BIGINT NOT NULL REFERENCES customers(id),
amount DECIMAL(12,2) NOT NULL CHECK (amount > 0),
status TEXT NOT NULL DEFAULT 'pending'
CHECK (status IN ('pending', 'confirmed', 'shipped', 'delivered', 'cancelled')),
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
shipped_at TIMESTAMPTZ,
CONSTRAINT shipped_requires_date
CHECK (status != 'shipped' OR shipped_at IS NOT NULL)
);
Let’s read this DDL. Even non-agents can read it.
amountmust be greater than 0 — negative orders impossiblestatusmust be one of 5 — arbitrary values like “processing” are rejectedshipped_atmust exist when status is ‘shipped’ — prevents “shipping but no ship date”customer_idmust exist in customers table — prevents ghost customers
DDL is the contract for data. Types, constraints, relationships are explicit. Agents don’t need to guess interpretations. Rules are declared in the DB.
Condition 2. Transformations Are Verifiable
Data transforms. From CSV to DB. From one table to another. From raw data to reports.
Transformation rules must be declarative and results mechanically checkable.
# Unverifiable transformation
Tell agent: "Put this Excel in the DB"
→ Agent maps columns on its own
→ 3,000 of 100,000 rows wrongly mapped
→ Nobody knows
# Verifiable transformation
Tell agent: "Put this Excel in the DB.
Mapping rules per transform.yaml.
After import, compare row counts,
verify amount totals match the original."
Write transformation rules in a declaration file and verify invariants before and after. This is verifiable transformation.
yongol’s DDL → sqlc chaining is an example of this principle. Declare schema in DDL, sqlc generates type-safe Go code. If drift between DDL and sqlc occurs, yongol validate catches it. Verified end-to-end from schema to code.
Condition 3. Source and Timestamp Are Tracked
“When, where, and why was this data created?”
Machines must be able to answer this question.
CREATE TABLE orders (
...
source TEXT NOT NULL, -- 'web', 'api', 'import', 'migration'
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
created_by BIGINT REFERENCES members(id),
updated_at TIMESTAMPTZ,
updated_by BIGINT REFERENCES members(id)
);
Source, timestamp (created_at, updated_at), and actor (created_by, updated_by) are recorded in DB. When the agent investigates “why is this order negative?”:
SELECT source, created_at, created_by FROM orders WHERE amount < 0;
-- source: 'import', created_at: 2026-02-15, created_by: NULL
“Negative amounts came from data imported on Feb 15, with no actor recorded.” With this information, the cause can be traced. Without it, it’s a mystery.
Just as whyso tracks “why” for code, data’s “why” must also be tracked. Code has whyso. Data has source/timestamp columns for that role.
Condition 4. Ratchet Applies to Data Changes Too
Ratchet Pattern from Class 6 doesn’t apply only to code. It applies to data too.
Migration ratchet:
Schema change request
→ DDL modification
→ yongol validate (cross-validation passes)
→ Migration file auto-generated (up + down)
→ Apply to staging DB
→ Verify existing data integrity
→ Pass → Apply to production (approval gate)
→ Fail → Rollback with down file
yongol already implements this. yongol generate detects DDL changes and auto-generates migration files. Up and down files come in pairs. No irreversible migrations.
What the ratchet guarantees: If migration succeeds, proceed to next step. If fails, revert to previous state. Never stops midway. Same principle as code’s ratchet.
Schema Is the Law I Establish
Here the philosophy running through the entire course appears.
In Class 5 we learned “constraints are contracts.” The three conditions of rule of law — verifiable, violation is defined, enforceable — apply identically to code.
In data, this principle manifests more directly.
Databases have schemas.
Schemas define what valid data is and what isn’t. NOT NULL, FOREIGN KEY, CHECK — data must pass these constraints to be stored. Regardless of who inserts the data. Whether human, program, or AI — if it satisfies the schema, it enters; if not, it’s rejected. A pattern that’s worked since 1970.
Schema is law. Law I establish.
Let’s map rule of law principles again:
| Rule of Law | Data Schema |
|---|---|
| Verifiable | CHECK (amount > 0) — DB verifies automatically |
| Violation is defined | NOT NULL violation, FOREIGN KEY violation — discrete |
| Enforceable | INSERT is rejected on violation |
This connects to the author’s worldview:
Law is not justice (正義) but definition (定義).
Law doesn’t guarantee justice. Schemas don’t guarantee data’s “truth” either. But law guarantees definition. Schemas guarantee validity.
This minimal guarantee — knowable in advance, mechanically verifiable, violations are rejected — is what humanity spent thousands of years winning in blood, and what databases have proven over 50 years.
Data without schema is a society without law. Anyone can put any data in. Wrong data goes unnoticed. Discovered 3 months later.
Data with schema is a rule-of-law society. Break the rules and you’re immediately rejected. Reasons are stated. You can fix and retry.
From Unstructured to Structured
Most real-world data is unstructured.
- Excel files — different formats per sheet
- Call recordings — audio files
- Meeting notes — free-form text
- PDF documents — semi-structured
- Email — natural language
For agents to handle this data, structuring must come first.
Unstructured data → Decide schema → Transform → DB
Excel → Declare DDL → import → PostgreSQL
Recordings → STT → Structurize → Summary DB
Notes → Parse → Extract action items → Task DB
PDF → OCR + Parse → Extract fields → Document DB
Notice the key: Humans decide the schema (structure), agents execute transformation.
“Separation of decisions and implementation” from Class 5 applies identically here.
- Decision (human): “Customer info needs name, phone, email, signup date. Phone must start with 01.”
- Implementation (agent): Read Excel, map columns, put data matching constraints into DB.
Agent executes transformation, DB constraints validate. Constraint-violating data is rejected with reasons reported. Human checks rejected data and decides: fix and re-insert, or discard.
Data Validation Pattern: 3-Layer Defense
Data validation isn’t one layer but three layers of defense.
1st Defense — DB Constraints (Most Powerful)
NOT NULL -- Prevent empty values
UNIQUE -- Prevent duplicates
CHECK (amount > 0) -- Range restriction
FOREIGN KEY -- Referential integrity
DEFAULT -- Guarantee defaults
DB constraints are unbypassable. Whatever code you write, whatever agent you use, data violating constraints doesn’t enter. That’s why it’s the 1st defense. The most important rules must be declared as DB constraints.
2nd Defense — Business Rules (Rego)
Some rules can’t be expressed as DB constraints. “Discounts over 30% require manager approval,” “more than 3 orders per day from the same customer is a suspicious transaction.” These rules are declared in Rego.
You don’t need to read this either. State rules in natural language and the agent translates to Rego:
# Order validation rules
deny[msg] {
input.order.discount > 30
not input.approver.role == "manager"
msg := "Discounts over 30% require manager approval"
}
warn[msg] {
count(input.customer.orders_today) >= 3
msg := "Same customer 3+ orders today — verify for suspicious transaction"
}
In natural language:
- “Discount over 30% and manager didn’t approve → deny” = first rule
- “Same customer ordered 3+ times today → warn” = second rule
Rego rules are one of yongol’s SSOTs. Class 4’s cross-validation works here too: if SSaC declares @auth, Rego must have a corresponding rule. If not, yongol validate catches it.
3rd Defense — Migration Ratchet
Verifies that existing data is compatible with new schema when schema changes.
Three defense lines’ division of responsibility:
| Defense | Handles | On violation |
|---|---|---|
| 1st: DB constraints | Data integrity | INSERT/UPDATE rejected |
| 2nd: Rego rules | Business logic | Warning or block |
| 3rd: Migration ratchet | Schema evolution | Rollback or backfill |
Same Pattern, Different Domains
See the common pattern across Classes 8, 9, 10?
| Class 8: Code | Class 9: System | Class 10: Data | |
|---|---|---|---|
| Readable? | filefunc (1 file 1 concept) | /health (structured JSON) | DDL (declarative schema) |
| Verifiable? | go test + tsma | CI/CD + health check | DB constraints + Rego |
| Reversible? | git revert | Previous image rollback | migration down |
| Progress persists? | session.json (tsma) | Terraform state | migration history |
| Decisions separated from implementation? | SSOT → code generation | Declarative config → execution | Schema → data |
All the same structure:
Declare, verify, lock, persist.
The principle that works in code works identically in systems and data. Not a new invention. Applying the same principle to new domains.
“Constraints are contracts” from Class 5 runs through all three domains:
- Code’s contract: filefunc 22 rules, yongol 287 cross-validation rules
- System’s contract: Docker Compose, Terraform, CI/CD pipelines
- Data’s contract: DDL constraints, Rego rules, migration ratchet
When reasonable constraints are verifiable, violations are defined, and enforceable — any domain converges.
DDL → sqlc → Code: Seamless Chaining
Let’s see concretely how data chaining works in yongol.
You don’t need to read these codes either. Understanding the flow — from DDL all the way to auto-generated code — is sufficient.
1. Declare schema in DDL
2. Declare queries in sqlc
3. yongol validate cross-validates
4. yongol generate produces type-safe Go code
Starting from schema (DDL) to code generation, there’s no gap for human interpretation. DDL changes → sqlc changes → generated code changes → tests catch. This is the structure where drift doesn’t occur in data-driven development.
The Vibe Coder’s Data Practice
“I don’t know DB — how do I do this?”
You don’t need to know. Agent writes the DDL. All you do is decide what to store.
To the agent: "Make a customer management table.
- Need name, phone, email, signup date
- Phone must start with 010
- Email must be unique
- Signup date auto-populated
Make it as DDL with constraints."
Agent writes DDL. yongol validate cross-validates with other SSOTs. When it passes, migration is generated. Even without reading DDL, you can make the decision (“phone must start with 010”).
Decisions in natural language. Realization of decisions in DDL. Verification of DDL by machine.
Truth Vanishes at the Speed of Light
Here we draw one final philosophy from this course.
Physics tells a cold fact. The moment an event happens, its truth vanishes at the speed of light. The moon 1 second ago is the moon from 1.3 seconds ago. A galaxy 10 billion light-years away is its appearance from 10 billion years ago.
Truth physically vanishes. What remains are only claims — fragments of truth.
“I saw this.” “This measurement read this.” “This source said this.” — All claims. Claims with sources, timestamps, and reliability.
Data is the same. “Order amount 50,000 won” in the DB isn’t truth. It’s a claim. A claim someone put in at some time through some path. That’s why source and timestamp matter. Data without source is a claim without evidence. Data without timestamp is a newspaper without a date.
What schema does is give structure to claims. “This claim must be in this form, must satisfy these constraints, must come from this source.” This structure is data’s law.
What has no source is not my data. What has no timestamp is not my record. What has no schema is not my system’s data.
Class 10 Vision: Speak and It’s Built — Code, System, Data
In Class 1 we started here.
“Make a todo list app.”
Code appeared. Worked up to 3 features. Crumbled at 5.
At the end of Class 10, where we stand:
Class 1's world:
"Make an app" → Code appears → Crumbles at 5 features
Class 10's world:
"Make an order management SaaS"
→ Decisions: Define schema, declare features, declare rules
→ Code: yongol generates from SSOT (Class 8)
→ System: CI/CD automates build-deploy-monitor (Class 9)
→ Data: DDL enforces schema, Rego validates rules (Class 10)
→ Approval: Human just presses "approve"
→ Doesn't crumble even at 200 endpoints
| Class | What we learned | Result |
|---|---|---|
| 1 | Vibe coding’s present | “Speak and code appears” |
| 2 | Why it crumbles | Drift, context evaporation, sycophancy |
| 3 | How to prevent | Hurl, Git, CI/CD |
| 4 | The 200-endpoint wall | yongol — declarative SSOT |
| 5 | AI with reins | Reins Engineering 3 pillars |
| 6 | Lock and progress | Ratchet Pattern — one-directional ratchet |
| 7 | Reverse-engineer sycophancy | IFEval — feedback creates convergence |
| 8 | Structure the code | filefunc + tsma — Agent Operable Codebase |
| 9 | Structure the system | 4 conditions — Agent Operable System |
| 10 | Structure the data | Schema is law — Agent Operable Data |
Code → System → Data. The same principle works across all three domains.
Declare, verify, lock, persist. Decisions by humans, implementation and verification by machines. Not rule by man but rule of law.
From Class 1’s “make a todo list app” to Class 10’s “make an order management SaaS” — what changed isn’t model size. It’s structure. We put reins on the agent, laid tracks, and established law.
Speak and it’s built. Not just code, but system and data too. For that to be possible, there must be reins, there must be tracks, there must be law. Designing those reins, tracks, and law is Reins Engineering.
You started this course unable to read a single line of code. Having finished Class 10, what changed isn’t that you can now read code. You now know what to tell agents, why to tell them, and how to verify their reports. This is the capability of a decision-maker.
Related Articles
Reins Engineering Full Course
| Class | Title |
|---|---|
| Class 1 | How to Command AI |
| Class 2 | How to Distrust AI |
| Class 3 | Unbreakable Apps |
| Class 4 | Decisions Outside Code |
| Class 5 | AI with Reins |
| Class 6 | Lock When It Passes |
| Class 7 | Flipping Sycophancy |
| Class 8 | Agent Factory |
| Class 9 | Automation Beyond Code |
| Class 10 | Law of Data |
Sources
- Stanford, “Lost in the Middle: How Language Models Use Long Contexts” (2024) — 30%+ performance drop when relevant info buried in context middle (re-referenced from Class 8)
- Amazon, “Context Length Alone Hurts LLM Performance” (2025) — 13.9-85% performance drop even with whitespace tokens (re-referenced from Class 8)
- E.F. Codd, “A Relational Model of Data for Large Shared Data Banks” (1970) — Relational database model, theoretical foundation for schema-based data integrity
- OPA (Open Policy Agent) / Rego — Declarative policy language for verifying business rules outside code
- yongol DDL → sqlc chaining — Seamless cross-validation structure from schema to type-safe code
- Rule of Law principle — Three conditions of verifiability, violation definition, and enforceability apply identically to code/system/data
- “Law is not justice (正義) but definition (定義)” — Digital rule of law philosophy, presenting schema as the analogue of law
- “Truth vanishes at the speed of light” — Foundation for data source/timestamp tracking from limitations of physical observation