// the markdown language model

Markdown Marty

Markdown is the model. Zero LLM at inference.

A runtime that walks markdown files — no neural net, no GPU, no embeddings, no API calls — and composes responses from cited files on disk. Hallucination is not a UX problem to mitigate. It is a category that does not exist.

0
LLM calls at inference
The runtime imports no model library. Enforced by static tests.
6
Architectural lie detectors
No imports. No embeddings. No subprocess. Cited files exist. Deleting changes behavior. The API serves provenance.
~30ms
Per-query latency
Cold cache. ~2ms warm. Bible-scale corpus on a laptop, no GPU.
100%
Audit traceability
Every word in every response cites the markdown file it came from. Delete the file, the answer changes.

One framework. Many domains.

If the knowledge in your domain is structured enough to write down, it is structured enough to live in markdown — and Marty can walk it.

Six tests. If any fails, the architecture is a lie.

Most "we don't use an LLM at runtime" claims rot the moment someone adds a fallback. Marty's claim is enforced by tests that run in CI — not by promises in a README.

01 No LLM imports. The runtime's Python imports must come from a stdlib + framework allowlist. Anything else fails the test.
02 No embedding or vector calls. No .encode(), cosine_similarity, faiss, chroma, or any of their cousins.
03 No subprocess. The runtime cannot shell out. Models can't sneak back in through a binary.
04 Every response cites real files. Provenance is a list of paths. Each path must exist on disk inside corpus/ or orchestration/.
05 Deleting a file changes behavior. If you can rip a node out of the corpus and the output is identical, the corpus wasn't load-bearing.
06 The HTTP API serves provenance. Every API response includes the cited corpus files alongside the answer.