सिद्ध — that which is realized
Grown,
not trained.
Siddha builds machine-learning models that are synthesized directly from the statistics of data — every weight computed in closed form. No backpropagation. No gradient descent. No GPUs. Seconds of CPU, where training takes hours of datacenter.
Built on the Reverse Synthetic Network (RSN) architecture. Every number we publish — including the ones we lose — is reproducible from the public benchmark code.
The idea
The model is already in the data.
Modern AI fixes an architecture first, fills it with random numbers, and spends megawatt-hours of gradient descent forcing those numbers to fit the data. Siddha inverts the flow: the data comes first, and it constructs the network — features discovered by discriminant analysis, neurons grown as prototypes, connections computed from spectral structure.
The result is a model that exists the moment the statistics are computed — the way a leaf's venation exists the moment it unfurls. Deterministic, inspectable, and cheap enough to rebuild every hour on a CPU.
Neuron synthesis
Features are discovered, not designed.
Candidate features are scored by Fisher discriminant ratio; prototypes grow by recursive principal-direction splitting. The data decides what the network's neurons are.
Interconnection
Topology from statistics.
An 11-stage pipeline — similarity graphs, label propagation, spectral communities, hub/authority scores — computes how neurons connect and how much each one's vote counts.
Transformer synthesis
Every weight in closed form.
Embeddings from PPMI + SVD, attention heads from discriminant directions and cross-covariance, feed-forward maps from PCA — a complete transformer with no loss function and no gradient.
Convergence
Refinement without training.
Statistical re-weighting, spectral error correction, and an ensemble across embedding geometries — the layer that lifts 20 Newsgroups from 86.3% to 87.0%.
The evidence
Measured, not promised.
Training-free synthesis, against trained baselines, on public benchmarks. One of these rows is a loss — we publish it anyway, because that is what honest research looks like.
| Benchmark | Siddha RSN — zero training | Trained baseline | Synthesis | Verdict |
|---|---|---|---|---|
20 Newsgroups (4-class) 1,732 train / 1,126 test docs | 87.0% ± 0.5% ensemble, 9/10 seeds above baseline; 86.4% single-model | 86.1% TF-IDF + tuned linear SVM | ≈60 s CPU | ahead |
Topical text (5-class) 105 train / 45 test docs | 68.9% embedding-only synthesis | 60.0% TF-IDF + cosine | ≈15 s CPU | ahead |
Digit recognition (8×8) 1,437 train / 360 test images | 90.8% zero gradient steps | 96.4% MLP trained with backpropagation | < 1 s CPU | behind — published |
† Sources: RSN paper, Tables 1–4; seed distributions and scripts in the public benchmark package. Full results, including the negative ones →
The honest line
Where synthesis wins — and where it doesn't.
Classification & routing
For discriminative problems — routing documents, classifying text, triaging records — closed-form synthesis is competitive with trained baselines at a vanishing fraction of the compute: 87.0% on 20 Newsgroups against a tuned SVM's 86.1%, beating it on 9 of 10 seeds, in about a minute of CPU.
Generative frontier models
Our research proved — carefully, at 100-million-token scale — that zero-training generation hits a hard statistical ceiling: ~1.9× more bits per byte than GPT-2. We published that negative result with its mechanism instead of burying it. Siddha will not sell you a "zero-training LLM," because there isn't one.
Where it serves
Honest use cases.
Document & ticket routing
Route support tickets, emails, claims, and complaints to the right team. Synthesized from your labeled history in seconds; re-synthesized every night, or every hour, for the cost of a CPU coffee break.
On-device & edge classification
Closed-form synthesis needs no training infrastructure, so models can be built on the device that uses them — field sensors, kiosks, clinics with no connectivity, no cloud round-trips.
Privacy-preserving ML
When data cannot leave the premises — health records, legal files, student records — the model can be grown where the data lives. Nothing is shipped to a GPU cluster.
Nature-friendly by construction
AI that fits inside nature's budget.
A model that synthesizes in sixty seconds on one CPU core does not need a datacenter, a cooling tower, or a megawatt. Frugality isn't our marketing — it is the arithmetic of the method.
The energy arithmeticIllustrative magnitudes for comparable small classification workloads; synthesis numbers measured, GPU figures from public hardware power ratings.
The Siddha Lab
Synthesize a working model in your browser. Right now.
Because synthesis is closed-form, it runs anywhere — including this page. Paste labeled examples, press once, and watch a classifier exist milliseconds later. Your data never leaves the tab.
Open the Lab