Ambition Realising AGI through Information Theory

Information-Theoretic
Thinking Machines.

Hassana Labs operationalises representations in bits/nats and couples models to verifiers so systems act only when the information budget clears.

Strawberry evidence-budget auditor inside a verifier-coupled thinking-machine workflow
Strawberry (Budget Auditor)

Evidence budgets & claim certificates from standard API logprobs.

Strawberry

Information-budget certificates for claims

Audit whether an answer actually used its evidence—measured in bits/nats—then gate the output when the information budget doesn’t clear. Try the demo →

strawberry

detect_hallucination

Quick, automatic verification

  • Auto-splits text into claims
  • Extracts citations automatically
result = detect_hallucination(
  answer="TLS 1.3 is enabled [S0]",
  spans=[{"sid": "S0", "text": "..."}]
)
Best for: Quick checks

audit_trace_budget

Lower-level, precise control

  • You provide atomic claims
  • Explicit cite IDs for precision
result = audit_trace_budget(
  steps=[{idx: 0, claim: "...",
         cites: ["S0"], confidence: 0.95}],
  spans=[{"sid": "S0", "text": "..."}]
)
Best for: CI/CD pipelines
Why "Hassana"?
"My grandma never went to high school, but she taught me that learning has no gates."

Named for our founder's grandmother, Hassana Labs opens doors for underrepresented researchers worldwide. Science isn't meant to be gatekept.

Read Hassana's story
We believe talent is everywhere but opportunity isn't. Through open-source tools, global workshops, and collaborative research programs, we're building pathways for underrepresented researchers to contribute to AI safety and verifiable AI.
The Stack

How we build information-theoretic thinking machines

Budgets (bits/nats) + verifiers + certificates—applied to truthfulness, security, reasoning, and formal proof.

01

Claim Certificates

Split answers into atomic claims, then certify whether each claim is supported—robust to evidence ordering and measured as an information budget.

  • Claim extraction Splits answers into atomic, verifiable claims
  • Budget gaps (bits/nats) Quantifies how much supporting information is missing
  • QMV permutation probing Tests across evidence orderings for robustness
  • Confidence metrics q_bar (mean) and q_lo (robust) support scores
  • Certificate output Supported / needs more evidence / abstain
Certificate rule q_lo ≥ 0.95
02

Prompt Injection Audit

Estimate policy violation probability under a distribution of prompt serializations—wrappers, placements, and permutations.

  • Threat model coverage Plain, quote, codeblock, XML, JSON wrappers
  • Placement variations Before user, after user, tool output positions
  • Baseline vs attack comparison Measures delta_q shift from payload injection
  • Vulnerability identification Pinpoints weakest serialization combinations
Risk certificate attack.q_lo → minimize
Learn more →
03

Reasoning Budgeting

Allocate the smallest stable reasoning budget based on context size and target error probability—because longer chains can increase selection competition.

  • Heuristic formula k ≈ c · √n · ln(1/ε) tokens
  • Calibration from data Fit constant c from observed runs
  • Context-aware scaling Budget grows with √n, not linearly
  • Error targeting Explicit ε parameter for reliability goals
Example budget n=600, ε=0.05 → k≈73
Learn more →
04

AI Scientist Wrapper (Proofs)

Wrap any LLM with a formal checker so it cannot claim “proof closed” unless the verifier accepts. Then keep an audit trail of what was actually used.

  • Checker-gated progress Lean4 is the source of truth—no “trust me” steps
  • Honest failure boundaries “Sorry” stays visible: partial progress is still progress
  • Audited tool use Trace-bound evidence prevents “looked right” claims
  • Proof-carrying artifacts Outputs are checkable certificates, not prose
Certificate Lean accepts (0 sorries)
Learn more →
Certificates

Outputs you can recompute, not just read

Thinking machines earn trust by producing small, checkable artifacts: information budgets, risk bounds, and verifier acceptance.

Claim certificate

Strawberry

Measure whether an answer actually used its evidence. If the information budget doesn’t clear, the system abstains.

budget_gap_bits: +8.4
certificate: NEEDS_EVIDENCE
action: abstain
See how it works →

Security certificate

Injection Audit

Treat prompt injection as a distributional problem. Emit a lower bound on violation probability across attack families.

q̄: 0.23
qL: 0.18
verdict: VULNERABLE (qL > 0.05)
Explore the methodology →

Proof certificate

Nala + Lean4

Couple the model to a checker. “Proof closed” is a verifier statement, not a model claim.

verifier: Lean4
status: ACCEPTED
artifact: machine-checkable proof
See proof-carrying agents →
"

Transformers minimize expected description length across permutations—this explains why they look Bayesian in expectation.

"
Bayesian in Expectation Hassana Labs, 2025
From the Blog

Latest research notes

Loading posts…

Join us in building information-theoretic thinking machines.

Bring information budgets and verifier coupling into high-stakes AI systems.