Back to Home

Submit a Proof

Submit a Lean 4 proof for any of the 300 benchmark conjectures. Human proofs are welcome. For AI model evaluation runs, contact us directly.

Requirements

Proofs must be formalized in Lean 4

We only accept proofs written in Lean 4 against the formal-conjectures repository. Proofs must compile with lake lean using Lean 4.27.0 and Mathlib. Natural language or informal proofs are not accepted. All submissions are run through our zero-trust Comparator verification pipeline before appearing on the leaderboard. We will follow up by email with results.

Download Benchmark ZIP Problems JSON Verification Docs

How It Works

1. Download the benchmark 300 open conjectures with Lean statements and benchmark IDs.

2. Write and verify locally Use lake lean with the formal-conjectures repo. Confirm your proof compiles before submitting.

3. Email us your proof Fill out the form below. This opens an email draft with your proof and metadata.

4. We verify and publish We run your proof through our zero-trust pipeline and follow up by email. Verified results appear on the leaderboard.

Email Submission

Email Us Your Proof

Fill out the form below and click submit. This will open an email draft in your mail client with your proof attached. Nothing is stored on our servers until we verify it.

Model Name Required

The AI model that generated the proof, or "Human" for manual submissions.

Organization Optional

Contact Email Required

We will email you when verification is complete.

Problem Required

Lean 4 Proof Code Required

Paste the tactic body (the lines after := by). Must be valid Lean 4 that compiles with lake lean.

Logs / Traces URL Optional

Additional Notes Optional

This opens your email client with a pre-filled draft. Your proof is not stored on our servers. We will verify it on our infrastructure and follow up via email.

What happens after you submit?

We run your proof through our robust V2 zero-trust pipeline using Lean's official Comparator for sandboxed kernel replay, ensuring exact signature matching, axiom auditing (rejecting sorryAx), and 19 banned token scanning. This strictly prevents adversarial redefinition attacks. If your proof verifies, it appears on the leaderboard. We will email you with the results either way.

Pipeline

Verification Pipeline

V2 Comparator Architecture Sandboxed kernel replay, exact signature matching, axiom auditing (sorryAx rejection).

Binary scoring Verified (proof compiles clean), partial (unsolved goals only), or failed.

Error taxonomy Categorized failures: tactic_failed, unknown_identifier, unsolved_goals, type_mismatch.

19 banned tokens sorry, admit, native_decide, unsafe, and 15 more. NFKC normalized to defeat homoglyph attacks.

Resources

Benchmark Download

300 pure problems Structured JSON with benchmark IDs and Lean statements.

Banned token policy Full disallow list used by the verifier.

Verifier quickstart Local lake lean examples and agent harness commands.

Contact

Model Evaluation

Want your AI model evaluated on the full benchmark? We run evaluations on our infrastructure. Contact us with your model name, organization, and intended scope.

ASI Prize documentation for formal verification pipeline

Docs

Full Documentation

Benchmark format, API reference, and verification details.

Read docs →