Lean 4 project
Annotation Guidelines V2
Introduction
The goal of this task is to create a high-quality dataset of mathematical problems translated into Lean
4. Your role involves formalizing mathematical problems and accurately converting them into Lean 4
code.
From an existing dataset of math problems and proofs written in natural language, we need to:
-
-
-
-
Verify the informal proof is correct or modify it
Control if the auto-formalization is correct
Translate the problem and the proof in Lean4 according to our guidelines.
Check the quality of the Lean4 code.
Annotation Guide
Here is a step by step guide on how to process to efficiently do the task. For each asset you’ll have
access to the question and an informal solution to start from. PLEASE respect the format
expectations:
1. Read the question and its available informal solution. In case there are mistakes in the informal
solution, please do your best to correct them on the formalization platform and only then use the correct
solution to annotate the proof. Provide the modification if relevant.
2. When writing the Lean 4 solution please copy and paste the problem statement to the Lean4 file as a
comment in between /— –/ brackets after the import and open statements.
3. Assign a name to the problem following the naming convention: type_number(subproblem). The
'type_number' corresponds to the asset name in Kili. If applicable, include the subproblem number as
part of the name. For example, a Number Theory question #4516 with two subproblems would be
named number_theory_4516_1 and number_theory_4516_2, respectively. Please name the
problems formalized using the theorem keyword.
4. For these formalizations, we aim to capture reasoning traces within proofs rather than verify the
statements. Hence, please avoid overly relying on methods such as decide to compute the problem
solution when a human problem-solver would not be able to do so in a reasonable amount of time. We
encourage you to follow one of the official solutions provided but recognize that some differences may
occur due to the nature of the formal setting.
5. It is strongly discouraged to use auxiliary definitions and lemmas for the formalization. If absolutely
needed, you may use them via the def/lemma keywords to answer a multiple-choice question, a
function/predicate or a common lemma, etc.
Kili
Numina Project
Version: V0_01
Date: 09.01.2025
2
6. Before submitting your answer please make sure it compiles. You can use your usual IDE/Compiler to
check.
The formal proofs should contain all key mathematical reasoning steps from the informal solutions but we
are aware that the time-constraint will require some problems to be formalized only partially. To provide such
partial solutions, we expect annotators to use have and suffices statements. These should be appropriately
commented by using an appropriate snippet taken from the informal proof or an annotator description. Naturally,
some of proofs of these sub-statements will be completed only partially and contain sorry tactics. For this, we
provide a quick check-list to follow before using a sorry within a proof:
1. Does there exist a valid Lean proof from the current goal state? If not certain, please proceed with the
formalization. Else, continue with the check-list.
2. Is the proof from the current goal state require mathematically non-trivial reasoning? If not, please
continue with the formalization. Else, continue with the check-list.
3. Is the proof from the current goal require only routine and tedious proof-steps, which cannot be
completed within 2-3 lines of Lean code. If yes, then sorry may be used here to save time. Else, please
provide a short-proof from the current goal-state.
A formalization of a solution following these guidelines can be found in the folder
/AnnotationExamples/Solutions
Example:
Labeling Rules
⚠️(Very Important)
● The Lean 4 / Mathlib version used must be the 4.15.0
● Generate a self-contained solution of the problem using imports from Mathlib. In particular,
please do not use auxiliary def and lemma statements when formalizing the solution.
Kili
Numina Project
Version: V0_01
Date: 09.01.2025
3
● If you must use auxiliary def or lemma in your formalization, please wrap your submission in a
namespace following the convention (problem_type)_(problem_number). An example of this is
given in the /AnnotationExamples/Statements/namespace_example.lean file
● Contributors are required to intersperse steps in the informal solution between Lean 4 code
snippets as comments, aligning the informal and the formal solutions. If the formal solution is
significantly more detailed, do not hesitate to add extra comments.
Further resource:
You can find example of formalization and more detailed guidelines here: Ressources
Using Kili
Ontology
The primary task is to complete the Lean 4 formalization. Since the
Lean 4 format isn’t natively supported, you can either use a different
predefined language support.
The response itself must strictly be in Lean 4 and in the code
format.
Then some tags and comments sections are available to explain
difficulties or specific situations.
You will receive an invitation to Kili by email. Please create an account with this same email. You’ll
have direct access to the project when entering it.
You can then start by clicking on Start Labeling.
Kili
Numina Project
Version: V0_01
Date: 09.01.2025
4
Kili
Numina Project
Version: V0_01
Date: 09.01.2025
5