Skip to main content

Grammar

Python's grammar is a PEG (Parsing Expression Grammar) since 3.9. The canonical definition lives in Grammar/python.gram. This page indexes every non-terminal, gives the entry points, and explains the lookahead conventions.

Entry points

ModeStart symbolUsed by
exec / filefilepythonrun.RunSimpleString, .py files.
evalevalparser.ModeEval, eval().
single / REPLinteractiveparser.ModeSingle, gopy -c.
Function typefunc_typetyping.get_type_hints.

The start symbol consumes input until ENDMARKER. Anything left over is a SyntaxError.

Top-level structure

file: [statements] ENDMARKER
interactive: statement_newline
eval: expressions [NEWLINE]* ENDMARKER
func_type: '(' [type_expressions] ')' '->' expression [NEWLINE]* ENDMARKER

Statement classes

Non-terminalCovers
statementsOne or more statement.
statementA compound or a simple-statement sequence.
statement_newlineStatement followed by NEWLINE (interactive).
simple_stmts;-separated simple_stmts, terminated by NEWLINE.
simple_stmtOne of the simple statements (see below).
compound_stmtif, while, for, try, with, match, function_def, class_def, async_stmt.

Simple statement productions

ProductionSyntax
assignmentTargets, augmented, annotated.
type_aliastype Name = expression
star_expressionsBare expression statement (top-level).
return_stmt'return' [star_expressions]
import_stmtimport_name or import_from.
raise_stmt'raise' [expression ['from' expression]]
pass_stmt'pass'
del_stmt'del' del_targets
yield_stmtyield_expr
assert_stmt'assert' expression [',' expression]
break_stmt'break'
continue_stmt'continue'
global_stmt'global' NAME (',' NAME)*
nonlocal_stmt'nonlocal' NAME (',' NAME)*

Compound statement productions

ProductionHeads
function_defdef, async def, with @decorator stack.
class_defclass, with decorators and PEP 695 type params.
if_stmtif, elif, else.
while_stmtwhile, optional else.
for_stmtfor / async for, optional else.
with_stmtwith / async with, parenthesised items.
try_stmttry with any of except, except*, else, finally.
match_stmtmatch with one or more case clauses.
async_stmtasync def, async for, async with.

Expressions

Expression hierarchy

The grammar is layered by precedence. The list below runs from loosest binding to tightest:

expressions -> expression (',' expression)*
expression -> conditional | lambdef
conditional -> disjunction ['if' disjunction 'else' expression]
disjunction -> conjunction ('or' conjunction)*
conjunction -> inversion ('and' inversion)*
inversion -> 'not' inversion | comparison
comparison -> bitwise_or (comp_op bitwise_or)*
bitwise_or -> bitwise_xor ('|' bitwise_xor)*
bitwise_xor -> bitwise_and ('^' bitwise_and)*
bitwise_and -> shift_expr ('&' shift_expr)*
shift_expr -> sum (('<<' | '>>') sum)*
sum -> term (('+' | '-') term)*
term -> factor (('*'|'/'|'//'|'%'|'@') factor)*
factor -> ('+'|'-'|'~') factor | power
power -> await_primary ['**' factor]
await_primary -> 'await' primary | primary
primary -> atom (call_suffix | subscript | attribute_ref | ...)*
atom -> NAME | literal | group | list | dict | set | gen | comprehension

Atoms

FormMeaning
NAMEName reference.
True / False / NoneSingletons.
... (Ellipsis)Singleton.
NUMBERNumeric literal.
stringsString/bytes/f-string/t-string literal.
'(' yield_expr ')'Parenthesised yield.
'(' tuple ')'Parenthesised tuple or grouping.
'[' list ']'List display or comprehension.
'{' set ']'Set display or comprehension.
'{' dict '}'Dict display or comprehension.

Comprehensions

Comprehension syntax:

comp_for -> 'async'? 'for' targets 'in' disjunction ('if' disjunction)*

Each subsequent for and if clause is appended to the previous. Comprehensions create their own scope; their iteration variable does not leak into the enclosing namespace.

Call syntax

call -> primary '(' [arguments] ')'
arguments -> args [',' kwargs] | kwargs
args -> (starred_expression | (assignment_expression | expression !':=') !'=') (',' ...)*
kwargs -> kwarg_or_starred (',' ...)*
kwarg_or_starred -> NAME '=' expression | '**' expression

This expresses: positional first, then keyword. *expr splats an iterable into positional; **expr splats a mapping into keyword. The walrus operator is allowed in positional context.

Patterns (match)

PatternForm
Capturename (a single NAME that is not a value pattern).
Wildcard_
Valuedotted.name (constant lookup).
Literalnumeric, string, True, False, None.
Group'(' pattern ')'
Sequence'[' [pattern (',' pattern)*] ']' or with ().
Mapping'{' [mapping_items] '}'
Classdotted.name '(' [pattern_args] ')'
*-rest*name in sequence patterns.
**-rest**name in mapping patterns.
OR`pattern ('
ASpattern 'as' NAME

A class pattern with __match_args__ binds positional arguments to the listed attribute names.

Type parameters (PEP 695)

Generic syntax:

type_params -> '[' type_param (',' type_param)* ']'
type_param -> NAME [':' bound] ['=' default]
| '*' NAME ['=' default]
| '**' NAME ['=' default]

def f[T](x: T) -> T: ... and class C[T]: ... desugar into hidden TypeVar / TypeVarTuple / ParamSpec creation at runtime.

Annotations

annotation -> expression

Annotations are always lazy at the module/class top level in 3.14 (PEP 649 / 749). They are stored as code objects on __annotate__ and only evaluated on demand.

Terminals

TerminalDefinition
NAMEIdentifier (see Lexical).
NUMBERNumeric literal.
STRINGString literal (any prefix/quote form).
FSTRING_*Composite f-string tokens emitted by the lexer.
NEWLINELogical line break.
INDENTIncrease in indent depth.
DEDENTDecrease in indent depth.
ENDMARKEREnd of input.
OPOperator/delimiter token (the parser narrows by text).

PEG features used

FeatureMeaning
e?Optional.
e* / e+Zero/more or one/more.
&ePositive lookahead (consume nothing, require e).
!eNegative lookahead.
~Commit: backtracking before this point is forbidden.
e1 | e2Ordered choice (left wins).
[ ... ]Optional with packrat memoization in the canonical grammar.
RULE [name]Capture group.

The parser memoizes results per (rule, position). A failed rule at a position is also memoized to skip rework.

Gopy status

AreaState
Whole grammarComplete. Generated from Grammar/python.gram.
Error recovery and friendly errorsComplete; matches CPython messages.
Tokeniser-parser feedbackComplete.
PEG memoizationComplete.
match statementComplete.
PEP 695 type paramsComplete.

Reference