`tokenize`

Produce a token stream for Python source. Used by ast, IDEs, and codemods that need exact whitespace and comments.

Source-of-record: Lib/tokenize.py, Parser/tokenizer.c, tokenize docs.

Functions

Function	Returns
`tokenize(readline)`	Iterator of `TokenInfo`.
`generate_tokens(readline)`	Same but reads `str`.
`untokenize(iterable)`	Reconstructed source.
`detect_encoding(readline)`	`(encoding, lines)`.
`open(filename)`	Open with detected encoding.

TokenInfo(type, string, start, end, line) where start and end are (row, col) pairs.

Token types

NAME, NUMBER, STRING, FSTRING_START, FSTRING_MIDDLE, FSTRING_END (3.12+), OP, NEWLINE, NL, INDENT, DEDENT, COMMENT, ENCODING, ENDMARKER, TYPE_COMMENT, SOFT_KEYWORD, ERRORTOKEN, EXACT_TOKEN_TYPES map.

CLI

python -m tokenize [-e] [source] prints the stream. With -e it uses exact token types.

Gopy status

Area	State
`tokenize`, `generate_tokens`	Complete.
`untokenize`	Complete.
F-string token split (3.12+)	Complete.
CLI	Complete.

Reference

CPython 3.14: tokenize.
Lib/tokenize.py, Parser/tokenizer.c.
module/tokenize/. gopy port.
PEP 701.

Functions​

Token types​

CLI​

Gopy status​

Reference​

Functions

Token types

CLI

Gopy status

Reference