v0.12.3 - The io subsystem and deferred annotations
Released May 15, 2026.
v0.12.2 was the import-chain release. It shimmed less and ported more,
but the deepest single dependency in that chain, the _io C
extension, still carried a stack of partial ports. bytesio.c,
stringio.c, fileio.c, bufferedio.c, iobase.c, and the
3,500-line textio.c had each been audited once and patched up to
the surface area we needed for import unittest to work. Not enough
to claim the subsystem.
v0.12.3 is the cleanup pass. Every file under Modules/_io/ is now
a 1:1 Go port with a citation per function. The codec layer
underneath TextIOWrapper is real, stateful, and snapshot-aware,
which means tell and seek finally produce the 21-byte cookies
CPython produces rather than int64 stand-ins that broke as soon as a
utf-16 stream went mid-character.
Two other themes land alongside the io drop.
PEP 649 and PEP 749 deferred annotations. Class, function, and
module bodies now compile __annotate__ functions the way CPython
3.14 does. Lib/annotationlib.py is vendored end to end and the
lazy __annotations__ descriptor resolves through it. This was
spec 1706, eight phases, all in this cut.
Object protocol full port, phases 2 through 8. v0.12.2 closed
Phase 1 (Objects/object.c). v0.12.3 closes the rest:
funcobject.c (classmethod, staticmethod, function), classobject.c
(bound method), typeobject.c (type_new pipeline and
inherit_slots), and the STORE_NAME / LOAD_NAME / DELETE_NAME
opcodes in Python/ceval.c. Issue
#544 (enum import),
#543 (re import end to
end), #542 (fnmatch
delegate), and #510 (the
re/_sre port) all close as a result.
A handful of CI and VM correctness fixes ride along: exceptions now
carry a real __traceback__ chain that traceback.format_exc() can
walk, the bytecode assembler emits a per-instruction location table
that matches Python/assemble.c, and a 3.14.5 sync audit (spec
1707) pulls the upstream pin to the latest patch release.
Highlights
Three pieces of work define this release.
_io in full
Every file under Modules/_io/ ships as a 1:1 port. Citations live
inline as // CPython: Modules/_io/foo.c:NNN function_name and the
spec table at website/docs/specs/1700/1702_*.md tracks the
coverage row by row.
| File | Port | Notes |
|---|---|---|
bytesio.c | module/io/bytesio.go | BytesIO with the upstream buffer growth policy. |
stringio.c | module/io/stringio.go | StringIO with universal-newline translation. |
fileio.c | module/io/fileio.go | FileIO with raw read, readinto, write, truncate, isatty, seek, tell. |
bufferedio.c | module/io/bufferedio.go | The unified-slab buffered model: BufferedReader, BufferedWriter, BufferedRandom, BufferedRWPair. Read-to-write transition invalidates the read buffer the way CPython does. |
iobase.c | module/io/iobase.go | _IOBase, _RawIOBase, _BufferedIOBase abstract bases, including the __del__ warning emission for unclosed files. |
textio.c | module/io/textiowrapper.go | _TextIOBase, IncrementalNewlineDecoder, TextIOWrapper plus the codec / read-chunk / tell-seek / reconfigure internals from spec 1709. |
_iomodule.c | module/io/iomodule.go | io.open dispatch, _UnsupportedOperation, BlockingIOError wiring. |
The interesting one is textio.c. Spec 1709 broke its internals
into four phases that landed back to back in PR
#57:
-
Stateful codec layer.
IncrementalDecoder/IncrementalEncoderinterfaces with utf-8 (carrying a 4-byte tail for partial sequences), ascii, latin-1, utf-16 and utf-32 with BOM sniffing and endianness encoded intodec_flags, and 8-bit charmap variants.GetStateandSetStatemirror CPython's tuple protocol so a decoder mid-stream can be snapshotted, serialized into atellcookie, and restored from a laterseek. CPython:Modules/_io/textio.c:912 _PyCodecInfo_GetIncrementalDecoder. -
Read pipeline.
readChunkdrives one read1 / decode cycle, snapshotting(buf pos, decoder buffer, dec_flags, newline pendingcr/seennl)before feeding bytes through. The bytes-to-chars ratio smooths as `b2cratio = 0.625 * ratio + 0.375- prev
, matching the CPython adaptive sizing exactly.read,read1, andreadlineall go throughdrainAll/drainN/drainLinehelpers that consume the snapshot-protected chunk stream. CPython:Modules/_io/textio.c:1853 _textiowrapper_read_chunk`.
- prev
-
Tell / seek cookie. A real 21-byte cookie struct, packed little-endian via
math/big.Int:Field Type Meaning start_posu64Underlying stream offset at chunk start. dec_flagsi32Encoder-specific snapshot (BOM state, endianness). bytes_to_feedi32Bytes to replay through the decoder. chars_to_skipi32Decoded chars to discard from the front. need_eofu8Whether to push final = true on the replay. TellCookiereturns*big.Int;SeekCookieaccepts one. The int64Seek/Tellstay as convenience wrappers but the Python-levelseek/tellroute throughobjects.NewIntFromBigandBigInt(). CPython:Modules/_io/textio.c:2387 cookie_type. -
Reconfigure.
_io_TextIOWrapper_reconfigure_impl(textio.c:1370) raisesValueErrorif you change codec or newline policy mid-stream while there is read-ahead or write-ahead pending. The new Go method enforces the same invariants and rebuilds the decoder, encoder, and newline decoder when the swap is legal.
The final gate is TestTextIOTellSeekRoundTripUTF16: open a utf-16
BytesIO, read "ab", call TellCookie, drain "cdef", call
SeekCookie back to the saved position, re-read and assert "cdef".
That sequence used to silently desync mid-stream because the
old shim discarded the decoder state across the seek; it now matches
CPython byte for byte.
Two smaller pieces of textio polish landed earlier in the cycle.
PR #50 ported
_textiowrapper_writeflush and the pending-bytes batching so
write accumulates short writes into one underlying flush rather
than one syscall per character. PR
#55 gave every _io
instance a real per-instance __dict__ and put __enter__ /
__exit__ as descriptors on the type, which is what
stdlib/contextlib.py was reaching for when with open(...) as f:
used to silently bypass the context manager protocol.
PEP 649 / 749 deferred annotations (spec 1706)
CPython 3.14 reshaped how annotations work. The old model was
__annotations__ = {'x': int}, evaluated eagerly when the class or
function body ran. The new model is __annotate__ = lambda fmt: {'x': int}, evaluated lazily, with three formats
(VALUE, FORWARDREF, STRING) the consumer picks at access time.
This unblocks forward references, postponed evaluation, and the
from __future__ import annotations retirement.
Spec 1706 ports the whole pipeline in eight phases. All eight ship in v0.12.3.
- Phase 1.
Python/symtable.clearns annotation blocks.ann_scopeis tracked separately so a name referenced only in an annotation does not pollute the enclosing function'sco_names. - Phase 2.
Python/codegen.c codegen_annassignrewrites annotated-assignment statements to record-only form. The annotation expression compiles into the__annotate__body rather than into the surrounding scope. - Phase 3.
Python/codegen.clearns to build the__annotate__function: a code object whose body is the deferred annotation expressions, with one parameter (the format selector) and one return value (the resolved annotations dict). - Phase 4.
Python/codegen.cbody hook plus theCO_FUTURE_ANNOTATIONSshort-circuit. If the module has the retired future-flag, we still honor the old eager-string behavior for one more release. - Phase 5.
Objects/typeobject.cgets the lazy__annotations__getset that triggers__annotate__(VALUE)on first read and caches the result on the instance dict. - Phase 6.
Objects/funcobject.cmirrors the type-level getset for function objects. - Phase 7.
Objects/moduleobject.cmirrors it again for module objects. - Phase 8.
Lib/annotationlib.pyis vendored byte for byte. TheForwardRef,Format,get_annotations,call_evaluate_function,call_annotate_functionsurface lands as upstream-identical Python, including the t-string pipeline forSTRINGformat.
Net effect:
from __future__ import annotations # no longer required
class Tree:
left: Tree # forward reference, no NameError
right: Tree
parent: Tree | None
# Eager access still works.
print(Tree.__annotations__)
# {'left': <class '__main__.Tree'>, 'right': ..., 'parent': ...}
# Lazy access through annotationlib works too.
import annotationlib
print(annotationlib.get_annotations(Tree, format=annotationlib.Format.STRING))
# {'left': 'Tree', 'right': 'Tree', 'parent': 'Tree | None'}
Object protocol full port, phases 2 through 8 (spec 1704)
v0.12.2 closed Phase 1 (the Objects/object.c method and getset
tables). v0.12.3 closes the rest of the spec. The deliverable, as
laid out in v0.12.2, is "every function in the C file has a Go
counterpart with a citation". After this release we never have to
come back to these files looking for a missing slot.
- Phase 2.
Objects/funcobject.c PyClassMethod_Type. Full port ofcm_init,cm_descr_get,cm_repr,cm_traverse, the member list for__func__and__wrapped__, the getset list for__isabstractmethod__,__dict__,__annotations__,__annotate__.__set_name__forwarding lands here soclassmethoddecorating a method picks up the owning class's name correctly. - Phase 3.
Objects/funcobject.c PyStaticMethod_Type. Mirror of Phase 2 plussm_callsostaticmethod(f)(x)works directly without bouncing through.func. - Phase 4.
Objects/funcobject.c PyFunction_Type. Audit of the fullfunc_*table and getset list.__qualname__,__defaults__,__kwdefaults__,__closure__,__module__,__globals__,__code__,__dict__,__doc__,__annotations__,__annotate__,__type_params__all resolve through real descriptors.func.__type_params__in particular was missing entirely. - Phase 5.
Objects/classobject.c PyMethod_Type. Bound-method port includingmethod_richcompare,method_hash,method_repr, the full getset table for__func__,__self__,__doc__,__name__,__module__,__qualname__.__func__is now a getset rather than a member, which means subclasses that try to override the slot work the same way they do in CPython. - Phase 6.
Objects/typeobject.c type_newpipeline. Every function in thetype_new_*family:type_new_set_attrs,type_new_set_bases,type_new_set_names,type_new_init_subclass,type_new_alloc,type_new_impl. This is the codegen-side ofclass C(Base, metaclass=M, **kw): ...and getting it right means every metaclass-related corner case (overridden__init_subclass__,__set_name__hooks, the__class_getitem__propagation) follows CPython exactly. - Phase 7.
Objects/typeobject.c inherit_slots. The audit. Every slot edge (tp_as_number,tp_as_sequence,tp_as_mapping,tp_as_async,tp_as_buffer) is walked and inherited through the same propagation rules CPython uses. The pre-existing code worked for the common cases but had stale logic on the rare slots; the audit replaces it wholesale. - Phase 8.
Python/ceval.c STORE_NAME / LOAD_NAME / DELETE_NAME. The fast-path-vs-protocol split CPython does for dict subclasses used as the class namespace. Class bodies whose namespace was adictsubclass with a custom__setitem__(thinkenum.EnumDict) used to silently skip the subclass hook. Now the path matches the CPython arrow exactly: dict gets the inline fast path, anything else getsPyObject_SetItem.
This phase set is what closes
#544 (enum import). The
enum.EnumDict class is a dict subclass with a __setitem__ that
intercepts member assignment, and the broken STORE_NAME path was
the last piece silently dropping its work.
What's new
The full breakdown, grouped by where it landed.
module/io subsystem (spec 1702 finalization)
The io subsystem entered v0.12.2 with the surface area unittest
needed and a backlog of half-ported functions. v0.12.3 closes the
backlog. The spec table at
website/docs/specs/1700/1702_*.md was scrubbed of false-positive
"done" flips and walked row by row.
bytesio.c->module/io/bytesio.go(PR #28). Buffer growth policy followsModules/_io/bytesio.c:_io_BytesIO_write_implso capacity doubles on overflow with a 256-byte floor.stringio.c->module/io/stringio.go(PR #29). Universal newline translation through theIncrementalNewlineDecoderfromtextio.c, not a separate translator.fileio.c->module/io/fileio.go(PR #30). Raw FD-backed IO. The CPython port treats the platformblksizehint as an optimization signal only; we dropped the platform helpers because Go'sos.File.Readalready chooses a sane block size.bufferedio.c->module/io/bufferedio.go(PR #31). This is the big one. Rewritten on CPython's unified-slab buffer model: one allocation per stream,read_end/write_end/read_pos/raw_poscursors, the read-to-write transition rule (a read that follows a write must seek the underlying raw back toraw_pos, and a write that follows a read invalidates the read buffer). Addsrepr, iter protocol, context manager exits.iobase.c->module/io/iobase.go(PR #32)._IOBase,_RawIOBase,_BufferedIOBaseabstract bases.__del__emits aResourceWarningwhen an unclosed stream is collected, matching the upstream warning text exactly.textio.c->module/io/textiowrapper.go(PR #34 and follow-ups #50, #55, #56, #57). The internals rework that spec 1709 covers._iomodule.c->module/io/iomodule.go(PR #35).io.opendispatch,_UnsupportedOperation,BlockingIOErrorwiring, theopenargument parser includingclosefdandopener.
The codec layer underneath TextIOWrapper got its own ports:
- Real charmap codecs
(PR #46).
Modules/_codecsmodule.c charmap_encodeandcharmap_decodeported in full.codecs.make_encoding_mapbuilds the inverse table the way CPython does. - utf-16, utf-32, cp1252, cp1250, cp1251, cp437, mac-roman
(PR #44). Each carries
a CPython reference vector in the test table so future codec
edits cannot silently regress. utf-16 / utf-32 honor BE / LE
variants and BOM sniffing; the 8-bit code pages use real lookup
tables generated from
Lib/encodings/cp*.py.
Object protocol (spec 1704)
Tracked in PR #26 as the phases-2-through-8 bundle.
objects/classmethod.go,objects/staticmethod.go,objects/function.gorebuilt against the upstream member / getset tables. The CPython names appear inline as comments so the grep target is "givenfunc_get_qualname, where does it live in Go" rather than scrolling.objects/method.goreworked.PyMethod_Typegetset goes throughobjects.NewGetSetwith the upstream getter / setter shape.objects/type_new.gois new. The CPythontype_newpipeline was previously distributed across half a dozenNewUserTypecall sites; the rewrite collapses them into a single ordered pass that mirrorsObjects/typeobject.c:3300 type_new_impl.objects/inherit_slots.gois new. Walks every slot edge once at class-creation time and inherits through the same propagation rules CPython uses.vm/store_name.go,vm/load_name.go,vm/delete_name.gorefactored to take the same dict-fast-path-vs-protocol-call split CPython's ceval does.
PEP 649 / 749 (spec 1706)
Tracked across commits ef648d2, 4c295f2, 5bb581a, 2d312cf,
78240a3, fa00aa8. Eight phases inline.
compile/symtable.golearns annotation scopes.compile/codegen.gorewrites annotated-assignment statements into__annotate__body emission.objects/type_annotate.go,objects/function_annotate.go,objects/module_annotate.goprovide the lazy__annotations__getsets.stdlib/annotationlib.pyvendored byte for byte.
The t-string pipeline (PEP 750) rides along because
annotationlib.Format.STRING reuses it. Lazy annotations that
contain forward references format as t-strings and resolve at
access time the way CPython does.
VM and compile
exc.__traceback__carries a real frame chain (PR #52). Every raised exception now has a populated__traceback__from the moment it leaves the raising frame.traceback.format_exc()walks the chain and produces a multi-frame render that matches CPython byte for byte. The earlier behavior was a single-frame stub.- Bare Go errors get a traceback too (commit
921343a). When the VM surfaces a Go error that bubbled up from a builtin (the classic1 / 0case), the raise path now synthesizes a frame stack from the current evaluation frame so the user sees the full call chain rather than a one-lineZeroDivisionErrorwith no context. - Per-instruction location table (spec 1708,
PR #53).
Python/assemble.c emit_location_infoported in full. Every bytecode instruction carries its source(line, end_line, col, end_col)quadruple in the same wire format CPython uses, which meanscode.co_positions()returns identical iterators to upstream anddis.dishighlights the same source ranges. CodeTypebinding in lift helpers (PR #51). Builtin code objects now expose everyco_*attribute:co_argcount,co_posonlyargcount,co_kwonlyargcount,co_nlocals,co_stacksize,co_flags,co_code,co_consts,co_names,co_varnames,co_freevars,co_cellvars,co_filename,co_name,co_qualname,co_firstlineno,co_linetable,co_exceptiontable. Reflection tools (inspect.signature,dis.code_info) work against any code object now, not just the ones we happened to bind by hand.tracebackmodule frame renders as<module>(PR #54). The outermost frame in a traceback used to render with the source filename in thenameslot; it now renders as<module>to match CPython.print(file=...)against a stdlibFileinstance no longer raises spurious type errors.SEND/END_SENDstack discipline (commite767443). Generators ran with a one-slot stack imbalance because theEND_SENDopcode was popping the value it should have left for the consuming frame. Fixed; generator goroutines now leave the exact stack shape CPython's_PyEval_EvalFrameDefaultdoes.sys.exception()(commitd801664). The Python 3.11+ except-block introspection helper. Returns the currently-handled exception orNone, reading the same handled-exception slotsys.exc_info()does.IMPORT_NAMEbuiltins propagation (commitd797ab1). Imported modules inherit the importing frame's__builtins__, which means a module imported from a sandboxed builtins namespace stays sandboxed. The old code reached for the global__builtins__unconditionally.tuple.__mul__(commit816a321). Thesq_repeatslot was missing for tuple, so(1, 2) * 3raisedTypeErrorinstead of returning(1, 2, 1, 2, 1, 2). Surfaced by the argparse vendor.
Modules and ports
_socketWindows port (PR #43). The POSIX-only socket entry points (fileno-based descriptor passing,socketpair) are gated behind//go:build !windows, and a_socket_windows.gofile publishes the same public surface backed bygolang.org/x/sys/windows. The Windows test lane is now green for_socket.- Vendor
Lib/socket.py(PR #48). The Python-level socket wrapper drops to upstream verbatim. The bogus_socket.makefilestub goes; the realsocket.SocketIOclass handlesmakefileagainst the unified buffered IO layer. _thread._localreal per-thread storage (PR #45). The previous implementation backedthreading.localonto a single process-wide dict. Now each goroutine carries its own dict, andlocal.__init__replays against each new thread the way CPython does. Theargs/kwargscaptured at construction time are stored on the local object and re-run lazily on first access in each thread.dataclasses.make_dataclass(PR #47). Ports the procedural dataclass builder. Clears the corresponding CI debt rows in the spec table.functoolsPathFinder (PR #49). Drops the deadtime/_colorize/pprintshims that were left over from the v0.12.0 import-chain workaround.functoolsresolves entirely through thePathFindernow._collectionsdeque and defaultdict surface (PR #37). Rounds out the methodsModules/_collectionsmodule.cships:deque.copy,deque.__reduce__,deque.insert,deque.maxlen,defaultdict.__missing__,defaultdict.copy,defaultdict.__reduce__.signal.ItimerError(PR #36). Registered as a public exception class.setitimer/getitimerraise it rather than a bareOSErrornow.
Test infrastructure
- Regrtest smoke test
(PR #41). The
TestRunSmokeTestgate runs a known-passing slice ofLib/test/test_*.pythrough the runner on every CI lane. A regression in the import chain trips this gate before the individual test ports notice. - Vendor
Lib/test/supporthelpers (PR #40). Drops the CPython support helpers byte for byte:test.support,test.support.os_helper,test.support.threading_helper,test.support.warnings_helper. The shims that previously stood in for these go away. - Argparse resync to 3.14.5
(PR #39). The vendored
argparse.pywas at 3.14.0; this resyncs to 3.14.5. Picks up the upstreamBooleanOptionalActionbugfix. os.posixmodule.csurface (PR #38). Fills in remainingposixmodule.centries:pathconf,pathconf_names,confstr,confstr_names,sysconf,sysconf_names,WCOREDUMPand the rest of theW*wait macros.- Unittest import unblock
(PR #42). Three small
but load-bearing fixes:
bytearray += bytes(thesq_inplace_concatslot was missing for bytearray when the right operand wasbytes);memoryviewgrowing its method list to includetobytes,tolist,hex,cast,release,__enter__,__exit__; and thelinecacheclosure capturing the wrongfilenamevariable.
CI and lint
- Golangci-lint 21-issue scrub (commit
b172636). Every issue on the lint run as of v0.12.2 fixed. The lane is now configured to fail PRs that introduce new findings. - gofumpt and gofmt enforcement (commits
d4e6442,91e3640).module/iois now gofumpt-clean. The PR template assumes lint is green before review. - CPython 3.14.5 sync audit (spec 1707, commit
100447b). The upstream pin moves from 3.14.0 to 3.14.5. Net diff in our ports is tiny because the upstream patch releases are mostly stdlib bugfixes, but the audit catches the ones we care about (argparse.BooleanOptionalAction,traceback.TracebackExceptionedge cases).
Compatibility
A few user-visible changes are worth flagging.
tellandseekon text streams take big ints. CPython has always returned a 21-byte cookie that overflows int64 once the stream is large or the codec mid-character. Code that capturedf.tell()into a typedint64variable and round-tripped it throughf.seek(pos)now gets a realintback; the Go-side bridging passes through*big.Int. Most user code is fine because Pythonintis unbounded.exc.__traceback__is populated. Code that defensively checkedif e.__traceback__ is None:will find it is neverNonenow (unless explicitly cleared).traceback.format_exc()produces multi-frame output where it previously produced one-line stubs.__annotations__is lazy. ReadingCls.__annotations__triggers a call toCls.__annotate__(VALUE)on first access. Code that monkey-patchedcls.__annotations__to a dict before the class body finished now needs to write through the__annotate__function or accept that the lazy getset will overwrite the patch on first read.bytearray += bytesworks. If you had a workaround that spelled this asbytearray.extend(b)because the inplace operator raised, that workaround can go.tuple * intworks against any int. Thesq_repeatfix means(1, 2) * NhonorsNfor any int rather than the limited subset the broken path accepted.
What's next
The big remaining work for v0.12.4 onwards is the test corpus.
- The CPython
Lib/test/test_*.pygate at scale. The smoke test that landed in PR #41 runs a curated slice. Expanding it to the full manifest requires the remaining shimmed modules (sqlite3,tkinter,curses, the multiprocessing / asyncio chains) to land, and then porting the test-runner-shaped helpers (test.support.bytecode_helper,test.support.script_helper) so the individual test files run. - Spec 1703
_srepolish. The regex engine is real but a few rare-opcode paths still take the slow interpreter loop instead of the specialized one CPython falls into; the gap is small (single digits of percent on micro-benchmarks) but worth closing. - Spec 1709 follow-up. A second pass on the
TextIOWrapperstate model will look at whether we can hold the snapshot in a smaller struct than the current per-chunk allocation.
Networking, multiprocessing, asyncio, sqlite3, ctypes, tk / curses, GUI tests, gdb / dtrace remain out of scope.
Acknowledgments
This release closes work tracked across these public-facing items:
- Spec 1702 (io subsystem full port). All seven file ports closed.
- Spec 1704 (object protocol full port). Phases 2 through 8 shipped.
- Spec 1706 (PEP 649 / 749 deferred annotations). All eight phases shipped.
- Spec 1707 (CPython 3.14.5 sync audit). Pin moved, diff reviewed.
- Spec 1708 (Python/assemble.c location-emission). Full port.
- Spec 1709 (textio.c internals). Four phases plus final gate.
- Issue #544 (enum import). Closed by object protocol Phase 8.
- Issue #543 (re import end to end). Closed by the v0.12.2 regex engine plus the v0.12.3 codec layer.
- Issue #542 (fnmatch delegate). Closed.
- Issue #510 (re / _sre full port). Closed.
The commit log covering everything since v0.12.2 is at compare v0.12.2..v0.12.3. The pull request bundle covering the spec 1702 finalization is PR #27; the object protocol phase set is PR #26; the textio internals final phase is PR #57.