Skip to content

Implement #254: preserve TOML table syntax on round-trip#670

Open
rajulbhatnagar wants to merge 3 commits into
FasterXML:3.xfrom
rajulbhatnagar:feature/254-toml-roundtrip
Open

Implement #254: preserve TOML table syntax on round-trip#670
rajulbhatnagar wants to merge 3 commits into
FasterXML:3.xfrom
rajulbhatnagar:feature/254-toml-roundtrip

Conversation

@rajulbhatnagar

@rajulbhatnagar rajulbhatnagar commented May 17, 2026

Copy link
Copy Markdown

Summary

Resolves #254. Parsing a TOML document with [section] headers or
[[array of tables]] and writing the resulting tree back now preserves
those forms instead of collapsing to dotted keys.

How it works

  • Origin-flag node subclasses + factory. Package-private
    TomlObjectNode / TomlArrayNode carry fromTableHeader /
    fromArrayOfTables flags set by the parser. TomlNodeFactory
    produces them so mapper.createObjectNode() and deepCopy()
    round-trip the metadata. TomlMapper._readTreeAndClose
    short-circuits the deserializer reconstruction when the parser is
    one we created (a TomlTreeTraversingParser marker), preserving
    flags that the standard JsonNodeDeserializer path would have
    dropped. The short-circuit is disabled when the user has overridden
    the node factory, so a user-configured factory is honored (at the
    cost of flag preservation, the explicit opt-out trade-off).

  • Strict-ordering enforcement. TOML rejects scalars after a
    sub-table at the same level. TomlObjectNode.serialize reorders
    children up front, deferring any branch that contains a table-
    flagged descendant. TomlGenerator also enforces the rule on the
    explicit-API path: scalars after writeStartTable() at any
    ancestor scope (not just the immediate context) throw
    TomlStreamWriteException.

  • Explicit generator API. New public methods on TomlGenerator:
    writeStartTable(), writeStartArrayOfTables(). Callers building
    TOML from scratch can emit [path] and [[path]] directly without
    parsing first.

  • BigDecimal trailing-zero handling. Honors
    JsonNodeFeature.STRIP_TRAILING_BIGDECIMAL_ZEROES on the
    short-circuit path so the override matches what
    JsonNodeDeserializer would have applied.

Backward compatibility

Trees built via mapper.createObjectNode() with only scalar children
(no parsed [section] origin) still serialize as dotted keys. Verified
by unflaggedNestedObjectStillUsesDottedKeys. Existing users on
upgrade paths see no behavior change.

This branch is rebased on the current 3.x HEAD, so it composes with
the recently-merged #669 fix (dotted key cannot extend an explicitly
defined table). The explicitlyDefined flag now lives on the promoted
TomlObjectNode; two tests
(dottedKeyExtendingExplicitTableStillRejected,
explicitTableThenExplicitSubtableRoundTrips) confirm the two features
coexist.

Test plan

  • ./mvnw -B -ntp -pl toml test — all green (TomlRoundTripTest: 33
    cases; full module: 195 run, 2 skipped — the pre-existing @Disabled
    compliance stubs, untouched by this PR)
    • parsed -> write preservation, deepCopy preservation parameterized
      over [table]/[[aot]], explicit-API emission, strict-ordering
      parameterized over 5 illegal-write cases, ancestor-scope scalar
      lockout, inline-scope fallthrough, custom-factory honor, BigDecimal
      feature on/off, TreeTraversingParser source-exposure contract,
      end-to-end fixture, TOML parser: invalid dotted-key extension of explicit tables / AOT not rejected #669 interaction

Notes for reviewers

  • The TomlTreeTraversingParser is a no-op marker subclass used only
    to discriminate parsers we created from caller-supplied
    TreeTraversingParser instances in _readTreeAndClose. Caller-
    supplied parsers go through the standard deserializer path so their
    source tree is not mutated (verified by
    externalTreeTraversingParserDoesNotMutateSource).
  • The _scalarsClosed propagation walks ancestors up to the nearest
    enclosing TABLE / ARRAY_OF_TABLES, fixing a subtle issue where
    a writeStartTable() invoked from inside a DOTTED_OBJECT would
    not lock the root scope, allowing later root-level scalars to
    serialize inside the just-emitted section. Regression test:
    scalarAtAncestorAfterNestedTableThrows.
  • The TOML 1.0.0 compliance test suite (toml-test) that surfaced
    several of the tracked parser bugs is intentionally not in this
    PR; it will be submitted separately.

@yawkat

yawkat commented May 18, 2026

Copy link
Copy Markdown
Member

Very intrusive changes for what is frankly a minor issue

@cowtowncoder

Copy link
Copy Markdown
Member

Definitely work needs to be split into smaller prs, 900+ file pr is a no-go.

@github-actions

Copy link
Copy Markdown

🧪 Code Coverage Report

Metric Coverage Change
Instructions coverage 71.59% 📈 +0.950%
Branches branches 66.25% 📈 +1.120%

Coverage data generated from JaCoCo test results

@rfscholte

Copy link
Copy Markdown

This topic has my interest. As stated in the description commit 1 is the compatibility test, which should be removed from this PR. (it makes more sense when there's a separate project that runs this testset again all toml parser, so it can make comparison statistics)

@cowtowncoder

cowtowncoder commented May 24, 2026

Copy link
Copy Markdown
Member

@rfscholte There's now #678 which I think adds compatibility testing suite (once fixes are in).

Or if not, separate PR for testing part would make sense.

@cowtowncoder

Copy link
Copy Markdown
Member

Compliancy test added for CI; compliancy deviations fixed.

This PR needs merging from 3.x head; some conflicts to resolve.

@rajulbhatnagar

rajulbhatnagar commented May 30, 2026

Copy link
Copy Markdown
Author

Ya makes sense for the compatibility suite to run as part of CI. Will remove that commit and rebase. Also happy to break this into more than 1 PR if you folks prefer

Parsing a TOML document with [section] headers or [[array of tables]]
and writing the resulting tree back now preserves those forms instead
of collapsing to dotted keys.

- Origin-flag node subclasses (TomlObjectNode, TomlArrayNode) carry
  fromTableHeader / fromArrayOfTables flags set by the parser.
  TomlNodeFactory produces them so createObjectNode and deepCopy
  round-trip the metadata. TomlMapper._readTreeAndClose short-circuits
  the deserializer reconstruction when the parser is one we created
  (a TomlTreeTraversingParser marker), preserving flags that the
  standard JsonNodeDeserializer path would have dropped. The
  short-circuit is disabled when the user has overridden the node
  factory, so a user-configured factory is honored.

- Strict-ordering enforcement: TOML rejects scalars after a sub-table
  at the same level. TomlObjectNode.serialize reorders children up
  front, deferring any branch that contains a table-flagged
  descendant. TomlGenerator also enforces the rule on the explicit-API
  path: scalars after a writeStartTable() at any ancestor scope (not
  just the immediate context) throw TomlStreamWriteException.

- Explicit generator API: writeStartTable() and
  writeStartArrayOfTables() emit [path] / [[path]] directly without
  requiring a parsed origin tree.

- BigDecimal trailing-zero handling honors
  JsonNodeFeature.STRIP_TRAILING_BIGDECIMAL_ZEROES on the short-circuit
  path so the override matches what JsonNodeDeserializer would have
  applied.

Backward compatibility: trees built via createObjectNode + scalar
children only (no parsed origin) still serialize as dotted keys, so
existing users on upgrade paths see no behavior change.
@rajulbhatnagar rajulbhatnagar force-pushed the feature/254-toml-roundtrip branch from a6cfbb8 to a8593bb Compare May 30, 2026 02:50
@github-actions

github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown

🧪 Code Coverage Report

Metric Coverage Change
Instructions coverage 71.93% 📈 +0.650%
Branches branches 66.65% 📈 +0.700%

Coverage data generated from JaCoCo test results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

More configurable TOML serialization

4 participants