Skip to content

fix(links): use pydash.get + decode body to fix analyze chain (CON-573)#184

Merged
pavanputhra merged 1 commit into
mainfrom
pavankumar/con-573-fix-analyze-link
May 26, 2026
Merged

fix(links): use pydash.get + decode body to fix analyze chain (CON-573)#184
pavanputhra merged 1 commit into
mainfrom
pavankumar/con-573-fix-analyze-link

Conversation

@pavanputhra
Copy link
Copy Markdown
Contributor

Summary

  • Closes CON-573TypeError: string indices must be integers, not 'str' thrown from the analyze link's navigate_dict, killing the chain and DLQing vCons.
  • Root cause: navigate_dict was copy-pasted into 5 link modules with key in current traversal. When an intermediate step landed on a JSON-encoded body string (the spec-current shape after fix(vcon): decode body per spec on read, canonicalize on write #182), key in current did a substring check, passed, and the next current[key] raised on the string-indexed access.
  • Replaced all 5 in-house copies with pydash.get (already a core dep — pydash>=7.0.7 in pyproject.toml). pydash.get handles every edge case we relied on (None input, missing key, non-dict mid-path, list-index-via-dot) and returns None cleanly instead of raising.
  • Additionally wrapped source with Vcon.with_decoded_body(source) in analyze and analyze_and_label so a dotted text_location like body.paragraphs.transcript can still drill through a JSON-encoded body — check_and_tag and detect_engagement already do this; analyze/analyze_and_label had silently broken after fix(vcon): decode body per spec on read, canonicalize on write #182.
  • Dropped 4 duplicate test_navigate_dict* blocks (testing in-house code that no longer exists). Kept one TestPydashGetContract::test_returns_none_when_midpath_is_non_dict in test_analyze.py that pins the CON-573 behaviour we depend on.
  • Net diff: +52 / -131 across 9 files.

Side-effect (positive)

5 tests that were already failing on main with the same CON-573 TypeError now pass:

  • conserver/tests/test_link_metrics.py::TestAnalyzeMetrics::test_success_records_analysis_time
  • conserver/tests/test_link_metrics.py::TestAnalyzeMetrics::test_failure_increments_counter
  • conserver/tests/test_link_metrics.py::TestAnalyzeAndLabelMetrics::test_success_records_labels_added
  • conserver/tests/test_link_metrics.py::TestAnalyzeAndLabelMetrics::test_success_records_analysis_time
  • conserver/tests/test_link_metrics.py::TestAnalyzeAndLabelMetrics::test_failure_increments_counter

Plus the 6 analyze_and_label/tests/test_analyze_and_label.py::test_run_* tests that were silently broken on main for the same reason.

The remaining test failures on this branch (TestCheckAndTagMetrics x3, TestOpenAITranscribeMetrics x2, groq_whisper::test_get_transcription) all pre-exist on main and are unrelated to this change.

Test plan

  • pytest conserver/links/{analyze,analyze_and_label,check_and_tag,detect_engagement,analyze_vcon}/38 passed, 10 skipped, 0 failures
  • Reproduced the original TypeError against the pre-fix navigate_dict in a one-liner and confirmed the new code returns None instead
  • Wider pytest conserver/ sweep — no new failures introduced
  • Deploy to test env and confirm the original DLQ'd vCons re-process cleanly

The in-house navigate_dict, copy-pasted into five link modules, used
`key in current` for traversal — which on a non-dict (e.g. an analysis
whose body is a JSON-encoded string after #182) does a substring check
and then crashes with TypeError on `current[key]`.

Replace all five copies with pydash.get (already a core dep), and in
analyze + analyze_and_label also wrap source with Vcon.with_decoded_body
so a dotted text_location can still drill through a JSON-encoded body —
matching what check_and_tag and detect_engagement already do.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@pavanputhra pavanputhra merged commit 123dde9 into main May 26, 2026
1 check failed
pavanputhra added a commit that referenced this pull request May 27, 2026
A production conserver has been emitting a steady stream of
`conserver.link.count{link_name=tag, outcome=error}` with this exception:

    TypeError: string indices must be integers, not 'str'
      File "conserver/links/tag/__init__.py", in run
        vCon.add_tag(tag_name=tag, tag_value=tag)
      File "common/vcon.py", in add_tag
        tags_attachment = self.find_attachment_by_purpose("tags")
      File "common/vcon.py", in find_attachment_by_purpose
        for a in self.vcon_dict["attachments"]
    TypeError: string indices must be integers, not 'str'

`self.vcon_dict` is a `str` at failure time, yet every sampled vCon in
Redis is `JSON.TYPE = object` and a local repro with the same dict
shape succeeds. The deployed code is current with #182 and #184.
So a caller is reaching `Vcon.__init__` with a JSON-encoded string
instead of a parsed dict; the static call graph doesn't show one.

`Vcon.__init__` does `json.loads(json.dumps(vcon_dict))`, which
silently round-trips a `str` — leaving `self.vcon_dict` as a `str` so
the crash only surfaces on the first dict-style access (typically
`find_attachment_by_purpose` from the tag link).

This change:

- Detects `isinstance(vcon_dict, str)` at the boundary and logs an
  ERROR with `stack_info=True` so the originating call site is visible
  in the next failure that lands. The message echoes a head of the
  payload so we can see what flavor of bad input it is.
- Defensively `json.loads`'s the string when it's valid JSON, so the
  link doesn't crash and the vCon doesn't get parked in the DLQ for
  the duration of the investigation. Non-JSON input falls back to `{}`
  rather than poisoning downstream access.

The defensive coercion is intended to stay only as long as it takes to
find and fix the upstream caller. Two regression tests cover both
branches and assert that the caller stack is captured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pavanputhra added a commit that referenced this pull request May 27, 2026
…#186)

* investigate(vcon): log caller stack + coerce str arg in Vcon.__init__

A production conserver has been emitting a steady stream of
`conserver.link.count{link_name=tag, outcome=error}` with this exception:

    TypeError: string indices must be integers, not 'str'
      File "conserver/links/tag/__init__.py", in run
        vCon.add_tag(tag_name=tag, tag_value=tag)
      File "common/vcon.py", in add_tag
        tags_attachment = self.find_attachment_by_purpose("tags")
      File "common/vcon.py", in find_attachment_by_purpose
        for a in self.vcon_dict["attachments"]
    TypeError: string indices must be integers, not 'str'

`self.vcon_dict` is a `str` at failure time, yet every sampled vCon in
Redis is `JSON.TYPE = object` and a local repro with the same dict
shape succeeds. The deployed code is current with #182 and #184.
So a caller is reaching `Vcon.__init__` with a JSON-encoded string
instead of a parsed dict; the static call graph doesn't show one.

`Vcon.__init__` does `json.loads(json.dumps(vcon_dict))`, which
silently round-trips a `str` — leaving `self.vcon_dict` as a `str` so
the crash only surfaces on the first dict-style access (typically
`find_attachment_by_purpose` from the tag link).

This change:

- Detects `isinstance(vcon_dict, str)` at the boundary and logs an
  ERROR with `stack_info=True` so the originating call site is visible
  in the next failure that lands. The message echoes a head of the
  payload so we can see what flavor of bad input it is.
- Defensively `json.loads`'s the string when it's valid JSON, so the
  link doesn't crash and the vCon doesn't get parked in the DLQ for
  the duration of the investigation. Non-JSON input falls back to `{}`
  rather than poisoning downstream access.

The defensive coercion is intended to stay only as long as it takes to
find and fix the upstream caller. Two regression tests cover both
branches and assert that the caller stack is captured.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(vcon): force propagate=True on vcon logger so caplog sees records

CI failed the two new __init__ tests:

    tests/core/test_vcon.py::test_init_coerces_json_string_arg_and_logs_caller FAILED
    tests/core/test_vcon.py::test_init_bails_to_empty_dict_for_non_json_string FAILED

Captured stdout showed the ERROR record being emitted by the project's
JSON stdout handler, but caplog.records was empty so the assertion
`any("received a str" in rec.message for rec in caplog.records)` was
False.

Root cause: `common/logging.conf` (the default when LOGGING_CONFIG_FILE
is unset, loaded transitively whenever any module imports
lib.logging_utils) sets `[logger_vcon] propagate = 0`. With propagation
off, records from the "vcon" logger flow only to that logger's own
handlers and never reach the root logger that pytest's caplog attaches
to — so caplog.records stays empty even though the record is real.

A local pytest run that imports only `from vcon import Vcon` never
loads `lib.logging_utils`, so the config never applies, propagate
stays True (default), and caplog sees the record — which is why the
tests passed locally but failed in CI.

Fix: a small fixture flips `propagate = True` (and `disabled = False`)
on the vcon logger for the duration of each test and restores them
after. The two __init__ tests now use the fixture and pass under both
local and CI configurations (verified by pre-importing
lib.logging_utils to force the CI shape).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant