fix(links): use pydash.get + decode body to fix analyze chain (CON-573)#184
Merged
Merged
Conversation
The in-house navigate_dict, copy-pasted into five link modules, used `key in current` for traversal — which on a non-dict (e.g. an analysis whose body is a JSON-encoded string after #182) does a substring check and then crashes with TypeError on `current[key]`. Replace all five copies with pydash.get (already a core dep), and in analyze + analyze_and_label also wrap source with Vcon.with_decoded_body so a dotted text_location can still drill through a JSON-encoded body — matching what check_and_tag and detect_engagement already do. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3 tasks
pavanputhra
added a commit
that referenced
this pull request
May 27, 2026
A production conserver has been emitting a steady stream of
`conserver.link.count{link_name=tag, outcome=error}` with this exception:
TypeError: string indices must be integers, not 'str'
File "conserver/links/tag/__init__.py", in run
vCon.add_tag(tag_name=tag, tag_value=tag)
File "common/vcon.py", in add_tag
tags_attachment = self.find_attachment_by_purpose("tags")
File "common/vcon.py", in find_attachment_by_purpose
for a in self.vcon_dict["attachments"]
TypeError: string indices must be integers, not 'str'
`self.vcon_dict` is a `str` at failure time, yet every sampled vCon in
Redis is `JSON.TYPE = object` and a local repro with the same dict
shape succeeds. The deployed code is current with #182 and #184.
So a caller is reaching `Vcon.__init__` with a JSON-encoded string
instead of a parsed dict; the static call graph doesn't show one.
`Vcon.__init__` does `json.loads(json.dumps(vcon_dict))`, which
silently round-trips a `str` — leaving `self.vcon_dict` as a `str` so
the crash only surfaces on the first dict-style access (typically
`find_attachment_by_purpose` from the tag link).
This change:
- Detects `isinstance(vcon_dict, str)` at the boundary and logs an
ERROR with `stack_info=True` so the originating call site is visible
in the next failure that lands. The message echoes a head of the
payload so we can see what flavor of bad input it is.
- Defensively `json.loads`'s the string when it's valid JSON, so the
link doesn't crash and the vCon doesn't get parked in the DLQ for
the duration of the investigation. Non-JSON input falls back to `{}`
rather than poisoning downstream access.
The defensive coercion is intended to stay only as long as it takes to
find and fix the upstream caller. Two regression tests cover both
branches and assert that the caller stack is captured.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
pavanputhra
added a commit
that referenced
this pull request
May 27, 2026
…#186) * investigate(vcon): log caller stack + coerce str arg in Vcon.__init__ A production conserver has been emitting a steady stream of `conserver.link.count{link_name=tag, outcome=error}` with this exception: TypeError: string indices must be integers, not 'str' File "conserver/links/tag/__init__.py", in run vCon.add_tag(tag_name=tag, tag_value=tag) File "common/vcon.py", in add_tag tags_attachment = self.find_attachment_by_purpose("tags") File "common/vcon.py", in find_attachment_by_purpose for a in self.vcon_dict["attachments"] TypeError: string indices must be integers, not 'str' `self.vcon_dict` is a `str` at failure time, yet every sampled vCon in Redis is `JSON.TYPE = object` and a local repro with the same dict shape succeeds. The deployed code is current with #182 and #184. So a caller is reaching `Vcon.__init__` with a JSON-encoded string instead of a parsed dict; the static call graph doesn't show one. `Vcon.__init__` does `json.loads(json.dumps(vcon_dict))`, which silently round-trips a `str` — leaving `self.vcon_dict` as a `str` so the crash only surfaces on the first dict-style access (typically `find_attachment_by_purpose` from the tag link). This change: - Detects `isinstance(vcon_dict, str)` at the boundary and logs an ERROR with `stack_info=True` so the originating call site is visible in the next failure that lands. The message echoes a head of the payload so we can see what flavor of bad input it is. - Defensively `json.loads`'s the string when it's valid JSON, so the link doesn't crash and the vCon doesn't get parked in the DLQ for the duration of the investigation. Non-JSON input falls back to `{}` rather than poisoning downstream access. The defensive coercion is intended to stay only as long as it takes to find and fix the upstream caller. Two regression tests cover both branches and assert that the caller stack is captured. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(vcon): force propagate=True on vcon logger so caplog sees records CI failed the two new __init__ tests: tests/core/test_vcon.py::test_init_coerces_json_string_arg_and_logs_caller FAILED tests/core/test_vcon.py::test_init_bails_to_empty_dict_for_non_json_string FAILED Captured stdout showed the ERROR record being emitted by the project's JSON stdout handler, but caplog.records was empty so the assertion `any("received a str" in rec.message for rec in caplog.records)` was False. Root cause: `common/logging.conf` (the default when LOGGING_CONFIG_FILE is unset, loaded transitively whenever any module imports lib.logging_utils) sets `[logger_vcon] propagate = 0`. With propagation off, records from the "vcon" logger flow only to that logger's own handlers and never reach the root logger that pytest's caplog attaches to — so caplog.records stays empty even though the record is real. A local pytest run that imports only `from vcon import Vcon` never loads `lib.logging_utils`, so the config never applies, propagate stays True (default), and caplog sees the record — which is why the tests passed locally but failed in CI. Fix: a small fixture flips `propagate = True` (and `disabled = False`) on the vcon logger for the duration of each test and restores them after. The two __init__ tests now use the fixture and pass under both local and CI configurations (verified by pre-importing lib.logging_utils to force the CI shape). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
TypeError: string indices must be integers, not 'str'thrown from theanalyzelink'snavigate_dict, killing the chain and DLQing vCons.navigate_dictwas copy-pasted into 5 link modules withkey in currenttraversal. When an intermediate step landed on a JSON-encodedbodystring (the spec-current shape after fix(vcon): decode body per spec on read, canonicalize on write #182),key in currentdid a substring check, passed, and the nextcurrent[key]raised on the string-indexed access.pydash.get(already a core dep —pydash>=7.0.7inpyproject.toml).pydash.gethandles every edge case we relied on (None input, missing key, non-dict mid-path, list-index-via-dot) and returnsNonecleanly instead of raising.sourcewithVcon.with_decoded_body(source)inanalyzeandanalyze_and_labelso a dottedtext_locationlikebody.paragraphs.transcriptcan still drill through a JSON-encoded body —check_and_taganddetect_engagementalready do this;analyze/analyze_and_labelhad silently broken after fix(vcon): decode body per spec on read, canonicalize on write #182.test_navigate_dict*blocks (testing in-house code that no longer exists). Kept oneTestPydashGetContract::test_returns_none_when_midpath_is_non_dictintest_analyze.pythat pins the CON-573 behaviour we depend on.Side-effect (positive)
5 tests that were already failing on
mainwith the same CON-573 TypeError now pass:conserver/tests/test_link_metrics.py::TestAnalyzeMetrics::test_success_records_analysis_timeconserver/tests/test_link_metrics.py::TestAnalyzeMetrics::test_failure_increments_counterconserver/tests/test_link_metrics.py::TestAnalyzeAndLabelMetrics::test_success_records_labels_addedconserver/tests/test_link_metrics.py::TestAnalyzeAndLabelMetrics::test_success_records_analysis_timeconserver/tests/test_link_metrics.py::TestAnalyzeAndLabelMetrics::test_failure_increments_counterPlus the 6
analyze_and_label/tests/test_analyze_and_label.py::test_run_*tests that were silently broken onmainfor the same reason.The remaining test failures on this branch (
TestCheckAndTagMetricsx3,TestOpenAITranscribeMetricsx2,groq_whisper::test_get_transcription) all pre-exist onmainand are unrelated to this change.Test plan
pytest conserver/links/{analyze,analyze_and_label,check_and_tag,detect_engagement,analyze_vcon}/— 38 passed, 10 skipped, 0 failuresTypeErroragainst the pre-fixnavigate_dictin a one-liner and confirmed the new code returnsNoneinsteadpytest conserver/sweep — no new failures introduced