[BugFix] Fix bin/chart NPE with null values (#5174)#5334
[BugFix] Fix bin/chart NPE with null values (#5174)#5334qianheng-aws wants to merge 5 commits intoopensearch-project:mainfrom
Conversation
Bin bucket functions declared non-nullable VARCHAR return type, causing Calcite to optimize away IS NOT NULL filters and allowing null group keys to reach TreeMap which crashes in compareTo. Fix: declare return type as nullable using FORCE_NULLABLE and add nullsLast to chart command sorts as a defensive measure. Signed-off-by: Heng Qian <qianheng@amazon.com>
PR Reviewer Guide 🔍(Review updated until commit ccb3283)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to ccb3283 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit d41e598
Suggestions up to commit 1f7053d
Suggestions up to commit 7edbc20
Suggestions up to commit ac9092c
|
Decision LogRoot Cause: Bin bucket functions ( Approach:
Alternatives Rejected:
Pitfalls: The non-nullable return type was inherited from Things to Watch: Other UDFs using |
The dedupe command and scripts/comment-on-duplicates.sh were removed from main, but CLAUDE_GUIDE.md still referenced them, causing linkchecker CI failure. Signed-off-by: Heng Qian <qianheng@amazon.com>
|
Persistent review updated to latest commit 7edbc20 |
- Fix total count from 3 to 2 in YAML REST test (bin+chart with category produces 2 groups, not 3) - Add datarows assertions to both YAML test cases for row-level validation - Correct table statistics row count in unit test Signed-off-by: Heng Qian <qianheng@amazon.com>
|
Persistent review updated to latest commit 1f7053d |
YAML REST tests should assert actual row content, not just count. Learned from opensearch-project#5174 where total-only assertion missed incorrect values. Signed-off-by: Heng Qian <qianheng@amazon.com>
|
Persistent review updated to latest commit d41e598 |
- Add pwd/branch reporting at harness start so caller knows the working directory - Add Completion Gate with git status check to prevent uncommitted changes from being left behind - Skill Step 2B reuses existing worktree if PR branch is already checked out, instead of always creating a new one - Step 3 requires main agent to record worktree→PR mapping for directing future operations to the correct directory Signed-off-by: Heng Qian <qianheng@amazon.com>
|
Persistent review updated to latest commit ccb3283 |
| --- | ||
|
|
||
| ## `/dedupe` | ||
|
|
||
| Find duplicate GitHub issues for a given issue. | ||
|
|
There was a problem hiding this comment.
Yes, #5319 has migrate to reuse the workflow in opensearch-build. The original dedupe workflow has been migrated to that repo for reusing across openseach projects.
| ppl: { body: { query: "source=test_issue_<ISSUE> | <your PPL>" } } | ||
| - match: { total: <expected> } | ||
| - length: { datarows: <expected> } | ||
| - match: { datarows: [ [ <row1_val1>, <row1_val2> ], [ <row2_val1>, <row2_val2> ] ] } |
There was a problem hiding this comment.
This change is from skill self-improving after receiving the feedback of PR_REVEW workflow comments.
|
Should we add |
Description
Fix NullPointerException when using
binthenchartwith null values in the binned field.Root cause: The bin bucket functions (
SpanBucketFunction,MinspanBucketFunction,RangeBucketFunction) declared a non-nullableVARCHAR(2000)return type viaReturnTypes.VARCHAR_2000, even though they returnnullfor null inputs. This caused Calcite's optimizer to remove theIS NOT NULLfilters (added byvisitAggregation'snonNullGroupMasklogic) as trivially true, allowing null group keys to reach the Enumerable aggregation'sTreeMap, which crashed with NPE incompareTo.Fix:
ReturnTypes.VARCHAR_2000.andThen(SqlTypeTransforms.FORCE_NULLABLE)so that the null-filtering logic invisitAggregationworks correctlynullsLastto chart command sort operations as a defensive measureRelated Issues
Resolves #5174
Check List
-s)spotlessCheckpassedTest plan
CalcitePPLChartNullTest) verifying logical plan includes null filter and NULLS LASTCalciteBinChartNullIT) verifying bin+chart with null values works end-to-end5174.yml) reproducing the exact scenario from the issue