Fix #263: Add box_id to ORDER BY for deterministic sorting#274
Open
algsoch wants to merge 1 commit into
Open
Conversation
- Added o.box_id to ORDER BY clause in getMainUnspentByErgoTree (line 253) - Added o.box_id to ORDER BY clause in getMainUnspentByErgoTreeFiltered (line 289) - Fixes PostgreSQL DISTINCT ON requirement: columns must match ORDER BY prefix - Resolves non-deterministic results when querying addresses with many boxes - Ensures sortDirection parameter works correctly with pagination
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical PostgreSQL query bug in the Explorer API that caused non-deterministic results when querying unspent boxes by address. The root cause was a mismatch between DISTINCT ON and ORDER BY clauses, violating PostgreSQL's requirement that ORDER BY must start with the same columns in the same order as DISTINCT ON.
Key Changes:
- Added
o.box_idto ORDER BY clause in two query methods to match their DISTINCT ON clauses - Ensures deterministic, consistent query results for address-based box queries
- Fixes pagination and sortDirection parameter functionality
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix #263: Add box_id to ORDER BY for deterministic sorting
🎯 Summary
Fixes Issue #263 - sortDirection parameter not working correctly
This PR fixes a PostgreSQL query bug that caused non-deterministic results when querying unspent boxes by address. The issue was a mismatch between
DISTINCT ONcolumns andORDER BYcolumns, which violates PostgreSQL requirements and leads to inconsistent query results.🐛 Problem Description
What Was Broken
The API endpoint
/api/v1/boxes/byAddress/{address}was returning different results on repeated queries with the same parameters. This affected:sortDirectionparameter (asc/desc) didn't work correctlyRoot Cause
The queries in
OutputQuerySet.scalaused:PostgreSQL Requirement: When using
DISTINCT ON (col1, col2, ...), theORDER BYclause must start with the same columns in the same order.What we had:
DISTINCT ON (o.box_id, o.global_index)butORDER BY o.global_index❌What we need:
DISTINCT ON (o.box_id, o.global_index)andORDER BY o.box_id, o.global_index✅Impact
✅ Solution
Added
o.box_idto theORDER BYclause in two query methods to match theDISTINCT ONclause.Changes Made
File:
modules/explorer-core/src/main/scala/org/ergoplatform/explorer/db/queries/OutputQuerySet.scalaChange 1: Line 253 (getMainUnspentByErgoTree)
Change 2: Line 289 (getMainUnspentByErgoTreeFiltered)
🧪 Testing
Manual Testing
Test with an address that has many boxes:
Before Fix: Different boxes returned on each iteration ❌
After Fix: Identical boxes returned on every iteration ✅
Expected Behavior After Fix
📊 Performance Impact
Performance: ✅ No negative impact
box_idcolumn is already part of the index used byDISTINCT ONORDER BYdoesn't require additional sorting🔍 Code Quality
📝 Checklist
🔗 References
💡 Technical Details
Why This Works
PostgreSQL's
DISTINCT ONremoves duplicate rows based on the specified columns, but it needs to know which duplicate to keep. TheORDER BYclause tells PostgreSQL how to sort the rows before selecting the first one from each group.When the
ORDER BYdoesn't start with the same columns asDISTINCT ON, PostgreSQL can't determine which row to keep deterministically, leading to non-deterministic results across query executions.Query Flow
WHERE o.main_chain = true AND i.box_id IS NULL AND o.ergo_tree = ?ORDER BY o.box_id, o.global_index DESCDISTINCT ON (o.box_id, o.global_index)- keeps first row per groupOFFSET ? LIMIT ?🎁 Additional Benefits
🚀 Deployment
This fix can be deployed immediately: