Skip to content

Add whitelist to block non-query statements in unified SQL path#5330

Open
dai-chen wants to merge 1 commit intoopensearch-project:mainfrom
dai-chen:security/unified-sql-query-type-whitelist
Open

Add whitelist to block non-query statements in unified SQL path#5330
dai-chen wants to merge 1 commit intoopensearch-project:mainfrom
dai-chen:security/unified-sql-query-type-whitelist

Conversation

@dai-chen
Copy link
Copy Markdown
Collaborator

@dai-chen dai-chen commented Apr 9, 2026

Description

The unified SQL path uses Calcite's SqlParser.parseQuery() which, despite its name, accepts any SQL statement type including DML and DDL. The SQL V2 grammar restricts input to SELECT/SHOW/DESCRIBE at the parser level, but the unified path had no equivalent protection.

Changes

Validate the parsed SqlNode against SqlKind.QUERY in CalciteNativeStrategy.plan() before proceeding to planning. Non-query statements are rejected with UnsupportedOperationException.

Statement type SQL V2/Legacy Unified SQL Blocked by
SELECT (with JOIN, subquery, etc.) ✅ Allowed
UNION, EXCEPT, INTERSECT ✅ UNION/EXCEPT ✅ Allowed
CTE (WITH), VALUES ✅ Allowed
DDL (CREATE MV, etc.) ❌ Blocked Calcite parser
SHOW / DESCRIBE ❌ Blocked Calcite parser
EXPLAIN Separate /_explain path ❌ Blocked Whitelist
INSERT, DELETE, UPDATE, MERGE ❌ Blocked Whitelist

Note: EXPLAIN, SHOW, and DESCRIBE have separate execution paths in the current SQL plugin and are out of scope for the unified query API — they should be handled by the API caller for now.

Related Issues

Part of #5248

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Validate parsed SQL statements against Calcite's SqlKind.QUERY set
before planning, rejecting DML (INSERT, DELETE, UPDATE, MERGE) and DDL
statements. This brings the unified path in line with the legacy
grammar's parser-level restriction to read-only queries.

Whitelist: SqlKind.QUERY (SELECT, UNION, INTERSECT, EXCEPT, VALUES,
WITH, ORDER_BY, EXPLICIT_TABLE).

Note: EXPLAIN is also blocked because Calcite's SqlToRelConverter does
not handle SqlExplain in convertQueryRecursive. Supporting EXPLAIN
requires unwrapping the inner query and formatting the plan separately.

Signed-off-by: Chen Dai <daichen@amazon.com>
@dai-chen dai-chen self-assigned this Apr 9, 2026
@dai-chen dai-chen added enhancement New feature or request SQL labels Apr 9, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 Multiple PR themes

Sub-PR theme: Add SQL query type whitelist in CalciteNativeStrategy

Relevant files:

  • api/src/main/java/org/opensearch/sql/api/UnifiedQueryPlanner.java

Sub-PR theme: Add tests for non-query statement blocking

Relevant files:

  • api/src/test/java/org/opensearch/sql/api/UnifiedQueryPlannerSqlTest.java
  • api/src/test/java/org/opensearch/sql/api/UnifiedQueryPlannerTest.java

⚡ Recommended focus areas for review

Error Message Leakage

The UnsupportedOperationException message includes parsed.getKind() which exposes internal Calcite SqlKind enum values to callers. This may reveal implementation details and could be confusing to end users. Consider mapping the kind to a user-friendly description or omitting it.

  throw new UnsupportedOperationException(
      "Only query statements are supported. Got: " + parsed.getKind());
}
Missing Assertion

The test testNonQueryStatementsBlockedByWhitelist iterates over a list and calls givenInvalidQuery(sql).assertErrorMessage(...) but does not assert that an exception is actually thrown (i.e., there is no assertThrows wrapping). If givenInvalidQuery silently swallows exceptions, the test could pass without actually validating the blocking behavior.

public void testNonQueryStatementsBlockedByWhitelist() {
  List.of(
          """
          INSERT INTO catalog.employees (id, name, age, department)
          VALUES (99, 'injected', 0, 'hacked')\
          """,
          """
          DELETE FROM catalog.employees
          WHERE age > 30\
          """,
          """
          UPDATE catalog.employees
          SET department = 'Fired'
          WHERE age > 50\
          """,
          """
          EXPLAIN PLAN FOR
          SELECT * FROM catalog.employees\
          """,
          """
          MERGE INTO catalog.employees AS t
          USING (SELECT 99 AS id) AS s ON t.id = s.id
          WHEN MATCHED THEN UPDATE SET name = 'hacked'\
          """)
      .forEach(
          sql ->
              givenInvalidQuery(sql).assertErrorMessage("Only query statements are supported"));
Missing @Test Annotation

The testUnsupportedStatementType method has @Test(expected = UnsupportedOperationException.class) added, but the old diff shows the method previously had no @Test annotation at all. Verify that the test was previously not being run and that the new annotation is correctly placed and the test now executes as expected.

@Test(expected = UnsupportedOperationException.class)
public void testUnsupportedStatementType() {
  planner.plan("explain source = catalog.employees"); // explain statement
}

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 9, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Ensure CTE queries are not blocked

SqlKind.QUERY is a set that includes SELECT, UNION, INTERSECT, EXCEPT, and ORDER_BY,
but may not cover all valid read-only query forms (e.g., WITH / CTEs map to WITH
kind). Verify that SqlKind.QUERY covers all intended query types, or explicitly
enumerate the allowed kinds to avoid accidentally blocking valid queries or allowing
unintended ones.

api/src/main/java/org/opensearch/sql/api/UnifiedQueryPlanner.java [85-88]

-if (!parsed.isA(SqlKind.QUERY)) {
+if (!parsed.isA(SqlKind.QUERY) && parsed.getKind() != SqlKind.WITH) {
       throw new UnsupportedOperationException(
           "Only query statements are supported. Got: " + parsed.getKind());
     }
Suggestion importance[1-10]: 5

__

Why: This is a valid concern about SqlKind.QUERY potentially not covering WITH (CTE) queries. However, in Apache Calcite, SqlKind.QUERY does include WITH as part of its set, so the suggestion may be based on an incorrect assumption. The improvement could still be valuable as a defensive measure, but it's not clearly necessary.

Low
Possible issue
Assert exception is thrown per statement

The test iterates over statements but does not assert that any exception is actually
thrown. If givenInvalidQuery(sql).assertErrorMessage(...) silently passes for a
valid query or swallows exceptions, the test would give a false positive. Ensure
each SQL in the list actually triggers an UnsupportedOperationException by using
assertThrows per statement, similar to testInvalidSqlThrowsException.

api/src/test/java/org/opensearch/sql/api/UnifiedQueryPlannerSqlTest.java [241-243]

 List.of(
         ...
         )
     .forEach(
         sql ->
-            givenInvalidQuery(sql).assertErrorMessage("Only query statements are supported"));
+            assertThrows(
+                UnsupportedOperationException.class,
+                () -> planner.plan(sql)));
Suggestion importance[1-10]: 4

__

Why: The suggestion raises a valid concern about whether givenInvalidQuery(sql).assertErrorMessage(...) properly asserts exceptions are thrown. However, the givenInvalidQuery helper method likely already handles exception assertion internally (it's a test helper pattern), and the improved_code changes the assertion approach without knowing the full implementation of givenInvalidQuery. The suggestion is speculative without seeing the helper method's implementation.

Low

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request SQL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant