RFC: OpenSearch SQL/PPL Telemetry Integration

## Problem Statement

The OpenSearch SQL/PPL plugin has no integration with OpenSearch's core telemetry framework. There is no distributed tracing for query execution, making it difficult to diagnose latency issues across the parse → analyze → optimize → compile → execute → materialize pipeline. The existing profiling framework (`QueryProfiling`) provides wall-clock timing but is disconnected from the standard OpenSearch telemetry export pipeline (OTel SDK → OTLP → observability backends).

## Goals

- **P0**: Add distributed tracing spans to the PPL Calcite query execution pipeline
- Follow OTel semantic conventions for database spans (Elasticsearch for `db.namespace`/`db.collection.name`, PostgreSQL/MySQL for `db.operation.name`)
- Integrate with OpenSearch's `TelemetryAwarePlugin` interface
- Graceful degradation via `NoopTracer` when telemetry is disabled

## Non-Goals

- Metrics migration (P1 — separate follow-up)
- Tracing the legacy v2 engine path (SQL currently uses v2 exclusively; `QueryService.shouldUseCalcite()` gates Calcite to PPL only)
- SQL query tracing (future work when SQL migrates to Calcite)

## Background

### OpenSearch Telemetry Framework

OpenSearch provides a backend-agnostic telemetry framework:

- **`libs/telemetry/`** — interfaces: `Tracer`, `Span`, `SpanScope`, `MetricsRegistry`
- **`plugins/telemetry-otel/`** — OTel SDK implementation that exports via `BatchSpanProcessor`
- Plugins access the framework by implementing `TelemetryAwarePlugin`, which provides `Tracer` and `MetricsRegistry`

The framework is gated behind `opensearch.experimental.feature.telemetry.enabled` (defaults to `false`). When disabled, all tracing operations are no-ops via `NoopTracer`.

### PPL Query Execution Pipeline (Calcite Path)

```
PPL text
  → Parse (PPLSyntaxParser → UnresolvedPlan AST)
  → Analyze (CalciteRelNodeVisitor → RelNode logical plan)
  → Optimize (HepPlanner → optimized RelNode)
  → Compile (RelRunner.prepareStatement → PreparedStatement)
  → Execute (PreparedStatement.executeQuery → ResultSet)
  → Materialize (ResultSet → QueryResponse)
```

---

## Design

### Plugin Interface

`SQLPlugin` implements `TelemetryAwarePlugin`. OpenSearch calls two `createComponents()` methods separately:

1. `TelemetryAwarePlugin.createComponents(... Tracer, MetricsRegistry)` — called first when telemetry is enabled. Stores the `Tracer` reference. Returns empty (no components).
2. `Plugin.createComponents(...)` — called always. Creates all plugin components. The stored `Tracer` (real or `NoopTracer` default) is passed to `OpenSearchPluginModule` for Guice binding.

```java
public class SQLPlugin extends Plugin
    implements ActionPlugin, ScriptPlugin, SystemIndexPlugin,
               JobSchedulerExtension, ExtensiblePlugin, TelemetryAwarePlugin {

    private Tracer tracer = NoopTracer.INSTANCE;

    // Telemetry-enabled: stores Tracer, returns empty
    @Override
    public Collection<Object> createComponents(
        ..., Tracer tracer, MetricsRegistry metricsRegistry) {
        this.tracer = tracer;
        return Collections.emptyList();
    }

    // Always called: creates all components with stored Tracer
    @Override
    public Collection<Object> createComponents(...) {
        modules.add(new OpenSearchPluginModule(executionEngineExtensions, tracer));
        // ... component creation ...
    }
}
```

`Tracer` is bound via Guice `@Provides @Singleton` in `OpenSearchPluginModule` and injected into all instrumented components.

### Span Hierarchy

Language-agnostic naming. Query language is an attribute (`db.query.type`), not part of the span name.

#### EXECUTE Request (7 spans)

```
[OpenSearch Transport Action]                 ← SERVER (auto, OpenSearch core)
  └── opensearch.query                        ← CLIENT (root span)
        ├── opensearch.query.parse            ← INTERNAL
        ├── opensearch.query.analyze          ← INTERNAL
        ├── opensearch.query.optimize         ← INTERNAL
        ├── opensearch.query.compile          ← INTERNAL
        ├── opensearch.query.execute          ← INTERNAL
        │     └── transport indices:data/read/search  ← auto, OpenSearch core
        │           └── [phase/query] → [phase/fetch] → [phase/expand]
        └── opensearch.query.materialize      ← INTERNAL
```

#### EXPLAIN Request (4-5 spans)

SIMPLE mode skips compile. Non-SIMPLE modes (STANDARD, EXTENDED, COST) include compile since `OpenSearchRelRunners.run()` calls `prepareStatement()` to capture the physical plan.

```
opensearch.query (db.operation.name="EXPLAIN")
  ├── opensearch.query.parse
  ├── opensearch.query.analyze
  ├── opensearch.query.optimize
  └── opensearch.query.compile              ← non-SIMPLE modes only
```

### Span Attributes

#### Root Span (`opensearch.query`)

Following OTel DB semantic conventions (Elasticsearch for cluster/index, PostgreSQL/MySQL for `db.operation.name`):

| Attribute | Convention | Value |
|-----------|-----------|-------|
| `db.system.name` | ES semconv | `"opensearch"` |
| `db.namespace` | ES semconv | Cluster name |
| `db.operation.name` | PG/MySQL semconv | `"EXECUTE"` or `"EXPLAIN"` |
| `db.query.text` | DB semconv | Raw PPL query |
| `db.query.summary` | DB semconv | Command structure (e.g., `"source \| where \| stats"`) |
| `db.query.type` | Custom | `"ppl"` |
| `db.query.id` | Custom | `QueryContext.getRequestId()` (UUID) |
| `server.address` | DB semconv | Node host address |
| `server.port` | DB semconv | Node transport port |

`db.query.summary` is extracted by `QuerySummaryExtractor`, a regex-based utility that produces a low-cardinality pipe-delimited command structure suitable for grouping in observability backends.

`db.query.id` uses `QueryContext.getRequestId()` — a UUID generated at the start of `doExecute()`, before any query processing. It propagates via Log4j ThreadContext and is already used for log correlation.

#### Phase Span Attributes

Phase spans (`INTERNAL`) carry `error` and `error.type` on failure. Additional phase-specific attributes (e.g., `opensearch.query.plan.node_count`, `opensearch.query.result.rows`) are defined in the design spec but not yet implemented — they will be added as the instrumentation matures.

### Instrumentation Points

Each component that owns a phase receives `Tracer` via Guice and creates its own span. Parent-child relationships propagate automatically via `ThreadContextBasedTracerContextStorage`.

| Span | Component | Method |
|------|-----------|--------|
| `opensearch.query` | `TransportPPLQueryAction` | `doExecute()` |
| `opensearch.query.parse` | `PPLService` | `plan()` |
| `opensearch.query.analyze` | `QueryService` | `executeWithCalcite()` |
| `opensearch.query.optimize` | `CalciteToolsHelper.OpenSearchRelRunners` | `run()` |
| `opensearch.query.compile` | `CalciteToolsHelper.OpenSearchRelRunners` | `run()` |
| `opensearch.query.execute` | `OpenSearchExecutionEngine` | `execute(RelNode, ...)` |
| `opensearch.query.materialize` | `OpenSearchExecutionEngine` | `execute(RelNode, ...)` |

### Async Span Lifecycle

The execution model is asynchronous: `doExecute()` returns before query execution finishes. The actual work runs on the `sql-worker` thread pool.

**Key insight:** `SpanScope` and `Span` have different lifecycles.

- **`SpanScope`** is thread-local. Opened and closed on the **transport thread** via `try/finally`. Its only job is to be active during `pplService.execute()` so `OpenSearchQueryManager.withCurrentContext()` captures the span context in ThreadContext for worker thread propagation.

- **`Span`** is thread-safe. Created on the transport thread, ended on the **worker thread** in the async listener callback via `AtomicBoolean` guard for exactly-once semantics.

```java
// Transport thread
Span rootSpan = tracer.startSpan(SpanCreationContext.client().name("opensearch.query")...);
SpanScope spanScope = tracer.withSpanInScope(rootSpan);

try {
    ActionListener<...> tracedListener = new ActionListener<>() {
        private final AtomicBoolean ended = new AtomicBoolean(false);

        @Override public void onResponse(...) {
            try { listener.onResponse(response); }
            finally { if (ended.compareAndSet(false, true)) rootSpan.endSpan(); }
        }

        @Override public void onFailure(Exception e) {
            try { rootSpan.setError(e); listener.onFailure(e); }
            finally { if (ended.compareAndSet(false, true)) rootSpan.endSpan(); }
        }
    };

    pplService.execute(request, tracedListener, ...);
} catch (Exception e) {
    rootSpan.setError(e); rootSpan.endSpan(); listener.onFailure(e);
} finally {
    spanScope.close(); // Close scope on transport thread
}
```

| Object | Created On | Closed/Ended On | Thread-Safe? |
|--------|-----------|----------------|-------------|
| `Span` | Transport thread | Worker thread (async callback) | Yes |
| `SpanScope` | Transport thread | Transport thread (finally block) | Must be same thread |
| ThreadContext snapshot | Captured at submit time | Restored on worker thread | Yes (immutable) |

### Error Handling

#### Synchronous Phases (parse, analyze, optimize, compile)

Standard `try/catch/finally` with `span.setError(e)` + re-throw + `span.endSpan()` in `finally`.

#### Async Boundaries (execute, materialize)

Inside `client.schedule()` lambdas, exceptions **must** route to `listener.onFailure()` — never re-thrown. Re-throwing bypasses the listener chain and leaks the root span.

```java
client.schedule(() -> {
    try (...) {
        // ... phase work ...
        listener.onResponse(response);
    } catch (Throwable t) {
        if (t instanceof VirtualMachineError) throw (VirtualMachineError) t;
        Exception e = (t instanceof Exception) ? (Exception) t : new RuntimeException(t);
        listener.onFailure(e);
    }
});
```

**Pre-existing bug fixed:** `OpenSearchExecutionEngine.execute(RelNode, ...)` previously caught `SQLException` and re-threw as `RuntimeException` without calling `listener.onFailure()`. Fixed to catch `Throwable`, re-throw only `VirtualMachineError`, and route everything else to `listener.onFailure()`.

### Telemetry Control

No SQL plugin-specific toggle. Controlled entirely by OpenSearch core:

| Level | Setting | Default | Effect |
|-------|---------|---------|--------|
| Feature Flag | `opensearch.experimental.feature.telemetry.enabled` | `false` | Gates all telemetry settings |
| Tracer Feature | `telemetry.feature.tracer.enabled` | `false` | Enables tracer infrastructure |
| Tracer Toggle | `telemetry.tracer.enabled` | `false` | Dynamic on/off for tracing |
| Sampling | `telemetry.tracer.sampler.probability` | `0.01` | Fraction of traces exported |

When telemetry is disabled, `Tracer` is `NoopTracer` — all span operations are no-ops with near-zero overhead. No conditional checks needed in application code.

Span	Component	Method
`opensearch.query`	`TransportPPLQueryAction`	`doExecute()`
`opensearch.query.parse`	`PPLService`	`plan()`
`opensearch.query.analyze`	`QueryService`	`executeWithCalcite()`
`opensearch.query.optimize`	`CalciteToolsHelper.OpenSearchRelRunners`	`run()`
`opensearch.query.compile`	`CalciteToolsHelper.OpenSearchRelRunners`	`run()`
`opensearch.query.execute`	`OpenSearchExecutionEngine`	`execute(RelNode, ...)`
`opensearch.query.materialize`	`OpenSearchExecutionEngine`	`execute(RelNode, ...)`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: OpenSearch SQL/PPL Telemetry Integration #5300

Problem Statement

Goals

Non-Goals

Background

OpenSearch Telemetry Framework

PPL Query Execution Pipeline (Calcite Path)

Design

Plugin Interface

Span Hierarchy

EXECUTE Request (7 spans)

EXPLAIN Request (4-5 spans)

Span Attributes

Root Span (`opensearch.query`)

Phase Span Attributes

Instrumentation Points

Async Span Lifecycle

Error Handling

Synchronous Phases (parse, analyze, optimize, compile)

Async Boundaries (execute, materialize)

Telemetry Control

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Attribute	Convention	Value
`db.system.name`	ES semconv	`"opensearch"`
`db.namespace`	ES semconv	Cluster name
`db.operation.name`	PG/MySQL semconv	`"EXECUTE"` or `"EXPLAIN"`
`db.query.text`	DB semconv	Raw PPL query
`db.query.summary`	DB semconv	Command structure (e.g., `"source \| where \| stats"`)
`db.query.type`	Custom	`"ppl"`
`db.query.id`	Custom	`QueryContext.getRequestId()` (UUID)
`server.address`	DB semconv	Node host address
`server.port`	DB semconv	Node transport port

Object	Created On	Closed/Ended On	Thread-Safe?
`Span`	Transport thread	Worker thread (async callback)	Yes
`SpanScope`	Transport thread	Transport thread (finally block)	Must be same thread
ThreadContext snapshot	Captured at submit time	Restored on worker thread	Yes (immutable)

Level	Setting	Default	Effect
Feature Flag	`opensearch.experimental.feature.telemetry.enabled`	`false`	Gates all telemetry settings
Tracer Feature	`telemetry.feature.tracer.enabled`	`false`	Enables tracer infrastructure
Tracer Toggle	`telemetry.tracer.enabled`	`false`	Dynamic on/off for tracing
Sampling	`telemetry.tracer.sampler.probability`	`0.01`	Fraction of traces exported

RFC: OpenSearch SQL/PPL Telemetry Integration #5300

Description

Problem Statement

Goals

Non-Goals

Background

OpenSearch Telemetry Framework

PPL Query Execution Pipeline (Calcite Path)

Design

Plugin Interface

Span Hierarchy

EXECUTE Request (7 spans)

EXPLAIN Request (4-5 spans)

Span Attributes

Root Span (opensearch.query)

Phase Span Attributes

Instrumentation Points

Async Span Lifecycle

Error Handling

Synchronous Phases (parse, analyze, optimize, compile)

Async Boundaries (execute, materialize)

Telemetry Control

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Root Span (`opensearch.query`)