Skip to content

Feat: Storage as resources#1003

Draft
RobertCrupa wants to merge 4 commits into
masterfrom
feat/storage-as-resources
Draft

Feat: Storage as resources#1003
RobertCrupa wants to merge 4 commits into
masterfrom
feat/storage-as-resources

Conversation

@RobertCrupa

@RobertCrupa RobertCrupa commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

WIP for adding storage as a resource interface

RobertCrupa and others added 2 commits June 17, 2026 12:41
Add an MCP resources surface for Apify storage data reads alongside the
existing storage tools (additive; no tool changed). Three `apify://`
templates (dataset items, KVS key listing, KVS record) plus a bounded
recent-list of the user's datasets/stores. Resource handlers resolve the
per-request token and build an ApifyClient like the CallTool handler;
no-token sessions skip storage and read returns an explanatory text block.

Closes #1002

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added t-ai Issues owned by the AI team. tested Temporary label used only programatically for some analytics. labels Jun 17, 2026
@RobertCrupa RobertCrupa requested a review from Copilot June 17, 2026 11:52

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Not ready to approve

There are correctness/contract issues in the new resources layer (unsafe typing for blob contents, URI decode error handling, and MIME/doc mismatches) that should be fixed before approval.

Pull request overview

Implements Issue #1002 by adding an MCP resources surface for Apify storage reads (datasets, KVS keys, KVS records) alongside existing storage tools, wiring the per-request token into the resources handlers and extending unit/integration coverage.

Changes:

  • Add apify://… storage resource templates, recent-resource listing, and resource reads for dataset items / KVS keys / KVS records.
  • Thread an optional ApifyClient through resource_service and build it per request in ActorsMcpServer.
  • Add unit + integration tests for the new resources surface, and cover empty KVS record bodies in the existing tool.
File summaries
File Description
tests/unit/tools.get_key_value_store_record.test.ts Adds unit coverage for empty KVS record bodies to keep structured output schema-conforming.
tests/unit/resources.storage.test.ts New unit tests for storage resource templates, listing, and read behavior across datasets/KVS.
tests/unit/resources.service.test.ts Updates resource service template listing expectations to include storage templates.
tests/integration/suite.ts Adds integration coverage for resources/templates/list, resources/list, and resources/read for storage.
src/tools/common/get_key_value_store_record.ts Ensures schema-required value is present even when apify-client returns undefined for empty bodies.
src/resources/storage_resources.ts Introduces apify:// storage resource parsing, recent listing, and read implementations.
src/resources/resource_service.ts Extends resource service to delegate apify:// URIs to storage resources and to accept an optional ApifyClient.
src/resources/AGENTS.md Documents the new storage resources surface and how it’s wired.
src/mcp/server.ts Wires ListResources/ReadResource to build an ApifyClient from per-request token metadata.

Copilot's findings

  • Files reviewed: 9/9 changed files
  • Comments generated: 4

Note

Your feedback helps us improve the quality of this feature.
Please use 👍 or 👎 to tell us whether this assessment is correct.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 23 to 30
type ExtendedResourceContents = TextResourceContents & {
html?: string;
_meta?: AvailableWidget['meta'];
};

type ExtendedReadResourceResult = Omit<ReadResourceResult, 'contents'> & {
contents: ExtendedResourceContents[];
};
Comment thread src/resources/storage_resources.ts Outdated
Comment on lines +135 to +156
// apify://datasets/{datasetId}/items
if (segments[0] === 'datasets' && segments[2] === 'items' && segments.length === 3) {
return readDatasetItems(uri, apifyClient, decodeURIComponent(segments[1]), query);
}

// apify://key-value-stores/{keyValueStoreId}/keys
if (segments[0] === 'key-value-stores' && segments[2] === 'keys' && segments.length === 3) {
return readKeyValueStoreKeys(uri, apifyClient, decodeURIComponent(segments[1]), query);
}

// apify://key-value-stores/{keyValueStoreId}/records/{recordKey}
if (segments[0] === 'key-value-stores' && segments[2] === 'records' && segments.length === 4) {
return readKeyValueStoreRecord(
uri,
apifyClient,
decodeURIComponent(segments[1]),
decodeURIComponent(segments[3]),
);
}

return buildTextResult(uri, `Resource ${uri} is not a recognized Apify storage URI.`);
}
Comment thread src/resources/storage_resources.ts Outdated
Comment on lines +262 to +265
const { value, contentType } = record;
// apify-client maps an empty record body to `undefined`; emit empty text (an empty OUTPUT is legitimate).
if (value === undefined) return buildTextResult(uri, '');
if (Buffer.isBuffer(value)) {
Comment thread src/resources/AGENTS.md Outdated
Comment on lines +26 to +31
`resources/list` adds concrete URIs for the user's recent datasets/stores
(`desc: true`, bounded). Contents are `application/json` for items/keys; records keep
their `contentType` (binary → base64 `blob`). Best-effort: no token / API error →
list omits storage; an unreadable read returns an explanatory `text` block, never an
error. Reuses the storage tools' arg-parsing helpers and 404→soft-fail pattern; it
does **not** share their response builders (resources need `ReadResourceResult`).
@RobertCrupa

Copy link
Copy Markdown
Contributor Author

Currently, large binaries are inlined in the resource interface. @jirispilka, we will need to decide whether to mirror the same approach with the tools, serving the URI when binaries are > 256Kb.

I am in favor of letting the LLM process the data from a resource link using a script, rather than dumping it into the context.

resources are now pulled from the api.apify.com, and large binaries
return signed resource links
- **Key-value store:** Flexible storage for unstructured data or auxiliary files.

## Apify API resources
- Any Apify API GET endpoint can be read as an MCP resource. Pass the full \`https://api.apify.com/v2/...\` URL to resources/read; the server injects authentication and returns the response body.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Any Apify API GET endpoint can be read as an MCP resource. Pass the full \`https://api.apify.com/v2/...\` URL to resources/read; the server injects authentication and returns the response body.
- Any Apify API GET endpoint can be read as an MCP resource. Pass the full \`https://api.apify.com/v2/...\` URL to `resources/read`; the server injects authentication and returns the response body.

});
}
} catch {
// Ignore: best-effort listing.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps at least log the error?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

t-ai Issues owned by the AI team. tested Temporary label used only programatically for some analytics.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants