Skip to content

Add collection export commands (create, get, cancel).#159

Open
jfrancoa wants to merge 12 commits intomainfrom
jose/collection-export
Open

Add collection export commands (create, get, cancel).#159
jfrancoa wants to merge 12 commits intomainfrom
jose/collection-export

Conversation

@jfrancoa
Copy link
Copy Markdown
Collaborator

Implements CLI support for exporting collections to external storage backends (S3, GCS, Azure, filesystem) in Parquet format. Includes unit tests, integration tests, and skill documentation updates.

Closes #158

Copy link
Copy Markdown

@orca-security-eu orca-security-eu Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed SAST high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Weaviate CLI support for collection exports (create/get/cancel) via a new ExportManager, along with defaults, tests, and documentation/skill reference updates to cover export workflows and output formats.

Changes:

  • Introduces ExportManager with create/status/cancel functionality and output formatting (text/JSON).
  • Adds new CLI subcommands: create export-collection, get export-collection, cancel export-collection.
  • Adds unit + (non-CI-selected) integration tests and documentation/skill references for export usage.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
weaviate_cli/managers/export_manager.py Implements export create/get-status/cancel and printing logic.
weaviate_cli/defaults.py Adds defaults dataclasses for export command parameters.
weaviate_cli/commands/create.py Adds create export-collection CLI entrypoint.
weaviate_cli/commands/get.py Adds get export-collection CLI entrypoint.
weaviate_cli/commands/cancel.py Adds cancel export-collection CLI entrypoint.
test/unittests/test_managers/test_export_manager.py Unit tests for ExportManager behavior and output.
test/integration/test_export_integration.py New integration tests for end-to-end export flow (not currently executed in CI workflow selection).
setup.cfg Updates runtime dependency on weaviate-client (currently to a VCS branch ref).
requirements-dev.txt Updates dev dependency on weaviate-client (currently to a VCS branch ref).
.claude/skills/operating-weaviate-cli/references/exports.md Adds export reference documentation (currently mentions --bucket which CLI doesn’t implement).
.claude/skills/operating-weaviate-cli/SKILL.md Updates skill docs to include export commands (also mentions --bucket).
.claude/skills/contributing-to-weaviate-cli/references/architecture.md Documents export_manager.py as a manager module.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/integration/test_export_integration.py Outdated
Comment thread requirements-dev.txt Outdated
Comment thread weaviate_cli/managers/export_manager.py
Comment thread weaviate_cli/managers/export_manager.py
Comment thread weaviate_cli/managers/export_manager.py Outdated
Comment thread setup.cfg Outdated
Comment thread weaviate_cli/commands/create.py Outdated
Comment thread .claude/skills/operating-weaviate-cli/SKILL.md
Comment thread test/unittests/test_managers/test_export_manager.py Outdated
Comment thread test/unittests/test_managers/test_export_manager.py Outdated
jfrancoa and others added 2 commits April 20, 2026 10:09
Implements CLI support for exporting collections to external storage
backends (S3, GCS, Azure, filesystem) in Parquet format. Includes
unit tests, integration tests, and skill documentation updates.

Closes #158

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The bucket argument was removed from the weaviate-python-client,
so the code had to be adapted. The path is also passed as a config
in the get collection-export.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Weaviate CLI support for collection export operations, wiring new create/get/cancel export-collection commands through a new ExportManager, plus tests and skill/reference documentation updates.

Changes:

  • Introduces ExportManager to create exports, query status, and cancel exports (including JSON output and shard-level status formatting).
  • Adds CLI subcommands for create export-collection, get export-collection, and cancel export-collection, with new defaults.
  • Adds unit/integration tests and updates skill/reference docs (and updates weaviate-client dependency to a Git ref).

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
weaviate_cli/managers/export_manager.py Implements export create/status/cancel logic and output formatting.
weaviate_cli/defaults.py Adds default dataclasses for export commands.
weaviate_cli/commands/create.py Adds create export-collection command and options.
weaviate_cli/commands/get.py Adds get export-collection command and options.
weaviate_cli/commands/cancel.py Adds cancel export-collection command and options.
test/unittests/test_managers/test_export_manager.py Unit tests for ExportManager behavior and output.
test/integration/test_export_integration.py Integration tests covering export create/get/cancel flows.
setup.cfg Changes runtime dependency to weaviate-client via Git URL.
requirements-dev.txt Changes dev dependency to the same Git URL.
.claude/skills/operating-weaviate-cli/references/exports.md Adds export reference documentation and examples.
.claude/skills/operating-weaviate-cli/SKILL.md Adds export usage section and workflow notes.
.claude/skills/contributing-to-weaviate-cli/references/architecture.md Mentions the new manager file in the architecture reference.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread .claude/skills/operating-weaviate-cli/references/exports.md
Comment thread .claude/skills/operating-weaviate-cli/references/exports.md
Comment thread weaviate_cli/commands/get.py
Comment thread weaviate_cli/commands/cancel.py
Comment thread weaviate_cli/managers/export_manager.py
Comment thread requirements-dev.txt Outdated
Comment thread weaviate_cli/commands/create.py
Comment thread weaviate_cli/managers/export_manager.py
Comment thread test/unittests/test_managers/test_export_manager.py Outdated
Comment thread .claude/skills/operating-weaviate-cli/SKILL.md
The Path is now being passed via environment variables,
therefore the python client got adapted for it.
This commit removes all references to path in the cli
code.
@jfrancoa jfrancoa force-pushed the jose/collection-export branch from f15f3c7 to d1a13b1 Compare April 20, 2026 08:21
- export_manager: raise ClickException when wait_for_completion finishes
  with non-SUCCESS status (matches BackupManager behavior so the CLI
  exits non-zero on FAILED/CANCELED)
- CI: run test_export_integration.py and enable collection-export input
  on the weaviate-local-k8s action (COLLECTION_EXPORT=true provisions
  MinIO and the weaviate-export bucket automatically)
- docs: replace ENABLE_BACKUP references with COLLECTION_EXPORT for the
  export feature prerequisite in SKILL.md and exports.md
- tests: add coverage for the wait+non-SUCCESS raise path and for the
  wait=False happy path with a non-terminal status

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds export-collection support to the CLI, wiring through a new ExportManager and exposing create/get/cancel commands, plus tests and documentation to support collection exports to external storage.

Changes:

  • Added ExportManager with create/status/cancel operations and JSON/text output formatting.
  • Added CLI commands: create export-collection, get export-collection, cancel export-collection.
  • Added unit + integration tests, CI workflow updates, and CLI skill/reference documentation for exports.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
weaviate_cli/managers/export_manager.py Implements export create/status/cancel logic and output formatting.
weaviate_cli/defaults.py Adds default values for export commands.
weaviate_cli/commands/create.py Adds create export-collection command and options.
weaviate_cli/commands/get.py Adds get export-collection command and options.
weaviate_cli/commands/cancel.py Adds cancel export-collection command and options.
test/unittests/test_managers/test_export_manager.py Unit tests for ExportManager.
test/integration/test_export_integration.py End-to-end export integration coverage.
.github/workflows/main.yaml Enables export feature in integration environment and runs new integration test file.
setup.cfg Changes weaviate-client dependency to a VCS URL.
requirements-dev.txt Changes dev dependency to the same VCS URL.
.claude/skills/operating-weaviate-cli/references/exports.md Adds export command reference documentation.
.claude/skills/operating-weaviate-cli/SKILL.md Adds exports to CLI skill docs and workflows.
.claude/skills/contributing-to-weaviate-cli/references/architecture.md Documents export manager as part of the manager layer.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/unittests/test_managers/test_export_manager.py
Comment thread setup.cfg Outdated
python_requires = >=3.9
install_requires =
weaviate-client>=4.20.4
weaviate-client @ git+https://github.com/weaviate/weaviate-python-client.git@dev/1.37
Comment thread requirements-dev.txt Outdated
@@ -1,4 +1,4 @@
weaviate-client>=4.20.4
weaviate-client @ git+https://github.com/weaviate/weaviate-python-client.git@dev/1.37
Comment thread weaviate_cli/commands/create.py
Comment thread weaviate_cli/managers/export_manager.py Outdated
Comment on lines +48 to +50
backend_enum = BACKEND_MAP[backend]
file_format_enum = FILE_FORMAT_MAP[file_format]

Comment on lines +92 to +103
export_manager.create_export(
export_id="my-export",
backend="filesystem",
file_format="parquet",
json_output=True,
)

out = capsys.readouterr().out
data = json.loads(out)
assert data["status"] == "success"
assert data["export_id"] == "test-export"
assert data["collections"] == ["Movies", "Books"]
Comment on lines +235 to +244
export_manager.get_export_status(
export_id="my-export",
backend="filesystem",
json_output=False,
)

out = capsys.readouterr().out
assert "test-export" in out
assert "SUCCESS" in out
assert "1234" in out
jfrancoa and others added 3 commits April 21, 2026 11:04
The previous test excluded the only collection in the fixture, which
the Weaviate server rejects with 422 'no exportable classes'. Create a
secondary collection inside the test and exclude that one instead, so
EXPORT_COLLECTION remains exportable and we can verify both the
excluded and included sides of the filter.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds export-collection support to the CLI, wiring through to the weaviate-client collection export APIs and documenting/testing the new workflow.

Changes:

  • Introduce ExportManager with create/get-status/cancel operations and JSON/text output.
  • Add create|get|cancel export-collection CLI commands plus defaults for new options.
  • Add unit + integration tests, enable export in CI local-k8s workflow, bump weaviate-client minimum version, and update skill/docs.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
weaviate_cli/managers/export_manager.py New manager implementing export create/status/cancel + output formatting.
weaviate_cli/defaults.py Adds defaults dataclasses for export commands.
weaviate_cli/commands/create.py Adds create export-collection command/options.
weaviate_cli/commands/get.py Adds get export-collection command/options.
weaviate_cli/commands/cancel.py Adds cancel export-collection command/options.
test/unittests/test_managers/test_export_manager.py Unit tests for export manager behaviors and argument passing.
test/integration/test_export_integration.py Integration coverage for create/get/cancel export flows.
setup.cfg Bumps minimum weaviate-client version required.
requirements-dev.txt Aligns dev dependency on weaviate-client min version.
.github/workflows/main.yaml Enables collection export in local-k8s and runs new integration test.
.claude/skills/operating-weaviate-cli/references/exports.md New reference docs for export commands/options.
.claude/skills/operating-weaviate-cli/SKILL.md Adds export commands to CLI skill docs and workflow section.
.claude/skills/contributing-to-weaviate-cli/references/architecture.md Documents export_manager.py as a manager module.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread weaviate_cli/managers/export_manager.py Outdated
Replace direct dict indexing on BACKEND_MAP / FILE_FORMAT_MAP with
helper functions that raise click.ClickException listing the allowed
values. The CLI is already constrained by click.Choice, but programmatic
callers (or future default drift) would otherwise hit a raw KeyError.
Reused across create/get/cancel and covered by new unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jfrancoa
Copy link
Copy Markdown
Collaborator Author

@copilot review

Copy link
Copy Markdown

Copilot AI commented Apr 30, 2026

@copilot review

Addressed the remaining open review items in commit 9c8c8c5:

  • result.backend enum serialization (export_manager.py): switched to .value in both JSON output and text output so the field always emits a stable lowercase string (e.g. "filesystem") instead of the enum repr.
  • Test fixture mocks: updated backend to use MagicMock(value="filesystem") to be consistent with how status is mocked.
  • Export-id mismatch in tests: test_create_export_json_output and test_get_export_status_json_output now override the mock's export_id return to match the value passed in ("my-export"), making the assertions validate the actual round-trip rather than the fixture default.

All 301 unit tests pass and CodeQL finds no issues.

jfrancoa and others added 2 commits April 30, 2026 16:35
Commit 9c8c8c5 added .value lookups on result.backend in three places
(create JSON output, status JSON output, status text output). The
weaviate-client returns ExportCreateReturn.backend as plain str, so
.value raised AttributeError in the integration tests. Use
result.backend directly and switch the test fixtures back to plain
strings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The JSON output previously hardcoded "status": "success" alongside an
"export_status" field carrying the real state. That conflated the API
call result with the export's terminal state and made automation hard
to write — a wait=False export with status STARTED was indistinguishable
from a finished SUCCESS one.

Drop the hardcoded "success" field and surface result.status.value as
the top-level "status" key, matching the get_export_status JSON
contract. Failures during wait_for_completion still raise ClickException
(non-zero exit) so consumers parsing JSON only see actual export states.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds Weaviate CLI support for collection export operations (create/status/cancel) backed by the newer weaviate-client export APIs, plus tests and documentation to cover the new workflow.

Changes:

  • Introduces ExportManager with backend/file-format resolution and CLI-friendly output (text/JSON).
  • Adds create|get|cancel export-collection CLI subcommands and defaults.
  • Adds unit + integration test coverage, updates docs, bumps weaviate-client minimum version, and enables export in CI integration runs.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
weaviate_cli/managers/export_manager.py New manager wrapping client.export APIs, validation, and JSON/text rendering.
weaviate_cli/defaults.py Adds defaults dataclasses for export create/get/cancel commands.
weaviate_cli/commands/create.py Adds create export-collection command wiring to ExportManager.
weaviate_cli/commands/get.py Adds get export-collection command wiring to ExportManager.
weaviate_cli/commands/cancel.py Adds cancel export-collection command wiring to ExportManager.
test/unittests/test_managers/test_export_manager.py Unit tests for validation, arg passing, and output formatting.
test/integration/test_export_integration.py Integration tests covering export create/get/cancel behavior end-to-end.
setup.cfg Bumps minimum weaviate-client version to include export functionality.
requirements-dev.txt Mirrors the weaviate-client version bump for dev/test environments.
.github/workflows/main.yaml Enables collection export in integration environment and runs new integration test file.
.claude/skills/operating-weaviate-cli/references/exports.md New reference doc for export commands and options.
.claude/skills/operating-weaviate-cli/SKILL.md Documents export commands in the operating skill guide.
.claude/skills/contributing-to-weaviate-cli/references/architecture.md Adds export_manager.py to the managers list in architecture docs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +85 to +86
if wait and result and result.status.value != "SUCCESS":
raise click.ClickException(
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add support for collection export commands

3 participants