Skip to content

Web hosted client (api key based access)#480

Open
atomic wants to merge 8 commits intoNVIDIA:release/26.02from
atomic:web-hosted-client
Open

Web hosted client (api key based access)#480
atomic wants to merge 8 commits intoNVIDIA:release/26.02from
atomic:web-hosted-client

Conversation

@atomic
Copy link

@atomic atomic commented Oct 9, 2025

Description

  • Add CuOptServiceWebHostedClient for connecting to web-hosted cuOpt services (e.g. NVIDIA API endpoints) with endpoint URL and API key authentication
  • Add create_client() factory function that returns the appropriate client based on parameters
  • Extends CuOptServiceSelfHostClient with endpoint URL parsing, auth headers, and URL construction for web-hosted APIs

Issue

Closes #479

Tested with NVIDIA's Build.com CuOpt model

Test script (excluded from the PR as it is an adhoc script): https://gist.github.com/atomic/9c761bd4d64d62bc039ecf88fe28fca0#file-test_cuopt_nvidia_build_com-py

export CUOPT_API_KEY=... python3 file-test_cuopt_nvidia_build_com-py.py

Checklist

  • I am familiar with the Contributing Guidelines.
  • Testing
    • New or existing tests cover these changes
    • Added tests
    • Created an issue to follow-up
    • NA
  • Documentation
    • The documentation is up to date with these changes
    • Added new documentation
    • NA

Summary by CodeRabbit

  • New Features

    • Added web-hosted CuOpt client supporting cloud deployment with API key authentication.
  • Bug Fixes

    • Improved memory error handling in barrier solver operations.
  • Documentation

    • Updated GPU requirements to include H100 and newer architectures.
    • Refined product terminology from "optimization library" to "optimization engine."
  • Tests

    • Added comprehensive web-hosted client test coverage.

@atomic atomic requested a review from a team as a code owner October 9, 2025 23:44
@atomic atomic requested a review from Iroy30 October 9, 2025 23:44
@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 9, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@atomic atomic changed the title [DRAFT] Web hosted client Web hosted client (api key based access) Oct 13, 2025
@anandhkb anandhkb added this to the 25.12 milestone Oct 17, 2025
@rgsl888prabhu rgsl888prabhu added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Oct 21, 2025
@rgsl888prabhu rgsl888prabhu changed the base branch from branch-25.10 to branch-25.12 October 21, 2025 19:36
@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2025

📝 Walkthrough

Walkthrough

This PR introduces a web-hosted cuOpt client interface with API key authentication for cloud deployment, refactors C++ linear programming code to explicitly pass problem handles, adds memory exception handling and pointer mode configuration, updates documentation, and includes comprehensive tests for the new client interface.

Changes

Cohort / File(s) Summary
Python Web-Hosted Client (New Feature)
python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py, python/cuopt_self_hosted/cuopt_sh_client/__init__.py
Introduces CuOptServiceWebHostedClient class extending CuOptServiceSelfHostClient with endpoint parsing, API key authentication, Bearer token headers, and 401/403 error handling. Adds create_client factory function to instantiate appropriate client type based on endpoint parameter. Exports new classes from module __init__.
Python Client Infrastructure
python/cuopt_self_hosted/cuopt_sh_client/cuopt_self_host_client.py
Adds internal _make_http_request helper method to centralize HTTP calls (GET/POST/DELETE), replacing direct requests library calls throughout multiple existing methods like _get_logs, stop_threads, delete, repoll, and status.
C++ Handle Propagation
cpp/src/linear_programming/translate.hpp, cpp/src/linear_programming/solve.cu
Updates cuopt_problem_to_simplex_problem function signature to accept explicit raft::handle_t const* parameter for handle management. Propagates handle pointer through objective copying, CSR matrix operations, bound extraction, and host-copy streams. Updates all call sites in run_barrier, run_dual_simplex, and run_concurrent to pass problem.handle_ptr explicitly.
C++ Exception & Configuration
cpp/src/dual_simplex/barrier.cu, cpp/src/dual_simplex/cusparse_view.cu
Adds rmm::out_of_memory exception handler in barrier solver returning NUMERICAL_ISSUES. Initializes cuBLAS and cuSPARSE pointer modes to device in cusparse_view_t constructor with RAFT_TRY error handling wrappers.
Documentation Updates
docs/cuopt/source/introduction.rst, docs/cuopt/source/system-requirements.rst
Changes descriptor from "optimization library" to "optimization engine" in introduction. Broadens GPU requirement from "H100 SXM (compute capability >= 9.0)" to "H100 SXM (compute capability >= 9.0) and above" to include GPUs with higher compute capabilities.
Benchmark Configuration
benchmarks/linear_programming/utils/get_datasets.py
Disables netlib dataset emps download/compile flow by adding and False condition to conditional branch, preventing runtime execution of compilation and execution steps while leaving other branches unchanged.
Test Additions
python/cuopt_self_hosted/tests/test_web_hosted_client.py
Introduces comprehensive test suite covering endpoint parsing, authentication header configuration, URL construction, API error handling (401/403), base path handling, factory function behavior, and client type instantiation with parameter propagation.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Out of Scope Changes check ⚠️ Warning Most changes align with issue #479 scope, but several modifications appear beyond the core feature: barrier.cu exception handling, cusparse_view.cu pointer mode configuration, and documentation updates (introduction.rst, system-requirements.rst) are unrelated to the web-hosted client. Remove or clarify the purpose of unrelated changes in barrier.cu, cusparse_view.cu, and documentation files, or defer them to separate PRs focused on their specific concerns.
Docstring Coverage ⚠️ Warning Docstring coverage is 69.05% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'Web hosted client (api key based access)' clearly and specifically describes the main change: adding a web-hosted client with API key authentication support.
Linked Issues check ✅ Passed The PR fulfills issue #479 requirements: adds CuOptServiceWebHostedClient class extending SelfHostClient, implements API key authentication via Bearer tokens, supports endpoint URL parsing, and reuses existing request-based interaction patterns.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rgsl888prabhu
Copy link
Collaborator

/ok to test e595eab

@rgsl888prabhu
Copy link
Collaborator

@CodeRabbit review and summarize status

@coderabbitai
Copy link

coderabbitai bot commented Oct 21, 2025

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get PR status and details
gh pr view 480 --json state,title,isDraft,reviews,statusCheckRollup,mergeable,mergeStateStatus,additions,deletions,changedFiles

# Get list of files changed in the PR
gh pr view 480 --json files --jq '.files[] | "\(.path) +\(.additions) -\(.deletions)"'

Length of output: 5220


🏁 Script executed:

#!/bin/bash
# Get the new web hosted client implementation
cat python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py

echo -e "\n\n=== INIT FILE ==="
cat python/cuopt_self_hosted/cuopt_sh_client/__init__.py

echo -e "\n\n=== MODIFIED SELF HOST CLIENT ==="
cat python/cuopt_self_hosted/cuopt_sh_client/cuopt_self_host_client.py

echo -e "\n\n=== TEST FILE ==="
cat python/cuopt_self_hosted/tests/test_web_hosted_client.py

Length of output: 61283


@rgsl888prabhu I'll review the PR and provide a status summary.

📊 PR Status Summary

Current State: OPEN (Merge Blocked - Waiting for CI checks)

CI Checks Status:

  • Passing: changed-files, other-checks, Branch Checker, CodeRabbit, Label Checker
  • 🔄 In Progress: check-style, multiple wheel builds (Python 3.10-3.13, amd64/arm64)
  • Reviews: None submitted yet

Changes: 4 files, +527/-10 lines

  • New: cuopt_web_hosted_client.py (251 lines)
  • New: test_web_hosted_client.py (232 lines)
  • Modified: cuopt_self_host_client.py (+40/-10)
  • Modified: __init__.py (+4)

🔍 Code Review

✅ Strengths

  1. Excellent Architecture

    • Clean inheritance from CuOptServiceSelfHostClient
    • Factory pattern with create_client() for easy client selection
    • Proper separation of concerns between web-hosted and self-hosted clients
  2. Comprehensive Testing

    • 232 lines of well-structured unit tests
    • Tests cover URL parsing, authentication, error handling, and factory function
    • Good use of mocking and parametrization
  3. Authentication & Security

    • Proper Bearer token implementation for API keys
    • Environment variable support (CUOPT_API_KEY)
    • Self-signed certificate handling
    • Specific error handling for 401/403 responses
  4. Documentation

    • Detailed docstrings with parameter descriptions
    • Clear usage examples in docstrings
    • Good inline comments explaining design decisions
  5. Backward Compatibility

    • Parent class refactored to support extensibility via _make_http_request()
    • Existing functionality preserved

🔧 Suggestions for Improvement

  1. Error Handling Enhancement (Line 186-192, cuopt_web_hosted_client.py)

    # Current: Uses ValueError for auth errors
    if response.status_code == 401:
        raise ValueError("Authentication failed. Please check your API key.")
    
    # Consider: More specific exception types
    if response.status_code == 401:
        raise AuthenticationError("Authentication failed. Please check your API key.")
    elif response.status_code == 403:
        raise PermissionError("Access forbidden. Please check your permissions.")
  2. Header Merging Pattern (Line 174-178)
    The current implementation replaces headers entirely. Consider a more defensive merge:

    def _make_http_request(self, method: str, url: str, **kwargs):
        headers = kwargs.get("headers", {}).copy()  # Copy to avoid mutations
        headers.update(self._get_auth_headers())
        kwargs["headers"] = headers
        
        response = requests.request(method, url, **kwargs)
        # ... rest of the method
  3. URL Construction Clarity (Lines 140-149)
    The comment mentions a "curl example" - consider adding a reference to documentation or examples for clarity.

  4. Test Coverage Enhancement
    While unit tests are comprehensive, consider adding integration tests that verify:

    • get_optimized_routes() works with the web-hosted client
    • get_LP_solve() works with authentication headers
    • URL construction is correct for all inherited methods
  5. Type Hints
    Consider adding return type hints to methods like _get_auth_headers():

    def _get_auth_headers(self) -> Dict[str, str]:

📝 Minor Observations

  1. Line 49: The log warning for missing protocol is good, but consider making the default protocol configurable if http-only endpoints exist.

  2. Line 28: mime_type import is correct, but verify it's exported properly in the parent module's __init__.py (it is ✅).

  3. Factory Function: The create_client() error message when API key is provided without endpoint is excellent - very clear guidance for users.


🎯 Overall Assessment

Code Quality: ⭐⭐⭐⭐½ (4.5/5)

  • Well-designed, tested, and documented
  • Follows Python best practices
  • Minor improvements suggested above

Readiness:

  • Waiting for: CI checks to complete (check-style and wheel builds)
  • Code review: Looks good with minor suggestions
  • Tests: Comprehensive coverage
  • Documentation: Well documented

Recommendation: Once CI checks pass (check-style and wheel builds), this PR should be ready to merge pending maintainer review of the architectural approach.


Would you like me to elaborate on any of the suggestions or run additional analysis on specific aspects of the code?

@anandhkb
Copy link
Contributor

Gentle ping @atomic

rgsl888prabhu and others added 2 commits October 27, 2025 17:02
<!--

Thank you for contributing to cuOpt :)

Here are some guidelines to help the review process go smoothly.

Many thanks in advance for your cooperation!

Note: The pull request title will be included in the CHANGELOG.
-->


## Description
Update introduction to use engine instead of library

## Checklist

- [x] I am familiar with the [Contributing
Guidelines](https://github.com/NVIDIA/cuopt/blob/HEAD/CONTRIBUTING.md).
- Testing
   - [ ] New or existing tests cover these changes
   - [ ] Added tests
   - [ ] Created an issue to follow-up
   - [x] NA
- Documentation
   - [ ] The documentation is up to date with these changes
   - [x] Added new documentation
   - [ ] NA
Hot fix with the following commits:
- NVIDIA#541
- NVIDIA#544
- NVIDIA#550
@github-actions
Copy link

github-actions bot commented Nov 1, 2025

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

2 similar comments
@github-actions
Copy link

github-actions bot commented Nov 9, 2025

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@github-actions
Copy link

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@rgsl888prabhu rgsl888prabhu changed the base branch from main to release/25.12 November 17, 2025 21:37
@github-actions
Copy link

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@chris-maes
Copy link
Contributor

Moving this out of the 25.12 release. If this is incorrect, please bring it back in.

@chris-maes chris-maes modified the milestones: 25.12, 26.02 Nov 25, 2025
@github-actions
Copy link

github-actions bot commented Dec 3, 2025

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

5 similar comments
@github-actions
Copy link

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@github-actions
Copy link

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@github-actions
Copy link

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@github-actions
Copy link

github-actions bot commented Jan 1, 2026

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@github-actions
Copy link

github-actions bot commented Jan 8, 2026

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@github-actions
Copy link

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@anandhkb anandhkb removed this from the 26.02 milestone Jan 21, 2026
@rg20
Copy link
Contributor

rg20 commented Jan 22, 2026

@tmckayus for viz!

@rgsl888prabhu rgsl888prabhu changed the base branch from release/25.12 to release/26.02 January 22, 2026 16:49
@github-actions
Copy link

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

1 similar comment
@github-actions
Copy link

github-actions bot commented Feb 6, 2026

🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you!

If this is an "epic" issue, then please add the "epic" label to this issue.
If it is a PR and not ready for review, then please convert this to draft.
If you just want to switch off this notification, then use the "skip inactivity reminder" label.

@atomic
Copy link
Author

atomic commented Feb 10, 2026

acknowledge, will rebase and re-work on the MR

@atomic atomic requested a review from a team as a code owner February 17, 2026 19:01
@atomic atomic requested review from kaatish and rg20 February 17, 2026 19:01
atomic and others added 6 commits February 17, 2026 11:17
remove the parameter bearer token as it is can be inferred
provide the example as gist instead
Remove suspicious hostname pattern validation that was overly
heuristic-based. Simplify invalid URL detection to rely on urlparse
hostname check. Fix Black/isort/flake8 formatting issues.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Tony Salim <tsalim@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Tony Salim <tsalim@nvidia.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (4)
benchmarks/linear_programming/utils/get_datasets.py (1)

695-707: and False is a poor mechanism to disable code — prefer removing or gating it properly.

This creates permanently dead code that will confuse future readers and may silently break the netlib dataset pipeline. Netlib-type archives are decompressed but never converted to .mps via emps, yet download_dataset (line 715) checks for {name}.mps existence — so netlib datasets may fail to be recognized as already downloaded on re-runs, or consumers expecting .mps files won't find them.

Consider one of:

  1. Remove the dead block entirely if emps is no longer needed.
  2. Gate it behind an explicit CLI flag or environment variable if it's meant to be re-enabled later.
  3. At minimum, add a return or raise for netlib type so the broken state is explicit.
Option 1: Remove dead code
-    # download emps and compile
-    # Disable emps for now
-    if type == "netlib" and False:
-        url = MittelmannInstances["emps"]
-        file = os.path.join(dir, "emps.c")
-        download(url, file)
-        subprocess.run(f"cd {dir} && gcc -Wno-implicit-int emps.c -o emps",
-                        shell=True)
-        # determine output file and run emps
-        subprocess.run(f"cd {dir} && ./emps {unzippedfile} > {outfile}",
-                        shell=True)
-        # cleanup emps and emps.c
-        subprocess.run(f"rm -rf {dir}/emps*",
-                        shell=True)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/linear_programming/utils/get_datasets.py` around lines 695 - 707,
Remove the dead "if type == 'netlib' and False:" block and any references to the
disabled emps conversion (emps.c, MittelmannInstances['emps'], running ./emps)
from get_datasets.py; if netlib entries must produce .mps files, implement a
clear path in the download_dataset logic (e.g., in download_dataset or the
calling code) to either generate .mps externally or mark netlib datasets as
already-unconvertible so the "{name}.mps" existence check is
consistent—alternatively, replace the removed block with an explicit, documented
gate (ENV flag or CLI option) if you intend to optionally re-enable emps in
future.
python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py (2)

119-126: Add stacklevel=2 to warnings.warn.

Without stacklevel=2, the warning points to this internal method rather than the caller's code, making it harder for users to locate the source.

🔧 Fix
             warnings.warn(
                 "No protocol specified in endpoint"
                 f" '{endpoint}', assuming https://",
                 UserWarning,
+                stacklevel=2,
             )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py` around
lines 119 - 126, The warnings.warn call that normalizes the endpoint protocol
should include stacklevel=2 so the warning points to the caller instead of this
helper; in cuopt_web_hosted_client.py update the warnings.warn invocation that
warns about a missing protocol for variable endpoint to pass stacklevel=2 (e.g.,
warnings.warn(..., UserWarning, stacklevel=2)) so users see the correct call
site.

59-66: endpoint typed as Optional[str] but always required.

The constructor immediately raises if endpoint is falsy. Making the type non-optional (just str) would better communicate intent and let type checkers catch misuse.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py` around
lines 59 - 66, The __init__ signature incorrectly types endpoint as
Optional[str] while the constructor requires it; change the parameter annotation
and default to a required str (e.g., endpoint: str) and remove the default None
so callers must pass it, updating the signature in
cuopt_web_hosted_client.__init__ (and any related docstrings or type hints in
that file) to reflect a non-optional endpoint.
python/cuopt_self_hosted/cuopt_sh_client/cuopt_self_host_client.py (1)

404-422: Missing default timeout in _make_http_request.

Ruff S113 flags requests.request without a timeout. While all current callers in this file pass timeout= via kwargs, there's no safety net if a future caller omits it. Consider adding a default timeout as a fallback.

🛡️ Suggested defensive fix
     def _make_http_request(self, method: str, url: str, **kwargs):
         ...
+        kwargs.setdefault("timeout", self.http_general_timeout)
         return requests.request(method, url, **kwargs)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_self_host_client.py` around
lines 404 - 422, The _make_http_request method calls requests.request without
ensuring a timeout, so add a defensive default timeout fallback (e.g.,
DEFAULT_TIMEOUT = 10) and ensure kwargs includes it before calling
requests.request; update _make_http_request to set kwargs.setdefault('timeout',
DEFAULT_TIMEOUT) (or similar) so callers that forget to pass timeout still get a
safe default while preserving any caller-supplied timeout.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cpp/src/linear_programming/solve.cu`:
- Around line 671-681: dual_simplex_problem is being constructed with the same
barrier_handle used by the barrier thread, causing shared cuBLAS/cuSPARSE
handles; create a dedicated stream and raft::handle_t for the dual‑simplex
thread and pass that handle into cuopt_problem_to_simplex_problem when building
dual_simplex_problem (e.g., create a new rmm::cuda_stream_view/raft::handle_t
named dual_stream/dual_handle and call
cuopt_problem_to_simplex_problem(&dual_handle, problem)); ensure the original
barrier_handle continues to be used for barrier_problem/any allocations meant
for the original stream so each thread has its own handle and no cuBLAS/cuSPARSE
handles are shared.

In `@docs/cuopt/source/introduction.rst`:
- Line 5: Add a new short "Web-hosted client & API key authentication" section
to the user-facing docs (e.g., cuopt-python/quick-start.rst and the server
quick-start) that describes the hosted endpoint URL pattern, how to provide the
API key via the CUOPT_API_KEY environment variable or an Authorization: Bearer
<key> header, and the expected 401/403 behaviors on invalid/missing keys; ensure
the text references the CUOPT_API_KEY env var and the client/server usage
patterns so users know where to configure the key and what errors to expect.

In `@docs/cuopt/source/system-requirements.rst`:
- Line 59: The line "NVIDIA H100 SXM (compute capability >= 9.0) and above" is
ambiguous and redundant; remove the trailing "and above" or rephrase to make
clear whether you mean newer GPU models or a compute capability threshold—for
example change it to either "NVIDIA H100 SXM (compute capability >= 9.0)" or
"NVIDIA H100 SXM and newer (compute capability >= 9.0)"; update the sentence
containing the exact phrase to use one of these clearer options.

In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py`:
- Around line 196-214: The auth error handling in _make_http_request currently
raises ValueError for 401/403 and bypasses parent handlers like
_send_request/_poll_request that attach reqId; modify _make_http_request to
extract a request id (e.g. req_id = kwargs.get("json", {}).get("reqId") or
kwargs.get("params", {}).get("reqId") or headers.get("X-Request-Id")) and
include it in the raised error message (or raise a small custom exception that
stores req_id) so that callers (_send_request, _poll_request) and logs will
contain the reqId context when authentication fails.

In `@python/cuopt_self_hosted/tests/test_web_hosted_client.py`:
- Around line 89-94: The test relies on CuOptServiceWebHostedClient picking up
CUOPT_API_KEY from the environment, so make the test explicitly unset that env
var before constructing the client; use pytest's monkeypatch (e.g.,
monkeypatch.delenv("CUOPT_API_KEY", raising=False)) or temporarily pop
os.environ in the test before calling CuOptServiceWebHostedClient(...) and then
call client._get_auth_headers() to assert headers are empty, ensuring the env
var is restored/isolated by using the fixture so no global state leaks between
tests.

---

Nitpick comments:
In `@benchmarks/linear_programming/utils/get_datasets.py`:
- Around line 695-707: Remove the dead "if type == 'netlib' and False:" block
and any references to the disabled emps conversion (emps.c,
MittelmannInstances['emps'], running ./emps) from get_datasets.py; if netlib
entries must produce .mps files, implement a clear path in the download_dataset
logic (e.g., in download_dataset or the calling code) to either generate .mps
externally or mark netlib datasets as already-unconvertible so the "{name}.mps"
existence check is consistent—alternatively, replace the removed block with an
explicit, documented gate (ENV flag or CLI option) if you intend to optionally
re-enable emps in future.

In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_self_host_client.py`:
- Around line 404-422: The _make_http_request method calls requests.request
without ensuring a timeout, so add a defensive default timeout fallback (e.g.,
DEFAULT_TIMEOUT = 10) and ensure kwargs includes it before calling
requests.request; update _make_http_request to set kwargs.setdefault('timeout',
DEFAULT_TIMEOUT) (or similar) so callers that forget to pass timeout still get a
safe default while preserving any caller-supplied timeout.

In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py`:
- Around line 119-126: The warnings.warn call that normalizes the endpoint
protocol should include stacklevel=2 so the warning points to the caller instead
of this helper; in cuopt_web_hosted_client.py update the warnings.warn
invocation that warns about a missing protocol for variable endpoint to pass
stacklevel=2 (e.g., warnings.warn(..., UserWarning, stacklevel=2)) so users see
the correct call site.
- Around line 59-66: The __init__ signature incorrectly types endpoint as
Optional[str] while the constructor requires it; change the parameter annotation
and default to a required str (e.g., endpoint: str) and remove the default None
so callers must pass it, updating the signature in
cuopt_web_hosted_client.__init__ (and any related docstrings or type hints in
that file) to reflect a non-optional endpoint.

Comment on lines 671 to 681
// Initialize the dual simplex structures before we run PDLP.
// Otherwise, CUDA API calls to the problem stream may occur in both threads and throw graph
// capture off
auto barrier_handle = raft::handle_t(*op_problem.get_handle_ptr());
detail::problem_t<i_t, f_t> d_barrier_problem(problem);
rmm::cuda_stream_view barrier_stream = rmm::cuda_stream_per_thread;
d_barrier_problem.handle_ptr = &barrier_handle;
raft::resource::set_cuda_stream(barrier_handle, barrier_stream);
auto barrier_handle = raft::handle_t(barrier_stream);
// Make sure allocations are done on the original stream
problem.handle_ptr->sync_stream();

dual_simplex::user_problem_t<i_t, f_t> dual_simplex_problem =
cuopt_problem_to_simplex_problem<i_t, f_t>(d_barrier_problem);
cuopt_problem_to_simplex_problem<i_t, f_t>(&barrier_handle, problem);
// Create a thread for dual simplex
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the actual code context at lines 671-681
cat -n cpp/src/linear_programming/solve.cu | sed -n '665,690p'

Repository: NVIDIA/cuopt

Length of output: 1512


🏁 Script executed:

# Locate user_problem_t definition and check handle_ptr usage
rg -n "struct\s+user_problem_t|class\s+user_problem_t" cpp/src/dual_simplex -A5 -B2

Repository: NVIDIA/cuopt

Length of output: 619


🏁 Script executed:

# Check where handle_ptr is used in dual_simplex components
rg -n "handle_ptr" cpp/src/dual_simplex --type cpp --type hpp --type cu -B2 -A2

Repository: NVIDIA/cuopt

Length of output: 83


🏁 Script executed:

# Check for cuBLAS/cuSPARSE handle usage
rg -n "get_cublas_handle|get_cusparse_handle|cublasHandle|cusparseHandle" cpp/src/dual_simplex --type cpp --type hpp --type cu -B2 -A2

Repository: NVIDIA/cuopt

Length of output: 83


🏁 Script executed:

# Search for barrier_problem creation and initialization
rg -n "barrier_problem" cpp/src/linear_programming/solve.cu -B3 -A3

Repository: NVIDIA/cuopt

Length of output: 749


🏁 Script executed:

# Check how handle_ptr is used in solver execution
rg -n "handle_ptr" cpp/src/dual_simplex -B2 -A2

Repository: NVIDIA/cuopt

Length of output: 39273


🏁 Script executed:

# Look at run_dual_simplex_thread implementation
rg -n "run_dual_simplex_thread" cpp/src/linear_programming -B2 -A10

Repository: NVIDIA/cuopt

Length of output: 2306


🏁 Script executed:

# Check the run_concurrent function to understand thread creation
rg -n "run_concurrent" cpp/src/linear_programming/solve.cu -B5 -A20

Repository: NVIDIA/cuopt

Length of output: 2532


Create dedicated raft::handle_t per solver thread to avoid shared cuBLAS/cuSPARSE handles.

Both dual_simplex_problem and barrier_problem reference the same barrier_handle (line 691 copies dual_simplex_problem), so the barrier and dual‑simplex threads share the same cuBLAS/cuSPARSE handles obtained via handle_ptr->get_cublas_handle() and handle_ptr->get_cusparse_handle(). These handles are not thread‑safe, and concurrent use will race and corrupt solver state.

Create a separate stream and handle for the dual‑simplex thread:

  rmm::cuda_stream_view barrier_stream = rmm::cuda_stream_per_thread;
  auto barrier_handle                  = raft::handle_t(barrier_stream);
+ rmm::cuda_stream_view dual_simplex_stream = rmm::cuda_stream_per_thread;
+ auto dual_simplex_handle                  = raft::handle_t(dual_simplex_stream);
  // Make sure allocations are done on the original stream
  problem.handle_ptr->sync_stream();

  dual_simplex::user_problem_t<i_t, f_t> dual_simplex_problem =
-   cuopt_problem_to_simplex_problem<i_t, f_t>(&barrier_handle, problem);
+   cuopt_problem_to_simplex_problem<i_t, f_t>(&dual_simplex_handle, problem);
  // Create a thread for dual simplex
  
  dual_simplex::user_problem_t<i_t, f_t> barrier_problem = dual_simplex_problem;
+ barrier_problem.handle_ptr = &barrier_handle;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/src/linear_programming/solve.cu` around lines 671 - 681,
dual_simplex_problem is being constructed with the same barrier_handle used by
the barrier thread, causing shared cuBLAS/cuSPARSE handles; create a dedicated
stream and raft::handle_t for the dual‑simplex thread and pass that handle into
cuopt_problem_to_simplex_problem when building dual_simplex_problem (e.g.,
create a new rmm::cuda_stream_view/raft::handle_t named dual_stream/dual_handle
and call cuopt_problem_to_simplex_problem(&dual_handle, problem)); ensure the
original barrier_handle continues to be used for barrier_problem/any allocations
meant for the original stream so each thread has its own handle and no
cuBLAS/cuSPARSE handles are shared.

==========================

**NVIDIA® cuOpt™** is a GPU-accelerated optimization library that solves `Mixed Integer Linear Programming (MILP) <https://en.wikipedia.org/wiki/Linear_programming#Integer_unknowns>`_, `Linear Programming (LP) <https://en.wikipedia.org/wiki/Linear_programming>`_, and `Vehicle Routing Problems (VRP) <https://en.wikipedia.org/wiki/Vehicle_routing_problem>`_. It enables solutions for large-scale problems with millions of variables and constraints, offering seamless deployment across hybrid and multi-cloud environments.
**NVIDIA® cuOpt™** is a GPU-accelerated optimization engine that solves `Mixed Integer Linear Programming (MILP) <https://en.wikipedia.org/wiki/Linear_programming#Integer_unknowns>`_, `Linear Programming (LP) <https://en.wikipedia.org/wiki/Linear_programming>`_, and `Vehicle Routing Problems (VRP) <https://en.wikipedia.org/wiki/Vehicle_routing_problem>`_. It enables solutions for large-scale problems with millions of variables and constraints, offering seamless deployment across hybrid and multi-cloud environments.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Document the new web-hosted client and API key auth.

This PR adds a web‑hosted cuOpt client and API‑key authentication, but this doc change only tweaks terminology. Please add a short section describing web‑hosted usage (endpoint URL, API key/CUOPT_API_KEY, and 401/403 behavior) in the user‑facing docs (e.g., docs/cuopt/source/cuopt-python/quick-start.rst and the server quick‑start), or the appropriate pages.
As per coding guidelines: "Update documentation in docs/ for API changes (new parameters, return values, error codes), new public functions/classes, and changed algorithm behavior".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/cuopt/source/introduction.rst` at line 5, Add a new short "Web-hosted
client & API key authentication" section to the user-facing docs (e.g.,
cuopt-python/quick-start.rst and the server quick-start) that describes the
hosted endpoint URL pattern, how to provide the API key via the CUOPT_API_KEY
environment variable or an Authorization: Bearer <key> header, and the expected
401/403 behaviors on invalid/missing keys; ensure the text references the
CUOPT_API_KEY env var and the client/server usage patterns so users know where
to configure the key and what errors to expect.


* GPU:
- NVIDIA H100 SXM (compute capability >= 9.0)
- NVIDIA H100 SXM (compute capability >= 9.0) and above
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Clarify ambiguous "and above" phrasing.

The phrase "compute capability >= 9.0" already implies "9.0 and above," so adding "and above" after the parenthetical is redundant and creates ambiguity. It's unclear whether "and above" refers to GPU models (H100 and newer) or compute capability (already stated as >= 9.0).

📝 Suggested clearer alternatives

Choose one of these clearer phrasings:

Option 1: Emphasize GPU models

-      - NVIDIA H100 SXM (compute capability >= 9.0) and above
+      - NVIDIA H100 SXM or newer (compute capability >= 9.0)

Option 2: Lead with compute capability

-      - NVIDIA H100 SXM (compute capability >= 9.0) and above
+      - GPUs with compute capability >= 9.0 (e.g., NVIDIA H100 SXM or newer)

Option 3: Keep it simple (original was clear)

-      - NVIDIA H100 SXM (compute capability >= 9.0) and above
+      - NVIDIA H100 SXM (compute capability >= 9.0)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- NVIDIA H100 SXM (compute capability >= 9.0) and above
- NVIDIA H100 SXM or newer (compute capability >= 9.0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/cuopt/source/system-requirements.rst` at line 59, The line "NVIDIA H100
SXM (compute capability >= 9.0) and above" is ambiguous and redundant; remove
the trailing "and above" or rephrase to make clear whether you mean newer GPU
models or a compute capability threshold—for example change it to either "NVIDIA
H100 SXM (compute capability >= 9.0)" or "NVIDIA H100 SXM and newer (compute
capability >= 9.0)"; update the sentence containing the exact phrase to use one
of these clearer options.

Comment on lines +196 to +214
# Add authentication headers
headers = kwargs.get("headers", {}).copy()
headers.update(self._get_auth_headers())
kwargs["headers"] = headers

# Make request
response = requests.request(method, url, **kwargs)

# Handle authentication errors
if response.status_code == 401:
raise ValueError(
"Authentication failed. Please check your API key."
)
elif response.status_code == 403:
raise ValueError(
"Access forbidden. Please check your permissions."
)

return response
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Auth errors bypass parent's _handle_request_exception context.

When _make_http_request raises ValueError on 401/403, it short-circuits the parent's error handling in methods like _send_request and _poll_request which would normally include the reqId in the error message. Users debugging failed requests won't see which request ID failed.

Consider including request context in the auth error messages, or letting callers handle 401/403 more gracefully.

🧰 Tools
🪛 Ruff (0.15.1)

[error] 202-202: Probable use of requests call without timeout

(S113)


[warning] 206-208: Avoid specifying long messages outside the exception class

(TRY003)


[warning] 210-212: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py` around
lines 196 - 214, The auth error handling in _make_http_request currently raises
ValueError for 401/403 and bypasses parent handlers like
_send_request/_poll_request that attach reqId; modify _make_http_request to
extract a request id (e.g. req_id = kwargs.get("json", {}).get("reqId") or
kwargs.get("params", {}).get("reqId") or headers.get("X-Request-Id")) and
include it in the raised error message (or raise a small custom exception that
stores req_id) so that callers (_send_request, _poll_request) and logs will
contain the reqId context when authentication fails.

Comment on lines +89 to +94
# Test no authentication
client = CuOptServiceWebHostedClient(
endpoint="https://api.nvidia.com/cuopt/v1"
)
headers = client._get_auth_headers()
assert len(headers) == 0
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Test is not isolated from the CUOPT_API_KEY environment variable.

If CUOPT_API_KEY is set in the environment (e.g., CI), this "no authentication" test will fail because the constructor falls back to os.getenv("CUOPT_API_KEY"). Patch the env to ensure the variable is unset.

🔧 Fix
+    `@patch.dict`(os.environ, {}, clear=True)
     def test_authentication_from_parameters(self):
         """Test authentication setup from parameters."""
         # Test API key (sent as Bearer token)

Or more targeted:

         # Test no authentication
-        client = CuOptServiceWebHostedClient(
-            endpoint="https://api.nvidia.com/cuopt/v1"
-        )
-        headers = client._get_auth_headers()
-        assert len(headers) == 0
+        with patch.dict(os.environ, {"CUOPT_API_KEY": ""}, clear=False):
+            os.environ.pop("CUOPT_API_KEY", None)
+            client = CuOptServiceWebHostedClient(
+                endpoint="https://api.nvidia.com/cuopt/v1"
+            )
+            headers = client._get_auth_headers()
+            assert len(headers) == 0

As per coding guidelines, "Ensure test isolation: prevent GPU state, cached memory, and global variables from leaking between test cases; verify each test independently initializes its environment."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@python/cuopt_self_hosted/tests/test_web_hosted_client.py` around lines 89 -
94, The test relies on CuOptServiceWebHostedClient picking up CUOPT_API_KEY from
the environment, so make the test explicitly unset that env var before
constructing the client; use pytest's monkeypatch (e.g.,
monkeypatch.delenv("CUOPT_API_KEY", raising=False)) or temporarily pop
os.environ in the test before calling CuOptServiceWebHostedClient(...) and then
call client._get_auth_headers() to assert headers are empty, ensuring the env
var is restored/isolated by using the fixture so no global state leaks between
tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backlog improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants