Web hosted client (api key based access)#480
Web hosted client (api key based access)#480atomic wants to merge 8 commits intoNVIDIA:release/26.02from
Conversation
📝 WalkthroughWalkthroughThis PR introduces a web-hosted cuOpt client interface with API key authentication for cloud deployment, refactors C++ linear programming code to explicitly pass problem handles, adds memory exception handling and pointer mode configuration, updates documentation, and includes comprehensive tests for the new client interface. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
/ok to test e595eab |
|
@CodeRabbit review and summarize status |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 5220 🏁 Script executed: Length of output: 61283 @rgsl888prabhu I'll review the PR and provide a status summary. 📊 PR Status SummaryCurrent State: OPEN (Merge Blocked - Waiting for CI checks) CI Checks Status:
Changes: 4 files, +527/-10 lines
🔍 Code Review✅ Strengths
🔧 Suggestions for Improvement
📝 Minor Observations
🎯 Overall AssessmentCode Quality: ⭐⭐⭐⭐½ (4.5/5)
Readiness:
Recommendation: Once CI checks pass (check-style and wheel builds), this PR should be ready to merge pending maintainer review of the architectural approach. Would you like me to elaborate on any of the suggestions or run additional analysis on specific aspects of the code? |
|
Gentle ping @atomic |
<!-- Thank you for contributing to cuOpt :) Here are some guidelines to help the review process go smoothly. Many thanks in advance for your cooperation! Note: The pull request title will be included in the CHANGELOG. --> ## Description Update introduction to use engine instead of library ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/NVIDIA/cuopt/blob/HEAD/CONTRIBUTING.md). - Testing - [ ] New or existing tests cover these changes - [ ] Added tests - [ ] Created an issue to follow-up - [x] NA - Documentation - [ ] The documentation is up to date with these changes - [x] Added new documentation - [ ] NA
Hot fix with the following commits: - NVIDIA#541 - NVIDIA#544 - NVIDIA#550
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
2 similar comments
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
Moving this out of the 25.12 release. If this is incorrect, please bring it back in. |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
5 similar comments
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
@tmckayus for viz! |
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
1 similar comment
|
🔔 Hi @anandhkb @atomic, this pull request has had no activity for 7 days. Please update or let us know if it can be closed. Thank you! If this is an "epic" issue, then please add the "epic" label to this issue. |
|
acknowledge, will rebase and re-work on the MR |
e595eab to
fc8dea3
Compare
remove the parameter bearer token as it is can be inferred
provide the example as gist instead
Remove suspicious hostname pattern validation that was overly heuristic-based. Simplify invalid URL detection to rely on urlparse hostname check. Fix Black/isort/flake8 formatting issues. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Tony Salim <tsalim@nvidia.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Tony Salim <tsalim@nvidia.com>
fc8dea3 to
6212169
Compare
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (4)
benchmarks/linear_programming/utils/get_datasets.py (1)
695-707:and Falseis a poor mechanism to disable code — prefer removing or gating it properly.This creates permanently dead code that will confuse future readers and may silently break the netlib dataset pipeline. Netlib-type archives are decompressed but never converted to
.mpsviaemps, yetdownload_dataset(line 715) checks for{name}.mpsexistence — so netlib datasets may fail to be recognized as already downloaded on re-runs, or consumers expecting.mpsfiles won't find them.Consider one of:
- Remove the dead block entirely if
empsis no longer needed.- Gate it behind an explicit CLI flag or environment variable if it's meant to be re-enabled later.
- At minimum, add a
returnorraisefor netlib type so the broken state is explicit.Option 1: Remove dead code
- # download emps and compile - # Disable emps for now - if type == "netlib" and False: - url = MittelmannInstances["emps"] - file = os.path.join(dir, "emps.c") - download(url, file) - subprocess.run(f"cd {dir} && gcc -Wno-implicit-int emps.c -o emps", - shell=True) - # determine output file and run emps - subprocess.run(f"cd {dir} && ./emps {unzippedfile} > {outfile}", - shell=True) - # cleanup emps and emps.c - subprocess.run(f"rm -rf {dir}/emps*", - shell=True)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@benchmarks/linear_programming/utils/get_datasets.py` around lines 695 - 707, Remove the dead "if type == 'netlib' and False:" block and any references to the disabled emps conversion (emps.c, MittelmannInstances['emps'], running ./emps) from get_datasets.py; if netlib entries must produce .mps files, implement a clear path in the download_dataset logic (e.g., in download_dataset or the calling code) to either generate .mps externally or mark netlib datasets as already-unconvertible so the "{name}.mps" existence check is consistent—alternatively, replace the removed block with an explicit, documented gate (ENV flag or CLI option) if you intend to optionally re-enable emps in future.python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py (2)
119-126: Addstacklevel=2towarnings.warn.Without
stacklevel=2, the warning points to this internal method rather than the caller's code, making it harder for users to locate the source.🔧 Fix
warnings.warn( "No protocol specified in endpoint" f" '{endpoint}', assuming https://", UserWarning, + stacklevel=2, )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py` around lines 119 - 126, The warnings.warn call that normalizes the endpoint protocol should include stacklevel=2 so the warning points to the caller instead of this helper; in cuopt_web_hosted_client.py update the warnings.warn invocation that warns about a missing protocol for variable endpoint to pass stacklevel=2 (e.g., warnings.warn(..., UserWarning, stacklevel=2)) so users see the correct call site.
59-66:endpointtyped asOptional[str]but always required.The constructor immediately raises if
endpointis falsy. Making the type non-optional (juststr) would better communicate intent and let type checkers catch misuse.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py` around lines 59 - 66, The __init__ signature incorrectly types endpoint as Optional[str] while the constructor requires it; change the parameter annotation and default to a required str (e.g., endpoint: str) and remove the default None so callers must pass it, updating the signature in cuopt_web_hosted_client.__init__ (and any related docstrings or type hints in that file) to reflect a non-optional endpoint.python/cuopt_self_hosted/cuopt_sh_client/cuopt_self_host_client.py (1)
404-422: Missing default timeout in_make_http_request.Ruff S113 flags
requests.requestwithout a timeout. While all current callers in this file passtimeout=via kwargs, there's no safety net if a future caller omits it. Consider adding a default timeout as a fallback.🛡️ Suggested defensive fix
def _make_http_request(self, method: str, url: str, **kwargs): ... + kwargs.setdefault("timeout", self.http_general_timeout) return requests.request(method, url, **kwargs)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_self_host_client.py` around lines 404 - 422, The _make_http_request method calls requests.request without ensuring a timeout, so add a defensive default timeout fallback (e.g., DEFAULT_TIMEOUT = 10) and ensure kwargs includes it before calling requests.request; update _make_http_request to set kwargs.setdefault('timeout', DEFAULT_TIMEOUT) (or similar) so callers that forget to pass timeout still get a safe default while preserving any caller-supplied timeout.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/linear_programming/solve.cu`:
- Around line 671-681: dual_simplex_problem is being constructed with the same
barrier_handle used by the barrier thread, causing shared cuBLAS/cuSPARSE
handles; create a dedicated stream and raft::handle_t for the dual‑simplex
thread and pass that handle into cuopt_problem_to_simplex_problem when building
dual_simplex_problem (e.g., create a new rmm::cuda_stream_view/raft::handle_t
named dual_stream/dual_handle and call
cuopt_problem_to_simplex_problem(&dual_handle, problem)); ensure the original
barrier_handle continues to be used for barrier_problem/any allocations meant
for the original stream so each thread has its own handle and no cuBLAS/cuSPARSE
handles are shared.
In `@docs/cuopt/source/introduction.rst`:
- Line 5: Add a new short "Web-hosted client & API key authentication" section
to the user-facing docs (e.g., cuopt-python/quick-start.rst and the server
quick-start) that describes the hosted endpoint URL pattern, how to provide the
API key via the CUOPT_API_KEY environment variable or an Authorization: Bearer
<key> header, and the expected 401/403 behaviors on invalid/missing keys; ensure
the text references the CUOPT_API_KEY env var and the client/server usage
patterns so users know where to configure the key and what errors to expect.
In `@docs/cuopt/source/system-requirements.rst`:
- Line 59: The line "NVIDIA H100 SXM (compute capability >= 9.0) and above" is
ambiguous and redundant; remove the trailing "and above" or rephrase to make
clear whether you mean newer GPU models or a compute capability threshold—for
example change it to either "NVIDIA H100 SXM (compute capability >= 9.0)" or
"NVIDIA H100 SXM and newer (compute capability >= 9.0)"; update the sentence
containing the exact phrase to use one of these clearer options.
In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py`:
- Around line 196-214: The auth error handling in _make_http_request currently
raises ValueError for 401/403 and bypasses parent handlers like
_send_request/_poll_request that attach reqId; modify _make_http_request to
extract a request id (e.g. req_id = kwargs.get("json", {}).get("reqId") or
kwargs.get("params", {}).get("reqId") or headers.get("X-Request-Id")) and
include it in the raised error message (or raise a small custom exception that
stores req_id) so that callers (_send_request, _poll_request) and logs will
contain the reqId context when authentication fails.
In `@python/cuopt_self_hosted/tests/test_web_hosted_client.py`:
- Around line 89-94: The test relies on CuOptServiceWebHostedClient picking up
CUOPT_API_KEY from the environment, so make the test explicitly unset that env
var before constructing the client; use pytest's monkeypatch (e.g.,
monkeypatch.delenv("CUOPT_API_KEY", raising=False)) or temporarily pop
os.environ in the test before calling CuOptServiceWebHostedClient(...) and then
call client._get_auth_headers() to assert headers are empty, ensuring the env
var is restored/isolated by using the fixture so no global state leaks between
tests.
---
Nitpick comments:
In `@benchmarks/linear_programming/utils/get_datasets.py`:
- Around line 695-707: Remove the dead "if type == 'netlib' and False:" block
and any references to the disabled emps conversion (emps.c,
MittelmannInstances['emps'], running ./emps) from get_datasets.py; if netlib
entries must produce .mps files, implement a clear path in the download_dataset
logic (e.g., in download_dataset or the calling code) to either generate .mps
externally or mark netlib datasets as already-unconvertible so the "{name}.mps"
existence check is consistent—alternatively, replace the removed block with an
explicit, documented gate (ENV flag or CLI option) if you intend to optionally
re-enable emps in future.
In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_self_host_client.py`:
- Around line 404-422: The _make_http_request method calls requests.request
without ensuring a timeout, so add a defensive default timeout fallback (e.g.,
DEFAULT_TIMEOUT = 10) and ensure kwargs includes it before calling
requests.request; update _make_http_request to set kwargs.setdefault('timeout',
DEFAULT_TIMEOUT) (or similar) so callers that forget to pass timeout still get a
safe default while preserving any caller-supplied timeout.
In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py`:
- Around line 119-126: The warnings.warn call that normalizes the endpoint
protocol should include stacklevel=2 so the warning points to the caller instead
of this helper; in cuopt_web_hosted_client.py update the warnings.warn
invocation that warns about a missing protocol for variable endpoint to pass
stacklevel=2 (e.g., warnings.warn(..., UserWarning, stacklevel=2)) so users see
the correct call site.
- Around line 59-66: The __init__ signature incorrectly types endpoint as
Optional[str] while the constructor requires it; change the parameter annotation
and default to a required str (e.g., endpoint: str) and remove the default None
so callers must pass it, updating the signature in
cuopt_web_hosted_client.__init__ (and any related docstrings or type hints in
that file) to reflect a non-optional endpoint.
| // Initialize the dual simplex structures before we run PDLP. | ||
| // Otherwise, CUDA API calls to the problem stream may occur in both threads and throw graph | ||
| // capture off | ||
| auto barrier_handle = raft::handle_t(*op_problem.get_handle_ptr()); | ||
| detail::problem_t<i_t, f_t> d_barrier_problem(problem); | ||
| rmm::cuda_stream_view barrier_stream = rmm::cuda_stream_per_thread; | ||
| d_barrier_problem.handle_ptr = &barrier_handle; | ||
| raft::resource::set_cuda_stream(barrier_handle, barrier_stream); | ||
| auto barrier_handle = raft::handle_t(barrier_stream); | ||
| // Make sure allocations are done on the original stream | ||
| problem.handle_ptr->sync_stream(); | ||
|
|
||
| dual_simplex::user_problem_t<i_t, f_t> dual_simplex_problem = | ||
| cuopt_problem_to_simplex_problem<i_t, f_t>(d_barrier_problem); | ||
| cuopt_problem_to_simplex_problem<i_t, f_t>(&barrier_handle, problem); | ||
| // Create a thread for dual simplex |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's look at the actual code context at lines 671-681
cat -n cpp/src/linear_programming/solve.cu | sed -n '665,690p'Repository: NVIDIA/cuopt
Length of output: 1512
🏁 Script executed:
# Locate user_problem_t definition and check handle_ptr usage
rg -n "struct\s+user_problem_t|class\s+user_problem_t" cpp/src/dual_simplex -A5 -B2Repository: NVIDIA/cuopt
Length of output: 619
🏁 Script executed:
# Check where handle_ptr is used in dual_simplex components
rg -n "handle_ptr" cpp/src/dual_simplex --type cpp --type hpp --type cu -B2 -A2Repository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Check for cuBLAS/cuSPARSE handle usage
rg -n "get_cublas_handle|get_cusparse_handle|cublasHandle|cusparseHandle" cpp/src/dual_simplex --type cpp --type hpp --type cu -B2 -A2Repository: NVIDIA/cuopt
Length of output: 83
🏁 Script executed:
# Search for barrier_problem creation and initialization
rg -n "barrier_problem" cpp/src/linear_programming/solve.cu -B3 -A3Repository: NVIDIA/cuopt
Length of output: 749
🏁 Script executed:
# Check how handle_ptr is used in solver execution
rg -n "handle_ptr" cpp/src/dual_simplex -B2 -A2Repository: NVIDIA/cuopt
Length of output: 39273
🏁 Script executed:
# Look at run_dual_simplex_thread implementation
rg -n "run_dual_simplex_thread" cpp/src/linear_programming -B2 -A10Repository: NVIDIA/cuopt
Length of output: 2306
🏁 Script executed:
# Check the run_concurrent function to understand thread creation
rg -n "run_concurrent" cpp/src/linear_programming/solve.cu -B5 -A20Repository: NVIDIA/cuopt
Length of output: 2532
Create dedicated raft::handle_t per solver thread to avoid shared cuBLAS/cuSPARSE handles.
Both dual_simplex_problem and barrier_problem reference the same barrier_handle (line 691 copies dual_simplex_problem), so the barrier and dual‑simplex threads share the same cuBLAS/cuSPARSE handles obtained via handle_ptr->get_cublas_handle() and handle_ptr->get_cusparse_handle(). These handles are not thread‑safe, and concurrent use will race and corrupt solver state.
Create a separate stream and handle for the dual‑simplex thread:
rmm::cuda_stream_view barrier_stream = rmm::cuda_stream_per_thread;
auto barrier_handle = raft::handle_t(barrier_stream);
+ rmm::cuda_stream_view dual_simplex_stream = rmm::cuda_stream_per_thread;
+ auto dual_simplex_handle = raft::handle_t(dual_simplex_stream);
// Make sure allocations are done on the original stream
problem.handle_ptr->sync_stream();
dual_simplex::user_problem_t<i_t, f_t> dual_simplex_problem =
- cuopt_problem_to_simplex_problem<i_t, f_t>(&barrier_handle, problem);
+ cuopt_problem_to_simplex_problem<i_t, f_t>(&dual_simplex_handle, problem);
// Create a thread for dual simplex
dual_simplex::user_problem_t<i_t, f_t> barrier_problem = dual_simplex_problem;
+ barrier_problem.handle_ptr = &barrier_handle;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/linear_programming/solve.cu` around lines 671 - 681,
dual_simplex_problem is being constructed with the same barrier_handle used by
the barrier thread, causing shared cuBLAS/cuSPARSE handles; create a dedicated
stream and raft::handle_t for the dual‑simplex thread and pass that handle into
cuopt_problem_to_simplex_problem when building dual_simplex_problem (e.g.,
create a new rmm::cuda_stream_view/raft::handle_t named dual_stream/dual_handle
and call cuopt_problem_to_simplex_problem(&dual_handle, problem)); ensure the
original barrier_handle continues to be used for barrier_problem/any allocations
meant for the original stream so each thread has its own handle and no
cuBLAS/cuSPARSE handles are shared.
| ========================== | ||
|
|
||
| **NVIDIA® cuOpt™** is a GPU-accelerated optimization library that solves `Mixed Integer Linear Programming (MILP) <https://en.wikipedia.org/wiki/Linear_programming#Integer_unknowns>`_, `Linear Programming (LP) <https://en.wikipedia.org/wiki/Linear_programming>`_, and `Vehicle Routing Problems (VRP) <https://en.wikipedia.org/wiki/Vehicle_routing_problem>`_. It enables solutions for large-scale problems with millions of variables and constraints, offering seamless deployment across hybrid and multi-cloud environments. | ||
| **NVIDIA® cuOpt™** is a GPU-accelerated optimization engine that solves `Mixed Integer Linear Programming (MILP) <https://en.wikipedia.org/wiki/Linear_programming#Integer_unknowns>`_, `Linear Programming (LP) <https://en.wikipedia.org/wiki/Linear_programming>`_, and `Vehicle Routing Problems (VRP) <https://en.wikipedia.org/wiki/Vehicle_routing_problem>`_. It enables solutions for large-scale problems with millions of variables and constraints, offering seamless deployment across hybrid and multi-cloud environments. |
There was a problem hiding this comment.
Document the new web-hosted client and API key auth.
This PR adds a web‑hosted cuOpt client and API‑key authentication, but this doc change only tweaks terminology. Please add a short section describing web‑hosted usage (endpoint URL, API key/CUOPT_API_KEY, and 401/403 behavior) in the user‑facing docs (e.g., docs/cuopt/source/cuopt-python/quick-start.rst and the server quick‑start), or the appropriate pages.
As per coding guidelines: "Update documentation in docs/ for API changes (new parameters, return values, error codes), new public functions/classes, and changed algorithm behavior".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/cuopt/source/introduction.rst` at line 5, Add a new short "Web-hosted
client & API key authentication" section to the user-facing docs (e.g.,
cuopt-python/quick-start.rst and the server quick-start) that describes the
hosted endpoint URL pattern, how to provide the API key via the CUOPT_API_KEY
environment variable or an Authorization: Bearer <key> header, and the expected
401/403 behaviors on invalid/missing keys; ensure the text references the
CUOPT_API_KEY env var and the client/server usage patterns so users know where
to configure the key and what errors to expect.
|
|
||
| * GPU: | ||
| - NVIDIA H100 SXM (compute capability >= 9.0) | ||
| - NVIDIA H100 SXM (compute capability >= 9.0) and above |
There was a problem hiding this comment.
Clarify ambiguous "and above" phrasing.
The phrase "compute capability >= 9.0" already implies "9.0 and above," so adding "and above" after the parenthetical is redundant and creates ambiguity. It's unclear whether "and above" refers to GPU models (H100 and newer) or compute capability (already stated as >= 9.0).
📝 Suggested clearer alternatives
Choose one of these clearer phrasings:
Option 1: Emphasize GPU models
- - NVIDIA H100 SXM (compute capability >= 9.0) and above
+ - NVIDIA H100 SXM or newer (compute capability >= 9.0)Option 2: Lead with compute capability
- - NVIDIA H100 SXM (compute capability >= 9.0) and above
+ - GPUs with compute capability >= 9.0 (e.g., NVIDIA H100 SXM or newer)Option 3: Keep it simple (original was clear)
- - NVIDIA H100 SXM (compute capability >= 9.0) and above
+ - NVIDIA H100 SXM (compute capability >= 9.0)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| - NVIDIA H100 SXM (compute capability >= 9.0) and above | |
| - NVIDIA H100 SXM or newer (compute capability >= 9.0) |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@docs/cuopt/source/system-requirements.rst` at line 59, The line "NVIDIA H100
SXM (compute capability >= 9.0) and above" is ambiguous and redundant; remove
the trailing "and above" or rephrase to make clear whether you mean newer GPU
models or a compute capability threshold—for example change it to either "NVIDIA
H100 SXM (compute capability >= 9.0)" or "NVIDIA H100 SXM and newer (compute
capability >= 9.0)"; update the sentence containing the exact phrase to use one
of these clearer options.
| # Add authentication headers | ||
| headers = kwargs.get("headers", {}).copy() | ||
| headers.update(self._get_auth_headers()) | ||
| kwargs["headers"] = headers | ||
|
|
||
| # Make request | ||
| response = requests.request(method, url, **kwargs) | ||
|
|
||
| # Handle authentication errors | ||
| if response.status_code == 401: | ||
| raise ValueError( | ||
| "Authentication failed. Please check your API key." | ||
| ) | ||
| elif response.status_code == 403: | ||
| raise ValueError( | ||
| "Access forbidden. Please check your permissions." | ||
| ) | ||
|
|
||
| return response |
There was a problem hiding this comment.
Auth errors bypass parent's _handle_request_exception context.
When _make_http_request raises ValueError on 401/403, it short-circuits the parent's error handling in methods like _send_request and _poll_request which would normally include the reqId in the error message. Users debugging failed requests won't see which request ID failed.
Consider including request context in the auth error messages, or letting callers handle 401/403 more gracefully.
🧰 Tools
🪛 Ruff (0.15.1)
[error] 202-202: Probable use of requests call without timeout
(S113)
[warning] 206-208: Avoid specifying long messages outside the exception class
(TRY003)
[warning] 210-212: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@python/cuopt_self_hosted/cuopt_sh_client/cuopt_web_hosted_client.py` around
lines 196 - 214, The auth error handling in _make_http_request currently raises
ValueError for 401/403 and bypasses parent handlers like
_send_request/_poll_request that attach reqId; modify _make_http_request to
extract a request id (e.g. req_id = kwargs.get("json", {}).get("reqId") or
kwargs.get("params", {}).get("reqId") or headers.get("X-Request-Id")) and
include it in the raised error message (or raise a small custom exception that
stores req_id) so that callers (_send_request, _poll_request) and logs will
contain the reqId context when authentication fails.
| # Test no authentication | ||
| client = CuOptServiceWebHostedClient( | ||
| endpoint="https://api.nvidia.com/cuopt/v1" | ||
| ) | ||
| headers = client._get_auth_headers() | ||
| assert len(headers) == 0 |
There was a problem hiding this comment.
Test is not isolated from the CUOPT_API_KEY environment variable.
If CUOPT_API_KEY is set in the environment (e.g., CI), this "no authentication" test will fail because the constructor falls back to os.getenv("CUOPT_API_KEY"). Patch the env to ensure the variable is unset.
🔧 Fix
+ `@patch.dict`(os.environ, {}, clear=True)
def test_authentication_from_parameters(self):
"""Test authentication setup from parameters."""
# Test API key (sent as Bearer token)Or more targeted:
# Test no authentication
- client = CuOptServiceWebHostedClient(
- endpoint="https://api.nvidia.com/cuopt/v1"
- )
- headers = client._get_auth_headers()
- assert len(headers) == 0
+ with patch.dict(os.environ, {"CUOPT_API_KEY": ""}, clear=False):
+ os.environ.pop("CUOPT_API_KEY", None)
+ client = CuOptServiceWebHostedClient(
+ endpoint="https://api.nvidia.com/cuopt/v1"
+ )
+ headers = client._get_auth_headers()
+ assert len(headers) == 0As per coding guidelines, "Ensure test isolation: prevent GPU state, cached memory, and global variables from leaking between test cases; verify each test independently initializes its environment."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@python/cuopt_self_hosted/tests/test_web_hosted_client.py` around lines 89 -
94, The test relies on CuOptServiceWebHostedClient picking up CUOPT_API_KEY from
the environment, so make the test explicitly unset that env var before
constructing the client; use pytest's monkeypatch (e.g.,
monkeypatch.delenv("CUOPT_API_KEY", raising=False)) or temporarily pop
os.environ in the test before calling CuOptServiceWebHostedClient(...) and then
call client._get_auth_headers() to assert headers are empty, ensuring the env
var is restored/isolated by using the fixture so no global state leaks between
tests.
Description
CuOptServiceWebHostedClientfor connecting to web-hosted cuOpt services (e.g. NVIDIA API endpoints) with endpoint URL and API key authenticationcreate_client()factory function that returns the appropriate client based on parametersCuOptServiceSelfHostClientwith endpoint URL parsing, auth headers, and URL construction for web-hosted APIsIssue
Closes #479
Tested with NVIDIA's Build.com CuOpt model
Test script (excluded from the PR as it is an adhoc script): https://gist.github.com/atomic/9c761bd4d64d62bc039ecf88fe28fca0#file-test_cuopt_nvidia_build_com-py
export CUOPT_API_KEY=... python3 file-test_cuopt_nvidia_build_com-py.pyChecklist
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests