-
Notifications
You must be signed in to change notification settings - Fork 455
Add: suppot trt-rtx-abi ep #1783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -314,6 +314,24 @@ def _check_for_nv_tensorrt_rtx_libs(): | |||||||||||||
| return found | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| def _check_for_nv_tensorrt_rtx_abi_libs(ep_path: str): | ||||||||||||||
| logger.info("Checking for NvTensorRtRtx ABI EP library") | ||||||||||||||
| if not ep_path: | ||||||||||||||
| raise FileNotFoundError("Need to provide abi_ep_path to use NvTensorRtRtx-abi") | ||||||||||||||
| if not os.path.isfile(ep_path): | ||||||||||||||
| raise FileNotFoundError(f"NvTensorRtRtx ABI EP library not found: {ep_path}") | ||||||||||||||
|
|
||||||||||||||
| ort.register_execution_provider_library("NvTensorRTRTXExecutionProvider", ep_path) | ||||||||||||||
| return True | ||||||||||||||
|
Comment on lines
+317
to
+325
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🧩 Analysis chain🏁 Script executed: # First, let's examine the file and the exact code at the specified lines
cd modelopt && cat -n onnx/quantization/ort_utils.py | sed -n '310,330p'Repository: NVIDIA/Model-Optimizer Length of output: 1205 🏁 Script executed: # Check how ep_path is passed and where _check_for_nv_tensorrt_rtx_abi_libs is called
rg "_check_for_nv_tensorrt_rtx_abi_libs" -B 3 -A 3 modelopt/Repository: NVIDIA/Model-Optimizer Length of output: 1109 🏁 Script executed: # Check the function signatures that call this function to understand the trust boundary
rg "abi_ep_path" -B 5 -A 5 modelopt/ | head -100Repository: NVIDIA/Model-Optimizer Length of output: 6303 🏁 Script executed: # Check if there's any documentation or config validation for abi_ep_path
rg "register_execution_provider_library" -B 5 -A 5 modelopt/Repository: NVIDIA/Model-Optimizer Length of output: 965 Add explicit trust boundary for ABI EP library loading. Line 324 loads a user-supplied shared library path into the process via Currently the code only validates file existence, with no trust boundary or safety justification. This creates an RCE vector if a malicious DLL is placed at the user-supplied path. 🤖 Prompt for AI AgentsSource: Coding guidelines |
||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| def register_abi_ep(abi_ep_path: str | None): | ||||||||||||||
| """Register an external NvTensorRtRtx ABI execution-provider library.""" | ||||||||||||||
|
|
||||||||||||||
| _check_for_nv_tensorrt_rtx_abi_libs(abi_ep_path or "") | ||||||||||||||
| logger.debug("Registered NvTensorRtRtx ABI EP") | ||||||||||||||
|
|
||||||||||||||
|
|
||||||||||||||
| def _prepare_ep_list(calibration_eps: list[str]): | ||||||||||||||
| """Prepares the EP list for ORT from the given user input.""" | ||||||||||||||
| logger.debug(f"Preparing execution providers list from: {calibration_eps}") | ||||||||||||||
|
|
@@ -334,6 +352,9 @@ def _prepare_ep_list(calibration_eps: list[str]): | |||||||||||||
| elif "cpu" in ep: | ||||||||||||||
| providers.append("CPUExecutionProvider") | ||||||||||||||
| logger.debug("Added CPU EP") | ||||||||||||||
| elif "NvTensorRtRtx-abi" in ep: | ||||||||||||||
| providers.append("NvTensorRTRTXExecutionProvider") | ||||||||||||||
| logger.debug("Added NvTensorRtRtx ABI EP") | ||||||||||||||
|
Comment on lines
+355
to
+357
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use exact matching for Using substring matching here accepts malformed values (for example, Suggested fix- elif "NvTensorRtRtx-abi" in ep:
+ elif ep == "NvTensorRtRtx-abi":
providers.append("NvTensorRTRTXExecutionProvider")
logger.debug("Added NvTensorRtRtx ABI EP")
- elif "NvTensorRtRtx" in ep:
+ elif ep == "NvTensorRtRtx":📝 Committable suggestion
Suggested change
🤖 Prompt for AI Agents |
||||||||||||||
| elif "NvTensorRtRtx" in ep: | ||||||||||||||
| try: | ||||||||||||||
| _check_for_nv_tensorrt_rtx_libs() | ||||||||||||||
|
|
||||||||||||||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -63,7 +63,7 @@ | |
| ) | ||
| from modelopt.onnx.quantization.int4 import quantize as quantize_int4 | ||
| from modelopt.onnx.quantization.int8 import quantize as quantize_int8 | ||
| from modelopt.onnx.quantization.ort_utils import update_trt_ep_support | ||
| from modelopt.onnx.quantization.ort_utils import register_abi_ep, update_trt_ep_support | ||
| from modelopt.onnx.quantization.qdq_utils import ( | ||
| qdq_to_dq, | ||
| remove_graph_input_q, | ||
|
|
@@ -358,6 +358,7 @@ def quantize( | |
| simplify: bool = False, | ||
| calibrate_per_node: bool = False, | ||
| input_shapes_profile: Sequence[dict[str, str]] | None = None, | ||
| abi_ep_path: str | None = None, | ||
| direct_io_types: bool = False, | ||
| opset: int | None = None, | ||
| autotune: bool = False, | ||
|
|
@@ -491,6 +492,9 @@ def quantize( | |
| If None of the calibration_eps require any such shapes profile for model inputs, then nothing needs to be | ||
| set for this "input_shapes_profile" parameter. | ||
| Default value is None. | ||
| abi_ep_path: | ||
| Path to an external NvTensorRtRtx ABI execution-provider library. Required when | ||
| ``NvTensorRtRtx-abi`` is present in ``calibration_eps``. | ||
|
Comment on lines
+495
to
+497
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Keep The new As per coding guidelines, public APIs should be clearly documented and kept accurate. 🤖 Prompt for AI AgentsSource: Coding guidelines |
||
| direct_io_types: | ||
| If True, modify the I/O types in the quantized ONNX model to be lower precision whenever possible. | ||
| If False, keep the I/O types in the quantized ONNX model the same as in the given ONNX model. | ||
|
|
@@ -547,6 +551,9 @@ def quantize( | |
| "Per node calibration is only supported for int8 and fp8 quantization modes" | ||
| ) | ||
|
|
||
| if "NvTensorRtRtx-abi" in calibration_eps: | ||
| register_abi_ep(abi_ep_path) | ||
|
|
||
| # quantize_static creates a shape-inferred copy at the input model's directory | ||
| # Needs to check if we have write permission to this directory | ||
| assert onnx_path.endswith((".onnx", ".pb")) | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix malformed
--calibration_epshelp text token.Line 119 shows
dml:xwithout quotes while all other choices are quoted. This looks like a typo in the user-facing guidance.Suggested fix
🤖 Prompt for AI Agents