Skip to content

feat(test): add AISBench prefix test tool #1030

Open
Potterluo wants to merge 2 commits into
ModelEngine-Group:developfrom
Potterluo:feat_test_aisbenchPrefix
Open

feat(test): add AISBench prefix test tool #1030
Potterluo wants to merge 2 commits into
ModelEngine-Group:developfrom
Potterluo:feat_test_aisbenchPrefix

Conversation

@Potterluo

@Potterluo Potterluo commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Pull Request Description
via: https://github.com/rayn-zzz/aisbench_auto_tools_prefix

Summary

This PR introduces a new test tool module aisbench_utils to enable prefix caching performance testing for our inference service. It provides a flexible framework for generating test datasets, configuring API requests, and parsing results with hit rate metrics.

Key Changes

  • Module structure: Created aisbench_utils containing:

    • data_selector.py – supports GSM8K (grade-school math) dataset and random token generation modes.
    • dataset_generator.py – generates multi-prefix datasets with configurable variable-length patterns.
    • api_config.py – builds API request configurations for both streaming and non‑streaming text tests.
    • config_dataclasses.py – dataclasses for unified parameter management (e.g., batch size, prefix lengths, concurrency).
    • result_parser.py – parses test outputs and computes prefix hit rates.
  • Configuration: Added new YAML/JSON config options under aisbench section, covering:

    • Dataset source (GSM8K / random)
    • Prefix length ranges
    • Number of test samples
    • API endpoint and authentication
    • Streaming on/off
  • Testing: Included unit tests for data selection, dataset generation, and hit rate calculation (see /tests/test_aisbench_utils.py).

Motivation

Prefix caching is critical for reducing latency in LLM serving. This tool allows us to systematically benchmark cache hit rates under varying prefix distributions, helping optimize our caching policies and deployment configurations.

How to Test

  1. Run the example script: python examples/run_aisbench.py --config configs/aisbench_example.yaml
  2. Verify generated datasets and API calls succeed.
  3. Check output reports for hit rate statistics.

Additional Notes

  • The module is designed to be extensible – new dataset types or test modes can be added by subclassing the base selectors/generators.
  • Documentation is included in the module docstrings; a separate user guide will follow in a later PR.
image image

- Add aisbench_utils module with data selection, dataset generation, and API configuration capabilities
- Implement API config generation supporting streaming and text test types
- Add data selector supporting GSM8K dataset and random token mode
- Add multi-prefix dataset generation with variable-length pattern support
- Introduce configuration dataclasses to unify AISBench test parameters
- Add result parsing and prefix hit rate calculation functions
- Add AISBench prefix cache test configuration options in the configuration file
- Add aisbench_utils module with data selection, dataset generation, and API configuration capabilities
- Implement API config generation supporting streaming and text test types
- Add data selector supporting GSM8K dataset and random token mode
- Add multi-prefix dataset generation with variable-length pattern support
- Introduce configuration dataclasses to unify AISBench test parameters
- Add result parsing and prefix hit rate calculation functions
- Add AISBench prefix cache test configuration options in the configuration file
if result.returncode != 0 or not result.stdout.strip():
return {}, {}

lines = result.stdout.strip().split("\n")

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning: Using shell=True with interpolated URLs in subprocess is a potential security risk. While ip_address and port should be validated, consider using subprocess.run(["curl", "-s", url], ...) without shell=True for safer execution.

except OSError as e:
if e.errno == errno.EEXIST:
os.remove(link_name)
os.symlink(target, link_name)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: The sys.platform == "win32" check handles Windows, but other Unix-like systems (macOS) might behave differently. Consider using os.name == "nt" or a more robust platform detection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants