Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #15 +/- ##
==========================================
+ Coverage 93.01% 93.15% +0.13%
==========================================
Files 6 6
Lines 501 555 +54
==========================================
+ Hits 466 517 +51
- Misses 35 38 +3 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Benchmark Results'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.30.
| Benchmark suite | Current: d7dde3a | Previous: 2a155ab | Ratio |
|---|---|---|---|
Correlations/two-time/single photon cavity |
330607520 ns |
245918285 ns |
1.34 |
Interaction Picture/coefficient matrix M/numerical (ODE) |
187290 ns |
121594.5 ns |
1.54 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
Benchmark Results
Details
| Benchmark suite | Current: 5248561 | Previous: 2a155ab | Ratio |
|---|---|---|---|
Correlations/two-time/single photon cavity |
247400836 ns |
245918285 ns |
1.01 |
Interaction Picture/coefficient matrix M/analytical (2 equal modes) |
12945 ns |
17328 ns |
0.75 |
Interaction Picture/coefficient matrix M/numerical (ODE) |
72286 ns |
121594.5 ns |
0.59 |
Interaction Picture/coupling matrix evaluation/2 modes |
97.21074815595364 ns |
311.7795275590551 ns |
0.31 |
Interaction Picture/coupling matrix evaluation/4 modes |
201.8 ns |
2145.5555555555557 ns |
0.09405489383738995 |
Interaction Picture/operator substitution/TLS cascade |
84419 ns |
531590721 ns |
0.00015880450253382057 |
Pulse Couplings/multi-pulse/input 2 modes |
413347 ns |
437219 ns |
0.95 |
Pulse Couplings/multi-pulse/output 2 modes |
464122 ns |
510509 ns |
0.91 |
Pulse Couplings/single-pulse/input |
366529 ns |
380661 ns |
0.96 |
Pulse Couplings/single-pulse/input Gaussian |
92.60314136125655 ns |
127.10699373695198 ns |
0.73 |
Pulse Couplings/single-pulse/output |
400443 ns |
404046.5 ns |
0.99 |
Pulse Couplings/single-pulse/output Gaussian |
88.134375 ns |
130.90388655462186 ns |
0.67 |
SLH Algebra/closure evaluation/2-QD waveguide H(t) |
34455 ns |
65417.5 ns |
0.53 |
SLH Algebra/closure evaluation/2-QD waveguide L(t) |
3297.5 ns |
7646.375 ns |
0.43 |
SLH Algebra/numeric/2-QD waveguide composition |
21260 ns |
52371.5 ns |
0.41 |
SLH Algebra/symbolic/3-cavity cascade |
7858 ns |
98119.5 ns |
0.08008601756021994 |
SLH Algebra/symbolic/concatenate + cascade |
2725.1111111111113 ns |
31346 ns |
0.0869364866685099 |
SLH Algebra/symbolic/feedback OPO loop |
6540.4 ns |
32864 ns |
0.20 |
Translation/closure evaluation/3-cavity H(t) |
18946 ns |
23670 ns |
0.80 |
Translation/closure evaluation/3-cavity L(t) |
14527 ns |
18372 ns |
0.79 |
Translation/static/atom-cavity |
38021 ns |
498433.5 ns |
0.07628098833645812 |
Translation/time-dependent/3-cavity H+L |
568912 ns |
3576277 ns |
0.16 |
Translation/time-dependent/atom-cavity |
111710 ns |
893640 ns |
0.13 |
This comment was automatically generated by workflow using github-action-benchmark.
Reduce the run-to-run and cross-runner variance that has been making the benchmark CI flag phantom regressions. Save minimum(results) instead of median(results). The minimum is BenchmarkTools' recommended tracking estimator: measurement noise (GC pauses, scheduler preemption, frequency scaling) only ever adds time, so the minimum is the most reproducible estimate of the true cost and the least sensitive to hardware differences between runners. Set gcsample=true on the two allocation-heavy ODE-solve benchmarks (Correlations two-time, Interaction Picture numerical mode evolution). They rebuild operators every ODE step and spend ~50% of their runtime in GC, which gives a ~4x run-to-run spread; running a GC before each sample brings that down to ~1.4x. The fast ns/us benchmarks are left untouched: their minimum is already stable and gcsample would only cost samples.
bench: report minimum + gcsample heavy benchmarks to cut CI variance
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://qojulia.github.io/QuantumInputOutput.jl/previews/PR15/
Move to SecondQuantizedAlgebra v0.8
This PR bumps SecondQuantizedAlgebra from 0.4.5 to 0.8 and adapts the package accordingly.
On the benchmark CI "regression"
The benchmark action flagged two benchmarks as regressions (1.54x and 1.34x). I am not sure why. To settle it, I ran a proper A/B on a single machine: same Julia 1.12.6, the same QuantumOptics 1.2.6 in both, single-threaded throughout (Julia task threads, GC threads, and BLAS all pinned to 1), reporting the minimum. The only difference between the two runs is SecondQuantizedAlgebra (0.4.5 vs 0.8.2).
The two benchmarks CI flagged are flat or faster on the same machine, with identical allocations:
So the numeric ODE paths this PR does not touch are unchanged (allocations are byte-for-byte identical), and the CI flag is pure cross-runner noise.
The symbolic paths this PR actually optimizes are dramatically faster:
Benchmark reproducibility changes
To reduce the cross-run noise that produced the false alarm, the benchmark job now pins all thread pools to 1:
.github/workflows/Benchmarks.yaml: run with--threads=1 --gcthreads=1(was--threads=2).benchmarks/runbenchmarks.jl:BLAS.set_num_threads(1)so the dense linear algebra in the numeric benchmarks does not pick up the runner's core count.