sched: simplify xex() and expand cluster_sched test coverage#51
Conversation
3ebb636 to
9e1c89a
Compare
Replace xex()'s branch-heavy switch with a branchless arithmetic equivalent: HVX(xe) and HLX(xe3) share one coproc slot, so their contribution to coproc_count is (new_xe|new_xe3) - (cur_xe|cur_xe3); HMX(xe2) is its own slot using plain subtraction. Verified equivalent by truth table. Comparing compiled result- 5 insns / 0 branches vs 15 / 3. Expand the readylist CLUSTER_SCHED test to cover all code paths in H2K_ready_head. Add slot-accounting correctness checks verifying xe, xe|xe3, and xe|xe3|xe2 each charge the expected count delta. Signed-off-by: Zeev Belinsky <zbelinsk@qti.qualcomm.com>
Signed-off-by: zbelinsk <zbelinsk@qti.qualcomm.com>
Signed-off-by: Zeev Belinsky <zbelinsk@qti.qualcomm.com>
ed69362 to
818db5d
Compare
bryanb-h2
left a comment
There was a problem hiding this comment.
This is great work; thanks!
Would be good to see before/after coverage data.
| count += 1; | ||
| } | ||
| /* HVX (xe) and HLX (xe3) share ONE coproc slot: the slot is occupied | ||
| * iff either is set, so its occupancy is (xe | xe3). The slot's |
| #ifdef CLUSTER_SCHED | ||
| if ((!H2K_gp->cluster_sched) || H2K_gp->coproc_max == -1) { | ||
| return ret; | ||
| return head; |
There was a problem hiding this comment.
I think maybe the compiler would already do this optimization :)
There was a problem hiding this comment.
I think maybe the compiler would already do this optimization :)
I agree with you, this is the big win compile time ifdef, just made the change for readability.
| /* This hthread is going to sleep, so no longer using xe/xe2/xe3 */ | ||
| ssr &= ~SSR_XE_BIT_MASK; | ||
| ssr &= ~SSR_XE2_BIT_MASK; | ||
| ssr &= ~(SSR_XE_BIT_MASK | SSR_XE2_BIT_MASK); |
There was a problem hiding this comment.
The human optimizer strikes again.
There was a problem hiding this comment.
The human optimizer strikes again.
Just thought It is better for readability. Was easier for me when reading the code first time to think about it like that.
There was a problem hiding this comment.
Sure, it's a good change. Go ahead and merge.
I did not run any coverage on old version. But, I glanced at it, it intended to test around half of the scheduling cases. |
Replace xex()'s branch-heavy switch with a branchless arithmetic equivalent: HVX(xe) and HLX(xe3) share one coproc slot, so their contribution to coproc_count is (new_xe|new_xe3) - (cur_xe|cur_xe3); HMX(xe2) is its own slot using plain subtraction. Verified equivalent by truth table and by comparing compiled result- 5 insns / 0 branches vs 15 / 3.
Expand the readylist CLUSTER_SCHED test to cover all five code paths in H2K_ready_head: early-out, fits
locally, full+no-waiters oversubscribe, full+handoff, and ring-walk-exhausted/ring-walk-finds-fitter. Add slot- accounting correctness checks verifying xe, xe|xe3, and xe|xe3|xe2 each charge the expected count delta.