Add a new DAE pass with a fixed point analysis #8217

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

tlively wants to merge 2 commits into main from dae2

+5,891 −0

Member

tlively commented Jan 17, 2026

DAE can be slow because it performs several rounds of interleaved
analysis and optimization. On top of this, the analysis it performs is
not as precise as it could be because it never removes parameters from
referenced functions and it cannot optimize unused parameters or results
that are forwarded through recursive cycles.

Start improving both the performance and the power of DAE by creating a
new pass, called DAE2 for now. DAE2 performs a single parallel walk of
the module to collect information with which it performs a fixed point
analysis to find unused parameters, then does a single parallel walk of
the module to optimize based on this analysis.

Member Author

tlively commented Jan 17, 2026

This replaces #8085 because so much has changed in the meantime.

tlively requested review from aheejin, kripken and stevenfontanella

January 17, 2026 20:05


          Add a new DAE pass with a fixed point analysis

cd49b44

DAE can be slow because it performs several rounds of interleaved
analysis and optimization. On top of this, the analysis it performs is
not as precise as it could be because it never removes parameters from
referenced functions and it cannot optimize unused parameters or results
that are forwarded through recursive cycles.

Start improving both the performance and the power of DAE by creating a
new pass, called DAE2 for now. DAE2 performs a single parallel walk of
the module to collect information with which it performs a fixed point
analysis to find unused parameters, then does a single parallel walk of
the module to optimize based on this analysis.

tlively force-pushed the dae2 branch from 7e02b28 to cd49b44 Compare

January 18, 2026 16:40

Member

kripken commented Jan 20, 2026

There is a lot of test code here - is it copied from the existing tests? If so, can we use another lit run line for it? If that isn't practical yet because of differences, I still wonder if we can reuse the existing tests somehow, even if the two passes have differing output in some cases (can use a different check prefix for those tests, maybe move them out to another file, etc.).

On another topic, what is our overall plan here? Do you want to land this even before we know if we will be replacing the old pass?


          fix trampolines

bce9d07

Member Author

tlively commented Jan 20, 2026

The tests here are all new. We could additionally run the existing DAE tests, but we wouldn't be able to remove many of the new tests that way.

Member Author

tlively commented Jan 20, 2026

As far as the overall plan goes, I'd like to have this landed so we can consider making it more powerful iteratively. My hope is that if it does more work DAE currently does, then we'll start seeing better speedups. We could also use it immediately to replace the unprofitable bailout from DAE, but that's probably not worth it.

kripken reviewed

View reviewed changes

src/passes/DeadArgumentElimination2.cpp

+              // forwarded on to other function calls. These forwarded parameters can still be
+              // optimized out as long as they are unused in the callees they are forwarded
+              // to. Since we perform a fixed point analysis, cycles of forwarded parameters
+              // can still be removed.

Member

kripken Jan 20, 2026

Maybe worth noting that "forwarded" is not just "literally the value is forwarded" but also transformations of the value.

src/passes/DeadArgumentElimination2.cpp

+              using Params = std::vector<Used::Element>;
+              // Function index and parameter index.
+              using ParamLoc = std::pair<Index, Index>;

Member

kripken Jan 20, 2026

How about FuncParamLoc, to match TypeParamLoc below?

src/passes/DeadArgumentElimination2.cpp

+              // A set of (source, destination) index pairs for parameters of a caller
+              // function being forwarded as arguments to a callee function.
+              using ForwardedParamSet = std::unordered_set<std::pair<Index, Index>>;

Member

kripken Jan 20, 2026

Suggested change

      
            using ForwardedParamSet = std::unordered_set<std::pair<Index, Index>>;
          
            using ForwardedParamSet = std::unordered_set<ParamLoc>;

Member Author

tlively Jan 20, 2026

This would be slightly deceptive because both Indexes here are parameter indices. In particular the first one is not a function index.

src/passes/DeadArgumentElimination2.cpp

+              #define TIME(...)
+              #endif // TIME_DAE
+              // Find the root of the subtyping hierarchy for a given HeapType.

Member

kripken Jan 20, 2026

Suggested change

      
            // Find the root of the subtyping hierarchy for a given HeapType.
          
            // Find the non-basic root of the subtyping hierarchy for a given HeapType.

src/passes/DeadArgumentElimination2.cpp

+                // Unreferenced functions can be optimized separately from referenced
+                // functions with the same type. For unreferenced functions in that situation,
+                // this is the new type that should be applied before global type rewriting to
+                // prevent the function from getting the wrong optimizations.

Member

kripken Jan 20, 2026

I don't follow this. If a function is unreferenced, we track that in referenced, and in that case, we can just apply all the stuff we find, without caring about syncing it up with functions with a similar type. Why do we need to set a particular "replacement type" here?

Member Author

tlively Jan 21, 2026

We use a GlobalTypeRewriter to rewrite optimized function types for indirect calls globally. GlobalTypeRewriter does not provide any way to skip updating the types of non-referenced functions, so we have to arrange for the GlobalTypeRewriter to rewrite the types of non-referenced functions to the desired signatures as well. When different functions with the same original type should be optimized to have different types, we need to ensure those functions have different types before global type rewriting, otherwise it would not be possible for them to have different types after global type rewriting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet