Conversation
Krishn1412
commented
Jan 7, 2026
- Fixes issue #2413.
- Uses a cost heuristic based on MultiplicativeDepthVisitorImpl to estimate polynomial evaluation complexity.
- Computes and compares the multiplicative depth of Chebyshev (PS) and monomial (Horner) evaluation DAGs.
- Selects the evaluation strategy with lower multiplicative depth.
| #include "lib/Utils/Polynomial/Horner.h" | ||
| #include "lib/Utils/Polynomial/PatersonStockmeyer.h" | ||
| #include "lib/Utils/Polynomial/Polynomial.h" | ||
| #include "lib/Utils/Polynomial/PolynomialTestVisitors.h" |
There was a problem hiding this comment.
nit: since this file was intended for unit testing only, and now it's used outside of a unit test, the MultiplicativeDepthVisitor should be extracted to a standalone library.
| double chebDepth = depthVisitor.process(chebDag); | ||
| double monoDepth = depthVisitor.process(monoDag); | ||
|
|
||
| bool useMonomial = chebDepth > monoDepth; |
There was a problem hiding this comment.
This code is happening inside of a function that suggests the Paterson Stockmeyer Chebyshev method will be used without qualification.
In addition, if the pass option specifies that "pscheb" method must be used, I don't think the pass should second-guess it and use a different method, even if it's more efficient.
So what we should instead do here is:
- Extract the construction of the DAG and the analysis into functions that are outside of the RewritePattern implementation.
- In
LowerPolynomialEvalwhere the method is "auto", construct the dags, do the analysis on the depth, and then use that to determine which rewrite pattern to apply.
There was a problem hiding this comment.
As I understand from the code, LowerPolynomialEval only registers the rewrite patterns, the actual rewriting happens later when MLIR applies them. At registration time we don’t have access to the polynomials, so any analysis can’t be done there? Can we instead do the analysis inside matchAndRewrite, where the polynomial.eval op is available. In automatic mode, a pattern can return failure() if it decides it’s too expensive or unstable, allowing MLIR to try the other patterns. Apologies for the delay!
There was a problem hiding this comment.
Anything inside runOnOperation has access to the entire IR, but yes, you can move the code that actually mutates the IR into helper functions and then do the analysis inside a smaller number of patterns.
|
|
||
| module { | ||
| func.func @chebyshev(%ct: f32) -> f32 { | ||
| %ct_0 = polynomial.eval #poly, %ct {coefficients = [0.0, 0.75, 0.0, 0.25], domain_lower = -1.000000e+00 : f64, domain_upper = 1.000000e+00 : f64} : f32 |
There was a problem hiding this comment.
This file needs at least one // CHECK: ... statement to assert the output is correct.
| double chebDepth = depthVisitor.process(chebDag); | ||
| double monoDepth = depthVisitor.process(monoDag); | ||
|
|
||
| bool useMonomial = chebDepth > monoDepth; |
There was a problem hiding this comment.
The second problem listed in #2413 which is not covered by this PR is numerical stability. The reason we use the Chebyshev basis instead of the monomial basis is that, for larger-degree polynomials, the monomial basis coefficients of high-degree terms will necessarily grow to be quite small (e.g., 1e-15) but cannot be ignored because of their large influence on the evaluated result, while Chebyshev basis coefficients remain relatively well normalized and small magnitude coefficients can be dropped without influencing the output.
So the other check that needs to occur to allow the monomial lowering is: will the monomial representation be unstable? While it may depend on which FHE scheme is being used and what precision is supported in that scheme, a good place to start would be to compute the condition number of polynomial evaluation for monomials, in https://epubs.siam.org/doi/10.1137/1.9780898718027.ch5 (Higham's Accuracy and Stability of Numerical Algorithms). However, that requires knowing the right value of x in advance, which may not be true in this case. We should have access to lower and upper bounds on the domain, so maybe sampling a few values in the domain would suffice.