-
Notifications
You must be signed in to change notification settings - Fork 350
Math: Optimizations to CORDIC trigonometric functions #10495
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR optimizes the CORDIC-based trigonometric functions by replacing runtime floating-point conversions with compile-time constants, improving algorithm efficiency, and enhancing code documentation.
Changes:
- Replaced runtime
Q_CONVERT_FLOAT()calls with precomputed#defineconstants for CORDIC lookup table values - Optimized the
cordic_approx()function by reducing pointer dereferencing, simplifying conditional logic, and deferring result writes - Improved function documentation with kernel-doc style comments including parameter descriptions and return value formats
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/math/trig.c | Replaces floating-point constant conversions with precomputed integer macros; refactors cordic_approx() for better performance; updates function documentation; removes redundant variables in inverse trig functions |
| src/include/sof/math/trig.h | Adds kernel-doc comments for all public functions; updates constant definitions with comments; adds helper macros for iteration counts; updates function signatures to use int instead of int16_t |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
kv2019i
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One leftover in the first patch, but otherwise looks good.
src/math/trig.c
Outdated
| /* is repeated twice and changes only through one iteration */ | ||
| i = numiters - 1; | ||
| for (b_i = 0; b_i < i; b_i++) { | ||
| // i = numiters - 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably should have been removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, yes, I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed now.
f3890fc to
73435c3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes save in MTL platform 46 MCPS in stft_process
component with 1024 size FFT and 256 hop in converting FFTs
to (magnitude,phase) phase format and back complex FFT.
btw. What percentage savings is that? 46 MCPS looks huge!
src/math/trig.c
Outdated
| * Return Type : int32_t | ||
| */ | ||
| int32_t is_scalar_cordic_acos(int32_t cosvalue, int16_t numiters) | ||
| int32_t is_scalar_cordic_acos(int32_t cosvalue, int numiters_minus_one) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The argument name "numiters_minus_one" is confusing to me.
Maybe a better name would be table_size_minus_one (to clearly communicate that the number of iterations shall be lower than the table size) or just keep it as is (numiters).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I agree. Just "numiters" and macro with name numiters to separate it from table size looks better to me.
src/math/trig.c
Outdated
| /** | ||
| * CORDIC-based approximation for inverse cosine | ||
| * Arguments : int32_t cosvalue | ||
| * int16_t numiters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also correct the description if you want to change the name
This patch optimizes the cycles count performance for the math functions. The computational accuracy is not impacted. In function cordic_approx() in trig.c, the most performance critical function, the if statements are simplified and macro values used instead of global constants. The first if statement is simplified with use of absolute value of the parameter th_rad_fxp and using new direction variable from sign of the parameter. The b_idx iteration loop is simplified with direction sign variable and new variable for common shift amount. The local variables in the b_idx iteration speed up the algorithm. In is_scalar_cordic_acos() and is_scalar_cordic_acos() functions the numiters parameter is replaced by numiters_minus_one to avoid the subtract by one. The cosvalue shift is avoided by changing the constant to compare to as 2x. The variable k is eliminated from the b_i iteration since it is duplicate of variable b_i. The constant variables to macros names have been adjusted for the actual Q-format computed the code. E.g. PI/2 Q29 fraction is same integer value as PI Q28 fraction. In header file trig.h the PI related constants are adjusted to Q-format used in the code. The equation to calculate the value in Octave is shown in comment. The new macros for table size -1 iteration counts are added and the inline functions are changed to use it. The int16_t parameters for functions are replaced with int for better code speed. These changes save in MTL platform 46 MCPS in stft_process component with 1024 size FFT and 256 hop in converting FFTs to (magnitude,phase) phase format and back complex FFT. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
The SOF code documentation syntax comments are added to header trig.h. Some comments in trig.c are trimmed due to placing more detailed documentation into the header. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
73435c3 to
ba2e795
Compare
|
In addition to fixes for @wjablon1 's comments I fixed doxygen issues in trig.h. |
The main saving is from function cordic_approx() that is exercised a lot in the test run. In profiler run the cycles per call got reduced from 146427 to 81294. It would be about 44% saved. |
|
Perfect! QB looks good, trigger rerun because of one unrelated (probably) fail |
lgirdwood
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice MCPS savings !
No description provided.