aboutsummaryrefslogtreecommitdiff
path: root/sys
diff options
context:
space:
mode:
authorBruce Evans <bde@FreeBSD.org>2005-11-30 11:51:17 +0000
committerBruce Evans <bde@FreeBSD.org>2005-11-30 11:51:17 +0000
commitf4b01a9edf59b9d3091f4fa787a899090a4414b4 (patch)
tree48dcd46ed879e591abfa46bfaf080ccbf0697479 /sys
parent1f4ae9be579b9636091f4e536656bc52fa6f4a3a (diff)
downloadsrc-f4b01a9edf59b9d3091f4fa787a899090a4414b4.tar.gz
src-f4b01a9edf59b9d3091f4fa787a899090a4414b4.zip
Rearranged the polynomial evaluation to reduce dependencies, as in
k_tanf.c but with different details. The polynomial is odd with degree 13 for tanf() and odd with degree 9 for sinf(), so the details are not very different for sinf() -- the term with the x**11 and x**13 coefficients goes awaym and (mysteriously) it helps to do the evaluation of w = z*z early although moving it later was a key optimization for tanf(). The details are different but simpler for cosf() because the polynomial is even and of lower degree. On Athlons, for uniformly distributed args in [-2pi, 2pi], this gives an optimization of about 4 cycles (10%) in most cases (13% for sinf() on AXP, but 0% for cosf() with gcc-3.3 -O1 on AXP). The best case (sinf() with gcc-3.4 -O1 -fcaller-saves on A64) now takes 33-39 cycles (was 37-45 cycles). Hardware sinf takes 74-129 cycles. Despite being fine tuned for Athlons, the optimization is even larger on some other arches (about 15% on ia64 (pluto2) and 20% on alpha (beast) with gcc -O2 -fomit-frame-pointer).
Notes
Notes: svn path=/head/; revision=152951
Diffstat (limited to 'sys')
0 files changed, 0 insertions, 0 deletions