A Review of Polynomial Representations and Arithmetic in

Download Report

Transcript A Review of Polynomial Representations and Arithmetic in

A Review of Polynomial
Representations and Arithmetic
in Computer Algebra Systems
Richard Fateman
University of California
Berkeley
cs h196 -- Polynomials
1
Polynomials force basic decisions
in the design of a CAS
• Computationally, nearly every “applied math”
object is either a polynomial or a ratio of
polynomials
• Polynomial representation(s) can make or break a
system: compactness, speed, generality
– Examples: Maple’s object too big
– SMP’s only-double-float
– Wasteful representation can make a computation
change from RAM to Disk speed (Poisson series)
cs h196 -- Polynomials
2
Polynomial Representation
• Flexibility vs. Efficiency
– Most rigid: fixed array of floating-point
numbers
– Among the most flexible: hash table of
exponents, arbitrary coefficients.
• (cos(z)+sqrt(2)* x^3*y^4*z^5:
• key is (3,4,5), entry is (+
(cos z) (expt 2 ½))
cs h196 -- Polynomials
3
A divergence in character of the
data
• Dense polynomials (1 or more vars)
– 3+4*x+5*x^2+0*x^3+9*x^4
– 1+2*x+3*y +4*x*y+ 5*x^2 +0*y^2 + 7*x^2*y
+ 8*x*y^2 + 9*x^2*y^2 + …
• Sparse (1 or more vars)
– X^100 +3*x^10 +43
– 34*x^100*y^30+1
– a+b+c+d+e
cs h196 -- Polynomials
4
Dense Representation
– Set number of variables v, say v=3 for x, y, z
– Set maximum degree d of monomials in each
variable (or d = max of all degrees)
– All coefficients represented, even if 0.
– Size of such a polynomial is (d+1)^v times the
maximum size of a coefficient
– Multidimensional arrays look good.
cs h196 -- Polynomials
5
Dense Representation almost
same as bignums..
– Set number of variables to 1, call it “10”
– All coefficients are in the set {0,…,9}
– Carry is an additional operation.
cs h196 -- Polynomials
6
Sparse Representation
– Usually new variables easy to introduce
– Many variables may not occur more than once
– Only some set S of non-zero coefficients
represented
– Linked lists, hash tables, explicit storage of
exponents.
cs h196 -- Polynomials
7
This is not a clean-cut distinction
• When does a sparse polynomial “fill in”?
• Consider powering:
is sparse
– Raise to 5th power: x^55
– (1+x^5+x^11)
+ 5 * x^49 + 5 * x^44 +
10 * x^43 + 20 * x^38 + 10 * x^37 + 10 * x^33 + 30
* x^32 + 5 * x^31 + 30 * x^27 + 20 * x^26 + x^25 +
10 * x^22 + 30* x^21 + 5 * x^20 + 20 * x^16 + 10 *
x^15 + 5 * x^11 + 10 * x^10 + 5 * x^5 + 1
– This is looking rather dense.
cs h196 -- Polynomials
8
Similarly for multivariate
• When does a sparse polynomial “fill in”?
• Consider powering:
is completely sparse
– Cube it: c^3 + 3 * b * c^2 + 3
– (a+b+c)
* a * c^2 + 3 * b^2
* c + 6 * a * b * c + 3 * a^2 * c + b^3 + 3 * a *
b^2 + 3 * a^2 * b + a^3
– This is looking completely dense.
cs h196 -- Polynomials
9
How many terms in a power?
• Dense:
– (d+1)^v raised to the power n is (n*d+1)^v.
• Sparse:
– assume complete sparse t term polynomial p=
x1+x2+…+xt.
– Size(p^n) is binomial(t+n-1,n).
• Proof: if t=2 we have binomial theorem;
– If t>2 then rewrite p= xt + (poly with 1 fewer term)
– Use induction.
cs h196 -- Polynomials
10
A digression on asymptotic
analysis of algorithms
• Real data is not asymptotically large
– Many operations on modest-sized inputs
• Counting the operations may not be right
– Are the coefficient operations constant cost?
– Is arithmetic dominated by storage allocation in small
cases?
• Benchmarking helps, as does careful counting
• Most asymptotically fastest algorithms in CAS are
actually not used
cs h196 -- Polynomials
11
Adding polynomials
• Uninteresting problem, generally
– Merge the two inputs
– Combine terms that need to be combined
• Part of multiplication: partial sums
– What if the terms are not produced in sorted
form? O(n) becomes O(n log n) or if done
naively (e.g. insert each term as generated)
perhaps O(n^2). UGH.
cs h196 -- Polynomials
12
Multiplying polynomials - I
• The way you learned in high school
– M=Size(p)*Size(q) coefficient multiplies.
– How many adds? How about this: terms
combine when they don’t appear in the answer.
The numbers of adds is Size(p*q)Size(p)*Size(q).
• Comment on the use of “*” above..
• We have formulas for the size of p, size of p^2, …
cs h196 -- Polynomials
13
Multiplying polynomials - II
• Karatsuba’s algorithm
– Polynomials f and g: if each has 2 or more terms
– Split each: if f and g are of degree d, consider
•
f = f1*x^(d/2)+f0
•
g= g1*x^(d/2)+g0
• Note that f1, f0, g1, g0 are polys of degree d/2.
• Note there are a bunch of nagging details, but it still works.
–
–
–
–
Compute A=f1*g1, C=f0*g0 (recursively!)
Compute D=(f1+f0)*(g1+g0) (recursively!)
Compute B=D-A-C
Return A*x^d+B*x^(d/2)+C (no mults).
cs h196 -- Polynomials
14
Multiplying polynomials - II
• Karatsuba’s algorithm
– Cost: in multiplications, the cost of multiplying
polys of size 2r: cost(2r) is 3*cost(r) 
cost(s)=s^log[2](3) = s^1.585… ; looking
good.
– Cost: in adds, about 5.6*s^1.585
– Costs (adds+mults) 6.6*s^1.585
– Classical adds+mults) is 2*s^2
cs h196 -- Polynomials
15
Karatsuba wins for degree > 18
1750
1500
1250
1000
750
500
250
5
10
15
cs h196 -- Polynomials
20
25
30
16
Analysis potentially irrelevant
• Consider that BOTH inputs must be of the
same degree, or time is wasted
• If the inputs are sparse, adding f1+f2 etc
probably makes the inputs denser
• A cost factor of 3 improvement is reached at
size 256.
• Coefficient operations are not really
constant time anymore.
cs h196 -- Polynomials
17
Multiplication III: evaluation
• Consider H(x)=F(x)*G(x), all polys in x
–
–
–
–
H(0)=F(0)*G(0)
H(1)=F(1)*G(1)
…
If degree(F) +degree(G) = 12, then degree(H) is
12. We can completely determine, by
interpolation, a polynomial of degree 12 by 13
values, H(0), …, H(12). Thus we compute the
product H=F*G
cs h196 -- Polynomials
18
What does evaluation cost?
• Horner’s rule evaluates a degree d polynomial in d
adds and multiplies. Assume for simplicity that
degree(F)=degree(G)=d, and H is of degree 2d.
• We need (2d+1) evals of F, (2d+1) evals of G,
each of cost d: (2d)(2d+1).
• We need (2d+1) multiplies.
• We need to compute the interpolating polynomial,
which takes O(d^2).
• Cost seems to be (2d+2)*(2d+1) +O(d^2), so it is
up above classical high-school…
cs h196 -- Polynomials
19
Can we evaluation/interpolate
faster?
• We can choose any n points to evaluate at,
so choose instead of 0,1, …. we can choose
w, w^2, w^3, … where w = nth root of unity
in some arithmetic domain.
• We can use the fast Fourier transform to do
these evaluations: not in time n^2 but
n*log(n).
cs h196 -- Polynomials
20
About the FFT
• Major digression, see notes for the way it can be
explained in a FINITE FIELD.
– Evaluation = You multiply a vector of coefficients by a
special matrix F. The form of the matrix makes it
possible to do the multiplication very fast.
– Interpolation = You multiply by the inverse of F. Also
fast.
• Even so, there is a start-up overhead, translating
the problem as given into the right domain,
translating back; rounding up the size…
cs h196 -- Polynomials
21
How to compute powers of a
polynomial: RMUL
• RMUL. (repeated multiplication)
– P^n = P * P ^(n-1)
– Algorithm: set ans:=1
• For I:=1 to n do ans:=p*ans
• Return ans.
cs h196 -- Polynomials
22
How to compute powers of a
polynomial: RSQ
• RSQ. (repeated squaring)
– P^(2*n) = (P^n) * (P^n) [compute P^n once..]
– P^(2*n+1) = P* P^(2*n) [reduce to prev. case]
cs h196 -- Polynomials
23
How to compute powers of a
polynomial: BINOM
• BINOM(binomial expansion)
– P= monomial. Fiddle with the exponents
– P= (p1+{p2+ …pd}) = (a+b)^n. Compute
sum(binomial(n,i)*a^i*b^(n-i), i=0,n).
– Computing b^2 by BINOM, b^3, … by RMUL
etc.
– Alternative: split P into two nearly-equal
pieces.
cs h196 -- Polynomials
24
Comparing RMUL and RSQ
• Using conventional n^2 multiplication, let
h=size(p). RMUL computes p, p*p, p*p^2,
…
• Cost is size(p)*size(p)+ size(p)*size(p^2)…
• For dense degree d with v variables:
– (d+1)^v * sum ((i*d+1)^v,i=1,n-1) <
(d+1)^(2v)*n^(v+1)/(v+1).
cs h196 -- Polynomials
25
Comparing RMUL and RSQ
• Using conventional n^2 multiplication, let’s
assume n=2^j.
• Cost is size(p)^2+ size(p^2)^2+ … size(p^(2^(j1)))^2
• For dense degree d with v variables:
sum ((2^i *d+1)^v,i=1,j) < (n*(d+1))^(2*v)/(3^v-1)
This is O((n*d)^(2*v)) not O(n^v*d^(2*v)) [rmul] so
RSQ loses. If we used Karatsuba multiplication, the
cost is O((n*d)^(1.585*v)) which is potentially better.
cs h196 -- Polynomials
26
There’s more stuff to analyze
• RSQ vs RMUL vs. BINOM on sparse
polynomials
• Given P^n, is it faster to compute P^(2*n)
by squaring or by multiplying by P, P, …P n
times more?
cs h196 -- Polynomials
27
FFT in a finite field
• Start with evaluating a polynomial A(x) at a
point x
cs h196 -- Polynomials
28
Rewrite the evaluation
• Horner’s rule
cs h196 -- Polynomials
29
Rewrite the evaluation
• Matrix multiplication
cs h196 -- Polynomials
30
Rewrite the evaluation
• Preprocessing (requires real arith)
cs h196 -- Polynomials
31
Generalize the task
• Evaluate at many points
cs h196 -- Polynomials
32
Primitive Roots of Unity
cs h196 -- Polynomials
33
The Fourier Transform
cs h196 -- Polynomials
34
This is what we need
• The Fourier transform
• Do it Fast and it is an FFT
cs h196 -- Polynomials
35
Usual notation
cs h196 -- Polynomials
36
Define two auxiliary polynomials
cs h196 -- Polynomials
37
So that
cs h196 -- Polynomials
38
Putting the pieces together:
•
cs h196 -- Polynomials
39
Simplifications noted:
•
cs h196 -- Polynomials
40
Re-stated…
cs h196 -- Polynomials
41
Even more succinctly
• Here is the Fourier Transform again..
cs h196 -- Polynomials
42
How much time does this take?
cs h196 -- Polynomials
43
What about the inverse?
•
cs h196 -- Polynomials
44
Why is this the inverse?
cs h196 -- Polynomials
45
An example: computing a power
cs h196 -- Polynomials
46
An example: computing a power
cs h196 -- Polynomials
47
cs h196 -- Polynomials
48