Optimal coordinates 24 May 2020 tags: math.LA, math.OC, math.DS
There was a bit of radio silence, but as indicated here, some interesting stuff is on its way.
In this post we highlight this 1976 paper by Mullis and Roberts on ’Synthesis of Minimum Roundoff Noise Fixed
Point Digital Filters’.
Let us be given some singleinput singleoutput (SISO) dynamical system
It is known that the inputoutput behaviour of any , that is, the map from to is invariant under a similarity transform.
To be explicit, the tuples and , which correspond to the change of coordinates for some , give rise to the same inputoutput behaviour.
Hence, one can define the equivalence relation by imposing that the inputoutput maps of and are equivalent.
By doing so we can consider the quotient . However, in practice, given a , is any such that equally useful?
For example, the following and are similar, but clearly, is preferred from a numerical point of view:
In what follows, we highlight the approach from Mullis and Roberts and conclude by how to optimally select . Norms will be interpreted in the sense, that is, for , . Also, in what follows we assume that plus that is stable, which will mean and that corresponds to a minimal realization.
The first step is to quantify the error. Assume we have allocated bits at our disposal to present the component of . These values for can differ, but we constrain the average by for some . Let be a 'stepsize’, such that our dynamic range of is bounded by (these are our possible representations). Next we use the modelling choices from Mullis and Roberts, of course, they are debatable, but still, we will infer some nice insights.
First, to bound the dynamic range, consider solely the effect of an input, that is, define by , . Then we will impose the bound on . In combination with the step size (quantization), this results in . Let be a sequence of i.i.d. sampled random variables from . Then we see that . Hence, one can think of as scaling parameter related to the probability with which this dynamic range bound is true.
Next, assume that all the roundoff errors are independent and have a variance of .
Hence, the worstcase variance of computating is . To evaluate the effect of this error on the output, assume for the moment that is identically .
Then, . Similar to , define as the component of . As before we can compute the variance, this time of the full output signal, which yields
. Note, these expressions hinge on the indepedence assumption.
Next, define the (infamous) matrices , by
If we assume that the realization is not just minimal, but that is a controllable pair and that is an observable pair, then, and .
Now the key observation is that and similarly .
Hence, we can write as and indeed .
Using these Lyapunov equations we can say goodbye to the infinitedimensional objects.
To combine these error terms, let say we have apply a coordinate transformation , for some .
Specifically, let be diagonal and defined by .
Then one can find that , . Where the latter expressions allows to express the output error (variance) by
Now we want to minimize over all plus some optimal selection of . At this point it looks rather painful. To make it work we first reformulate the problem using the wellknown arithmeticgeometric mean inequality for nonnegative sequences :
This inequality yields
See that the right term is independent of , hence this is a lowerbound with respect to minimization over . To achieve this (to make the inequality from an equality), we can select
Indeed, as remarked in the paper, is not per se an integer. Nevertheless, by this selection we find the clean expression from to minimize over systems equivalent to , that is, over some transformation matrix . Define a map by
It turns out that if and only if is diagonal. This follows from Hadamard's inequality.
We can use this map to write
Since the term is invariant under a transformation , we can only optimize over a structural change in realization tuple , that is, we need to make and simultaneous diagonal! It turns out that this numerically 'optimal’ realization, denoted , is what is called a principal axis realization.
To compute it, diagonalize , and define . Next, construct the diagonalization . Then our desired transformation is . First, recall that under any the pair becomes . Plugging in our map yields the transformed matrices , which are indeed diagonal.
: To show this AMGM inequality, we can use Jensen's inequality for concave functions, that is, . We can use the logarithm as our prototypical concave function on and find . Then, the result follows.
: The inequality attributed to Hadamard is slightly more difficult to show. In its general form the statement is that , for the column of . The inequality becomes an equality when the columns are mutually orthogonal. The intuition is clear if one interprets the determinant as the signed volume spanned by the columns of , which. In case , we know that there is such that , hence, by this simple observation it follows that
Equality holds when the columns are orthogonal, so must be diagonal, but must also hold, hence, must be diagonal, and thus must be diagonal, which is the result we use.
