Posts (9) containing the 'math.LA’ (Linear Algebra) tag:On the complexity of matrix multiplications |10 October 2022|
|
At last, we do a numerical test in Julia. Consider a linear system defined by the tuple : To simulate numerical problems, we round the maps and to the closest integer where is a noisy sinusoidal input. We compare a perfect (no numerical errors) realization to the rounded, denoted by , naive realization and to the optimized one . We do 100 experiments and show the mean plus the full confidence interval (of the input-output behaviour), the optimized representation is remarkably better. Numerically we observe that for we have , which is precisely where the naive struggles. |
: To show this AM-GM inequality, we can use Jensen's inequality for concave functions, that is, . We can use the logarithm as our prototypical concave function on and find . Then, the result follows.
: The inequality attributed to Hadamard is slightly more difficult to show. In its general form the statement is that , for the column of . The inequality becomes an equality when the columns are mutually orthogonal. The intuition is clear if one interprets the determinant as the signed volume spanned by the columns of , which. In case , we know that there is such that , hence, by this simple observation it follows that
Equality holds when the columns are orthogonal, so must be diagonal, but must also hold, hence, must be diagonal, and thus must be diagonal, which is the result we use.
As a follow up to the previous post, we discuss a lower bound, given as exercise P7.1.14 in (GvL13). I have never seen this bound before, especially not with the corresponding proof, so here we go. Given any , we have the following lower bound:
which can be thought of as a sharpening of the bound . In the case of a singular , the bound is trivial , but in the general setting it is less obvious. To prove that this holds we need a few ingredients. First, we need that for any we have . Then, we construct two decompositions of , (1) the Jordan Normal form , and (2) the Singular Value Decomposition . Using the multiplicative property of the determinant, we find that
Hence, since it follows that and thus we have the desired result.
(Update 8 March) It is of course interesting to get insight in how sharp this bound is, in general. To that end we consider random matrices with . When we compute the relative error for samples we obtain the histogram as shown on the left. Clearly, the relative error is not concentrated near . Nevertheless, it is an interesting bound, from a theoretical point of view. |
(GvL13) G.H. Golub and C.F. Van Loan: ‘‘Matrix Computations’’, 2013 John Hopkins University Press.
Lately, linear discrete-time control theory has start to appear in areas far beyond the ordinary. As a byproduct, papers start to surface which claim that stability of linear discrete-time systems
is characterized by . The confunsion is complete by calling this object the spectral norm of a matrix . Indeed, stability of is not characterized by , but by . If , then all absolute eigenvalues of are strictly smaller than one, and hence for , . This follows from the Jordan form of in combination with the simple observation that for , .
To continue, define two sets; and . Since we have . Now, the main question is, how much do we lose by approximating with ? Motivation to do so is given by the fact that is a norm and hence a convex function, i.e., when given a convex polytope with vertices , if , then . Note that is not a norm, can be without (consider an upper-triangular matrix with zeros on the main diagonal), the triangle inequality can easily fail as well. For example, let
Then , but , hence fails to hold. A setting where the aforementioned property of might help is Robust Control, say we want to assess if some algorithm rendered a compact convex set , filled with 's, stable. As highlighted before, we could just check if all extreme points of are members of , which might be a small and finite set. Thus, computationally, it appears to be attractive to consider over the generic .
As a form of critique, one might say that is a lot larger than . Towards such an argument, one might recall that . Indeed, if , but . Therefore, it seems like the set of for which considering over is reasonable, is negligibly small. To say a bit more, since we see that we can always find a ball with non-zero volume fully contained in . Hence, is at least locally dense in . The same holds for (which is more obvious). So in principle we could try to investigate . For , the sets agree, which degrades asymptotically. However, is this the way to go? Lets say we consider the set . Clearly, the sets and are different, even in volume, but for sufficiently large , should we care? The behaviour they parametrize is effectively the same.
We will stress that by approximating with , regardless of their volumetric difference, we are ignoring a full class of systems and miss out on a non-neglible set of behaviours. To see this, any system described by is contractive in the sense that , while systems in are merely asymptotically stable. They might wander of before they return, i.e., there is no reason why for all we must have . We can do a quick example, consider the matrices
Then , and both and have . We observe that indeed is contractive, for any initial condition on , we move strictly inside the sphere, whereas for , when starting from the same initial condition, we take a detour outside of the sphere before converging to . So although and have the same spectrum, they parametrize different systems.
In our deterministic setting this would mean that we would confine our statespace to a (solid) sphere with radius , instead of . Moreover, in linear optimal control, the resulting closed-loop system is usually not contractive. Think of the infamous pendulum on a cart. Being energy efficient has usually nothing to do with taking the shortest, in the Euclidean sense, path.
: Recall, . Then, let be some eigenvector of . Now we have . Since this eigenvalue is arbitrary it follows that .
: Let then . This follows from the being a norm.
: Clearly, if , we have . Now, when , does this imply that ? The answer is no, consider
Then, , yet, . For the full set of conditions on such that see this paper by Goldberg and Zwas.
: Recall that . This expression is clearly larger or equal to .
This short note is inspired by the beautiful visualization techniques from Duistermaat and Kolk (DK1999). Let's say we have a -dimensional linear discrete-time dynamical system , which preserves the volume of the cube under the dynamics, i.e. for any .
Formally put, this means that is part of a certain group, specifically, consider some field and define the Special Linear group by
Now, assume that we are only interested in matrices such that the cube remains bounded under the dynamics, i.e., is bounded. In this scenario we restrict our attention to . To see why this is needed, consider and both with determinant :
If we look at the image of under both these maps (for several iterations), we see that volume is preserved, but also clearly that the set is extending indefinitely in the horizontal direction.
To have a somewhat interesting problem, assume we are given with while it is our task to find a such that , hence, find feedback which not only preserves the volume, but keeps any set of initial conditions (say ) bounded over time.
Towards a solution, we might ask, given any what is the nearest (in some norm) rotation matrix? This turns out to be a classic question, starting from orthogonal matrices, the solution to is given by , where follows from the SVD of , . Differently put, when we use a polar decomposition of , , then the solution is given by . See this note for a quick multiplier proof. An interesting sidenote, problem is well-defined since is compact. To see this, recall that for any we have , hence the set is bounded, plus is the pre-image of under , , hence the set is closed as well.
This is great, however, we would like to optimize over instead. To do so, one usually resorts to simply checking the sign - and if necessary - flipping it via finding the closest matrix with a change in determinant sign. We will see that by selecting appropriate coordinates we can find the solution without this checking of signs.
For today we look at and largely follow (DK1999), for such a -dimensional matrix, the determinant requirement translates to . Under the following (invertible) linear change of coordinates
becomes , i.e., for any pair the point runs over a circle with radius . Hence, we obtain the diffeomorphism . We can use however a more pretty diffeomorphism of the sphere and an open set in . To that end, use the diffeomorphism:
for (formal way of declaring that should not be any real number) and the pair being part of (the open unit-disk). To see this, recall that the determinant is not a linear operator. Since we have a -dimensional example we can explicitly compute the eigenvalues of any (plug into the characteristic equation) and obtain:
At this point, the only demand on the eigenvalues is that . Therefore, when we would consider within the context of a discrete linear dynamical system , is either a saddle-point, or we speak of a marginally stable system (). We can visualize the set of all marginally stable , called , defined by all satisfying
(DK1999) J.J. Duistermaat and J.A.C. Kolk : ‘‘Lie Groups’’, 1999 Springer.
Using the ideas from the previous post, we immediately obtain a dynamical system to compute for any (symmetric positive definite matrices). Here we use the vector field , defined by and denote a solution by , where is the initial condition. Moreover, define a map by . Then , where is the eigenvector corresponding to the largest eigenvalue.
We can do a simple example, let with . Indeed, when given 100 , we observe convergence. To apply this in practice we obviously need to discretize the flow. To add to that, if we consider the discrete-time version of , we can get something of the form , which is indeed the Power Iteration algorithm, used to compute the largest eigenvalue of . Recall however that this algorithm needs stronger conditions (on ) to work than our flow. |
Consider an unknown dynamical system parametrized by . In this short note we look at a relatively simple, yet neat method to assess stability of via finite calls to a particular oracle. Here, we will follow the notes by Roman Vershynin (RV2011).
Assume we have no knowledge of , but we are able to obtain for any desired . It turns out that we can use this oracle to bound from above.
First, recall that since is symmetric, . Now, the common definition of is . The constraint set is not per se convenient from a computational point of view. Instead, we can look at a compact constraint set: , where is the sphere in . The latter formulation has an advantage, we can discretize these compact constraints sets and tame functions over the individual chunks.
To do this formally, we can use -nets, which are convenient in quantitative analysis on compact metric spaces . Let be a -net for if . Hence, we measure the amount of balls to cover .
For example, take as our metric space where we want to bound (the cardinality, i.e., the amount of balls). Then, to construct a minimal set of balls, consider an -net of balls , with . Now the insight is simply that . Luckily, we understand the volume of -dimensional Euclidean balls well such that we can establish
Which is indeed sharper than the bound from Lemma 5.2 in (RV2011).
Next, we recite Lemma 5.4 from (RV2011). Given an -net for with we have
This can be shown using the Cauchy-Schwartz inequality. Now, it would be interesting to see how sharp this bound is. To that end, we do a -dimensional example (). Note, here we can easily find an optimal grid using polar coordinates. Let the unknown system be parametrized by
such that . In the figures below we compute the upper-bound to for several nets . Indeed, for finer nets, we see the upper-bound approaching . However, we see that the convergence is slow while being inversely related to the exponential growth in .
Overall, we see that after about a 100 calls to our oracle we can be sure about being exponentially stable. Here we just highlighted an example from (RV2011), at a later moment we will discuss its use in probability theory.
In this first post we will see that the ‘‘almost surely’’ in the title is not only hinting at stochasticity.
For a long time this picture - as taken from Gabriel Peyré his outstanding twitter feed - has been on a wall in our local wildlife (inc. hedgehogs) shelter. This has lead to a few fun discussions and here we try to explain the picture a bit more.
Let . Now, we want to find the largest ball within . We can find that is given by with . This follows the most easily from geometric intuition.
To prove it, see that has ‘‘similar’’ faces, without loss of generality we take the one in the positive orthant of . Now we fit a -dimensional plane (hyperplane) to this face and find the point the closest (in -norm) to . This hyperplane can be parametrized by for and . Thus the normal (direction) is given by . To find , observe that for we have . Hence we have . Now, let and define the other unit vectors correspondingly. Then, since for any we can think of a (skinny) hedgehog indeed for . The inscribed ball shrinks to , while the spikes remain. |
Besides being a fun visualization of concentration, this picture means a bit more to me. For years I have been working at the Wildlife shelter in Delft and the moment has come where the lack of financial support from surrounding municipalities is becoming critical. As usual, this problem can be largely attributed to bureaucracy and the exorbitant amount of managers this world has. Nevertheless, the team in Delft and their colleagues around the country do a fantastic job in spreading the word (see Volkskrant, RTLnieuws). Lets hope Delft, and it neighbours, finally see value in protecting the local wildlife we still have.