Discussion

$Professor Varga$
Prof. Richard S. Varga

It seems like ages ago that Professor Richard S. Varga and I wrote a Java applet (Java version 1.04!) together with Alan Krautstengl for the journal ETNA[9]. I've updated that old work for the modern era here. Consider this a small dedication to Professor Varga as thanks for his wonderful enthusiasm for—and significant contributions to—mathematics.

The plot above illustrates two eigenvalue inclusion methods. The methods produce sets in the complex plane that are guaranteed to contain the eigenvalues of a matrix. The methods are cheaper to compute than precise eigenvalue estimates for large matrices, but the inclusion sets might be big.

Gerschgorin's theorem[4] defines perhaps the most famous eigenvalue inclusion method. Let $A = [a_{ij}] \in \mathbb{C}^{n\times n}$ be an $n\times n$ matrix and let $R_i$ be its $i$th off-diagonal row sum defined by $R_i = \sum_{j\ne i} |a_{ij}|$. The Gerschgorin disks are the set \[ \bigcup_{i=1}^n \{z\in\mathbb{C}: |z-a_{ii}| \le R_i\}, \] and the Gerschgorin theorem states that the above set contains the eigenvalues of $A$. (The theorem is elegant but elementary, try to prove it before looking it up!)

Gerschgorin's theorem says that if we replace a matrix with only two numbers per row, the diagonal entry and off-diagonal row sum, then we can still bound where its eigenvalues will be!

It's easy to see that Gerschgorin's theorem also applies to off-diagonal column sums of $A$ (by considering $A^H$, for example). Indeed, the intersection of the Gerschgorin sets of $A$ and those of $A^H$ may provide tighter bounds than illustrated here. One can further consider how tight one can make Gerschgorin-based estimates of eigenvalues, and Professor Varga has had a lot to say about this[8].

Each Gerschgorin disk uses data from one row of the matrix $A$. What if we consider two rows at a time? Can we find sets that contain $A$'s eigenvalues similarly to the Gerschgorin theorem? This question was answered (affirmatively) by Alfred Brauer in 1947 when he came up with sets that he called the ovals of Cassini[1].

Let the $ij$-th oval of Cassini of the matrix $A$ be defined by: \[ K_{ij}(A) = \{z\in\mathbb{C}: |z-a_{ii}|\cdot|z-a_{jj}| \le R_iR_j\}, \] where $i,j=1,2,\ldots,n$, $i\ne j$, and $R_i$ is the $i$th off-diagonal row sum of $A$ defined above. There are $n(n-1)/2$ total ovals of Cassini for an $n\times n$ matrix. That is generally more than the total number of $n$ Gerschgorin disks—except, notably, in the interactive example above. Brauer showed that the eigenvalues of the matrix $A$ are included in the union of all its ovals of Cassini.

Are the ovals of Cassini better at bounding eigenvalues than Gerschgorin diks? The answer is yes! The ovals of Cassini are always at least as good as the Gerschgorin disks at estimating the eigenvalues of any matrix, a result first mentioned by Brauer in 1958[2] but only sparsely elsewhere (see [7]). (As you play with the small interactive example above, you will see that the ovals of Cassini always lie within the Gerschgorin disks.) This result is not so well-known. Varga([7] and [6]) and Varga and Krautstengl[9] show even more—that under very general conditions the ovals of Cassini define optimal (tightest) eigenvalue inclusion sets among methods that use the same amount information.

What about considering three or more rows of a Matrix at a time? Can the inclusion bounds be improved even more? Remarkably, in general the answer is no! Generalizations to three or more rows at a time might not even contain all the eigenvalues. This is shown with a counter-example in Horn and Johnson's book[5, p. 382]. However, see Brualdi's really interesting paper [3] for an extension of Gerschgorin and Brauer's ideas to more rows that does work for special matrices related to directed graphs.

Technology

The plotting program is written in Javascript and relies on the superb d3.js library. LaTeX expressions are rendered using the MathJax library. All source code files are available on Github at https://github.com/bwlewis/cassini. You can view a live version of this document there at: http://bwlewis.github.io/cassini/.

Complex numbers

I considered many available numeric libraries for Javascript, including the very admirable NumericJavascript project. After many experiments, I was never fully satisfied, mostly because few (none that I could find) projects support eigenvalue solvers for complex-valued matrices.

In the end, I broke my usual rule of not re-inventing the wheel and implemented a really simple and tiny complex-value type and supporting library for Javascript myself available here: https://github.com/bwlewis/cassini/blob/master/complex.js. Feel free to fork it, use it, whatever. It's pretty basic but it does support complex-valued matrix multiplication, complex-valued matrix QR decomposition, and a few other things.

Plotting Brauer's ovals of Cassini

This section outlines the approach the interactive example takes to numerically estimate and display the ovals of Cassini.

Let's consider just one oval using rows 1 and 2 of the matrix $A$. We parameterize the corresponding oval of Cassini by the following process:

Let $w=A_{11}$, and set $A^{(1)} = A - wI$ (that is shift $A$ so that $A_{11}$ is zero).
Let $A^{(2)} = \gamma I\cdot A^{(1)}$, where $\gamma = -\arg A^{(1)}_{22}$ (this rotates the diagonal entry $A^{(1)}_{22}$ to the real line).
Let $A^{(3)} = A^{(2)} - \alpha I$, where $\alpha = A^{(2)}_{22}/2$. The transformed matrix $A^{(3)}$ now looks like: \[ \left( \begin{matrix} -\alpha & * & *\\ * & \alpha & *\\ * & * & *\\ \end{matrix} \right). \] Note that the off-diagonal row-sums remain unchanged from the original matrix.
Brauer's formula for defining the corresponding $K_{12}(A^{(3)})$ oval now simplifies a bit: \[ \{z\in\mathbb{C}: |z-\alpha|\cdot|z+\alpha| \le R_1R_2\} = \{z\in\mathbb{C}: |z^2 -\alpha^2| \le R_1R_2\}, \] and if we only consider points on the boundary of the oval, we're interested in: \[ \{z\in\mathbb{C}: |z^2 -\alpha^2| = R_1R_2\}. \]
Write the complex number $z$ in polar form, $z=r e^{i\theta}$, and let $\theta$ vary between $0 \le \theta \lt 2\pi$ to parameterize the problem. Our task is, for each value of $\theta$, solve for $r$ to get a point on the boundary of the shifted and rotated oval. The law of cosines helps us out here. Consider the picture to the right. The law of cosines informs us that \[ r^4 - 2r^2\alpha^2\cos2\theta + \alpha^4 - (R_1R_2)^2 = 0. \] For each value of $\theta$, we need to solve a monic fourth degree polynomial with real-valued coefficients to find possible values of $r$ to determine points on the boundary of the oval of Cassini. The Javascript program https://github.com/bwlewis/cassini/blob/master/cassini.js proceeds for many discrete values of $\theta$ between zero and $2\pi$, and for each of those directly solves for the real-valued roots of the corresponding polynomial. This gives us points on the boundary of the corresponding shifted and rotated oval of Cassini.
Apply the inverse shifts and rotations from steps 3—1 to the solution points to obtain points on the boundary of the original oval.

Notes and some additional difficulties

Eigenvalue solvers applied to an appropriate companion matrix are the normal way to find roots of higher-order polynomials. But in our case, the 4th degree polynomial was simple enough (monic, real-valued coefficients) to admit a fast direct algebraic solution. If you want to extend this to larger matrices, you'll need to use an iterative eigenvalue solver instead.

Finding points on the boundary of the ovals of the Cassini is one problem, and properly displaying them an entirely different one. Depending on how far apart the diagonal elements are and the relative sizes of the off-diagonal row sums, the ovals can form many interesting shapes including figure eights and two distinct sub-ovals. In the latter case the roots of the corresponding polynomial consist of four real values instead of two. When this happens, we have to be careful to order the points carefully to draw ordered contiguous boundary point values in d3.js or else the figure will zig/zag between boundary values and not get filled in properly. The code goes to some effort to get this right, but I think that there are still some edge cases where it's not quite working perfectly yet!

Gerschgorin Disks and Brauer's ovals of Cassini

Discussion

Technology

Complex numbers

Plotting Brauer's ovals of Cassini

Notes and some additional difficulties

Author

References