shh.csparse
Class MPDQ

Object
  extended byMPDQ
All Implemented Interfaces:
HopcroftTarjan.Graph

public class MPDQ
extends Object
implements HopcroftTarjan.Graph

Finds the Smith normal form of its argument, working in arbitrary precision over Z while aiming to avoid fill-in and entry explosion in the sparse matrix.

Notation.

If the argument is M, the Smith normal form is
M = P D Q.
Here P and Q are square matrices over Z of determinant ±1. They are represented as LinkedLists of ElemMatrixs; P is the product of the matrices in getPList(), and Q the product for getQList(), both in the usual left-to-right order. The matrix D, returned by getD(), is a diagonal matrix
D = diag(d0, d1, d2, ...),
over Z where d0 | d1 and d1 | d2, etc., and all di ≥ 0 (possibly with dj = dj+1 = ... = 0 for some j).

The constructor fires the Smith normal form computation automatically.

The basic Smith normal form algorithm is in jac(). Let corner = 0. Choose a pivot, make it positive, and move the it to the (cornercorner) position. Add multiples of the pivot row to the rows beneath it until the entries below the pivot are as small in absolute value as possible. Do the same to the columns to the right of the pivot. If any entry to the lower right of the pivot is not divisible by the pivot, then make the gcd of these two numbers the new pivot, and repeat. If all the entries to the lower right of the pivot are divisible by the pivot, increment corner and repeat.

Our algorithm.

The main point of innovation in Sheafhom is how the pivot is chosen. The method is tuned to our applications like EquCoh.main(java.lang.String[]), where n is much bigger than m, sparsity starts at under 0.1%, and almost all non-zero entries have initial value ±1. Crawling is better than crashing--we are willing to use a slow algorithm if it takes relatively little space. Assume we are not remembering Q, so we can perform column operations without spending the memory to record them. At each "repeat" stage we do the following.
  1. Sort the columns by COL_MARKOWITZ. This puts the sparsest columns to the left, and upper-triangularizes as much as possible with that constraint, producing shapes like
    ooooXoooX
    oooXoooXX
    ooXoooXXo
    oXoooXXoo
    XooooXooo
        
  2. Do a greedy search, in column-major order, for a pivot of smallest absolute value. A priori, this means we do no Markowitz stuff to reduce fill-in. However, the columns are sorted by COL_MARKOWITZ, and in our applications there are lots of entries with value ±1. That means the first pivot we find by a greedy search will probably have value ±1 and be fairly good in the Markowitz sense. (In the literature of sparse matrices over R, "Markowitz" means to choose a pivot minimizing (length of pivot row - 1) × (length of pivot column - 1).)
  3. Do a single row- and column-reduction step using the pivot.
  4. When the matrix becomes too dense, and especially too dense with arbitrary-precision entries (SparseEltZBig as opposed to SparseEltZInt), it is usually hopeless to preserve sparsity. Hence we change strategies.

The user is invited to change the algorithms in this class to suit his/her purposes. More comments than usual have been left in the code so you can see alternative ideas the author has tried.

Graphics.

The class offers graphs and windows to help you understand the progress of the computation. See SHOW_METERS.

The implementation of HopcroftTarjan.Graph is used in upperTriangularize().

Many methods will throw a ClassCastException if not all the entries of M are over Z (i.e., of class SparseEltZ).

Author:
Mark McConnell

Field Summary
static boolean ALLOW_LLL
          To disallow any use of LLL, set this to false.
 Comparator COL_MARKOWITZ
          Compares two SparseVs, first by number of entries (smallest goes to the left), then by initial entry's index (largest goes to the left).
 Comparator COL_NORMSQ
          Compares two SparseVs, putting first that one that's shorter under SparseV.getNormSq().
static int LLL_WIDTH
          Do LLL to at most this many columns at a time.
 Comparator MIN_LAST_INDEX
          Compares two SparseVs, putting to the left the one whose SparseV.getLastIndex() is smaller.
static String PIVOT_METHOD
          A short human-readable description of the algorithm in getPivot().
static int SHOW_CSW
          Display a CSparseWin, a window showing the sparsity pattern of the matrix as it changes in real time.
static int SHOW_METERS
          To see meters and graphs showing the computation's history, set this variable (before constructing an MPDQ) to the logical or of the appropriate SHOW_xxx constants in this class.
static int SHOW_PROG
          Show a ProgressMonitor, a dialog box with a few statistics and with a progress bar indicating how far along the computation is.
static int SHOW_STATS
          Show StatWins with line graphs showing the sparsity, number of P's and Q's, etc., as they change throughout the computation.
 
Constructor Summary
MPDQ(CSparse mat)
          Constructs an MPDQ that remembers both P and Q and works non-destructively.
MPDQ(CSparse mat, boolean useP, boolean useQ, boolean destructive)
          Basic constructor.
 
Method Summary
static double bitLength(CSparse A)
          Sums SparseElt.bitLength() for all the entries, and returns the sum in kilobytes (kB).
 SparseEltZ det()
          Returns a SparseEltZ whose value is the determinant of M and of unspecified index.
 String dToString()
          Prints D in human-readable form.
 BigInteger[] elemDivisors()
          Returns an array of the elementary divisors other than 1 or 0.
 Iterator enumEdgesOn(Object v)
          Returns an Iterator of all the edges on v.
 CSparse getD()
          See the comment on the class MPDQ.
 CSparse getMat()
          Returns the original matrix this MPDQ was asked to simplify, or null if the computation was destructive.
 Object getOtherVertexOn(Object e, Object v)
          If v is one of the vertices of edge e, returns the other vertex.
protected  MPDQ.Pivot getPivot()
          Returns the next Pivot to use, or null if there are no more entries.
protected  MPDQ.Pivot getPivotGreedy()
          Finds the smallest pivot in absolute value.
 LinkedList getPList()
          See the comment on the class MPDQ.
 LinkedList getQList()
          See the comment on the class MPDQ.
protected  void jac()
          The main algorithm for Smith normal form computations.
 int numOnes()
          The number of 1's down the diagonal of D.
 int rank()
          Returns the rank of M (the dimension over Q of its image).
 String torsion()
          Returns a pretty-print version of the elementary divisors other than 1 or 0.
 void upperTriangularize()
          A greedy upper-triangularization algorithm.
 
Methods inherited from class Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SHOW_METERS

public static int SHOW_METERS
To see meters and graphs showing the computation's history, set this variable (before constructing an MPDQ) to the logical or of the appropriate SHOW_xxx constants in this class.


SHOW_PROG

public static final int SHOW_PROG
Show a ProgressMonitor, a dialog box with a few statistics and with a progress bar indicating how far along the computation is.

See Also:
Constant Field Values

SHOW_STATS

public static final int SHOW_STATS
Show StatWins with line graphs showing the sparsity, number of P's and Q's, etc., as they change throughout the computation.

See Also:
Constant Field Values

SHOW_CSW

public static final int SHOW_CSW
Display a CSparseWin, a window showing the sparsity pattern of the matrix as it changes in real time. This slows down the computation, but minimizing the window eliminates the slowdown.

See Also:
Constant Field Values

ALLOW_LLL

public static boolean ALLOW_LLL
To disallow any use of LLL, set this to false. We use LLL in general situations to reduce coefficient explosion when the matrix has become dense and full of redundant columns. But sometimes it's better to avoid LLL, e.g., when the matrix is known to have near-full rank.


PIVOT_METHOD

public static final String PIVOT_METHOD
A short human-readable description of the algorithm in getPivot().

See Also:
Constant Field Values

LLL_WIDTH

public static int LLL_WIDTH
Do LLL to at most this many columns at a time. We use LLL in general situations to reduce coefficient explosion when the matrix has become dense and full of redundant columns. You can't do it to too many columns at a time, though, because of speed and memory.


COL_MARKOWITZ

public final Comparator COL_MARKOWITZ
Compares two SparseVs, first by number of entries (smallest goes to the left), then by initial entry's index (largest goes to the left).


MIN_LAST_INDEX

public final Comparator MIN_LAST_INDEX
Compares two SparseVs, putting to the left the one whose SparseV.getLastIndex() is smaller.


COL_NORMSQ

public final Comparator COL_NORMSQ
Compares two SparseVs, putting first that one that's shorter under SparseV.getNormSq(). Assumes SparseV.updateNormSq() has been called on any vectors it will be given.

Constructor Detail

MPDQ

public MPDQ(CSparse mat,
            boolean useP,
            boolean useQ,
            boolean destructive)
Basic constructor.

Parameters:
mat - The matrix whose Smith normal form we want.
useP - If true, construct the LinkedList of ElemMatrixs whose product is P. If false, throw away the P data.
useQ - If true, construct the LinkedList of ElemMatrixs whose product is Q. If false, throw away the Q data. Since CSparses are stored by columns, setting useQ to false also enables the most important optimizations.
destructive - If false, don't overwrite mat, but rather do the work on a copy. If true (recommended), mat is overwritten, and will hold D at the end of the computation.

MPDQ

public MPDQ(CSparse mat)
Constructs an MPDQ that remembers both P and Q and works non-destructively.

Method Detail

getMat

public CSparse getMat()
Returns the original matrix this MPDQ was asked to simplify, or null if the computation was destructive.


getD

public CSparse getD()
See the comment on the class MPDQ.


getPList

public LinkedList getPList()
See the comment on the class MPDQ. Returns null if we didn't ask the constructor to use P.


getQList

public LinkedList getQList()
See the comment on the class MPDQ. Returns null if we didn't ask the constructor to use Q.


rank

public int rank()
Returns the rank of M (the dimension over Q of its image).


det

public SparseEltZ det()
Returns a SparseEltZ whose value is the determinant of M and of unspecified index.

If you call this method when the matrix is not square, the return value is null.

Some of the reduction algorithms for large matrices may mess up the sign.


bitLength

public static double bitLength(CSparse A)
Sums SparseElt.bitLength() for all the entries, and returns the sum in kilobytes (kB).

Throws:
ClassCastException - Unless all the entries are SparseEltZs.

jac

protected void jac()
The main algorithm for Smith normal form computations.

Since 1980, the author has called this method jac because he learned it as Theorem 3.8 of Basic Algebra I by Nathan Jacobson. During his college and grad school days, the author collected nine stuffed panda bears, one for each year of school; the panda for 1979-80 is named Jacobson.


dToString

public String dToString()
Prints D in human-readable form. An example is
[five thousand and thirty-seven 1's] 3 3 6 [two hundred and five 0's]


torsion

public String torsion()
Returns a pretty-print version of the elementary divisors other than 1 or 0. Examples: [3 3 6], or [] for no torsion.


elemDivisors

public BigInteger[] elemDivisors()
Returns an array of the elementary divisors other than 1 or 0.


numOnes

public int numOnes()
The number of 1's down the diagonal of D.


getPivot

protected MPDQ.Pivot getPivot()
Returns the next Pivot to use, or null if there are no more entries.


getPivotGreedy

protected MPDQ.Pivot getPivotGreedy()
Finds the smallest pivot in absolute value.

This method does no Markowitz stuff to reduce fill-in. Nevertheless, if the columns are sorted by COL_MARKOWITZ and there are lots of entries with value ±1, then the pivot it finds will have value ±1 and be fairly good in the Markowitz sense.


upperTriangularize

public void upperTriangularize()
A greedy upper-triangularization algorithm. When it has a choice, it uses a random-number generator to make it not completely stupid. Only rows and columns with index ≥ corner are affected.


enumEdgesOn

public Iterator enumEdgesOn(Object v)
Description copied from interface: HopcroftTarjan.Graph
Returns an Iterator of all the edges on v.

Specified by:
enumEdgesOn in interface HopcroftTarjan.Graph

getOtherVertexOn

public Object getOtherVertexOn(Object e,
                               Object v)
Description copied from interface: HopcroftTarjan.Graph
If v is one of the vertices of edge e, returns the other vertex. Returns null if v is not on e, so the method also tests whether v is on e at all.

Specified by:
getOtherVertexOn in interface HopcroftTarjan.Graph