Opened 13 years ago

Closed 13 years ago

#309 closed task (fixed)

Investigate faster sparce matrix multiplication packages for Simulated Annealing

Reported by: Kevin Milner Owned by: Kevin Milner
Priority: major Milestone:
Component: UCERF3 Version:
Keywords: Cc: pagem@…


I'm creating a ticket for this to document my findings. I'll post benchmarks here for various Sparse Matrix implementations.

Also, only the A matrix is sparse, so it can potentially be multiplied by a dense implementation instead of using 2 sparse implementations.

Change History (3)

comment:1 Changed 13 years ago by Kevin Milner

Here are the results of my benchmarks for 2000 iterations. Times are for the multiplication portion only (in seconds), which is the biggest time hog:

I found a package called ParallelColt? which takes advantage of multicore/processor machines, all of the classes except OpenMapRealMatrix? and Array2DRowRealMatrix are from this new package.

2000 iterations, SparseDoubleMatrix2D x SparseDoubleMatrix2D: 109.036
2000 iterations, SparseDoubleMatrix2D x DenseDoubleMatrix2D: 68.639
2000 iterations, SparseRCDoubleMatrix2D x SparseRCDoubleMatrix2D: 8.955
2000 iterations, SparseRCDoubleMatrix2D x DenseDoubleMatrix2D: 122.698
2000 iterations, SparseCCDoubleMatrix2D x SparseCCDoubleMatrix2D: 8.57
2000 iterations, SparseCCDoubleMatrix2D x DenseDoubleMatrix2D: 4.248
2000 iterations, OpenMapRealMatrix? x OpenMapRealMatrix?: 51.84
2000 iterations, OpenMapRealMatrix? x Array2DRowRealMatrix: 41.801
2000 iterations, SparseFloatMatrix2D x SparseFloatMatrix2D: 108.346
2000 iterations, SparseFloatMatrix2D x DenseFloatMatrix2D: 66.598
2000 iterations, SparseRCFloatMatrix2D x SparseRCFloatMatrix2D: 8.838
2000 iterations, SparseRCFloatMatrix2D x DenseFloatMatrix2D: 127.349
2000 iterations, SparseCCFloatMatrix2D x SparseCCFloatMatrix2D: 5.26
2000 iterations, SparseCCFloatMatrix2D x DenseFloatMatrix2D: 3.89

From this, it looks like the fastest combination is to use SparseCCFloatMatrix2D for A, and DenseFloatMatrix2D for xnew. If double precision is needed, then we should use SparseCCDoubleMatrix2D & DenseDoubleMatrix2D respectively.

comment:2 Changed 13 years ago by Kevin Milner

I committed my updates, using SparseCCDoubleMatrix2D and DenseDoubleMatrix1D. Performance is ~47.73 minutes for 1 million iterations on my laptop. Committed in [7951].

comment:3 Changed 13 years ago by Kevin Milner

Resolution: fixed
Status: newclosed
Note: See TracTickets for help on using tickets.