wiki:DistCaches

Surface Distance Cache Tests

Overview and Benchmark Procedure

I have externalized the caching functionality in our various fault surface classes. There are a couple different implementation of the new SurfaceDistanceCache? interface:

Caching style/parameters are configurable via the org.opensha.sha.faultSurface.cache.SurfaceCachingPolicy? class, which can be configured via java properties. It defaults to using some form of multiple/hybrid cache for all surfaces except for Compound Surfaces, which default to the single cache.

I then ran lots of tests at HPCC with many variations on these caching schemes. All tests were run on nodes with the exact same configuration:

2 quad core 2.5 GHz processors, 12 GB RAM.

I ran the following benchmarks for a UCERF3 ERF with the branch average solution:

Distance test: how long does it take to calculate each distance metric from 80 sites to every rupture in the ERF? Hazard test: how long does it take to calculate a hazard curve at 80 sites?

Sites are randomly distributed (tightly) around 35, -118. Tests were run with 1, 2, 4, and 8 threads (multiple runs of each configuration and times averaged).

Plot Legend

Each plot has a number of lines. Solid lines (force=false) mean that the surfaces used the given cache for all rupture surfaces except for CompoundSurfaces?, which used the single location cache. Dashed lines (force=true) use the given cache for compound surfaces as well. If size == 1 and force=false, the single location cache is used for all surfaces (this is the previous surface caching behavior).

The color is related to the cache size:

  • black = 1 (the solid black line represents the previous caching behavior, and is the line to beat)
  • blue = 2
  • green = 4
  • orange = 8
  • magenta = 12

First Results and Sensitivity to Java Version

These are the initial results comparing the multi cache with the given size and a 1 hour access expiration with the single cache. As you'll see, the single location cache performs best, and is actually fastest with a single thread than with multiple threads for the simple distance calculation test. When you switch to hazard calculation tests, multiple threads start to help but things are still fastest with the single location cache.

Distance Test Hazard Test
distance test, Java 6, multi cache with expirationhazard test, Java 6, multi cache with expiration
distance test, Java 6, multi cache with expirationhazard test, Java 6, multi cache with expiration

I was surprised that the guava multi cache performance was so poor, so I decided to try a more recent version of Java. The following plots show the exact same calculation, but with Java 7. It is immediately apparent that guava cache performance is very sensitive to Java version, and performs much better with Java 7. Additionally, even the single cache case sped up. I recommend we use Java 7 for all calculations.

Distance Test Hazard Test
distance test, Java 7, multi cache with expirationhazard test, Java 7, multi cache with expiration
distance test, Java 7, multi cache with expirationhazard test, Java 7, multi cache with expiration

Performance of Expiration Time

I then ran the same Java 7 test as before, but with the multi cache expiration time disabled. Results are below. The multi cache now beats the single cache for threads > 1, but single cache is still best for the single threaded case.

Distance Test Hazard Test
distance test, Java 7, multi cache without expirationhazard test, Java 7, multi cache without expiration
distance test, Java 7, multi cache without expirationhazard test, Java 7, multi cache without expiration

Hybrid Cache

I then implemented a hybrid cache which uses both a multi cache and single cache. If the distance is in the single cache (was the last value to be accessed), that value will be returned. Otherwise it will default to the multi cache (and then load that value into the single cache). Results are below. The hybrid cache either matches or beats the single cache in all cases, especially in the hazard calculation case.

Distance Test Hazard Test
distance test, Java 7, hybrid cache without expirationhazard test, Java 7, hybrid cache without expiration
distance test, Java 7, hybrid cache without expirationhazard test, Java 7, hybrid cache without expiration

Recommendations

I recommend that we use the hybrid cache for all fault surfaces except for compound surfaces, which should use the single location cache. They perform best in all cases, and memory analysis shows negligible impact when used in this way (although the impact is more pronounced if used on compound surfaces). The default behavior of setting cache size to the number of available processors plus a buffer seems to work best. Future work has to be done to examine memory/speed implications for gridded seismicity sources.

Last modified 10 years ago Last modified on Jun 10, 2014, 11:58:36 AM

Attachments (8)

Download all attachments as: .zip