Opened 12 years ago

Closed 12 years ago

#392 closed defect (fixed)

Deal with thread safety of background seismicity/ERFs that reuse ProbEqkRupture objects

Reported by: Kevin Milner Owned by:
Priority: major Milestone: OpenSHA 1.3
Component: sha Version:
Keywords: Cc:

Description

Even after fixing the problems described in #390 and #391 I was still seeing odd behavior on background seismicity sources. It turns out the UCERF2 isn't thread safe after all - the background seismicity sources reuse a ProbEqkRupture? object, just overriding the magnitude, surface, etc... That means that if 2 threads are working on the same source at the same time, the surface or magnitude could change mid-computation. We need to address this - I can think of a couple different ways but each has tradeoffs.

Attachments (2)

heap_regen_rups.png (91.2 KB) - added by Kevin Milner 12 years ago.
Heap vs Time when regenerating ProbEqkRupture? background seismicity objects
heap_reuse_rups.png (93.7 KB) - added by Kevin Milner 12 years ago.
Heap vs Time when reusing ProbEqkRupture? background seismicity objects

Download all attachments as: .zip

Change History (7)

comment:1 Changed 12 years ago by Peter Powers

In the gridded seis implementations of the NSHMP, I killed the shared source approach. They're low overhead and so its easier/safer to spit them out as needed. I'll migrate some of this over soon.

comment:2 Changed 12 years ago by Kevin Milner

I wrote some tests to find ERFs that reused sources and/or ruptures (or their surfaces) on subsequent calls to getSource/getRupture. I made sure that background seismicity was enabled if applicable.

First of all, I didn't find any instances of sources being reused. Ned, if you think I missed something, can you point me in the right direction?

I did however find a whole bunch of ERFs that reuse ruptures within a source:

  • USGS/CGS 1996 Adj. Cal. ERF (Frankel96_AdjustableEqkRupForecast)
    • Source class: org.opensha.sha.earthquake.rupForecastImpl.Frankel96.Frankel96_GR_EqkSource
  • USGS/CGS 1996 Cal. ERF (Frankel96_EqkRupForecast)
    • Source class: org.opensha.sha.earthquake.rupForecastImpl.Frankel96.Frankel96_GR_EqkSource
  • USGS/CGS 2002 Adj. Cal. ERF (Frankel02_AdjustableEqkRupForecast)
  • WG02 Eqk Rup Forecast (WG02_EqkRupForecast)
    • Source class: org.opensha.sha.earthquake.rupForecastImpl.WG02.WG02_CharEqkSource
  • WGCEP UCERF 1.0 (2005) (WGCEP_UCERF1_EqkRupForecast)
  • PEER Non Planar Fault Forecast (PEER_NonPlanarFaultForecast)
  • PEER Logic Tree ERF List (PEER_LogicTreeERF_List)
  • Point 2 Mult Vertical SS Fault ERF (Point2MultVertSS_FaultERF)
    • Source class: org.opensha.sha.earthquake.rupForecastImpl.Point2MultVertSS_FaultSource
  • WGCEP Eqk Rate Model 2 ERF (UCERF2)
  • UCERF2 ERF Epistemic List (UCERF2_TimeIndependentEpistemicList)
    • Source class: org.opensha.sha.earthquake.rupForecastImpl.WGCEP_UCERF_2_Final.UnsegmentedSource?
  • WGCEP (2007) UCERF2 - Single Branch (MeanUCERF2)
  • WGCEP (2007) UCERF2 - Single Branch, Modified, Fault Model 2.1 only (ModMeanUCERF2_FM2pt1)
  • Yucca mountain Adj. ERF (YuccaMountainERF)
  • Yucca Mountain ERF Epistemic List (YuccaMountainERF_List)

These are all of the offending sources:

  • org.opensha.sha.earthquake.rupForecastImpl.Frankel96.Frankel96_GR_EqkSource
  • org.opensha.sha.earthquake.rupForecastImpl.FaultRuptureSource?
  • org.opensha.sha.earthquake.rupForecastImpl.WG02.WG02_CharEqkSource
  • org.opensha.sha.earthquake.rupForecastImpl.FloatingPoissonFaultSource?
  • org.opensha.sha.earthquake.rupForecastImpl.Point2MultVertSS_FaultSource
  • org.opensha.sha.earthquake.rupForecastImpl.WGCEP_UCERF_2_Final.UnsegmentedSource?

I'm going to modify one of them to not do it this way and compare the performance...standby.

Changed 12 years ago by Kevin Milner

Attachment: heap_regen_rups.png added

Heap vs Time when regenerating ProbEqkRupture? background seismicity objects

Changed 12 years ago by Kevin Milner

Attachment: heap_reuse_rups.png added

Heap vs Time when reusing ProbEqkRupture? background seismicity objects

comment:3 Changed 12 years ago by Kevin Milner

I've done some benchmarks, both of heap space and calculation time. These are all for a simple main program that calculates 1000 PGA hazard curves using Frankel 96 set to only background seismicity (and CB 2008).

Timing tests are somewhat subjective as they varied every time I ran them. Overall, it does appear that there might be a time advantage with the "reuse" method, but it's not significant:

"Reuse" method (4 tries):

  • 219.663 secs
  • 228.389 secs
  • 216.421 secs
  • 228.025 secs

"Regen" method (4 tries):

  • 218.763 secs
  • 234.845 secs
  • 230.645 secs
  • 226.366 secs

I also analyzed the heap (memory usage) over time for both methods. Results look identical. In the plots below, the orange line is total heap (this is what you see in your task manager) and the blue line is the amount of that heap that is actually in use at a given time.

Reuse method

Heap vs Time when reusing ProbEqkRupture background seismicity objects

Regen method

Heap vs Time when regenerating ProbEqkRupture background seismicity objects

comment:4 Changed 12 years ago by Kevin Milner

I spoke too soon - there are serious performance implications for crosshair/random strike faults. For example, I calculated 10 MeanUCERF2 background only (crosshair) both ways and the reuse method was ~45% faster (156.744 secs vs 283.966 secs). There may be a more efficient implementation, however. I'll work more on this tomorrow. Until then I've created a branch for this work

comment:5 Changed 12 years ago by Kevin Milner

Resolution: fixed
Status: newclosed

I sped up the thread safe version of crosshair/random strike sources by cloning the frankel gridded surfaces instead of recreating them each time. This involved adding 2 new methods (with documentation) to FrankelGriddedSurface?:

deepCopy()
deepCopyOverrideDepth(double depth)

With these changes, performance is now identical to the old non-thread safe version. As background seismicity is a significant portion of the calculation time, it will be important to think carefully about how it's implemented for UCERF3. Peter, you also may want to implement my changes in your NSHMP FixedStrikeSource? class as it is probably significantly slowed down by recreating the FrankelGriddedSurface? instances on each getRupture() invocation.

This has all be merged back to trunk with revision [8877].

Note: See TracTickets for help on using tickets.