wiki:FaultSystemSolution

Fault System Solution Zip File Format Documentation

UCERF3 (and potentially other forecast) data is stored in a zip file format, this page describes how to parse these zip files. If you are using Java, a parser is already written in OpenSHA via the scratch.UCERF3.utils.FaultSystemIO class.

Zip File Contents

The following files constitute a Fault System Solution. See File Formats below for descriptions and implementation details for each format.

File Name File Format Optional? Description
fault_sections.xmlXMLnoThis XML file describes each sub section in the Fault System. These indexes will be referred to in the rup_sections.bin file when defining ruptures.
grid_sources.xmlXMLyesThis XML file, if present, gives gridded seismicity MFDs at each node in the region that this solution covers.
grid_sources_reg.xmlXMLyesThis XML file, if present, gives the region associated with gridded seismicity MFDs. Used in conjunction with grid_sources.bin and a more space efficient alternative to grid_sources.xml
grid_sources.binDouble array list binaryyesThis binary file, if present, gives gridded seismicity MFDs at each node in the region described in grid_sources_xml.
info.txtASCIIyesThis text file, if present, contains metadata describing the solution.
mags.binDouble array binarynoThis file gives magnitudes for each rupture. It contains one double value for each rupture index, in order.
rakes.binDouble array binarynoThis file gives average rakes for each rupture. It contains one double value for each rupture index, in order.
rates.binDouble array binarynoThis file gives annualized rates for each rupture. It contains one double value for each rupture index, in order.
rup_areas.binDouble array binarynoThis file gives areas for each rupture in SI units (square meters). It contains one double value for each rupture index, in order.
rup_lengths.binDouble array binaryyesThis file, if present, gives lengths for each rupture in SI units (meters). It contains one double value for each rupture index, in order.
rup_mfds.binMFD BinaryyesThis file, if present, gives magnitude frequency distributions for each rupture. It contains one function (consisting of an x value and y value data array) for each rupture index, in order.
rup_sections.binInteger array list binarynoThis lists the sub sections involved in each rupture. It consists of numRuptures arrays, each of which lists the sub sections indexes (as defined in fault_sections.xml) for that rupture
sect_areas.binDouble array binaryyesThis file, if present, gives areas for each fault sub section in SI units (square meters). It contains one double value for each sub section index, in order.
sect_slips.binDouble array binaryyesThis file, if present, gives slip rates after any aseismic and subseismogenic reductions for each fault sub section in SI units (square meters). It contains one double value for each sub section index, in order.
sect_slips_std_dev.binDouble array binaryyesThis file, if present, gives standard deviations of slip rates after any aseismic and subseismogenic reductions for each fault sub section in SI units (square meters). It contains one double value for each sub section index, in order.
sub_seismo_on_fault_mfds.binMFD BinaryyesThis file, if present, gives subseismogenic magnitude frequency distributions for each sub section. It contains one function (consisting of an x value and y value data array) for each sub section index, in order.

The following files are neither documented nor required but may be present in zip files generated by the UCERF3 inversion. They give metadata about the logic tree branch associated and other inversion metadata.

File Name File Format Optional? Description
close_sections.binInteger array list binaryyesThis file lists, for each sub section (in order), all of the other sub sections that the given sub section connects with in the Fault System.
cluster_rups.binInteger array list binaryyesSome fault systems are separated into clusters of interconnected faults. This file lists, for each cluster, all of the rupture indexes which are part of the given cluster.
cluster_sects.binInteger array list binaryyesSome fault systems are separated into clusters of interconnected faults. This file lists, for each cluster, all of the sub section indexes which are part of the given cluster.
inv_rup_set_metadata.xmlXMLyesThis file gives metadata for the logic tree branch and rupture filtration criterion (laugh test filter) used to generate this solution.
inv_sol_metadata.xmlXMLyesThis file gives metadata for the UCERF3 inversion including equation set weights and final simulated annealing energies.
rup_avg_slips.binDouble array binaryyesThis file, if present, gives the average slip for each rupture in SI units (meters). It contains one double value for each rupture index, in order.

File Formats Used

You must write a parser for each of the following file formats in order to load in a fault system solution.

Double array binary file

These files contain an array of double values in a binary format. These files simply contain a series of big endian 8 bit double precision floating point numbers. The size of this file will be equal to the number of values x 8 bits.

Integer array list binary file

These files contain a list of integer arrays in a binary format. All file entries are 4-bit big endian integer values. The first value in the file is the number of integer arrays stored in the file. Then each array is written to the file by first writing the number of elements in the array, then each value in the array. For example, consider the following 3 arrays:

[ 0 6 2 4 ]
[ 3 6 2 ]
[ 3 7 9 1 4 7 ]

This would be written as (all stored as big endian 4-bit integers):

3 4 0 6 2 4 3 3 6 2 6 3 7 9 1 4 7

In this example, the number of arrays is in red, each array's size in blue, and array data is in black.

Fault section data XML file

Each fault subsection is stored in an XML file, an example of which is shown below.

<?xml version="1.0" encoding="UTF-8"?>

<OpenSHA>
  <FaultSectionPrefDataList>
    <!--- Each fault subsection is listed in it's own element so ensure correct ordering.
Each subsection element name will be i0, i1, i2, ... i[N-1] for N subsections. Note that
some fields may be NaN for certain solutions. -->
    <i0 sectionId="0" sectionName="Los Alamos extension, Subsection 0" aveLongTermSlipRate="NaN"
slipRateStdDev="0.0" aveDip="30.0" aveRake="NaN" aveUpperDepth="0.0" aveLowerDepth="12.0"
aseismicSlipFactor="0.1" couplingCoeff="1.0" dipDirection="205.15857"
parentSectionName="Los Alamos extension" parentSectionId="780" connector="false">
      <!-- This defines a polygon representing the geologic fault zone. Note that UCERF3 uses
a combination of this polygon and a buffered fault trace as described in UCERF3 Appendix O. -->
      <ZonePolygon name="Unnamed Region">
        <LocationList>
          <Location Latitude="34.562639" Longitude="-119.867803" Depth="0.0"/>
          <Location Latitude="34.54856" Longitude="-119.878996" Depth="0.0"/>
          <Location Latitude="34.586297" Longitude="-119.926461" Depth="0.0"/>
          <Location Latitude="34.630493" Longitude="-120.091308" Depth="0.0"/>
          <Location Latitude="34.647866" Longitude="-120.086651" Depth="0.0"/>
          <Location Latitude="34.60288" Longitude="-119.919064" Depth="0.0"/>
        </LocationList>
      </ZonePolygon>
      <!-- This is the actual fault trace. Note that order is important and should follow the
Aki-Richards definition -->
      <FaultTrace name="Los Alamos extension 1">
        <Location Latitude="34.63918" Longitude="-120.08897999999999" Depth="0.0"/>
        <Location Latitude="34.608190666873426" Longitude="-119.97328407378366" Depth="0.0"/>
      </FaultTrace>
    </i0>
    <i1 sectionId="1" ...>
        ...
    </i1>
    ...
    <i[N-1] sectionId="[N-1]" ...>
        ...
    </i[N-1]>
  </FaultSectionPrefDataList>
</OpenSHA>

Grid Sources XML file

Some solutions will contain gridded seismicity magnitude frequency distributions. Here is an example XML file:

NOTE 1: UCERF3 uses the RELM region evenly discretized at 0.1 degrees for gridded seismicity. Due to the complexities involved in reproducing our gridding exactly, a file is posted here with grid node indexes and locations for this region: http://opensha.usc.edu/ftp/kmilner/ucerf3/relm_gridded_region.csv

NOTE 2: This file is now deprecated as it is very large and does not compress well. The newer version of the file, grid_sources.xml, just contains the evenlyGriddedGeographicRegion region element below and uses a binary format.

<?xml version="1.0" encoding="UTF-8"?>

<OpenSHA>
  <!-- this explains the evenly gridded region for on which gridded seismicity is distributed. See
class javadocs here for more information: http://opensha.usc.edu/JavaDocs/org/opensha/commons/geo/GriddedRegion.html -->
  <evenlyGriddedGeographicRegion spacing="0.1" numPoints="7636">
    <anchor>
      <Location Latitude="31.5" Longitude="-125.4" Depth="0.0"/>
    </anchor>
    <Region name="RELM_TESTING Region">
      <LocationList>
        <Location Latitude="43.0" Longitude="-125.2" Depth="0.0"/>
        <Location Latitude="43.0" Longitude="-119.0" Depth="0.0"/>
        <Location Latitude="39.4" Longitude="-119.0" Depth="0.0"/>
        <Location Latitude="35.7" Longitude="-114.00000000000001" Depth="0.0"/>
        <Location Latitude="34.3" Longitude="-113.1" Depth="0.0"/>
        <Location Latitude="32.9" Longitude="-113.5" Depth="0.0"/>
        <Location Latitude="32.2" Longitude="-113.6" Depth="0.0"/>
        <Location Latitude="31.7" Longitude="-114.5" Depth="0.0"/>
        <Location Latitude="31.5" Longitude="-117.1" Depth="0.0"/>
        <Location Latitude="31.900000000000002" Longitude="-117.90000000000002" Depth="0.0"/>
        <Location Latitude="32.8" Longitude="-118.40000000000002" Depth="0.0"/>
        <Location Latitude="33.7" Longitude="-121.0" Depth="0.0"/>
        <Location Latitude="34.2" Longitude="-121.6" Depth="0.0"/>
        <Location Latitude="37.7" Longitude="-123.80000000000001" Depth="0.0"/>
        <Location Latitude="40.2" Longitude="-125.4" Depth="0.0"/>
        <Location Latitude="40.5" Longitude="-125.4" Depth="0.0"/>
      </LocationList>
    </Region>
  </evenlyGriddedGeographicRegion>
  <!-- this gives magnitude frequency distributions for each grid node. Each node can have an Unassociated MFD (seismicity at
that node that is not associated with a mapped fault), and a sub seismogenic MFD if a fault crosses the node. Num is the total
number of grid nodes.-->
  <MFDNodeList num="7636">
    <MFDNode index="0">
      <!-- this describes the discretization of the MFD function -->
      <UnassociatedFD name="Incremental Mag Freq Dist" tolerance="1.0000000000000001E-7" num="90" minX="0.05" maxX="8.950000000000001" delta="0.1">
        <Points>
          <!-- points on the MFD function -->
          <Point x="0.05" y="9.218866335939037"/>
          <Point x="0.15000000000000002" y="7.322805822785568"/>
          <Point x="0.25" y="5.816711422441951"/>
          <Point x="0.35000000000000003" y="4.620378116088878"/>
          <Point x="0.45" y="3.6700967927115817"/>
          ...
        </Points>
      </UnassociatedFD>
    </MFDNode>
    <MFDNode index="1">
      <!-- only some nodes have sub seismogenic MFDs -->
      <SubSeisMFD name="Incremental Mag Freq Dist" tolerance="1.0000000000000001E-7" num="90" minX="0.05" maxX="8.950000000000001" delta="0.1">
        <Points>
          <Point x="0.05" y="7.173882325337919"/>
          <Point x="0.15000000000000002" y="5.69841728360539"/>
          ...
        </Points>
      </SubSeisMFD>
      <UnassociatedFD name="Incremental Mag Freq Dist" tolerance="1.0000000000000001E-7" num="90" minX="0.05" maxX="8.950000000000001" delta="0.1">
        <Points>
          <Point x="0.05" y="9.51974524882418"/>
          <Point x="0.15000000000000002" y="7.561802438523384"/>
          <Point x="0.25" y="6.006553182326047"/>
          ...
        </Points>
      </UnassociatedFD>
    </MFDNode>
    ...
  </MFDNodeList>
</OpenSHA>

Grid Sources Binary file

This is a binary representation of grid source MFDs. All values are stored in a binary format (8-bit big endian floating point values) as a list of double arrays.

First, the number of total arrays is written, this is two times the number of grid nodes plus one, for the x values (which are only written once). The multiple of two is because each node has both an unassociated MFD (not associated with any faults) and an associated (associated with a fault) MFD.

For example, the 7636 grid nodes used for UCERF3 would write (7636 * 2) + 1 = 15273 arrays.

Then each array is written first with a 4-bit integer for the size of the array, followed by each 8-bit big endian value in the array. Empty arrays (size zero) mean that there is no MFD at that node (for example, many nodes do not have any faults and do not have an unassociated MFD).

Lets consider a simple example with 2 grid nodes where one associated MFD is null (note that actual grid source MFDs are discretized more finely):

Node 1:

Unassociated:

x y
5.00.5
5.50.1
6.01e-2
6.53e-5
7.01e-8
7.51e-11

Associated sub seismogenic:

null

Node 2:

Unassociated:

x y
5.00.4
5.50.2
6.02e-2
6.53e-5
7.02e-8
7.51e-10

Associated sub seismogenic:

x y
5.00.2
5.50.1
6.03e-2
6.57e-5
7.04e-8
7.56e-11

These would be written to the file as:

5 6 5.0 5.5 6.0 6.5 7.0 7.5 6 0.5 0.1 1e-2 3e-5 1e-8 1e-11 0 6 0.4 0.2 2e-2 3e-5 2e-8 1e-10 6 0.2 0.1 3e-2 7e-5 4e-8 6e-11

In this example, the number of arrays ((2 * the number of grid nodes) + 1), 4-bit integer, is in red, each array's size (integer) in blue, x value array data (double values) are in cyan, y value array data (double values) for unassociated MFDs are in orange, and y value array data (double values) for associated sub seismogenic MFDs are in green.

MFD Binary File

Some mean (across multiple logic tree branches) solutions may contain Magnitude Frequency Distributions for each rupture. In this case, the rates.bin file will contain total rates and mags.bin will contain weighted average magnitudes. These MFDs can be used to more accurately represent the mean of multiple solutions instead of using the mean magnitude.

Additionally, solutions can optionally include subseismogenic magnitude frequency distributions for each fault subsection. These are not needed for most applications, but can be used instead of the "associated" MFDs provided in the gridded seismicity data files.

They are written as a series of double arrays, with x values and y values separated into individual arrays. For example, consider these two functions:

function 1:

x y
5.50.1
5.750.3
5.90.2

function 2:

x y
5.50.05
5.750.33
5.90.24
6.210.1

These would be written to the file as:

4 3 5.5 5.75 5.9 3 0.1 0.3 0.2 4 5.5 5.75 5.9 6.21 4 0.05 0.33 0.24 0.1

In this example, the number of arrays (2 * the number of functions, integer) is in red, each array's size (integer) in blue, x value array data (double values) are in cyan, and y value array data (double values) are in orange.

Compound Fault System Solution Files

Compound Fault System Solution files are single zip files which contain all data for solutions for multiple UCERF3 Logic Tree Branches. This format takes advantage of the fact that many contain duplicate information, so that file is only written once. For example, rakes only depend on the Fault Model and Deformation Model (they are not, for example, dependent on the Spatial Seismicity Kernel). So one 'rakes.bin' file is stored for each combination of FM/DM, for example, "FM3_1_ZENGBB_rakes.bin'. The 'rates.bin' files, however, are unique to each logic tree branch and one is present for each branch. See the table below for a mapping the logic tree branch choices that each file type depends on.

File Name Logic Tree Branch Levels
close_sections.binFM
cluster_rups.binFM
cluster_sects.binFM
fault_sections.xmlFM, DM
info.txtALL
mags.binFM, DM, Scale
rakes.binFM, DM
rates.binALL
rup_areas.binFM, DM
rup_lengths.binFM
rup_avg_slips.binFM, DM, Scale
rup_sec_slip_type.txtN/A
rup_sections.binFM
sect_areas.binFM, DM
sect_slips.binALL BUT Dsr
sect_slips_std_dev.binALL BUT Dsr
inv_rup_set_metadata.xmlALL
inv_sol_metadata.xmlALL
grid_sources.xmlALL old xml format
grid_sources_reg.xmlNONE new binary format
grid_sources.binALL new binary format
rup_mfds.binALL
Last modified 3 years ago Last modified on Feb 4, 2015, 11:21:16 AM