Opened 13 years ago

Last modified 10 years ago

#341 new defect

Tolerance is inconsitant for ArbitrarilyDiscretizedFunctions

Reported by: Kevin Milner Owned by: Kevin Milner
Priority: major Milestone: OpenSHA 1.4
Component: commons Version:
Keywords: Cc:

Description

Currently, ArbitrarilyDiscretizedFunctions? let you set a tolerance, but the behavior is strange. Consider these cases.

ArbitrarilyDiscretizedFunction func = new ArbitrarilyDiscretizedFunction();
func.setTolerance(1.0);
func.set(1.0, 0.0);
func.set(3.0, 0.0);
// func now contains 2 points: (1, 0) and (3, 0)
func.set(1.9, 0.0)
// func now contains 2 points: (1.9, 0) and (3, 0)
func.set(2.2, 0.0)
// now it's undefined. The new point, (2.2, 0.0) could replace either point, as it is within the tolerance of both

Until we decide how to do this, tolerance has been fixed to 0.0 (else an IllegalStateException? will be thrown). No instances were found that actually changed the tolerance, so this shouldn't have any effect.

One possible solution would be that the tolerance is used to bin values, similar to an evenly gridded function but with holes. So in the above function, if you were to call set(11.1, 0.0) it would put the value at (11.0, 0.0) - same for (10.8, 0.0).

Change History (6)

comment:1 Changed 13 years ago by Peter Powers

As I see it there are two use cases for tolerance

1) to interact with an existing function where precision errors may create very small mismatches in x-values (hence the original small default value)

2) histograms and binning

In either case, assuming some set(double, double) call matches an x-value within tolerance, that x-value should keep to its original as it would for an EvenlyDiscrFunc?. That is, if a match exists, the call should pass through to set(int, double) rather than treeSet/ArrayList add().

As for the ambiguity you point out, I would take a "buyer beware" approach. I assume that with the ArrayList? implementation the points are sorted ascending on X and so we should say in the docs, that:

"For large tolerances where more than one point in the function may match, the first matching point ascending in X will be used"

If is not the case (I doubt it) and there is some ambiguity as to which of two (or more) possible points will be encountered first, then the docs should state:

"For large tolerances where more than one point in the function may match, matching behavior is undefined."

comment:2 Changed 13 years ago by Kevin Milner

Actually, the insertion point would be undefined, as I use a binary search for all "get" methods so it could match the point before OR after, depending on the size of and placement within the list.

I agree that it should keep to its original (that's how I assumed it worked, and was surprised to discover otherwise).

It can still get nasty though, say you have a point at 1 with a tolerance of 0.9. If you create a point at 2, then the tolerances still overlap causing undefined behavior.

So I agree with your two use cases, both of which require slightly different functionality:

1) Avoiding small precision errors. In this case, tolerance should be relative to a given point, and not change the x value of the original point. So if you have a tolerance of 0.1, and a point at 0.12, and you call set(0.2, y) it would set the value at 0.12.

2) For binning, like histograms. This is how I would expect it to work: Lets say you have a tolerance of 0.05, and you call set with a point at 0.12, it would actually set at 0.1. Then if you called set at 0.2, it would be set at 0.2. If you called set at 0.16, it would also set at 0.2. This is hardly "arbitrarily discretized," however. It's more evenly discretized with gaps.

comment:3 Changed 13 years ago by Ned Field

Unless someone can articulate a realistic use of a non-zero tolerance, lets just force it to be zero (esp since it's not being used any other way right now).

comment:4 Changed 13 years ago by Peter Powers

One scenario is doing curve comparisons with static curve data generated elsewhere (e.g. pos fortran code) where x-values are off by some small amount. However, I have not yet encountered such a situation as mfd's are EvenlyDiscrFuncs? so I'm okay forcing it to be 0.

comment:5 Changed 11 years ago by Kevin Milner

Milestone: OpenSHA 1.3

Leaving ticket open as this contains interesting discussion should we decide to re-enable tolerance for ArbitrarilyDiscretizedFunc?. Removing milestone until we decide if we should re-enable this capability.

comment:6 Changed 10 years ago by Kevin Milner

Milestone: OpenSHA 1.4
Note: See TracTickets for help on using tickets.