Package Overview
Workflow
NoisySignalIntegration.jl
estimates the uncertainty in numeric integrals based on the noise level and uncertainty in placing integration bounds. This is achieved by performing the integration many times while varying the noise and integration bounds.
The package uses a custom datatype Curve
to represent the xy-data that shall be integrated. Curve
wraps two vectors of identical element type and length. It was introduced mainly for convenience and simpler plotting.
From a Curve
object, a NoiseSample
can be derived. A NoiseSample
is required to determine the noise amplitude and autocorrelation length (if the noise is strongly correlated, as it is often the case in FTIR spectra). With the noise parameters, a noise model can be constructed. This is either a GaussianNoiseModel
(uncorrelated Gaussian noise) or an MvGaussianNoiseModel
(correlated Gaussian noise, "Mv" = multivariate).
From the Curve
object and noise model, an UncertainCurve
object can be constructed. It contains random samples of the original data with varying noise. The UncertainCurve
object is the first input required for the actual integration function mc_integrate
.
While your input dataset should contain a somewhat lengthy portion of noise for the noise analysis step, you should not include this portion of the data in the actual integration, as this will only decrease performance while not offering any benefits. You should always crop
your data to only include the relevant signals you want to integrate (see also Usage Guide).
The second and last input for the integration function is one or several integration bounds. They are represented by UncertainBound
objects which are defined from one or two probability distributions that encode the uncertainty in the integration bounds.
In the final step, the mc_integrate
function takes the UncertainCurve
and UncertainBound
(s) and integrates each random sample using the trapezoidal rule. Each UncertainBound
yields one uncertain area that is returned as a Particles object from the MonteCarloMeasurements.jl package. Uncertainty propagation with the resulting areas thus works as described in the documentation of MonteCarloMeasurements.jl
, merely by performing calculations with Particles
objects.
You can swap the core integration function that mc_integrate
uses to something more accurate, if needed. To do so, pass your integration function as keyword argument intfun
. Note, that your integration function needs to have the same call signature as trapz
.
See also: documentation of mc_integrate
.
Data requirements
Due to the internal workings of the package, the input data needs to fulfill some basic requirements:
In general, the data that you analyze must be ordered from low to high x-values. If your data is not ordered, you should run the Base.sort()
function on your input data (Curve
object) once in the beginning of your analysis:
julia> using NoisySignalIntegration
julia> c = Curve([2, 6, 1], [4, 12, 2]);
julia> c = sort(c)
Curve{Int64}, 3 datapoints
(1, 2)
(2, 4)
(6, 12)
Correlated noise can only be analyzed, if the x-data is evenly spaced. If this is not the case for your input data, you can use Interpolations.jl to interpolate your data on an even x-grid (see in particular here).
Example: Interpolating data on evenly spaced grid
using NoisySignalIntegration, Interpolations, Plots
c = NoisySignalIntegration.testdata_3() # test dataset with uneven grid spacing
diff(c.x) # shows the uneven spacing of x-values ↓↓↓
1000-element Vector{Float64}:
0.10001
0.10003000000000002
0.10004999999999994
0.10007000000000005
0.10008999999999996
0.10011000000000003
0.10012999999999994
0.10015000000000007
0.10016999999999998
0.10018999999999989
⋮
0.11983000000002164
0.11984999999998536
0.11987000000000592
0.11988999999999805
0.11990999999999019
0.11993000000001075
0.11995000000000289
0.11997000000000924
0.11998999999998716
δx = minimum(diff(c.x))
x_even = collect(minimum(c.x):δx:maximum(c.x))
interp = LinearInterpolation(c.x, c.y)
c_even = Curve(x_even, interp(x_even))
plot(c; label="original data")
plot!(c_even; label="interpolated data")