Package Overview
Workflow
NoisySignalIntegration.jl estimates the uncertainty in numeric integrals based on the noise level and uncertainty in placing integration bounds. This is achieved by performing the integration many times while varying the noise and integration bounds.
The package uses a custom datatype Curve to represent the xy-data that shall be integrated. Curve wraps two vectors of identical element type and length. It was introduced mainly for convenience and simpler plotting.
From a Curve object, a NoiseSample can be derived. A NoiseSample is required to determine the noise amplitude and autocorrelation length (if the noise is strongly correlated, as it is often the case in FTIR spectra). With the noise parameters, a noise model can be constructed. This is either a GaussianNoiseModel (uncorrelated Gaussian noise) or an MvGaussianNoiseModel (correlated Gaussian noise, "Mv" = multivariate).
From the Curve object and noise model, an UncertainCurve object can be constructed. It contains random samples of the original data with varying noise. The UncertainCurve object is the first input required for the actual integration function mc_integrate.
While your input dataset should contain a somewhat lengthy portion of noise for the noise analysis step, you should not include this portion of the data in the actual integration, as this will only decrease performance while not offering any benefits. You should always crop your data to only include the relevant signals you want to integrate (see also Usage Guide).
The second and last input for the integration function is one or several integration bounds. They are represented by UncertainBound objects which are defined from one or two probability distributions that encode the uncertainty in the integration bounds.
In the final step, the mc_integrate function takes the UncertainCurve and UncertainBound(s) and integrates each random sample using the trapezoidal rule. Each UncertainBound yields one uncertain area that is returned as a Particles object from the MonteCarloMeasurements.jl package. Uncertainty propagation with the resulting areas thus works as described in the documentation of MonteCarloMeasurements.jl, merely by performing calculations with Particles objects.
You can swap the core integration function that mc_integrate uses to something more accurate, if needed. To do so, pass your integration function as keyword argument intfun. Note, that your integration function needs to have the same call signature as trapz.
See also: documentation of mc_integrate.
Data requirements
Due to the internal workings of the package, the input data needs to fulfill some basic requirements:
In general, the data that you analyze must be ordered from low to high x-values. If your data is not ordered, you should run the Base.sort() function on your input data (Curve object) once in the beginning of your analysis:
julia> using NoisySignalIntegration
julia> c = Curve([2, 6, 1], [4, 12, 2]);
julia> c = sort(c)
Curve{Int64}, 3 datapoints
(1, 2)
(2, 4)
(6, 12)Correlated noise can only be analyzed, if the x-data is evenly spaced. If this is not the case for your input data, you can use Interpolations.jl to interpolate your data on an even x-grid (see in particular here).
Example: Interpolating data on evenly spaced grid
using NoisySignalIntegration, Interpolations, Plots
c = NoisySignalIntegration.testdata_3() # test dataset with uneven grid spacing
diff(c.x) # shows the uneven spacing of x-values ↓↓↓1000-element Vector{Float64}:
0.10001
0.10003000000000002
0.10004999999999994
0.10007000000000005
0.10008999999999996
0.10011000000000003
0.10012999999999994
0.10015000000000007
0.10016999999999998
0.10018999999999989
⋮
0.11983000000002164
0.11984999999998536
0.11987000000000592
0.11988999999999805
0.11990999999999019
0.11993000000001075
0.11995000000000289
0.11997000000000924
0.11998999999998716δx = minimum(diff(c.x))
x_even = collect(minimum(c.x):δx:maximum(c.x))
interp = LinearInterpolation(c.x, c.y)
c_even = Curve(x_even, interp(x_even))
plot(c; label="original data")
plot!(c_even; label="interpolated data")