Table of Contents
tabnllsqfit - general purpose non-linear least squares fitting program
tabnllsqfit in=table-file [parameter=value]
takes a non-linear least squares fit (minimized ChiSquared) of a set of
datapoints (x(i),y(i)) where a functional form y=f(x;a0,a1,...,an) is assumed.
The function is not required to be non-linear in it parameters. Although
a number of common pre-defined functional forms can be selected (see fit=),
you can also write a small C routine, and dynamically load the fitted function
during runtime (see load=).
The inputfile is an ascii file in a tabular
format, where on every line there must be an assigned column for the X-
and Y-coordinate(s). An optional column can be assigned to the errors in
Y (the DY-coordinates, the inverse square of these being used as a weight).
If no DY-column is assigned, it is assumed that the errors in Y are all
the same and equal to 1. Lines starting with the '#' symbol are comment lines.
Certain models (see fit= below) allow multiple X or Y columns.
can also produce an output file with an additional column computing the
difference between the data and the model (in that order). In the special
case where the user has supplied no free parameters, the residuals are
still computed correctly.
The following parameters are recognized
in any order if the keyword is also given:
are referred to as p0,p1,p2,p3,.....
- (ascii) input file,
a table of values from which data is taken. No default.
- The column(s)
from which the (independant) X-values are taken. The first column is numbered
1 [default: 1].
- The column(s) from which the (dependant) Y-values
are taken [default: 2].
- Optional column from which the weights
will be derived. By default this column is meant to refer to an error bar,
and normally the inverse square of the errors are uses as weights for
those fits where these are used (see dypow= next). [default: none assigned].
- By changing this value, you can adjust the weights column (dycol).
Instead of the normal weight = 1/sigma^2, you will be able to use them as
weight^dypow. The most common use is when your dycol does not refer to an
error, but to the weight directly (e.g. an intensity), in which case dypow=-0.5
is appropriate. [default: 1].
- Range in X-values
for which the fit is done. A set of min:max, separated by comma's, can be
given. This keyword cannot be used if more than one X-column is used. [default:
no entry, i.e. all].
- Model name what to fit. Currently implemented
are line, plane, poly, gauss, loren, and exp. [default: line].
- Highest order of the polynomial fit (fit=poly) or number of dimension
of the hyper-plane fit (fit=plane). 0 would fit a constant. [Default: 0].
- Optional output filename where the data and residuals are
stored. The first few columns contain the X and Y columns, the last column
contains the residual Y-F(X). See an example below how to plot the data with
a model fit overlayed. [default: no output file created].
- A positive number will delete points more than sigma_factor from the fit,
and fit again. Multiple values can be given here, in which case multiple
iterations are done. [Default: -1].
- Initial estimates for the parameters.
For non-linear fits, or linear fits where certain parameters are fixed (see
free= below) they need to be given here. Some fits (e.g. gauss1d) do not require
an initial estimate and attempt to come up with a reasonable one. Otherwise
all initial parameters are 0.
- A list of 1's and 0's to set which parameters
are to be kept free (1)
and which are fixed (0). By default all parameters
are free, i.e. fitted. If any parameter is fixed, initial estimates for all
need to be given (see par=). Default: all parameters free.
- If given,
this should contain the full path to a compiled and shared object (.so)
file) that contains the function to be fitted in terms of the nllsqfit
parameters f and df. They must be named func_loadobj and derv_loadobj (maybe
modified with fit=). See LOAD FUNCTIONS below.
- If given, these are the
X-values for which the load function can be tested. It is useful for testing
and debugging your function. It will report the X, Y and the dY/dPAR vector
values. The program will exit after this, and no fitting is done. Default:
- Maximum number of lines allocated if the input file
was being read from a pipe. If not, the routine file_lines(3NEMO)
to allocate space for the table. [Default: 10000].
- Tolerance for convergence
of nllsqfit [Default: 0.0].
- Mixing parameter for nllsqfit [Default:
0.0 for linear, 0.1 for non-linear fits].
- Maximum number of allowed
nllsqfit iterations [Default: 50]
- Output format. Default %g
- Number of bootstrap samples to take to estimate the error in the parameters.
Output off all parameters and their errors are on one line containing the
word bootstrap. Parameters and their errors are output in pairs. Default:0
- Initial seed. See xrandom(3)
for more details. Default: 0
line y=p0+p1*x (same as poly,order=1)
plane z=p0+p1*x+p2*y (example order=2)
polynomial y=p0+p1*x+p2*x^2+p3*x^3... (example order=3)
loren y=p1/PI( (x-p0)^2 + p1^2 )
Here is an example of a linear fit: a straight line, with some
added noise and random weights between 1 and 2: (ieck, can't reproduce this
% nemoinp 1:100 |\
tabmath - - "%1+rang(0,10),ranu(1,2)" seed=123 |\
tabnllsqfit - 1 2 3
a= 1.87458 2.29818
b= 0.961672 0.0398614
Here is an example of a 2D plane in 3D: (1+2x+3y)
% ccdmath "" - '1+2*%x+3*%y+rang(0,0.1)' 5,5 seed=123 |\
ccdprint - x= y= label=x,y newline=t |\
tabnllsqfit - 1,2 3 fit=plane order=2
Fitting p0+p1*x1+p2*x2+.....pN*xN: (N=2)
p0= 1.0688 0.0523819
p1= 2.01497 0.0165646
p2= 2.97436 0.0165646
And a fit to a gaussian:
% nemoinp 1:100 |\
tabmath - - '4+exp(-(%1-50)**2/(200))+ranu(0,1)' seed=123 |\
tabnllsqfit - fit=gauss par=4,1,50,10
a= 4.46714 0.0416026
b= 1.13036 0.0994723
c= 50.2263 0.845469
d= 8.70728 0.959347
rms/chi = 1
Here is a contrived example of plotting the function to be plotted, by
fixing all parameters and computing a residual table from 0s:
% nemoinp 0:10:0.1 | tabmath - tab0 0
% tabnllsqfit tab0 1 2 fit=gauss par=1,2,5,1 free=0,0,0,0 out=tab0.d
% tabmath tab0.d - %1,-%3 | tabplot -
Here is an example of removing outlier points and fitting again:
% nemoinp 1:10 |\
tabmath - - '2*%1+1+rang(0,0.1)' seed=123 |\
tabnllsqfit - fit=line nsigma=1.5::3
a= 1.09548 0.0775617
b= 1.99937 0.0125002
2/10 points outside 1.5*sigma (0.152328)
a= 1.02651 0.0452119
b= 2.01358 0.00753531
0/8 points outside 1.5*sigma (0.080422)
Although 3 iterations were requested, after the first iteration no more
points were removed, and the iterations were stopped.
Here is an example
of estimating the errors via a bootstrap (resampling of errors) method.
Fitting a polynomial of order 2 and taking 100 bootstrap samples:
% tabnllsqfit tab11 fit=poly order=2 bootstrap=100
Fitting p0+p1*x+p2*x^2+.....pN*x^N: (N=2)
p0= 3.11325 0.192787
p1= 1.97474 0.0896959
p2= 0.00157311 0.008639
bootstrap= 3.11603 0.164922 1.96464 0.086429 0.00293632 0.00820007
^^^^ ^^^^^ ^^^^^ ^^^^^ ^^^^^^ ^^^^^^
P0 dP0 P1 dP2 P3 dP3
With the load= keyword dynamic object files can be loaded
using the loadobj(3NEMO)
mechanism. The convention is that two functions
must be externally visible, and named func_method and derv_method (where
method is the same as the fit= keyword.
Here is an example of the file myline.c
that can be used with fit=line load=myline.so and compiled with
/* File: myline.c */
real func_line(real *x, real *p, int np)
return p + p*x;
void derv_line(real *x, real *p, real *e, int np)
e = 1.0;
e = x;
One word of caution: if you find the program having a hard time finding
a solution in complex cases, it is quite possible that this is not due
to the fact that the function is complex, but due to noise or bad initial
It will not recognize linear fits if the non-linear parameters
are kept fixed, e.g. the offset p0 in fit=gauss.
Numerical Recipies in C, Ch.14
NIST linear: http://www.itl.nist.gov/div898/strd/nls/nls_main.shtml
A new scheme for calculating weights and describing
correlations in nonlinear least squares fits. (Hessler, Curent & Ogren, C.I.P.
10, 186, 1996): http://dx.doi.org/10.1063/1.168569
~/src/kernel/tab/fit example fitting functions
12-jul-02 V1.0 cloned off tablsqfit PJT
17-jul-02 V1.1 added load=, x=, numrec= PJT
11-sep-02 V1.1e changes error/warning to accomodate residual writen PJT
21-nov-02 V1.4 nsigma= can be an array of iterations PJT
14-feb-03 V1.6 arm,arm3 for Rahul PJT
21-mar-03 V1.7 added bootstrap=, seed= PJT
4-apr-03 V1.8 fixed error in using dycol=, and introduced dypow= PJT
15-mar-04 V1.8b added fit=loren and corrected lab= setting for functions PJT/RS
21-nov-05 V2.0 added fit=gauss1d,gauss2d PJT
24-apr-08 V2.1 added psf PJT
7-may-10 V2.2 added grow PJT
24-dec-11 V2.3b estimate gauss1d if no initial par given PJT
9-dec-12 V3.0 new style xrange= with multiple segments PJT
Table of Contents