Title: | Generate PDFs and CDFs from Binned Data |
---|---|
Description: | Provides several methods for generating density functions based on binned data. Methods include step function, recursive subdivision, and optimized spline. Data are assumed to be nonnegative, the top bin is assumed to have no upper bound, but the bin widths need be equal. All PDF smoothing methods maintain the areas specified by the binned data. (Equivalently, all CDF smoothing methods interpolate the points specified by the binned data.) In practice, an estimate for the mean of the distribution should be supplied as an optional argument. Doing so greatly improves the reliability of statistics computed from the smoothed density functions. Includes methods for estimating the Gini coefficient, the Theil index, percentiles, and random deviates from a smoothed distribution. Among the three methods, the optimized spline (splinebins) is recommended for most purposes. The percentile and random-draw methods should be regarded as experimental, and these methods only support splinebins. |
Authors: | David J. Hunter and McKalie Drown |
Maintainer: | Dave Hunter <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.2.2 |
Built: | 2025-02-19 02:57:41 UTC |
Source: | https://github.com/cran/binsmooth |
Binned income data from 3,221 counties in the U.S. and Puerto Rico.
data("county_bins")
data("county_bins")
A data frame with 51536 observations on the following 6 variables.
fips
Number identifying the county
households
Bin counts
bin_min
Left endpoints of bins (US Dollars)
bin_max
Right endpoints of bins
county
County name
state
State name
U.S. Census Bureau, American Community Survey: https://www.census.gov/programs-surveys/acs/
data(county_bins) data(county_true) binedges <- county_bins$bin_max[county_bins$fips=="6083"]+0.5 # continuity correction bincounts <- county_bins$households[county_bins$fips=="6083"] smean <- county_true$mean_true[county_true$fips=="6083"] plot(splinebins(binedges, bincounts, smean)$splinePDF, 0, 300000, n=500, main="Santa Barbara County") plot(stepbins(binedges, bincounts, smean)$stepPDF, do.points=FALSE, col="red", add=TRUE)
data(county_bins) data(county_true) binedges <- county_bins$bin_max[county_bins$fips=="6083"]+0.5 # continuity correction bincounts <- county_bins$households[county_bins$fips=="6083"] smean <- county_true$mean_true[county_true$fips=="6083"] plot(splinebins(binedges, bincounts, smean)$splinePDF, 0, 300000, n=500, main="Santa Barbara County") plot(stepbins(binedges, bincounts, smean)$stepPDF, do.points=FALSE, col="red", add=TRUE)
Statistics computed from raw data on 3,221 counties in the U.S. and Puerto Rico.
data("county_true")
data("county_true")
A data frame with 3221 observations on the following 4 variables.
fips
Number identifying the county
mean_true
Sample mean
median_true
Sample median
gini_true
Gini coefficient
U.S. Census Bureau, American Community Survey: https://www.census.gov/programs-surveys/acs/
data(county_bins) data(county_true) binedges <- county_bins$bin_max[county_bins$fips=="6083"]+0.5 # continuity correction bincounts <- county_bins$households[county_bins$fips=="6083"] smean <- county_true$mean_true[county_true$fips=="6083"] plot(stepbins(binedges, bincounts, smean)$stepPDF, do.points=FALSE, main="Santa Barbara County")
data(county_bins) data(county_true) binedges <- county_bins$bin_max[county_bins$fips=="6083"]+0.5 # continuity correction bincounts <- county_bins$households[county_bins$fips=="6083"] smean <- county_true$mean_true[county_true$fips=="6083"] plot(stepbins(binedges, bincounts, smean)$stepPDF, do.points=FALSE, main="Santa Barbara County")
Estimates the Gini coefficient from a smoothed distribution.
gini(binFit)
gini(binFit)
binFit |
A list as returned by |
For distributions of non-negative support, the Gini coefficient can be computed from a cumulative distribution function by the integral
where is the mean of the distribution.
Returns the Gini coefficient .
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) stepfit <- stepbins(binedges, bincounts, 76091) splinefit <- splinebins(binedges, bincounts, 76091) gini(stepfit) gini(splinefit) # More accurate
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) stepfit <- stepbins(binedges, bincounts, 76091) splinefit <- splinebins(binedges, bincounts, 76091) gini(stepfit) gini(splinefit) # More accurate
Creates a PDF and CDF based on a set of binned data, using recursive subdivision on a step function.
rsubbins(bEdges, bCounts, m=NULL, eps1 = 0.25, eps2 = 0.75, depth = 3, tailShape = c("onebin", "pareto", "exponential"), nTail=16, numIterations=20, pIndex=1.160964, tbRatio=0.8)
rsubbins(bEdges, bCounts, m=NULL, eps1 = 0.25, eps2 = 0.75, depth = 3, tailShape = c("onebin", "pareto", "exponential"), nTail=16, numIterations=20, pIndex=1.160964, tbRatio=0.8)
bEdges |
A vector |
bCounts |
A vector |
m |
An estimate for the mean of the distribution. If no value is supplied, the mean will be estimated by (temporarily) setting |
eps1 |
Parameter controlling how far the edges of the subdivided bins are shifted. Must be between 0 and 0.5. |
eps2 |
Parameter controlling how wide the middle subdivsion of each bin should be. Must be between 0 and 1. |
depth |
Number of times to subdivide the bins. |
tailShape |
Must be one of |
nTail |
The number of bins to use to form the initial tail, before recursive subdivision.
Ignored if |
numIterations |
The number of iterations to optimize the tail to fit the mean. Ignored if
|
pIndex |
The Pareto index for the shape of the tail. Defaults to |
tbRatio |
The decay ratio for the tail bins. Ignored unless |
First, a step function PDF is created, as described in stepbins
. The bins of the resulting PDF are then recursively subdivided and shifted in a manner that preserves the area of the original bins, resulting in a step function with finer bins.
The methods stepbins
and rsubbins
are included in this package mainly for the purpose of comparison. For most use cases, splinebins
will produce more accurate smoothing results.
Returns a list with the following components.
rsubPDF |
A |
rsubCDF |
A piecewise-linear |
E |
The right-hand endpoint of the support of the PDF. |
shrinkFactor |
If the supplied estimate for the mean is too small to be fitted with a step function, the bins edges will be scaled by |
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) rsb <- rsubbins(binedges, bincounts, 76091, tailShape="pareto") plot(rsb$rsubPDF, do.points=FALSE) plot(rsb$rsubCDF, 0, rsb$E) library(pracma) integral(rsb$rsubPDF, 0, rsb$E) integral(function(x){1-rsb$rsubCDF(x)}, 0, rsb$E) #mean is approximated
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) rsb <- rsubbins(binedges, bincounts, 76091, tailShape="pareto") plot(rsb$rsubPDF, do.points=FALSE) plot(rsb$rsubCDF, 0, rsb$E) library(pracma) integral(rsb$rsubPDF, 0, rsb$E) integral(function(x){1-rsb$rsubCDF(x)}, 0, rsb$E) #mean is approximated
Estimates percentiles of a smoothed distribution obtained using splinebins
.
sb_percentiles(splinebinFit, p = seq(0,100,25))
sb_percentiles(splinebinFit, p = seq(0,100,25))
splinebinFit |
A list as returned by |
p |
A vector of percentages in the range |
The approximate inverse of the CDF calculated by splinebins
is used to approximate percentiles of the smoothed distribution.
A vector of percentiles. Returns NA
if an inaccurate fit is detected, as indicated by fitWarn
.
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) splinefit <- splinebins(binedges, bincounts, 76091) sb_percentiles(splinefit) sb_percentiles(splinefit, c(27, 32, 93))
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) splinefit <- splinebins(binedges, bincounts, 76091) sb_percentiles(splinefit) sb_percentiles(splinefit, c(27, 32, 93))
Draw a random sample of points from a smoothed distribution obtained using splinebins
.
sb_sample(splinebinFit, n = 1)
sb_sample(splinebinFit, n = 1)
splinebinFit |
A list as returned by |
n |
A positive integer giving the sample size. |
The approximate inverse of the CDF calculated by splinebins
is used to generate random values of the smoothed distribution.
A vector of random deviates. Returns NA
if an inaccurate fit is detected, as indicated by fitWarn
.
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) splinefit <- splinebins(binedges, bincounts, 76091) sb_sample(splinefit, 5) hist(sb_sample(splinefit, 3000))
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) splinefit <- splinebins(binedges, bincounts, 76091) sb_sample(splinefit, 5) hist(sb_sample(splinefit, 3000))
county_bins
and county_true
Samples from a selection of distributions (Gamma, Lognormal, Weibull, Triangle) to simulate income data in the
format used in the American Community Survey data (county_bins
and county_true
).
simcounty(numCounties, minPop = 1000, maxPop = 100000, bin_minimums = c(0, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000, 50000, 60000, 75000, 100000, 125000, 150000, 200000))
simcounty(numCounties, minPop = 1000, maxPop = 100000, bin_minimums = c(0, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000, 50000, 60000, 75000, 100000, 125000, 150000, 200000))
numCounties |
The number of counties to simulate data for |
minPop |
Minimum population to sample (default = 1000) |
maxPop |
Maximum population to sample (default = 100000) |
bin_minimums |
Bin edges. Defaults to the edges used in the Census data. |
The county names will tell which distributions were sampled to simulate each county.
Returns a list of two data frames:
county_bins |
Simulated binned income data |
county_true |
Statistics computed from the raw data |
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
l1 <- simcounty(5) cb <- l1$county_bins ct <- l1$county_true sbl <- splinebins(cb$bin_max[cb$fips==103], cb$households[cb$fips==103], ct$mean_true[ct$fips==103]) stl <- stepbins(cb$bin_max[cb$fips==105], cb$households[cb$fips==105], ct$mean_true[ct$fips==105]) plot(sbl$splinePDF, 0, 300000, n=500) plot(stl$stepPDF, do.points=FALSE, main=cb$county[cb$fips==105][1]) ## Simulate one county and estimate gini and theil from binned data l2 <- simcounty(1) binedges <- l2$county_bins$bin_max + 0.5 # continuity correction bincounts <- l2$county_bins$households splinefit <- splinebins(binedges, bincounts, l2$county_true$mean_true) gini(splinefit) theil(splinefit) l2$county_true
l1 <- simcounty(5) cb <- l1$county_bins ct <- l1$county_true sbl <- splinebins(cb$bin_max[cb$fips==103], cb$households[cb$fips==103], ct$mean_true[ct$fips==103]) stl <- stepbins(cb$bin_max[cb$fips==105], cb$households[cb$fips==105], ct$mean_true[ct$fips==105]) plot(sbl$splinePDF, 0, 300000, n=500) plot(stl$stepPDF, do.points=FALSE, main=cb$county[cb$fips==105][1]) ## Simulate one county and estimate gini and theil from binned data l2 <- simcounty(1) binedges <- l2$county_bins$bin_max + 0.5 # continuity correction bincounts <- l2$county_bins$households splinefit <- splinebins(binedges, bincounts, l2$county_true$mean_true) gini(splinefit) theil(splinefit) l2$county_true
Creates a smooth cubic spline CDF and piecewise-quadratic PDF based on a set of binned data (edges and counts).
splinebins(bEdges, bCounts, m = NULL, numIterations = 16, monoMethod = c("hyman", "monoH.FC"))
splinebins(bEdges, bCounts, m = NULL, numIterations = 16, monoMethod = c("hyman", "monoH.FC"))
bEdges |
A vector |
bCounts |
A vector |
m |
An estimate for the mean of the distribution. If no value is supplied, the mean will be estimated by (temporarily) setting |
numIterations |
The number of iterations performed by a binary search that optimizes the CDF to fit the mean. |
monoMethod |
The method for constructing a monotone spline. Must be one of |
Fits a monotone cubic spline to the points specified by the binned data to produce a smooth cumulative distribution function. The PDF is then obtained by differentiating, so it will be piecewise quadratic and preserve the area of each bin.
Returns a list with the following components.
splinePDF |
A piecewise-quadratic function giving the fitted PDF. |
splineCDF |
A piecewise-cubic function giving the CDF. |
E |
The right-hand endpoint of the support of the PDF. |
shrinkFactor |
If the supplied estimate for the mean is too small to be fitted with our method, the bins edges will be scaled by |
splineInvCDF |
An approximate inverse of |
fitWarn |
Flag set to |
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) sb <- stepbins(binedges, bincounts, 76091) splb <- splinebins(binedges, bincounts, 76091) plot(splb$splinePDF, 0, 300000, n=500) plot(sb$stepPDF, do.points=FALSE, col="gray", add=TRUE) # notice that the curve preserves bin area library(pracma) integral(splb$splinePDF, 0, splb$E) integral(function(x){1-splb$splineCDF(x)}, 0, splb$E) # should be the mean splb <- splinebins(binedges, bincounts, 76091, numIterations=20) integral(function(x){1-splb$splineCDF(x)}, 0, splb$E) # closer to given mean
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) sb <- stepbins(binedges, bincounts, 76091) splb <- splinebins(binedges, bincounts, 76091) plot(splb$splinePDF, 0, 300000, n=500) plot(sb$stepPDF, do.points=FALSE, col="gray", add=TRUE) # notice that the curve preserves bin area library(pracma) integral(splb$splinePDF, 0, splb$E) integral(function(x){1-splb$splineCDF(x)}, 0, splb$E) # should be the mean splb <- splinebins(binedges, bincounts, 76091, numIterations=20) integral(function(x){1-splb$splineCDF(x)}, 0, splb$E) # closer to given mean
Estimates the mean, variance, standard deviation, Gini coefficient, and Theil index from a smoothed distribution.
stats_from_distribution(binFit)
stats_from_distribution(binFit)
binFit |
A list as returned by |
The mean and variance are calculated from the CDF. For details on the other statistics, see gini
and theil
.
A vector of five statistics.
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) stepfit <- stepbins(binedges, bincounts, 76091) splinefit <- splinebins(binedges, bincounts, 76091) stats_from_distribution(stepfit) stats_from_distribution(splinefit) # More accurate
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) stepfit <- stepbins(binedges, bincounts, 76091) splinefit <- splinebins(binedges, bincounts, 76091) stats_from_distribution(stepfit) stats_from_distribution(splinefit) # More accurate
Creates a step function PDF and CDF based on a set of binned data (edges and counts).
stepbins(bEdges, bCounts, m = NULL, tailShape = c("onebin", "pareto", "exponential"), nTail = 16, numIterations = 20, pIndex = 1.160964, tbRatio = 0.8)
stepbins(bEdges, bCounts, m = NULL, tailShape = c("onebin", "pareto", "exponential"), nTail = 16, numIterations = 20, pIndex = 1.160964, tbRatio = 0.8)
bEdges |
A vector |
bCounts |
A vector |
m |
An estimate for the mean of the distribution. If no value is supplied, the mean will be estimated by (temporarily) setting |
tailShape |
Must be one of |
nTail |
The number of bins to use to form the tail. Ignored if |
numIterations |
The number of iterations to optimize the tail to fit the mean. Ignored if
|
pIndex |
The Pareto index for the shape of the tail. Defaults to |
tbRatio |
The decay ratio for the tail bins. Ignored unless |
We assume that the left endpoint of the first bin is 0 and that the top bin is unbounded. Options exist to replace the top bin with a single bin or a sequence of bins in the shape of a Pareto or exponential tail. The density functions will fit a supplied estimate for the population mean, if supplied.
The methods stepbins
and rsubbins
are included in this package mainly for the purpose of comparison. For most use cases, splinebins
will produce more accurate smoothing results.
Returns a list with the following components.
stepPDF |
A |
stepCDF |
A piecewise-linear |
E |
The right-hand endpoint of the support of the PDF. |
shrinkFactor |
If the supplied estimate for the mean is too small to be fitted with a step function, the bins edges will be scaled by |
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) sb <- stepbins(binedges, bincounts, 76091) sbpt <- stepbins(binedges, bincounts, 76091, tailShape="pareto") plot(sb$stepPDF) plot(sbpt$stepPDF, do.points=FALSE) plot(sb$stepCDF, 0, sb$E+100000) library(pracma) integral(sb$stepPDF, 0, sb$E) # should be approximately 1 integral(function(x){1-sb$stepCDF(x)}, 0, sb$E) # should be the mean
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) sb <- stepbins(binedges, bincounts, 76091) sbpt <- stepbins(binedges, bincounts, 76091, tailShape="pareto") plot(sb$stepPDF) plot(sbpt$stepPDF, do.points=FALSE) plot(sb$stepCDF, 0, sb$E+100000) library(pracma) integral(sb$stepPDF, 0, sb$E) # should be approximately 1 integral(function(x){1-sb$stepCDF(x)}, 0, sb$E) # should be the mean
Estimates the Theil index from a smoothed distribution.
theil(binFit)
theil(binFit)
binFit |
A list as returned by |
For distributions of non-negative support, the Theil index can be computed from a probability density function by the integral
where is the mean of the distribution.
Returns the Theil index .
David J. Hunter and McKalie Drown
Paul T. von Hippel, David J. Hunter, McKalie Drown. Better Estimates from Binned Income Data: Interpolated CDFs and Mean-Matching, Sociological Science, November 15, 2017. https://www.sociologicalscience.com/articles-v4-26-641/
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) stepfit <- stepbins(binedges, bincounts, 76091) splinefit <- splinebins(binedges, bincounts, 76091) theil(stepfit) theil(splinefit) # More accurate
# 2005 ACS data from Cook County, Illinois binedges <- c(10000,15000,20000,25000,30000,35000,40000,45000, 50000,60000,75000,100000,125000,150000,200000,NA) bincounts <- c(157532,97369,102673,100888,90835,94191,87688,90481, 79816,153581,195430,240948,155139,94527,92166,103217) stepfit <- stepbins(binedges, bincounts, 76091) splinefit <- splinebins(binedges, bincounts, 76091) theil(stepfit) theil(splinefit) # More accurate