This is a fork from lendle's KernSmooth.jl. I'm updating it for the newest version of Julia.
KernSmooth.jl is a partial port of the R package KernSmooth, (v2.23-10.) in pure Julia. The R package carries an unlimited license.
Currently locpoly and dpill functions are ported.
locpoly uses local polynomials to estimate the pdf of a single variable or a regression function for two variables, or their derivatives.
dpill provides a method to select a bandwidth for local linear regression.
Other functionality provided by the R package but not ported to KernSmooth.jl pertains to univariate and bivariate kernel density estimation.
Univariate and bivariate kernel density estimation is provided by the kde function in StatsBase.jl.
You can install through the package manager using
] add https://github.com/azeredo-e/KernSmooth.jl/tree/master
This install the latest stable release. For other releases you can use tags e.g.
] add https://github.com/azeredo-e/KernSmooth.jl/tree/v0.1.0
The method signatures for a x and y pair is:
locpoly(x::Vector{Float64}, y::Vector{Float64}, bandwidth::Union{Float64, Vector{Float64}};
drv::Int = 0,
degree::Int=drv+1,
kernel::Symbol = :normal,
gridsize::Int = 401,
bwdisc::Int = 25,
range_x::Vector{Float64}=Float64[],
binned::Bool = false,
truncate::Bool = true
)and the signature for a single variable density estimation is:
locpoly(x::Vector{Float64}, bandwidth::Union{Float64, Vector{Float64}};args...)x- vector of x datay- vector of y data. For density estimation (ofx),yshould be omitted or be an emptyVector{T}bandwidth- should be a scalar or vector of lengthgridsize- Other arguments are optional. For their descriptions, see the R documentation
A (Vector{Float64}, Vector{Float64}) is returned. The first vector is the sorted set of points at which an estimate was computed. The estimates are in the second vector.
The method signature
function dpill(x::Vector{Float64}, y::Vector{Float64};
blockmax::Int = 5,
divisor::Int = 20,
trim::Float64 = 0.01,
proptrun::Float64 = 0.05,
gridsize::Int = 401,
range_x::Vector{Float64} = Float64[],
truncate = true
)x- vector of x datay- vector of y data.- Other arguments are optional. For their descriptions, see the R documentation.
Estimate regression using different bandwidths, including the bandwidth selected by dpill.
xgrid2, yhat0_5 = locpoly(x, y, 0.5)
yhat1_0 = locpoly(x, y, 1.0)[2]
yhat2_0 = locpoly(x, y, 2.0)[2]A plot of the estimates and true regression:
The full code for the example is here.
- Implementation of the
locpolyfunction - Implementation of the
dpillfunction - Implementation of bandwidth selector functions (not in the original R package, but I'm putting here since I think fits with the theme of the package)
