Are those two functions more or less equivalent? For example, if I have an R call like:
loess(formula = myformula, data = mydata, span = myspan, degree = 2, normalize = TRUE, family = "gaussian")
How can I obtain the same or similar result with PyQt-Fit? Should I simply call the smooth.NonParamRegression function (http://pythonhosted.org/PyQt-Fit/NonParam_tut.html) with method=npr_methods.LocalPolynomialKernel(q=2)? What about other parameters, such as span, and family?
UPDATE
I do realize the two implementations are likely not equivalent (https://www.statsdirect.com/help/nonparametric_methods/loess.htm). But any comments regarding "approximating" their outcomes are appreciated.
Statsmodels has a LOWESS implementation
(http://www.statsmodels.org/devel/generated/statsmodels.nonparametric.smoothers_lowess.lowess.html).
Check out this post on the difference between LOESS and LOWESS: https://stats.stackexchange.com/questions/161069/difference-between-loess-and-lowess
Quick example on how to use statsmodels' lowess function in Python
import numpy as np
import statsmodels.api as sm
lowess = sm.nonparametric.lowess
Generate two random arrays, x and y:
x = np.random.rand(100, 1)
y = np.random.rand(100, 1)
Run the lowess function (Frac refers to bandwidth. Note that frac and it are set arbitrarily. Also, not all parameters are specified here, some are set to default. For more, see the official documentation):
results = lowess(y, x, frac=0.05, it=3)
The results are stored in a two-dimensional array. The first column contains the sorted x (exog) values and the second column the associated estimated y (endog) values.
If, for instance, you'd like to construct the residuals, you can proceed as follows:
res = y - results[:,1]
Related
I want to obtain a multivariate spline basis using R. I do not know how to do it properly or the best approach for this. According to my limited research on the Internet, I think that the package that can help me is mgcv and the functions ti and smooth.construct.tensor.smooth.spec but I am not sure.
The structure of my data is simple. I have two vectors xdata and alphadata generated as
n = 200
T = 2
xdata = as.matrix(rnorm(T*n),T*n,1)
tau = seq(-2,2,by=0.1)
tau = as.matrix(tau,length(tau),1)
So basically I have two vectors xdata and alphadata of dimension n*T and 41, respectively. My goal is then obtain a spline basis (for example a cubic spline) which should be a function of both b(alphadata,xdata).
What I have tried so far is something like this
xdata_data <- data.frame("xdata" = xdata[,1])
tau_data <- data.frame("tau" = tau[,1])
basisobj1 <- ti(tau_data, xdata_data, bs = 'cr', k = c(6, 6), fx = TRUE) #cr:cubic regression splines
xdata_data <- data.frame("xdata_data" = xdata[,1])
tau_data <- data.frame("tau_data" = tau[,1])
basisobj2 <- smooth.construct.tensor.smooth.spec(basisobj1, data = c(tau_data,xdata_data), knots = NULL)
basis <- basisobj2[["X"]]
Note that I manipulated my data, otherwise I get some errors with smooth.construct.tensor.smooth.spec.
My questions are:
(1) With the previous approach I am doing what I want?
(2) Is this a smart approach to do what I want?
(3) When I do the above, the number of rows of basis is 41 but shouldn't the number of rows of basis be equal to the product of dimensions of xdata and alphadata as the basis is a function of two vectors?
I'm running a nonlinear least squares using the minpack.lm package.
However, for each group in the data I would like optimize (minimize) fitting parameters like similar to Python's minimize function.
The minimize() function is a wrapper around Minimizer for running an
optimization problem. It takes an objective function (the function
that calculates the array to be minimized), a Parameters object, and
several optional arguments.
The reason why I need this is that I want to optimize fitting function based on the obtained fitting parameters to find global fitting parameters that can fit both of the groups in the data.
Here is my current approach for fitting in groups,
df <- data.frame(y=c(replicate(2,c(rnorm(10,0.18,0.01), rnorm(10,0.17,0.01))),
c(replicate(2,c(rnorm(10,0.27,0.01), rnorm(10,0.26,0.01))))),
DVD=c(replicate(4,c(rnorm(10,60,2),rnorm(10,80,2)))),
gr = rep(seq(1,2),each=40),logic=rep(c(1,0),each=40))
the fitting equation of these groups is
fitt <- function(data) {
fit <- nlsLM(y~pi*label2*(DVD/2+U1)^2,
data=data,start=c(label2=1,U1=4),trace=T,control = nls.lm.control(maxiter=130))
}
library(minpack.lm)
library(plyr) # will help to fit in groups
fit <- dlply(df, c('gr'), .fun = fitt) #,"Die" only grouped by Waferr
> fit
$`1`
Nonlinear regression model
model: y ~ pi * label2 * (DVD/2 + U1)^2
data: data
label2 U1
2.005e-05 1.630e+03
$`2`
label2 U1
2.654 -35.104
I need to know are there any function that optimizes the sum-of-squares to get best fitting for both of the groups.
We may say that you already have the best fitting parameters as the residual sum-of-squares but I know that minimizer can do this but I haven't find any similar example we can do this in R.
ps. I made it up the numbers and fitting lines.
Not sure about r, but having least squares with shared parameters is usually simple to implement.
A simple python example looks like:
import matplotlib
matplotlib.use('Qt4Agg')
from matplotlib import pyplot as plt
from random import random
from scipy import optimize
import numpy as np
#just for my normal distributed errord
def boxmuller(x0,sigma):
u1=random()
u2=random()
ll=np.sqrt(-2*np.log(u1))
z0=ll*np.cos(2*np.pi*u2)
z1=ll*np.cos(2*np.pi*u2)
return sigma*z0+x0, sigma*z1+x0
#some non-linear function
def f0(x,a,b,c,s=0.05):
return a*np.sqrt(x**2+b**2)-np.log(c**2+x)+boxmuller(0,s)[0]
# residual function for least squares takes two data sets.
# not necessarily same length
# two of three parameters are common
def residuals(parameters,l1,l2,dataPoints):
a,b,c1,c2 = parameters
set1=dataPoints[:l1]
set2=dataPoints[-l2:]
distance1 = [(a*np.sqrt(x**2+b**2)-np.log(c1**2+x))-y for x,y in set1]
distance2 = [(a*np.sqrt(x**2+b**2)-np.log(c2**2+x))-y for x,y in set2]
res = distance1+distance2
return res
xList0=np.linspace(0,8,50)
#some xy data
xList1=np.linspace(0,7,25)
data1=np.array([f0(x,1.2,2.3,.33) for x in xList1])
#more xy data using different third parameter
xList2=np.linspace(0.1,7.5,28)
data2=np.array([f0(x,1.2,2.3,.77) for x in xList2])
alldata=np.array(zip(xList1,data1)+zip(xList2,data2))
# rough estimates
estimate = [1, 1, 1, .1]
#fitting; providing second length is actually redundant
bestFitValues, ier= optimize.leastsq(residuals, estimate,args=(len(data1),len(data2),alldata))
print bestFitValues
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(xList1, data1)
ax.scatter(xList2, data2)
ax.plot(xList0,[f0(x,bestFitValues[0],bestFitValues[1],bestFitValues[2] ,s=0) for x in xList0])
ax.plot(xList0,[f0(x,bestFitValues[0],bestFitValues[1],bestFitValues[3] ,s=0) for x in xList0])
plt.show()
#output
>> [ 1.19841984 2.31591587 0.34936418 0.7998094 ]
If required you can even make your minimization yourself. If your parameter space is sort of well behaved, i.e. approximately parabolic minimum, a simple Nelder Mead method is quite OK.
I am using Conv2D model of Keras 2.0. However, I cannot fully understand what the function is doing mathematically. I try to understand the math using randomly generated data and a very simple network:
import numpy as np
import keras
from keras.layers import Input, Conv2D
from keras.models import Model
from keras import backend as K
# create the model
inputs = Input(shape=(10,10,1)) # 1 channel, 10x10 image
outputs = Conv2D(32, (3, 3), activation='relu', name='block1_conv1')(inputs)
model = Model(outputs=outputs, inputs=inputs)
# input
x = np.random.random(100).reshape((10,10))
# predicted output for x
y_pred = model.predict(x.reshape((1,10,10,1))) # y_pred.shape = (1,8,8,32)
I tried to calculate, for example, the value of the first row, the first column in the first feature map, following the demo in here.
w = model.layers[1].get_weights()[0] # w.shape = (3,3,1,32)
w0 = w[:,:,0,0]
b = model.layers[1].get_weights()[1] # b.shape = (32,)
b0 = b[0] # b0 = 0
y_pred_000 = np.sum(x[0:3,0:3] * w0) + b0
But relu(y_pred_000) is not equal to y_pred[0][0][0][0].
Could anyone point out what's wrong with my understanding? Thank you.
It's easy and it comes from Theano dim ordering. The result of applying filter in stored in a so called channel dimension. In case of TensorFlow this is the last dimension and that's why results are good. In case of Theano it's second dimension (convolution result has shape (cases, channels, width, height) so in order to solve your problem you need to change prediction line to:
y_pred = model.predict(x.reshape((1,1,10,10)))
Also you need to change the way you get the weights as weights in Theano has shape (output_channels, input_channels, width, height) you need to change the weight getter to:
w = model.layers[1].get_weights()[0] # w.shape = (32,1,3,3)
w0 = w[0,0,:,:]
I am teaching myself how to run some Markov models in R, by working through the textbook "Hidden Markov Models for Time Series: An Introduction using R". I am a bit stuck on how to go about implementing something that is mentioned in the text.
So, I have the following function:
f <- function(samples,lambda,delta) -sum(log(outer(samples,lambda,dpois)%*%delta))
Which I can optimize with respect to, say, lambda using:
optim(par, fn=f, samples=x, delta=d)
where "par" is the initial guess for lambda, for some x and d.
Which works perfectly fine. However, in the part of the text corresponding to the example I am trying to reproduce, they note: "The parameters delta and lambda are constrained by sum(delta_i)=1 for i=1,...m, delta_i>0, and lambda_i>0. It is therefore necessary to reparametrize if one wishes to use an unconstrained optimizer such as nlm". One possibility is to maximize the likelihood with respect to the 2m-1 unconstrained parameters".
The unconstrained parameters are given by
eta<-log(lambda)
tau<-log(delta/(1-sum(delta))
I don't entirely understand how to go about implementing this. How would I write a function to optimize over this transformed parameter space?
When using optim() without parmater transfromations like so:
simpleFun <- function(x)
(x-3)^2
out = optim(par=5,
fn=simpleFun)
the set of parmaters estimates would be obtained via out$par which is 3 in
the case, as you might expect. Alternatively, you can wrap your function
f in a transformation the parameters like so:
out = optim(par=5,
fn=function(x)
# apply the transformation x -> x^3
simpleFun(x^3))
and now the trick to get the correct set of optimal parmeters to your
function you need to apply the same transfromation to the parameter
estimates as in:
(out$par)^3
#> 2.99741
(and yes, the parameter estimate is slightly different. For this contrived
example, you could set method="BFGS" for a slightly better estimate. Anyhow, this goes to show that the choice of transformation does matter in
some cases, but that's for another discussion...)
To complete the answer, It sounds like you a want to use a wrapper like so
# the function to be optimized
f <- function(samples,lambda,delta)
-sum(log(outer(samples,lambda,dpois)%*%delta))
out <- optim(# par it now a 2m vector
par = c(eta1 = 1,
eta2 = 1,
eta3 = 1,
tau1 = 1,
tau2 = 1,
tau3 = 1),
# a wrapper that applies the constraints
fn=function(x,samples){
# exp() guarantees that the values of lambda are > 0
lambda = exp(x[1:3])
# delta is also > 0
delta = exp(x[4:6])
# and now it sums to 1
delta = delta / sum(delta)
f(samples,lambda,delta)
},
samples=samples)
The above guarantees that the the parameters passed to f()have the correct constraints, and as long as you apply the same transformation to out$par, optim() will estimate an optimal set of parameters for f().
I am trying to find the best PDF of a continuous data that has unknown distribution, using the "density" function in R. Now, given a new data point, I want to find the probability density of this data point based on the kernel density estimator that I have from the "density" function result.
How can I do that?
If your new point will be within the range of values produced by density, it's fairly easy to do -- I'd suggest using approx (or approxfun if you need it as a function) to handle the interpolation between the grid-values.
Here's an example:
set.seed(2937107)
x <- rnorm(10,30,3)
dx <- density(x)
xnew <- 32.137
approx(dx$x,dx$y,xout=xnew)
If we plot the density and the new point we can see it's doing what you need:
This will return NA if the new value would need to be extrapolated. If you want to handle extrapolation, I'd suggest direct computation of the KDE for that point (using the bandwidth from the KDE you have).
This is one year old, but nevertheless, here is a complete solution. Let's call
d <- density(xs)
and define h = d$bw. Your KDE estimation is completely determined by
the elements of xs,
the bandwidth h,
the type of kernel functions.
Given a new value t, you can compute the corresponding y(t), using the following function, which assumes you have used Gaussian kernels for estimation.
myKDE <- function(t){
kernelValues <- rep(0,length(xs))
for(i in 1:length(xs)){
transformed = (t - xs[i]) / h
kernelValues[i] <- dnorm(transformed, mean = 0, sd = 1) / h
}
return(sum(kernelValues) / length(xs))
}
What myKDE does is it computes y(t) by the definition.
See: docs
dnorm(data_point, its_mean, its_stdev)