Xgboost - how to make a custom loss function which depends the value of another column, as well the error - r

I am having issue implementing recency-weighting for xgboost training in R (i.e. passing a weight vector to xgb.dmatrix) - although the weighting affects the learning curve readout for the training set, it does not appear to have any impact at all on the actual model produced - performance in the test set is identical.
I can't seem to get to the bottom of this issue or generate a reproducible example. So instead I would like to pass the Date column of the features to a custom loss function, something like:
custom_loss <- function(preds,dat) {
labels <- getinfo(dat,"label")
dates <- [a vector corresponding to the dates associated with each prediction]
grad = f(dates)*-2*(labels - preds)
hess = f(dates)*2
[where f is an increasing function of the value in dates, so later samples matter more when training]
return(list(grad=grad,hess=hess))
}
But I can't seem to figure out how to do this, any suggestions?

Related

R Cross Validation lm predict function [duplicate]

I am trying to convert Absorbance (Abs) values to Concentration (ng/mL), based on an established linear model & standard curve. I planned to do this by using the predict() function. I am having trouble getting predict() to return the desired results. Here is a sample of my code:
Standards<-data.frame(ng_mL=c(0,0.4,1,4),
Abs550nm=c(1.7535,1.5896,1.4285,0.9362))
LM.2<-lm(log(Standards[['Abs550nm']])~Standards[['ng_mL']])
Abs<-c(1.7812,1.7309,1.3537,1.6757,1.7409,1.7875,1.7533,1.8169,1.753,1.6721,1.7036,1.6707,
0.3903,0.3362,0.2886,0.281,0.3596,0.4122,0.218,0.2331,1.3292,1.2734)
predict(object=LM.2,
newdata=data.frame(Concentration=Abs[1]))#using Abs[1] as an example, but I eventually want predictions for all values in Abs
Running that last lines gives this output:
> predict(object=LM.2,
+ newdata=data.frame(Concentration=Abs[1]))
1 2 3 4
0.5338437 0.4731341 0.3820697 -0.0732525
Warning message:
'newdata' had 1 row but variables found have 4 rows
This does not seem to be the output I want. I am trying to get a single predicted Concentration value for each Absorbance (Abs) entry. It would be nice to be able to predict all of the entries at once and add them to an existing data frame, but I can't even get it to give me a single value correctly. I've read many threads on here, webpages found on Google, and all of the help files, and for the life of me I cannot understand what is going on with this function. Any help would be appreciated, thanks.
You must have a variable in newdata that has the same name as that used in the model formula used to fit the model initially.
You have two errors:
You don't use a variable in newdata with the same name as the covariate used to fit the model, and
You make the problem much more difficult to resolve because you abuse the formula interface.
Don't fit your model like this:
mod <- lm(log(Standards[['Abs550nm']])~Standards[['ng_mL']])
fit your model like this
mod <- lm(log(Abs550nm) ~ ng_mL, data = standards)
Isn't that some much more readable?
To predict you would need a data frame with a variable ng_mL:
predict(mod, newdata = data.frame(ng_mL = c(0.5, 1.2)))
Now you may have a third error. You appear to be trying to predict with new values of Absorbance, but the way you fitted the model, Absorbance is the response variable. You would need to supply new values for ng_mL.
The behaviour you are seeing is what happens when R can't find a correctly-named variable in newdata; it returns the fitted values from the model or the predictions at the observed data.
This makes me think you have the formula back to front. Did you mean:
mod2 <- lm(ng_mL ~ log(Abs550nm), data = standards)
?? In which case, you'd need
predict(mod2, newdata = data.frame(Abs550nm = c(1.7812,1.7309)))
say. Note you don't need to include the log() bit in the name. R recognises that as a function and applies to the variable Abs550nm for you.
If the model really is log(Abs550nm) ~ ng_mL and you want to find values of ng_mL for new values of Abs550nm you'll need to invert the fitted model in some way.

Low-pass fltering of a matrix

I'm trying to write a low-pass filter in R, to clean a "dirty" data matrix.
I did a google search, came up with a dazzling range of packages. Some apply to 1D signals (time series mostly, e.g. How do I run a high pass or low pass filter on data points in R? ); some apply to images. However I'm trying to filter a plain R data matrix. The image filters are the closest equivalent, but I'm a bit reluctant to go this way as they typically involve (i) installation of more or less complex/heavy solutions (imageMagick...), and/or (ii) conversion from matrix to image.
Here is sample data:
r<-seq(0:360)/360*(2*pi)
x<-cos(r)
y<-sin(r)
z<-outer(x,y,"*")
noise<-0.3*matrix(runif(length(x)*length(y)),nrow=length(x))
zz<-z+noise
image(zz)
What I'm looking for is a filter that will return a "cleaned" matrix (i.e. something close to z, in this case).
I'm aware this is a rather open-ended question, and I'm also happy with pointers ("have you looked at package so-and-so"), although of course I'd value sample code from users with experience on signal processing !
Thanks.
One option may be using a non-linear prediction method and getting the fitted values from the model.
For example by using a polynomial regression, we can predict the original data as the purple one,
By following the same logic, you can do the same thing to all columns of the zz matrix as,
predictions <- matrix(, nrow = 361, ncol = 0)
for(i in 1:ncol(zz)) {
pred <- as.matrix(fitted(lm(zz[,i]~poly(1:nrow(zz),2,raw=TRUE))))
predictions <- cbind(predictions,pred)
}
Then you can plot the predictions,
par(mfrow=c(1,3))
image(z,main="Original")
image(zz,main="Noisy")
image(predictions,main="Predicted")
Note that, I used a polynomial regression with degree 2, you can change the degree for a better fitting across the columns. Or maybe, you can use some other powerful non-linear prediction methods (maybe SVM, ANN etc.) to get a more accurate model.

Getting incorrect values of theta while trying to implement stochastic gradient descent

I am trying to implement Stochastic Gradient Descent algorithm for logistic regression. I have written a small train function whose job is to get the theta values / coefficients. But the values of theta come out to be incorrect and are same as the one initialised. I could not understand the reason for this. Is it not the correct way to implement stochastic gradient descent?
Here is the code I wrote for it:
train <- function(data, labels, alpha = 0.0009) {
theta <- seq(from = 0, to = 1, length.out = nrow(data))
label <- label[,shuffle]
data <- data[,shuffle]
for(i in seq(1:ncol(data))) {
h = hypothesis(x, theta)
theta <- theta - (alpha * ((h - y) * data[,i]))
}
return(theta)
}
Please note that, each column in the data frame is one input. There are 20K columns and 456 rows. So, 20K input values for training. The corresponding data frame named labels has the correct value for the input training data. So for example column 45 in data has its corresponding y value in column 45 of labels.
In the regression above, I am trying to train to predict between the label 1 and label 0. So labels is a data frame that comprises of 0 and 1.
I can't debug this for you without a minimal, complete, and verifiable example, but I can offer you a tool to help you debug it:
add browser() in the body of your function like this:
train <- function(data, labels, alpha = 0.001) {
browser()
# ... the rest of your function
Call train with your data. This will open up a browser session. You can enter help (not the function, just help) to get the commands to navigate in the browser, but in general, use n and s to step through the statements (s will step into a nested function call, n will step over). If you do this in RStudio, you can keep an eye on your environment tab to see what the values for, e.g., theta are, and see a current traceback. You can also evaluate any R expression, e.g., tail(theta) in the executing environment. Q exits the browser.
I'd recommend exploring what hypothesis returns in particular (I'd be surprised if it's not almost always 1). But I think you have other issues causing the undesired behavior you described (the return value for theta isn't changing from its initial assignment).
EDIT:
Fix the typo: label should be labels each time.
Compare the sum of your return with the sum of theta as it is initialized, and you'll see that the return value is not the same as your initialized theta. Hope that helped!

wrapnls: Error: singular gradient matrix at initial parameter estimates

I have created a loop to fit a non-linear model to six data points by participants (each participant has 6 data points). The first model is a one parameter model. Here is the code for that model that works great. The time variable is defined. The participant variable is the id variable. The data is in long form (one row for each datapoint of each participant).
Here is the loop code with 1 parameter that works:
1_p_model <- dlply(discounting_long, .(Participant), function(discounting_long) {wrapnls(indiff ~ 1/(1+k*time), data = discounting_long, start = c(k=0))})
However, when I try to fit a two parameter model, I get this error "Error: singular gradient matrix at initial parameter estimates" while still using the wrapnls function. I realize that the model is likely over parameterized, that is why I am trying to use wrapnls instead of just nls (or nlsList). Some in my field insist on seeing both model fits. I thought that the wrapnls model avoids the problem of 0 or near-0 residuals. Here is my code that does not work. The start values and limits are standard in the field for this model.
2_p_model <- dlply(discounting_long, .(Participant), function(discounting_long) {nlxb(indiff ~ 1/(1+k*time^s), data = discounting_long, lower = c (s = 0), start = c(k=0, s=.99), upper = c(s=1))})
I realize that I could use nlxb (which does give me the correct parameter values for each participant) but that function does not give predictive values or residuals of each data point (at least I don't think it does) which I would like to compute AIC values.
I am also open to other solutions for running a loop through the data by participants.
You mention at the end that 'nlxb doesn't give you residuals', but it does. If your result from your call to nlxbis called fit then the residuals are in fit$resid. So you can get the fitted values using just by adding them to the original data. Honestly I don't know why nlxb hasn't been made to work with the predict() function, but at least there's a way to get the predicted values.

Denormalize values after prediction in R

I am not sure whether I am denormalizing data correctly. I have one output variable and several input variables. I am normalizing them using the RSNNS package. Suppose x is an input matrix (NxM) where each of the N rows is an object with M features. And y is a vector (N) with corresponding answers.
nx <- normalizeData(x, type='0_1')
After that, some of the data are used to make the model and some for prediction. Suppose pred.ny are predicted values. These values are normalized.
pred.y <- denormalizeData(pred.ny, getNormParameters(nx))
Is this correct? How does it work? It's clear for one input it can use the min and max values previously used for normalization. But how does it work if each input was normalized separately, using its own min and max value?
update
Here is a toy-example, where '0_1' looks better than 'norm'. 'norm' make huge training error and almost constant prediction.
x <- runif(1020, 1, 5000)
y <- sqrt(x)
nx <- normalizeData(x, type='0_1')
ny <- normalizeData(y, type='0_1')
model <- mlp(nx[1:1000], ny[1:1000], size = 1)
plotIterativeError(model)
npy <- predict(model, matrix(nx[1001:1020], ncol=1))
py <- denormalizeData(npy, getNormParameters(ny))
print(cbind(y[1001:1020], py))
There are two things going on here:
Training your model, i.e. setting the internal coefficients in the neural network. You use both the inputs and output for this.
Using the model, i.e. getting predictions with fixed internal coefficients.
For part 1, you've decided to normalize your data. So the neural net works on normalized data. So you have trained the neural net
on the inputs fX(X) rather than X, where fX is the transform you used to the matrix of original inputs to produce the normalized inputs.
on the outputs fy(y) rather than y, where fy is the transform you applied to the vector of outputs to get the normalized outputs.
In terms of your original inputs and outputs your trained machine now looks like this:
Apply the normalization function fX to inputs to get normalized inputs fX(X).
Run neural net with normalized inputs to produce normalized outputs fy(y).
Apply the denormalization function fy-1 to the normalized outputs fy(y) to get y.
Note that fX(X) and fy, and hence fy-1, are defined on the training set.
So in R you might write something like this to get training data and normalize it, the first 100 rows
tx <- x[1:100,]
ntx <- normalizeData(tx, type='0_1')
ty <- y[1:100]
nty <- normalizeData(ty, type='0_1')
and something like this to denormalize the predicted results
pred.y <- denormalizeData(pred.ny, getNormParameters(nty))
# nty (or ny) not nx here
The slight concern I have is that I'd prefer to normalize the features used in prediction using the same transform fX that I used for training, but looking at the RSNNS documentation this facility doesn't appear to be present (it would be easy to write it yourself, however). It would probably be OK to normalize the prediction features using the whole X matrix, i.e. including the training data. (I can also see it might be preferable to use the default, normalization to z-score that RSNNS provides rather than the "0_1" choice you have used.)

Resources