Kriging interpolation using GeoStats package in Julia - julia

I am trying to build a model for kriging interpolation using GeoStats package in julia.
I have tried an example of 2D interpolations but the results are not accurate, as mentioned below.
Code for 2D interpolation:
using KrigingEstimators, DataFrames, Variography, Plots
OK = OrdinaryKriging(GaussianVariogram()) # interpolator object
f(x) = sin(x)
# fit it to the data:
x_train = range(0, 10.0, length = 9) |> collect
y_train = f.(x_train)
scatter(x_train, y_train, label="train points")
x_train = reshape(x_train, 1, length(x_train))
krig = KrigingEstimators.fit(OK, x_train, y_train) # fit function
result = []
variance =[]
test = range(0, 10, length = 101) |> collect
y_test = f.(test)
test = reshape(test, 1, length(test))
for i in test
μ, σ² = KrigingEstimators.predict(krig, [i])
push!(result, μ)
push!(variance, σ²)
end
df_krig_vario = DataFrame(:predict=>result, :real=>y_test, :variance=>variance)
println(first(df_krig_vario, 5))
mean_var = sum(variance)/length(variance)
println("")
println("mean variance is $mean_var")
test = reshape(test, length(test), 1)
plot!(test, y_test, label="actual")
plot!(test, result, label="predict", legend=:bottomright, title="Gaussian Variogram")
With reference to the above figure it can be seen that the interpolation prediction is not accurate. May I know, how to improve this accuracy?

Related

Output of keras in R can not be used to predict

i use keras and tensorflow to run an lstm in R to predict some stock market prices.
Here I am providing the code where instead of stock market prices, I just use one randomly generated vector VECTOR of length 100. Then I consider a training period of 80 first values and try to predict the 20 test values...
What am I doing wrong?
I am getting an error:Error in UseMethod("predict") :
no applicable method for 'predict' applied to an object of class "keras_training_history"
Thank you
library(tensorflow)
library(keras)
set.seed(12345)
VECTOR=rnorm(100,2,5)
VECTOR_training=VECTOR[1:80]
VECTOR_test=VECTOR[81:100]
training_rescaled=scale(VECTOR_training)
#I also calculate the scale factors because I will need them when I will be coming
#back to the original data
scale_factors=matrix(NA,nrow=1,ncol=2)
scale_factors=c(mean(VECTOR_training), sd(VECTOR_training))
#We want to predict 20 days, so we need to base each prediction on 20 data points.
prediction_stocks=20
lag_stocks=prediction_stocks
test_rescaled =training_rescaled[(length(VECTOR_training)- prediction_stocks + 1):length(VECTOR_training)]
#We lag the data 20times, so that each prediction is based on 20 values, and arrange lagged values into columns. Then we transform it into the desired 3D form.
x_train_data_stocks=t(sapply(1:(length(VECTOR_training)-lag_stocks-prediction_stocks+1),
function(x) training_rescaled[x:(x+lag_stocks-1),1]
))
# now we transform it into 3D form
x_train_arr_stocks=array(
data=as.numeric(unlist(x_train_data_stocks)),
dim=c(
nrow(x_train_data_stocks),
lag_stocks,
1
)
)
#Now we apply similar transformation for the Y values.
y_train_data_stocks=t(sapply(
(1 + lag_stocks):(length(training_rescaled) - prediction_stocks + 1),
function(x) training_rescaled[x:(x + prediction_stocks - 1)]
))
y_train_arr_stocks= array(
data = as.numeric(unlist(y_train_data_stocks)),
dim = c(
nrow(y_train_data_stocks),
prediction_stocks,
1
)
)
#In the same manner we need to prepare input data for the prediction
#list_test_rescaled
# this time our array just has one sample, as we intend to perform one 20-days prediction
x_pred_arr_stocks=array(
data = test_rescaled,
dim = c(
1,
lag_stocks,
1
)
)
###lstm forecast prova
set.seed(12345)
lstm_model <- keras_model_sequential()
lstm_model_prova=
layer_lstm(lstm_model,units = 70, # size of the layer
batch_input_shape = c(1, 20, 1), # batch size, timesteps, features
return_sequences = TRUE,
stateful = TRUE) %>%
# fraction of the units to drop for the linear transformation of the inputs
layer_dropout(rate = 0.5) %>%
layer_lstm(units = 50,
return_sequences = TRUE,
stateful = TRUE) %>%
layer_dropout(rate = 0.5) %>%
time_distributed(keras::layer_dense(units = 1))
lstm_model_compile=compile(lstm_model_prova,loss = 'mae', optimizer = 'adam', metrics = 'accuracy')
lstm_fit_prova=fit(lstm_model_compile,
x = x_train_arr_stocks[[1]],
y = y_train_arr_stocks[[1]],
batch_size = 1,
epochs = 20,
verbose = 0,
shuffle = FALSE
)
lstm_forecast_prova=predict(lstm_fit_prova,x_pred_arr_stocks, batch_size = 1)
It works if I use
lstm_forecast_prova=predict(lstm_model_compile,x_pred_arr_stocks, batch_size = 1)
But shouldn't I use the fitted model in order to make the predictions?
Also, if I plot the fitted model, the accuracy is 0. And actually on my real data the predictions do not make any sense. So what does it mean that the accuracy is 0? Maybe something is wrong with the lstm parameters?
Thank you in advance!!

LightGBM with Tweedie loss; I'm confused on the Gradient and Hessians used

I'm trying to figure out custom objective functions in LightGBM, and I figured a good place to start would be replicating the built-in functions. The equation LightGBM uses to calculate the Tweedie metric (https://github.com/microsoft/LightGBM/blob/1c27a15e42f0076492fcc966b9dbcf9da6042823/src/metric/regression_metric.hpp#L300-L318) seems to match definitions of the Tweedie loss I've found online (https://towardsdatascience.com/tweedie-loss-function-for-right-skewed-data-2c5ca470678f), though they do a weird exp(ln(score)) process, I'm guessing for numerical stability. However, their equations for the gradient and Hessian seem to be done on the log of score directly (https://github.com/microsoft/LightGBM/blob/1c27a15e42f0076492fcc966b9dbcf9da6042823/src/objective/regression_objective.hpp#L702-L732).
It seems like they are using the equation:
gradients[i] = -label_[i] * e^((1 - rho_) * score[i]) + e^((2 - rho_) * score[i]);
where I would expect the gradient to be:
gradients[i] = -label_[i] * score[i]^(- rho_) + score[i]^(1 - rho_);
My guess is somewhere LightGBM is processing score as ln(score), like using parameter reg_sqrt, but I can't find where in the documentation this is described.
Anyway I've tried recreating both their formula and my own calculations as custom objective functions, and neither seem to work:
library(lightgbm)
library(data.table)
# Tweedie gradient with variance = 1.5, according to my own math
CustomObj_t1 <- function(preds, dtrain) {
labels <- dtrain$getinfo('label')
grad <- -labels * preds^(-3/2) + preds^(-1/2)
hess <- 1/2 * (3*labels*preds^(-5/2) - preds^(-3/2))
return(list(grad = grad, hess = hess))
}
# Tweedie gradient with variance = 1.5, recreating code from LightGBM github
CustomObj_t2 <- function(preds, dtrain) {
labels <- dtrain$getinfo('label')
grad <- -labels*exp(-1/2*preds) + exp(1/2*preds)
hess <- -labels*(-1/2)*exp(-1/2*preds) + 1/2*exp(1/2*preds)
return(list(grad = grad, hess = hess))
}
params = list(objective = "tweedie",
seed = 1,
metric = "rmse")
params2 = list(objective = CustomObj_t1,
seed= 1,
metric = "rmse")
params3 = list(objective = CustomObj_t2,
seed= 1,
metric = "rmse")
# Create data
set.seed(321)
db_Custom = data.table(a=runif(2000), b=runif(2000))
db_Custom[,X := (a*4+exp(b))]
# break into test and training sets
db_Test = db_Custom[1:10]
db_Custom=db_Custom[11:nrow(db_Custom),]
FeatureCols = c("a","b")
# Create dataset
ds_Custom <- lgb.Dataset(data.matrix(db_Custom[, FeatureCols, with = FALSE]), label = db_Custom[["X"]])
# Train
fit = lgb.train(params, ds_Custom, verb=-1)
#print(" ")
fit2 = lgb.train(params2, ds_Custom, verb=-1)
#print(" ")
fit3 = lgb.train(params3, ds_Custom, verb=-1)
# Predict
pred = predict(fit, data.matrix(db_Test[, FeatureCols, with = FALSE]))
db_Test[, prediction := pmax(0, pred)]
pred2 = predict(fit2, data.matrix(db_Test[, FeatureCols, with = FALSE]))
db_Test[, prediction2 := pmax(0, pred2)]
pred3 = predict(fit3, data.matrix(db_Test[, FeatureCols, with = FALSE]))
db_Test[, prediction3 := pmax(0, pred3)]
print(db_Test[,.(X,prediction,prediction2,prediction3)])
I get the results (would expect prediction2 or prediction3 to be very similar to prediction):
"X" "prediction" "prediction2" "prediction3"
4.8931646234958 4.89996556839721 0 1.59154656425556
6.07328897031702 6.12313647937047 0 1.81022588429474
2.05728566704078 2.06824004875244 0 0.740577102751491
2.54732526765174 2.50329903656292 0 0.932517774958986
4.07044099941395 4.07047912554207 0 1.39922723582939
2.74639568121359 2.74408567443232 0 1.01628212910587
3.47720295158928 3.49241414141969 0 1.23049599462599
2.92043718858535 2.90464303454649 0 1.0680618051659
4.44415913080697 4.43091665909845 0 1.48607456777287
4.96566318066753 4.97898586895233 0 1.60163901781479
Is there something I'm missing? Am I just doing the math or coding wrong?
It appears, per the linked git page, and your prediction3 column, that if you exponentiate this column, it becomes very close to columns 0 and 1.

MXNET softmax output: label shape confusion

I have not got a clear idea about how labels for the softmax classifier should be shaped.
What I could understand from my experiments is that a scalar laber indicating the index of class probability output is one option, while another is a 2D label where the rows are class probabilities, or one-hot encoded variable, like c(1, 0, 0).
What puzzles me though is that:
I can use sclalar label values that go beyong indexing, like 4 in my
example below -- without warning or error. Why is that?
When my label is a negative scalar or an array with a negative value,
the model converges to uniform probablity distribution over classes.
For example, is this expected that actor_train.y = matrix(c(0, -1,v0), ncol = 1) results in equal probabilities in the softmax output?
I try to use softmax MXNET classifier to produce the policy gradient
reifnrocement learning, and my negative rewards lead to the issue
above: uniform probability. Is that expected?
require(mxnet)
actor_initializer <- mx.init.Xavier(rnd_type = "gaussian",
factor_type = "avg",
magnitude = 0.0001)
actor_nn_data <- mx.symbol.Variable('data') actor_nn_label <- mx.symbol.Variable('label')
device.cpu <- mx.cpu()
NN architecture
actor_fc3 <- mx.symbol.FullyConnected(
data = actor_nn_data
, num_hidden = 3 )
actor_output <- mx.symbol.SoftmaxOutput(
data = actor_fc3
, label = actor_nn_label
, name = 'actor' )
crossentfunc <- function(label, pred)
{
- sum(label * log(pred)) }
actor_loss <- mx.metric.custom(
feval = crossentfunc
, name = "log-loss"
)
initialize NN
actor_train.x <- matrix(rnorm(11), nrow = 1)
actor_train.y = 0 #1 #2 #3 #-3 # matrix(c(0, 0, -1), ncol = 1)
rm(actor_model)
actor_model <- mx.model.FeedForward.create(
symbol = actor_output,
X = actor_train.x,
y = actor_train.y,
ctx = device.cpu,
num.round = 100,
array.batch.size = 1,
optimizer = 'adam',
eval.metric = actor_loss,
clip_gradient = 1,
wd = 0.01,
initializer = actor_initializer,
array.layout = "rowmajor" )
predict(actor_model, actor_train.x, array.layout = "rowmajor")
It is quite strange to me, but I found a solution.
I changed optimizer from optimizer = 'adam' to optimizer = 'rmsprop', and the NN started to converge as expected in case of negative targets. I made simulations in R using a simple NN and optim function to get the same result.
Looks like adam or SGD may be buggy or whatever in case of multinomial classification... I also used to get stuck at the fact those optimizers did not converge to a perfect solution on just 1 example, while rmsprop does! Be aware!

Keras LSTM and multiple input feature: how to define parameters

I am discoveting Keras in R and the LSTM. Following this blog post, I want to predict time series, and I would like to use various past time point (t-1, t-2) to predict the t point.
Here is what I tried so far:
library(data.table)
library(tensorflow)
library(keras)
Serie <- c(5.66333333333333, 5.51916666666667, 5.43416666666667, 5.33833333333333,
5.44916666666667, 6.2025, 6.57916666666667, 6.70666666666667,
6.95083333333333, 8.1775, 8.55083333333333, 8.42166666666667,
8.01333333333333, 8.99833333333333, 11.0025, 10.3116666666667,
10.51, 10.9916666666667, 10.6116666666667, 10.8475, 13.7841666666667,
16.2916666666667, 15.9975, 14.3683333333333, 13.4041666666667,
11.8666666666667, 9.11916666666667, 9.47862416666667, 9.08404666666667,
8.79606166666667, 9.93211091666667, 9.03834041666667, 8.58787275,
6.77499383333333, 7.21377583333333, 7.53497175, 6.31212966666667,
5.5825105, 4.64021041666667, 4.608787, 5.39446983333333, 4.93945983333333,
4.8612215, 4.13088808333333, 4.09916575, 3.40943183333333, 3.79573258333333,
4.30319966666667, 4.23431266666667, 3.64880758333333, 3.11700716666667,
3.321058, 2.53599408333333, 2.20433991666667, 1.66643905833333,
0.84187275, 0.467880658333333, 0.810507858333333, 0.795)
Npoints <- 2 # number of previous point to take into account
I then create a data frame with the lagged time series, and create a test and train set:
supervised <- data.table(x = diff(Serie, differences = 1))
supervised[,c(paste0("x-",1:Npoints)) := lapply(1:Npoints,function(i){c(rep(NA,i),x[1:(.N-i)])})] # create shifted versions
# take the non NA
supervised <- supervised[!is.na(get(paste0("x-",Npoints)))]
head(supervised)
# Split dataset into training and testing sets
N = nrow(supervised)
n = round(N *0.7, digits = 0)
train = supervised[1:n, ]
test = supervised[(n+1):N, ]
I rescale the data
scale_data = function(train, test, feature_range = c(0, 1)) {
x = train
fr_min = feature_range[1]
fr_max = feature_range[2]
std_train = ((x - min(x,na.rm = T) ) / (max(x,na.rm = T) - min(x,na.rm = T) ))
std_test = ((test - min(x,na.rm = T) ) / (max(x,na.rm = T) - min(x,na.rm = T) ))
scaled_train = std_train *(fr_max -fr_min) + fr_min
scaled_test = std_test *(fr_max -fr_min) + fr_min
return( list(scaled_train = as.vector(scaled_train), scaled_test = as.vector(scaled_test) ,scaler= c(min =min(x,na.rm = T), max = max(x,na.rm = T))) )
}
Scaled = scale_data(train, test, c(-1, 1))
# define x and y train
y_train = as.vector(Scaled$scaled_train[, 1])
x_train = Scaled$scaled_train[, -1]
And following this post I reshape the data in 3D
x_train_reshaped <- array(NA,dim= c(1,dim(x_train)))
x_train_reshaped[1,,] <- as.matrix(x_train)
I do the following model and try to start the learning :
model <- keras_model_sequential()
model%>%
layer_lstm(units = 1, batch_size = 1, input_shape = dim(x_train), stateful= TRUE)%>%
layer_dense(units = 1)
# compile model ####
model %>% compile(
loss = 'mean_squared_error',
optimizer = optimizer_adam( lr= 0.02, decay = 1e-6 ),
metrics = c('accuracy')
)
# make a test
model %>% fit(x_train_reshaped, y_train, epochs=1, batch_size=1, verbose=1, shuffle=FALSE)
but I get the following error:
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: No data provided for "dense_11". Need data for each key in: ['dense_11']
Trying to reshape the data differently didn't help.
What I am doing wrong ?
Keras and tensorflow in R cannot recognise the size of your input/target data when they are data frames.
y_train is both a data.table and a data.frame:
class(y_train)
[1] "data.table" "data.frame"
The keras fit documentation states: "y: Vector, matrix, or array of target (label) data (or list if the model has multiple outputs)." Similarly, for x.
Unfortunately, there still appears to be an input and/or target dimensionality mismatch when y_train is cast to a matrix:
model %>%
fit(x_train_reshaped, as.matrix(y_train), epochs=1, batch_size=1, verbose=1, shuffle=FALSE)
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: Input arrays should have the same number of samples as target arrays.
Found 1 input samples and 39 target samples.
Hope this answer helps you, or someone else, make further progress.

Winbugs to Rjags beta binomial model translation

I am working through the textbook "Bayesian Ideas and Data Analysis" by Christensen et al.
There is a simple exercise in the book that involves cutting and pasting the following code to run in Winbugs:
model{ y ~ dbin(theta, n) # Model the data
ytilde ~ dbin(theta, m) # Prediction of future binomial
theta ~ dbeta(a, b) # The prior
prob <- step(ytilde - 20) # Pred prob that ytilde >= 20 }
list(n=100, m=100, y=10, a=1, b=1) # The data
list(theta=0.5, ytilde=10) # Starting/initial values
I am trying to translate the following into R2jags code and am running into some trouble. I thought I could fairly directly write my R2Jags code in this fashion:
model {
#Likelihoods
y ~ dbin(theta,n)
yt ~ dbin(theta,m)
#Priors
theta ~ dbeta(a,b)
prob <- step(yt - 20)
}
with the R code:
library(R2jags)
n <- 100
m <- 100
y <- 10
a <- 1
b <- 1
jags.data <- list(n = n,
m = m,
y = y,
a = a,
b = b)
jags.init <- list(
list(theta = 0.5, yt = 10), #Chain 1 init
list(theta = 0.5, yt = 10), #Chain 2 init
list(theta = 0.5, yt = 10) #Chain 3 init
)
jags.param <- c("theta", "yt")
jags.fit <- jags.model(data = jags.data,
inits = jags.inits,
parameters.to.save = jags.param,
model.file = "hw21.bug",
n.chains = 3,
n.iter = 5000,
n.burnin = 100)
print(jags.fit)
However, calling the R code brings about the following error:
Error in jags.model(data = jags.data, inits = jags.inits, parameters.to.save = jags.param, :
unused arguments (parameters.to.save = jags.param, model.file = "hw21.bug", n.iter = 5000, n.burnin = 100)
Is it because I am missing a necessary for loop in my R2Jags model code?
The error is coming from the R function jags.model (not from JAGS) - you are trying to use arguments parameters.to.save etc to the wrong function.
If you want to keep the model as similar to WinBUGS as possible, there is an easier way than specifying the data and initial values in R. Put the following into a text file called 'model.txt' in your working directory:
model{
y ~ dbin(theta, n) # Model the data
ytilde ~ dbin(theta, m) # Prediction of future binomial
theta ~ dbeta(a, b) # The prior
prob <- step(ytilde - 20) # Pred prob that ytilde >= 20
}
data{
list(n=100, m=100, y=10, a=1, b=1) # The data
}
inits{
list(theta=0.5, ytilde=10) # Starting/initial values
}
And then run this in R:
library('runjags')
results <- run.jags('model.txt', monitor='theta')
results
plot(results)
For more information on this method of translating WinBUGS models to JAGS see:
http://runjags.sourceforge.net/quickjags.html
Matt
This old blog post has an extensive example of converting BUGS to JAGS accessed via package rjags not R2jags. (I like the package runjags even better.) I know we're supposed to present self-contained answers here, not just links, but the post is rather long. It goes through each logical step of a script, including:
loading the package
specifying the model
assembling the data
initializing the chains
running the chains
examining the results

Resources