How do I average 20 saved ANN model? - jupyter-notebook

I have created 20 neural network models with the exact same architecture (but different seed numbers) and I have saved them as SM1, SM2, …., SM20. I am basically tying to import these models using the joblib function and use the average model (called NN here) to apply for various analyses. When I run this in Jupyter Notebook , I get the following error. Appreciate any help.
NN1 = joblib.load('SM1.sav')
NN2 = joblib.load('SM2.sav')
NN3 = joblib.load('SM3.sav')
NN4 = joblib.load('SM4.sav')
NN5 = joblib.load('SM5.sav')
NN6 = joblib.load('SM6.sav')
NN7 = joblib.load('SM7.sav')
NN8 = joblib.load('SM8.sav')
NN9 = joblib.load('SM9.sav')
NN10 = joblib.load('SM10.sav')
NN11 = joblib.load('SM11.sav')
NN12 = joblib.load('SM12.sav')
NN13 = joblib.load('SM13.sav')
NN14 = joblib.load('SM14.sav')
NN15 = joblib.load('SM15.sav')
NN16 = joblib.load('SM16.sav')
NN17 = joblib.load('SM17.sav')
NN18 = joblib.load('SM18.sav')
NN19 = joblib.load('SM19.sav')
NN20 = joblib.load('SM20.sav')
NN = (NN1+NN2+NN3+NN4+NN5+NN6+NN7+NN8+NN9+NN10+NN11+NN12+NN13+NN14+NN15+NN16+NN17+NN18+NN19+NN20)/(20)
TypeError: unsupported operand type(s) for +: 'MLPRegressor' and 'MLPRegressor'

Related

RuntimeError: quantile() q tensor must be same dtype as the input tensor in pytorch-forecasting

PyTorch-Forecasting version: 0.10.2
PyTorch version:1.12.1
Python version:3.10.4
Operating System: windows
Expected behavior
No Error
Actual behavior
The Error is
File c:\Users\josepeeterson.er\Miniconda3\envs\pytorch\lib\site-packages\pytorch_forecasting\metrics\base_metrics.py:979, in DistributionLoss.to_quantiles(self, y_pred, quantiles, n_samples)
977 except NotImplementedError: # resort to derive quantiles empirically
978 samples = torch.sort(self.sample(y_pred, n_samples), -1).values
--> 979 quantiles = torch.quantile(samples, torch.tensor(quantiles, device=samples.device), dim=2).permute(1, 2, 0)
980 return quantiles
RuntimeError: quantile() q tensor must be same dtype as the input tensor
How do I set them to be of same datatype? This is happening internally. I do not have control over this. I am not using any GPUs.
The link to the .csv file with input data is https://github.com/JosePeeterson/Demand_forecasting
The data is just sampled from a negative binomila distribution wiht parameters (9,0.5) every 4 hours. the time inbetween is all zero.
I just want to see if DeepAR can learn this pattern.
Code to reproduce the problem
from pytorch_forecasting.data.examples import generate_ar_data
import matplotlib.pyplot as plt
import pandas as pd
from pytorch_forecasting.data import TimeSeriesDataSet
from pytorch_forecasting.data import NaNLabelEncoder
from pytorch_lightning.callbacks import EarlyStopping, LearningRateMonitor
import pytorch_lightning as pl
from pytorch_forecasting import NegativeBinomialDistributionLoss, DeepAR
import torch
from pytorch_forecasting.data.encoders import TorchNormalizer
data = [pd.read_csv('1_f_nbinom_train.csv')]
data["date"] = pd.Timestamp("2021-08-24") + pd.to_timedelta(data.time_idx, "H")
data['_hour_of_day'] = str(data["date"].dt.hour)
data['_day_of_week'] = str(data["date"].dt.dayofweek)
data['_day_of_month'] = str(data["date"].dt.day)
data['_day_of_year'] = str(data["date"].dt.dayofyear)
data['_week_of_year'] = str(data["date"].dt.weekofyear)
data['_month_of_year'] = str(data["date"].dt.month)
data['_year'] = str(data["date"].dt.year)
max_encoder_length = 60
max_prediction_length = 20
training_cutoff = data["time_idx"].max() - max_prediction_length
training = TimeSeriesDataSet(
data.iloc[0:-620],
time_idx="time_idx",
target="value",
categorical_encoders={"series": NaNLabelEncoder(add_nan=True).fit(data.series), "_hour_of_day": NaNLabelEncoder(add_nan=True).fit(data._hour_of_day), \
"_day_of_week": NaNLabelEncoder(add_nan=True).fit(data._day_of_week), "_day_of_month" : NaNLabelEncoder(add_nan=True).fit(data._day_of_month), "_day_of_year" : NaNLabelEncoder(add_nan=True).fit(data._day_of_year), \
"_week_of_year": NaNLabelEncoder(add_nan=True).fit(data._week_of_year), "_year": NaNLabelEncoder(add_nan=True).fit(data._year)},
group_ids=["series"],
min_encoder_length=max_encoder_length,
max_encoder_length=max_encoder_length,
min_prediction_length=max_prediction_length,
max_prediction_length=max_prediction_length,
time_varying_unknown_reals=["value"],
time_varying_known_categoricals=["_hour_of_day","_day_of_week","_day_of_month","_day_of_year","_week_of_year","_year" ],
time_varying_known_reals=["time_idx"],
add_relative_time_idx=False,
randomize_length=None,
scalers=[],
target_normalizer=TorchNormalizer(method="identity",center=False,transformation=None )
)
validation = TimeSeriesDataSet.from_dataset(
training,
data.iloc[-620:-420],
# predict=True,
stop_randomization=True,
)
batch_size = 64
train_dataloader = training.to_dataloader(train=True, batch_size=batch_size, num_workers=8)
val_dataloader = validation.to_dataloader(train=False, batch_size=batch_size, num_workers=8)
# save datasets
training.save("training.pkl")
validation.save("validation.pkl")
early_stop_callback = EarlyStopping(monitor="val_loss", min_delta=1e-4, patience=5, verbose=False, mode="min")
lr_logger = LearningRateMonitor()
trainer = pl.Trainer(
max_epochs=10,
gpus=0,
gradient_clip_val=0.1,
limit_train_batches=30,
limit_val_batches=3,
# fast_dev_run=True,
# logger=logger,
# profiler=True,
callbacks=[lr_logger, early_stop_callback],
)
deepar = DeepAR.from_dataset(
training,
learning_rate=0.1,
hidden_size=32,
dropout=0.1,
loss=NegativeBinomialDistributionLoss(),
log_interval=10,
log_val_interval=3,
# reduce_on_plateau_patience=3,
)
print(f"Number of parameters in network: {deepar.size()/1e3:.1f}k")
torch.set_num_threads(10)
trainer.fit(
deepar,
train_dataloaders=train_dataloader,
val_dataloaders=val_dataloader,
)
Need to cast samples to torch.tensor as shown below. Then save this base_metrics.py and rerun above code.
except NotImplementedError: # resort to derive quantiles empirically
samples = torch.sort(self.sample(y_pred, n_samples), -1).values
quantiles = torch.quantile(torch.tensor(samples), torch.tensor(quantiles, device=samples.device), dim=2).permute(1, 2, 0)
return quantiles

Issue with TFX Trainer component not outputting model to filesystem

First of all, I am using TFX version 0.21.2 and Tensorflow version 2.1.
I have constructed a pipeline largely following the Chigaco taxi example. When the Trainer component is executed I can see the following in the logs:
INFO - Training complete. Model written to /root/airflow/tfx/pipelines/fish/Trainer/model/9/serving_model_dir
When checking the above directory it is empty. What am I missing?
This is my DAG definition file (import statements omitted):
_pipeline_name = 'fish'
_airflow_config = AirflowPipelineConfig(airflow_dag_config = {
'schedule_interval': None,
'start_date': datetime.datetime(2019, 1, 1),
})
_project_root = os.path.join(os.environ['HOME'], 'airflow')
_data_root = os.path.join(_project_root, 'data', 'fish_data')
_module_file = os.path.join(_project_root, 'dags', 'fishUtils.py')
_serving_model_dir = os.path.join(_project_root, 'serving_model', _pipeline_name)
_tfx_root = os.path.join(_project_root, 'tfx')
_pipeline_root = os.path.join(_tfx_root, 'pipelines', _pipeline_name)
_metadata_path = os.path.join(_tfx_root, 'metadata', _pipeline_name,
'metadata.db')
def _create_pipeline(pipeline_name: Text, pipeline_root: Text, data_root: Text,
module_file: Text, serving_model_dir: Text,
metadata_path: Text,
direct_num_workers: int) -> pipeline.Pipeline:
examples = external_input(data_root)
example_gen = CsvExampleGen(input=examples)
statistics_gen = StatisticsGen(examples=example_gen.outputs['examples'])
infer_schema = SchemaGen(
statistics=statistics_gen.outputs['statistics'],
infer_feature_shape=False)
validate_stats = ExampleValidator(
statistics=statistics_gen.outputs['statistics'],
schema=infer_schema.outputs['schema'])
trainer = Trainer(
examples=example_gen.outputs['examples'], schema=infer_schema.outputs['schema'],
module_file=_module_file, train_args= trainer_pb2.TrainArgs(num_steps=10000),
eval_args= trainer_pb2.EvalArgs(num_steps=5000))
model_validator = ModelValidator(
examples=example_gen.outputs['examples'],
model=trainer.outputs['model'])
pusher = Pusher(
model=trainer.outputs['model'],
model_blessing=model_validator.outputs['blessing'],
push_destination=pusher_pb2.PushDestination(
filesystem=pusher_pb2.PushDestination.Filesystem(
base_directory=_serving_model_dir)))
return pipeline.Pipeline(
pipeline_name=_pipeline_name,
pipeline_root=_pipeline_root,
components=[
example_gen,
statistics_gen,
infer_schema,
validate_stats,
trainer,
model_validator,
pusher],
enable_cache=True,
metadata_connection_config=metadata.sqlite_metadata_connection_config(
metadata_path),
beam_pipeline_args=['--direct_num_workers=%d' % direct_num_workers]
)
runner = AirflowDagRunner(config = _airflow_config)
DAG = runner.run(
_create_pipeline(
pipeline_name=_pipeline_name,
pipeline_root=_pipeline_root,
data_root=_data_root,
module_file=_module_file,
serving_model_dir=_serving_model_dir,
metadata_path=_metadata_path,
# 0 means auto-detect based on on the number of CPUs available during
# execution time.
direct_num_workers=0))
And this is my module file:
_DENSE_FLOAT_FEATURE_KEYS = ['length']
real_valued_columns = [tf.feature_column.numeric_column('length')]
def _eval_input_receiver_fn():
serialized_tf_example = tf.compat.v1.placeholder(
dtype=tf.string, shape=[None], name='input_example_tensor')
features = tf.io.parse_example(
serialized=serialized_tf_example,
features={
'length': tf.io.FixedLenFeature([], tf.float32),
'label': tf.io.FixedLenFeature([], tf.int64),
})
receiver_tensors = {'examples': serialized_tf_example}
return tfma.export.EvalInputReceiver(
features={'length' : features['length']},
receiver_tensors=receiver_tensors,
labels= features['label'],
)
def parser(serialized_example):
features = tf.io.parse_single_example(
serialized_example,
features={
'length': tf.io.FixedLenFeature([], tf.float32),
'label': tf.io.FixedLenFeature([], tf.int64),
})
return ({'length' : features['length']}, features['label'])
def _input_fn(filenames):
# TFRecordDataset doesn't directly accept paths with wildcards
filenames = tf.data.Dataset.list_files(filenames)
dataset = tf.data.TFRecordDataset(filenames, 'GZIP')
dataset = dataset.map(parser)
dataset = dataset.shuffle(2000)
dataset = dataset.batch(40)
dataset = dataset.repeat(10)
return dataset
def trainer_fn(trainer_fn_args, schema):
estimator = tf.estimator.LinearClassifier(feature_columns=real_valued_columns)
train_input_fn = lambda: _input_fn(trainer_fn_args.train_files)
train_spec = tf.estimator.TrainSpec(
train_input_fn,
max_steps=trainer_fn_args.train_steps)
eval_input_fn = lambda: _input_fn(trainer_fn_args.eval_files)
eval_spec = tf.estimator.EvalSpec(
eval_input_fn,
steps=trainer_fn_args.eval_steps,
name='fish-eval')
receiver_fn = lambda: _eval_input_receiver_fn()
return {
'estimator': estimator,
'train_spec': train_spec,
'eval_spec': eval_spec,
'eval_input_receiver_fn': receiver_fn
}
Thank you in advance for your help!
Posting the solution for anyone that is facing the same problem that I faced.
The reason that the model was not written in the filesystem was that the estimator needs a config argument to know where to write the model.
The following modification to the trainer_fn function should solve the problem.
run_config = tf.estimator.RunConfig(save_checkpoints_steps=999, keep_checkpoint_max=1)
run_config = run_config.replace(model_dir=trainer_fn_args.serving_model_dir)
estimator=tf.estimator.LinearClassifier(feature_columns=real_valued_columns,config=run_config)

How can I access the trained parameters of a Neural ODE in Julia?

I'm trying to fit one Neural ODE to a time series usind Julia's DiffEqFlux. Here my code:
u0 = Float32[2.;0]
train_size = 15
tspan_train = (0.0f0,0.75f0)
function trueODEfunc(du,u,p,t)
true_A = [-0.1 2.0; -2.0 -0.1]
du .= ((u.^3)'true_A)'
end
t_train = range(tspan_train[1],tspan_train[2],length = train_size)
prob = ODEProblem(trueODEfunc, u0, tspan_train)
ode_data_train = Array(solve(prob, Tsit5(),saveat=t_train))
dudt = Chain(
Dense(2,50,tanh),
Dense(50,2))
ps = Flux.params(dudt)
n_ode = NeuralODE(dudt, tspan_train, Tsit5(), saveat = t_train, reltol=1e-7, abstol=1e-9)
**n_ode.p**
function predict_n_ode(p)
n_ode(u0,p)
end
function loss_n_ode(p)
pred = predict_n_ode(p)
loss = sum(abs2, ode_data_train .- pred)
loss,pred
end
final_p = []
losses = []
cb = function(p,l,pred)
display(l)
display(p)
push!(final_p, p)
push!(losses,l)
pl = scatter(t_train, ode_data_train[1,:],label="data")
scatter!(pl,t_train,pred[1,:],label="prediction")
display(plot(pl))
end
DiffEqFlux.sciml_train!(loss_n_ode, n_ode.p, ADAM(0.05), cb = cb, maxiters = 100)
**n_ode.p**
The problem is that calling n_ode.p (or Flux.params(dudt)) before and after the train function gives me back the save values. I would have expected to receive the latest updated values from the training. That's why I've created an array to gather all parameter values during the training and then access it to get the updated parameters.
Am I doing something wrong in the code? Does the train function automatically update the parameters? If not how to enforce it?
Thanks in advance!
The result is an object that holds the best parameters. Here's a complete example:
using DiffEqFlux, OrdinaryDiffEq, Flux, Optim, Plots
u0 = Float32[2.; 0.]
datasize = 30
tspan = (0.0f0,1.5f0)
function trueODEfunc(du,u,p,t)
true_A = [-0.1 2.0; -2.0 -0.1]
du .= ((u.^3)'true_A)'
end
t = range(tspan[1],tspan[2],length=datasize)
prob = ODEProblem(trueODEfunc,u0,tspan)
ode_data = Array(solve(prob,Tsit5(),saveat=t))
dudt2 = FastChain((x,p) -> x.^3,
FastDense(2,50,tanh),
FastDense(50,2))
n_ode = NeuralODE(dudt2,tspan,Tsit5(),saveat=t)
function predict_n_ode(p)
n_ode(u0,p)
end
function loss_n_ode(p)
pred = predict_n_ode(p)
loss = sum(abs2,ode_data .- pred)
loss,pred
end
loss_n_ode(n_ode.p) # n_ode.p stores the initial parameters of the neural ODE
cb = function (p,l,pred;doplot=false) #callback function to observe training
display(l)
# plot current prediction against data
if doplot
pl = scatter(t,ode_data[1,:],label="data")
scatter!(pl,t,pred[1,:],label="prediction")
display(plot(pl))
end
return false
end
# Display the ODE with the initial parameter values.
cb(n_ode.p,loss_n_ode(n_ode.p)...)
res1 = DiffEqFlux.sciml_train(loss_n_ode, n_ode.p, ADAM(0.05), cb = cb, maxiters = 300)
cb(res1.minimizer,loss_n_ode(res1.minimizer)...;doplot=true)
res2 = DiffEqFlux.sciml_train(loss_n_ode, res1.minimizer, LBFGS(), cb = cb)
cb(res2.minimizer,loss_n_ode(res2.minimizer)...;doplot=true)
# result is res2 as an Optim.jl object
# res2.minimizer are the best parameters
# res2.minimum is the best loss
At the end, the sciml_train function returns a result object that holds information about the optimization, including the final parameters as .minimizer.

Parallel computing on two R servers using batchtools/BatchJobs

I'm trying to use batchtools/BatchJobs for parallel computing on two unix-based R servers. I'm completely new to this and hence followed a few articles and package details to do this. I have added some links below:
batchtools,
BatchJobs
So far I have not really understood how to use batchtools for multi-machines. On the other hand, with BatchJobs I have better progress.
I made an ssh connection from the terminal first and execute the following lines:
reg = makeRegistry("TestExp")
reg$cluster.functions = makeClusterFunctionsSSH(worker = makeSSHWorker(nodename="sla19438")) #By BatchJobs
#Test Function
piApprox = function(n) {
nums = matrix(runif(2 * n), ncol = 2)
d = sqrt(nums[, 1]^2 + nums[, 2]^2)
4 * mean(d <= 1)
}
set.seed(42)
piApprox(1000)
BatchJobs::batchMap(reg = reg, fun = piApprox, n = rep(1e7, 10))
getJobTable()
BatchJobs::submitJobs(reg = reg, resources = list(walltime = 3600, memory = 1024))
getStatus(reg = reg)
loadResult(reg = reg, id = 5)
mean(sapply(1:10, loadResult, reg = reg))
It works and gives me the results but I can't see any indication of the jobs being run on the other machine (sla19438) when I run "top" in the terminal.
Please help me understand what I'm doing wrong. Maybe there is some configuration needed but I don't see any material online which dumbs down the steps for a newbie like me.
Thanks

Catching the print of the function

I am using package fda in particular function fRegress. This function includes another function that is called eigchk and checks if coeffients matrix is singular.
Here is the function as the package owners (J. O. Ramsay, Giles Hooker, and Spencer Graves) wrote it.
eigchk <- function(Cmat) {
# check Cmat for singularity
eigval <- eigen(Cmat)$values
ncoef <- length(eigval)
if (eigval[ncoef] < 0) {
neig <- min(length(eigval),10)
cat("\nSmallest eigenvalues:\n")
print(eigval[(ncoef-neig+1):ncoef])
cat("\nLargest eigenvalues:\n")
print(eigval[1:neig])
stop("Negative eigenvalue of coefficient matrix.")
}
if (eigval[ncoef] == 0) stop("Zero eigenvalue of coefficient matrix.")
logcondition <- log10(eigval[1]) - log10(eigval[ncoef])
if (logcondition > 12) {
warning("Near singularity in coefficient matrix.")
cat(paste("\nLog10 Eigenvalues range from\n",
log10(eigval[ncoef])," to ",log10(eigval[1]),"\n"))
}
}
As you can see last if condition checks if logcondition is bigger than 12 and prints then the ranges of eigenvalues.
The following code implements the useage of regularization with roughness pennalty. The code is taken from the book "Functional data analysis with R and Matlab".
annualprec = log10(apply(daily$precav,2,sum))
tempbasis =create.fourier.basis(c(0,365),65)
tempSmooth=smooth.basis(day.5,daily$tempav,tempbasis)
tempfd =tempSmooth$fd
templist = vector("list",2)
templist[[1]] = rep(1,35)
templist[[2]] = tempfd
conbasis = create.constant.basis(c(0,365))
betalist = vector("list",2)
betalist[[1]] = conbasis
SSE = sum((annualprec - mean(annualprec))^2)
Lcoef = c(0,(2*pi/365)^2,0)
harmaccelLfd = vec2Lfd(Lcoef, c(0,365))
betabasis = create.fourier.basis(c(0, 365), 35)
lambda = 10^12.5
betafdPar = fdPar(betabasis, harmaccelLfd, lambda)
betalist[[2]] = betafdPar
annPrecTemp = fRegress(annualprec, templist, betalist)
betaestlist2 = annPrecTemp$betaestlist
annualprechat2 = annPrecTemp$yhatfdobj
SSE1.2 = sum((annualprec-annualprechat2)^2)
RSQ2 = (SSE - SSE1.2)/SSE
Fratio2 = ((SSE-SSE1.2)/3.7)/(SSE1/30.3)
resid = annualprec - annualprechat2
SigmaE. = sum(resid^2)/(35-annPrecTemp$df)
SigmaE = SigmaE.*diag(rep(1,35))
y2cMap = tempSmooth$y2cMap
stderrList = fRegress.stderr(annPrecTemp, y2cMap, SigmaE)
betafdPar = betaestlist2[[2]]
betafd = betafdPar$fd
betastderrList = stderrList$betastderrlist
betastderrfd = betastderrList[[2]]
As penalty factor the authors use certain lambda.
The following code implements the search for the appropriate `lambda.
loglam = seq(5,15,0.5)
nlam = length(loglam)
SSE.CV = matrix(0,nlam,1)
for (ilam in 1:nlam) {
lambda = 10ˆloglam[ilam]
betalisti = betalist
betafdPar2 = betalisti[[2]]
betafdPar2$lambda = lambda
betalisti[[2]] = betafdPar2
fRegi = fRegress.CV(annualprec, templist,
betalisti)
SSE.CV[ilam] = fRegi$SSE.CV
}
By changing the value of the loglam and cross validation I suppose to equaire the best lambda, yet if the length of the loglam is to big or its values lead the coefficient matrix to singulrity. I recieve the following message:
Log10 Eigenvalues range from
-5.44495317739048 to 6.78194912518214
Created by the function eigchk as I already have mentioned above.
Now my question is, are there any way to catch this so called warning? By catch I mean some function or method that warns me when this has happened and I could adjust the values of the loglam. Since there is no actual warning definition in the function beside this print of the message I ran out of ideas.
Thank you all a lot for your suggestions.
By "catch the warning", if you mean, will alert you that there is a potential problem with loglam, then you might want to look at try and tryCatch functions. Then you can define the behavior you want implemented if any warning condition is satisfied.
If you just want to store the output of the warning (which might be assumed from the question title, but may not be what you want), then try looking into capture.output.

Resources