I am currently starting to work with Keras/Tensorflow in R and am therefore working through the tensorflow tutorial.
However, when I try to normalize the feature space in the same way as described in the tutorial, I receive an error message/exception.
I found a kaggle notebook online that tried to reproduce the tensorflow tutorial as well, and it als got stuck at the exact same error message. See https://www.kaggle.com/code/kewagbln/boston-housing-regression-with-tensorflow/notebook.
Does anyone understand why I am getting the error message? Ultimately, I am not even coding on my own but just copying out of the tutorial and it still does not work.
To provide some more information: I am running the following code:
rm(list = ls())
boston_housing <- dataset_boston_housing()
c(train_data, train_labels) %<-% boston_housing$train
c(test_data, test_labels) %<-% boston_housing$test
paste0("Training entries: ", length(train_data), ", labels: ", length(train_labels))
column_names <- c('CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE',
'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT')
train_df <- train_data %>%
as_tibble(.name_repair = "minimal") %>%
setNames(column_names) %>%
mutate(label = train_labels)
test_df <- test_data %>%
as_tibble(.name_repair = "minimal") %>%
setNames(column_names) %>%
mutate(label = test_labels)
spec <- feature_spec(train_df, label ~ . ) %>%
step_numeric_column(all_numeric(), normalizer_fn = scaler_standard()) %>%
layer <- layer_dense_features(
feature_columns = dense_features(spec),
dtype = tf$float32
input <- layer_input_from_dataset(train_df %>% select(-label))
output <- input %>%
layer_dense_features(dense_features(spec)) %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 64, activation = "relu") %>%
layer_dense(units = 1)
model <- keras_model(input, output)
Overall, the code runs just fine and I can train a simple neural network. The exception is being raised when calling layer(train_df). This, however, seems to have no impact on the overall model construction.
I have a recipe with the step_mutate() function in between, performing text data transformations on titanic dataset, supported by the stringr package.
extract_title <- function(x) stringr::str_remove(str_extract(x, "Mr\\.? |Mrs\\.?|Miss\\.?|Master\\.?"), "\\.")
rf_recipe <-
recipe(Survived ~ ., data = titanic_train) %>%
step_impute_mode(Embarked) %>%
step_mutate(Cabin = if_else(is.na(Cabin), "Yes", "No"),
Title = if_else(is.na(extract_title(Name)), "Other", extract_title(Name))) %>%
step_impute_knn(Age, impute_with = c("Title", "Sex", "SibSp", "Parch")) %>%
update_role(PassengerId, Name, new_role = "id")
This set of transformations works perfectly well with rf_recipe %>% prep() %>% bake(new_data = NULL).
When I try to fit a random forests model with hyperparameter tunning and 10-fold cross validation within a workflow, all models fail. The output of the .notes columns explicitly says that there was a problem with mutate() column Title: couldn't find the function str_remove().
rf_res <-
resamples = titanic_folds,
grid = rf_grid,
control = control_resamples(save_pred = TRUE)
As this post suggests I've explicitly told R that str_remove should be found in stringr package. Why this isn't working and what could be causing it?
I don't think this will fix the error, but just in case the str_extract function is not written stringr :: str_extract, did you load the package?
The error shows up because step_knn_impute() and subsequently the gower::gower_topn function transforms all characters to factors. To overcome this issue I had to apply prep()and bake() functions, without the inclusion of the recipe in the workflow.
prep_recipe <- prep(rf_recipe)
train_processed <- bake(prep_recipe, new_data = NULL)
test_processed <- bake(prep_recipe, new_data = titanic_test %>%
mutate(across(where(is.character), as.factor)))
Now the models converge.
I am using the R programming language. I am trying to follow the R tutorial over here on neural networks (lstm) and time series: https://blogs.rstudio.com/ai/posts/2018-06-25-sunspots-lstm/
I decided to create my own time series data ("y.mon") for this tutorial (the same format and the same variable names) :
index = seq(as.Date("1749/1/1"), as.Date("2016/1/1"),by="day")
index <- format(as.Date(index), "%Y/%m/%d")
value <- rnorm(97520,27,2.1)
final_data <- data.frame(index, value)
format="%Y/%m"),data=final_data, FUN=sum)
y.mon$index = y.mon$`format(as.Date(index), format = "%Y/%m")`
y.mon$`format(as.Date(index), format = "%Y/%m")` = NULL
y.mon %>%
mutate(index = paste0(index, '/01')) %>%
tk_tbl() %>%
mutate(index = as_date(index)) %>%
as_tbl_time(index = index) -> y.mon
From here on, I follow the instructions in the tutorial (replacing the "sun_spots data" with "y.mon". Everything works fine until this point (I posted a question yesterday that got closed for being too detailed https://stackoverflow.com/questions/65527230/r-error-in-is-symbolx-object-not-found-keras - the code can be followed from the rstudio tutorial) :
coln <- colnames(compare_train)[4:ncol(compare_train)]
cols <- map(coln, quo(sym(.)))
rsme_train <-
map_dbl(cols, function(col)
truth = value,
estimate = !!col,
na.rm = TRUE
)) %>% mean()
Error in is_symbol(x) : object '.' not found
I found another stackoverflow post which deals with a similar problem:Getting error message while calculating rmse in a time series analysis
According to this stackoverflow post, this first error can be resolved like this:
coln <- colnames(compare_train)[4:ncol(compare_train)]
rsme_train <-
map_df(coln, function(col)
truth = value,
estimate = !!col,
na.rm = TRUE
)) %>%
pull(.estimate) %>%
However, the following section of the tutorial has a similar section in which the same error persists even after applying the corrections:
compare_test %>% write_csv(str_replace(model_path, ".hdf5", ".test.csv"))
compare_test[FLAGS$n_timesteps:(FLAGS$n_timesteps + 10), c(2, 4:8)] %>% print()
cols <- map(coln, quo(sym(.)))
rsme_test <-
map_dbl(cols, function(col)
truth = value,
estimate = !!col,
na.rm = TRUE
)) %>% mean()
Error in stri_replace_first_regex(string, pattern, fix_replacement(replacement), :
object 'model_path' not found
Error in is_symbol(x) : object '.' not found
These errors are preventing me from finishing the rest of the tutorial.
Can someone please show me how to fix these?
Try using coln in map_dbl :
rsme_test <- map_dbl(coln, function(col)
truth = value,
estimate = !!col,
na.rm = TRUE
)) %>% mean()
The behavior of map is understood to the extent of the following code.
iris %>%
group_nest(Species) %>%
mutate(lm_mod = map(data,function(x){
The above code works in my head as follows.
fot(i in unique(iris$species))
data <- slice iris$species[i]
function(data){lm(width ~ length,data)}
lm_mod[i] <- function(data)
But I'm confused when I encounter new code.
folds %>%
mutate(recipes = map(splits, prepper, recipe = recipe_code))
in my head
for(i in len(folds))
folds$recipes <- prepper(splits) ??recipe = recipe_code??
Where does the recipe_code go in?
I'm using R simmer to do a simulation. However, I receive this error message every time when I run it:
Error: 'truck0' at 48.73 in [Seize]->Timeout->[Release]: Expecting a
single value: [extent=11].
What is wrong with this?
This is my R script:
#load packages
#create an simulation environment
env <- simmer("Terminal")
#create a truck trajectory
truck <- trajectory("Truck path", verbose = TRUE)
#draw model
truck %>%
seize("frontdesk",1) %>%
timeout(function() rnorm(11.27671,3.233562)) %>%
release("frontdesk",1) %>%
seize("gate-in",1) %>%
timeout(function() rnorm(17.54509,9.915719)) %>%
release("gate-in",1) %>%
seize("station",1) %>%
timeout(function() rnorm(12.68418,12.55247)) %>%
release("station",1) %>%
seize("lashing",1) %>%
timeout(function() rnorm(28.87726,21.0809)) %>%
release("lashing",1) %>%
seize("control",1) %>%
timeout(function() rnorm(12.70417,3.711475)) %>%
release("control",1) %>%
seize("frontdesk end",1) %>%
timeout(function()rnorm(11.27671,3.233562)) %>%
release("frontdesk end",1)
env <- lapply(1:100, function(i) {
simmer("Terminal") %>%
add_resource("frontdesk", 2) %>%
add_resource("gate-in", 2) %>%
add_resource("station", 1) %>%
add_resource("lashing", 15) %>%
add_resource("control", 1) %>%
add_resource("frontdesk end", 2) %>%
add_generator(name = "truck" ,
trajectory = truck,
distribution = function() rnorm(1,24.992,36.015)) %>%
run(660) %>%
As the error indicates, timeout activities expect a single value, and you are providing 11 in this case. Because of this:
timeout(function() rnorm(11.27671,3.233562))
rnorm's first parameter is the number of samples (which is rounded to 11 in this case). What are you trying to do here? If that's supposed to be mean=11.27, sd=3.23, then you need to add
timeout(function() rnorm(1, 11.27671,3.233562))
so that you get a single sample per call, as required. And the same applies for all the other timeouts.
EDIT: Also, I do not recommend using a normal distribution for service times, because a normal distribution may return negative values (that are by default coerced to positive), and thus you may get unexpected results.
I am trying to load the kapferer min dataset into r using the igraph function "read_graph"
The code is very simple, however it throws an error.
test_g <-read_graph("http://vlado.fmf.uni-lj.si/pub/networks/data/ucinet/kapmine.dat", format = "dl")
Error in read.graph.dl(file, ...) : At foreign.c:3050 : syntax
error, unexpected $end, expecting DL in line 1, Parse error
The as can be seen by following the link the file does begin with DL. The only clue I can find to this is a message from 2015 which basically says file a bug report.
Can dl files not beloaded by igraph at the moment, or is there some trick to it?
As there doesn't seem to be a clear way to load dl files I have made a loader that seems to work well for dl graph on the Pajek website. The function is a bit scrappy and has not been extensively tested, but it may be useful to some who want to use certain graphs that are not available in a more common format. If there is more to date information on these datasets, then this code can be ignored.
load_dl_graph <- function(file_path, directed){
raw_mat <- readLines(file_path) %>%
row_labels_row <- grep( "ROW LABELS:", raw_mat$value)
column_labels_row <- grep( "COLUMN LABELS:", raw_mat$value)
level_labels_row <- grep("LEVEL LABELS:",raw_mat$value )
data_table_row <- grep( "DATA:", raw_mat$value)
row_labels <- raw_mat %>%
slice((row_labels_row+1):(column_labels_row-1)) %>%
select(from = value)
column_labels <- raw_mat %>%
slice((column_labels_row+1):(level_labels_row-1)) %>% pull(value)
table_levels <- raw_mat %>%
slice((level_labels_row+1):(data_table_row+-1)) %>% pull(value)
data_df <- raw_mat %>%
slice((data_table_row+1):nrow(.)) %>%
select(value) %>%
mutate(value = str_squish(value)) %>%
separate(col = value, into = column_labels, sep = " ") %>%
mutate(table_id = rep(1:length(table_levels), each = nrow(.)/length(table_levels)))
tables_list <- 1:length(table_levels) %>%
data_df %>%
filter(table_id ==.x) %>%
select(-table_id) %>%
bind_cols(row_labels,.) %>%
pivot_longer(cols = 2:ncol(.), names_to = "to", values_to = "values") %>%
filter(values ==1) %>%
select(-values) %>%
graph_from_data_frame(., directed = directed)
names(tables_list) <- table_levels