Recommenderlab running into memory issues - r

I am trying to compare some recommender algorithms against each other but am running into some memory issues. The dataset that i am using is https://drive.google.com/open?id=0By5yrncwiz_VZUpiak5Hc2l3dkE
Following is my code:
library(recommenderlab)
library(Matrix)
Amazon <- read.csv(Path to Reviews.csv, header = TRUE,
col.names = c("ID","ProductId","UserId","HelpfulnessNumerator","HelpfulnessDenominator","Score",
"Time","Summary","Text"),
colClasses = c("NULL","character","character","NULL","NULL","integer","NULL","NULL","NULL"))
Amazon <- Amazon[,c("UserId","ProductId","Score")]
Amazon <- Amazon[!duplicated(Amazon[1:2]),] ## To get unique values
scheme <- evaluationScheme(r, method = "split", train = .7,
k = 1, given = 1 ,goodRating = 4)
algorithms <- list(
"user-based CF" = list(name="UBCF", param=list(normalize = "Z-score",
method="Cosine",
nn=50, minRating=3)),
"item-based CF" = list(name="IBCF", param=list(normalize = "Z-score"
))
)
results <- evaluate(scheme, algorithms, n=c(1, 3, 5))
I get the following errors :
UBCF run fold/sample [model time/prediction time]
1 Timing stopped at: 1.88 0 1.87
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
IBCF run fold/sample [model time/prediction time]
1 Timing stopped at: 4.93 0.02 4.95
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
Warning message:
In .local(x, method, ...) :
Recommender 'user-based CF' has failed and has been removed from the results!
Recommender 'item-based CF' has failed and has been removed from the results!
I tried to use recommenderlabrats package which i thought would solve this problem but could not install it. https://github.com/sanealytics/recommenderlabrats
It gave me some errors which i am not bale to make sense of:
c:/rbuildtools/3.3/gcc-4.6.3/bin/../lib/gcc/i686-w64- mingw32/4.6.3/../../../../i686-w64-mingw32/bin/ld.exe: cannot find -llapack
collect2: ld returned 1 exit status
Then i came to this link for solving the recommenderlabrats problem but it did not work for me
Error while installing package from github in R. Error in dyn.load
Any help on how to get around the memory issue is appreciated

I am the author of recommenderlabrats. Try to install now, it should be fixed. Then use RSVD/ALS to solve. Your matrix is too big even when it's sparse for your computer.
Also, it might be a good idea to experiment with a smaller sample before spending on an AWS memory instance.

Related

How to solve Cholmod error 'problem too large' at file

I am running a programme called 'stdeconvolve', on my Spatial single-cell data. I have ~ 100,000 column (cells) and ~26K rows (genes). I am getting the "Cholmod error 'problem too large' at file". I am unable to debug it, how can I resolve this error?
> dim(pdach1.mat)
[1] 26273 100974
My codes are as follows
library(Seurat)
library(STdeconvolve)
pdac.int <- readRDS("pd_integ.rds")
## extract the counts matrix
pdac.mat <- Matrix((pdac.int#assays$Spatial#counts), sparse = TRUE)
## remove poor genes and pixels
pdac.mat <- cleanCounts(pdac.mat, min.lib.size = 100)
## filter for features in less than 100% of pixels but more than 5% of pixels
pdac.mat <- restrictCorpus(pdac.mat, removeAbove = 1.0, removeBelow = 0.05)
pdf("pdac.int.perplexity.pdf")
pdac.ldas = fitLDA(pdac.mat), Ks = 6:20, plot=T, verbose=TRUE)
dev.off()
pdac.optLDA <- optimalModel(models = pdac.ldas, opt = "min")
pdac.results <- getBetaTheta(pdac.optLDA, perc.filt = 0.05, betaScale = 1000)
#you can obtain the spatial coordinates of the pixels (if available) by doing something like:
now the scary part - The ERROR
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 102
Calls: restrictCorpus ... is.data.frame -> as.matrix -> as.matrix.Matrix -> as -> asMethod
Execution halted

Error in h(simpleError(msg, call)) with 'as.matrix': cannot open the connection

This is my first question here. For now I'm learning how to use R in R-studio, and when I tried to read the data in a matrix form, the program showed the mistake. I tried this code:
ModelName = 'new_file' #I'm writing the file name in the same directory as the .r file is
FileName = paste(ModelName, '.txt', sep = '') #as far as I understand, I'm telling the program that
the file is in the form of txt
### Read Time series
d = as.matrix(read.table(FileName, header= T))
And then the program writes this:
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'as.matrix':
cannot open the connection
And I don't understand why it's not working.
The file for analysis is in the txt form, the example of data is below:
decy Temp CTD_S OxFix Pro Syn Piceu Naneu
2011.74221 27.60333 36.20700 27.26667 58638.33333 13107.00000 799.66667 117.66667
2011.74401 26.97950 36.13400 27.05000 71392.50000 13228.50000 1149.00000 116.50000
2011.74617 24.99750 35.34450 24.80000 264292.00000 27514.00000 2434.50000 132.50000
2011.74692 24.78400 35.25800 25.82500 208996.50000 39284.00000 3761.75000 220.75000
My r-studio version is 4.2.0.
I would be very grateful for explanation.

R "mi" package: mi(...) command throws error

I am attempting to do multiple imputation with the "mi" package (v1.0). Due to computing/processing time constraints, I split my code into two files. The first does all of the mi-style preprocessing, and the second actually runs the imputation.
The first file runs without error, but I am including it here for completeness (below is an edited, shorter version of the file):
require(mi)
# Load data for multiple imputation
data = as.data.frame(read.delim("for_mi.csv"))
...
# Declare data as missing data frame for MI functions
mdf = missing_data.frame(data)
mdf <- change(mdf, y = "x", what = "type", to = "nonnegative-continuous")
... (many type corrections later) ...
mdf <- change(mdf, y = "y", what = "type", to = "positive-continuous")
# Save pre-processed missing-format data for analysis in r_mi_2.R.
saveRDS(mdf,"preprocessed.rds")
The second file is the one that throws the error:
require(mi)
# Load output from first file
mdf <- readRDS("preprocessed.rds")
# Note: at this point, mdf loads as a missing_data.frame.
# MI commands such as show(mdf) function as expected.
# Impute data
imputations <- mi(mdf, n.iter = 30, n.chains = 4, max.minutes = Inf, parallel = TRUE)
I get the following output:
Chain 1
Chain 1 Iteration 0
Chain 2
Chain 2 Iteration 0
Chain 1 Iteration 1
Chain 3
Chain 3 Iteration 0
Chain 2 Iteration 1
Chain 4
Chain 4 Iteration 0
Chain 3 Iteration 1
Chain 4 Iteration 1
Error in checkForRemoteErrors(val) :
4 nodes produced errors; first error: cannot open the connection
Calls: mi ... clusterApply -> staticClusterApply -> checkForRemoteErrors
Warning message:
In file(file, ifelse(append, "a", "w")) :
cannot open file '/var/tmp/Rtmp0TqkWn/mi1502972500/pars_1.csv': No such file or directory
Warning message:
In file(file, ifelse(append, "a", "w")) :
cannot open file '/var/tmp/Rtmp0TqkWn/mi1502972500/pars_2.csv': No such file or directory
Execution halted
Warning message:
In file(file, ifelse(append, "a", "w")) :
cannot open file '/var/tmp/Rtmp0TqkWn/mi1502972500/pars_3.csv': No such file or directory
Warning message:
In file(file, ifelse(append, "a", "w")) :
cannot open file '/var/tmp/Rtmp0TqkWn/mi1502972500/pars_4.csv': No such file or directory
Other background info:
I am running the code on a cluster, using 8 processors on a single node. I have also tried running it locally on my computer, with the same result.
I have tried varying the number of chains, lowering the number of iterations, and setting parallel = FALSE, all to no avail.
I have tried running the code with and without the options(mc.cores = 4) line that appears in the mi vignette (see here, page 4)
The mi vignette linked above states that many errors from running mi stem from running out of RAM. I'm not sure how to test this, but some info: the starting object mdf is 128MB, the per-user server cap on the cluster is 32GB, and the error throws even if I set parallel=FALSE or n.iter=1.
Any help would be greatly appreciated. Thank you!
Edit: The error output with parallel = FALSE is:
Chain 1
Chain 1 Iteration 0
Chain 1 Iteration 1
Error in file(file, ifelse(append, "a", "w")) :
cannot open the connection
Calls: mi -> mi -> .local -> .mi -> write.table -> file
In addition: Warning message:
In file(file, ifelse(append, "a", "w")) :
cannot open file '/var/tmp/Rtmp0TqkWn/mi1502972500/pars_1.csv': No such file or directory
Execution halted

R2WinBUGS error in R

I'm trying to duplicate some code and am running into troubles with WinBUGS. The code was written in 2010 and I think that back then, the package was installed with additional files which R is now looking for and can't find (hence the error), but I'm not sure.
R stops trying to run #bugs.directory (see code) and the error is:
Error in file(con, "rb") : cannot open the connection
In addition: Warning message:
In file(con, "rb") :
cannot open file 'C:/Users/Hiwi/Documents/R/Win-library/3.0/R2winBUGS/System/Rsrc/Registry.odc': No such file or directory
Error in bugs.run(n.burnin, bugs.directory, WINE = WINE, useWINE = useWINE, :
WinBUGS executable does not exist in C:/Users/Hiwi/Documents/R/Win-library/3.0/R2winBUGS
I have the results of the analysis so if there is another way of conducting a Bayesian analysis for the "rawdata" file (in the 14 day model with [-3,0] event window) or if someone would PLEASE shed some light on what's wrong with the code, I would be forever grateful.
The code is:
rm(list=ls(all=TRUE))
setwd("C:/Users/Hiwi/Dropbox/Oracle/Oracle CD files/analysis/chapter6_a")
library(foreign)
rawdata <- read.dta("nyt.dta",convert.factors = F)
library(MASS)
summary(glm.nb(rawdata$num_events_14 ~ rawdata$nyt_num))
# WinBUGS code
library("R2WinBUGS")
nb.model <- function(){
for (i in 1:n){ # loop for all observations
# stochastic component
dv[i]~dnegbin( p[i], r)
# link and linear predictor
p[i] <- r/(r+lambda[i])
log(lambda[i] ) <- b[1] + b[2] * iv[i]
}
#
# prior distributions
r <- exp(logr)
logr ~ dnorm(0.0, 0.01)
b[1]~dnorm(0,0.001) # prior (please note: second element is 1/variance)
b[2]~dnorm(0,0.001) # prior
}
write.model(nb.model, "negativebinomial.bug")
n <- dim(rawdata)[1] # number of observations
winbug.data <- list(dv = rawdata$num_events_14,
iv = rawdata$nyt_num,
n=n)
winbug.inits <- function(){list(logr = 0 ,b=c(2.46,-.37)
)} # Ausgangswerte aus der Uniformverteilung zwischen -1 und 1
bug.erg <- bugs(data=winbug.data,
inits=winbug.inits,
#inits=NULL,
parameters.to.save = c("b","r"),
model.file="negativebinomial.bug",
n.chains=3, n.iter=10000, n.burnin=5000,
n.thin=1,
codaPkg=T,
debug=F,
#bugs.directory="C:/Users/Hiwi/Documents/R/Win-library/3.0/R2winBUGS/"
bugs.directory="C:/Users/Hiwi/Documents/R/Win-library/3.0/R2winBUGS"
)
tempdir()
setwd(tempdir())
file.rename("codaIndex.txt","simIndex.txt")
file.rename("coda1.txt","sim1.txt")
file.rename("coda2.txt","sim2.txt")
file.rename("coda3.txt","sim3.txt")
posterior <- rbind(read.coda("sim1.txt","simIndex.txt"),read.coda("sim2.txt","simIndex.txt"),read.coda("sim3.txt","simIndex.txt"))
post.df <- as.data.frame(posterior)
summary(post.df)
quantile(post.df[,2],probs=c(.025,.975))
quantile(post.df[,2],probs=c(.05,.95))
quantile(post.df[,2],probs=c(.10,.90))
tempdir()
Difficult to say for sure without sitting at your PC... Maybe it is something to do with R2WinBUGS looking in the wrong directory for WinBUGS.exe? You can point R2WinBUGS to the right place using the bugs.directory argument in the bugs function.
If not, try and install OpenBUGS and give R2OpenBUGS a go.

Error while using transformation function in R

I was working with baby names data set and encountered below error while using transform function. Any guidance/suggestion would be highly appreciated. I did reinstalled the packages but of no avail.
Mac OS X (Mountain Lion)
R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet"
library(stringr)
require(stringr)
bnames1 <- transform(bnames1,
first = tolower(str_sub(name,1,1)),
last = tolower(str_sub(name,-1,1)),
vowels = vowels(name),
length= nchar(name),
per1000 = 10000 * prop,
one_par = 1/prop
)
Error in tolower(str_sub(name, 1, 1)) :
lazy-load database '/Library/Frameworks/R.framework/Versions/3.1/Resources/library/stringr/R/stringr.rdb' is corrupt
In addition: Warning messages:
1: In tolower(str_sub(name, 1, 1)) :
restarting interrupted promise evaluation
2: In tolower(str_sub(name, 1, 1)) : internal error -3 in R_decompress1
internal error -3 is often a functioning of installing on top of a loaded package. Restart R and restart your application. There may be other issues, but until you do this you won't be going much further.
Try
remove.packages("stringr")
install.packages("stringr")

Resources