R - how to capture an error from RDCOMclient - r

I have some code which compacts and repairs a number of MS Access databases:
library(RDCOMClient)
library(stringr)
accfolders <- list.dirs('C:\\users\\username\\accessdb\\',recursive = FALSE,full.names=F)[-1] #need -1 to exclude current dir
accfolders <- paste0("C:\\users\\username\\accessdb\\",accfolders)
#launch access
oApp <- COMCreate("Access.Application")
for (folder in accfolders) {
accfiles <- list.files(path=folder, pattern="\\.mdb", full.names=TRUE)
print(paste("working in dir", folder))
for (file in accfiles){
print (paste("working in db", file))
bkfile <- sub(".mdb", "_bk.mdb", file)
oApp$CompactRepair(file, bkfile, FALSE)
file.copy(bkfile, file, overwrite = TRUE)
file.remove(bkfile)
}
#print(paste("completed", folder))
}
oApp$quit()
gc()
However, sometimes the code returns this following error:
<checkErrorInfo> 80020009
Error: Exception occurred.
This error seems to happen somewhat randomly and it happens on the call oApp$CompactRepair during the second for loop
I can't seem to figure out why this happens and it happens with random .mdb files rather than a specific one. Sometimes I run the code and there is no issue at all, other times it produces the error.
Seeing as I can't figure it out, I'm wondering if I could capture this error somehow and just skip that element in the for loop? That way the code will not break down

Related

NetCDF: HDF error only inside a loop in R

I have a script to loop through a selection of net cdf files. The files are opened, data extracted, then closed again. I have used this many times before and it works with no issue. I was recently sent a new selection of files to run through the same code. I can check the files individually using the ncdf4 package and nc_open() function. The files look fine and are not corrupt. However, when I run through the loop the function will not let me open the files and I get this error:
Error in R_nc4_open: NetCDF: HDF error
When I run though the loop to check, all is fine and the file opens. It just cannot open in the loop. There is no issue with the code.
Has anyone come across this before with non-corrupt net cdf files getting this error only on occasion. Even outside the loop I can run the code and get the error first time, then run it again without changing anything and the connection works.
Not sure how to trouble shoot this one, so just looking for advice as to why this might be happening.
Code snippet:
targetYear <- '2005-2019'
variables <- c('CHL','SSH')
ncNam <- list.files(folderdir, '.nc', recursive = TRUE)
for(v in 1:length((variables)))
{
varNam <- unlist(unique(variables))[v]
# Get names corresponding to variable
varLs <- ncNam[grep(varNam, basename(ncNam))]
varLs <- varLs[grep(targetYear, varLs)]}
varLs <- varLs[1]
export <- paste0(exportdir,varNam,'/')
dir.create(export, recursive = TRUE)
if(varNam == 'Proximity1km' | varNam == 'Proximity200m'| varNam ==
'ProximityCoast'| varNam == 'Bathymetry'){
fileNam <- varLs
ncfilename <- paste0(folderdir, fileNam)
print(ncfilename)
# Read ncfile
ncfile <- nc_open(ncfilename)
nc_close(ncfile)
gc()
} else {
fileNam <- varLs
ncfilename <- paste0(folderdir, fileNam)
print(ncfilename)
# Read ncfile
ncfile <- nc_open(ncfilename)
nc_close(ncfile)
gc()}`
I figured out the issue. It was to do with the error detection filer in the .nc files.
I removed the filter and the files work fine inside the loop. Still a bit strange.
Perhaps the ncdf4 package is not up to date with this filtering.

Error in file(file, *"rt"): cannot open the connection

I made a for-loop that loops through a folder. The folder is called Ultrasonic data - Plots and it contains subfolders. The names of the subfolders are consistent and the data in the subfolders as well. When I run the code it runs very well for the majority of the loop, but for some subfolders it gives an error.
The error is: file(file, "rt"): cannot open the connection.
The answers that I have read normally state the problem as a wrongly states working directory. However, I think it is not the case in this example. Could anything else cause this error message?
I have already checked the consitency of the names of the subfolders and .txt data files within them.
'''R
parent.folder <- "//home.org.aalto.fi/meijsl1/data/Documents/GAGS/Ultrasonic data/Ultrasonic data - Plots"
sub.folders <- list.dirs(parent.folder, recursive=FALSE)
filt.folders <- sub.folders[grepl("SV-30-[^_]*_S[12]", sub.folders)]
for(i in filt.folders) {
setwd(i)
AIC("SV-30", 20, 40) #This is a function that picks the S-wave onset of an ultrasonic signal
} #End for loop over all specimens
'''
'''PArt of the AIC function where the error occurs: (read.table)
for (n in 1:length(filelist)){
#
# Read the file into R
#
file.path('./out/Processed', basename(filelist[n])) -> procpath
read.table(file=procpath, sep="\t", stringsAsFactors=FALSE, check.names = FALSE) -> temp
assign(paste(substr(basename(filelist[n]), 1, nchar(basename(filelist[n])) - 4)), temp)
'''
The code should be running smoothly, as it does for most of the subfolders, but apperantly there must be something wrong. I am out of ideas what it could be. Consistency was the only thing that could cause this trouble in my point of view. I hope anyone can help.
Cheers

R download,file freeze

I'm trying to download some images from a website. I have a series of urls of images that I have to download. So I run it with this code :
dlphoto <- function(x){
print(x)
setTimeLimit(5)
Sys.sleep(0.3)
download.file(x , destfile = basename(x))
}
This function has however one major problem :
When I run my vector of 15000 urls with it, it freezes the entire R session, and stop reacting to anything. However, if I run urls separately, it works fine. Or when I run for example 1:50 urls, it works too. However, when I put 1:100, for example, it freezes as well.... So can you please help me to figure this out ?
at first I was using this line to call:
dlphoto(allimage[,2])
then I changed to this one :
dlphoto(allimage[c(1:50),2])
dlphoto(allimage[c(51:100),2])
dlphoto(allimage[c(101:150),2])
dlphoto(allimage[c(151:200),2])
and so on untill 15000
and so on. But it still freeze a lot. And each time it dies I have to close R and search where the process reached and start from there. And I get this warning message regularly :
Error in download.file(x, destfile = basename(x)) :
reached CPU time limit
And also, can you help me to make that the photos downloaded are saved in
/Users/name/Desktop/M2/Mémoire M2/Scrapingtest/photos
thanks a lot !!
There are couple of improvements possible. I have assumed that OP is using download.file from base packages which supports only single file in one attempt if method is not set libcurl and quiet = T.
Hence the fix should be to use method = "libcurl" and quiet = TRUE in download.file function. The changed function:
dlphoto <- function(x){
print(x)
download.file(x , destfile = basename(x), method="libcurl", quiet = TRUE)
}
OR
download.file(x , destfile = basename(x), method="libcurl", quiet = TRUE)
Note: In both above cases, the progress-bar will not be displayed.
I think the value of timeout from options is good enough to ensure return from download.file in case of delays.
The error for return value from download.file should be checked. Any non-zero return value indicate failure.
If you want to see progress-bars (which is probably not needed for 1500 files in one go) then function should be modified to handle 1 file at a time. The modified function will be:
# This function will display progressbar for each file
dlphoto <- function(x){
for(file in x){
print(fine)
download.file(file , destfile = basename(file))
}
}

How to skip missing files when downloading multiples files from the web?

I have a question about downloading files. I know how to download files, using the download.file function. I need to download multiple files from a particular site, each file corresponding to a different date. I have a series of dates, using which I can prepare the URL to download the file. I know for a fact that for some particular dates, the files are missing on the website. Subsequently my code stops at that point. I then have to manually reset the date index (increment it by 1) and re-run the code. Since I have to download more than 1500 files, I was wondering if I can somehow capture the 'absence of the file' and instead of the code stopping, it continues with the next date in the array.
Below is the dput of a part of the date array:
dput(head(fnames,10))
c("20060102.trd", "20060103.trd", "20060104.trd", "20060105.trd",
"20060106.trd", "20060109.trd", "20060110.trd", "20060112.trd",
"20060113.trd", "20060116.trd")
This file has 1723 dates. Below is the code that I am using:
for (i in 1:length(fnames)){
file <- paste(substr(fnames[i],7,8), substr(fnames[i],5,6), substr(fnames[i],1,4), sep = "")
URL <- paste("http://xxxxx_",file,".zip",sep="")
download.file(URL, paste(file, "zip", sep = "."))
unzip(paste(file, "zip", sep = "."))}
The program works fine, till it encounters a particular date for which the file is missing, and it stops. Is there a way to capture this, and print the missing file name (the variable 'file'), and move on to the next date in the array?
Please help.
I apologize that I have not shared the exact URL. In case it becomes difficult to simulate the issue, then please let me know.
* Trying to incorporate #Paul's suggestion.
I worked on a smaller dataset.
dput(testnames) is
c("20120214.trd", "20120215.trd", "20120216.trd", "20120217.trd",
"20120221.trd")
I know that file corresponding to the date '20120216' is missing from the website. I altered my code to incorporate the tryCatch function. Below it is:
tryCatch({for (i in 1:length(testnames)){
file <- paste(substr(testnames[i],7,8), substr(testnames[i],5,6), substr(testnames[i],1,4), sep = "")
URL <- paste("http://xxxx_",file,".zip",sep="")
download.file(URL, paste(file, "zip", sep = "."))
unzip(paste(file, "zip", sep = "."))}
},
error = function(e) {cat(file, '\n')
i=i+1},
warning = function(w) {message('cannot unzip')
i=i+1}
)
It runs fine for the first two dates, and as expected, throws an error for the 3rd one. I am facing 2 issues:
When I 'exclude' the warning block, it gives me the missing file name file as coded in the error block. But when I 'include' the warning block, it only issues the warning, and somehow doesnt execute the error block. Why is that?
In either case, the code stops after reading "20120216.trd" and doesnt proceed ahead with the next file, which is desirable. Is incrementing the variable i not sufficient for that purpose?
Please advise.
You can do this using tryCatch. This function will try the operation you feed it, and provide you with a way to dealing with errors. For example, in your case an error could simply lead to skipping the file and ignoring the error. For example:
skip_with_message = simpleError('Did not work out')
tryCatch(print(bla), error = function(e) skip_with_message)
# <simpleError: Did not work out>
Notice that the error here is that the bla object does not exist.

Fast assessment of corrupted Affymetrix CEL files

I'm trying to normalize a big amount of Affymetrix CEL files using R. However, some of them appear to be truncated, so when reading them i get the error
Cel file xxx does not seem to have the correct dimensions
And the normalization stops. Manually removing the corrupted files and restart every time will take very long. Do you know if there is a fast way (in R or with a tool) to detect corrupted files?
PS I'm 99.99% sure I'm normalizing together CELs from the same platform, it's really just truncated files :-)
One simple suggestion:
Can you just use a tryCatch block around your read.table (or whichever read command you're using)? Then just skip a file if you get that error message. You can also compile a list of corrupted files within the catch block (I recommend doing that so that you are tracking corrupted files for future reference when running a big batch process like this). Here's the pseudo code:
corrupted.files <- data.frame()
for(i in 1:nrow(files)) {
x <- tryCatch(read.table(file=files[i]), error = function(e)
if(e=="something") { corrupted.files <- rbind(corrupted.files, files[i]) }
else { stop(e) },
finally=print(paste("finished with", files[i], "at", Sys.time())))
if(nrow(x)) # do something with the uncorrupted data
}

Resources