How to deal with errors for loading a rdata file? - r

I need the data of the .rdata file for text mining. These are my dataset. I don't know exactly what's in that file. The problem is i can't load it.
I tried to open the file with different windows computers but with the same errors. I used RStudio in the updated version. I google the error-Information but nothing worked. Because I can open other rdata files there should be no registry problem. I wanted to check in an other basic windows Editor to look what is in the file but there were only signs like: ‹ ìùuPo³6
’#p ÜÝ‚»»»[pw—…»»»»»†àînÁuA°E`!‡ßûîýí}æÌLMÕÌùæŸÝõTñÈ}÷ÝrõÕ½
¢hXˆ
I tried different possibilities to open the file in RStudio with different error informations as followed:
with load()
require("readr")
setwd("C:/Users/..")
options(stringsAsFactors = F)
load("file")
# Error in load("file") :
# bad restore file magic number (file may be corrupted) -- no data loaded
# In addition: Warning message:
# file ‘.rdata’ has magic number ''
# Use of save versions prior to 2 is deprecated
with source()
require("readr")
setwd("C:/Users/..")
options(stringsAsFactors = F)
source("file")
# Error in source("file") :
# file.rdata:1:1: unexpected input
# 1:
# ^
readRDS
setwd("C:/Users/..")
options(stringsAsFactors = F)
readRDS("file")
# Error in readRDS("file") : unknown input format

Related

Finding and setting the working directory within a super computer

Obviously one can find and set working directories with getwd() and setwd(). This question is a bit more complicated. I'm running two files on a super computer (one is a regular R file which calls the other file (a .stan file). The way I submit work to the super computer is that I have to zip a folder containing the data, the .R file, and the .stan file. I upload this folder, and I pull this folder by setting it as one of the parameters in the super computer. I call the data using the standard read.csv() command and everything is hunky dory.
However, when I call the .stan file from the .R file, it can't access it because it needs to know the working directory.
This is the error I get:
> fit <- stan(file="ace_thresholds.stan", data=stanData, cores = 4)
Error in file(fname, "rt") : cannot open the connection
In addition: Warning messages:
1: In normalizePath(file) :
path[1]="ace_thresholds.stan": No such file or directory
2: In file(fname, "rt") :
cannot open file 'ace_thresholds.stan': No such file or directory
Error in get_model_strcode(file, model_code) :
cannot open model file "ace_thresholds.stan"
Calls: stan -> stan_model -> stanc -> get_model_strcode
Execution halted
When I tried setting the working directory to the unzipped NSG_stan folder (which is what I assumed the wd to be, I received this error:
fit <- stan(file="NSG_stan/ace_thresholds.stan", data=stanData, cores = 4)
Error in file(fname, "rt") : cannot open the connection
In addition: Warning messages:
1: In normalizePath(file) :
path[1]="NSG_stan/ace_thresholds.stan": No such file or directory
2: In file(fname, "rt") :
cannot open file 'NSG_stan/ace_thresholds.stan': No such file or directory
Error in get_model_strcode(file, model_code) :
cannot open model file "NSG_stan/ace_thresholds.stan"
Calls: stan -> stan_model -> stanc -> get_model_strcode
Execution halted
So I tried running print(getwd()) within the script and in the printout I see that the wd is
"/projects/ps-nsg/home/nsguser/ngbw/workspace/NGBW-JOB-RTOOL_TG-EBE9CDBF28BF42AF8CB6EC9355006B3E/NSG_stan"
which means that the working directory will shift with every job. So to accurately set the working directory, I'd need to set it to the current folder within the script. I looked for various posts on how to do this like the following
# install.packages("rstudioapi") # run this if it's your first time using it to install
library(rstudioapi) # load it
# the following line is for getting the path of your current open file
current_path <- getActiveDocumentContext()$path
# The next line set the working directory to the relevant one:
setwd(dirname(current_path ))
# you can make sure you are in the right directory
print( getwd() )
The issue with this is that it's a super computer, so it's difficult to install packages, because every time I want to install something, I have to email the folks associated with the super comp and that all takes time.
I've looked over this thread as well: R command for setting working directory to source file location in Rstudio. Appears to be a lot of dissent over what works. I tried a couple of them, and they didn't work.
setwd(getSrcDirectory()[1])
this.dir <- dirname(parent.frame(2)$ofile)
setwd(this.dir)
I've included the .R file below, in the event that it helps, but I think this is probably a pretty easy answer for someone with a decent amount of coding experience.
.R file (so it's the "ace_thresholds.stan" file that I either need to link to the current wd or to include the code that would set the wd, such that this "ace_thresholds.stan" call would work. Does that make sense?
Thanks much!
dat <- ace.threshold.t2.samp
dat <- subset(dat, !is.na(rw))
dat$condition <- factor(dat$condition)
dat$pid <- factor(dat$pid)
nTotal <- dim(dat)[1]
nCond <- length(unique(dat$condition))
nSubj <- length(unique(dat$pid))
intensity <- dat$rw
condition <- as.numeric(dat$condition)
pid <- as.numeric(dat$pid)
correct <- dat$correct_button == "correct"
chancePerformance <- 1/2
stanData <- list(nTotal=nTotal, nLevels=nCond, nSubj = nSubj, subject = pid, intensity=intensity, level=condition, correct=correct, chancePerformance=chancePerformance)
fit.rw <- stan(file="ace_thresholds.stan", data=stanData, cores = 4, control=list(max_treedepth=15, adapt_delta=0.90))

How to eliminate security concern while accessing to the file through program in R in Windows?

While accessing to CSV file from disk with the help of the R program, where a path to the CSV file is provided in the configuration file ( A path is like "testData/Amazon S3/Inventory/Accounts.csv" which is provided in the Configuration file and cfig[2]$save.location is variable who is having value of this path accessed from Configuration file). Few lines of code are below
path <- cfig[2]$save.location
test_data <- fread(path,stringsAsFactors = FALSE,drop=col_ignor,blank.lines.skip = TRUE)
but it gave the message below:
Taking input= as a system command ('testData/Amazon
S3/Inventory/Accounts.csv') and a variable has been used in the
expression passed to input=. Please use fread(cmd=...). There is a
security concern if you are creating an app and the app could have a
malicious user and the app is not running in a secure environment;
e.g. the app is running as root. Please read item 5 in the NEWS file
for v1.11.6 for more information and for the option to suppress this
message.
'testData' is not recognized as an internal or external
command, operable program or batch file. Warning messages:
1: In (if
(.Platform$OS.type == "unix") system else shell)(paste0("(", :
'(testData/Amazon S3/Inventory/Accounts.csv) >
C:\Users\sharmb5\AppData\Local\Temp\RtmpOa25kH\filea78b5351f1'
execution failed with error code 1.
2: In fread(cfig[2]$save.location,
stringsAsFactors = FALSE, drop = col_ignor, : File
'C:\Users\sharmb5\AppData\Local\Temp\RtmpOa25kH\filea78b5351f1' has
size 0. Returning a NULL data.table.
when the following line of code executes,
config[4]$save.location <- stri_replace_all(config[4]$save.location, cp_val, fixed = cp_key)
It gives an error like as, Error in [<-.data.table(*tmp*, j, value = list(TestCaseID = "C419760", : Supplied 14 columns to be assigned 15 items. Please see NEWS for v1.12.2.
The above error was a warning but after manually updating of packages. This warning turns into an error. What will be the reason behind this issue and how to solve it? Thanks for Advance!!!

how to import .rec files in R

I have a .rec file that I want to import into R. I have saved the .rec file to my working directory. This is what I have tried.
library(foreign)
library(RODBC)
data.test <- read.epiinfo("data_in.rec")
I get this error:
Error in if (headerlength <= 0L)
stop("file has zero or fewer variables: probably not an EpiInfo file") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1:
In readLines(file, 1L, ok = TRUE) :
line 1 appears to contain an embedded nul
2:
In strsplit(line, " ") : input string 1 is invalid in this locale
I have looked online and in the read.epiinfo help package in R. The help package says
Some later versions of Epi Info use the Microsoft Access file format
to store data. That may be readable with the RODBC package.
I have two questions.
1. Is the error I am getting because the .rec file I have is from an Epi Info version later than 6?
2. How do I use the RODBC library to open the .rec file?
The .rec (or .REC) file turned out to be a .EDF (European Data Format) file type. It was easily opened in R using the library edfReader. The edfReader library help file is very useful for opening the file and extracting the time series data. See code below for what I used. Code was adapted from the help file.
install.packages('edfReader')
library(edfReader)
?edfReader
lib.dir <- system.file("data_in.rec",package="edfReader")
Cfile <- paste(lib.dir,'/edfPlusC.edf',sep='')
CHdr <- readEdfHeader("data_in.rec")
CSignals <- readEdfSignals(CHdr)
summary(CSignals)

trouble unzipping file under Windows

I have the following code:
download.file(
"http://www.wikipathways.org//wpi/batchDownload.php?species=Homo%20sapiens&fileType=txt",
destfile="human.zip")
files <- unzip( "human.zip", list=T)
It works on Linux, but throws the following error on Windows:
Error in unzip("human.zip", list = T) :
error -103 with zipfile in unzGetCurrentFileInfo
Do you happen to know what's the problem?
In ?download.file, we read that:
If mode is not supplied and url ends in one of .gz, .bz2, .xz, .tgz,
.zip, .rda or .RData a binary transfer is done. Since Windows (unlike
Unix-alikes) does distinguish between text and binary files, care is
needed that other binary file types are transferred with mode = "wb".
Note that this list does not include .zip, although it is a binary file type. So you need to pass mode="wb".
I cannot reproduce your example, but it solved my identical problem. Here is an example:
url <- "https://www.bls.gov/cex/pumd/ce_pumd_interview_diary_dictionary.xlsx"
download.file(url, 'file1.xlsx')
download.file(url, 'file2.xlsx', mode="wb") # Try this instead
library(readxl)
read_xlsx('file1.xlsx', sheet='Variables') # Fails
# Error in sheets_fun(path) :
# Evaluation error: error -103 with zipfile in unzGetCurrentFileInfo
read_xlsx('file2.xlsx', sheet='Variables') # Works
# A tibble: 3,580 x 13

Cannot read data from an xlsx file in RStudio

I have installed the required packages - gdata and ggplot2 and I have installed perl.
library(gdata)
library(ggplot2)
# Read the data from the excel spreadsheet
df = data.frame(read.xls ("AssignmentData.xlsx", sheet = "Data", header = TRUE, perl = "C:\\Strawberry\\perl\\bin\\perl.exe"))
However when I run this I get the following error:
Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, :
Intermediate file 'C:\Users\CLAIRE~1\AppData\Local\Temp\RtmpE3UYWA\file8983d8e1efc.csv' missing!
In addition: Warning message:
running command '"C:\STRAWB~1\perl\bin\perl.exe" "C:/Users/Claire1992/Documents/R/win-library/3.1/gdata/perl/xls2csv.pl" "AssignmentData.xlsx" "C:\Users\CLAIRE~1\AppData\Local\Temp\RtmpE3UYWA\file8983d8e1efc.csv" "Data"' had status 2
Error in file.exists(tfn) : invalid 'file' argument
Thanks to #Stibu I realised I had to set my work directory. This is the command you use to run in Rstudio; setwd("C/Documents..."). The file path is where the excel file is located.
I had the issue but I solved it differently.
My problem was because my file was saved as Excel (extension .xls) but it was a txt file.
I corrected the file and I did not meet any other error with the R function.

Resources