how to import .rec files in R - r

I have a .rec file that I want to import into R. I have saved the .rec file to my working directory. This is what I have tried.
library(foreign)
library(RODBC)
data.test <- read.epiinfo("data_in.rec")
I get this error:
Error in if (headerlength <= 0L)
stop("file has zero or fewer variables: probably not an EpiInfo file") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1:
In readLines(file, 1L, ok = TRUE) :
line 1 appears to contain an embedded nul
2:
In strsplit(line, " ") : input string 1 is invalid in this locale
I have looked online and in the read.epiinfo help package in R. The help package says
Some later versions of Epi Info use the Microsoft Access file format
to store data. That may be readable with the RODBC package.
I have two questions.
1. Is the error I am getting because the .rec file I have is from an Epi Info version later than 6?
2. How do I use the RODBC library to open the .rec file?

The .rec (or .REC) file turned out to be a .EDF (European Data Format) file type. It was easily opened in R using the library edfReader. The edfReader library help file is very useful for opening the file and extracting the time series data. See code below for what I used. Code was adapted from the help file.
install.packages('edfReader')
library(edfReader)
?edfReader
lib.dir <- system.file("data_in.rec",package="edfReader")
Cfile <- paste(lib.dir,'/edfPlusC.edf',sep='')
CHdr <- readEdfHeader("data_in.rec")
CSignals <- readEdfSignals(CHdr)
summary(CSignals)

Related

unable to open .dat files on R even with haven installed

So I use SGA tools for processing my images. It gives back results in .dat files. Now in order to work on this data in R, I tried to import the .dat file using the haven package. I installed haven and then its library, but I am not able to import data still and it gives this error message.
Error: Failed to parse C:/Users/QuRana/Desktop/SGA Tools/Plate_Image_Example (1).dat: This version of the file format is not supported.
When I use this command install.packages("haven"), haven is loaded, but then when I load library using library(haven) nothing appears on my console except for this
> library(haven)
Then when I use this code:
datatrial1 <- read_dta("C:/Users/QuRana/Desktop/SGA Tools/Plate_Image_Example (1).dat")
It gives me the error mentioned above. When I try converting my .dat file to a .csv file and load my data, the imported data adds additional "t" values before the values in columns except for the first one like this:
Flags: S - Colony spill or edge interference C - Low colony circularity
# row\tcol\tsize\tcircularity\tflags
1\t1\t4355\t0.9053\t
1\t2\t4456\t0.8401\t
1\t3\t3439\t0.8219\t
1\t4\t3215\t0.8707\t
All the t's before the numeric values are not what I want. Another issue that I am facing is I cannot install the gitter package on my R version which is R 4.2.2.
You can read your tab separated file like so `read.delim("file_path", header = TRUE, sep = "\t")

Error in df_parse_dta_file(): Failed to parse C:/Users/folder/data.dta: This version of the file format is not supported

I wanted to read in a .dta file in R in order to convert it to a .csv file. First, I tried to do so by using the foreign package, but it reported:
Error in read.dta(file): not a Stata version 5-12 .dta file
So I tried to do it by using teh haven package, but that also failed and reported:
Error in df_parse_dta_file(spec, encoding, cols_skip, n_max, skip, name_repair = .name_repair) : Failed to parse C:/Users/folder/data.dta: This version of the file format is not supported
I also tried to convert it with the rio package:
install.packages("rio")
library(rio)
install_formats()
convert("file.dta","file.csv")
but it reported:
Error in arg_reconcile(haven::read_dta, file = file, ..., .docall = TRUE, :
Failed to parse C:/Users/folder/data.dta: This version of the file format is not supported.
This error was generated by: haven::read_dta
With the following arguments:
"._costs.dta"
Does anyone know how to import such .dta files in R so that one can convert a .csv file ?
PS: The preamble of the .dta-file looks like this:
<stata_dta>118LSFM 23 Apr 2019 16:22
Try adding encoding = "UTF-8" or encoding = "Latin1" inside the read_dta() function to tell R import same data without encoding into numbers. It might take a little while to clean data tho :(

How to deal with errors for loading a rdata file?

I need the data of the .rdata file for text mining. These are my dataset. I don't know exactly what's in that file. The problem is i can't load it.
I tried to open the file with different windows computers but with the same errors. I used RStudio in the updated version. I google the error-Information but nothing worked. Because I can open other rdata files there should be no registry problem. I wanted to check in an other basic windows Editor to look what is in the file but there were only signs like: ‹ ìùuPo³6
’#p ÜÝ‚»»»[pw—…»»»»»†àînÁuA°E`!‡ßûîýí}æÌLMÕÌùæŸÝõTñÈ}÷ÝrõÕ½
¢hXˆ
I tried different possibilities to open the file in RStudio with different error informations as followed:
with load()
require("readr")
setwd("C:/Users/..")
options(stringsAsFactors = F)
load("file")
# Error in load("file") :
# bad restore file magic number (file may be corrupted) -- no data loaded
# In addition: Warning message:
# file ‘.rdata’ has magic number ''
# Use of save versions prior to 2 is deprecated
with source()
require("readr")
setwd("C:/Users/..")
options(stringsAsFactors = F)
source("file")
# Error in source("file") :
# file.rdata:1:1: unexpected input
# 1:
# ^
readRDS
setwd("C:/Users/..")
options(stringsAsFactors = F)
readRDS("file")
# Error in readRDS("file") : unknown input format

Cannot read data from an xlsx file in RStudio

I have installed the required packages - gdata and ggplot2 and I have installed perl.
library(gdata)
library(ggplot2)
# Read the data from the excel spreadsheet
df = data.frame(read.xls ("AssignmentData.xlsx", sheet = "Data", header = TRUE, perl = "C:\\Strawberry\\perl\\bin\\perl.exe"))
However when I run this I get the following error:
Error in xls2sep(xls, sheet, verbose = verbose, ..., method = method, :
Intermediate file 'C:\Users\CLAIRE~1\AppData\Local\Temp\RtmpE3UYWA\file8983d8e1efc.csv' missing!
In addition: Warning message:
running command '"C:\STRAWB~1\perl\bin\perl.exe" "C:/Users/Claire1992/Documents/R/win-library/3.1/gdata/perl/xls2csv.pl" "AssignmentData.xlsx" "C:\Users\CLAIRE~1\AppData\Local\Temp\RtmpE3UYWA\file8983d8e1efc.csv" "Data"' had status 2
Error in file.exists(tfn) : invalid 'file' argument
Thanks to #Stibu I realised I had to set my work directory. This is the command you use to run in Rstudio; setwd("C/Documents..."). The file path is where the excel file is located.
I had the issue but I solved it differently.
My problem was because my file was saved as Excel (extension .xls) but it was a txt file.
I corrected the file and I did not meet any other error with the R function.

R read.spss error importing SPSS .por file - "Bad character in time"

I'm trying to import the NYPD stop-and-frisk data into R. The data is in SPSS .por files at http://www.nyc.gov/html/nypd/downloads/zip/analysis_and_planning/YYYY.zip
where YYYY is a year from 2003 to 2012
Most of the files load fine, but the 2004, 2007, and 2008 files all give me this error:
> library(foreign)
> mydata= read.spss("2004.por", to.data.frame=TRUE)
Error in read.spss("2004.por", to.data.frame = TRUE) :
error reading portable-file dictionary
In addition: Warning message:
In read.spss("2004.por", to.data.frame = TRUE) : Bad character in time
Execution halted
Any suggestions on how to debug this? I realize that read.spss does not support the latest SPSS versions, but given that most of the files (7 out of 10) import properly I wonder whether it's something more subtle.
psppire loads all the files without complaint, but the data looks corrupted, with some fields seemingly combined with others, and binary data in some of the fields.
I had some success using memisc as recommended in Read SPSS file into R. Namely, after installing memisc:
> install.packages('memisc')
You can read the data rather easily:
> library(memisc)
> data <- as.data.set(spss.portable.file('2004.por'))
While I haven't thoroughly inspected the data, it appears on first glance to be right.

Resources