Read.dta not working on Mac OS X - r

A project that typically works on my Windows 7 office machine now gives errors on my Mac OS X laptop, trying to run it with R Studio. The part it fails is
library(foreign)
basis <- read.dta("myfile.dta")
Error in factor(rval[[v]], levels = tt[[ll[v]]], labels = names(tt[[ll[v]]])) :
invalid 'labels'; length 4 should be 1 or 3
R and Rstudio are on the newest version, I already ran update.packages(). As I'm a beginner on R itself, I'm completely clueless what to try next.
Could this somehow be related with OS X encoding? The stata file has German "umlaut" (that is, non ISO characters) in it.

Use package memisc instead. This is supposed to be more flexible. From the docs (found here) we have:
The importer mechanism is more flexible and extensible than read.spss
and read.dta of package "foreign", as most of the parsing of the file
headers is done in R.
So back to the problem. First, load the following:
library(lattice)
library(MASS)
library(memisc)
and then use the call:
as.data.frame(as.data.set(Stata.file("filename.dta")))

Related

R Program Version issue

Results are different in Version 3.6 & 4.1.
My R(3.6) code in the ubuntu server(18) is running well but the same code in ubuntu 20 R(4.1) is working very badly.
look at this capture
Issue with R Version
The purpose of this code is to normalize the column by dividing the sum.
Thank you all in advance.
Please don't post code as an image. It is also advised to post a reproducible example.
In any case, in your example on R 3.6, all_bins is a factor. However, in your R 4.1 example, all_bins is a character vector.
This is because of the change in R 4.0.0.:
R now uses a ‘⁠stringsAsFactors = FALSE⁠’ default, and hence by default no longer converts strings to factors in calls to data.frame() and read.table().
In order to reproduce the server behaviour on your local machine, when you read in bins in your local version of R, you need to add the argument stringsAsFactors = TRUE, e.g.:
bins <- read.csv("path/to/file", stringsAsFactors = TRUE)
This should solve this particular issue. However, you may run into other differences between R 3.6 and R 4.1 on different machines. I would recommend running the same version of R and packages on both machines, perhaps using renv, if you want to ensure the output is the same.

Convert Stata 13 .dta file to CSV without using stata [duplicate]

Is there a way to read a Stata version 13 dataset file in R?
I have tried to do the following:
> library(foreign)
> data = read.dta("TEAdataSTATA.dta")
However, I got an error:
Error in read.dta("TEAdataSTATA.dta") :
not a Stata version 5-12 .dta file
Could someone point out if there is a way to fix this?
There is a new package to import Stata 13 files into a data.frame in R.
Install the package and read a Stata 13 dataset with read.dta13():
install.packages("readstata13")
library(readstata13)
dat <- read.dta13("TEAdataSTATA.dta")
Update: readstata13 imports in version 0.8 also files from Stata 6 to 14
More about the package: https://github.com/sjewo/readstata13
There's a new package called Haven, by Hadley Wickham, which can load Stata 13 dta files (as well as SAS and SPSS files)
library(haven) # haven package now available on cran
df <- read_dta('c:/somefile.dta')
See: https://github.com/hadley/haven
If you have Stata 13, then you can load it there and save it as a Stata 12 format using the command saveold (see help saveold). Afterwards, take it to R.
If you have, Stata 10 - 12, you can use the user-written command use13, (by Sergiy Radyakin) to load it and save it there; then to R. You can install use13 running ssc install use13.
Details can be found at http://radyakin.org/transfer/use13/use13.htm
Other alternatives, still with Stata, involve exporting the Stata format to something else that R will read, e.g. text-based files. See help export within Stata.
Update
Starting Stata 14, saveold has a version() option, allowing one to save in Stata .dta formats as old as Stata 11.
In the meanwhile savespss command became a member of the SSC archive and can be installed to Stata with: findit savespss
The homepage http://www.radyakin.org/transfer/savespss/savespss.htm continues to work, but the program should be installed from the SSC now, not from the beta location.
I am not familiar with the current state of R programs regarding their ability
to read other file formats, but if someone doesn't have Stata installed on their computer and R cannot read a specific version of Stata's dta files, Pandas in Python can now do the vast majority of such conversions.
Basically, the data from the dta file are first loaded using the pandas.read_stata function. As of version 0.23.0, the supported encoding and formats can be found in a related answer of mine.
Then one can either save the data as a csv file and import them
using standard R functions, or instead use the pandas.DataFrame.to_feather function, which exports the data using a serialization format built on Apache Arrow. The latter has extensive support in R as it was conceived to promote interoperability with Pandas.
I had the same problem. Tried read.dta13, read.dta but nothing worked. Then tried the easiest and least expected: MS Excel! It opened marvelously. I saved it as a .csv and used in R!!! Hope this helps!!!!

Read Stata 13 file in R

Is there a way to read a Stata version 13 dataset file in R?
I have tried to do the following:
> library(foreign)
> data = read.dta("TEAdataSTATA.dta")
However, I got an error:
Error in read.dta("TEAdataSTATA.dta") :
not a Stata version 5-12 .dta file
Could someone point out if there is a way to fix this?
There is a new package to import Stata 13 files into a data.frame in R.
Install the package and read a Stata 13 dataset with read.dta13():
install.packages("readstata13")
library(readstata13)
dat <- read.dta13("TEAdataSTATA.dta")
Update: readstata13 imports in version 0.8 also files from Stata 6 to 14
More about the package: https://github.com/sjewo/readstata13
There's a new package called Haven, by Hadley Wickham, which can load Stata 13 dta files (as well as SAS and SPSS files)
library(haven) # haven package now available on cran
df <- read_dta('c:/somefile.dta')
See: https://github.com/hadley/haven
If you have Stata 13, then you can load it there and save it as a Stata 12 format using the command saveold (see help saveold). Afterwards, take it to R.
If you have, Stata 10 - 12, you can use the user-written command use13, (by Sergiy Radyakin) to load it and save it there; then to R. You can install use13 running ssc install use13.
Details can be found at http://radyakin.org/transfer/use13/use13.htm
Other alternatives, still with Stata, involve exporting the Stata format to something else that R will read, e.g. text-based files. See help export within Stata.
Update
Starting Stata 14, saveold has a version() option, allowing one to save in Stata .dta formats as old as Stata 11.
In the meanwhile savespss command became a member of the SSC archive and can be installed to Stata with: findit savespss
The homepage http://www.radyakin.org/transfer/savespss/savespss.htm continues to work, but the program should be installed from the SSC now, not from the beta location.
I am not familiar with the current state of R programs regarding their ability
to read other file formats, but if someone doesn't have Stata installed on their computer and R cannot read a specific version of Stata's dta files, Pandas in Python can now do the vast majority of such conversions.
Basically, the data from the dta file are first loaded using the pandas.read_stata function. As of version 0.23.0, the supported encoding and formats can be found in a related answer of mine.
Then one can either save the data as a csv file and import them
using standard R functions, or instead use the pandas.DataFrame.to_feather function, which exports the data using a serialization format built on Apache Arrow. The latter has extensive support in R as it was conceived to promote interoperability with Pandas.
I had the same problem. Tried read.dta13, read.dta but nothing worked. Then tried the easiest and least expected: MS Excel! It opened marvelously. I saved it as a .csv and used in R!!! Hope this helps!!!!

Error in reading SAS dataset in R 3.0.1

I am trying to read SAS dataset in R 3.0.1.
I have downloaded Hmisc package required to use sas.get function. But I am getting note as below:
Hmisc library by Frank E Harrell Jr
Type library(help='Hmisc'), ?Overview, or ?Hmisc.Overview')
to see overall documentation.
NOTE:Hmisc no longer redefines [.factor to drop unused levels when
subsetting. To get the old behavior of Hmisc type dropUnusedLevels().
Attaching package: ‘Hmisc’
Then I am using the following command:
sas.get(library = "C:\\SAS_dataset", member = "test", formats = FALSE, sasprog = sasprog)
Then the R goes in infinite loop and does not give output. Finally when i press "Esc", it terminates by giving an warning message saying
Warning message:
running command '"C:/program files/SAS/SAS 9.1/sas.exe" "C:\Users\TEJASW~1.ABH\AppData\Local\Temp\RtmpML87zC\SaS13c41642d38.3.sas" -log "_temp_.log"' had status 10708
I tried to find the reason for the same, but all in vain.
I don't understand the reason for this. Is it due to some note given by Hmisc package or something else?
Also I noted that I am facing this problem for latest version i.e. 3.0.1 only. Whereas I was able to read the SAS dataset with the same commands in version 2.15.1.
Can any one help me to solve this problem.
Thanks in advance.
Regards,
Tejasweeni
If you have SAS, you can always export your data to a CSV file and read in R using read.table() or read.csv(). I think this is often the best solution.

R.matlab/readMat : Error in readTag(this)

I am trying to read a matlab file into R using R.matlab but am encountering this error:
require(R.matlab)
r <- readMat("file.mat", verbose=T)
Trying to read MAT v5 file stream...
Error in readTag(this) : Unknown data type. Not in range [1,19]: 18569
In addition: Warning message:
In readMat5Header(this, firstFourBytes = firstFourBytes) :
Unknown MAT version tag: 512. Will assume version 5.
How can this issue be solved or is there an alternative way to load matlab files? I can use hdf5load but have heard this can mess with the data. Thanks!
This is a bit late on the response, but I've recently been running into the same issues. For me, the issue was that I was saving matlab files by default using the '-v7.3' option. After extensive searching, the R.matlab source documentation (http://cran.r-project.org/web/packages/R.matlab/R.matlab.pdf) indicates the following:
Reading compressed MAT files
From MATLAB v7, compressed MAT version 5 files are used by default
[3,4]. This function supports reading such
files, if running R v2.10.0 or newer. For older versions of R, the
Rcompression package is used. To install that package, please see
instructions at http://www.omegahat.org/ cranRepository.html. As a
last resort, use save -V6 in MATLAB to write MAT files that are
compatible with MATLAB v6, that is, to write non-compressed MAT
version 5 files.
About MAT files saved in MATLAB using ’-v7.3’
This function does not
support MAT files saved in MATLAB as save('foo.mat',
'-v7.3'). Such MAT files are of a completely different file format
[5,6] compared to those saved with, say, '-v7'."
adding the '-v7' option at the end of my save command fixed this issue.
i.e.: save('filename', 'variable', '-v7')
i had a very similar problem until i pointed the function to an actual .mat file that existed. before that i'd been specifying two files of the same name, but one was .mat and the other was .txt, so it may have been trying to open the other.
i realize this may not directly solve your issue (the only difference i saw in my error message was the absence of that first line "Trying ..." and the specific numbers thereafter as well as the presence of another couple similar warnings with odd numbers), but it might point to some simple filename problem as the issue.
i use the latest matlab on 64 bit vista and the latest R on 32 bit xp.

Resources