Results are different in Version 3.6 & 4.1.
My R(3.6) code in the ubuntu server(18) is running well but the same code in ubuntu 20 R(4.1) is working very badly.
look at this capture
Issue with R Version
The purpose of this code is to normalize the column by dividing the sum.
Thank you all in advance.
Please don't post code as an image. It is also advised to post a reproducible example.
In any case, in your example on R 3.6, all_bins is a factor. However, in your R 4.1 example, all_bins is a character vector.
This is because of the change in R 4.0.0.:
R now uses a ‘stringsAsFactors = FALSE’ default, and hence by default no longer converts strings to factors in calls to data.frame() and read.table().
In order to reproduce the server behaviour on your local machine, when you read in bins in your local version of R, you need to add the argument stringsAsFactors = TRUE, e.g.:
bins <- read.csv("path/to/file", stringsAsFactors = TRUE)
This should solve this particular issue. However, you may run into other differences between R 3.6 and R 4.1 on different machines. I would recommend running the same version of R and packages on both machines, perhaps using renv, if you want to ensure the output is the same.
Related
I have been writing code in R studio and tried to move it over to Jupyer Books to share it with people.
The code all works in R studio but when I run it in Jupyer Books, as.date() does not convert the date column which begins as a factor into a date which then means I have no data when I subset by date later on.
Has anyone had this happen and know a solution? Or will I just need to use lubridate or similar to convert the date?
Thanks,
Dave
My guess is that you are running different R versions in both places. Run R.version.string at both the places to check which version of R you are running at each of them. Since R 4.0.0 the default behaviour of R changed when importing string data into R. Previously they were imported as factors and now (since 4.0.0) they are imported as characters.
The solution is to import your dataset with stringsAsFactors = FALSE in both the places to see the same output at both the places.
data <- read.csv('filename.csv', stringsAsFactors = FALSE)
My question is if an object in R saved to binary format using the save function can be different if saved from different (but recent) versions of R.
That is because I have a script that makes some calculations and save its results to a file. When reproducing the same calculations later, I decided to compare the two files using
diff --binary -s mv3p.Rdata mv3p.Rdata.backup
To my surprise the two files are different. However when analysing the contents in R, they are identical.
The new version is 3.3.1. I believe the older version have been created by R 3.3.0 but it could also be by 3.2.x, I am not 100% sure. I used the save command with only the object I wanted to save and the filename arguments.
So my question is : is it normal that the same object is written differently in different versions of R? is it documented somewhere? How can I be sure to be able to reproduce exactly the same file? On what can it depend (R version, OS, processor architecture, etc...)
Please , I am NOT asking if versions can be read by another version of R and I am NOT asking about very old R versions.
R data files also include the R version used to write it. That's one reason the files may be different. See here on documentation: http://biostat.mc.vanderbilt.edu/wiki/Main/RBinaryFormat
Also, you can use save(..., ascii=T) to see the difference in plain text.
A recurring question on SO is that package xx is not available for R version 2.xx.xx. For example the gplots package requires the user to have R 3.0 installed in order for it to install. You can get older versions in the Archive of CRAN, but:
It is not easy to see which version of the package you need to get for a specific R version.
You need to build the package from source, which is primarily a (mild) challenge under Windows.
My question is the following: is there a more effective workflow in getting older package versions which match your older version of R? In the spirit of having different package repositories for different version of ubuntu.
I know one option would be to just get the latest version of R, but there might be some pressing reason to stick to a certain version of R. For example, one could be interested in repeating an old experiment which relies on an old version of R and support packages. Or one is limited by the system administration.
This is entirely untested (I'm running the latest version of R and have no time at the moment to install an old version of R to test it out), but perhaps one idea is to grab the dates from the "Archive" page for the package, compare that to the date for your R version, and progressively try installing the earlier versions, starting with the most recent version.
Something like this might be a starting point:
install_archive <- function(PackageName) {
if(!require("XML"))
install.packages("XML")
if(!require("devtools"))
install.packages("devtools")
rVersionDate <- as.Date(paste(R.Version()[c("year", "month", "day")],
collapse = "-"))
BaseURL <- "http://cran.r-project.org/src/contrib/Archive/"
u <- htmlParse(paste(BaseURL, PackageName, sep = ""))
doc <- readHTMLTable(u, skip.rows=1:2)[[1]][2:3]
releaseDate <- as.Date(strptime(doc$`Last modified`,
format="%d-%b-%Y"))
Closest <- which.min(rVersionDate -
releaseDate[releaseDate <= rVersionDate])
install_url(paste(BaseURL, doc$Name[Closest], sep = ""))
}
install_archive("reshape")
From here, I would add at least the following things to the function:
I would first try to install the most current version (not from the "Archive"), and if that fails, then move ahead.
In moving ahead, I would change the which.min() line to rank(), and try rank == 1, rank == 2, and so on, perhaps setting a maximum rank at which to try.
Even so, this is a lot of "guess and check", only the software is doing the guessing and checking for you automatically. And, of course, the same advice holds that there is probably a good reason it's not on CRAN!
A project that typically works on my Windows 7 office machine now gives errors on my Mac OS X laptop, trying to run it with R Studio. The part it fails is
library(foreign)
basis <- read.dta("myfile.dta")
Error in factor(rval[[v]], levels = tt[[ll[v]]], labels = names(tt[[ll[v]]])) :
invalid 'labels'; length 4 should be 1 or 3
R and Rstudio are on the newest version, I already ran update.packages(). As I'm a beginner on R itself, I'm completely clueless what to try next.
Could this somehow be related with OS X encoding? The stata file has German "umlaut" (that is, non ISO characters) in it.
Use package memisc instead. This is supposed to be more flexible. From the docs (found here) we have:
The importer mechanism is more flexible and extensible than read.spss
and read.dta of package "foreign", as most of the parsing of the file
headers is done in R.
So back to the problem. First, load the following:
library(lattice)
library(MASS)
library(memisc)
and then use the call:
as.data.frame(as.data.set(Stata.file("filename.dta")))
I am trying to read a matlab file into R using R.matlab but am encountering this error:
require(R.matlab)
r <- readMat("file.mat", verbose=T)
Trying to read MAT v5 file stream...
Error in readTag(this) : Unknown data type. Not in range [1,19]: 18569
In addition: Warning message:
In readMat5Header(this, firstFourBytes = firstFourBytes) :
Unknown MAT version tag: 512. Will assume version 5.
How can this issue be solved or is there an alternative way to load matlab files? I can use hdf5load but have heard this can mess with the data. Thanks!
This is a bit late on the response, but I've recently been running into the same issues. For me, the issue was that I was saving matlab files by default using the '-v7.3' option. After extensive searching, the R.matlab source documentation (http://cran.r-project.org/web/packages/R.matlab/R.matlab.pdf) indicates the following:
Reading compressed MAT files
From MATLAB v7, compressed MAT version 5 files are used by default
[3,4]. This function supports reading such
files, if running R v2.10.0 or newer. For older versions of R, the
Rcompression package is used. To install that package, please see
instructions at http://www.omegahat.org/ cranRepository.html. As a
last resort, use save -V6 in MATLAB to write MAT files that are
compatible with MATLAB v6, that is, to write non-compressed MAT
version 5 files.
About MAT files saved in MATLAB using ’-v7.3’
This function does not
support MAT files saved in MATLAB as save('foo.mat',
'-v7.3'). Such MAT files are of a completely different file format
[5,6] compared to those saved with, say, '-v7'."
adding the '-v7' option at the end of my save command fixed this issue.
i.e.: save('filename', 'variable', '-v7')
i had a very similar problem until i pointed the function to an actual .mat file that existed. before that i'd been specifying two files of the same name, but one was .mat and the other was .txt, so it may have been trying to open the other.
i realize this may not directly solve your issue (the only difference i saw in my error message was the absence of that first line "Trying ..." and the specific numbers thereafter as well as the presence of another couple similar warnings with odd numbers), but it might point to some simple filename problem as the issue.
i use the latest matlab on 64 bit vista and the latest R on 32 bit xp.