setwd in R - mapped drive? - r

Sorry I am very new to this, so am confused. I am working on a project that requires me to analyze data that is on a shared drive. I cannot make a copy of the dataset. How do I load this dataset in R? It's also a SAS file I'll need to read into R.
The file is smb://department.university.edu/lab101/me/dataset/file.xpt Setwd(smb://department.university.edu/lab101/me/dataset) doesn't work, but am not sure what would here.

First, Your setwd receives a character string as file path:
setwd("smb:\\department.university.edu\lab101\me\dataset")
Then you can read the file like mentioned here:
# save SAS dataset in trasport format
libname out xport 'c:/mydata.xpt';
data out.mydata;
set sasuser.mydata;
run;
# in R
library(Hmisc)
mydata <- sasxport.get("file.xpt")
# character variables are converted to R factors
Reference: http://www.statmethods.net/input/importingdata.html

Related

Convert Raw Vector back to Excel R

I am working with an Excel file saved in S3. I am trying to access it using R. To get the file I am using fl <- get_object(paste(file_path,file_name),bucket = bucket). This works fine and returns the file in raw vector format. The problem I am having is that any function I have found to read an Excel file requires an actual file (ie path), not a raw vector.
Is there a way to read a raw vector (of an Excel file) into a data frame? Or, convert the raw vector back to an Excel file so I can reference that file in read_excel() or the like?
The python code below does what I need, but for reasons far beyond my control, I must do this in R.
fl = s3.get_object(Bucket=bucket,Key= file_path + file_name)
df = pd.read_excel(fl['Body'])

Append new lines to a .Rda file in R

Writing a fresh .Rda file to save a data.frame is easy:
df <- data.frame(a=c(1,2,3,4), b=c(5,6,7,8))
save(df,file="data.Rda")
But is it possible to write more data afterwards, there is no append=TRUE option using save.
Similarly, writing new lines to a text file is easy using:
write.table(df, file = 'data.txt', append=T)
However for large data.frames, the resulting file is much larger.
If you use Microsoft R, you might want to check RevoScaler package, rxImport function in particular. It allows you to store compressed data.frame in file, it also allows you to append new lines to existing file without loading it into environment.
Hope this helps. Link on function documentation below.
https://learn.microsoft.com/en-us/machine-learning-server/r-reference/revoscaler/rximport

Reading and Setting Up CSV files on R Programming Language

I would like to clarify my understanding here on both converting a file into CSV and also reading it. Let's use a dataset from R for instance, titled longley.
To set up a data frame, I can just use the write.table command as follows, right?
d1<-longley
write.table(d1, file="", sep="1,16", row.names=TRUE, col.names=TRUE)
Has this already become a data frame or am I missing something here?
Now let's say if I want to read this CSV file. Then would my code be something like:
read.table(<dframe1>, header=FALSE, sep="", quote="\"")
It seems like before that I have to use a function called setwd(). I'm not really sure what it does or how it helps. Can someone help me here?
longley and, therefore, d1 are already data frames (type class(d1) in the console). A data frame is a fundamental data structure in R. Writing a data frame to a file saves the data in the data frame. In this case, you're trying to save the data in the data frame in CSV format, which you would do like this:
write.csv(d1, "myFileName.csv")
write.csv is a wrapper for write.table that takes care of the settings needed for saving in CSV format. You could also do:
write.table(d1, "myFileName.csv", sep=",")
sep="," tells R to write the file with values separated by a comma.
Then, to read the file into an R session you can do this:
df = read.csv("myFileName.csv", header=TRUE, stringsAsFactors=FALSE)
This creates a new object called df, which is the data frame created from the data in myFileName.csv. Once again, read.csv is a wrapper for read.table that takes care of the settings for reading a CSV file.
setwd is how you change the working directory--that is, the default directory where R writes to and reads from. But you can also keep the current working directory unchanged and just give write.csv or read.csv (or any other function that writes or reads R objects) the full path to wherever you want to read from or write to. For example:
write.csv(d1, "/path/for/saving/file/myFileName.csv")

How to export a dataset to SPSS?

I want to export a dataset in the MASS package to SPSS for further investigation. I'm looking for the EuStockMarkets data set in the package.
As described in http://www.statmethods.net/input/exportingdata.html, I did:
library(foreign)
write.foreign(EuStockMarkets, "c:/mydata.txt", "c:/mydata.sps", package="SPSS")
I got a text file but the sps file is not a valid SPSS file. I'm really looking for a way to export the dataset to something that a SPSS can open.
As Thomas has mentioned in the comments, write.foreign doesn't generate native SPSS datafiles (.sav). What it does generate is the data in a comma delimited format (the .txt file) and a basic syntax file for reading that data into SPSS (the .sps file). The EuStockMarkets data object class is multivariate time series (mts) so when it's exported the metadata is lost and the resulting .sps file, lacking variable names, throws an error when you try to run it in SPSS. To get around this you can export it as a data frame instead:
write.foreign(as.data.frame(EuStockMarkets), "c:/mydata.txt", "c:/mydata.sps", package="SPSS")
Now you just need to open mydata.sps as a syntax file (NOT as a datafile) in SPSS and run it to read in the datafile.
Rather than exporting it, use the STATS GET R extension command. It will take a specified data frame from an R workspace/dataset and convert it into a Statistics dataset. You need the R Essentials for Statistics and the extension command, which are available via the SPSS Community site (www.ibm.com/developerworks/spssdevcentral)
I'm not trying to answer a question that has been answered. I just think there is something else to complement for other users looking for this.
On your SPSS window, you just need to find the first line of code and edit it. It should be something like this:
"file-name.txt"
You need to find the folder path where you're keeping your file:
"C:\Users\DELL\Google Drive\Folder-With-Your-File"
Then you just need to add this path to your file's name:
"C:\Users\DELL\Google Drive\Folder-With-Your-File\file-name.txt"
Otherwise SPSS will not recognize the .txt file.
Sorry if I'm repeating some information here, I just wanted to make it easier to understand.
I suppose that EuStockMarkets is a (labelled) data frame.
This should work and even keep the variable and value labels:
require(sjlabelled)
write_spss(EuStockMarkets, "mydata.sav")
Or you try rio:
rio::export(EuStockMarkets, "mydata.sav")

Read and write a Netcdf file using R

How can I read and write the following file using R ?
https://www.dropbox.com/s/vlnrlxjs7f977zz/3B42_daily.2012.11.23.7.nc
In other words, I would like to read the "3B42_daily.2012.11.23.7.nc" file and write with the same structure that it is written.
Best regards
Package ncdf have functions to do this. You should also read other Q&A on this site tagged with netcdf and r.
Basically to read a netcdf file:
library(ncdf)
a <- open.ncdf('your/path/to/your/file.nc') #that opens a connection to the file
Then function get.var.ncdf helps you extract the data, variable by variable.
The process to write one is described in this Q&A.
The idea is to create dimensions first using dim.def.ncdf then the variables with var.def.ncdf and finally the file itself using create.ncdf.

Resources