I am working with an Excel file saved in S3. I am trying to access it using R. To get the file I am using fl <- get_object(paste(file_path,file_name),bucket = bucket). This works fine and returns the file in raw vector format. The problem I am having is that any function I have found to read an Excel file requires an actual file (ie path), not a raw vector.
Is there a way to read a raw vector (of an Excel file) into a data frame? Or, convert the raw vector back to an Excel file so I can reference that file in read_excel() or the like?
The python code below does what I need, but for reasons far beyond my control, I must do this in R.
fl = s3.get_object(Bucket=bucket,Key= file_path + file_name)
df = pd.read_excel(fl['Body'])
Related
I am pulling in NetCDF data from a remote server using data <- httr:GET(my_url) in an R session. I can writeBin(content(data, "raw"), "my_file.nc") and then nc_open("my_file.nc") but that is rather cumbersome (I am processing hundreds of NetCDF files).
Is there a way to convert the raw data straight into a ncdf4 object without going through the file system? For instance, would it be possible to pipe the raw data into nc_open()? I looked at the source code and the function prototype expects a named file, so I suppose a named pipe might work but how do I make a named pipe from a raw blob of bytes in R?
Any other suggestions welcome.
I have an Excel file containing a column of 10000 numbers that I wish to import into R.
However, no matter the method I use, the resulting object is either a list of 1, or 10000 obs. of 1 variable (I have used read.csv on the .csv version of the file, read_xlsx on the .xlsx version). If this is expected, how can I work these objects into ordinary arrays?
I have tried importing the same files into matlab and everything is working normally there (it's immediately an ordinary array).
If it's an excel file you might want to try the readxl package.
library("readxl")
dt <- read_excel("your_file_path")
link
Found an easy method:
convert data to a dataframe, and then convert it to an array;
my_data<-data.frame(my_data)
my_data<-data.matrix(my_data)
I am creating a process to converting an API data into a df.
My problem is:
The data just appears correct after exporting to a csv file, using ' df.to_csv("df.csv", sep=','). If I don't do that, the first column appears a big data list.
Is there a way to do this process of convert to csv format without creating an external file ?
From the documentation of DataFrame.to_csv:
path_or_buf : string or file handle, default None
File path or object, if None is provided the result is returned as a
string.
So simply doing:
csv_string = df.to_csv(None, sep=",")
Gives you a string containing a csv representation of your dataframe without creating an external file.
Sorry I am very new to this, so am confused. I am working on a project that requires me to analyze data that is on a shared drive. I cannot make a copy of the dataset. How do I load this dataset in R? It's also a SAS file I'll need to read into R.
The file is smb://department.university.edu/lab101/me/dataset/file.xpt Setwd(smb://department.university.edu/lab101/me/dataset) doesn't work, but am not sure what would here.
First, Your setwd receives a character string as file path:
setwd("smb:\\department.university.edu\lab101\me\dataset")
Then you can read the file like mentioned here:
# save SAS dataset in trasport format
libname out xport 'c:/mydata.xpt';
data out.mydata;
set sasuser.mydata;
run;
# in R
library(Hmisc)
mydata <- sasxport.get("file.xpt")
# character variables are converted to R factors
Reference: http://www.statmethods.net/input/importingdata.html
I would like to clarify my understanding here on both converting a file into CSV and also reading it. Let's use a dataset from R for instance, titled longley.
To set up a data frame, I can just use the write.table command as follows, right?
d1<-longley
write.table(d1, file="", sep="1,16", row.names=TRUE, col.names=TRUE)
Has this already become a data frame or am I missing something here?
Now let's say if I want to read this CSV file. Then would my code be something like:
read.table(<dframe1>, header=FALSE, sep="", quote="\"")
It seems like before that I have to use a function called setwd(). I'm not really sure what it does or how it helps. Can someone help me here?
longley and, therefore, d1 are already data frames (type class(d1) in the console). A data frame is a fundamental data structure in R. Writing a data frame to a file saves the data in the data frame. In this case, you're trying to save the data in the data frame in CSV format, which you would do like this:
write.csv(d1, "myFileName.csv")
write.csv is a wrapper for write.table that takes care of the settings needed for saving in CSV format. You could also do:
write.table(d1, "myFileName.csv", sep=",")
sep="," tells R to write the file with values separated by a comma.
Then, to read the file into an R session you can do this:
df = read.csv("myFileName.csv", header=TRUE, stringsAsFactors=FALSE)
This creates a new object called df, which is the data frame created from the data in myFileName.csv. Once again, read.csv is a wrapper for read.table that takes care of the settings for reading a CSV file.
setwd is how you change the working directory--that is, the default directory where R writes to and reads from. But you can also keep the current working directory unchanged and just give write.csv or read.csv (or any other function that writes or reads R objects) the full path to wherever you want to read from or write to. For example:
write.csv(d1, "/path/for/saving/file/myFileName.csv")