"filename.rdata" file Exploring and Converting to CSV - r

I'm no R-programmer (because of the problem I started learning it), I'm using Python, In a forcasting task I got a dataset signalList.rdata of a pheomenen called partial discharge.
I tried some commands to load, open and view, Hardly got a glimps
my_data <- get(load('C:/Users/Zack-PC/Desktop/Study/Data Sets/pdCluster/signalList.Rdata'))
but, since i lack deep knowledge about R, I wanted to convert it into a csv file, or any type that I can deal with in python.
or, explore it and copy-paste manually.
so, i'm asking for any solution whether using R or Python or any tool to get what's in the .rdata file.

Have you managed to load the data successfully into your working environment?
If so, write.csv is the function you are looking for.
If not,
setwd("C:/Users/Zack-PC/Desktop/Study/Data Sets/pdCluster/")
signalList <- load("signalList.Rdata")
write.csv(signalList, "signalList.csv")
should do the trick.
If you would like to remove signalList from your working directory,
rm(signalList)
will accomplish this.
Note: changing your working directory isn't necessary, it just makes it easier to read in a comment I feel. You may also specify another path for saving your csv to within the second argument of write.csv.

Related

How to use a file modified by a R chunk in a Python one

I am working in Rmarkdown into primarily R chunks, which I used to modify data frames. Now that they are ready, a colleague gave me Python codes to process some of the data. But when transitioning from a R chunk to a Python one, the environment changes and I do not know how to use the previous files directly.
reticulate::repl_python()
biodata_file = women_personal_data
NameError: name 'women_personal_data' is not defined
NameError: name 'women_personal_data' is not defined
Ideally, I would like not to have to save the files on my computer between R and Python, and then back at R again, to avoid accumulating files that are not completely clean yet (because I figured it could be a solution).
I tried this solution but seems to not work with Data Frames
Thanks !
biodata_file = r.women_personal_data
The '.r' makes it take it from R, because the variable was called
r women_personal_data
TIP = to come back to R, the variable is now called py$women_personal_data

In R and Sparklyr, writing a table to .CSV (spark_write_csv) yields many files, not one single file. Why? And can I change that?

Background
I'm doing some data manipulation (joins, etc.) on a very large dataset in R, so I decided to use a local installation of Apache Spark and sparklyr to be able to use my dplyr code to manipulate it all. (I'm running Windows 10 Pro; R is 64-bit.) I've done the work needed, and now want to output the sparklyr table to a .csv file.
The Problem
Here's the code I'm using to output a .csv file to a folder on my hard drive:
spark_write_csv(d1, "C:/d1.csv")
When I navigate to the directory in question, though, I don't see a single csv file d1.csv. Instead I see a newly created folder called d1, and when I click inside it I see ~10 .csv files all beginning with "part". Here's a screenshot:
The folder also contains the same number of .csv.crc files, which I see from Googling are "used to store CRC code for a split file archive".
What's going on here? Is there a way to put these files back together, or to get spark_write_csv to output a single file like write.csv?
Edit
A user below suggested that this post may answer the question, and it nearly does, but it seems like the asker is looking for Scala code that does what I want, while I'm looking for R code that does what I want.
I had the exact same issue.
In simple terms, the partitions are done for computational efficiency. If you have partitions, multiple workers/executors can write the table on each partition. In contrast, if you only have one partition, the csv file can only be written by a single worker/executor, making the task much slower. The same principle applies not only for writing tables but also for parallel computations.
For more details on partitioning, you can check this link.
Suppose I want to save table as a single file with the path path/to/table.csv. I would do this as follows
table %>% sdf_repartition(partitions=1)
spark_write_csv(table, path/to/table.csv,...)
You can check full details of sdf_repartition in the official documentation.
Data will be divided into multiple partitions. When you save the dataframe to CSV, you will get file from each partition. Before calling spark_write_csv method you need to bring all the data to single partition to get single file.
You can use a method called as coalese to achieve this.
coalesce(df, 1)

Error in excel charts when overwrite data from R

I am trying to automate some of my tests in R to produce a static report in Excel. I have created a template in Excel which has a few charts and tables(sheet 1).
Now I run my R code to generate the data to fill in the same excel template file on Sheet 2.
I am using Openxlsx package to loadworkbook(excel template), next I overwrite data in sheet 2 by deleting the sheet and recreating it again with the new data so that the excel template has data for new test runs.
This runs without any error. But when I open my excel back the charts disappear with the !REF# error whereas as the tables are overwritten properly in the template(sheet1).
Has anyone come across such a scenario? The method I am using is a bit weird but can't think of any other alternative.
Thanks in advance!!
This definitely sounds weird. Something seems off, but I'm sorry I can't tell you what the issue may be. Anyway, I would say, just use R to generate the data and dump everything into Excel. Then, run some VBA in Excel to create the charts. I have no idea what your VBA skills are like, but I'm guessing it would be much easier to crate charts in Excel using VBA, rather than trying to do all of this with R.
Here are a few resources that you may find useful.
https://www.thespreadsheetguru.com/blog/2015/3/1/the-vba-coding-guide-for-excel-charts-graph
https://analysistabs.com/excel-vba/chart-examples-tutorials/
http://www.sthda.com/english/wiki/r-xlsx-package-a-quick-start-guide-to-manipulate-excel-files-in-r
Finally, you can learn a lot by recording Macros and hitting F8 to step-through the code to see how everything works.

Retrieve path of supplementary data file of developed package

While developing a package I encountered the problem of supplementary data import - this has been 'kind of' solved here.
Nevertheless, I need to make use of a function of another package, which needs a path to the used file. Sadly, using GlobalEnvironment variables here is not an option.
[By the way: the file needs to be .txt, while supplementary data should be .RData. The function is quite picky.]
So I need to know how to get the path supplementary data file of a package. Is this even possible to do?
I had the idea of reading the .RData into the global environment and then saving it into a tmpfile for further processing. I would really like to know a clean way - the supplementary data is ~100MB large...
Thank you very much!
Use system.file() to reliably find the path to the installed package and sub-directories, typically these are created in your-pkg-source/inst/extdata/your-file.txt and then referenced as
system.file(package="your-pkg", "extdata", "your-file.txt")

R How to read in a function

I'm currently implementing a tool in R and I got stucked with a problem. I looked already in the forums and didn't found anything.
I have many .csv files, which are somehow correlated with each other. The problem is I don't know yet how (this depends on the input of the user of the tool). Now I would like to read in a csv-file, that contains an arbitrary function f, e.g. f: a=b+max(c,d), and then the inputs, e.g. a="Final Sheet", b="Sheet1", c="Sheet2", d="Sheet3". (Maybe I didn't explained it very well, then I will upload a picture).
Now my question is, can I somehow read that csv file in, such that I can later use the function f in the programm? (Of course the given function has to be common in R).
I hope you understand my problem and I would appreciate any help or idea!!
I would not combine data files with R source. Much easier to keep them separate. You put your functions in separate script files and then source() them as needed, and load your data with read.csv() etc.
"Keep It Simple" :-)
I am sure there's a contorted way of reading in the source code of a function from a text file and then eval() it somehow -- but I am not sure it would be worth the effort.

Resources