I have an Excel with several sheets, two of which contain pivot tables based on data from other sheets ("data sheets"). Using the openxlsx package, I'm loading the Excel into R and first removing the data sheets and then creating them again with new data. This works well and the pivots update accordingly.
However, if I apply conditional formatting to the pivots and perform the above process, then I obtain an error message when opening the new updated file (We found a problem with some content in [file]. Do you want us to try to recover as much as we can? [...]). After having done the repair, I get the message:
Repaired Records: Conditional formatting from /xl/pivotTables/pivotTable1.xml part (PivotTable view)
The conditional formatting has been removed after the repair. I have found this page which might be of use to me, but I haven't got their possible solutions to work:
https://github.com/awalker89/openxlsx/issues/387
I have also tried to construct a minimal Excel reproducing this behavior, but while the minimal Excel I produce yields the We found a problem... error, it does keep the conditional formatting once repaired.
Any ideas? Thanks in advance!
Related
So I recently helped write a code for my lab which takes our processed data and makes a merged data frame of it. For purpose of keeping the lab updated, we keep our data tables updated on a secure wiki and thus I need an HTML made so I can basically upload the dataframe onto the wiki easily. It's worked before - all I did was basically copy what was already written and working and edited it to work for a different time point in our data collection. I have no errors given back to me and the data looks how I want it to look. As far as I know this script should be written logically and working well and so far it does except for one issue: R will make a file for the HTML, but there is no HTML written in the text document.
I have HTML's written from the other data time points which are written the exact same as this one, so I don't think it is a script construction thing.
Any ideas as to why this could be happening? I just need to know where to triage.
The package used for HTML is R2HTML, included in my packages list up at the top of the script. For HTML(, file=paste()), you will need to use your own directory to see if the HTML is written as a text file.
If I am not wrong , You are trying to get the dataframe in html format .
In this case you need to use xtable package in R
Just the below code in bottom of the script
## install the xtable package before importing it
library("xtable")
print(xtable(ChildSRPtotsFU_wiki), type="html", file="check_stack_overflow.html")
I am trying to automate some of my tests in R to produce a static report in Excel. I have created a template in Excel which has a few charts and tables(sheet 1).
Now I run my R code to generate the data to fill in the same excel template file on Sheet 2.
I am using Openxlsx package to loadworkbook(excel template), next I overwrite data in sheet 2 by deleting the sheet and recreating it again with the new data so that the excel template has data for new test runs.
This runs without any error. But when I open my excel back the charts disappear with the !REF# error whereas as the tables are overwritten properly in the template(sheet1).
Has anyone come across such a scenario? The method I am using is a bit weird but can't think of any other alternative.
Thanks in advance!!
This definitely sounds weird. Something seems off, but I'm sorry I can't tell you what the issue may be. Anyway, I would say, just use R to generate the data and dump everything into Excel. Then, run some VBA in Excel to create the charts. I have no idea what your VBA skills are like, but I'm guessing it would be much easier to crate charts in Excel using VBA, rather than trying to do all of this with R.
Here are a few resources that you may find useful.
https://www.thespreadsheetguru.com/blog/2015/3/1/the-vba-coding-guide-for-excel-charts-graph
https://analysistabs.com/excel-vba/chart-examples-tutorials/
http://www.sthda.com/english/wiki/r-xlsx-package-a-quick-start-guide-to-manipulate-excel-files-in-r
Finally, you can learn a lot by recording Macros and hitting F8 to step-through the code to see how everything works.
I am using a shiney interface under R to read in a CSV file and load it into one sheet of an excel xlsm file. The file then allows user input and preforms calculations based on VBA macros.
The R xlsx package is working well for preserving the VBA and formatting in the original excel sheet. However some of the data is being converted to a different data type than intended. For example a cell containing the string "F" is causing the column containing it to be converted to type boolean, or a miss-entered number in one cell is causing the entire column to be converted to string.
Can this behavior be controlled so that, for example, cells with valid numbers are not converted to string type? Is there a work-around? Or can someone just help me to understand what is happening in the guts of the package to cause this effect so I can try to find a way around it?
Here are the calls in question:
#excelType() points to an excel xlsm template
data = read.csv("results.csv")
excelForm = loadWorkbook(excelType())
sheets = getSheets(excelForm)
addDataFrame(data, sheets[[1]], col.names = FALSE, row.names = FALSE, startRow=2, colStyle = NULL)
saveWorkbook(excelForm, "results.xlsm")
Thanks!
I hope this is the correct protocol for explaining the outcome which worked for me. I hope it will be of help to others if they end up doing something similar, though the solution is not very elegant!
I tried r2evans's suggestion of forcing column types I could not get that to work in this case. Using readxls gave the same problem, and also broke my VBA. Given lebelionz's comment suggesting that this is an R thing and not a package thing I followed his advice to deal with it after the fact. (I do not see how to credit a comment rather than an answer, but for the record this was very helpful, as were the others).
I therefore altered the program producing the CSV that was being loaded through R. I appended "::" to each cell produced, so that R saw all cells as strings, regardless of the original content. Thus "F" was stored as "::F", and therefore was not altered by R.
I added an autorun macro to the excel sheet thus created, so that when opened it automatically performed a global search and replace to remove the prefix "::" from the whole of the data. This forces Excel to choose a data type for each cell after it was restored, resulting in the types being detected cell by cell and in the correct format for my purposes.
It feels kludgy, but it works and is relatively transparent to the user. One hazard is that if the user data intentionally contained the string "::" it would be lost (I am confident this cannot arise in my particular application, but if someone would like to suggest a better prefix I would be interested). I still hope for an eventual solution rather than a work-around.
And here I thought it was only the movie industry that had to "fix it in post"!
I have 2 tables, Table_1 and Table_2, on one sheet (sheet_5) in my excel workbook myWorkbook.xlsx. I know there are plenty of packages that allow you to specify which sheet to load in, but is there a way to load in only the table(s) you want? In this case, I want to load Table_2 only.
Thanks
The most recent version of readxl has the ability to set the range. IMHO this is BY FAR the best excel read in package for R.
See the part of this post entitled "Specifying the data rectangle": https://blog.rstudio.org/2017/04/19/readxl-1-0-0/
the syntax should be very familiar to an excel user.
Also please see this post for asking questions on SO:
How to make a great R reproducible example?
My question:
Can I change the parameters in R to use the source editor to also view >5MB data sets in R?
If not, what is your advice?
Background:
I recently stopped looking at data in Excel and switched to R entirely. As I did in Excel and still prefer to do in R, I like to look at the entire frame and then decide on filters.
Problem: Working with the World Development Indicators (WDI) data set which is over 100MB, opening it in the source editor does not work. View(df) opens an empty tab in RStudio as also shown below:
R threw another error when I selected the data set from the Files Tab in column on the right of RStudio which read:
The selected file 'wdi.csv' is too large to open in the source editor (the file is 104.5 MB and the maximum file size is 5MB).
Solutions?
My alter ego would tell me to increase the threshold of datasets' file size for the source editor, so I could investigate it there. In brief: change 5 to 200 MB. My alter ego would also tell me that I would probably encounter performance issues (since I am using a MacAir).
How I resolved the issue:
I used head() and dplyr's glimpse() to get a better idea, but ended up looking at the wdi matrix in excel and then filtered it out in R. Newly created dataframes could be opened in the source editor without any problems.
Thanks in advance!