IMPORTFROMWEB() , splitResult=TRUE not working for this table? - web-scraping

=IMPORTFROMWEB is a custom function for google sheets developed by
https://nodatanobusiness.com/importfromweb/documentation/
It helps me load data from dynamic html to a spreadsheet. (any other solution for excel or google sheets also works)
Here I'm loading a table.
https://docs.google.com/spreadsheets/d/1Dh7KZ91FeqzvTh2BwzB6WOmfTUTN4FoGIf_c4ZrfdHk/edit?usp=sharing
code is in A6 & A15.
But the function returns everything in on cell instead of separating them. splitResult doesnt seem to be working.

With IMPORTFROMWEB, you will need to indicate the path for each column of the table
Just follow this example:

Have you tried just using a different source than Morninstar? Yahoo is pretty scrapable with normal IMPORTHTML(). That is, without a custom function.
I put htis formula on a new tab in your sheet and it seems to work ok...
=IMPORTHTML("https://finance.yahoo.com/quote/MCSMX/holdings?p=MCSMX","table",1)

Related

How to color some numbers in Excel cell using python or R

I have this example of Excel file where the data contain some random values. I generated this using RAND() function.
What I want to do is read this excel file using R so that I can color red and bold the number 9 wherever it appears in the cell. Is this possible to do?
I've been searching on Google a while but haven't been able to figure it out any other way other than using VBA. But it's not an option.
Does anybody have an example of how to achieve this?
What I wanted to do is not possible using any of the python packages - xlsxwriter can only do rich text like I wanted but only on new cell but cannot modify, openpyxl can do a lot of things but not rich text. Wasn't sure if it could be done using R or not, but seems like it's not possible to do what I want done. I saw a Google Group discussion here where they showed a potential method to perform what I wanted, but that method didn't work for me. It showed .jnew is not recognized.
So, instead, what I did is created a function to add a color dot (image) to the cell to delineate that the cell contains the value I'm searching for - 9 in this case. The reason I can't use conditional formatting is because there's another conditional formatting that is applied for another logic.
def __add_color_dots__(self, ws=None, excel_filename=None):
from openpyxl.drawing.image import Image
import os
path = os.path.abspath('blue-dot.png')
image = Image(path)
image.anchor = 'C4'
ws.add_image(image)
return ws
Hope this will help someone later and that this method maybe a useful workaround.

Is there a way to make some formatting in Google Sheet within R

I'm currently on a project aiming to generate some formatted report from R to google sheet.
Now I'm using googlesheets4 package and able to write data into google sheet from R. But is there a way to do some formatting work? like bold, italic, add $ sign or conditional formatting and etc.
Here is the example spread sheet I made.
https://docs.google.com/spreadsheets/d/1vp-w5muArvMxHKx4NL-39NMkAHRJCtldXLkUmJqKK2E/edit?usp=sharing
the output I want is like this(in sheet2), having $ in spend col, keep integer in kpi cols and conditional formatting on upsell cols
I ran into the same issue. I don't believe googlesheets4 has this type of functionality (yet at least.)
The way I (quite inefficiently) worked around this was by creating two sheets, taking advantage of the fact that it is possible to format cells that have native/google sheets functions.
The first spreadsheet contained the raw data imported from the R script, which could not be formatted. The second sheet was formatted to my liking, then collected the first sheet's data through functions.
Hope this helps! I would have added this as a comment but I do not yet have the reputation to do so.

Why won't the HTML function in R actually write the HTML?

So I recently helped write a code for my lab which takes our processed data and makes a merged data frame of it. For purpose of keeping the lab updated, we keep our data tables updated on a secure wiki and thus I need an HTML made so I can basically upload the dataframe onto the wiki easily. It's worked before - all I did was basically copy what was already written and working and edited it to work for a different time point in our data collection. I have no errors given back to me and the data looks how I want it to look. As far as I know this script should be written logically and working well and so far it does except for one issue: R will make a file for the HTML, but there is no HTML written in the text document.
I have HTML's written from the other data time points which are written the exact same as this one, so I don't think it is a script construction thing.
Any ideas as to why this could be happening? I just need to know where to triage.
The package used for HTML is R2HTML, included in my packages list up at the top of the script. For HTML(, file=paste()), you will need to use your own directory to see if the HTML is written as a text file.
If I am not wrong , You are trying to get the dataframe in html format .
In this case you need to use xtable package in R
Just the below code in bottom of the script
## install the xtable package before importing it
library("xtable")
print(xtable(ChildSRPtotsFU_wiki), type="html", file="check_stack_overflow.html")

RStudio: Save data from Viewer

Due to a stupid mistake and a defective USB stick I lost a bunch of data and I am now trying to recover it.
Some of the the data is still displayed in the Viewer tabs when I open RStudio. However, I can only save R Scripts and R Markdownfiles out of the Viewer. The displayed data frames are nice and complete, I can sort and filter them in the Viewer, however, I cannot find a "save" option. Is there a possibility to save this displayed data into Rdata or csv or something similar?
I would suggest three different approaches, but none of them will necessarily work. I sort them according to my prior expectations of success.
1) You can copy all your data frame from the viewer and paste it into an external spreadsheet software to obtain a .csv file. E.g. through the "convert text to columns" button in MS Excel.
2) You can copy and paste the character string into an object that is passed to the text option of read.table or to dput(). Check out the "Copy your data" section of this famous SO question
3) Finally, you can get google Chrome's "Inspect Element" function to inspect the html code of the object in the viewer. Once you find the table you can copy paste and scrape with an html parser, e.g. using the rvest package. Good luck!
Thanks everybody, there is a way to access the data as Rdata files, which was kindly explained to me here
I used the second method and located the files in %localappdata%\RStudio-Desktop\viewer-cache.

When using Excel's "Print Titles" how do i change the titles midway down the sheet

I have a classic ASP web app that outputs reports to Excel, but it's really just html.
Some reports output with multiple groups and each group can span multiple pages (vertically). I'm aware of the "Page Titles" ability of Excel to print a specified row (or rows) on every page, however, I need the title of each group to also display in the title. Otherwise the title of the first group gets displayed as the title of every group.
I saw on google groups that someone suggested putting each group on a separate worksheet however I don't think I can output multiple worksheets easily - or at all - using html alone.
I'm looking for a quick and dirty solution as I don't have much time to devote to maintaining this crufty old app.
This is a bit late as answers go but I think I have found a solution. What you can do is open Excel, manually mock up what you want, then save it as a webpage. Open the generated file(s) up in a simple text editor and examine the generated HTML/XML. I did this for a workbook with multiple worksheets and it appears to work.
You can do the same with the multiple groups since that seems like the solution you really want, the process is the same. But the multiple worksheets option will work as well. Here are the interesting bits of what Excel generated for me (from Book.htm, not the Sheet files) when I saved a simple 2 sheet workbook with 'abc' on the first page and 'def' on the second:
<script language="JavaScript">
var c_lTabs=2;
var c_rgszSh=new Array(c_lTabs);
c_rgszSh[0] = "Sheet1";
c_rgszSh[1] = "Sheet2";
------
<xml>
<x:ExcelWorkbook>
<x:ExcelWorksheets>
<x:ExcelWorksheet>
<x:Name>Sheet1</x:Name>
<x:WorksheetSource HRef="Book1_files/sheet001.htm"/>
</x:ExcelWorksheet>
<x:ExcelWorksheet>
<x:Name>Sheet2</x:Name>
<x:WorksheetSource HRef="Book1_files/sheet002.htm"/>
</x:ExcelWorksheet>
</x:ExcelWorksheets>
<x:Stylesheet HRef="Book1_files/stylesheet.css"/>
<x:WindowHeight>13065</x:WindowHeight>
<x:WindowWidth>15315</x:WindowWidth>
<x:WindowTopX>360</x:WindowTopX>
<x:WindowTopY>75</x:WindowTopY>
<x:ProtectStructure>False</x:ProtectStructure>
<x:ProtectWindows>False</x:ProtectWindows>
</x:ExcelWorkbook>
</xml><![endif]-->
</head>

Resources