can anybody help me through the URL optimization in R.
Actually I need to get multiple organization related CSV file from URL.
I have to just change the one parameter in url to get that particular organization CSV table.
So mainly I was writing a function to extract that.
Example:
data<-function("bank"){table<-read.csv(url("*://.....=string.....")}
can you guide me on this.
Is that what you are looking for?
Data <- function(string){
table <- read.csv(url(paste("https://.....=", string, "...", sep="")))
# ...
}
# then you can change the parameter with function call
# (also dynamically using apply or a loop )
Data(string="bank")
Related
I am interested in writing a function where it takes a data frame as an input and returns an html page via knitr as an output based on the information in the data frame.
Here is sort of the psuedocode of the function that I wanted to write:
htmlOuput <- function(Df) {
newDf<-someManipulation(Df)
meltedDf<-melt(Df)
g<-ggplot(meltedDf)
return (html(g)) # This is the part that I am not sure about
}
Is there a way to output an html page as a function output via knitr ?
After some research I found that calling an rmarkdown file to render within a function to be the best option.
htmlOuput <- function(Df,meta = NULL, cacheable = NA) {
rmarkdown::render('./report.rmd',params=list(output_file = report.html))
}
Where the report.rmd will contain the manipulation of the data frame
newDf<-someManipulation(Df)
meltedDf<-melt(Df)
g<-ggplot(meltedDf)
g
I guess you took the hard way.
An easy approach would be to use htmlTable
I have used that to export to html and it is easy to use.
I want to scrape the reviews of room from airbnb web-page. For example, from this web-page: https://www.airbnb.com/rooms/8400275
And this is my code for this task. I used rvest packege and selectorgadget:
x <- read_html('https://www.airbnb.com/rooms/8400275')
x_1 <- x%>%html_node('#reviews p')%>%html_text()%>%as.character()
Can you help me to fix that? Is it possible to do with rvest package(I am not familiar with xpathSApply)
I assume that you want to extract the comment itself. Looking at the html file, it seems that that is not an easy task, since you have to extract it within the script node. So, what I tried was this:
Reading the html. Here I use connection and readLines to read it
as character vectors.
Selecting the line that contains the review information.
Using str_extract to extract the comments.
For the first two steps, we can also use rvest or XML package to select the appropriate node.
url <- "https://www.airbnb.com/rooms/8400275"
con <- file (url)
raw <- readLines (con)
close (con)
comment.regex <- "\"comments\":\".*?\""
comment.line <- raw[grepl(comment.regex, raw)]
require(stringr)
comment <- str_extract_all(comment.line, comment.regex)
I'm still a rookie to the R world, in a very accelerated class with limited/no guidance. My assignment is to build a custom function that reads in a specific .csv, and take some specific columns out to be analyzed. Could anyone please offer some advice? The "sample code" I was given looks like this:
AnnualLekSurvey=function(data.in,stat.year){
d1=subset(data.in,year==stat.year)
d2=d1[c("year","complex","tot_male")]
attach(d2)}
So when it's complete and I run it, I should be able to say:
AnnualLekSurvey(gsg_lek,2006)
where "gsg_lek" is the name of the file I want to import, and 2006 is the values from the "year" column that I want to subset. "complex" and "tot_male" will be the variable to be analyzed by "year", but I'm not worried about that code right now.
What I'm confused about is; how do I tell R that gsg_lek is a .csv file, and tell it to look in the proper directory for it when I run the custom function?
I saw one other vaguely similar example on here, and they had to use the if() and paste() commands to build the string of the file name - that seems like too much arbitrary work, unless I'm just being lazy...
Any help would be appreciated.
You can make a function like this:
AnnualLekSurvey <- function(csvFile, stat.year)
{
d1 <- read.csv(paste("C:/",csvFile,".csv", sep=""),header=T, sep=",")
d2 <- subset(d1, year==stat.year)
d2 <- d2[, c("year","complex","tot_male")]
return(d2)
}
The argument 'csvFile' in the function is the basename of your csv file. In this particular example, this has to be in your C:/ folder. If your file is in some other folder, you have to change the "C:/" in the function to the folder where your csv file is located.
Running the function:
data <- AnnualLekSurvey("gsg_lek", "2006")
Note that the arguments has to be within the quotes. 'data' will now contain the columns year, complex and tot_male of gsg_lek.csv corresponding to the year 2006
I need to simulate some data, and I would like to have a function with everything built in, so I just need to run
simulate(scenario="xxx")
This function stores all the simulated datasets for the designated scenario in a list called simdat. Within the function, I want to rename that list "simdat.xxx" and save it out as "simdat_xxx.RData", so later on I can just load this file and have access to the list simdat.xxx. I need the list to have a name that refers specifically to which batch it is, because I am dealing with a lot of batches and I may want to load several at the same time.
Is there a way to, within a function, make a name and use it to name an object? I searched over and over again and could not find a way to do this. In desperation, I am resorting to doing this: within the function,
(a) write a temporary script using paste, which looks like this
temp.fn <- function(simdat){
simdat.xxx <- simdat
save(simdat.xxx,file="simdat_xxx.RData")
}
(b) use writeLines to write it out to a .R file
(c) source the file
(d) run it
This seriously seems like overkill to me. Is there a better way to do this?
Thanks much for your help!
Trang
Try this,
simulate <- function(scenario="xxx"){
simdat <- replicate(4, rnorm(10), simplify=FALSE)
data_name <- paste("simdat", scenario, sep=".")
assign(data_name, simdat)
save(list = data_name, file = paste0("simdat_", scenario, ".Rdata"))
}
Using R, I just want to read the contents of a file into a variable like:
query <- read_file_contents('biglongquery.sql')
As to avoid putting, well, big long queries in the R script itself. I do not want to read in data like CSV (e.g. read.tables), etc- just the raw text.
Scan does the job, but the function for this purpose is actually readLines().
query <- readLines("biglongquery.sql")
This gives you a vector with the lines. To combine them to one single variable, you can use the paste function, e.g.
one.variable <- paste(query,collapse="\n")
x <- paste(scan("foo.sql",what="",sep="\n",blank.lines.skip=FALSE),collapse="\n")
Another way is to create a .R script with query definition
# content of biglongquery.R
query <- "
SELECT
very_long_list_of_fields
FROM ...
"
and then use it in the main script using
source("biglongquery.R")