googleway, the radar argument is now deprecated - r

Attempting to use googleway package in r to list retirement villages within a specified radius of a location. Get the radar argument is now deprecated error message and null results as a consequence.
library(googleway)
a <- google_places(location = c(-36.796578,174.768836),search_string = "Retirement Village",radius=10000, key = "key")
a$results$name
```
#Would expect this to give me retirement villages within 10km radius, instead get error message
```> library(googleway)
> a <- google_places(location = c(-36.796578,174.768836),search_string = "Retirement Village",radius=10000, key = "key")
The radar argument is now deprecated
> a$results$name
NULL
```

There's nothing wrong with the code you've written, and that 'message' you get is not an error, it's a message, but it probably should be removed - I've made an issue to remove it here
a <- google_places(
location = c(-36.796578,174.768836)
, search_string = "Retirement Village"
, radius = 10000
, key = "key"
)
place_name( a )
# [1] "Fairview Lifestyle Village" "Eastcliffe Retirement Village"
# [3] "Meadowbank Retirement Village" "The Poynton - Metlifecare Retirement Village"
# [5] "The Orchards - Metlifecare Retirement Village" "Bupa Hugh Green Care Home & Retirement Village"
# [7] "Bert Sutcliffe Retirement Village" "Grace Joel Retirement Village"
# [9] "Bupa Remuera Retirement Village and Care Home" "7 Saint Vincent - Metlifecare Retirement Village"
# [11] "Remuera Gardens Retirement Village" "William Sanders Retirement Village"
# [13] "Puriri Park Retirement Village" "Selwyn Village"
# [15] "Aria Bay Retirement Village" "Highgrove Village & Patrick Ferry House"
# [17] "Settlers Albany Retirement Village" "Knightsbridge Village"
# [19] "Remuera Rise" "Northbridge Residential Village"
Are you sure the API key you're using is enabled on the Places API?

Related

Scraping keywords on PHP page

I would like to scrape the keywords inside the dropdown table of this webpage https://www.aeaweb.org/jel/guide/jel.php
The problem is that the drop-down menu of each item prevents me from scraping the table directly because it only takes the heading and not the inner content of each item.
rvest::read_html("https://www.aeaweb.org/jel/guide/jel.php") %>%
rvest::html_table()
I thought of scraping each line that starts with Keywords: but I do not get how can I do that. Seems like the HTML is not showing the items inside the table.
A RSelenium solution,
#Start the server
library(RSelenium)
driver = rsDriver(
browser = c("firefox"))
remDr <- driver[["client"]]
#Navigate to the url
remDr$navigate("https://www.aeaweb.org/jel/guide/jel.php")
#xpath of the table
remDr$findElement(using = "xpath",'/html/body/main/div/section/div[4]') -> out
#get text from the table
out <- out$getElementText()
out= out[[1]]
Split using stringr package
library(stringr)
str_split(out, "\n", n = Inf, simplify = FALSE)
[[1]]
[1] "A General Economics and Teaching"
[2] "B History of Economic Thought, Methodology, and Heterodox Approaches"
[3] "C Mathematical and Quantitative Methods"
[4] "D Microeconomics"
[5] "E Macroeconomics and Monetary Economics"
[6] "F International Economics"
[7] "G Financial Economics"
[8] "H Public Economics"
[9] "I Health, Education, and Welfare"
[10] "J Labor and Demographic Economics"
[11] "K Law and Economics"
[12] "L Industrial Organization"
[13] "M Business Administration and Business Economics; Marketing; Accounting; Personnel Economics"
[14] "N Economic History"
[15] "O Economic Development, Innovation, Technological Change, and Growth"
[16] "P Economic Systems"
[17] "Q Agricultural and Natural Resource Economics; Environmental and Ecological Economics"
[18] "R Urban, Rural, Regional, Real Estate, and Transportation Economics"
[19] "Y Miscellaneous Categories"
[20] "Z Other Special Topics"
To get the Keywords for History of Economic Thought, Methodology, and Heterodox Approaches
out1 <- remDr$findElement(using = 'xpath', value = '//*[#id="cl_B"]')
out1$clickElement()
out1 <- remDr$findElement(using = 'xpath', value = '/html/body/main/div/section/div[4]/div[2]/div[2]/div/div/div/div[2]')
out1$getElementText()
[[1]]
[1] "Keywords: History of Economic Thought"

Rvest html_nodes span once and Xpath

I would like to collect parish.name from masstimes.org
I used selector gadget. The CSS selector 'span' as an XPath is shown below. Please report any bugs that you find with this converter. The result is //span
The data I would like to collect is here: <span once="parish.name" class="">Saint John Chrysostom [Ruthenian]</span>
I'm not sure what the html_nodes command should look like?
Thanks,
Javascript is required to display the results. So you won't scrape anything with Rvest. You should use RSelenium. As an alternative you can download the JSON loaded in the background to fetch the data.
First, you need to obtain the lat and long of the city you're looking for. The website uses Arcgis API to get them. For example, for New York the GET url is :
https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/findAddressCandidates?f=json&SingleLine=New%20York,%20NY,%20USA
Output :
{"spatialReference":{"wkid":4326,"latestWkid":4326},"candidates":[{"address":"New York","location":{"x":-74.007139999999936,"y":40.714550000000031},"score":100,"attributes":{},"extent":{"xmin":-74.257139999999936,"ymin":40.464550000000031,"xmax":-73.757139999999936,"ymax":40.964550000000031}}]}
From this output, lat is 40.715 (rounded with 3 digits) and long is -74.007. Use GET and content (as text) from httr package to load the file in R. And str_extract from stringr to extract these 2 values.
Example. I'm looking for Paris in France. We modify the URL (add Paris and FRA into it) to get the JSON, then store its content. We extract lat and long, then construct the URL with paste0.
data=GET("https://geocode.arcgis.com/arcgis/rest/services/World/GeocodeServer/findAddressCandidates?f=json&SingleLine=Paris,%20FRA")
parse=content(data,as="text")
lat = round(as.numeric(str_extract_all(parse,"\\d+\\.\\d+")[[1]][2]),digits = 3)
long = round(as.numeric(str_extract_all(parse,"\\d+\\.\\d+")[[1]][1]),digits = 3)
paste0("https://apiv4.updateparishdata.org/Churchs/?lat=",lat,"&long=",long,"&pg=1")
Output :
https://apiv4.updateparishdata.org/Churchs/?lat=48.857&long=2.341&pg=1
You can also manually lookup these values with your favorite search engine.
Once you get them, you can construct your request url. Like the following one :
https://apiv4.updateparishdata.org/Churchs/?lat=40.091&long=-82.95&pg=1
where lat and long is the value you've found and pg the page number (30 results per page).
Load the JSON in R with jsonlite(a df is created) and extract the column of interest :
library (jsonlite)
mydata <- fromJSON("https://apiv4.updateparishdata.org/Churchs/?lat=40.091&long=-82.95&pg=1")
mydata$name
Output :
[1] "Saint John Chrysostom [Ruthenian]" "Saint Elizabeth"
[3] "Saint Anthony" "St. Matthias"
[5] "Saint Paul" "Holy Resurrection [Melkite]"
[7] "St. Michael" "St. James the Less"
[9] "Our Lady of Peace" "Immaculate Conception"
[11] "Saints Augustine and Gabriel" "Saint Peter"
[13] "Holy Name" "Saint Timothy"
[15] "Ohio Dominican University" "St. Thomas More Newman Center"
[17] "Church of the Resurrection" "Saint Andrew Roman Catholic Church "
[19] "Saint Matthew" "Saint Joan of Arc Catholic Church"
[21] "St. Thomas the Apostle" "St. Agatha"
[23] "Sacred Heart Church" "Saint Dominic"
[25] "Saint John the Baptist " "Saint Francis of Assisi"
[27] "Saint Patrick" "Holy Spirit"
[29] "Saint Brendan" "Saint Christopher"

How to read a .txt file into a dataframe with readr?

I have the following data that I obtained from a .txt file using the read_lines function from readr
txtread<-read_lines("expenses_copy1.txt")
txtread
[1] "Amount:Category:Date:Description"
[2] "5.25:supply:20170222:box of staples"
[3] "79.81:meal:20170222:lunch with ABC Corp. clients Al, Bob, and Cy"
[4] "43.00:travel:20170222:cab back to office"
[5] "383.75:travel:20170223:flight to Boston, to visit ABC Corp."
[6] "55.00:travel:20170223:cab to ABC Corp. in Cambridge, MA"
[7] "23.25:meal:20170223:dinner at Logan Airport"
[8] "318.47:supply:20170224:paper, toner, pens, paperclips, tape"
[9] "142.12:meal:20170226:host dinner with ABC clients, Al, Bob, Cy, Dave, Ellie"
[10] "303.94:util:20170227:Peoples Gas"
[11] "121.07:util:20170227:Verizon Wireless"
[12] "7.59:supply:20170227:Python book (used)"
[13] "79.99:supply:20170227:spare 20\" monitor"
[14] "49.86:supply:20170228:Stoch Cal for Finance II"
[15] "6.53:meal:20170302:Dunkin Donuts, drive to Big Inc. near DC"
[16] "127.23:meal:20170302:dinner, Tavern64"
[17] "33.07:meal:20170303:dinner, Uncle Julio's"
[18] "86.00:travel:20170304:mileage, drive to/from Big Inc., Reston, VA"
[19] "22.00:travel:20170304:tolls"
[20] "378.81:travel:20170304:Hyatt Hotel, Reston VA, for Big Inc. meeting"
I want to read each of these in to vectors that are "Amount", "Category", "Date" and "Description" and create a dataframe out of them so that I have a dataset I can work with
I tried the following
for (i in length(txtread) ) {
data<-read.table(textConnection(txtread[[i]]))
print(data)
}
However this does't seem to work.
how can I read this data into a dataframe in R

Collapse elements separated by ""

I have raw bibliographic data as follows:
bib =
c("Bernal, Martin, \\\"Liu Shi-p\\'ei and National Essence,\\\" in Charlotte",
"Furth, ed., *The Limit of Change, Essays on Conservative Alternatives in",
"Republican China*, Cambridge: Harvard University Press, 1976.",
"", "Chen,Hsi-yuan, \"*Last Chapter Unfinished*: The Making of the *Draft Qing",
"History* and the Crisis of Traditional Chinese Historiography,\"",
"*Historiography East & West*2.2 (Sept. 2004): 173-204", "",
"Dennerline, Jerry, *Qian Mu and the World of Seven Mansions*, New Haven:",
"Yale University Press, 1988.", "")
[1] "Bernal, Martin, \\\"Liu Shi-p\\'ei and National Essence,\\\" in Charlotte"
[2] "Furth, ed., *The Limit of Change, Essays on Conservative Alternatives in"
[3] "Republican China*, Cambridge: Harvard University Press, 1976."
[4] ""
[5] "Chen,Hsi-yuan, \"*Last Chapter Unfinished*: The Making of the *Draft Qing"
[6] "History* and the Crisis of Traditional Chinese Historiography,\""
[7] "*Historiography East & West*2.2 (Sept. 2004): 173-204"
[8] ""
[9] "Dennerline, Jerry, *Qian Mu and the World of Seven Mansions*, New Haven:"
[10] "Yale University Press, 1988."
[11] ""
I would like to collapse elements between the ""s in one line so that:
clean_bib[1]=paste(bib[1], bib[2], bib[3])
clean_bib[2]=paste(bib[5], bib[6], bib[7])
clean_bib[3]=paste(bib[9], bib[10])
[1] "Bernal, Martin, \\\"Liu Shi-p\\'ei and National Essence,\\\" in Charlotte Furth, ed., *The Limit of Change, Essays on Conservative Alternatives in Republican China*, Cambridge: Harvard University Press, 1976."
[2] "Chen,Hsi-yuan, \"*Last Chapter Unfinished*: The Making of the *Draft Qing History* and the Crisis of Traditional Chinese Historiography,\" *Historiography East & West*2.2 (Sept. 2004): 173-204"
[3] "Dennerline, Jerry, *Qian Mu and the World of Seven Mansions*, New Haven: Yale University Press, 1988."
Is there a one-liner that does this automatically?
You can use tapply while grouping with all "" then paste together the groups
unname(tapply(bib,cumsum(bib==""),paste,collapse=" "))
[1] "Bernal, Martin, \\\"Liu Shi-p\\'ei and National Essence,\\\" in Charlotte Furth, ed., *The Limit of Change, Essays on Conservative Alternatives in Republican China*, Cambridge: Harvard University Press, 1976."
[2] " Chen,Hsi-yuan, \"*Last Chapter Unfinished*: The Making of the *Draft Qing History* and the Crisis of Traditional Chinese Historiography,\" *Historiography East & West*2.2 (Sept. 2004): 173-204"
[3] " Dennerline, Jerry, *Qian Mu and the World of Seven Mansions*, New Haven: Yale University Press, 1988."
[4] ""
you can also do:
unname(c(by(bib,cumsum(bib==""),paste,collapse=" ")))
or
unname(tapply(bib,cumsum(grepl("^$",bib)),paste,collapse=" "))
etc
Similar to the other answer. This uses split and sapply. The second line is just to remove any elements with only has "".
vec <- unname(sapply(split(bib, f = cumsum(bib %in% "")), paste0, collapse = " "))
vec[!vec %in% ""]

Find out POI (within 2km) using latitude and longitude

I have a dataset which corresponding of Zipcode along with lat and log.I want to find out list of hospital/bank(within 2km) from that latitude and longitude.
How to do it?
The Long/Lat data looks like
store_zip lon lat
410710 73.8248981 18.5154681
410209 73.0907 19.0218215
400034 72.8148177 18.9724162
400001 72.836334 18.9385352
400102 72.834424 19.1418961
400066 72.8635299 19.2313448
400078 72.9327444 19.1570343
400078 72.9327444 19.1570343
400007 72.8133825 18.9618411
400050 72.8299518 19.0551695
400062 72.8426858 19.1593396
400083 72.9374227 19.1166191
400603 72.9781047 19.1834148
401107 72.8929 19.2762702
401105 72.8663173 19.3053477
400703 72.9992013 19.0793547
401209 NA NA
401203 72.7983705 19.4166761
400612 73.0287209 19.1799265
400612 73.0287209 19.1799265
400612 73.0287209 19.1799265
If your Points of Interest are unknown and you need to find them, you can use Google's API through my googleway package (as you've suggested in the comments). You will need a valid API key for this to work.
As the API can only accept one request at a time, you'll need to iterate over your data one row at a time. For that you can use whatever looping method you're most comforatable with
library(googleway) ## using v2.4.0 on CRAN
set_key("your_api_key")
lst <- lapply(1:nrow(df), function(x){
google_places(search_string = "Hospital",
location = c(df[x, 'lat'], df[x, 'lon']),
radius = 2000)
})
lst is now a list that contains the results of the queries. For example, the names of the hospitals it has returned for the first row of your data is
place_name(lst[[1]])
# [1] "Jadhav Hospital"
# [2] "Poona Hospital Medical Store"
# [3] "Sanjeevan Hospital"
# [4] "Suyash Hospital"
# [5] "Mehta Hospital"
# [6] "Deenanath Mangeshkar Hospital"
# [7] "Sushrut Hospital"
# [8] "Deenanath Mangeshkar Hospital and Research Centre"
# [9] "MMF Ratna Memorial Hospital"
# [10] "Maharashtra Medical Foundation's Joshi Multispeciality Hospital"
# [11] "Sahyadri Hospitals"
# [12] "Deendayal Memorial Hospital"
# [13] "Jehangir Specialty Hospital"
# [14] "Global Hospital And Research Institute"
# [15] "Prayag Hospital"
# [16] "Apex Superspeciality Hospital"
# [17] "Deoyani Multi Speciality Hospital"
# [18] "Shashwat Hospital"
# [19] "Deccan Multispeciality Hardikar Hospital"
# [20] "City Hospital"
You can also view them on a map
set_key("map_api_key", api = "map")
## the lat/lon of the returned results are found through `place_location()`
# place_location(lst[[1]])
df_hospitals <- place_location(lst[[1]])
df_hospitals$name <- place_name(lst[[1]])
google_map() %>%
add_circles(data = df[1, ], radius = 2000) %>%
add_markers(data = df_hospitals, info_window = "name")
Note:
Google's API is limited to 2,500 queries per key per day, unless you pay for a premium account.

Resources