Error with rvest - NAs introduced by coercion (xpath & css) - css

I'm attempting to scrape a website and collect the daily prices for various articles of clothing over an extended period. I've followed the tutorial on RStudio's blog but I am unable to replicate the idea on the test set despite using SelectorGadget. I've tried the follow code still receive NAs:
url<- ""
jeans <- url %>%
read_html() %>%
html_nodes(".description , .product-price span") %>%
html_text() %>%
I've also attempting to use the xpath format and still no luck:
jeans <- url %>%
read_html() %>%
html_nodes(xpath = '//*[contains(concat( " ", #class, " " ), concat( " ", "product-price", " " ))]') %>%
html_text() %>%
I'd greatly appreciate any insight you might share - and would really appreciate it if you passed along any resources that details how to build a database over time from pulled data / or how to batch rvest webscrape requests!
Thank you!


How to get a specific text in web scraping in r?

I am trying to scrape a website and map the artists to the url.
The element I am trying to pull from is here:
<title data-ng-bind="'Chartmetric | ' + $" class="ng-binding">Chartmetric | Fleetwood Mac</title>
I would like to get the "Fleetwood Mac" out of the code.
the following code gives me the top part " data-ng-bind
"'Chartmetric | ' + $" "
Edit: will accept any answer that gives me the artist title
url = ""
parsed_page <- url %>% GET(., timeout(10)) %>% read_html
html_nodes(":contains('Chartmetric')") %>%
After you have provided rvest cookies or authentication, you should be able to extract the text with html_text2() from rvest package. After that you'd probably need string manipulation.
url %>% read_html %>%
html_nodes(":contains('Chartmetric')") %>%
.[2] %>% # Accessing the second node
html_text2() # Extract the text

How to deal with HTTP error 504 when scraping data from hundreds of webpages?

I am trying to scrape voting data from the website of the Russian parliament. I am working with nearly 600 webpages, and I am trying to scrape data from within those pages as well. Here is the code I have written thus far:
# load packages
# base url
base_url <- sprintf("", 1:789)
# loop over pages
map_df(base_url, function(i) {
pg <- read_html(i)
title = html_nodes(pg, ".item-left a") %>% html_text() %>% str_trim(),
link = html_elements(pg, '.item-left a') %>%
html_attr('href') %>%
paste0('', .),
}) -> duma_votes_data
The above code executed successfully. This results in a df containing the titles and links. I am now trying to extract the date information. Here is the code I have written for that:
# extract date of vote
duma_votes_data$date <- map(duma_votes_data$link, ~ {
.x %>%
read_html() %>%
html_nodes(".date-p span") %>%
html_text() %>%
paste(collapse = " ")
After running this code, I receive the following error:
Error in open.connection(x, "rb") : HTTP error 504.
What is the best way to get around this issue? I have read about the possibility of incorporating Sys.sleep() to my code, but I am not sure where it should go. Note that this code is for all 789 pages, as indicated in base_url. The code does work with around 40 pages, so I guess worst case scenario I could do everything in small chunks and save the resulting dfs as a single df.

web scraping directors sections IMDB in r

I'm trying to scrape data from IMDB website , I'm trying to get the directors names with this command : html_nodes("p.text-mutated + a") and also tried html_nodes(".text-mutated + p a") but both are not working
note that this is my first time doing web-scraping
Your help will be much appreciated
Thank you !
Your css selector is not matching anything. This code gets you the directors:
url <- ""
webpage <- read_html(url)
directors_data_html <- html_nodes(webpage,".text-small:nth-child(6)")
directors_data <- html_text(directors_data_html)
directors <- directors_data %>%
str_split("\\|") %>%
map(., 1) %>%
directors %>%
tibble("directors" = .) %>%

No data when scraping with rvest

I am trying to scrape a website but it does not give me any data.
#Get the Data
#specify the url
url <- ''
#get data
url %>%
read_html() %>%
html_nodes(".green div:nth-child(1)") %>%
I have also tried to use the xpath = '//*[contains(concat( " ", #class, " " ), concat( " ", "green", " " ))]//div[(((count(preceding-sibling::*) + 1) = 1) and parent::*)]//a' but this gives me the same result with 0 data.
I am expecting Horse names. Shouldnt I at least get some javascript code even if data on page is rendered by javascript?
I cant see what else CSS selector I should use here.
You can simply use RSelenium package to scrape dynamycal pages :
#specify the url
url <- ''
#Create the remote driver / navigator
rsd <- rsDriver(browser = "chrome")
remDr <- rsd$client
#Go to your url
page <- read_html(remDr$getPageSource()[[1]])
#get your horses data by parsing Selenium page with Rvest as you know to do
page %>% html_nodes(".green div:nth-child(1)") %>% html_text()
Hope that will helps

Extracting web table using Rvest (in R)

I am looking to pull a table in at in order to process active and inactive players. I am very familiar with rvest and have tried using the code:
url <- paste0("")
Table <- url %>%
read_html() %>%
html_nodes(xpath= '//*[contains(concat( " ", #class, " " ), concat( " ", "yui3-datatable-cell", " " ))]') %>%
TableNew <- Table[[1]]
Nothing is coming up correctly though. Ideally, I would like to be able to put all the players and their team name into one single table. I appreciate your insights.
