R - using RSelenium to log into website (Captcha, and staying logged in) - r

I want to use RSelenium to access and scrape a website each day. Something I've noticed is that when I open up the website in a regular chrome browser, I am already logged in from the last time I visited the website. However, if I use RSelenium to open up a remote driver, and visit the webpage using this driver, it does not have me logged into the website already. It's basic enough to log into most sites usually, however for this website there is a Captcha that makes logging in more difficult.
Is there anyway the remote driver can access the website with me already logged in?
Example of my code below:
this_URL = "my_url_goes_here"
startServer()
remDr = remoteDriver$new(browserName = 'chrome')
Sys.sleep(2); remDr$open();
Sys.sleep(4); remDr$navigate(this_URL);
login_element = remDr$findElement(using = "id", "login-link")
login_element$
After clicking the login_element link, it brings me to the page where I input my username, password, and click the captcha / do what it asks.
Thanks,

It should work using firefox and firefox profiles as follows:
Setup Firefxx Access:
Open firefox and login as usual. Make sure when you close firefox and you login again you stay logged in.
Figure out the location of your default firefox profile:
This should be somethink like: (source + more details)
Windows: %AppData%MozillaFirefoxProfilesxxxxxxxx.default
Mac: ~/.mozilla/firefox/xxxxxxxx.default/
Linux: ~/Library/Application Support/Firefox/Profiles/xxxxxxxx.default/
Start a new RSelenium driver and set the profile as follows
->
require(RSelenium)
eCap <- list("webdriver.firefox.profile" = "MySeleniumProfile")
remDr <- remoteDriver(browserName = "firefox", extraCapabilities = eCap)
remDr$open()
The firefox-window that opens should be your chosen profile.
I did this a while ago. If i remember correctly it works like this.
P.S.: You could also create an extra/new firefox profile for that. To do that follow the steps in the link above

Related

RSelenium with Chrome: how to keep informations?

I have been trying to do a webscraping with RSelenium using Chrome browser.
To acess the page with the information I need, first I need to scan a QRcode.
In a regular browser, I do it just one time and I can close and open the browser as many times I want. With RSelenium, I need to scan the QRcode everytime I open the browser.
Is there a way to keep this information? Cache or cookies?
I have been trying:
rD <- RSelenium::rsDriver(browser = "chrome",
chromever = "108.0.5359.71",
port = netstat::free_port(),
verbose = FALSE)
remDr <- rD[["client"]]
remDr$navigate("https://google.com/")
remDr$maxWindowSize()
remDr$getAllCookies()
But Im̀ getting only an empty list.

RSelenium Chrome webdriver doesn't work with user profile

I have some existing RSelenium code I am trying to get working with a Chrome Profile. I am using the code below to open a browser:
cprof <- getChromeProfile("C:/Users/Paul/AppData/Local/Google/Chrome/User Data", "Profile 1")
driver <- rsDriver(browser=c("chrome"), chromever="80.0.3987.106", port=4451L, extraCapabilities=cprof)
But when I run this, Three (3!) new Chrome browser windows open before the following error is displayed in RStudio:
Could not open chrome browser.
Client error message:
Summary: SessionNotCreatedException
Detail: A new session could not be created.
Further Details: run errorDetails method
Check server log for further details.
The puzzling part is that it does look like it is getting the correct profile, because when I switch between "Profile 1", "Profile 2" and even "Default" in the getChromeProfile call, I see the correct user icon in the browser windows that open. And if I leave off the extraCapabilities the browser opens with no problem (using the default "empty" profile).
Any idea what I am doing wrong?

RGoogleAnalytics Auth function wont open the browser for "Request for Permission"

I'm using the RGoogleAnalytics package in a remote Desktop (in a different country), i'm not using my local machine since Google blocks my local machine due to restrictions.
In my local machine, when i run the 'Auth' command:
token <- Auth(client.id = "XXXXXXXXXXXXXXXXXXXXXXX",
client.secret = "YYYYYYYYYYY")
the browser automatically opens a new tab ("Request for permission") in the browser for me to accept (natural part of authentication) - This is what should happen, though when i'm doing it through my remote machine (where i'm logged into the GA account, how it should work), My R console just gets stuck with the following command, without automatically opening a new request for permission tab in the browser:
token <- Auth(client.id = "XXXXXXXXXXXXXXXXXXXXXXX",
client.secret = "YYYYYYYYYYY")
Waiting for authentication in browser...
Press Esc/Ctrl + C to abort
Has anyone ran into this issue at the past? I've actually used this package quit a lot and never ran into this weird issue
Thanks in advance for any help on this one :)
I solved this by launching a browser remotely :
ssh user#server -Y firefox
...and launching Rstudio as if it was on localhost.
Just make sure you have a port for Google API authentication response (likely 1410) open for TCP

Rfacebook: Connection Errow with a R crash

I ran
fb_oauth <- fbOAuth(app_id="XXXXX", app_secret="XXXXXX",extended_permissions = TRUE)
from RFacebook. I had added http/:1410 in siteurl and saved changes.
I waited for a few minutes and hit enter. A chrome page opened and asked if my app was allowed to access my information. I clicked on ok. But then the page as well as R console closed. The message from webpage was "Google Chrome's connection attempt to localhost was rejected"

How to implement Remember me automation in Firefox with web driver?

How to implement "Rememeber me" automation in Firefox with web driver? I am using web driver 2.20, Eclipse IDE, Firefox 9.0
The reason you are experiencing that is because every time you start firefox, webdriver creates a new anonymous profile with no cookies. You can make it use a particular profile, which should retain cookies.
File profileDir = new File("path/to/profile");
FirefoxProfile profile = new FirefoxProfile(profileDir);
WebDriver driver = new FirefoxDriver(profile);
FirefoxProfile has many other options, like adding extensions and all.
I understand you need a solution for firefox, but I have the below working version for Chrome. You can refer this link for a firefox solution: How to start Selenium RemoteWebDriver or WebDriver without clearing cookies or cache?
For Chrome (config): You have to set the path to user-dir which will save all the login info after you login for the first time. The next time you login again, login info from the user-dir will be taken.
System.setProperty("webdriver.chrome.driver", "res/chromedriver.exe");
DesiredCapabilities capabilities = DesiredCapabilities.chrome();
ChromeOptions options = new ChromeOptions();
options.addArguments("test-type");
options.addArguments("start-maximized");
options.addArguments("user-data-dir=D:/temp/");
capabilities.setCapability("chrome.binary","res/chromedriver.exe");
capabilities.setCapability(ChromeOptions.CAPABILITY,options);
WebDriver driver = new ChromeDriver(capabilities);
Login for the first time:
driver.get("https://gmail.com");
//Your login script typing username password, check 'keep me signed in' and so on
Close the driver (do NOT quit):
driver.close();
Re-initialize the driver and navigate to the site. You should not be asked for username and password again:
driver = new ChromeDriver(capabilities);
driver.get("http://gmail.com");
The above can be implemented for firefox using a firefox profile.

Resources