Selenium stopping mid-loop without any error or exception - web-scraping

I was working on a project to scrap data for multiple cities from a website. There are about 1000 cities to scrap and during the for-loop, the webdriver window simply stops going to the requested site. However, the program is still running and no error or exception is thrown, the webdriver window just simply stays at the same webpage.
There is a common feature regarding the webpage that it stopped at, which is the windows are always the ones requesting the button clicks. The button click parts of the code should be correct as it is working as desired for hundreds of cities before the suspension.
However, every different run the program is stopping at different cities and I find no correlation between the cities, which leaves me very confused and unable to identify the root of the problem.
data_lookup = 'https://www.numbeo.com/{}/in/{}'
tabs = ['cost-of-living', 'property-investment', 'quality-of-life']
cities_stat = []
browser = webdriver.Chrome(executable_path = driver_path, chrome_options=ChromeOptions)
for x in range(len(cities_split)):
city_stat = []
for tab in tabs:
browser.get(data_lookup.format(tab, str(cities_split[x][0])+'-'+str(cities_split[x][1])))
try:
city_stat.append(read_data(tab))
except NoSuchElementException:
try:
city_button = browser.find_element(By.LINK_TEXT, cities[x])
city_button.click()
city_stat.append(read_data(tab))
except NoSuchElementException:
try:
if len(cities_split[x]) == 2:
browser.get(data_lookup.format(tab, str(cities_split[x][0])))
try:
city_stat.append(read_data(tab))
except NoSuchElementException:
city_button = browser.find_element(By.LINK_TEXT, cities[x])
city_button.click()
city_stat.append(read_data(tab))
elif len(cities_split[x]) == 3:
city_stat.append(initialize_data(tab))
except:
city_stat.append(initialize_data(tab))
cities_stat.append(city_stat)
I have tried using WebDriverWait to no appeal. All it does is simply make the NoSuchElementException become a TimeoutException.

Related

How to make a button refuse giving you a item the second time you click it

So, i was making a roblox game, and i implemented a shop, but i forgot something, the way how to make it give you an item once a time.
My code is this:
local BtoolGiver = script.Parent
local player = game:GetService("Players")
local valid = game.Workspace.SoundGroup.SoundSFX
local invalid = game.Workspace.SoundGroup.Sounding
script.Parent.MouseButton1Click:Connect(function()
valid.Playing = true
local tool = BtoolGiver.F3X:Clone()
tool.Parent = game.Players.LocalPlayer.Backpack
end)
Can you find a way how to fit the code in it?
Thanks!

Requests(url) is having after 5 iteration

I am attempting to run a webscraping algo on indeed using beautifulSoup and loop through the different pages. However, after 2-6 iterations, the requests.get(url) hangs and stops finding the next page. I have read that it might do something with the server being blocked but that would have blocked the original requests and it also says online that Indeed allows for web scraping. I have also heard that I should set a header but I am unsure how to do that. I am running on the latest version of safari and MacOs 12.4.
A solution I came up with, thought this does not answer the question specifically, is by using a try expect statement and setting a timeout value to the request. Once the timeout value is reached, it enters the try except statement, sets a boolean value, and then continues the loop and try again. Code is inserted below.
while(i < 10):
url = get_url('software intern', '', i)
print("Parsing Page Number:" + str(i + 1))
error = False
try:
response = requests.get(url, timeout = 10)
except requests.exceptions.Timeout as err:
error = True
if error:
print("Trying to connect to webpage again")
continue
i += 1
I am leaving the question as unanswered for now however as I still don't know the root cause of this issue and this solution is just a workaround.

Different webpage results when using Scrapy

I was training to scrape on an supermarket website using scrapy :
https://www.pnp.co.za/pnpstorefront/pnp/en/All-Products/Fresh-Food/Milk-%26-Cream/c/milk-and-cream703655157
I noticed that when using chrome, I will get a page showing 106 results over 5 pages. However, when using a spide with scrapy (and other scraping software), the number of results is reduced to 30 products over 2 pages. It seems like the site is limiting the results shown when using scrapy. How would one go around this and have a scrapy spider be seen as my laptop on chrome?
I use the following cmd to run the sypder:
scrapy crawl tstPnPCategories -o out.csv
And here is the spyder script:
import scrapy
class testSpydi(scrapy.Spider):
name = 'tstPnPCategories'
start_urls = [
'https://www.pnp.co.za/pnpstorefront/pnp/en/All-Products/Fresh-Food/Milk-%26-Cream/c/milk-and-cream703655157'
]
def parse(self, response):
names = response.css(".item-name::text").extract()
print("*** *******")
print("")
print("NAMES")
print("")
print("************")
for name in names:
print("")
print(name)
print("")
yield {
'item': name
}
next_page = response.css("li.pagination-next a::attr(href)").get()
if next_page is not None:
yield response.follow(next_page, self.parse)
You have to select a different region to be able to scrape data about more items.
The script has to issue a click on one of the dropdown menu items.
The first item in the dropdown can be clicked by issuing the following:
document.getElementsByClassName('js-base-store')[0].click()
The element was identified by using Developer Tools in Chrome browser.
DevTools is activated by pressing F12 or ctrl + shft + I or by choosing Developer Tools in the browser menu (three vertical dots)
Here is what to look for.

How to fix List Box returning null value using UIA wrapper

I want to access text values from a List BOX (pywinauto uia wrapper)that is nested inside LIST view inside the application used.
code snippet:
#upper window
up_window.ListView.wait('visible').Select('Enforcement').click_input(double=True)
time.sleep(5)
#after this enforcement window opens and i need to select the third tab which is performed below and its working fine.
enfwin = guilib.get_window('Enforcement', backend='uia')
# guilib is user defined library which will retun the window handle
if enf_win.TabControl.get_selected_tab() != 2:
log.debug("Clicking on 'Targets' tab in Enforcement window")
enf_win.TabControl`enter code here`.wait('enabled', timeout=60).select(2)
time.sleep(30)
list_rows = enf_win.ListBox.wait('ready', timeout=60).texts()
dcs_win.ListView.wait('visible').Select('Enforcement').click_input(double=True)
time.sleep(5)
enf_win = guilib.get_window('Enforcement', backend='uia')
if enf_win.TabControl.get_selected_tab() != 2:
log.debug("Clicking on 'Targets' tab in Enforcement window")
enf_win.TabControl.wait('enabled', timeout=60).select(2)
time.sleep(30)
list_rows = enf_win.ListBox.wait('ready', timeout=60).texts()
The problem here is, when I am calling this function two times from script, 1 st run its fetching the list_rows whereas in second run its returning blank. It seems like it needed some time in between. But increasing time is not helping.
Please suggest if any modification I need to do to fetch list box value each time.

UFT/QTP General Run time error with .GetRoProperty

Hello I am getting a general run time error. This am working with preloaded drop downs and I have another one function before this that works just fine. but when it tries to run this I get the error.
I have tried with different properties like innertext, html id, ect.. but this get the same error.
Sub WebList(DropDown)
Set myPage=Browser("title:=.*").Page("title:=.*")
Set myWebList=Description.Create()
myWebList("micClass").value="WebList"
Set AllWebList=myPage.ChildObjects(myWebList)
totalWebList=AllWebList.count()
For i = 0 To totalWebList
If AllWebList(i).GetRoProperty("name") = DropDown Then
AllWebList(i).select ("GO")
wait(3)
Exit for
End If
Next
Set myPage = nothing
Set myWebList2 = Nothing
Set AllWebList2 = nothing
End Sub
I want the dropdown to be selected. Thanks for any help/suggestion. Also if I can improve on any lines to make it for dynamic and experienced coder, please do suggest them.
You have an error in your For loop, if there are no lists with the specified name you will access one more than actually exist. This is due to the fact that To in vbscript is inclusive and the index starts at 0. If the list is found the code works for me.
The For loop should be:
For i = 0 to totatlWebList - 1

Resources