Trying to iterate thru a web class using Selnium Python 3.10 - css

Im trying to iterate thru a class element in aour website to download each report using selenium and python, the problem is tha t the reports are variable some day is 8 another day is 9 other day is 12 so I cant harcode the logic for each one must be dynamic, I almos try everything I searched in google, and nothing works, I found something that is as follows:
a = driver.find_elements(By.CLASS_NAME, "k-link")
enter image description here
I need something like that but the result must be something that selenium understand like xpath, class, selector or something lake that in order to avoid hardcoding.
here is the image of the website specifically the tabs of the class
enter image description here
I really will apreciate your help. Thanks in advance.
`thats my python code``
def elements():
global maxelement
#a = driver.find_elements(By.CLASS_NAME, "k-link")
classname = (By.CLASS_NAME, "k-link")
a = WebDriverWait(driver, 120).until(EC.visibility_of_all_elements_located(classname))
print(a)
for x in range(maxelement):
selector = "li.k-item:nth-child(" + str(x) + ") > span:nth-child(2)"
print(selector)
xpathvar = "/html/body/div[4]/section[2]/div/div[1]/div/div/ul/li[" + str(x) + "]/span[2]"
print(xpathvar)
print('***************')
# First table Logic
time.sleep(60)
## Lo puse asi PAra provar la logica y esta perfecta corre muy rapido y bien.
try:
print('Entro al Try del Tabulador')
first_tab = driver.find_element(By.CSS_SELECTOR, selector)
first_tab.click()
click = True
except NoSuchElementException:
click = False
pass
#print(str(x) + 'Tab not Found')
time.sleep(60)
if click == True:
try:
download_first = driver.find_element(By.XPATH, xpathvar)
download_first.click()
except NoSuchElementException:
click = False
pass
#print('Downloaded file Succesfully' + str(x))
#time.sleep(60)
print('Cycle ends correclty')
def elements():
global maxelement
#a = driver.find_elements(By.CLASS_NAME, "k-link")
classname = (By.CLASS_NAME, "k-link")
a = WebDriverWait(driver, 120).until(EC.visibility_of_all_elements_located(classname))
print(a)
for x in range(maxelement):
selector = "li.k-item:nth-child(" + str(x) + ") > span:nth-child(2)"
print(selector)
xpathvar = "/html/body/div[4]/section[2]/div/div[1]/div/div/ul/li[" + str(x) + "]/span[2]"
print(xpathvar)
print('***************')
# First table Logic
time.sleep(60)
## Lo puse asi PAra provar la logica y esta perfecta corre muy rapido y bien.
try:
print('Entro al Try del Tabulador')
first_tab = driver.find_element(By.CSS_SELECTOR, selector)
first_tab.click()
click = True
except NoSuchElementException:
click = False
pass
#print(str(x) + 'Tab not Found')
time.sleep(60)
if click == True:
try:
download_first = driver.find_element(By.XPATH, xpathvar)
download_first.click()
except NoSuchElementException:
click = False
pass
#print('Downloaded file Succesfully' + str(x))
#time.sleep(60)
print('Cycle ends correclty')

Related

R: hide cells in DT::datatable based on condition

I am trying to create a datatable with child rows: the user will be able to click on a name and see a list of links related to that name. However, the number of itens to show is different for each name.
> data1 <- data.frame(name = c("John", "Maria", "Afonso"),
a = c("abc", "def", "rty"),
b=c("ghj","lop",NA),
c=c("zxc","cvb",NA),
d=c(NA, "mko", NA))
> data1
name a b c d
1 John abc ghj zxc <NA>
2 Maria def lop cvb mko
3 Afonso rty <NA> <NA> <NA>
I am using varsExplore::datatable2 to hide specific columns:
varsExplore::datatable2(x=data1, vars=c("a","b","c","d"))
and it produces the below result
Is it possible to modify DT::datatable in order to only render cells that are not "null"? So, for example, if someone clicked on "Afonso", the table would only render "rty", thus hiding "null" values for the other columns (for this row), while still showing those columns if the user clicked "Maria" (that doesn't have any "null").
(Should I try a different approach in order to achieve this behavior?)
A look into the inner working of varsExplore::datatable2
Following your request I took a look into the varsExplore::datatable2 source code. And I found out that varsExplore::datatable2 calls varsExplore:::.callback2 (3: means that it's not an exported function) to create the javascript code. this function also calls varsExplore:::.child_row_table2 which returns a javascript function format(row_data) that formats the rowdata into the table you see.
A proposed solution
I simply used my js knowledge to change the output of varsExplore:::.child_row_table2 and I came up with the following :
.child_row_table2 <- function(x, pos = NULL) {
names_x <- paste0(names(x), ":")
text <- "
var format = function(d) {
text = '<div><table >' +
"
for (i in seq_along(pos)) {
text <- paste(text, glue::glue(
" ( d[{pos[i]}]!==null ? ( '<tr>' +
'<td>' + '{names_x[pos[i]]}' + '</td>' +
'<td>' + d[{pos[i]}] + '</td>' +
'</tr>' ) : '' ) + " ))
}
paste0(text,
"'</table></div>'
return text;};"
)
}
the only change I did was adding the d[{pos[i]}]!==null ? ....... : '' which will only show the column pos[i] when its value d[pos[i]] is not null.
Looking at the fact that loading the package and adding the function to the global environment won't do the trick, I forked it on github and commited the changes you can now install it by running (the github repo is a read-only cran mirror can't submit pull request)
devtools::install_github("moutikabdessabour/varsExplore")
EDIT
if you don't want to redownload the package I found a solution basically you'll need to override the datatable2 function :
first copy the source code into your R file located at path/to/your/Rfile
# the data.table way
data.table::fwrite(list(capture.output(varsExplore::datatable2)), quote=F, sep='\n', file="path/to/your/Rfile", append=T)
# the baseR way
fileConn<-file("path/to/your/Rfile", open='a')
writeLines(capture.output(varsExplore::datatable2), fileConn)
close(fileConn)
then you'll have to substitute the last ligne
DT::datatable(
x,
...,
escape = -2,
options = opts,
callback = DT::JS(.callback2(x = x, pos = c(0, pos)))
)
with :
DT::datatable(
x,
...,
escape = -2,
options = opts,
callback = DT::JS(gsub("('<tr>.+?(d\\[\\d+\\]).+?</tr>')" , "(\\2==null ? '' : \\1)", varsExplore:::.callback2(x = x, pos = c(0, pos))))
)
what this code is basically doing is adding the js condition using a regular expression.
Result

Kivy regarding binding multiple buttons to each individual function

Hi I am new to Kivy and just started programming. I have problem, I want to bind all the buttons i created in the for loops to the on_release for every single buttons. So that to make all buttons once click is able to go different screens. Below is my a small part of my code( I EDITED with more information)
#this are the pictures of the buttons
a = '_icons_/mcdonald2.png'
b = '_icons_/boostjuice.png'
c = '_icons_/duckrice.png'
d = '_icons_/subway_logo.png'
e = '_icons_/bakery.png'
f = '_icons_/mrbean.png'
#these are the names of the different screen
n1 = 'mcdonald_screen'
n2 = 'boost_screen'
n3 = 'duck_screen'
n4 = 'subway_screen'
n5 = 'bakery_screen'
n6 = 'mrbean_screen'
arraylist = [[a,n1],[b,n2],[c,n3],[d,n4],[e,n5],[f,n6]]
self.layout2 = GridLayout(rows=2, spacing = 50,size_hint = (0.95,0.5),
pos_hint = {"top":.65,"x":0},padding=(90,0,50,0))
for image in arraylist:
self.image_outlet = ImageButton(
size_hint=(1, 0.1),
source= image[0])
self.screen_name = image[1]
self.image_outlet[0].bind(on_release= ??) ## This part is the one
i want to change
according to the
different screen
self.layout2.add_widget(self.image_outlet)
self.add_widget(self.layout2)
GUI = Builder.load("_kivy_/trying.kv")
class TRYINGApp(App):
def build(self):
return GUI
def change_screen(self,screen_name):
screen_manager = self.root.ids['screen_manager']
screen_manager.current = screen_name
#kv file#
# all the varies kv file screen
#: include _kivy_/variestime_screen.kv
#: include _kivy_/homescreen.kv
#: include _kivy_/mcdonaldscreen.kv
#: include _kivy_/firstpage.kv
#: include _kivy_/mrbeanscreen.kv
#: include _kivy_/boostscreen.kv
#: include _kivy_/duckscreen.kv
#: include _kivy_/subwayscreen.kv
#: include _kivy_/bakeryscreen.kv
GridLayout:
cols:1
ScreenManager:
id : screen_manager
FirstPage:
name :"first_page"
id : first_page
VariesTimeScreen:
name: "variestime_screen"
id: variestime_screen
HomeScreen:
name : "home_screen"
id : home_screen
McDonaldScreen:
name : "mcdonald_screen"
id : mcdonald_screen
BoostScreen:
name : "boost_screen"
id : boost_screen
DuckScreen:
name: "duck_screen"
id: duck_screen
SubwayScreen:
name:"subway_screen"
id: subway_screen
BakeryScreen:
name: "bakery_screen"
id: bakery_screen
MrBeanScreen:
name: "mrbean_screen"
id : mrbean_screen
Your on_release can be something like:
self.image_outlet.bind(on_release=partial(self.change_screen, image[1]))
where change_screen is a method that you must define:
def change_screen(self, new_screen_name, button_instance):
# some code to change to the screen with name new_screen_name
Note that I have removed the [0] from self.image_outlet (I suspect that was a typo). I can't determine what code should go in the new method, because you haven't provided enough information.
If you have a change_screen method in your App class, you can use that directly by referencing it in your on_release as:
self.image_outlet.bind(on_release=partial(App.get_running_app().change_screen, image[1]))
You will need to make a minor change to your change_screen method to handle additional args:
def change_screen(self, screen_name, *args):
screen_manager = self.root.ids['screen_manager']
screen_manager.current = screen_name

how to iterate over multiple links and scrape everyone of them one by one and save the output in csv using python beautifulsoup and requests

I have this code but don't know how to read the links from a CSV or a list. I want to read the links and scrape details off every single link and then save the data in columns respected to each link into an output CSV.
Here is the code I built to get specific data.
from bs4 import BeautifulSoup
import requests
url = "http://www.ebay.com/itm/282231178856"
r = requests.get(url)
x = BeautifulSoup(r.content, "html.parser")
# print(x.prettify().encode('utf-8'))
# time to find some tags!!
# y = x.find_all("tag")
z = x.find_all("h1", {"itemprop": "name"})
# print z
# for loop done to extracting the title.
for item in z:
try:
print item.text.replace('Details about ', '')
except:
pass
# category extraction done
m = x.find_all("span", {"itemprop": "name"})
# print m
for item in m:
try:
print item.text
except:
pass
# item condition extraction done
n = x.find_all("div", {"itemprop": "itemCondition"})
# print n
for item in n:
try:
print item.text
except:
pass
# sold number extraction done
k = x.find_all("span", {"class": "vi-qtyS vi-bboxrev-dsplblk vi-qty-vert-algn vi-qty-pur-lnk"})
# print k
for item in k:
try:
print item.text
except:
pass
# Watchers extraction done
u = x.find_all("span", {"class": "vi-buybox-watchcount"})
# print u
for item in u:
try:
print item.text
except:
pass
# returns details extraction done
t = x.find_all("span", {"id": "vi-ret-accrd-txt"})
# print t
for item in t:
try:
print item.text
except:
pass
#per hour day view done
a = x.find_all("div", {"class": "vi-notify-new-bg-dBtm"})
# print a
for item in a:
try:
print item.text
except:
pass
#trending at price
b = x.find_all("span", {"class": "mp-prc-red"})
#print b
for item in b:
try:
print item.text
except:
pass
Your question is kind of vague!
Which links are you talking about? There are a hundred on a single ebay page. Which infos would you like to scrape? Similarly there is also a ton.
But anyway, here is I would proceed:
# First, create a list of urls you want to iterate on
urls = []
soup = (re.text, "html.parser")
# Assuming your links of interests are values of "href" attributes within <a> tags
a_tags = soup.find_all("a")
for tag in a_tags:
urls.append(tag["href"])
# Second, start to iterate while storing the info
info_1, info_2 = [], []
for link in urls:
# Do stuff here, maybe its time to define your existing loops as functions?
info_a, info_b = YourFunctionReturningValues(soup)
info_1.append(info_a)
info_2.append(info_b)
Then if you want a nice csv output:
# Don't forget to import the csv module
with open(r"path_to_file.csv", "wb") as my_file:
csv_writer = csv.writer(final_csv, delimiter = ",")
csv_writer.writerows(zip(urls, info_1, info_2, info_3))
Hope this will help?
Of course, don't hesitate to give additional info, so to have additional details
On attributes with BeautifulSoup: https://www.crummy.com/software/BeautifulSoup/bs4/doc/#attributes
About the csv module: https://docs.python.org/2/library/csv.html

Format datetime day with st, nd, rd, th

I'm creating a report in SSRS and across the top I have a header with a placeholder for "Last Refreshed" which will show when the report last ran.
My function in the placeholder is simply this:
=Format(Now, "dddd dd MMMM yyyy hh:mm tt")
Which looks like this:
Monday 22 September 2015 09:46 AM
I want to format the day value with the English suffix of st, nd, rd and th appropriately.
I can't find a built in function for this and the guides I've looked at so far seem to describe doing it on the SQL side with stored procedures which I don't want. I'm looking for a report side solution.
I thought I could get away with an ugly nested IIF that did it but it errors out despite not giving me any syntax errors (whitespace is just for readability).
=Format(Now, "dddd " +
IIF(DAY(Now) = "1", "1st",
IIF(DAY(Now) = "21","21st",
IIF(DAY(Now) = "31","31st",
IIF(DAY(Now) = "2","2nd",
IIF(DAY(Now) = "22","22nd",
IIF(DAY(Now) = "3","3rd",
IIF(DAY(Now) = "23","23rd",
DAY(Now) + "th")))))))
+ " MMMM yyyy hh:mm tt")
In any other language I would have nailed this ages ago, but SSRS is new to me and so I'm not sure about how to do even simple string manipulation. Frustrating!
Thanks for any help or pointers you can give me.
Edit: I've read about inserting VB code into the report which would solve my problem, but I must be going nuts because I can't see where to add it. The guides say to go into the Properties > Code section but I can't see that.
Go to layout view. Select Report Properties.Click on the "Code" tab and Enter this code
Public Function ConvertDate(ByVal mydate As DateTime) as string
Dim myday as integer
Dim strsuff As String
Dim mynewdate As String
'Default to th
strsuff = "th"
myday = DatePart("d", mydate)
If myday = 1 Or myday = 21 Or myday = 31 Then strsuff = "st"
If myday = 2 Or myday = 22 Then strsuff = "nd"
If myday = 3 Or myday = 23 Then strsuff = "rd"
mynewdate = CStr(DatePart("d", mydate)) + strsuff + " " + CStr(MonthName(DatePart("m", mydate))) + " " + CStr(DatePart("yyyy", mydate))
return mynewdate
End function
Add the following expression in the required field. I've used a parameter, but you might be referencing a data field?
=code.ConvertDate(Parameters!Date.Value)
Right Click on the Textbox, Go To Textbox Properties then, Click on Number tab, click on custom format option then click on fx button in black.
Write just one line of code will do your work in simpler way:
A form will open, copy the below text and paste there to need to change following text with your database date field.
Fields!FieldName.Value, "Dataset"
Replace FieldName with your Date Field
Replace Dataset with your Dateset Name
="d" + switch(int(Day((Fields!FieldName.Value, "Dataset"))) mod
10=1,"'st'",int(Day((Fields!FieldName.Value, "Dataset"))) mod 10 =
2,"'nd'",int(Day((Fields!FieldName.Value, "Dataset"))) mod 10 =
3,"'rd'",true,"'th'") + " MMMM, yyyy"
I found an easy way to do it. Please see example below;
= DAY(Globals!ExecutionTime) &
SWITCH(
DAY(Globals!ExecutionTime)= 1 OR DAY(Globals!ExecutionTime) = 21 OR DAY(Globals!ExecutionTime)=31, "st",
DAY(Globals!ExecutionTime)= 2 OR DAY(Globals!ExecutionTime) = 22 , "nd",
DAY(Globals!ExecutionTime)= 3 OR DAY(Globals!ExecutionTime) = 23 , "rd",
true, "th"
)

Compiling individual webpage tables into a single Excel readable table

I would like to create a master list of contact information for all Chiropractors in Arizona. The board website lists all the Chiropractors here however, I have to click through to see each individual address and phone number.
How can I get all of the information about each Chiropractor in to a single spreadsheet row format?
This is easy. In your first sheet do Data > External data > From website. Paste the URL, select the main table and do Next and put it in A1.
In VBA Editor paste the following formula and execute it. It will retrieve all data from the website and paste it in Sheet2. The rest is just reorganizing the data which is not the topic of your question so I leave it up to you.
Sub ExtractAllData()
Dim dest As Range, license As Range
Dim license_no As String
Worksheets("Feuil2").Select
Set dest = Worksheets("Feuil2").Range("A1")
Set license = Worksheets("Feuil1").Range("C3")
Do Until license.Value = ""
license_no = Mid(license.Value, 1, InStr(1, license.Value, " "))
With Worksheets("Feuil2").QueryTables.Add(Connection:= _
"URL;http://www.azchiroboard.us/ProDetail.asp?LicenseNo=" & license_no, Destination:= _
dest)
.Name = "ProDetail.asp?LicenseNo=" & license_no
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlSpecifiedTables
.WebFormatting = xlWebFormattingNone
.WebTables = "1"
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = False
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
Set dest = Range("A65535").End(xlUp)
Set dest = dest.Offset(1, 0)
Set license = license.Offset(1, 0)
Loop
End Sub
For the record, it took me 1 minute to figure out how to retrieve the data from the main table. 1 minute to figure out that the link just calls a PHP page with the license number. 1 minute to record the macro and then 5 minutes to adapt it and fix errors I had made.

Resources