I have searched around to find about how to find a class with name contains some word but I don't find it. I want to take the information from class named with word footer on it.
<div class="footerinfo">
<span class="footerinfo__header">
</span>
</div>
<div class="footer">
<div class="w-container container-footer">
</div>
</div>
I have tried this but it still don't work
soup.find_all('div',class_='^footer^'):
and
soup.find_all('div',class_='footer*'):
Does anyone have any idea on doing this?
You can use CSS selectors which allow you to select elements based on the content of particular attributes. This includes the selector *= for contains.
for ele in soup.select('div[class*="footer"]'):
print (ele)
or regex
import re
regex = re.compile('.*footer.*')
soup.find_all("div", {"class" : regex})
Related
For a scraper I am looking to get a list of all elements on a page, which do not contain a certain child element. The DOM looks something like this
<scrape>
<div id='123'>
<span>test</span>
</div>
</scrape>
<scrape>
<div id='1234'>
<span>test</span>
</div>
</scrape>
<scrape>
<div id='12345'>
<span>test</span>
<span>don't include</span>
</div>
</scrape>
What I need to do is, my list needs to contain all scrape elements which do not contain a span with text don't include.
Any ideas?
Thanks!
This should work
//scrape[not(.//span[text()='don't include'])]
Literally:
Element(s) with tag name scrape not having inside it (child element) with span tag name and text with value don't include
How can I get the first text, I mean "Quotes to Scrape", from the following element using class name by scrapy python?
<div class="col-md-8">
<h1>
Quotes to Scrape
</h1>
</div>
Thanks for your time.
Here is a reasonable list of selectors both for css and xpath.
The element has no class, but you can get the text like this:
response.css('h1 a::text').get()
I'm scraping a forum and extracting the post nodes, getting something like this:
nodes = page %>% html_nodes('.mypost')
nodes[[1]]
<div class="mypost" itemprop="text">
<div class="bbcode_container">
<div class="bbcode_quote">
<div class="quote_container">
<div class="bbcode_quote_container b-icon b-icon__ldquo-l--gray"></div>
<div class="bbcode_postedby">
Originally posted by <strong>Mike</strong>
</div>
<div class="message">
This is great news. Can you elaborate on what it means? \
</div>
</div>
</div>
</div>
I copied this from another web site. So I'm not sure...
</div>
I want to get all the text within the posts (in this case for node 1 the "I copied this...") but remove everything that is within the div class="bbcode_container".
Is there a way to remove children based on the class name? It's possible my node might have other div children with other names, and the position of bbcode_container is not fixed (could be anywhere, not at all, or appear multiple times so an xpath approach seems tricky at best).
I've seen there's a way to negate within rvest but I'm certain I'm doing it wrong:
nodes %>% html_nodes(':not(.bbcode_container)') %>% html_text()
I have a loop displaying some markup that has dynamic class names. Is it possible to hide all elements with duplicate class name besides the first instance? For example below I would only want the first .SomethingDynamic1 and the first .SomethingDynamic2 to be visible.
I think I might be able to use the div[class^="group"] "starts with" attribute selector to achieve this but am I able to match dynamic text after that and filter out the duplicates? I would prefer a CSS only solution if possible.
<div class="group-SomethingDynamic1">
<div class="group-SomethingDynamic1">
<div class="group-SomethingDynamic1">
<div class="group-SomethingDynamic1">
<div class="group-SomethingDynamic2">
<div class="group-SomethingDynamic2">
<div class="group-SomethingDynamic2">
<div class="group-SomethingDynamic2">
Update (credit #Temani Afif)
If you want a CSS only solution, you will need to know the classes to filter beforehand.
Given that, you can simply use a siblings selector like the following:
.group-SomethingDynamic1 ~ .group-SomethingDynamic1 {
display: none;
}
Here is a stackblitz example
i want to fill text in selenium firefox broswer
how to find entering text selector its very complex for me please explain me the only way i want to achieve this using only css selector
<div class="Gb WK">
<div class="Rd"guidedhelpid="sharebox_editor">
<div class="eg">
<div class="yw oo"">
<div class="yw vk"">
</div>
<div class="URaP8 Kf Pf b-K b-K-Xb">
<div id="195" class="pq"
Share what's new...
</div>
<div id=":37.f" class="df b-K b-K-Xb URaP8 editable" contenteditable="true"
g_editable="true"role="textbox"aria-labelledby="195"></div>
</div>
</div>
</div>
</div>
You already wrote the cssSelector. However I will explain this for you. CssSelector allows you to use single/multiple attribute search. In case if you don't find a single attribute unique you can keep adding more attribute to the selector
Single attribute
[role='textbox']
Multiple attributes
[role='textbox'][contenteditable='true']
If you want to add div for a faster search that's possible too
div[role='textbox'][contenteditable='true']
Notice if I don't add div it's going to be tag independent search