Remove a child node by class name with rvest - r

I'm scraping a forum and extracting the post nodes, getting something like this:
nodes = page %>% html_nodes('.mypost')
nodes[[1]]
<div class="mypost" itemprop="text">
<div class="bbcode_container">
<div class="bbcode_quote">
<div class="quote_container">
<div class="bbcode_quote_container b-icon b-icon__ldquo-l--gray"></div>
<div class="bbcode_postedby">
Originally posted by <strong>Mike</strong>
</div>
<div class="message">
This is great news. Can you elaborate on what it means? \
</div>
</div>
</div>
</div>
I copied this from another web site. So I'm not sure...
</div>
I want to get all the text within the posts (in this case for node 1 the "I copied this...") but remove everything that is within the div class="bbcode_container".
Is there a way to remove children based on the class name? It's possible my node might have other div children with other names, and the position of bbcode_container is not fixed (could be anywhere, not at all, or appear multiple times so an xpath approach seems tricky at best).
I've seen there's a way to negate within rvest but I'm certain I'm doing it wrong:
nodes %>% html_nodes(':not(.bbcode_container)') %>% html_text()

Related

BeautifulSoup find class contains some specific words

I have searched around to find about how to find a class with name contains some word but I don't find it. I want to take the information from class named with word footer on it.
<div class="footerinfo">
<span class="footerinfo__header">
</span>
</div>
<div class="footer">
<div class="w-container container-footer">
</div>
</div>
I have tried this but it still don't work
soup.find_all('div',class_='^footer^'):
and
soup.find_all('div',class_='footer*'):
Does anyone have any idea on doing this?
You can use CSS selectors which allow you to select elements based on the content of particular attributes. This includes the selector *= for contains.
for ele in soup.select('div[class*="footer"]'):
print (ele)
or regex
import re
regex = re.compile('.*footer.*')
soup.find_all("div", {"class" : regex})

grid__item sizing simply not working?

I'm trying to do something fairly simple that I've done many times and I have no idea why it isn't working.
The following is simplified code for what I'm trying to do:
<div class="grid grid--uniform new-header">
<div class="grid__item small-up--one-third">
some stuff
</div>
<div class="grid__item small-up--one-third">
some stuff
</div>
<div class="grid__item small-up--one-third">
some stuff
</div>
</div>
No matter what I put as the second class after grid__item, I cannot get it to become any fraction of the size of the page. Currently, all divs are full-width regardless of me directing them to be one third.
Any ideas?
This is more of a css question than a liquid one. Correct me if I am wrong; I think your code relates mostly to Timber.
Timber is mobile first so you don't have to do small-up what you want is more like small--one-third. If the grid item is only going to be one third you can simply do: grid__item one-third

how to find complex css selector

i want to fill text in selenium firefox broswer
how to find entering text selector its very complex for me please explain me the only way i want to achieve this using only css selector
<div class="Gb WK">
<div class="Rd"guidedhelpid="sharebox_editor">
<div class="eg">
<div class="yw oo"">
<div class="yw vk"">
</div>
<div class="URaP8 Kf Pf b-K b-K-Xb">
<div id="195" class="pq"
Share what's new...
</div>
<div id=":37.f" class="df b-K b-K-Xb URaP8 editable" contenteditable="true"
g_editable="true"role="textbox"aria-labelledby="195"></div>
</div>
</div>
</div>
</div>
You already wrote the cssSelector. However I will explain this for you. CssSelector allows you to use single/multiple attribute search. In case if you don't find a single attribute unique you can keep adding more attribute to the selector
Single attribute
[role='textbox']
Multiple attributes
[role='textbox'][contenteditable='true']
If you want to add div for a faster search that's possible too
div[role='textbox'][contenteditable='true']
Notice if I don't add div it's going to be tag independent search

Creating Multiple Columns in Hyde Theme for Jekyll

I am creating an about us page in Jekyll. I am using hyde theme which is based on poole.
I am trying to add headshots of the people involved with a basic profile about them. The issues is that I am not able to manage that in two columns. I do not know whether that will affect the responsiveness of the website (as I have so far not succeeding in getting two columns side by side).
I tried looking at the source code of poole's website. They used a css class called themes for that. I tried taking that approach but did not succeed in getting a two columns layout.
How can I keep my site responsive to mobile layout and still create a two column layout for content?
EDIT: Code for the creation of multiple columns.
<div class = "themes"> // class to create columns
<div class="circle" style="background-image: url('/public/nitesh.jpg')">
</div>
</div>
<p> Mr. Nitesh Pandey<br> <nitesh.osf#gmail.com> </p>
<div class = "themes"> // class to create another column beside the one above.
<div class="circle" style="background-image: url('/public/parth.jpg')">
</div>
</div>
<p> Mr. Parth Bolke </br> <parth.osf#gmail.com>

How do I customize the CSS of just my FIRST post on archive/home pages?

So I've done a lot of research before asking this question. I already know how to use the if/else and conditional tags to make certain code applicable to only certain pages, BUT, I noticed that there isn't a single guide or question-answer out there addressing my question on only styling the first/most recent post in my blogger.
The closest I got to finding the solution (other than codes that I didn't have the skill to implement), was this one: http://helplogger.blogspot.ro/2014/01/create-magazine-style-layout-for-blogger-posts.html
Sample site from that tutorial: http://helploggertestblog.blogspot.com/
The problem with the above script is that was made to be too automated, and I don't need a post-summary or thumbnail for my other posts-- I'm only trying to change the look for the first post. I love that the first post's width was increased, bordered, color-changed, and what not.
Does anyone have any ideas on how I might isolate what I'm looking for, point me towards the right direction, or even hand me a general container so that I can get on with my life?
Without seeing any of your code it's kinda guessing, but I'll give it a try anyway.
I'm guessing that you have the posts in a div or other parent element. Something like this:
<div class="container">
<div>
<h3>Title</h3>
<p>Content of the post ... </p>
</div>
<div>
<h3>Title</h3>
<p>Content of the post ... </p>
</div>
<div>
<h3>Title</h3>
<p>Content of the post ... </p>
</div>
</div>
To style only the first div inside the container, you can use:
.container > div:nth-first-child {
/* your specific style here */
}
With your code, it would be easier to help...
EDIT
Use:
.blog-posts > .post-outer:first-child {
background: green;
}
There is a conditional tag available for 1st post
<b:if cond='data:post.isFirstPost'>
Your custom css only for the 1st post here
</b:if>

Resources