How can I target text when there is no next sibling? - web-scraping

I am trying to scrape dynamically generated pages with BeautifulSoup, sometimes I get loose text and somethings I don't.
How can I extract the loose text below, I tried to use next sibling but the text is not contained in any tags.
<div class="div1">
<table class="table1"></table>
<ul></ul>
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt
</div>

What you might do is use a css selector with select div.div1 ul and match the next_sibling
html_doc = """
<div class="div1">
<table class="table1"></table>
<ul></ul>
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt
</div>
"""
from bs4 import BeautifulSoup
result_page = BeautifulSoup(html_doc, 'html.parser')
for text in result_page.select("div.div1 ul"):
print(text.next_sibling.strip())

Related

Rows are not equal sizes in bootstrap 4

I'm using Bootstrap 4 and trying to have columns which are the same height (which I thought Bootstrap did by default).
This is my markup:
<div class="row">
<div class="col-md-4">
<div class="table-heading">
Lorem ipsum dolor sit amet
</div>
<div class="table-text">
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore
</div>
</div>
<div class="col-md-4">
<div class="table-heading">
Lorem ipsum dolor sit amet, consectetur adipiscing
</div>
<div class="table-text">
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore Lorem ipsum dolor sit amet, consectetur
</div>
</div>
<div class="col-md-4">
<div class="table-heading">
Lorem ipsum dolor
</div>
<div class="table-text">
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt
</div>
</div>
</div>
And the result is this:
I'd like the purple headings to align, as well as the columns themselves.
Here is the CSS I'm using:
.table-heading {
text-align: center;
font-weight: bold;
vertical-align: middle;
color:white;
background-color: #7b2265;
padding:10px;
}
.table-text {
padding:10px;
background-color: white !important;
}
Here's a Pen: https://codepen.io/anon/pen/ZxeLpO
Edit:
Since you want both, the titles AND the body text parts to be the same height while also making sure that they are responsive, you can use the re-ordering classes in combination with the h-100 class (height:100%) to achieve the desired effect:
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css" integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous">
<style>
.table-heading {
text-align: center;
font-weight: bold;
vertical-align: middle;
color:white;
background-color: #7b2265;
padding:10px;
}
</style>
<div class="container bg-light p-3">
<div class="row">
<div class="col-md-4">
<div class="table-heading h-100">
Title1 Lorem ipsum dolor sit amet
</div>
</div>
<div class="col-md-4 order-md-1">
<div class="bg-white h-100 px-3">
Body1 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore
</div>
</div>
<div class="col-md-4">
<div class="table-heading h-100">
Title2 Long, long, veeery long! Lorem ipsum dolor sit amet
</div>
</div>
<div class="col-md-4 order-md-2">
<div class="bg-white h-100 px-3">
Body2 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore. Body2 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore.
</div>
</div>
<div class="col-md-4">
<div class="table-heading h-100">
Title3 Short Lorem
</div>
</div>
<div class="col-md-4 order-md-3">
<div class="bg-white h-100 px-3">
Body3 Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore
</div>
</div>
</div>
</div>
The classes like order-md-1 etc. only kick in on screens that are medium (md) or larger. In this case here, it has the effect of the body part columns being pushed what looks like the next row. While on smaller screens, no re-ordering happens and therefore, on smaller screens, the columns appear in the same order as they are in the HTML.
If you add h-100 as a class to the divs with col-md-4 as a class then it will behave as you expect.
h-100 is the bootstrap4 utility class to set the height to 100%, which will make all the columns have a height equal to the row they all reside in.
<div class="col-md-4 h-100">
...
</div>
Update, you can add height: 100% to the css class .table-text to fill the empty space.

RMarkdown word_document bullet indent

How can you intent a normal bullet list as Word would do when rendering from Markdown? By default it's left aligned.
e.g. instead of
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do ei
Item 1
would be come....
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do ei
tab tab tab 1. Item 1

Bootstrap grid system not working the way expected

I am new to bootstrap and the grid system, but as far as i am concerned the code i have build should do the following:
This is a brief example of what it should look like on desktop:
and mobile:
I intend to add padding etc, which is in a JS fiddle.
I think it may be something to do with my CSS..
My CSS:
#history-img img
{
width: 300px;
height: 400px;
float: left;
margin-left: 100px;
margin-top: 35px;
border: 2px solid #000;
}
#history
{
background-color: #0088CE;
}
#history-text
{
width: 800px;
height: 400px;
border: 2px solid #000;
background-color: #FFF;
float: right;
margin-top: 35px;
margin-right: 100px;
color: #000;
text-align: left;
padding: 30px;
}
#history-text p
{
font-size: 16px;
}
Like the comments have said don't do that with your css. The max of what you have should be CSS:
#history-img img {
border: 2px solid #000;
}
#history-text{
border: 2px solid #000;
background-color: #FFF;
color: #000;
}
#history-text p{
font-size: 16px;
}
Here's a more bootstrap specific layout:
<section class="success" id="history">
<div class="container">
<div class="row">
<div class="col-lg-12 text-center">
<h2>A brief History of Doosan Babcock</h2>
</div>
</div>
<div class="row">
<div id="history-img" class="col-xs-4 col-lg-4">
<img src="img/1920/babcock_team_1920_sm.jpg" alt="babcock team 1920"/>
</div>
<div class="col-xs-6 col-sm-8 col-lg-8" id="history-text">
<p> Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et. Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et.</p>
</div>
</div>
</div>
</section>
Bootply http://www.bootply.com/HORwI8RjP3
The col- classes assign a width for you. I made a new fiddle which should help you get started
https://jsfiddle.net/5z6ojmm1/2/
What you want is one .row inside a .container with 2 col-md-6 or col-lg-6 classes. This means that the content will be split 50/50
<div class="row">
<div class="col-md-6">
<!-- image -->
</div>
<div class="col-md-6">
<!-- text -->
</div>
</div>
Collapse on mobile
If you want to have the columns underneath each other on mobile you can specify so by using the following
<div class="row">
<div class="col-xs-12 col-md-6">
<!-- image -->
</div>
<div class="col-xs-12 col-md-6">
<!-- text -->
</div>
</div>
Note the col-xs-12 and the col-md-6, This means on XS devices it will collapse to two 12 ( 100% ) columns and on MEDIUM it will split in 2 times 6.
This is the grid system explained
http://getbootstrap.com/css/#grid-options

How can I target a specific group of siblings in a flat hierarchy?

Assume you have this HTML:
<h1>Header 1</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
<h1>Header 1</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
<h1>Header 1</h1>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
<p>Lorem ipsum dolor sit amet, consectetur adipiscing elit.</p>
Note that the hierarchy is flat.
Now try to select the "middle pair" of <p> elements. Is this possible? I really can't figure out how.
This selector only grabs the first <p> following the <h1>:
h1:nth-of-type(2) + p
But this selector grabs the correct pair of <p> elements plus all the following <p> elements that appear after the pair we want:
h1:nth-of-type(2) ~ p
Is it possible?
No JavaScript. No markup changing. Generic solution. Any number of <h1>s or <p>s are allowed, and the number two, in this case, is arbitrary.
I'm thinking maybe this is possible using some using the :not() selector, but I can't seem to figure it out. Kind of like selecting the "general siblings" and then excluding as necessary, or something similar.
Due to the way the general sibling combinator works, it is not possible to limit a selection of siblings to a specific range or group, even of consecutive siblings. Even the :not() selector won't be of any help here.
You will have to use JavaScript to target the right elements. jQuery's .nextUntil() method immediately springs to mind.

css issue with float content

I need to move up the fourth column right beneath the first column. Is there any possibility to do this in css?
<body>
<div class="fleft">1 Lorem ipsum dolor sit amet</div>
<div class="fleft">2 Lorem ipsum dolor sit amet, consectetur adipisicing elit</div>
<div class="fleft">3 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.</div>
<div class="fleft">4 Lorem ipsum dolor sit amet, consectetur.</div>
<div class="fleft">5 Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.</div>
</body>
CSS
body { width:600px;}
.fleft{ float:left; width:200px;}
​
Please refer the fiddle
http://jsfiddle.net/9xbGC/
Given your original structure - No there isn't (with sole css and without dirty hacks). The fourth column has to obey the height of the third one.
Perhaps this jQuery Plugin can help you:
http://masonry.desandro.com/
Only way you can do it with pure CSS is to have 3 columns and split your content up between them.
you can do this using position absolute and setting top and left for each , but you should do this using javascript if you don't know height of each
or you can use jquery plugin: http://www.wookmark.com/jquery-plugin
No, only if you could rely on a specific height of the first div.
#div4{
position: absolute;
margin-top: 20px;
}
Not with floating, since float only adjusts the flow of elements, not their order.
You may go for absolute/relative positioning, javasscript.
Specify widths of the divs so that the 4th div would fit between the first and third divs, and then when you float the 4th collumn left it should sit right.
Edit:
You can also Use the awesome grid system that comes with Twitter Bootstrap, it's very useful and can sort out a serious amount of layout issues.
http://twitter.github.com/bootstrap/

Resources