Scraping - find the name of all sub-class - web-scraping

I'm trying to find a way to get number of sub-class and their name contained in a root class.
For example, I would like to have in return for the class 'o-container__left u-mt-lg':
class= "c-site__container"
class= "c-site__container"
class= "c-site__container c-site__container__last"
I'm working with BeautifulSoup. I found this, but it didn't really do what I'm expected:
soup.div["class"]
Thank you for your help!

from bs4 import BeautifulSoup
data = """
<main class="o-page-content" role="main">
<section class="o-container">
<div class="o-container__left u-mt-lg">
<div class="c-site__container "></div>
<div class="c-site__container "></div>
<div class="c-site__container c-site__container__last"></div>
</div>
</div>
"""
soup = BeautifulSoup(data, 'html.parser')
for item in soup.findChild('div', attrs={'class': 'o-container__left u-mt-lg'}):
print(item)
PLEASE NEXT TIME POST THE HTML AS TEXT, NOT AN IMG

Related

Is it possible to to get an empty string in a list when there is no element, using CSS selector?

I want to scrape some items, which are on the same page, using Scrapy.
HTML looks like this:
<div class="container" id="1">
<span class="title">
product-title1
</span>
<div class="description">
product-desc
</div>
<div class="price">
1.0
</div>
</div>
I need to extract name, description and price.
Unfortunately, sometimes product doesn't have the description and HTML look like this:
<div class="container" id="2">
<span class="title">
product-title2
</span>
<div class="price">
2.0
</div>
</div>
Currently I am using CSS selectors which returns list of all elements existing on the website:
title = response.css('span[class="title"]').extract()
['product-title1', 'product-title2', 'product-title3']
description = response.css('div[class="description"]').extract()
['desc1','desc3']
price = response.css('div[class="price"]').extract()
['1.0','2.0','3.0']
Is it possible to get for example an empty string in place of missing 'desc2' when description object isn't there, using CSS selector?
I recommend you to rewrite you code:
for section in response.xpath('//div[#class="container"]'):
title = section.xpath('./span[#class="title"]/text()').get(default='not-found') # you can use any default value here or just empty string
desctiption = section.xpath('./div[#class="description"]').get()
price = section.xpath('./div[#class="price"]/text()').get()
Check this out..
for section in response.xpath('//div[#class="container"]'):
title = section.xpath('./span[#class="title"]/text()').get()
desctiption_tag = section.xpath("//div[contains(#class,'description')]")
if desctiption_tag:
desctiption = section.xpath('./div[#class="description"]').get()
else:
desctiption = "String"
price = section.xpath('./div[#class="price"]/text()').get()

How do I pass data up from the child page to the page calling #Body?

So I have a MainLayout with a title bar, and I want to have a parameter that allows the page to set the title bar to whatever it wants. So mainlayout calls the page through #Body, I'm confused how I would pass data up through body to the mainlayout to update the title bar.
Any help would be greatly appreciated!
<div class="sidebar">
<NavMenu />
</div>
<div class="main">
<div class="top-row px-4">
<h3 bind="#TitleValue">#(TitleValue)</h3>
<a class="ml-md-auto">#ADService.LoggedUser().DisplayName (#*#(ADService.LoggedUser().IsMemberOf())*#Admin)</a>
</div>
<CascadingValue Value="#TitleValue" Name="TitleValue">
<div class="content px-4">
#Body
</div>
</CascadingValue>
</div>
<Footer />
#functions {
string TitleValue = "Inventory";
}
So what I want to do is pass the TitleValue down, have the page update it depending what is happening and have the title bar update with the new value.
If this isn't the way to do it, or I'm missing something, any help would be great :)
I guess the following, which is not mine, can help you:
Create a class which holds your data. Register it as singleton
service. Inject it into layout and into page. You should probably add
a notification mechanism to inform all your components that something
was changed in your data class.
You can look here...
Source and more...

Unable to load the data from other controller

I have a controller, which is composed of many other partial views. I wanted to use a particular section of that controller in another controller. I am able to see the design but unable to load items in it.
Let say I have one controller named "Products" within Product view folder I have _items.cshtml. I wanted to use this _items.cshtml in another controller called "placeTheOrder".
In particular section of div in placeTheOrder view I referred to #Html.Partial("~/Views/Products/_items.cshtml"). Even after doing so it is unable to load the content from _items.cshtml into placeTheOrder.
What am I doing wrong.
_items.cshtml view
<div id="accordionProduct" class="span-6 last prod-acc">
#Html.Partial("~/Views/product/_ezCpSearchBar.cshtml")
<div id="filterPanel" class="span-6 last filter-panel">
<div class="span-6 last">
<div class="filter-panel-head">
<h1>Filter</h1>
</div>
</div>
<div class="span-6 last">
<div class="filter-panel-body">
<div class="filter-panel-prop"></div>
</div>
</div>
</div>
<div id="accordionProductInner" class="span-6 last prod-acc-body">
</div>
</div>
// this is the script are where the content gets loaded into the view
<div id="TemplateFilterItem" class="hidden"></div>
<div id="TemplateLastViewItem" class="hidden"></div>
This is the place where I have refered to it in another controller
<div class="span-6" style="background-color:#d4d4d4;padding:20px;">
#Html.Partial("~/Views/product/_items.cshtml")
</div>
You can use Html.Renderpartial in your view instead
Html.RenderPartial("~/Views/ControllerName/ViewName.cshtml", ModelData);
If you are using different model for both the views then you need to make ViewModel which will include required properties.
Hope this Helps

How to make your views DRY with MVC5 and multiple page breakpoints?

I have a predicament that I am not quite sure how to overcome. I do not know what is the right way. I am building a website and I was given a template to integrate with my server code. The problem lies in how the template is outlined. Let me show you an example.
<body>
<div class="breakpoint active" id="bp_infinity" data-min-width="588">
<div id="header">full page header content</div>
<div id="body">some stuff</div>
<div id="footer">some stuff</div>
</div>
<div class="breakpoint" id="bp_587" data-min-width="493" data-max-width="587">
<div id="header">mobile header content</div>
<div id="body">some stuff</div>
<div id="footer">some stuff</div>
</div>
<div class="breakpoint" id="bp_492" data-max-width="492">
<div id="header">mobile header content</div>
<div id="body">some stuff</div>
<div id="footer">some stuff</div>
</div>
</body>
I am trying to setup my MVC5 Views in a way that does not repeats common code. The problem that I am facing is that the header and footer div are common code from page to page and the body changes. The second problem is that each page has different number of breakpoints. Here is a second page to show what I mean:
<body>
<div class="breakpoint active" id="bp_infinity" data-min-width="588">
<div id="header">full page header content</div>
<div id="body">some stuff</div>
<div id="footer">some stuff</div>
</div>
<div class="breakpoint" id="bp_587" data-max-width="587">
<div id="header">mobile header content</div>
<div id="body">some stuff</div>
<div id="footer">some stuff</div>
</div>
</body>
So the Layout page is now tricky to setup because I can't just say:
<body>
#RenderBody
</body>
One of the solutions I thought of was to use Sections, something like this:
<body>
#RenderBody
#RenderSection("Breakpoint-1", false)
#RenderSection("Breakpoint-2", false)
#RenderSection("Breakpoint-3", false)
</body>
Now each page would be along the lines of:
#section Breakpoint-1
{
<div class="breakpoint active" id="bp_infinity" data-min-width="588">
#{ Html.RenderPartial("full-page-header"); }
#{ Html.RenderPartial("full-page-body"); }
#{ Html.RenderPartial("full-page-footer"); }
</div>
}
#section Breakpoint-2
{
<div class="breakpoint" id="bp_587" data-max-width="587">
#{ Html.RenderPartial("mobile-page-header"); }
#{ Html.RenderPartial("mobile-page-body"); }
#{ Html.RenderPartial("mobile-page-footer"); }
</div>
}
A problem that I see with above code is that if the header now needs to have 5 breakpoints instead of 2, I need to go and modify it everywhere.
Is there a better way to do this? Is what I thought of the best solution for my scenario?
EDIT: To clarify. There are multiple brakpoints in the HTML because only one of them is active at a time. When page hits a certain width, 1 the currenct active breakpoint gets hidden and the new one becomes visible.
Assumptions
... are the mother of all....
"some stuff" that goes in the body tag is HTML being fed from some data source, or is hard-coded
"...the header and footer div are common code from page to page..." means that literally, you don't need to change the header/footer at all. (You still could, but I'm ignoring that for now)
The div id's "header", "body", "footer" should be handled as dom classes rather than dom ids. That is another discussion, but ids should always be unique.
Solution
This is a basic example, there are plenty of other approaches to try and plenty of other tweaks you can make
Controller
Let's call this BreakpointController
public ActionResult Index()
{
var model = new List<BreakpointViewModel>();
// populate model
return View(model);
}
ViewModel
public class BreakpointViewModel
{
public string BreakPointId { get; set; }
public int? MinWidth { get; set; }
public int? MaxWidth { get; set; }
public string Body { get; set; }
public bool IsActive { get; set; }
}
View
This should be your index.cshtml (or whatever you want to call it)
#model IEnumerable<WebApplication1.Models.BreakpointViewModel>
<div>
<h1>A header!</h1>
</div>
#Html.DisplayForModel()
<div>
<h4>A footer!</h4>
</div>
DisplayTemplate
* Thou shalt live in the folder containing views for the controller (or Shared)
* Thou shalt live in a subfolder named 'DisplayTemplates'
* Thou shalt be named {ModelName}.cshtml
in the end, the folder structure should look something like this:
Views
|-- Breakpoint
| |-- DisplayTemplates
| | +-- BreakpointViewModel.cshtml
| +-- Index.cshtml
And BreakpointViewModel.cshtml should look like this:
#model WebApplication1.Models.BreakpointViewModel
<div class="breakpoint #(Model.IsActive ? "active" : null)"
id="#Model.BreakPointId"
#(Model.MinWidth.HasValue ? "data-min-width='" + Model.MinWidth + "'" : null)
#(Model.MaxWidth.HasValue ? "data-max-width='" + Model.MaxWidth + "'" : null)>
#Html.Raw(Model.Body)
</div>
Note the minwidth/maxwidth lines in the div. Not required, just how I would personally deal with the widths.
Resulting HTML
<div>
<h1>A header!</h1>
</div>
<div class="breakpoint active"
id="bp_1"
data-max-width='720'>
<div>Hello World!</div>
</div>
<div class="breakpoint"
id="bp_2"
data-max-width='720'>
<div>Another Breakpoint</div>
</div>
<div class="breakpoint"
id="bp_3"
data-max-width='720'>
<div>Third Breakpoint</div>
</div>
<div class="breakpoint"
id="bp_4"
data-max-width='720'>
<div>Fourth Breakpoint</div>
</div>
<div>
<h4>A footer!</h4>
</div>
Original Answer
DisplayTemplates are your friend. If your sections are going to be the same, you can put the relevant information into a ViewModel, then pass the List<ViewModel> to the DisplayTemplate. The MVC engine will then use the DisplayTemplate for your ViewModel to fill out the needed code for each section.
You only need code your DisplayTemplate for your ViewModel once.
I don't have any sample code up at the moment, but if you need further help, comment on this and I'll break some out over the weekend.

Rename multiple css with identical name

I'm working inside a templated system where i can implement code, but i can't modified the core of the file. My layer are stacked like this:
<div class="layer1">
<div class="layer2">
<div class=" layer3">
<div class="layer4">
</div>
</div>
</div>
</div>
<div class="layer1">
<div class="layer2">
<div class=" layer3">
<div class="layer4">
</div>
</div>
</div>
</div>
<div class="layer1">
<div class="layer2">
<div class=" layer3">
<div class="layer4">
</div>
</div>
</div>
</div>
As you can see, my class all have the same name (layer1, layer2, etc...). I want to know if there's a way by using Javascript, Jquery or any other online client side library to modify the CSS class name so, for example, the first layer1 become level1 and the following layer1 become level 2?
Thank for your answer!
As other people already said, jQuery actually does what you want.
As long as you don't know the number of “layers” you have, you better find all elements by classname substring:
$('*[class^="layer"]')
Then you can get the list of the element classes and change old names to new ones.
Many different ways to do this:
Solution 1:
Use addClass() and removeClass()
$(".layer1").removeClass('old_class').addClass('new_class');
Replace old_class with your older class and new_class with your new class
Solution 2:
If you are able to get the element by ID
You can set the class by using .attr()
$("#id").attr('class', 'new_class');
an all around solution working with className :
var elem=document.querySelectorAll('[class^="layer"]') ;
for(i in elem){
x = elem[i].className;
var y=x.replace("layer" , "level");
elem[i].className=y||x;
}

Resources