How can I select from only one table with Web::Scraper? - css

I want to extract the text only for heading Node Object Methods from a webpage. The specific HMTL part is as follows:
<h2>Node Object Properties</h2>
<p>The "DOM" column indicates in which DOM Level the property was introduced.</p>
<table class="reference">
<tr>
<th width="23%" align="left">Property</th>
<th width="71%" align="left">Description</th>
<th style="text-align:center;">DOM</th>
</tr>
<tr>
<td>attributes</td>
<td>Returns a collection of a node's attributes</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>baseURI</td>
<td>Returns the absolute base URI of a node</td>
<td style="text-align:center;">3</td>
</tr>
<tr>
<td>childNodes</td>
<td>Returns a NodeList of child nodes for a node</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>firstChild</td>
<td>Returns the first child of a node</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>lastChild</td>
<td>Returns the last child of a node</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>localName</td>
<td>Returns the local part of the name of a node</td>
<td style="text-align:center;">2</td>
</tr>
<tr>
<td>namespaceURI</td>
<td>Returns the namespace URI of a node</td>
<td style="text-align:center;">2</td>
</tr>
<tr>
<td>nextSibling</td>
<td>Returns the next node at the same node tree level</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>nodeName</td>
<td>Returns the name of a node, depending on its type</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>nodeType</td>
<td>Returns the type of a node</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>nodeValue</td>
<td>Sets or returns the value of a node, depending on its
type</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>ownerDocument</td>
<td>Returns the root element (document object) for a node</td>
<td style="text-align:center;">2</td>
</tr>
<tr>
<td>parentNode</td>
<td>Returns the parent node of a node</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>prefix</td>
<td>Sets or returns the namespace prefix of a node</td>
<td style="text-align:center;">2</td>
</tr>
<tr>
<td>previousSibling</td>
<td>Returns the previous node at the same node tree level</td>
<td style="text-align:center;">1</td>
</tr>
<tr>
<td>textContent</td>
<td>Sets or returns the textual content of a node and its
descendants</td>
<td style="text-align:center;">3</td>
</tr>
</table>
<h2>Node Object Methods</h2>
<p>The "DOM" column indicates in which DOM Level the method was introduced.</p>
<table class="reference">
<tr>
<th width="33%" align="left">Method</th>
<th width="61%" align="left">Description</th>
<th style="text-align:center;">DOM</th>
</tr>
<tr>
<td>appendChild()</td>
<td>Adds a new child node, to the specified node, as the last child node</td>
<td style="text-align:center;">1 </td>
</tr>
<tr>
<td>cloneNode()</td>
<td>Clones a node</td>
<td style="text-align:center;">1 </td>
</tr>
<tr>
<td>compareDocumentPosition()</td>
<td>Compares the document position of two nodes</td>
<td style="text-align:center;">1 </td>
</tr>
<tr>
<td>getFeature(<span class="parameter">feature</span>,<span class="parameter">version</span>)</td>
<td>Returns a DOM object which implements the specialized APIs
of the specified feature and version</td>
<td style="text-align:center;">3 </td>
</tr>
<tr>
<td>getUserData(<span class="parameter">key</span>)</td>
<td>Returns the object associated to a key on a this node. The
object must first have been set to this node by calling setUserData with the
same key</td>
<td style="text-align:center;">3 </td>
</tr>
<tr>
<td>hasAttributes()</td>
<td>Returns true if a node has any attributes, otherwise it
returns false</td>
<td style="text-align:center;">2 </td>
</tr>
<tr>
<td>hasChildNodes()</td>
<td>Returns true if a node has any child nodes, otherwise it
returns false</td>
<td style="text-align:center;">1 </td>
</tr>
<tr>
<td>insertBefore()</td>
<td>Inserts a new child node before a specified, existing, child node</td>
<td style="text-align:center;">1 </td>
</tr>
</table>
In Perl if I write the following:
my $data = scraper {
process "table.reference > tr > td > a", 'renners[]' => 'TEXT';
}
for my $i (0 .. $#{$res2->{renners}}) {
print $res2->{renners}[$i];
print "\n";
}
I get the text for all the tags i.e.:
attributes
baseURI
.
.
.
.
insertBefore()
wheras I need the text of tag <a> only for Node Object Methods i.e.:
appendChild()
.
.
.
insertBefore()
In short I want to print the NODE object methods only. What should I modify in the code?

Web::Scraper can use nth_of_type to choose the right table. There are two tables with the same class, so you can say table.reference:nth-of-type(2):
use v5.22;
use feature qw(postderef);
no warnings qw(experimental::postderef);
use Web::Scraper;
my $html = do { local $/; <DATA> };
my $methods = scraper {
process "table.reference:nth-of-type(2) > tr > td > a", 'renners[]' => 'TEXT';
};
my $res = $methods->scrape( $html );
say join "\n", $res->{renners}->#*;
And here's a Mojo::DOM:
use Mojo::DOM;
my $html = do { local $/; <DATA> };
my $dom = Mojo::DOM->new( $html );
say $dom
->find( 'table.reference:nth-of-type(2) > tr > td > a' )
->map( 'text' )
->join( "\n" );
I tried looking for a selector solution that could recognize the text in the h2, but my kung fu is weak here.

Web::Query provides an almost identical solution to the Mojo::DOM solution proposed by brian d foy.
use Web::Query;
my $html = do { local $/; <DATA> };
wq($html)
->find('table.reference:nth-of-type(2) > tr > td > a')
->each(sub {
my ($i, $e) = #_;
say $e->text();
});
However it looks like Mojo::DOM is the more robust library. For Web::Query to correctly match with its selector I had to edit the input provided in the question to add a root node surrounding all the other content.
__DATA__
<html>
...
</html>

You can use XPath to extract data from the very next table after the heading Node Object Methods, like so
use Web::Scraper;
my $html = do { local $/; <DATA> };
my $methods = scraper {
process '//h2[.="Node Object Methods"]/following-sibling::table[1]//tr/td[1]',
'renners[]' => 'TEXT';
};
my $res = $methods->scrape( $html );
say join "\n", #{ $res->{renners} };
The output will be
appendChild()
cloneNode()
compareDocumentPosition()
getFeature(feature,version)
getUserData(key)
hasAttributes()
hasChildNodes()
insertBefore()

Related

How to retrieve data from Custom Post Types wordpress using php

I'm trying to show the data inserted into custom post types in a table and I'm having trouble showing values in front end underneath each column i.e (Duration, Incall, Outcall)
<table class="rates-table">
<?php $get_rates_list = get_field('rates_optional');
if(get_rates_list){
foreach($get_rates_list as $rate){?>
<thead>
<tr>
<td><h3 class="inside-model-single">Duration</h3></td>
<td><h3 class="inside-model-single">Incall</h3></td>
<td><h3 class="inside-model-single">Outcall</h3></td>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td><span class="rate_value"><?php echo $rate['incall_1_hour'];?></span></td>
<td><span class="rate_value"><?php echo $rate['outcall_1_hour'];?></span></td>
</tr>
<?php }
}?>
</tbody>
</table>
One way is displaying by its Name individually using get_field() or the_field().
<table class="rates-table">
<?php $incall_1_hour = get_field('incall_1_hour'); ?>
<thead>
<tr>
<td><h3 class="inside-model-single">Duration</h3></td>
<td><h3 class="inside-model-single">Incall</h3></td>
<td><h3 class="inside-model-single">Outcall</h3></td>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td><span class="rate_value"><?php echo $incall_1_hour; ?></span></td>
<td><span class="rate_value"><?php the_field('outcall_1_hour'); ?></span></td>
</tr>
</tbody>
</table>
Second way is using get_fields() to get an array of all field values for current or a specific post. Following is just a simple example as I don't know the full structure of your post. Make your changes accordingly.
<?php $get_rates_list = get_fields();
if ($get_rates_list) {
foreach($get_rates_list as $rate) {
?><td><span class="rate_value"><?php echo $rate;?></span></td><?php
}
} ?>

Unable to load data for nested table dynamically

I have created a table using component nztable from ``ng-zorro`. That table contains a nested table where I want to load data dynamically(REST service call) on click of the expand icon.
I am able to load the data in the nested table but when I expand next row, it is overriding data in the first row with the new result.
As nested table creation is in the loop(ngfor), I am unable to control the data binding to a specific row.
<nz-table #nestedTable [nzData]="displayData" [nzPageSize]="10">
<thead colspan="5">
<tr>
<th nzWidth="4%"nzShowExpand></th>
<th nzWidth="12%">Id</th>
<th nzWidth="10%">Start Time</th>
<th nzWidth="10%">End Time</th>
<th nzWidth="10%">Status</th>
</tr>
</thead>
<tbody>
<ng-template ngFor let-data [ngForOf]="nestedTable.data">
<tr>
<td nzShowExpand [(nzExpand)]="data.expand" (click)="getDetails(data)"></td>
<td >{{data.Id}}</td>
<td>{{data.startTime}}</td>
<td>{{data.endTime}}</td>
<td>{{data.status}}</td>
</tr>
<tr [nzExpand]="data.expand">
<td><nz-spin *ngIf="isEventLoading"></nz-spin></td>
<td colspan="9">
<nz-table #innerTable [nzData]="innerTableData" nzSize="middle" [nzShowPagination]="false">
<thead>
<tr>
<th>E ID</th>
<th>S ID</th>
<th>E Type</th>
</tr>
</thead>
<tbody>
<tr *ngFor="let data of innerTable.data">
<td>{{data.eID}}</td>
<td>{{data.sID}}</td>
<td>{{data.eType}}</td>
</tr>
</tbody>
</nz-table>
Try to replace this [nzData]="innerTableData" whith your inner collection like this [nzData]="data.childRows".

Format string in graphviz

How can I format string in graphviz NODE ?
Now I have common styles for NODE
node [ href=\"#\",
shape=box,
style=filled,
fillcolor=azure,
color = lightblue3,
fontname=Helvetica,
center=true,
fontsize=9
]
I want to change date format
Use html label
node[label=<>]
An example
i0[label=<
<TABLE border="0">
<TR>
<TD valign="top" rowspan="2">
I<sub>0</sub>:
</TD>
<TD align="left">
S'ā†’.S<BR ALIGN="LEFT"/>
</TD>
</TR>
<TR>
<TD align="left" bgcolor="#aaaaaa">
Sā†’.SS<BR ALIGN="LEFT"/>
Sā†’.(S)<BR ALIGN="LEFT"/>
Sā†’.a<BR ALIGN="LEFT"/>
</TD>
</TR>
</TABLE>
>];
You should be able to do this with html(-like) labels. Give the table one style and that cell a different style. Keep in mind that this HTML is a very constrained subset of real HTML.

How to loop through Map in Thymeleaf

I am trying to understand how to loop through all entries in a Map in Thymeleaf. I have a domain object being processed by Thymeleaf that contains a Map.
How do I loop through the keys and fetch the values ?
Thanks.
Nevermind... I found it...
<tr th:each="instance : ${analysis.instanceMap}">
<td th:text="${instance.key}">keyvalue</td>
<td th:text="${instance.value.numOfData}">num</td>
</tr>
Thanks.
In case you have a List as the value. For example, when you have a map with key being the category, and value being a list of items pertaining to that category, you can use this:
<table>
<tr th:each="element : ${catsAndItems}">
<td th:text="${element.key}">keyvalue</td>
<table>
<tr th:each="anews : ${element.value}">
<td th:text="${anews.title}">Some name</td>
<td th:text="${anews.description}">Some name</td>
<td th:text="${anews.url}">Some name</td>
<td th:text="${anews.logo}">Some name</td>
<td th:text="${anews.collectionDate}">Some name</td>
</tr>
</table>
</tr>
</table>

Graphviz (xdot): How to make recursive nodes?

I'm currently writing a graphs library in Java, and I would like a tool to visualize some graphs. I discovered Graph-viz, which happens to be a great - although buggy - way of doing this.
In my model, Graphs are composed of Nodes and Edges. Every Node have a certain number of Ports (I/O/IO) and Edges link those Ports together. Some special nodes are called GraphNodes and embed a Graph. The Ports of these GraphNodes are mapped to some Ports of the internal Nodes.
I'd like to provide several representation. The first of them, with which I am satisfied, is as follows: http://i.stack.imgur.com/ujU71.png
The input Ports are represented in green, the output ones in red, and the input-output ones in blue.
In this representation, the GraphNodes are not expanded and are displayed just as simple Nodes. In a second version, I would like to create something that looks like the following picture: http://i.stack.imgur.com/Cx624.png
The problem is that I can't manage to create a sub-graph (cluster) with fixed areas (it seems not to be possible). Another solution I tried was to embed a graph into a node. However, inserting some code into the <td> </td> part of a HTML label does not evaluate the code:
digraph graph0
{
node1
[
label =
<
<table border="0" cellspacing="0">
<tr>
<td cellpadding="0">
<table border="0" cellspacing="0">
<tr>
<td bgcolor="palegreen" border="1" port="port2">port2</td>
<td bgcolor="palegreen" border="1" port="port3">port3</td>
</tr>
</table>
</td>
</tr>
<tr>
<td cellpadding="0">
<table border="0" cellspacing="0">
<tr>
<td cellpadding="0">
<table border="0" cellspacing="0">
<tr>
<td bgcolor="skyblue" border="1" port="port5">port5</td>
</tr>
</table>
</td>
<td bgcolor="peachpuff" border="1">
subgraph clusterTest
{
nodeTest
}
</td>
</tr>
</table>
</td>
</tr>
<tr>
<td cellpadding="0">
<table border="0" cellspacing="0">
<tr>
<td bgcolor="lightpink" border="1" port="port4">port4</td>
</tr>
</table>
</td>
</tr>
</table>
>
style = "invisible"
]
}
The previous code creates the following graph: http://i.stack.imgur.com/E9jQ1.png
Finally, the best solution I can come up with is the following: http://i.stack.imgur.com/VzS5g.png
However I am not satisfied with it, because the GraphNodes' Ports are placed in strange locations sometimes.
Do you please know how I can reach the target graph layout? Please ask for any other information if needed.
EDIT: I still didn't find any solution. A way to handle this would be to be able to fix the position of given nodes inside the containing cluster, but it seems not to be possible with "dot" layout. Any idea ?
Using a digraph, one can specify the positions of nodes (relative to one another). This can be used to force certain elements to appear above others. While other nodes can be forced to appear on the same level ( port 101 and 102 in this example)
Fake nesting: This graph does not use nested plaintext/semi-html nodes because I don't think that is possible (not a feature). I'm not sure if any graphviz libraries support them either, but it may be worth looking into other libraries. I've never even used dot from Java or Python, otherwise I would make a suggestion.
digraph {
nodesep = 0.2
ranksep = 0.8
pad = 0.1
node [ shape=square ]
node [ style=filled ]
edge [ arrowhead=none ]
//rankdir = LR
component_starter [ label = <
port02
port03
port06
S
port07
port08
port04
port05
> style = "invisible" ]
subgraph cluster_container {
label="I/O device with components "
color=orange
//margin = 0
edge [ style="invis"]
//edge [ len="0.5" minlen="1" ]
node [ height="0.5" width="2" fixedsize=true ];
node [ shape=rectangle style=filled ]
{
node [ color=palegreen ];
{ rank = same port101 -> port102 }
}
{
node [ color=skyblue];
port103 port104 }
{
node [ height="1.5" width="2" fixedsize=true ];
node [ color=peachpuff];
//notaport
}
{
node [ height="0.5" width="4" fixedsize=true ];
node [ color=lightpink];
output
}
//--
//subgraph cluster_inner {
//label="abstractions"
//color="black"
//style="invis"
component_a [ label = <
<table border="0" cellspacing="0"><tr>
<td border="1" bgcolor="white" > </td>
<td border="1" bgcolor="palegreen" port="port2">port2</td>
<td border="1" bgcolor="palegreen" port="port3">port3</td>
</tr><tr>
<td border="1" bgcolor="skyblue" port="port6">port6</td>
<td border="1" bgcolor="peachpuff" rowspan="3" colspan="2">A</td>
</tr><tr>
<td border="1" bgcolor="skyblue" port="port7">port7</td>
</tr><tr>
<td border="1" bgcolor="skyblue" port="port8">port8</td>
</tr><tr>
<td border="1" bgcolor="lightpink" colspan="1" port="port4">port4</td>
<td border="1" bgcolor="lightpink" colspan="2" port="port5">port5</td>
</tr></table> > style = "invisible" ]
component_b [ label = <
<table border="0" cellspacing="0"><tr>
<td border="1" bgcolor="white" > </td>
<td border="1" bgcolor="palegreen" port="port22">port22</td>
<td border="1" bgcolor="palegreen" port="port23">port23</td>
</tr><tr>
<td border="1" bgcolor="skyblue" port="port25">port25</td>
<td border="1" bgcolor="peachpuff" colspan="2"> B </td>
</tr><tr>
<td border="1" bgcolor="lightpink" colspan="3" port="port24">port24</td>
</tr></table> > style = "invisible" ]
//-
component_c [ label = <
<table border="0" cellspacing="0"><tr>
<td border="1" bgcolor="white" > </td>
<td border="1" bgcolor="palegreen" port="port32">port32</td>
<td border="1" bgcolor="palegreen" port="port33">port33</td>
</tr><tr>
<td border="1" bgcolor="skyblue" port="port35">port35</td>
<td border="1" bgcolor="peachpuff" colspan="2"> C </td>
</tr><tr>
<td border="1" bgcolor="lightpink" colspan="3" port="port34">port34</td>
</tr></table> > style = "invisible" ]
//}
port101 -> port103
port102 -> component_a
port102 -> component_b
port103 -> port104
component_a -> output;
component_b -> output;
component_c -> output;
edge [ style="" arrowhead="normal" color="#444444"]
component_a:port4 -> output;
component_c:port34 -> component_a:port3;
component_a:port5 -> component_b:port22;
port101 -> component_c:port33
//-
{ rank = same
edge [ dir=back ]
port104 -> component_a:port8
}
}
component_starter;
component_starter:port5 -> port101;
}
The above dot file, compressed. Use base64 -d nesting.bz2.base64 |bzcat to view it.
QlpoOTFBWSZTWd/epEIABCzfgHAwWAP/3zgkmAq/7//6UASZm8a7VNrQBQQlSDUaYjTINGjIyZAG
ho0aNMgkUZJpEIzUw0TTEzQBoAIwCTUiFT1NppDymGk0Pap6gMmCaAAcwAAAAAAAAAAFSRJoGp6B
TyNTyhABoD1DQ0aephLyAcTAhMSQiKogMBLsVaZBYIwUHIGFISYVKCMVkRgLypG2mhHPb5z0hBap
yN3HCL2iJVDYvXI6SykmzPN9LCaex+63c7jyTnk18c2KgvDZq6Kkz+WWf4DU4KoQsCQJ1gKpAcwC
mp5nGnmlI8wBNtgDi+Hmf0/g/v4PoNaZVrhy5cdWCavJkutPC0t50kljBJLHXrbQUjJMPPDCUKwN
NHO8aaiqKTus3tLEpprCW8Gzr68DtvyteHrqa7JJ9J46R4muUMuU39kJYPEgwJWwCSqsgMteezTo
ta1rr3va1sccdW/32OJUROFkmUzqeyHn+g96EccgEY5SfJixh2aJgQC0JVmWAtrdagoOkDZAViKA
qUpGZ1dXNJikmmsRZmAO16Kq9osW7KzzPZPS9IeLIqXo0cOoNwatry3Mi792YMRvA3oiKxe84ac7
EMGmdrabTaG0qqDpAMJJn5IeAbvMNiSJW8og7y+Ik+CExJBhLDZlKFSFKMBgIIMai57J5pLmgl5R
Qm261e797RF6qhy82NQypLEa8ktUVAL2R1hxThWC3pVB0jBThRxVLfHHJeZHv+pMEkxZ3P6KP+ho
QWyC9gtIM2cxJK2pIiiIooOPlxE0kkspJPijcgrHRHw6XvMpwy5ldiqlUpWzvymgxr78zAXh4vSW
L3jya8Hqc6ekwhhDROybThDBnIdmdlN6ClO8bo7ucxNqSVjHNGd8F8ocW2qmT0bJRujojA9MJqnC
TPg09tZlJ5d/am7W8E6GeU3TkyVC0N5nTgeXXn7Sj1UWijm1Q07OKeXiyPbENRSNcNzrnkyxkNeu
RgS5GWEXKkpTdIy8NenUhVYpqkbUCushe+cQ15oMcRIrb4GZscDDVmLk3LF7Txk6yFvmky8aoiK5
T+3pVRjVVVUtLRVFFI0q1lrpHlGqNnHKOVIKkktUYAuq2L1bekwQeG68LX3tK64kVNRw2tDb7rLn
acBl0J61Ld57HXXHKpXLjoqAo5yyRvQ/YxLdufjNCSaEk4mmYLcZp1ybI1BqDTum20PSm2cId4pU
Zx00pZzOJZwnSHjAqXNkWCM4s/+LuSKcKEhv71IhAA==

Resources