How to find XPath of a url of an image? - wordpress

I'm trying to map a XML to import to Wordpress using the WP All Import plugin, but I got stuck after tried several ways to get the "scr" of this image:
<content type="html">
<div align="center" class="post-cover">
<img src="IMG.jpg"/>
<ul>
<li></li>
<li></li>
</ul>
</div>
</content>
I Tried {content[1]/div[1]/img/#src} and {content[1]/div/img/#src}, but no success.
The only path that indeed worked somehow was {content[#type = "html"]} and {content[1]} that showed all the html inside <content>.
If necessary, I can mass edit some things with notepad, like removing type="html", to force it recognize inner divs as childs, but it is also something that I already tried. Unfortunately the content is treated like a simple text.

Are you sure that the stuff that looks like HTML inside <content> isn't actually just text? The tree view image you linked to suggests that it is just text: E.g. the < in <div is actually an escaped, literal <, not the beginning of a <div tag.
If you view the XML/HTML in a plain text editor, you will probably see <content> <div align=...
In that case, <content> has no element children, just plain text. You can't select nodes like img/#src from it using XPath because it doesn't have any such nodes. You would have to find a way to parse it into XML or HTML, if you want to apply XPath to it.

The following do work using R and XML library. Just used '//img/#src'
library(XML)
html = '<content type="html">
<div align="center" class="post-cover">
<img src="IMG.jpg"/>
<ul>
<li></li>
<li></li>
</ul>
</div>
</content>'
doc = htmlParse(html, asText=TRUE)
src = xpathSApply(doc, '//img/#src')
The output:
src
"IMG.jpg"

Related

Data-sly-list is adding white space, causing bugginess

I'm having an issue where an unordered list created by data-sly-list is adding whitespace that isn't represented in the DOM or by any class. If I manually code the list rather than letting data-sly-list handle it, the whitespace isn't added.
 <div class="bullets">
    <ul class="columns unordered-list" id="stateList">
      <div data-sly-unwrap data-sly-list.slidesNode="${resource.listChildren}">
        <div data-sly-unwrap data-sly-list.states="${slidesNode.listChildren}">
          <li data-sly-test="${states.valueMap.flag}">
<sly data-sly-use.htmlpaths="${'htmlpaths.js' # thePath=states.valueMap.path}" data-sly-unwrap>
${states.valueMap.name}
</sly>                    
</li>
        </div>
    </div>
    </ul>
</div>
If I hardcode the list like the following, there's no whitespace
  <div class="bullets">
    <ul class="columns unordered-list" id="stateList">
<li>Accessibility   
</li>
<li>Accessibility    
</li>
<li>Accessibility     
</li>
<li>Accessibility     
</li>
    </ul>
</div>
There's also a htmlpaths.js involved:
"use strict";
use(function() {
var path = this.thePath;
var httpRegex = /http/;
    var hashRegex = /#/;
    if (path !== undefined && (httpRegex.test(path) === false && hashRegex.test(path) === false)){
       path = path + '.html';
    }
return {
href: path
}
});
The only difference I see is that its run through Sightly iterating. Is there any fix to this? In addition to listing I'm trying to break them into columns with the following CSS
li {
width:25%;
float:left;
display:inline;
}
This works perfectly fine on the hardcoded list, but on the Sightly iterated one it creates all kind of weird spacing issues that change based on screen width
This whitespace isn't accounted for at all in the DOM. I'm not sure what to do.
More weirdness:
If the margin top is set to -9 or higher, it looks like the above screenshot. But if its set to -10 or lower, it looks like this
It's like its a breakpoint, it goes from one extreme to the other on that one pixel change. No change otherwise. It's bizarre.
It's a little weird behavior in sightly, when you have some extra spaces in your HTML code, it will display with extra spaces in the HTML.
Try to remove all the spaces in the HTML as shown below and try it.
 <div class="bullets"><ul class="columns unordered-list" id="stateList"><sly data-sly-list.slidesNode="${resource.listChildren}"><sly data-sly-list.states="${slidesNode.listChildren}"><li>${states.valueMap.name}</li></sly></sly></ul></div>
You can use HTML formatter in your IDE or online tools like below to format the HTML for a readable format
https://www.freeformatter.com/html-formatter.html.
<div class="bullets">
<ul class="columns unordered-list" id="stateList">
<sly data-sly-list.slidesNode="${resource.listChildren}">
<sly data-sly-list.states="${slidesNode.listChildren}">
<li>${states.valueMap.name}</li>
</sly>
</sly>
</ul>
</div>
This should get rid of the extra spaces in your HTML.
Also, it is best to use sightly tags wherever we need some conditions to check or embed them directly in the actual div tag or html tags instead of using data-sly-unwrap.
You can also use sling models to get the required data and check all the conditions(including appending html) in the backend and send the data just to display and avoid all the conditions in sightly.
Using data-sly-unwrap or a sly tag still adds an empty line in the generated HTML. Even though most browsers ignore those spaces, they might cause issues in some cases. If you want the HTL output to look similar to your hardcoded HTML, try placing the use statement and anchor tag in a single line as shown below.
<div class="bullets">
    <ul class="columns unordered-list" id="stateList" data-sly-list.slidesNode="${resource.listChildren}">
       <li data-sly-repeat.states="${slidesNode.listChildren}" data-sly-test="${states.valueMap.flag}"><sly data-sly-use.htmlpaths="${'htmlpaths.js' # thePath=states.valueMap.path}">${states.valueMap.name} </sly></li>
    </ul>
</div>
Also, a few tips
The sly tag doesn't need a data-sly-unwrap. It is automatically
removed in the generated HTML.
data-sly-list can be added to the parent ul tag itself instead of introducing an extra div tag and then unwrapping it.
Use data-sly-repeat instead of data-sly-list wherever possible. I was able to bring down the generated HTML of one of our complex pages from 20k lines to 12k lines, as data-sly-repeat doesn't introduce additional white spaces.
Solution
The issue is on line 7 of your HTL template:
${states.valueMap.name}
You have a space at the end of the inner HTML of your tag ;)
Unrelated
Regarding your htmlpaths.js script, are you aware of Transformers in AEM? You can use them to implement a global Link Rewriter which will fix links when a page is rendered, much like your script does. You can see an example here: https://helpx.adobe.com/experience-manager/using/aem63_link_rewriter.html
If you decide to keep htmlpaths.js, you may want to review it because I'm afraid there might be some problems with it. Of course, I don't know your requirement so it's just a suggestion :)

The internal hyperlink defined in jupyter-notebook is not working

While creating Table of Contents in jupyter-notebook using <html>
I created hyperlinks linking to internal notebook cells, But clicking them does not take me to the desired cells.
Example:
The markup in the table of content is like:
<ol>
<li>Understanding the Data</li>
<li>Reading the file</li>
<li>Adding Columns</li>
<li>General Analysis</li>
</ol>
Whereas the code in the Cells linked by above hyperlinks are as follows:
<h2> Understanding the Data </h2>
<h2> Reading the file </h2>
... and so on
Like to share the solution to my problem as follows:
a. The hyperlink's href attribute should be preceded by hash'#' and exactly match the name of the linking cell( case insensitive)
with dash( no underscore ) replacing the spaces.
e.g.
<ol>
<li>Understanding the Data</li>
<li>Reading the file</li>
<li>Adding Columns</li>
<li>General Analysis</li>
</ol>
b. Whereas on the cells I am linking to, there should not be any space between opening and closing tags that are encompassing the name.
e.g.
<h2>Understanding the Data</h2>
<h2>Reading the file</h2>
<h2>Adding Columns</h2
<h2>General Analysis</h2>
Note that now there is no space between html tags and the name defined within.

XHTML OrderedList and UnorderedList Invalid validation

I am trying to use the ul and ol tags to create two seperate lists on a page, but everytime I try to validate the page I get "document type does not allow element "ul" here. I have tried moving the tags around and ive checked every tag that I have opened to ensure they are all closed. I also tried moving just that section of code to a new page and it throws the same error in the validation. I'm out of ideas, any help you can offer is greatly appreciated. It displays correctly, but I need it to pass validation.
<ol>
<li>USA</li>
<li>Canada</li>
<li>Sweden</li>
</ol>
</h2>
<hr/>
<h3>List Example (Order NOT important)</h3>
<h2> Things to Pick Up</h2>
<h3>
<ul>
<li>Milk</li>
<li>Eggs</li>
<li>Bread</li>
<li>Cheese</li>
</ul>
You can't have a list inside a heading. You are trying to put one inside a sub-sub-heading (<h3>). (You probably have one inside a sub-heading (<h2>) too, but the start tag is missing from that.
Put the lists after the headings.
See the spec under "Contexts in which this element can be used" for other places that you are allowed to place lists.
I did some tests with your code and tryed to validate with W3C validator.
Here are my observations about your piece of code with the proper corrections applied.
<!doctype html>
<html>
<head>
<meta charset="utf-8" />
<title>Ryan's solution</title>
</head>
<body>
<ol>
<li>USA</li>
<li>Canada</li>
<li>Sweden</li>
</ol>
<h2> <!--This oppening tag was added by me--> Hello World!</h2> <!--Why are you closing what was not oppened?-->
<hr/>
<h3>List Example (Order NOT important)</h3>
<h2> Things to Pick Up</h2>
<h3> <!--Where is the closing for this tag?--> Hello World!</h3><!--The closing tag was added by me-->
<ul>
<li>Milk</li>
<li>Eggs</li>
<li>Bread</li>
<li>Cheese</li>
</ul>
</body>
</html>
This corretions have been aproved by W3c validator.The explanation for the errors I found are in the comments.
I'm sorry if this answer is not fully clear and elegant. This is my first answer here, and i'm learning how to help people :)
Good luck with your coding !

codeception cannot click on xpath as it cannot locate CSS or Xpath

<div id="page" class="container-fluid">
<div id="pageContent" class="">
<h1>Angular Test: projectUI</h1>
<!-- ngView: -->
<div class="ng-scope" data-ng-view="">
<ul class="package-menu container white ng-scope">
<!-- ngRepeat: package in packages | orderBy:'name' -->
<div class="ng-scope" data-ng-repeat="package in packages | orderBy:'name'">
<li>
<a class="ng-binding" href="#package/2">Craig's farm</a>
</li>
</div>
task : I want to click on Craig's farm label but when i try to click via CSS it give below
error: CSS or XPath '#pageContent.package-menu.container.white.ng-scope>li:nth-child(1)
can anyone give me the exact xpath of the above code??
Thank you
I believe the issue is that your html snippet is not valid xml. There are a lot of unclosed tags and no root element. Xpath works on xml and xml is stricter than html.
//a[#class="ng-binding" and text()="Craig's farm"]
Fix the html and run the above xpath

ASP.Net server-side navigation menu based on html contents

I need to do some styling to a bunch of webforms, containing articles formatted in a rather uniform way. I can change any source code I want.
What I need is a quick way to dynamically create a navigation menu (on the server side) for an ASP.NET webform, based on contents of a specified div.
For example, given the following HTML:
<div id="article">
<h2 id="first">Chapter 1</h2>
<p>Some text...</p>
<h2 id="second">Chapter 2</h2>
<p>Some other text</p>
</div>
I would like to insert something like this at the end (and render it at the server side, not in a script):
<div id="navigation">
<ul>
<li>Chapter 1</li>
<li>Chapter 2</li>
</ul>
</div>
NOTE: I know I could iterate through parent div's child controls in codebehind (although I would need to make them all "run at server", or even parse the InnerHtml property of the parent div), but if feels pretty weird.
Also, I am aware that if the article was being created from a data source, I would have the content already organized, but I would like to make as little changes needed in the existing pages.
You could search for the headings with a RegEx and render the navigation from the results. Something like "<h2 id=\"([^\"]+)\">([^<]+)</h2>" would get you the id in the first and the caption in the second group.
If you have access to the data source that is creating the article, definitely use that.
However, if all you have is the HTML, I would use XSLT.

Resources