Testing Watson responses containing HTML markup in Botium Box - automated-tests

I'm testing out the Botium Box service and trying to run a test of chatbot conversations within IBM Watson. When the service tries to test an utterance that results in a response/output that contains HTML markup for a URL, it fails. Is this a bug or is there a way to use Botium Box to test and verify responses containing HTML markup for hyperlinks? If I can't automatically test responses that contain html markup for URLs, I might as well do all the testing my hand.
Scenario:
A user asks a question and the chatbot (IBM Watson) returns a response that contains a hyperlink. This hyperlink is embedded into the response using HTML markup. I have tried various configurations of the HTML markup code, such as moving the elements around and using apostrophes vs quotes.
The HTML markup in this case is:
<a target="_blank" href="https://facilities.gwu.edu/heating-cooling-residential">go.gwu.edu/heatcool</a>
When tested within IBM Watson, the response renders with a hyperlinked word. (correctly)
When tested within Botium Box live chat, it does not render as a hyperlink and instead shows the HTML markup.
When running a test in Botium Box, this utterance fails with the error:
Error: Temperature question 2/Line 6: assertion error - Error: Line 6: FAILURE: https://facilities.gwu.edu/heating-cooling-residential">go.gwu.edu/heatcool</a> Not Found Actual: 404 Expected: 200 at Promise.all.then.results (/home/ec2-user/botium-box-dist/premium/agent/node_modules/botium-asserter-hyperlink/src/HyperLinkAsserter.js:105:31) at <anonymous> at process._tickCallback (internal/process/next_tick.js:189:7)

For enabling HTML in the live chat, please enable this option in Botium Box for the chatbot:
The error message is already pretty detailed: in the Botium Box trial, there is a hyperlink checker to assert that all hyperlinks returned from the bot are actually valid links with valid responses. In this case, the HTTP error code 404 is returned.

Related

Accessibility test being failed because text not included in an ARIA landmark

I have some text which is spat out if JavaScript isn't turned on and this is currently failing an accessibility test.
It is within a <noscript> tag but the accessibility test is saying that the text is not included within a landmark.
None of the 8 standard roles seem to cover this, and I can see there is a generic role.
Is it therefore okay to use:
<noscript role="generic">
Or is that going to be a poor user experience for someone with a screenreader?
Thanks
Just fleshing this out a bit.
So actually when javascript is disabled via developer tools, the code just gets spat out on the with no tags at all. Looks like:
<body>
"Please ensure Javascript is enabled for purposes of.....
<meta charset="UTF-8">............
The first landmark I can see is on the navigation so the message itself is not wrapped within another landmark.
It is being flagged an error because of:
https://www.w3.org/WAI/WCAG22/Techniques/aria/ARIA11
https://www.w3.org/WAI/WCAG21/Techniques/aria/ARIA20
https://alfa.siteimprove.com/rules/sia-r57
Based of the above, am I right in thinking I can either:
Wrap the message in <dialog>message to go here</dialog> OR
Bring the current message inside the <header> tag

Uncaught SyntaxError: Failed to set the 'innerHTML' property on 'Element' while testing opening encrypted epubs

While attempting to open encrypted epubs using TestCafe I consistently get this error:
Uncaught SyntaxError: Failed to set the 'innerHTML' property on 'Element': The provided markup is invalid XML, and therefore cannot be inserted into an XML document.
In browser mode, the script shows the browser throwing this error: error in line 10 at column 8: Opening and ending tag mismatch: meta line 0 and head
I found this possible reason:
XHTML does not support document.write or .innerHTML. Due to the fact, that jQuery inserts the new code using one of these methods, all XHTML compatible browsers will error out
Does this mean that I cannot use TestCafe at all to do this kind of operation?
The code I am using is a simple .click(bookselector)
TestCafe can test HTML pages only. Your browser can treat EPUB files as pages when you click on a link because EPUB format is very similar to XML and HTML. Instead of clicking on a link to an EPUB file, consider retrieving the URL via the href and use http.request or got to download the file.

Why the same URL gives different results?

On the following page, the number 2, 3 ... at the bottom all point to the same URL. Yet, the different tables will be shown. Does anybody know what specific techniques are used here? How to extract information in these tables using raw HTTP request (I prefer not to use a headless browser to do so)? Thanks.
https://services27.ieee.org/fellowsdirectory/home.html#results_table
It is using Javascript (AJAX) to make HTTP calls to the server.
If you inspect the Network activity in the Developer tools you will see calls to the following URL: https://services27.ieee.org/fellowsdirectory/getpageresultsdesk.html.
They send data from Javascript:
selectedJSON: {"alpha":"ALL","menu":"ALPHABETICAL","gender":"All","currPageNum":1,"breadCrumbs":[{"breadCrumb":"Alphabetical Listing "}],"helpText":"Click on any of the alphabet letters to view a list of Fellows."}
inputFilterJSON: {"sortOnList":[{"sortByField":"fellow.lastName","sortType":"ASC"}],"typeAhead":false}
pageNum: 2
You can see the pageNum property. This is how they request a specific page of results.
When you click the number buttons, some Javascript code makes an AJAX POST request to https://services27.ieee.org/fellowsdirectory/getpageresultsdesk.html;jsessionid=yoursessionid with formData including pageNum: 3 and some other formatting parameters. The server responds with the HTML block of table rows that get loaded into the page. You can look at the requests on that webpage in your browser's network inspector (in the developer tools) to see exactly what HTTP requests are happening.
The link has an onclick handler that changes the href onclick. Go to
https://services27.ieee.org/fellowsdirectory/home.html#results_table
In the console, enter:
window.location=getDetailProfileUrl('lOH1bDxMyI1CCIxo5ODlGg==');
This redirects to Aarons, Jules.
Now go back and enter window.location=getDetailProfileUrl('JJuL3J00kHdIUozoVAgKdg==');
This opens Aarts, Ronald.
Basically, when the link is clicked, the JavaScript changes the url of the link.
To extract them using php, use the file_get_contents() function.
echo file_get_contents('https://services27.ieee.org/fellowsdirectory/home.html#results_table');
That will print out the page. Now scrape it with JavaScript.
echo "<script>console.log(document.querySelectorAll('.name'));</script>";
Hope this helps.

Change HTTP status code for page in Adobe CQ5 (AEM)

I'm trying to support a CQ5 (5.5) installation developed by an outside firm for my company.
It appears that my company wanted a pretty 404 page that looked like the rest of the site, and using the custom Sling 404.jsp error handler to redirect to a regular page that merely says "Page Not Found" was the easiest way to do it. The problem is that the 404 page actually returns a 200 status code since it really is just a regular content page that bears a "Not Found" message on it.
This is causing us problems with Google and the GoogleBot, since Google believes all the old search links to now non-existent pages are still valid (200 status code).
Is there any way to configure CQ to return the appropriate 404 status code for the "not found" HTML page that we display? When I am in the CQ Author mode editing the page, I find nothing in page properties or in components that could be added to the page.
Any help would be appreciated, as CQ is not exactly my area of expertise.
You'll have to overlay /libs/sling/servlet/errorhandler/404.jsp file in order to do so - copy it to /apps/sling/servlet/errorhandler/404.jsp and change according to your specification.
And if you are looking specifically into setting appropriate response status code - you can do it by setting respective response property:
response.setStatus(404);
UPDATE: instead of redirecting to the page_not_found.html you might want to include it to the 404.jsp after setting response status:
<sling:include path="path/page_not_found.html" />
You can set the response code fairly easily with this sort of code: response.setStatus(SlingHttpServletResponse.SC_NOT_FOUND);
So for example, a quick-and-dirty implementation on your page_not_found.jsp would be as follows:
<%
response.setStatus(SlingHttpServletResponse.SC_NOT_FOUND);
%>
(or a longer-term/better implementation would be to set it via a tag and a tag library to avoid scriptlets)
If your page_not_found.html page is a static HTML page and not rendered via a jsp, you may need to change your 404.jsp so it redirects to a page that is rendered via a jsp for this approach to work. The status code is set by the server rendering the response. It is not something intrinsic in the HTML itself, so you won't be able to set this in a regular, static HTML page. Something must be done on the server to set this status code. Also see How to Return Specific HTTP Status Code in a Plain HTML Page

the HtmlHighlighter of boilerpipe in .net is not returning the text always

am using Boilerpipe in my application, and when am trying to extract the content using ArticleExtractor am getting plane text only, all the html formating has been removed, so am trying with HtmlHighlighter. but the process method of HtmlHighlighter fails for certain urls.
is there any option to use html string to pass to this method? can anybody explain?
You can use IKVM to convert the Boilerpipe jar into a new DLL to use in your .NET aplications. I am using this approach and works fine when sending html thrown the different boilerpipe methods.
If the page content that you are trying to access is loaded by javascript, a simple http request cant handle such information.
First you need to get the result html after the javascript changes, and then give it to boilerpipe.

Resources