Open source object recognition library for visual browsing interfaces - web-scraping

Are there any image analysis or object recognition libraries that would be suitable for use in browser emulation? Eg: automation of user logins, filling forms, extracting data from tables, etc.
I understand that HTML parsers are more commonly used than image analysis algorithms for the purpose of browser emulation. However, I would like to automate tasks on websites that frequently change their html code and appearance. I would like to build an "intelligent" scraper that can recongize these visual browser elements in conjunction with conventional browser emulation techniques.
Are there any existing libraries for this purpose? Ones that would be suitable for screenshots of GUIs? (non-photographic images) such as buttons, tables, checkboxes and text?

"Selenium automates browsers. That's it."

Related

Accessibility with Google Slides Embed on a webpage

Got a site with powerpoint presentations that the client wants embedded, the embedding is being done via google docs embed. I have been doing some accessibility testing, albeit not particularly in-depth but even with OS X screen reader it is not having any luck reading the slides. (I am aware slides are probably terrible for accessibility anyway). I can get the text content of the slides stripped out via the Google API, but I don't know if thats the best thing, to include it on the page below/above the iframe embed with one of the CSS tricks for hiding it from normal view?
I am aware of iframe title and aria-label but those seem to imply they are only to describe the contents of the iframe, which I am doing, but I need somewhere that can contain more text.
Has anyone got any good tips for the best way to deal with such things? Thanks!!
Embedding rich 3rd-party content in web pages poses many challenges.
When we put something like this in a web page, we typically think we're adding a bit of "content", but it often amounts to embedding a complex application; and the user interface, semantics, and presentation are outside of our control. In your case it's a presentation slide deck, but it could also be a Flash/Silverlight/Java applet, a slippy map, interactive SVG infographic, a 3D-panorama, virtual tour, zoomable image, chemical molecule viewer, or who-knows-what. (Note: I'm not familiar with the Google docs embed/API specifically, so most parts of my answer will address these rich content cases in general.)
Even if the embedded rich 3rd-party content/application is accessible today, there's no guarantee it will remain so after the 3rd-party system gets an update.
So what can you do? The safest thing might be to assume it's inaccessible, and consider the best way to provide an accessible alternative. The Web Content Accessibility Guidelines (WCAG) calls this a "conforming alternate version", and it sounds like you're already thinking along these lines.
An important caveat to all of this, is that the use of "conforming alternate versions" isn't considered ideal by many accessibility specialists. It's greatly preferred to make your main content accessible as you can.
Some relevant parts of WCAG:
Understanding conformance, especially the section about "Understanding Conforming Alternate Versions".
Technique G190: Providing a link adjacent to or associated with a non-conforming object that links to a conforming alternate version.
F19: Failure of Conformance Requirement 1 due to not providing a method for the user to find the alternative conforming version of a non-conforming Web page.
It's worth mentioning the 3rd-party content in your website's accessibility statement. Statement of Partial Conformance - Third Party Content offers guidance about that.
The crucial thing about conforming alternate versions, is that it's no use at all if the user isn't made aware of it, or isn't able to reach it.
Implementation-wise, there are a variety of approaches you might take. In many ways, providing an alternative for embedded rich content is similar to providing a long description of a complex image, or a transcript of a video. Have a look at these WAI tutorials for some ideas.
Web Accessibility Tutorials: Complex Images
Making Audio and Video Media Accessible: Where to Put Transcripts
I can get the text content of the slides stripped out via the Google API
It sounds like you're trying to automate the process. That's fine, but it might not give satisfactory results. Some things you should consider:
Is the text content alone going to be enough? Presentations often have images too. Did the author provide text alternatives for these images, and are they present in the text extracted via the API? If the author hasn't provided text alternatives for images in the slide deck, the text you get from the API won't be telling the whole story.
Not all text in slides carries the same weight. Some slides serve to introduce sections, some slides have headings. Does the text obtained from the API convey these relationships?
Lists are very commonly found in presentation. Does the text obtained from the API preserve this structure?
Slides sometimes contain links. Are these included in the text obtained from the API, so the links are available to everybody using the alternative version?
Slides sometimes contain tables and charts. How will the information in these be conveyed in your alterative version? Is the information included in the text obtained from the API?
Sometimes, presentation decks also contain rich 3rd-party embedded content themselves! A slide containing a video, or an animated GIF are examples of this. If so, you can find yourself with a much more complex challenge.
... and many other meaningful info and relationships. Quotations, code samples, etc.
If any of the above points give cause for concern, you will need to consider managing your alternative version manually.
The "conforming alternative version" has to conform to WCAG; if you just offer a choice of two non-conforming version, that doesn't satisfy WCAG.
include it on the page below/above the iframe embed with one of the CSS tricks for hiding it from normal view?
No, I wouldn't recommend that. I assume you're refering to visually-hidden text, using CSS utility classes such as .visually-hidden or .sr-only. It sounds like you're only thinking about screen reader users.
You need to offer the alternative version available to everybody, not just one group of users who you think will need it.
Many groups of users may experience difficulty using the rich 3rd-party embedded content. This includes people using the keyboard only, screen readers, magnifiers, speech control, switch access, or other tools. The conforming alternative version can be navigated like a normal web page though.
The 3rd-party content likely has a different visual style, and it may not be adaptable in the same way as the page it is embedded in. That can frustrate people who make use of browser zooming, text resizing, font preferences, reader mode, Windows high-contrast themes, viewport resizing, and other user-applied presentation changes. The conforming alternate version should be as adaptable as the rest of your site.
Rather than hiding the alternative version in a visually-hidden container, here are some other ways to present it. The first two are the simplest and most reliable.
Put it on the same page, just after the original content, visible to everybody.
Put it in a collapsible disclosure element just after the embedded content. A <details> element is an easy way to achieve this. This is useful if the alternative version is quite long.
Put it elsewhere on the same page, and tell users where to find it. An internal link can help here.
Put it on a separate page, with a link next to the embedded content. I'm less keen on this approach. Only use it if you can commit to maintaining both pages.
Provide a way for users to switch between the two versions. For example some buttons, or a tabbed UI. However, you must also ensure that the switching mechanism is accessible. That might mean a full-blown ARIA tabs implementation.
I am aware of iframe title and aria-label but those seem to imply they are only to describe the contents of the iframe
Giving the iFrame a useful name is important. It's also a useful mechanism to inform screen reader users that an alternative version is available. The WAI Complex Images tutorial linked above has some similar approaches. Example: <iframe title="Google Slides Presentation of FOO BAR BAZ. Link to text version follows this frame.">. This only helps screen reader users though; you still need to make the availability of the alternative version clear to everyone.
How committed are you to using Google Docs for displaying these slides?
Any accessibility enhancements that you'll be able to implement on Google Slides won't be very good.
One way around this whole thing is to offer PDF versions of the slides that have been fully-remediated for accessibility instead of using Google Slides. That would potentially be a single solution that could be accessible to all visitors. Going this route means that you wouldn't have multiple copies of the same slides to update, which could lead to a split in content if one gets updated and the other is forgotten.
If you're really set on having the slides embedded in the page, then you could offer both formats by applying aria-hidden to the embedded iframe and then hiding the PDF links from sighted users using CSS clip, or by positioning content off-screen.

How are user interfaces for websites designed?

I am more of a server side programmer so bear with me on this. How exactly are user interfaces for websites designed? I mean which tools are usually used? Lets say for example, stackoverflow.com which has lot of dynamic content. How are the various areas designed? I am pretty sure its not in Visual studio. Probably the server side code is in asp.net but what about the actual UI? (layout, images, tables, buttons etc)
What is the usual workflow for an activity like this? Say, I have a design on paper. Where do I go from there? How do you wire in the code after the interface design is complete?
How do you handle the fact that in a page, some of the stuff is static and some areas are dynamic? (like the ask question page I am on now)
As you said, It boils down to the requirement of the webpage.
For a professional (fairly big) website, many teams are involved for example, creative team to do the paper work and design of UI elements and controls, graphics team to actually design images, UI Developers for placing the contents appropriately and CSS, architects to decide on performance for various items (and taking a call on static/dynamic nature of controls)
Generally designers use some external tools for designing HTML pages to provide templates and same can be used later in visual studio to make actual pages. There are many such tools available in the market such as Dreamweaver There are many freeware also available in the market for designing client and CSS rich websites. You can search on Google for these.
If your website requirement is not very client rich, you can still design using visual studio or use new Microsoft product Webmatrix which gives you user friendly tools to make a website look fabulous.
The paper design is the first correct step.
How to continue:
You can get the 960 grid system from http://960.gs/ and start from this one. Its a nice trick that have ready to use css templates that you can build on them your design.
The image effects:
The shadow and borders and other thinks that you ask is usually make on Photoshop, but now the new browsers support many of them using css. For example: http://css-tricks.com/snippets/css/css-box-shadow/
Software that can help on design:
- MS Expression
- DreamWeaver

Choosing a suitable multi-media builder software

Hi foks
I need a software but I am not a multimedia builder I am a .net developer I want to choose a software to build my first multi-media application I have to do something with this software :
1- it must be portable between different Windows operation
2- it would be independent I mean I don't want to install other software before it.
3- it must run at the autorun for CD
4- I need search ability for some values in the information
5- I don't want someone copy my information easily.
6- The information are videos and rich text
7- it has ability to change on specific screen resolution
Please guide me which softwares I mean a software to build this application are suitable for me I need something to build more easy not very complex but I need beautifull User Interface at the result.
You should try medichance's multimedia builder. It is doing the exact things you described.
http://www.mediachance.com/mmb/
It sounds like you are wanting to develop a multimedia distribution which has videos and text for the end user. That's very similar to a tutorial or training CD or DVD. There are many ways to develop this sort of content, but perhaps the easiest (although not particularly .NET-related) would be a web-based site stored on disc.
Design the product using HTML, CSS, and your preferred video format for web. (Silverlight, Flash, Quicktime...)
To address your points:
Web-based data is extremely portable, not just between Windows installations but across platforms and browsers.
It would be free from dependencies for the most part, assuming the user has a web browser with applicable add-ons to view the video content (such as a Flash or Silverlight plug-in).
You can use a text editor to create an autorun.inf file which will automatically load the main file (usually index.html).
The user can use the browser search functionality to easily find keywords in the pages. If you need the ability to search the entire contents of the multimedia package, that will add a small amount of complication.
The downside to a web-based product is that the files are plain text and anyone can easily copy the data. The question I have is whether you want to try and prevent copying of the entire product (say, as a CD) or just the information it displays?
There shouldn't be any problems displaying videos and rich text in a web environment, provided you've converted them to a format that is compatible with the intended distribution. (For example, if your audience uses Windows and you know they will have a Flash plugin, then a Flash-based video format would be ideal.)
Assuming that you mean reflow by "change on specific screen resolution," this was one of the main reasons I thought of web-based media. The browser of course be capable of displaying content with appropriate resizing capability just as most web sites which are crafted with consideration for multiple screen sizes. This is simply a matter of using appropriate CSS to ensure that elements appear just as logical on a widescreen, high-resolution monitor as they do on the lowest expected resolution screen.
To build a multimedia site, consider Adobe's products such as DreamWeaver, Photoshop, Flash, Fireworks, etc. (http://www.adobe.com/products/creativesuite/web/whatsnew/)
If you would prefer to develop an application using .NET Framework, instead you may want to consider Windows Presentation Foundation (http://windowsclient.net/wpf/white-papers/when-to-adopt-wpf.aspx) however there may be prerequisites depending on how you build the application.

Interactive Graph Visualisations in ASP.NET Website (Drag/Select/Link/Unlink)

We have a requirements to create a Website (ASP.NET v4.0) which displays a Graph. It should be able to do:
Display nodes (with names and colours)
Display links between the nodes, with text on the link (e.g. '85%')
Interact with nodes/links to drag/move/select
Layouts out the nodes in a clear manner automatically
Can add/remove nodes (asynchronously) and link/unlink easily
Javascript interaction with events (onClick, onDrag)
Events must provide identification of selected nodes/links (Javascript).
Ability to zoom in/out (ideally)
Updates Asynchronously (rather than full postbacks)
Responsive when displaying >100 nodes
Flash is not supported
MUST support IE6 (just don't ask...!)
Development is Visual Studio 2010 on .Net Framework v4.0
We were currently using the Syncfusion Diagram tool (v. 6.1.0.34) running on v2.0, but recently upgraded to v4.0 and a breaking change in System.Web ViewState management means we need to find an alternative. Its possible the latest version (v8.3) is much better, but we're reclutant to fork out a few thousand pounds for the licenses if its just as bad.
We found the Syncfusion tool ok, but it was very difficult to code against (without manual hacks) and it performed quite badly with large graphs when it loads 200 images from the server.
Really looking for some inspiration from your guys. Any suggestions or experiences shared would be most helpful.
Thanks in advance.
mxGraph is designed for this type of functionality (disclaimer - I do work for them). It does support IE 6 and is entirely written in JavaScript. It comes with .NET backend server classes to perform comms with the JavaScript client. In order to get responsive behaviour with over 100 nodes on IE 6, you need to switch to using a server side image over about 50-60 nodes, since IE 6 does perform very badly. We include an example to demonstrate how to do this. Give it a try, if you require evaluation support, there is a forum for that.

Minimizing the pain in implementing printable reports

How do you minimize the pain in your development process when it comes to reporting?
For web frameworks, there is a pretty straightforward way to both produce content as well as graphically design it; content is represented semantically through HTML, and the design is separately specified through CSS. And browsers are fairly consistent with how they render the output (and the inconsistencies are well-known and can be planned for). There are even WYSIWYG editors to help out less-CSS-savvy graphical designers.
But what do we do about print content?
At one company, I created a process that worked like this: A script generated a semantic representation through XML. The XML was passed through XSLT to generate an XML-FO document. Then, this was passed to another tool (Apache FOP, I believe) to generate a PDF. This worked well for that company.
At this company, however, output appearance matters to management, and we have a graphical designer. Currently, we are using a reporting tool (XtraReports from Developer Express, version 8.1). It isn't bad; it outputs to a variety of formats, has a WYSIWYG designer, reports are implemented through C# classes, and it supports data binding to data sets (unfortunately, not POCO's). However, we have some major pain points with this setup:
The reporting framework has major limitations on how you can lay out and group your reporting bands
Presentable elements, especially charts, lack the capabilities we need to fine-tune and achieve the look of our mock-ups.
There is no good way to share styles and layout among reports akin to what we can get through CSS.
Good composability of reusable parts is very hard to implement. So we end up with a lot of copy & paste inheritance of functionality; this is bad news whenever we need to make sweeping changes across all reports.
Now, maybe there's some kick-ass framework out there that can eliminate the pains of reporting frameworks, but I assume that they all have their weaknesses. Do you have a framework or process that works well for you and reduces the pain points inherent in reporting?
Prince XML is a really cool tool which allows you to use HTML or XML styled with CSS (including CSS paged media for printing) and generate PDFs from it.
Option #1 : Adobe Acrobat is really nice. You can design form enabled PDFs and then use something like PDFSharp to manipulate the PDF document. You can create template PDF's that you dump your generated stuff into. I've done this before and it was pretty successful. I also used POCO objects nicely.
Option #2 : You could start creating XPS documents, which is XML based anyways. And they can be easily converted to PDF if necessary.
Option #3 : Run for your life.(might not be an option)
i-net Clear Reports is a nice product. It's based on Java but you can also work with ASP.NET. There is a bridge. The .NET version is in work if you want work with POCO. Because the Java version can work with POJO that the coming .NET version will also work with POCO.

Resources