is There any way to control programs by finding from task manager and managing contents? - web-scraping

Hello guess my title is bad enough to explain question but I am trying to understand is there any way to control and automate things just finding tasks from task manager? I have seen in Visual Studio "Spy++". Firstly, i didn't understand what it's aim and how far we can go with it. I just got it, it can provide us logs in a cool range.
I would like to give an example,
I want to log in Facebook/Twitter and do casual things with developed software by myself(I don't want to use selenium or any kind of that thing) or I want to get informations from a game which is about characters actual health, attack power, ability power... or giving command that game from my software like, press a,b or 1.
Can someone tell me, exact subject name of what i am talking about?

Terminology: Selenium / AutoIt: "UI automation". Reading and modifying in-game values: "memory editor" or "trainer".
There is no universal way to control programs if you want your tool to be transparent. A browser may listen to OS input events (Windows messages telling it which keys were pressed or where the mouse was clicked), games may use DirectInput and yet other apps may subscribe to low-level system events or hooks.
For example browser automation:
Using plugins/extensions gives you a JavaScript API that allows you to inspect pages, forms on those pages, modify browser behavior and whatnot.
Browsers can also have their own external API. This can be done by linking to their DLLs, or passing command line arguments, or passing messages in other ways. For Firefox, this API is named "Marionette".
Then there's Selenium, that provides a common API for various browsers. It controls them using "drivers".
Selenium "knows" how to drive a browser, as it's coded against the browser's APIs. Spy++ "knows" that it's inspecting a Win32 window and looks for known controls, their classes and their names so you could write another program to send specific messages to those specific controls of those specific applications.
As for "log in to Facebook", no, you cannot do that in a reasonable amount of time for the currently popular browsers if you want to code it from the ground on up.
You'll have to, in one way or the other, interface with the browser and ask for a handle to the username/password textboxes, enter data into them and then submit the form. Then you'll practically be rebuilding Selenium, so why not use that tool in the first place?
Or you'll have to scrape the pixels on the screen, recognize those textboxes, click the mouse there and send some keys. And then Facebook redesigns their login form and you'll have to start over.
tl;dr: use the right tool for the job. If you want to automate a site's UI, then use Selenium.

Related

make non-native application accessible to screen readers for the visually impaired

I create applications, that are divorced from any native framework. All rendering happens in OpenGL, with a context provided by GLFW, all in C, with no framework to rely on supplying compatibility. As such, standard screen readers like NVDA have no chance of picking up information ( excluding OCR ) and my applications are an accessibility black hole.
How can I provide an interface for screen readers to cling unto? I presume this is a per OS thing... How would that be possible on Windows, Linux, BSD or even android? In the *NIX world, I presume this would be Desktop environment dependent...
I'm finding a lot of information on this, with a framework as a starting point, but have a hard time finding resources on how to do it from scratch.
I'm fully aware this is far beyond the capability of a sole developer and know, that writing programs by ignoring native interfaces is a common accessibility hole, which you are advised to avoid.
However, I have a tough time finding resources and jump-in points to explore this topic. Can someone point me in the right direction?
TL;DR: How to provide screen-reader compatibility from scratch. Not in detail - but conceptually.
As you have already well identified, your app is an accessibility blackhole because you are using a rendering engine.
It's basicly the same for OpenGL, SDL, or <canvas> on the web, or any library rendering something without specific accessibility support.
WE can talk about several possibilities:
Become an accessibility server. Under windows, it means doing the necessary so that your app provide accessible components on demand from UIA / IAccessible2 interface.
Use a well known GUI toolkits having accessibility support and their provieded accessibility API to make your app.
Directly talk to screen readers via their respective API in order to make them say something and/or show something on a connected braille display.
Do specific screen reader scripting
However, it doesnt stops there. Supporting screen readers isn't sufficient to make your app really accessible. You must also think about many other things.
1. Accessibility server, UIA, IAccessible2
This option is of course the best, because users of assistive technologies in general (not only screen readers) will feel right at home with a perfectly accessible app if you do your job correctly.
However, it's also by far the hardest since you have to reinvent everything. You must decompose your interface into components, tell which category of component each of them are (more commonly called roles), make callback to fetch values and descriptions, etc.
IF you are making web development, compare that with if you had to use ARIA everywhere because there's no defaults, no titles, no paragraphs, no input fields, no buttons, etc.
That's an huge job ! But if you do it really well, your app will be well accessible.
You may get code and ideas on how to do it by looking at open source GUI toolkits or browsers which all do it.
Of course, the API to use are different for each OS. UIA and IAccessible2 are for windows, but MacOS and several linux desktops also have OS-specific accessibility API that are based on the same root principles.
Note about terminology: the accessibility server or provider is your app or the GUI toolkit you are using, while the accessibility client or consumer is the scren reader (or others assistive tools).
2. Use a GUI toolkit with good accessibility support
By chance, you aren't obliged to reinvent the wheel, of course !
Many people did the job of point 1 above and it resulted in libraries commonly called GUI toolkits.
Some of them are known to generally produce well accessible apps, while others are known to produce totally inaccessible apps.
QT, WXWidgets and Java SWT are three of them with quite good accessibility support.
So you can quite a lot simplify the job by simply using one of them and their associated accessibility API. You will be saved from talking more or less directly to the OS with UIA/IAccessible2 and similar API on other platforms.
Be careful though, it isn't as easy as it seems: all components provided by GUI toolkits aren't necessarily all accessible under all platforms.
Some components may be accessible out of the box, some other need configuration and/or a few specific code on your side, and some are unaccessible no matter what.
Some are accessible under windows but not under MacOS or vice-versa.
For example, GTK is the first choice for linux under GNOME for making accessible apps, but GTK under windows give quite poor results. Another example: wxWidgets's DataView control is known to be good under MacOS, but it is emulated under windows and therefore much less accessible.
In case of doubt, the best is to test yourself under all combinations of OS and screen readers you intent to support.
Sadly, for a game, using a GUI toolkit is perhaps not a viable option, even if there exist OpenGL components capable of displaying a 3D scene.
Here come the third possibility.
3. Talk directly to screen readers
Several screen readers provide an API to make them speak, adjust some settings and/or show something on braille display. If you can't, or don't want to use a GUI toolkit, this might be a solution.
Jaws come with an API called FSAPI, NVDA with NVDA controller client. Apple also alow to control several aspects of VoiceOver programatically.
There are still several disadvantages, though:
You are specificly targetting some screen readers. People using another one, or another assistive tool than a screen reader (a screen magnifier for example), are all out of luc. Or you may multiply support for a big forest of different API for different products on different platforms.
All of these screen reader specific API support different things that may not be supported by others. There is no standards at all here.
Thinking about WCAG and how it would be transposed to desktop apps, in fact you are bypassing most best practices, which all recommand first above anything else to use well known standard component, and only customize when really necessary.
So this third possibility should ideally be used if, and only if, using a good GUI toolkit isn't possible, or if the accessibility of the used GUI toolkit isn't sufficient.
I'm the autohr of UniversalSpeech, a small library trying to unify direct talking with several screen readers.
You may have a look at it if you are interested.
4. Screen reader scripting
If your app isn't accessible alone, you may distribute screen reader specific scripts to users.
These scripts can be instructed to fetch information to give to the user, add additional keyboard shortcuts and several other things.
Jaws has its own scripting language, while NVDA scripts are developed with Python. AS far as I know, there's also scripting capabilities with VoiceOver under MacOS.
I gave you this fourth point for your information, but since you are starting from a completely inaccessible app, I wouldn't advise you to go that way.
In order for scripts to be able to do useful things, you must have a working accessible base. A script can help fixing small accessibility issues, but it's nearly impossible to turn a completly inaccessible app into an accessible one just with a script.
Additionally, you must distribute these scripts separately from your app, and users have to install them. It may be a difficulty for some people, depending on your target audience.
Beyond screen reader support
Screen reader support isn't everything.
This is beyond your question, so I won't enter into details, but you shouldn't forget about the following points if you really want to make an accessible app which isn't only accessible but also comfortable to use for a screen reader user.
This isn't at all an exhaustive list of additional things to watch out.
Keyboard navigation: most blind and many visually impaired aren't comfortable with the mouse and/or a touch screen. You must provide a full and consist way of using your app only with a keyboard, or, on mobile, only by standard touch gestures supported by the screen reader. Navigation should be as simple as possible, and should as much as you can conform to user preferences and general OS conventions (i.e. functions of tab, space, enter, etc.). This in turn implies to have a good structure of components.
Gamepad, motion sensors and other inputs: unless it's absolutely mandatory because it's your core concept, don't force the use of them and always allow a keyboard fallback
Visual appearance: as much as you can, you should use the settings/preferences defined at OS level for disposition, colors, contrasts, fonts, text size, dark mode, high contrast mode, etc. rather than using your own
Audio: don't output anything if the user can't reasonably expect any, make sure the volume can be changed at any time very easily, and if possible if it isn't against your core concept, always allow it to be paused, resumed, stopped and muted. Same reflection can apply to other outputs like vibration which you should always be able to disable.

Automate existing web browser session

How can I programmatically interact with an existing web page in a web browser launched in a standard way? For example I navigate to a specific page and want to be able to run a Python script that fills some edits or clicks some elements.
This should be possible at least through IAccessible2 for main browsers, but I did not find any pointers. To put it in another way, how do screen readers do it? And bonus question, is there Python library for it?
EDIT: I am looking for something more than user input simulation. I would like to programmatically read the DOM at least, write if possible. So far I have looked at code in NVDA which is very low-level and complex. Is there anything easier?
How can I programmatically interact with an existing web page in a web browser launched in a standard way? For example I navigate to a specific page and want to be able to run a Python script that fills some edits or clicks some elements.
The answer is keyboard/mouse macros if you have to visually see the browser as it happens. You can google macro programs for your OS.
But you most likely are looking for a headless browser such as PhantomJS, HtmlUnit, TrifleJS, Splash, and SimpleBrowser
Check out - https://saucelabs.com/blog/headless-browser-testing-101
When you mention 'interact with an existing webpage in a web browser launched in the standard way' you are talking about the DOM (Document Object Model).
Many QA environments are running testing scripts on code that has not been rendered by the browser into a DOM (you see the DOM when you inspect a page using your browser tools). When you use a headless browser it creates the DOM and then runs all the tests as if a human were clicking without having to visually look at it happen.
see - https://css-tricks.com/dom/
To put it in another way, how do screen readers do it? And bonus question, is there Python library for it?
Screen readers are interacting with the DOM at a low level. I do not know if there is a Python library. Most likely this would be overkill though unless you are building a desktop app that interacts with browsers like a screen reader does.
edit...
I did some more digging and found this article that is a much more verbose explanation of how screen readers interact with the browser/dom.
Also, there is a python API for manipulating the DOM and this library seemed popular.

Tool for Overlaying User Data on Government Form

I'm working on a project where user's submit data and then it is put onto a state form that they can print and submit. To give you an idea of what I'm talking about, the form looks similar to an IRS 1040 form (https://www.healthykids.org/_img/document_1040.gif).
We've recently discovered that the form generated by our software isn't close enough to be accepted by some state's OCR process.
We're looking for some way to quickly create stylesheets or something similar so that the data can be overlayed on a scan of the original form and then printed. We've tested to ensure this works, however the lost time of trying to get the positioning right for every version of the form for each state has become a huge problem.
I'm looking for a tool or technique that would help me roll out each form faster.
The web application is based on Code Igniter. Our company prefers open source solutions but if a proven proprietary product exists we would certainly use it due to the critical nature of the issue.
Thank you very much for any help.
First, the obvious. Most web IDE's do this with ease (I know both Microsoft Visual Studio and Adobe Dreamweaver would allow you to visually position the elements above the image without problems).
I take it you are looking for a tool that lets you design the forms as part of the web application itself. One of the related links points to Suggestions for a JavaScript form builder?.
Other than that, if you know your Javascript and jQuery/extjs etc, it should be pretty quick to write a simple "put the textfields above the image" (absolute-positioning + drag and drop) type of web interface.

Where Does JQuery/Client-Side Programming Fit Into MVP and DDD

I'm working on an a pretty big project right now and am trying to implement an MVP architecture. I'm starting to run across a instances where I think JQuery or Javascript might be better suited than server-side code. I'm looking for feedback on how others are implementing client-side programming into their enterprise applications. How are you structuring the client-side code and how do you determine when to use it?
Things that can make user say "wow". For example - Populating search results while user has just typed 3-4 character of search term. Just go back in past and think about Yahoo or Hotmail which used to postback to server when you clicked on "Create Message". But when google came they just did on client side without going to server. I bet you would have said "wow" to that. At least I did.
Things that can reduce server load. For example - Adding extra data entry row in HTML table, instead of doing it through round trip, Increase/Decrease of quantity etc.
These are just some example to sight. Even to do these things properly you need to go to server but that will be behind the scene using ajax. Other than this you need to select few more jquery plugins that you will use in your project. To name some are jQuery UI, jQuery Validation, jQuery AnythingSlider etc. There are too many of them.
Http://ClearTrip.com is one site that I envy for their UX. Visit their site from mobile device and you will get further clues about their UX work. Besideds just coding you need to have a person in your team who can work on these UX aspects.
Regarding how this fits into DDD: I've just recently started my journey into DDD but one hears a lot about command/query separation in that circle. Certainly if you are doing something that hits your domain (like fetching for auto-completion or certainly if you allow partial page submission to accomplish a domain command) you have to decide how it gets there and how the domain is structured to handle it.
I think two decisions are most relevant.
First, bits entirely in the browser and even those specifically in your application layer are outside your domain and thus, though covered in the layered architecture part of the DDD discussion, do not land in the entity/value/event/service, etc. discussion. If, however, you are using AJAX to interact with your application layer and in turn need to access your domain, you need to consider again two things in my mind.
(a) Are you separating commands and queries simply using different methods on your domain? Fine if you have a relatively small demand for either queries or commands and this will not seem like "noise" in your domain API. Otherwise, you have a separate bounded context...another domain modeled just for queries that your UI needs to avoid clutter on your domain. Regardless, you are doing something like JS->AJAX handler in application layer->domain (including a domain service).
(b) Is this a command or a query? Once you have (a) figured out, this lets you know where the access will land...then use the presentation layer's use case to elaborate the domain concept and put it into your ubiquitous language.
Second, you have the DTO vs direct to domain decision. This can be a religious war gathering topic, but usually the answer is "depends." I think there are cases for using DTOs and cases for not (within the same architecture)...just search for all the discussions around the topic and apply the pattern only where it adds value; I won't try to cover details here.
Hope this provides some insight or at least conversation magnet to which others will add.
I guess this question is a little too subjective. Looks like I'm just going to grab a view books on advanced javascript and study up on the JQuery library.

Showing a form from a webpage

I have a problem I am trying to solve in an elegant manner. I have a .net application that I have created. I am trying to get one of the forms to be shown from a webpage. This sounds strange I'll admit, so here is the backstory
We have some large monitors at work, that show information on them. I have no control over how the information is displayed. Currently they are just using a browser and tabbing in the browser to show each different piece of information on the screen. Most of the info they show is just standard html stuff, text and images.
Now along comes my winforms application. The part of the application I need to show is a graphical display. Everything on this display is drawn using GDI+, if that matters. I need to get this form into a format that I can show. Below is my own solution, but I am pretty sure this is not the best method, but it may be the only method I can use
Create a console application. The application would do the following
1. Run as a service on a server
2. Create the display in memory, and save it to a bitmap every so ofter
3. Save the bitmap to a location on the network.
4. have an HTML file that links the image that can be shown in the browser
I though about doing something with the clients, however the clients are not always up, so I could have periods where the image wouldnt be updated.
I also was thinking about an ASP.net solution, but that would require me to learn ASP.net, and I am not quite ready to take that challenge
In IE you can host a winforms app/control as an ActiveX control, like so:
<object id="DateTimePicker" height="31" width="177"
classid="bin/Web.Controls.DateTime.dll#Web.Controls.DateTime.DateTimePicker" VIEWASTEXT>
</object>
See this article for more information: http://www.codeproject.com/KB/miscctrl/htmlwincontrol.aspx
Now, I'm not claiming that this is any more elegant than your solution, but it is an alternative.
I think using Asp.Net to serve a dynamic image using a HttpHandler would be the best approach, but depending on your skills and time this may not be an option. Here is a nice tutorial: http://www.codeguru.com/columns/dotnet/article.php/c11013
IMHO The best way to build this would be as a browser plug-in, like how Flash works. Microsoft has created a plug-in framework called SpicIE, that allows you develop managed plug-ins for IE. This is probably your best bet.
The old unmanaged way is to build out your WinForms dll app and then package it in a signed cab file, and then reference that cab file with an HTML object tag (codebase arg is the one you need).
i.e.,
document.write("<object CLASSID='clsid:DC187740-46A9-11D5-A815-00B0D0428C0C' CODEBASE='/MyFormsApp/MyFormsApp.cab#Version=1,00,0000' />");
The first time the user hits the page they will be asked to allow for the installer to load its payload (dll's). Once they do, they will have a fully fledged WinForms desktop APP running through a browser window.
I took the easy route on this one. I created a small winform app, that coverts the GDI objects to a bitmap, and then I save the bitmap to a network share. This file is refenced in a simple HTML file that is displayed on the monitor.
I chose the winform app, because it makes it really easy for me to set this up in task manager, and run it every 10 minutes to update.

Resources