Buffering a New url which is navigated on other page using tosca in run time - tosca

There is an website called X, When u click on the particular button from website X, it navigates in another tab with new url & i want to buffer that new url at run time. How to do in tosca?

I was able to successfully buffer a URL from IE. Here's how I did it.
First, I found this article on tricentis: https://support.tricentis.com/community/article.do?number=KB0015575
Following the instructions in that article, I scanned a new module for IE itself by selecting UIA during the scan (in the article). I captured the editbox of the URL bar as a module element.
Then, in a test case, I just used action-mode Buffer to read and store the URL into a buffer.

Related

How do I take a screenshot and close the browser after each test instance pass or fail?

I have a simple Tosca Test Case Template which takes data from a TestCaseDesign Sheet. The scenario is simple - It just launches the browser, navigates to a login page, enters username and password and clicks the login button. I have just two values for username and two values for password in the TestCaseDesign Sheet.
I linked the TestCaseDesign Sheet with the Test Case Template and generated two test instances.
Now, when I select and run these two test instances using the Scratchbook, I would like that the browser automatically closes after each test instance passes or fails so that the next test instance can open a new browser as a first step.
Also, I would like to take a screenshot before the browser closes.
How can I achieve that using Tosca?

How to figure out where is the raw data in a table?

https://www.nyse.com/quote/XNYS:A
After I access the above URL, I open Developer Tools in Firefox. Then change the date in HISTORIC PRICES, then click 'GO'. The table is updated. But I don't see relevant HTTP requests sent in devtools.
So this means that the data has already been downloaded in the first request. But I can not figure out how to extract the raw data of the table. Could anybody take a look at how to extract the raw data from the table? (Note that I don't want to use methods like selenium, I want to stay with raw HTTP requests to get the raw data.)
EDIT: websocket is mentioned in the comment. But I can't see it in Developer Tools. I add websocket tag anyway in case somebody knows more about websocket can chime in.
I am afraid you cannot extract javascript rendered content without selenium. You can always make use of a headless browser(you don't see any instance on your screen, the only pitfall is that you have to wait until the page fully loads) and it won't bother you anymore.
In other words, all the other scraping libs are based on urls and forms. Scrapy can post forms but not run javascripts.
Selenium will save the day, all you lose is a couple of seconds for each attempt(will be milliseconds if it is run in frontend). You can share page source with driver.page_source and it can be directly used for parsing(as a html text) with BeautifulSoup or whatever.
You can do it with requests-html, for example let's grab the first row of the table:
from requests_html import HTMLSession
session = HTMLSession()
url = 'https://www.nyse.com/quote/XNYS:A'
r = session.get(url)
r.html.render(sleep=7)
first_row = r.html.find('.flex_tr', first=True)
print(first_row.text)
Output:
06/18/2021
146.31
146.83
144.94
145.01
3,220,680
As #Nikita said you will have to wait the page loading (here 7sec but maybe less), but if you want to do multiple requests you can do it asynchronously !

"Iteration ID" for a CustomVision Project (for use in MSFlow action)?

I'm building an MSFlow which sends a SharePoint pic lib pic to a just-trained CustomVision Classifier, which then sends back a label (eg "Green", "Red", etc);
Challenges:
My MSFlow "CustomVision" action is failing, stating "there's no default iteration for this project. please provide an Iteration ID"
There is nowhere on the CustomVision project's settings page which displays this IterationID !
How / where to find this iteration ID (appears to be a GUID) ???
Turns out the IterationID can be found as follows:
Browse to your custom vision projects page URL
(eg https://www.customvision.ai/projects)
=> browser will display a set of "tiles" - one for each of your existing projects;
Navigate (click) on your particular project for which you seek the IterationID;
=> browser will redirect to the "manage" page (note: defaults to Training Images page) for your project;
It will look something like this:
https://www.customvision.ai/projects/<project GUID here>#/manage
Navigate (click) on the Performance tab of this project
=> browser will direct to the "performance" page, something like this:
https://www.customvision.ai/projects/<project GUID here>#/performance
Note: all of the "iterations" (ie training iterations) will be tabbed along the left side
Select the (training) iteration you wish to use as the "web service" for actually classifiying incoming images;
=> browser will display details/metrics for that (training) iteration
Click on the "PredictionURL" tab in the upper left region of the page
=> a pop-up window will display all the settings-related data you'll need to consume the underlying web service ("API") wrapped around this classifier!
In particular, you'll see 2 different URLs:
For ImageURL-as-input:
https://southcentralus.api.cognitive.microsoft.com/customvision/v2.0/Prediction/<projectGUIDhere>/url?iterationId=g9fc4e82-3f95-4ec1-acf2-9b12bba2b409
For ImageFILE-as-input:
https://southcentralus.api.cognitive.microsoft.com/customvision/v2.0/Prediction/<projectGUIDhere>/image?iterationId=g9fc4e82-3f95-4ec1-acf2-9b12bba2b409
No matter which URL you inspect, you'll see the same value for IterationID - and there you have it!
Copy & paste this IterationID GUID into your MSFlow CustomVision Action, and it should work!
In the custom vision portal home, Select the project you are using, then select the Performance Tab. On the left side of the page you would see Iterations. Select the Iteration that you want and select Prediction URL. This will open a new dialog which gives the URL's for image URL and image file. In this URL the iteration id is a parameter that is passed, Copy the id and use it in your application.
If you choose any iteration as default the iteration id would not be required in the image URL.

Can't figure how phone number reveal works

I am pretty new to web-scraping and recently I am trying to automatically scrap phone number for pages like this. I am not supposed to use Selenium/headless url browser libraries and I am trying to find the a way to actually request the phone number using let say a web service or any other possible solution that could give me the phone number hopefully directly without having to go through the actual button press by selenium.
I totally understand that it may not even be possible to automatically reveal the phone number in one shut as it is meant not be accessible by nosy newbie web-scraper like me; but I still like to raise the question for my information to get detailed answer from an expert point of view.
If I search the "Reveal" button DOM element, it shows some tags which I have never seen before. I have two main questions which I believe could be helpful for newbies like me.
1) Given a set of unknown tags/attribues (ie. data-q and data-reveal in the blow button), how is one able to find out which scripts in the page are actually using them?
2) I googled the button element's tag like: data-q and data-reveal the only relevant I could find was this which for some reason I don't have access two even-if I use proxy.
Any clue particularly on the first question is much appreciate it.
Regards,
Below is the href-button code
Reveal
Ok, according to your demand there are several steps before you finally get a solution.
1st step : open your own browser and enter your target page(https://www.gumtree.com/p/vans/2015-ford-transit-custom-2.2tdci-290-l1-h1/1190345514)
2nd step : (Assume you are using Chrome as your favorite browser) Press Ctrl+Shift+I to open the console, and then select 'Network' tag in the console.
3rd step : Press the 'Reveal' button on that page, watch the console carefully, catch the http request which is sent immediately when you press the 'Reveal' button. You can see the request contains a long string of number in Query String Parameters, actually it is a timestamp.
4th step : Also you can see there is a part named 'Request Headers' in that http request, and you should copy the values of referer , user-agent , x-gumtree-token.
5th step : Try to construct your request (I am a fan of Python, So I am going to show you my example code in Python)
import time
import requests
import json
headers = {
'referer': 'please enter the value you just copied from that specific request',
'user-agent': 'please enter the value you just copied from that specific request',
'x-gumtree-token': 'please enter the value you just copied from that specific request'
}
url = 'https://www.gumtree.com/ajax/account/seller/reveal/number/1190345514?_='
current_time = time.time()
current_time = str(current_time)
current_time = current_time.split('.')[0] + current_time.split('.')[1] + '0'
url += current_time
response = requests.get(url=url,headers=headers)
response_result = json.loads(response.content)
phone_number = response_result['data']

Google Analytics Realtime Sandbox Environment

I am looking for a way to setup a google analytics sandbox environment that will allow me
to test out my custom js code near real time.
My app will be using custom variables for advanced segmentation, and I would like to test out multiple scenarios quickly, as opposed to setting up a dummy GA account and wait for a whole day to confirm the test.
Thanks
Great question.
For GA, server updates occur every four hours, and after every sixth such update, the entire set is recalculated, which means a 24-hour lag from code change to reliable feedback. This delay also applies to most customizations to the GA Browser (e.g., "custom filters").
So if you are going to use GA as your web metrics system, and you expect to actually rely on those data then a test rig is essential.
For me, it's useful to group test systems for client-side analytics using two rubrics: (i) complete, self-contained (closed-loop) systems; or (ii) simpler automated data pulls from the production system (by "production system" here i mean GA's system, not the Site whose pages the GA code is tracking).
For the latter, just add this line to each page of your Site that contains the GA tracking code, just below '__trackPageview()':
pageTracker._setLocalRemoteServerMode();
That line will cause a copy of each transaction line to be logged to your server's activity log--so in essence, you get the data captured by GA in real-time That's all you need to do to capture the data; to parse it, you can use, for instance, any of the excellent open source web log analyzers like AWStats, or roll your own.
This is simple and reliable--but all it can do is tell you (in real-time) "does the analytics code i just implemented on pages served by my production server actually work?"
Usually, that's not good enough--you would rather know if your code will work before it's on your production server. To do that, you need to simulate the production environment and find a way to access in real-time the data GA collects.
This kind of test rig is a little more involved, but still not difficult.
In sum, it requires these steps:
host/serve the ga.js and the
tracking pixel locally;
log the __utm.gif requests (in the
GA data flow, each request
corresponds to one logged
transaction); and
parse the headers into some
convenient human-readable form.
If you want more detail than that (ie, a step-by-step implementation), here it is:
I. Hosting/Serving the GA Script (& automating updates
To do that, you can create a small shell script like this one to wget the latest ga.js version into your local directory (replacing the extant version it finds there).
#!/bin/sh
rm /My_Sites/sitename.com/analytics/ga.js
cd /My_Sites/sitename.com/analytics/
wget http://www.google-analytics.com/ga.js
chmod 644 /My_Sites/sitename.com/analytics/ga.js
cd ${OLDPWD}
exit 0;
(Thanks to AskApache.com, which provided the original motivation and config details to do this in a production context.)
II. Create __utm.gif file
This is just a transparent 1x1 pixel gif image, which you will place in Site directory (doesn't matter where, it just needs to match the location recited in your pages)
III. Log the __utm.gif Requests
For a testing protocol in which you are the source of the client-side activity (e.g., you want to verify the cross-browser fidelity of some event-tracking code you've added to a page on your Site, so you automate 5000 clicks on the button you just wired up,serving the page from your dev server set up for this purpose) it's probably simplest to just log the Request Headers, because it's in those headers that the GA script directs the client to gather various data from the DOM, from the location bar (url), and from prior http headers, and append them to a request for a resource on the GA server (__utm.gif, which is just a 1x1 transparent pixel).
For this type of protocol, i use the Firefox addon, LiveHTTPHeaders. You install it like any other Firefox addon, a few mouse clicks is all. Next, open it, and click the "Generator" tab. From this window, you can see the actual requests in real time. At the bottom of the window is a 'save' button to store the log. I find it easier to configure LiveHTTPHeaders to log only the __utm.gif requests; to do that, just click the 'Edit' tab and create a siimple filter to exclude everything except these particular gif images (using the check boxes on the right, and the large text box to the right).
Other kinds of test protocols require you to work from your Server Activity Logs; in that case just add this line to each page of your Site, just below __trackPageview():
pageTracker._setLocalRemoteServerMode();
IV. Parse those logged requests so you can actually read them
So now your log will contain individual transction lines, each one of which is a string appended to an HTTP Request for the GA tracking pixel. This string is just a concatenation of key-value pairs, each key begins with the letters "utm" (probably for "urchin tracker"). Each of these parameters corresponds to a variable that you see in the GA Dashboard (here's a complete list and description of them). This is all you need to know to build a parser. In more detail:
First, here's a sanitized __utm.gif request (the entries in your LiveHTTPHeaders log):
http://www.google-analytics.com/__utm.gif?utmwv=1&utmn=1669045322&utmcs=UTF-8&utmsr=1280x800&utmsc=24-bit&utmul=en-us&utmje=1&utmfl=10.0%20r45&utmcn=1&utmdt=Position%20Listings%20%7C%20Linden%20Lab&utmhn=lindenlab.hrmdirect.com&utmr=http://lindenlab.com/employment&utmp=/employment/openings.php?sort=da&&utmac=UA-XXXXXX-X&utmcc=__utma%3D87045125.1669045322.1274256051.1274256051.1274256051.1%3B%2B__utmb%3D87045125%3B%2B__utmc%3D87045125%3B%2B__utmz%3D87045125.1274256051.1.1.utmccn%3D(referral)%7Cutmcsr%3Dlindenlab.com%7Cutmcct%3D%2Femployment%7Cutmcmd%3Dreferral%3B%2B
This is my parser (in Python):
# regular expression module imported
import re
pattern = r'\&{1,2}'
pat_obj = re.compile(pattern)
# splitting the gif request on the '&' character
# (which GA originally used to concatenate each piece to build the request)
# (here, i've bound the __utm.gif to the variable by 'gfx')
gfx1 = pat_obj.split(gfx)
# create a look-up table to map a descriptive name to each gif request parameter
# (note, this isn't the entire list, which i've linked to above)
keys = "utmje utmsc utmsr utmac utmcc utmcn utmcr utmcs utmdt utme utmfl utmhn utmn utmp utmr utmul utmwv"
values = "java_enabled screen_color_depth screen_resolution account_string cookies campaign_session_new repeat_campaign_visit language_encoding page_title event_tracking_data flash_version host_name GIF_req_unique_id page_request referral_url browser_language gatc_version"
keys = keys.strip().split()
#create the look-up table
GIF_REQUEST_PARAMS = dict(zip(keys, values))
# parse each request parameter and map the parameter name to a descriptive name:
pattern = r'(utm\w{1,2})=(.*?)$'
pat_obj = re.compile(pattern)
for itm in gfx1 :
m = pat_obj.search(itm)
if m :
fmt = '{0:25} {1:10}'
print( fmt.format( GIF_REQUEST_PARAMS[m.group(1)], m.group(2) ) )
The result looks like this:
gatc_version              1         
GIF_req_unique_id         1669045322
language_encoding         UTF-8     
screen_resolution         1280x800  
screen_color_depth        24-bit    
browser_language          en-us     
java_enabled              1         
flash_version             10.0%20r45
campaign_session_new      1         
page_title                Position%20Listings%20%7C%20Linden%20Lab
host_name                 lindenlab.hrmdirect.com
referral_url              http://lindenlab.com/employment
page_request              /employment/openings.php?sort=da
account_string            UA-XXXXXX-X
cookies
To avoid making this longer still, i left out the cookies' value. They obviously require a separate parsing step, though it's virtually identical to the step i just showed. Again, each request represents a single transaction, so you can store them as you need to.

Resources