Getting Cross Origin error when trying to scrape data

Getting Cross Origin error when trying to scrape data - web-scraping

I'm trying to scrape some data. That page is secured with Google Captcha but I'm manually passing that Captcha, and on the next page when I'm trying to scrape the data at the moment I'm getting cross origin error. I have tried within C# as well as in Python but nothing worked. Please provide solution in C# or Python maybe in PHP.
I have tried some of the Python libraries to achieve the same and tried C# as well.

Related

I´m trying to use Google analytics on my genexus based application, but I run into some problems regarding the domain name property, probably

I´m trying to use Google analytics on my genexus based application, but I run into some problems regarding the domain name property, or the lack of it sometimes.
To contextualize a bit, I have an application which runs in Apache Tomcat can´t figure out a way to make Google Analytics work with it when setting it up from the genexus properties. I´ve tried to add the Google Analytics control to my masterpage and set up the properties for ir to work, but since the domain name will change based on which server is runing my webapp, I can't find a way around this.
If possible, I want to make each instalation of the webapp to generate data to a different Property on Google Analytics Control Panel, but the only way I've managed to make google analytics work with genexus is using an external .js file using the code google analytics gave me, and doing this way I can't make each application send data to a different Property.
My ideia was on the Start Event of the master page, to set the info needed like this:
GoogleAnalytics1.Code = &var1
GoogleAnalytics1.DomainName = &var2
Where &var1 and &var2 where loaded earlier in the event.
I've tried to make it work setting it up through the properties to see if I was doing something wrong in the attribuition but didn't work either. The only way I found to make it work is to write the .js file and add it to the Knowledge Base as an external file and call it in the master page Start Event.
I saw that this was improved on GX17 U10 but I haven't found any other coments on this. Also, I could not find the .js file generated by genexus when I set the Google Analytics Control in my MasterPage, my hope was to see if there is any difference in the file generated by Genexus and the file suggested by Google. Is there any way for me to find this file or some other thing I need to set it up on my Knowledge Base to make it work? I've tried to look on genexus wiki but didn't find anything there.
At the start, I thought the lack of the DomainName was the problem, but I've managed to make it work without a domain name using the static .js file, so right now I'm not so sure about what's the problem here.

Download data from a website using python requests

I'm trying to download the data from this website: https://cdr.ffiec.gov/public/PWS/DownloadBulkData.aspx.
My questions are (1) how I can set the appropriate "payload" and post to the url for the three inputs: available products, report period end date and available file formats and (2)how I can get the link of the files since in the website, there is a download button (i can't get the link by right clicking on the button). Sorry that my questions are basic but i hope someone can provide me step-by-step guidance. Thanks.

You can’t manipulate the web page (selecting from drop downs etc) with just requests.
You need to use dev tools to capture the URL you’re redirected to when you submit the form, then use requests to call that URL with the parameters it expects.

Error scraping aspx site with ruby Mechanize. Mechanize::ResponseCodeError: 404 => Net::HTTPNotFound

I'm trying to scrape a ratings website with Ruby's mechanize, and am having a world of trouble. My code is pretty simple:
require "mechanize"
#client.get("http://cape.ucsd.edu/responses/Results.aspx")
At that point, you'll see the 404 errors.
I've tried a few things, including HTTParty searching for redirects; disabling SSL checking; even saving the html file locally (to get the proper query form), and then trying to issue it directly from an agent connected to the main site. All of these lead to the same error.
I'm fairly new to scraping, and I'm hoping I'm doing something silly. Any help would be appreciated.

Yes, it's user agent. To set the user agent do:
#client = Mechanize.new
#client.user_agent = 'Mozilla'

ASP.Net Webforms - how to get the friendly URL for the currently executing page

I have enabled friendly URLs in my application by having the following line:
routes.EnableFriendlyUrls();
in my App_Start/RouteConfig.cs file.
I would like to get the Friendly URL of the currently executing page. I know that I can always the currently executing file page from the request and take off the ".aspx" extension from that.
this.Request.CurrentExecutionFilePath;
However, I have a feeling that the Friendly URL framework component should be able to directly provide me the information I am looking for rather than me having to do string manipulation.
Any pointers on the same will be appreciated. Thanks for looking up my question.

What are you trying to achieve exactly?
If you're trying to extract a value from the url like you would normally do with Request.QueryString["someValue"], you should read up on how Routing works. Here's a good write-up
http://www.codeproject.com/Tips/698666/USE-OF-MapPageRoute-Method-IN-ASP-NET-WEBFORM-ROUT
If you are only interested in getting the url itself, you can use
Page.Request.RawUrl
Cheers

Post Verb not allowed iis7

I'm trying to implement an upload with progress bar code i found here. But when i run my example code i get the following error in IIS7 Windows7:
Click here for larger image
I tried messing with my handlers but only messed it up more as i don't know what i'm doing. Can someone help me get this working?

It appears that your are trying to upload the file to (or trying to get process update from) a html file (fileupload.html) - now html files are considered as static files by IIS. So you can only issue GET request (there is no point in submitting POST to a static file because the content is not going to change based on POST data) and hence the error.
Perhaps, you have done integration incorrectly or may be using wrong plugin (the author is talking about using it in conjunction with apache module). You may want to look at alternatives from below links:
http://mattberseth.com/blog/2008/07/aspnet_file_upload_with_realti.html
File Upload with progress bar in Asp.Net Mvc/ jQuery?

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex