Google Analytics Data Feed 400 Bad Request on First Request Only - google-analytics

I have an inherited VB6 program that runs each morning to download Google Analytics feed data for several clients. The process does the following:
Checks that my current OAuth 2.0 Access Token (saved in my database from yesterday) has not expired, and if so obtains a new one. Naturally, when the program runs for the first time each day it gets a new Access Token.
For each client, post a request for feed data.
Processes the XML data received from the Google server.
My problem is with step 2 above. The first post using the new Access Token always fails with a 400 Bad Request error. Making a second post using the exact same data always succeeds and my program can move to step 3.
Here is an example of my POST (with Client Id and Access Token in [ ] brackets):
https://www.google.com/analytics/feeds/data?
ids=ga%3A[Client Id]&
start-date=2016-01-10&
end-date=2016-01-10&
metrics=ga%3Asessions%2Cga%3Atransactions%2Cga%3AtransactionRevenue&
dimensions=ga%3Amedium%2Cga%3Asource%2Cga%3Akeyword&
filters=ga%3Asource%3D%3Dshopping&
access_token=[Access Token]
This has been occurring for several weeks.
The error description (Err.Description) from my code is "400 Bad Request". The entire response from the Google server (less HTML) is "400. That's an error. Your client has issued a malformed or illegal request. That's all we know."
Can anyone offer any suggestions as to why the first request fails, but subsequent requests don't? I have even built in a five minute delay between getting the new Access Token and making the first data request, but still get the 400 Bad Request Error.
Any help you can offer would be greatly appreciated.
Thanks.

Using Fiddler (thank you M Schenkel) I was able to finally track the source of the problem.
My VB6 program is using the IP*WORKS! SSL component from nsoftware. The form had a single HTTPS component that was being used for (1) getting a new Access Token, and then (2) for getting the feed data.
Fiddler showed that the second use of the component was using some of the parameters from the first use of it.
I added a second HTTPS component to the form so my feed data request would start off with a blank slate and it worked.
Thank you very much for your help!

Related

AdCreativesV2 Batch GET API returns 400 BAD REQUEST - "Cannot process request involving multiple routing entities"

I am making API requests to a url like this to access AdCreatives API:
https://api.linkedin.com/v2/adCreativesV2?ids=List(123,456,789)
(not the exact ids, but you get the idea)
Depending on the IDs used in the call, sometimes this works as expected, and sometimes I get a 400 error code response with the message "Cannot process request involving multiple routing entities"
What does this mean, and how can I fix it?
I assume I can't make a request that includes all of these ids at once, but is there a way to tell which ids are causing the problem? This could help me group similar IDs successfully to make the call.
Have you checked you are using X-Restli-Protocol-Version: 2.0.0 in the header?
For example, for me in python this looks like
headers['X-Restli-Protocol-Version'] = '2.0.0'
r = requests.get(url, headers=headers)
This is mentioned on the docs here
Make sure that your Creatives belong to the same account.
Based on LinkedIn's new error messages documentation, the new error message that will replace the "Cannot process request involving multiple routing entities" message is indicating that the ad account of the Creative_ids is not identical:
Entities should have the same ad account in batch update and batch partial update request.

POST requests sometimes getting cut off on Amazon EC2 server, causing Invalid postback or callback argument

I have an ASP.NET 4.7.2 WebForms project that uses both standard postbacks with viewstate and AJAX requests using JSON data. It is hosted on Amazon Web Services EC2 using a load balancer to split requests between 2 IIS servers. (The sessions are set to sticky so it shouldn't be switching servers partway through the session.) Most of the time these requests work fine. I'm servicing many thousands of requests per day that have no problems.
Unfortunately not all the requests work correctly. I'm getting about 200 requests per day that throw an error. Either "Invalid postback or callback argument" for the requests with a viewstate or "Unterminated string passed in" for the JSON ones. When this happens, my error logger records the POST parameters for the request and I can see that the request was cut-off and the entire viewstate or JSON request didn't come through. (The JSON requests include a file base64 encoded and only part of it comes through.)
If the user retries the request it seems to work, but they shouldn't have to. These are not spam bots because the entire site is behind a login screen and the values I'm seeing in the incomplete requests are real valid data... just not all of it.
I'm not sure how to track down this problem as it seems intermittent, and as far as I can tell ASP.NET is doing the right thing saying the data is invalid... it is invalid because it's incomplete. On the other hand I'm using pretty standard services with AWS and IIS running an ASP.NET site and if there was a bug in any of those I'd expect it would have been fixed a long time ago or people wouldn't be using them.

HTTP Request on POST and GET

I have a server log and it shows POST and GET
So, if a page is showing POST /ping and GET /xyz
Does this mean that the user agent is Requesting a page is GET and POST is the response from the server?
Because in my server logs, it's showing a lot of POST with million of /ping while the other pages have been GET is a smaller amount of number.
Which should I focus on? Get the POST pages get index if the server shows this to Search engines?
I would suggest you learn the difference between HTTP GET and POSTS.
This answer is quite good.
In summary, the GET requests are pages/data being requested by clients. POSTs are clients sending data to the server, usually expecting data as a response.
In their comment, Sylwit pretty much explains what this has to do with search engines. I'm going to just describe the differences between GET and POST
GET and POST are two different types of requests to the server. A GET request is normally used to retrieve information from the server and usually has a series of GET parameters. When you search something on Google you're making a GET request.
https://google.com/?q="how+do+i+get"
In this case, the GET parameter is the q after the ?, and has a value of "how do i get". It should be noted that a GET request doesn't need these additional parameters (http://google.com) is still a GET request
POST requests, on the other hand, are normally used to send data to the server. You'll see this anytime you send a message, submit a form etc. When I click submit on this answer, I'll be making a POST request to stackoverflow's servers. The parameters for these aren't immediately visible in the browser. POST requests can also return a HTTP response, with a message.
Hope that shows the differences between the two.

I'm accessing a secure site with httr but I get a server error whenever it isn't the first request

As the title says the site in question is secure and I can't share my credentials but here's the outline of events.
The way the site security works is you send a POST to one url with user/pass and then it sends back a token. All requests then need to carry that token in their headers to work. I can get that to work once. On the first request after the login step I get the results I want. All subsequent requests result in a http 500 error of "Internal Server Error". Of course, in a perfect world, I could go to the server and get logs to see more verbosely what is going on. However, they aren't so accommodating on my planet so I'm left scratching my head.
Just to clarify I can send the exact same request the second time and I get the aforementioned error. So far my work around is to detach httr and then relibrary(httr) to start over. This doesn't seem like it's the best approach for this problem.
I'm guessing that the problem has to do with how httr reuses the same handle but I don't know what info is changing between the two requests.
In pseudo code let's say I do
resp<-POST('https://my.site.com/login', add_headers(.headers=c('user'='me', 'pass'='blah'))
mytoken<-content(resp)$token
qry<-POST('https://my.site.com/soap/qry', add_headers(.headers=c('token'=mytoken)),body=myxmlstring)
#qry will have status 200 and the content I expect
#If I run the same POST command again
qry2<-POST('https://my.site.com/soap/qry', add_headers(.headers=c('token'=mytoken)),body=myxmlstring)
#qry2 will be status code 500
#if I do
detach("package:httr", unload=TRUE)
library(httr)
#and then do the commands again from the top then it will work again.
Ideally, there'd be a parameter I can add to POST which will make each POST completely independent of the last. Short of that I'd be happy with something that makes more sense than detaching and reattaching the package itself.

ASP.NET form scraping not working

I'm trying to scrape some pages on a website that uses ASPX forms. The forms involve adding details of people by updating the server (one person at a time) and then proceeding to a results page that shows information regarding the specified people. There are 5 steps to the process:
Hit the login page (the site is HTTPS) by sending a POST request with my credentials. The response will contain cookies that will be used to validate all subsequent requests.
Hit the search criteria page by sending a GET request (no parameters). The only purpose of this is to discover the __VIEWSTATE and __EVENTVALIDATION tokens in the HTML response to be used in the next step.
Update the server with a person. This involves hitting the same webpage in step 2 but using a POST request with form parameters that correspond to the form controls on the page for adding person details and their values. The form parameters will include the __VIEWSTATE and __EVENTVALIDATION tokens gained from the previous step. The server response will include a new __VIEWSTATE and __EVENTVALIDATION. This step can be repeated using the new __VIEWSTATE and __EVENTVALIDATION, or can proceed to the next step.
Signal to the server that all people have been added. This involves hitting the same page as the previous 2 steps by sending a POST request with form parameters that correspond to the form controls on the page for signalling that all people have been added. The server response will simply be 25|pageRedirect||/path/to/results.aspx|.
Hit the search results page specified in the redirect response from the previous step by sending a GET request (no parameters - cookies are enough). The server response will be the HTML that I need to scrape.
If I follow the process manually with any browser, filling in the form controls and clicking the buttons etc. (testing with just one person) I get to the results page and the results are fine. If I do this programmatically from an application running on my machine, then ultimately the search results HTML is wrong (the page returns valid HTML, but there are no results compared with the browser version and some null values were there should not be).
I've run this using a Java application with Apache HttpClient handling the requests. I've also tried it using a Ruby script with Mechanize handling the requests. I've setup a proxy server using Charles to intercept and examine all 5 HTTPS requests. Using Charles, I've scrutinized the raw requests (headers and body) and made comparisons between requests made using a browser and requests made using the application(s). They are all identical (except for the VIEWSTATE / EVENTVALIDATION values and session cookie values, which I would expect to differ).
A few additional points about the programmatic attempts:
The login step returns successful data, and the cookies are valid (otherwise the subsequent requests would all fail)
Updating the server with a person (step 3) returns successful responses, in that they are the same as would be returned from interaction using a browser. I can only assume this must mean the server is updating successfully with the person added.
A custom header is being added to requests in step 3 X-MicrosoftAjax: Delta=true (just like the browser requests are doing)
I don't own or have access to the server I'm scraping
Given that my application requests are identical to the browser requests that succeed, it baffles me that the server is treating them differently somehow. I can't help but feel that this is an ASP.net issue with forms that I'm overlooking. I'd appreciate any help.
Update:
I went over the raw requests again a bit more methodically, and it turns out I was missing something in the form parameters of the requests. Unfortunately, I don't think it will be of much use to anyone else, because it would seem to be specific to this particular ASP servers logic.
The POST request that notifies the server that all people have been added (step 4) requires two form parameters specifying the county and address of the last person that was added to the search. I was including these form parameters in my request, but the values were empty strings. I figured the browser request was just snagging these values because when the user hits the Continue button on the form, those controls would have the values of the last person added. I figured they wouldn't matter and forgot about them, but I was wrong.
It's a peculiar issue that I should have caught the first time. I can't complain though, I am scraping a site after all.
Review Charles logs again. It is possible that the search results and other content may be coming over via Ajax, and that your Java/Ruby apps are not actually doing all of the requests/responses that happen with the browser. Look for any POST or GET requests in between the requests you are already duplicating. If search results are populated via Javascript your client app may not be able to handle this?

Resources