Form Recognizer not processing requests - azure-cognitive-services

For the past few hours form recognizer analyze
https://my_ResourceName.cognitiveservices.azure.com/formrecognizer/v2.0-preview/custom/models/my_modelId/analyzeresults/my_referenceId
{
"status": "notStarted",
"createdDateTime": "2021-06-14T21:00:38Z",
"lastUpdatedDateTime": "2021-06-14T21:00:39Z"
}
Anyone has similar experience? I know usually takes some time to process form but now all tries to process form are failing with no error of any sort. Using postman to check both post to analyze and get analyze results. This used to work for months without issues until today!

Form recognizer has new version!
https://my_ResourceName.cognitiveservices.azure.com/formrecognizer/v2.1-preview.3/custom/models/my_modelId/analyzeresults/my_referenceId
It would be nice if old webservice was returning more descriptive message instead of "notStarted", since it will never start again in the future, and point users to new url.

Related

Are there ways to check the structure of JSON response in Postman tests?

I have written several tests in Postman, based on the example snippet codes given in the Postman GUI on Windows desktop.
Mainly, I want to check for existence of the parameters in the response (exact in those cases where I need to check for particular values of the parameters) and I want to know if there's a better way to do it than the way I've been doing now.
The following test shows one such example and this is just a small part of it. The actual response schema is a lot larger so I envisioned writing 50-60 lines of such checks per API endpoint.
pm.test("Det details of a POI", function () {
pm.expect(jsonData.code).to.eql(0);
pm.expect(jsonData.data[0].provider).to.eql("google");
pm.expect(jsonData.data[0]).to.have.property("id");
pm.expect(jsonData.data[0].location).to.have.property("position");
pm.expect(jsonData.data[0].location.address).to.have.property("text");
pm.expect(jsonData.data[0].location.address).to.have.property("house");
pm.expect(jsonData.data[0].location.address).to.have.property("street");
pm.expect(jsonData.data[0].location.address).to.have.property("postalCode");
pm.expect(jsonData.data[0].location.address).to.have.property("city");
pm.expect(jsonData.data[0].location.address).to.have.property("county");
pm.expect(jsonData.data[0].location.address).to.have.property("state");
pm.expect(jsonData.data[0].location.address.country).to.eql("United Kingdom");
pm.expect(jsonData.data[0].location.address).to.have.property("countryCode");
pm.expect(jsonData.data[0].contacts).to.have.property("phone");
pm.expect(jsonData.data[0].contacts.website.value).to.include("www.google.com");
pm.expect(jsonData.data[0].contacts.website).to.have.property("label");
pm.expect(jsonData.data[0].categories[0]).to.have.property("id");
pm.expect(jsonData.data[0].categories[0]).to.have.property("title");
pm.expect(jsonData.data[0].categories[0]).to.have.property("type");
pm.expect(jsonData.data[0].categories[0]).to.have.property("system");
)};
Any tips and improvements would be greatly appreciated.
You're basically asking the same as these two Stack Overflow posts:
Schema validation using Postman
How to validate response in Postman?
Answer: There is a json format validation build into Postman it uses the Tiny Validator project to allow schema validation in post-request test scripts. Research Postman's documentation (1, 2) for examples on how to use it.

Mobile data reported in GA Measurement Protocol appear in realtime but not in daily summary

I've been attempting to log activity on a mobile-like device using the Google Analytics Measurement Protocol. All of these attempts have validated using the validation URL, and I can see activity when I look at the real-time reports on the Analytics website. But when I look at the Home or Overview reports for the day - no activity is shown.
The view is set for "All Mobile App Data".
The POST body looks something like this:
v=1&tid=UA-000000000-1&ds=app&qt=1601&uid=uid-zzzzz&t=screenview&cd=Foo&an=Foo%20App%20Name&aid=com.example.foo&aiid=com.example.foo&av=0.0.1&ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36
The ua field is just a pre-defined string. I found that if I omitted it, the Real Time monitoring listed the hits as desktop hits, although I was in a Mobile report and the ds field was "app".
Am I missing a field that is required? Is there some reason why it is showing up in the real-time report, but not in a daily report? Is there some other way to diagnose why the data is vanishing, or confirm the data is actually being captured?
When i check the debug endpoint the hit is valid
Request:
https://www.google-analytics.com/debug/collect?v=1&tid=UA-XXX-1&ds=app&qt=1601&uid=uid-zzzzz&t=screenview&cd=Foo&an=Foo%20App%20Name&aid=com.example.foo&aiid=com.example.foo&av=0.0.1&ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36
Response
{
"hitParsingResult": [ {
"valid": true,
"parserMessage": [ ],
"hit": "/debug/collect?v=1\u0026tid=UA-53766825-1\u0026ds=app\u0026qt=1601\u0026uid=uid-zzzzz\u0026t=screenview\u0026cd=Foo\u0026an=Foo%20App%20Name\u0026aid=com.example.foo\u0026aiid=com.example.foo\u0026av=0.0.1\u0026ua=Mozilla%2F5.0%20(Linux%3B%20Android%207.0%3B%20SM-G930V%20Build%2FNRD90M)%20AppleWebKit%2F537.36%20(KHTML%2C%20like%20Gecko)%20Chrome%2F59.0.3071.125%20Mobile%20Safari%2F537.36"
} ],
"parserMessage": [ {
"messageType": "INFO",
"description": "Found 1 hit in the request."
} ]
}
I cannot use one of the mobile libraries from Firebase - this is not one of the platforms they support. I do not wish to pretend this is a web page - there is no associated hostname or path. I do not wish to use Events since I can't do event Behavior Flow, which is one of the things I'm interested in seeing.
I'm aware that it can sometimes take "a day or so" for results to first appear. The site was setup over five days ago at this point, and has received data during that time.
Good thought about the anti-spam setting, however the setting appears to be correct:
I've also tried using GET instead of POST - no change, it still shows the hit in real-time, but then it vanishes.
However, I know that it can record hits permanently. There were two hits from a spammer in Russia that have shown up in the daily report (I wasn't there to see it show up in real-time). I don't know what they did, but would love to find out since it might help figure out how I can add a record.
In the real-time reports, it correctly points out the data center all the hits are coming from. Perhaps that is filtering it out somewhere out of my control?
Try adding Cid I know it says this is an optional parameter but for mobile accounts I belive it may be required.
Client ID
Optional.
This field is required if User ID (uid) is not specified in the request. This anonymously identifies a particular user, device, or browser instance. For the web, this is generally stored as a first-party cookie with a two-year expiration. For mobile apps, this is randomly generated for each particular instance of an application install. The value of this field should be a random UUID (version 4) as described in http://www.ietf.org/rfc/rfc4122.txt.
Example value: 35009a79-1a05-49d7-b876-2b884d0f825b
Although this says it needs to be a UUIDv4, it does work with other UUIDs (I've tested it with a v5, which is a hash against the value used for the uid parameter).

Google reCAPTCHA response success: false, no error codes

UPDATE: Google has recently updated their error message with an additional error code possibility: "timeout-or-duplicate".
This new error code seems to cover 99% of our previously mentioned mysterious
cases.
We are still left wondering why we get that many validation requests that are either timeouts or duplicates. Determinining this with certainty is likely to be impossible, but now I am just hoping that someone else has experienced something like it.
Disclaimer: I cross posted this to Google Groups, so apologies for spamming the ether for the ones of you who frequent both sites.
I am currently working on a page as part of a ASP.Net MVC application with a form that uses reCAPTCHA validation. The page currently has many daily users.
In my server side validation** of a reCAPTCHA response, for a while now, I have seen the case of the reCAPTCHA response having its success property set to false, but with an accompanying empty error code array.
Most of the requests pass validation, but some keep exhibiting this pattern.
So after doing some research online, I explored the two possible scenarios I could think of:
The validation has timed out and is no longer valid.
The user has already been validated using the response value, so they are rejected the second time.
After collecting data for a while, I have found that all cases of "Success: false, error codes: []" have either had the validation be rather old (ranging from 5 minutes to 10 days(!)), or it has been a case of a re-used response value, or sometimes a combination of the two.
Even after implementing client side prevention of double-clicking my submit-form button, a lot of double submits still seem to get through to the server side Google reCAPTCHA validation logic.
My data tells me that 1.6% (28) of all requests (1760) have failed with at least one of the above scenarios being true ("timeout" or "double submission").
Meanwhile, not a single request of the 1760 has failed where the error code array was not empty.
I just have a hard time imagining a practical use case where a ChallengeTimeStamp gets issued, and then after 10 days validation is attempted, server side.
My question is:
What could be the reason for a non-negligible percentage of all Google reCAPTCHA server side validation attempts to be either very old or a case of double submission?
**By "server side validation" I mean logic that looks like this:
public bool IsVerifiedUser(string captchaResponse, string endUserIp)
{
string apiUrl = ConfigurationManager.AppSettings["Google_Captcha_API"];
string secret = ConfigurationManager.AppSettings["Google_Captcha_SecretKey"];
using (var client = new HttpClient())
{
var parameters = new Dictionary<string, string>
{
{ "secret", secret },
{ "response", captchaResponse },
{ "remoteip", endUserIp },
};
var content = new FormUrlEncodedContent(parameters);
var response = client.PostAsync(apiUrl, content).Result;
var responseContent = response.Content.ReadAsStringAsync().Result;
GoogleCaptchaResponse googleCaptchaResponse = JsonConvert.DeserializeObject<GoogleCaptchaResponse>(responseContent);
if (googleCaptchaResponse.Success)
{
_dal.LogGoogleRecaptchaResponse(endUserIp, captchaResponse);
return true;
}
else
{
//Actual code ommitted
//Try to determine the cause of failure
//Look at googleCaptchaResponse.ErrorCodes array (this has been empty in all of the 28 cases of "success: false")
//Measure time between googleCaptchaResponse.ChallengeTimeStamp (which is UTC) and DateTime.UtcNow
//Check reCAPTCHAresponse against local database of previously used reCAPTCHAresponses to detect cases of double submission
return false;
}
}
}
Thank you in advance to anyone who has a clue and can perhaps shed some light on the subject.
You will get timeout-or-duplicate problem if your captcha is validated twice.
Save logs in a file in append mode and check if you are validating a Captcha twice.
Here is an example
$verifyResponse = file_get_contents('https://www.google.com/recaptcha/api/siteverify?secret='.$secret.'&response='.$_POST['g-recaptcha-response'])
file_put_contents( "logfile", $verifyResponse, FILE_APPEND );
Now read the content of logfile created above and check if captcha is verified twice
This is an interesting question, but it's going to be impossible to answer with any sort of certainly. I can give an educated guess about what's occurring.
As far as the old submissions go, that could simply be users leaving the page open in the browser and coming back later to finally submit. You can handle this scenario in a few different ways:
Set a meta refresh for the page, such that it will update itself after a defined period of time, and hopefully either get a new ReCAPTCHA validation code or at least prompt the user to verify the CAPTCHA again. However, this is less than ideal as it increases requests to your server and will blow out any work the user has done on the form. It's also very brute-force: it will simply refresh after a certain amount of time, regardless of whether the user is currently actively using the page or not.
Use a JavaScript timer to notify the user about the page timing out and then refresh. This is like #1, but with much more finesse. You can pop a warning dialog telling the user that they've left the page sitting too long and it will soon need to be refreshed, giving them time to finish up if they're actively using it. You can also check for user activity via events like onmousemove. If the user's not moving the mouse, it's very likely they aren't on the page.
Handle it server-side, by catching this scenario. I actually prefer this method the most as it's the most fluid, and honestly the easiest to achieve. When you get back success: false with no error codes, simply send the user back to the page, as if they had made a validation error in the form. Provide a message telling them that their CAPTCHA validation expired and they need to verify again. Then, all they have to do is verify and resubmit.
The double-submit issue is a perennial one that plagues all web developers. User behavior studies have shown that the vast majority occur because users have been trained to double-click icons, and as a result, think they need to double-click submit buttons as well. Some of it is impatience if something doesn't happen immediately on click. Regardless, the best thing you can do is implement JavaScript that disables the button on click, preventing a second click.

Paypal Processing - Need to grab TransactionId, CorrelationId and TimeStamp

Current Project:
ASP.NET 4.5.2
MVC 5
PayPal API
I am using this example to build myself a PayPal transaction (and yes, my code is virtually identical), as I do not know of any other method that will return the three values in the title.
My main problem is that, the example I am utilizing is much more concise and compact than the one I used for a much older Web Forms application, and as such, I am unsure as to where or even how to grab the three values I need.
My initial thought was to do so right after the ACK, and indeed I was able to obtain the CorrelationId as well as the TimeStamp, but because this was prior to the user being carted off to PayPal’s site (sandbox in this case -- see the return new PayPalRedirect contained within the if), the TransactionId was blank. And in this example, PayPal explicitly redirects the user to a Success page without returning to the Action that sent the user to PayPal in the first place, and I am not seeing any GET values in the URL at all aside from the Token and the PayerId, much less ones that could provide me with the TransactionId.
Suggestions?
I have also looked at the following examples:
For ASP.NET Core, was unsure how to adapt to my current project particularly due to appsettings.json, but it looked quite well done. I really liked how the values were rolled up in lists.
For MVC 4, but I couldn’t find where ACK was being used to determine success or successwithwarning so I couldn’t hook into that.
I have also found the PayPal content to be like trying to drink from a fire hose at full blast -- not only was the content was hopelessly outdated (Web Forms code, FTW!) but there was also so many different examples it would have taken me days to determine which one was most appropriate to use.
Any assistance would be greatly appreciated.
Edit: my initial attempt at modifying the linked code has this portion:
values = Submit(values);
var ack = values["ACK"].ToLower();
if(ack == "success" || ack == "successwithwarning") {
using(_db = new ApplicationDbContext()) {
var updateOrder = await _db.Orders.FirstOrDefaultAsync(x => x.OrderId == order.OrderId);
if(updateOrder != null) {
updateOrder.OrderProcessed = false;
updateOrder.PayPalCorrelationId = values["CORRELATIONID"];
updateOrder.PayPalTransactionId = values["TRANSACTIONID"];
updateOrder.PayPalTimeStamp = values["TIMESTAMP"];
updateOrder.IPAddress = HttpContext.Current.Request.UserHostAddress;
_db.Entry(updateOrder).State = EntityState.Modified;
await _db.SaveChangesAsync();
}
}
return new PayPalRedirect {
Token = values["TOKEN"],
Url = $"https://{PayPalSettings.CgiDomain}/cgi-bin/webscr?cmd=_express-checkout&token={values["TOKEN"]}"
};
}
Everything within and including the using() is my added content. As I mentioned, the CorrelationId and the TimeStamp come through just fine, but I have yet to successfully obtain the TransactionId.
Edit 2:
More problems -- the transactions that are “successful” through the sandbox site (the ReturnUrl is getting called) aren’t reflecting properly on my Facilitator and Buyer accounts, even when I do payments straight from the buyer’s PayPal account (not using the Credit Card). I know I am supposed to see transactions in the Buyer’s account, either through the overall Dev account (Accounts -> Profile -> balance or Accounts -> Notifications) or through the Buyer’s account in the sandbox front end. And yet -- multiple transactions returning me to the ReturnUrl path, and yet no transactions in either.
Edit 3:
Okay, this is really, really weird. I have gone over all settings with a fine-toothed comb, and intentionally introduced errors to see where things should crap out. It turns out that the entire process goes swimmingly - except nothing shows up in my notifications and no amounts get moved between my different accounts (Facilitator and Buyer). It’s like all my transactions are going into /dev/null, yet the process is successful.
Edit 4: A hint!
In the sandbox, where Buyer accepts the transaction, there is a small note, “You will be able to review the transaction before completing it” or something like that -- suggesting that an additional page is not coming up and that the user is being uncerimoniously dumped back to the success page. Why the success page? No clue. But it’s happening.
It sounds like you are only doing the first part of the process.
Express Checkout consists of 3 API calls:
SetExpressCheckout
GetExpressCheckoutDetails
DoExpressCheckoutPayment
SEC generates a token, and then you redirect to PayPal where the user signs in and reviews the transactions before agreeing to pay.
They are then sent to the ReturnURL included in your SEC request, and this is where you'll call GECD in order to obtain all the buyer details that are now available since they signed in.
Using that data you can complete the final DECP request, which is what finalizes the procedure. No money is actually processed until this final call is completed successfully.

Log in to website using Jsoup

I'm trying to scrap a webpage for data but came across the problem of needing to log in.
Connection.Response loginForm = Jsoup.connect("http://www.rapidnyc.net/users/google_login")
.method(Connection.Method.GET)
.execute();
Document document = Jsoup.connect("http://www.rapidnyc.net/users/google_login")
.data("Email", "testEmail")
.data("Passwd", "testPass")
.... //other form data
.cookies(loginForm.cookies())
.post();
This gives me the org.jsoup.HttpStatusException: HTTP error fetching URL. Status=400
I used chrome developer tool to look at the Form data being posted but nothing I post works.
1. Have you submitted ALL input fields? Including HIDDEN ones.
2. I see the website requires "captcha-box" authentication, which is to prevent web crawlers from logging in. I highly doubt you will be able to log in with your program.
I say the 400 status is coming from your program not being able to provide the value for "captcha" authentication.

Resources