How to authorize an MS Flow connector in Asynchronuous Pattern - asynchronous

I have an azure function that is a long-running process but MS Flow connector has a limit of 2 minutes.
I followed this article - What you should know about building Microsoft Flow connectors , (particularly the Asynchronous Connectors section) - which partially solves the issue. Implementing what was done in this article works for API Management that does not require subscription.
The issue emerge when subscription is being required. It seems like the Ocp-Apim-Subscription-Key header is not being added to the request header by the connector whenever it tries to access the 'location', hence, producing a 401 Unauthorized error:

Related

Application insights | Sometimes End-to-end transaction details do not show all telemetry

I have .Net core App deployed on azure and enabled application insights.
Sometimes Azure application insights End-to-end transaction details do not display all telemetry.
Here it only logs the error and not request or maybe request logged but both do not display together over here(difficult to find out due to many people use it)
Should be like:
Sometimes request log but with no error log.
What could be the reason for happening this? do I need to look into application insights specific set-up/feature?
Edit:
As suggested by people here, try to disable the Sampling feature but still not works, Here is open question as well.
This usually happens due to sampling. By default, adaptive sampling is enabled in the ApplicationInsights.config which basically means that only a certain percentage of each telemetry item type (Event, Request, Dependency, Exception, etc.) is sent to Application insights. In your example probably one part of the end to end transaction got sent to the server, another part got sampled out. If you want, you can turn off sampling for specific types, or completely remove the
AdaptiveSamplingTelemetryProcessor
from the config which completely disables sampling. Bear in mind that this leads to higher ingestion traffic and higher costs.
You can also configure sampling in the code itself, if you prefer.
Please find here a good overview of how sampling works and can be configured.
This may be related to :
When using SDK 2.x, you have to track all events and send the telemetries to Application insights
When using auto-instrumentation with 3.x agent, in this case the agent collect automatically the traffic, logs ... and you have to pay attention to the sampling file applicationinsights.json where you can filter the events.
If you are using java, below the accepted Logging libraries :
-java.util.logging
-Log4j, which includes MDC properties
-SLF4J/Logback, which includes MDC properties

What is the difference between tracing calls in distributed system using ELK stack and using Dynatrace distributed tracing?

In my project this week I got a backlog where I asked to analyze how can we use distributed tracing in our microservices. We already have ELK stack where we check log in Kibana dashboard. I tried to filter calls using co-relation id and this is working fine. However, our application uses Dynatrace also for tracing. My question is if we have ELK stack to check log to find issues in PRD then what extra Dynatrace distributed tracing does? What is main difference that I am not able to understand. Please can anyone try to explain in simple language? I am new to this backlog so some concepts are not proper yet.
Distributed tracing is an industry method to allow developers to
monitor the performance of the APIs that they use without actually
being able to analyze the backing microservice’s code.
From off doc:
ELK Distributed tracing enables you to analyze performance throughout
your microservice architecture by tracing the entirety of a
request — from the initial web request on your front-end service all
the way to database queries made on your back-end services.
It works
by injecting a custom traceparent HTTP header into outgoing requests.
This header includes information, like trace-id, which is used to
identify the current trace, and parent-id, which is used to identify
the parent of the current span on incoming requests or the current
span on an outgoing request.
Dynatrace distributed tracing is almost the same plus code execution analysis. In Dynatrace words: Purepaths plus Code-level visibility and AI-based application performance monitor. Easy way to identify how your microservices works together, full service flow and problem analysis.
More explanations from Dynatrace blog

Microservice Design with SignalR

We have built a microservice architecture but have run into an issue where the messages going on to the bus are too large. (Discovered since moving to Azure Service bus as this only allows 256KB compared to RabbitMQ 4MB)
We have a design as the below diagram. Where we're struggling is with the data being returned.
An example is when performing a search and returning multiple results.
To step through our current process:
Web client sends a http request to the Web Api.
Web api then puts appropriate message on to the bus. (Web api responds to client with an Accepted response)
Microservice picks up this message.
Microservice queries its database for the records matching search criteria.
Results returned from database.
A SearchResult message is added to the bus. (This contains the results)
Our response microservice is listening for this SearchResult message.
The response microservice then posts to our SignalR api.
SignalR Api sends the results back to the web client.
My question is how do we deal with large results sets when designed in this way? If it's not possible how should the design be changed to handle large results sets?
I understand we could page the results but even so one result could be over the 256KB allowance, for example a document or a particularly large object.
There are 2 ways :-
Use Kafka like system which support large size messages.
If you can't go with the 1st approach (that's appear from your question), then Microservices can place 2 types of messages for response service
(1.) If size is small then place the complete message and
(2.) If size is more than supported then place message that contain link to Azure Storage Blob which have result
Based on message, response service can get proper result and return the same to Client.

Amazon DynamoDB Async client http metrics

I am using Amazon SDK (Java) DynamoDB Async client v2.10.14 with custom configuration:
DynamoDbAsyncClientBuilder = DynamoDbAsyncClient
.builder()
.region(region)
.credentialsProvider(credentialsProvider)
.httpClientBuilder(
NettyNioAsyncHttpClient.builder()
.readTimeout(props.readTimeout)
.writeTimeout(props.writeTimeout)
.connectionTimeout(props.connectionTimeout)
)
I often run into a connect timeout:
io.netty.channel.ConnectTimeoutException: connection timed out: dynamodb.region.amazonaws.com/1.2.3.4:443
I expect this is due to my settings, but I need aggressive timeouts. I used to run into the same issues with the defaults anyway (it just took longer). I would like to know why I am getting into this situation. My gut feel is that it's related to connection pool exhaustion, or other issues with the pool.
Are there any metrics I can turn on to monitor this?
It seems like your application is a "latency-aware DynamoDB client application". The underlying HTTP client behavior needs to be tuned for its retry strategy. Luckily, the AWS Java SDK provides full control over the HTTP client behavior and retry strategies, this documentation helps to explain how to tune AWS Java SDK HTTP request settings and parameters for the underlying HTTP client:
https://aws.amazon.com/blogs/database/tuning-aws-java-sdk-http-request-settings-for-latency-aware-amazon-dynamodb-applications/
In the document it provides an example on how to tune the five HTTP client configuration parameters that are set during the ClientConfiguration object creation and discusses very comprehensively about each of the parameters:
ConnectionTimeout
ClientExecutionTimeout
RequestTimeout
SocketTimeout
The DynamoDB default retry policy for HTTP API calls with a custom
maximum error retry count
Tuning this latency-aware DynamoDB application "requires an understanding of the average latency requirements and record characteristics (such as the number of items and their average size) for your application, interdependencies between different application modules or microservices, and the deployment platform. Careful application API design, proper timeout values, and a retry strategy can prepare your application for unavoidable network and server-side issues"
Quoted from: https://aws.amazon.com/blogs/database/tuning-aws-java-sdk-http-request-settings-for-latency-aware-amazon-dynamodb-applications/

ASP.Net API App - continual HTTP 502.3 errors

My team and I have been at this for 4 full days now, analyzing every log available to us, Azure Application Insights, you name it, we've analyzed it. And we can not get down to the cause of this issue.
We have a customer who is integrated with our API to make search calls and they are complaining of intermittent but continual 502.3 Bad Gateway errors.
Here is the flow of our architecture:
All resources are in Azure. The endpoint our customers call is a .NET Framework 4.7 Web App Service in Azure that acts as the stateless handler for all the API calls and responses.
This API app sends the calls to an Azure Service Fabric Cluster - that cluster load balances on the way in and distributes the API calls to our Search Service Application. The Search Service Application then generates and ElasticSearch query from the API call, and sends that query to our ElasticSearch cluster.
ElasticSearch then sends the results back to Service Fabric, and the process reverses from there until the results are sent back to the customer from the API endpoint.
What may separate our process from a typical API is that our response payload can be relatively large, based on the search. On average these last several days, the payload of a single response can be anywhere from 6MB to 12MB. Our searches simply return a lot of data from ElasticSearch. In any case, a normal search is typically executed and returned in 15 seconds or less. As of right now, we have already increased our timeout window to 5 minutes just to try to handle what is happening and reduce timeout errors for the fact their searches are taking so long. However, we increased the timeout via the following code in Startup.cs:
services.AddSingleton<HttpClient>(s => {
return new HttpClient() { Timeout = TimeSpan.FromSeconds(300) };
});
I've read in some places that you actually have to do this in the web.config file as opposed to here, or at least in addition to it. Not sure if this is true?
So The customer who is getting the 502.3 errors have significantly increased the volumes they are sending us over the last week, but we believe we are fully scaled to be able to handle it. They are still trying to put the issue on us, but after many days of research, I'm starting to wonder if the problem is actually on their side. Could it be possible that they are not equipped to take the increased payload on their side. Can it be that their integration architecture is not scaled enough to take the return payload from the increased volumes? When we observe our resources usages (CPU/RAM/IO) on all of the above applications, they are all normal - all below 50%. This also makes me wonder if this is on their side.
I know it's a bit of a subjective question, but I'm hoping for some insight from someone who may have experienced this before, but even more importantly, from someone who has experience with a .Net API app in Azure which return large datasets in it's responses.
Any code blocks of our API app, or screenshots from Application Insights are available to post upon request - just not sure what exactly anyone would want to see yet as I type this.

Resources