Audit logging CosmosDB - azure-cosmosdb

Wanting to validate my ARM template was deployed ok and to get an understanding of the telemetry options...
Under what circumstances do the following get logged to Log Analytics?
DataPlaneRequests
MongoRequests
QueryRuntimeStatistics
Metrics
From what I can tell arduously in the last few days connecting in different ways.
DataPlaneRequests are logged for:
SQL API calls
Table API calls even when the account was setup for SQL API
Graph API calls against an account setup for Graph API
Table API calls against an account setup for Table API
MongoRequests are logged for:
Mongo requests even when the account was setup for SQL API
However I haven't been able to see anything for QueryRuntimeStastics (even when turning on PopulateQueryMetrics) nor have I seen any AzureMetrics appear?

Thanks Alex for spending time and trying out different options of logging for Azure Cosmos DB.
There are primarily two types of monitoring paths for Azure Cosmos DB.
Metrics: These are low latency (<5 min) and aggregated metrics which are exposed on Azure Monitor API for consumption. THese metrics are primarily used for diagnosis of the app for any live site issues.
Logs: These are raw request logs coming at 2hours+ latency and are used for customer for primarily audit scenarios to understand who accessed the data.
Depending on your need you can choose either of the approaches.
DataPlaneRequests by default shows all the requests across all the API's and Mongo Requests only show Mongo specific calls. Please note Mongo requests would also be seen in Data Plane requests.
Metrics would not be see in Log Analytics due to a knowwn which our partner team is fixing.
Let me know if you have any further questions here.

Related

Which API should be used for querying Application Insights trace logs?

Our ASP.NET Core app logs trace messages to App Insights. We need to be able to query them and filter by some customDimentions. However, I have found 3 APIs and am not sure which one to use:
App Insights REST API
Azure Log Analytics REST API
Azure Data Explorer .NET SDK (Preview)
Firstly, I don't understand the relationships between these options. I thought that App Insights persisted its data to Log Analytics; but if that's the case I would expect to only be able to query through Log Analytics.
Regardless, I just need to know which is the best to use and I wish that documentation were clearer. My instinct says to use the App Insights API, since we only need data from App Insights and not from other sources.
The difference between #1 and #2 is mostly historical and converging.
Application Insights existed as a product before log analytics, and were based on different underlying database technologies
Both Application Insights and Log Analytics converged to use the same underlying database, based on ADX (Azure Data Explorer), and the same exact REST API service to query either. So while your #1 and #2 links are different, they point to effectively the same service backend by the same team, but the pathing/semantics are subtly different where the service looks depending on the inbound request.
both AI and LA introduce the concept of multi-tenancy and a specific set of tables/schema on top of their azure resources. They effectively hide the entire database from you, and make it look like one giant database.
there is now the possibility (suggested) to even have your Application Insights data placed in a Log Analytics Workspace:
https://learn.microsoft.com/en-us/azure/azure-monitor/app/create-workspace-resource
this lets you put the data for multiple AI applications/components into the SAME log analytics workspace, to simplify query across different apps, etc
Think of ADX as any other kind of database offering. If you create an ADX cluster instance, you have to create database, manage schema, manage users, etc. AI and LA do all that for you. So in your question above, the third link to ADX SDK would be used to talk to an ADX cluster/database directly. I don't believe you can use it to directly talk to any AI/LA resources, but there are ways to enable an ADX cluster to query AI/LA data:
https://learn.microsoft.com/en-us/azure/data-explorer/query-monitor-data
And ways to have a LA/AI query also join with an ADX cluster using the adx keyword in your query:
https://learn.microsoft.com/en-us/azure/azure-monitor/logs/azure-monitor-data-explorer-proxy

Is there a Google API answering about Firestore database either Metrics or Health Checks or Current Active Connectios or Exceptions or Performance

Context: I am total Google Cloud begginer and I have just convinced my company headers to use Firestore Realtime Database for pushing transaction status to our mobile application. We have around 4 millions users that will use significantly our application for small money transfers. Now-a-days we use the concept of polling from Android/IOS to our Microservice endpoints and it will replaced by Firebase SDK imported to our Mobile app which will listen/observe to our Firestore Collection following few Firestore Rules. Since all money transfer will be confirmed/denied in short time (from few seconds to 1 or 2 minutes) the idea of replacing polling by a real reactive approach straigh from Firestore sounded and is already ongoing coding.
The issue: Firstly I don't what to compare solutions. It is just my reality: the prodution support operators must look after our internal Dashboard. Isn't allowed to them look at Google Dashboard Console (please accept this for this question). I need get on demand metrics of our FIrestore. It is nothing to do with Google pricing. It is just our demand: they want to see metrics like:
how many users listening at the same time now
how many users took some exception during connection
is there any user holding connection for more than X minute
when was the connection pick this morning
any exception of any type surrounding our Firestore database
I read Code Samples carefully follow the sample step-by-step trying to figure out some idea if there is some API providing the answers I am looking for.
So, my straight question is: is there such type of Google API providing metrics about my Firestore Database? Maybe following the same idea we found in Performance Monitor which works on Mobile side also some similar aproach on Firestore side.
*** Edited
Future readers may find worth read also about a way to get Firestore metrics info striagh from curl/postman
A couple of things: You mentioned both Firestore and Realtime Database; just wanted to make sure that you are aware that those are two different databases offered under the Firebase umbrella.
how many users listening at the same time now
is there any user holding connection for more than X minute
Yes, there's a dashboard: https://support.google.com/firebase/answer/6317517?hl=en. Including lots of options, like users active in the last 30 mins.
how many users took some exception during connection
any exception of any type surrounding our Firestore database
Yes, you can track errors and other logging via Stack Driver logging. These can give you reports on your cloud functions.
https://cloud.google.com/functions/docs/monitoring
Where can I find Stackdriver in Firebase console?
when was the connection pick this morning
For this one, I'm not sure if you mean A. when did somebody log on in the morning, or B. what was the time that there was the peak \ most usage. If B see 1. If A,
Real-time database has the concept of presence, which lets you know if a user is currently logged in or not. See examples here from the official documentation:
https://firebase.google.com/docs/firestore/solutions/presence
and this post
How to make user presence mechanism using Firebase?
Also applies to your
is there any user holding connection for more than X minute
..............
Edit in response to comments: I believe you are experiencing the XY problem https://meta.stackexchange.com/questions/66377/what-is-the-xy-problem where you are focused on a particular solution, even though your problem has other solutions. User metrics, database events, and errors are all accessible through both dashboards and cloud functions. You can cURL cloud functions if you wish, or set up cron functions to auto report, or set up database trigger functions to log errors. So, while the exact way you want this to work may not exist, you just need to connect existing tools to get the result you want.

Firebase browser key API restrictions

When creating a new project Firebase generates browser API keys automatically in the GCP API credentials. This is the same API key that is set in the Firebase Web client SDKs and is publicly available.
By default the key has no restrictions, so it's prone to quota stealing for every API enabled for that project. Surprisingly I have not found information about securing this key in the Firebase documentation.
So I took two extra steps to secure the key:
Added HTTP referrer restriction to allow requests from my domain only.
Added Identity Toolkit API to the list of allowed APIs. Experimentally I've figured out that it's enough for Firebase Auth and Firestore to work.
Added Token Service API. This is needed for refresh tokens to work and keep the authentication.
My question is mostly related to points #2-3. What are the APIs that needs to be enabled for various components of Firebase to work on the web?
I also enabled those same two APIs, but I used the Metrics Explorer to see what the various Firebase-created keys had been using based on actual traffic.
In GCP,
Go to Monitoring -> Metrics Explorer
Click 6W in the time range above the graph
Resource Type, start typing consumed_api and select it
Metric, choose Request Count
Group By, type credential_id, select it, then type service, and select it
Aggregator, select sum
By now, the legend for the graph should list all the credential ids and which services they used in the last 6 weeks. You should be able to figure out the APIs from the service.
You can use Filter to filter by credential_id if the results are too noisy.
By default the key has no restrictions, so it's prone to quota
stealing for every API enabled for that project.
This is indeed possible and I am able to make e. g. Google Maps API call with the auto generated Firebase API key.
Such preconfigured behaviour was certainly unexpected and I am now experimenting with the restrictions as per the extra steps described in the original question.

StackDriver User Quotas

I'm using the Java client library's MetricServiceClient for getting StackDriver timeseries. I am authenticating using a user oauth token (this user has access to multiple projects), but there seems to be some kind of global quota across multiple projects because when I fetch only one or two projects at a time I have no throttling, but when I fetch four or five different projects at a time, I start getting throttled with errors like the following:
io.grpc.StatusRuntimeException: RESOURCE_EXHAUSTED: Insufficient tokens for quota 'DefaultGroup' and limit 'USER-100s' of service 'monitoring.googleapis.com' for consumer 'project_number:764086051850'.
I have confirmed this by alternating which projects are being fetched so that I can say it is not any single project -- they all start to get rate limited. Another strange thing is, that project_number in the error message doesn't correspond to any project I am fetching, or even have access to -- it is meaningless to me.
This appears to be the quota for # of requests per 100 seconds, but I have that set to 10,000 on all projects and I'm not doing nearly that many requests, as the quota historical chart in the web console confirms.
Is there really some global quota that applies across multiple projects and if so, is there some way to work around it? It is much simpler to me to have a single user with access to multiple projects instead of having to make service account tokens for them all.
The token quota you're hitting is for users using Application Default Credentials (which uses a shared gcloud project for billing), while it exists to get users up and run quickly but it's not recommended for actual production use. Therefore using a proper service account bound to the user's project is highly recommended and the solution to the issue.

Linkedin API throttle limit

Recently I was developing an application using Linkedin people-search API. Documentation says that a developer registration has 1 lac API calls per day, but when I have registered this API, and ran a python script, after some 300 calls it says throttle limit exceeds.
Did anyone face such kind of issue using Linkedin API, comments are appreciated.
Thanks in advance.
It's been a while but the stats suggest people still look at this and I'm experimenting with the LinkedIn API and can provide some more detail.
The typical throttles are stated as both a max (e.g. 100K) and a per-user-token number (e.g. 500). Those numbers together mean you can get up to a maximum of 100,000 calls per day to the API but even as a developer a single user token means a maximum of 500 per day.
I ran into this, and after setting up a barebones app and getting some users I can confirm a daily throttle of several thousands of API calls. [Deleted discussion of what was probably, upon further consideration, an accidental back door in the LinkedIn API.]
As per the Throttle Limits published by LinkedIn:
LinkedIn API keys are throttled by default. The throttles are designed
to ensure maximum performance for all developers and to protect the
user experience of all users on LinkedIn.
There are three types of throttles applied to all API keys:
Application throttles: These throttles limit the number of each API call your application can make using its API key.
User throttles: These throttles limit the number of calls for any individual user of your application. User-level throttles serve
several purposes, but in general are implemented where there is a
significant potential impact to the user experience for LinkedIn
users.
Developer throttles: For people listed as developers on their API keys, they will see user throttles that are approximately four times
higher than the user throttles for most calls. This gives you extra
capacity to build and test your application. Be aware that the
developer throttles give you higher throttle limits as a developer of
your application. But your users will experience the User throttle
limits, which are lower. Take care to make sure that your application
functions correctly with the User throttle limits, not just for the
throttle limits for your usage as a developer.
Note: To view current API usage of your application and to ensure you haven't hit any throttle limits, visit
https://www.linkedin.com/developer/apps and click on "Usage & Limits".
The throttle limit for individual users of People Search is 100, with 400 being the limit for the person that is associated with the Application as the developer:
https://developer.linkedin.com/documents/throttle-limits
When you run into a limit, view the api usage for the application on the application page to see which throttle you are hitting.

Resources