How can we find the first time error in Log Analytics or Application insights. There could be errors that get repeatedly written to log files. But I want to find errors that are not in this error pattern and send an alert when it happens.
Or search these 'different'/irregular errors (not necessarily first time) during a particular hour or custom time.
I am thinking of the following options as solutions(existing or common errors can ran up to 100-500 lines):
running a saved kusto query with a list of hardcoded error messages(resulting mismatches can be listed as new errors).
Create a Data Table with all existing/common errors (in Kusto but will it stay stored?) and do the same as above(join)
Or use logic apps and store the known errors in table storage.
retrieve the table storage first and run the join query against log
analytics.
Has anyone done this before and what would you suggest the best option is?
Possible options can be
If the known errors list is small , you can adjust your Kusto query on Log
Analytics/Application Insights to exclude those common errors and create an alert
based on your alert logic (custom query, number of results, threshold, period (grain)
and frequency)
If you want to maintain the errors list with in Log Analytics/Application Insights as
a custom events , you very well can call the respective API's and ingest the data.
Some documentation reference for the same.
Once you have your list of errors as custom logs/events you can write your Kusto query
accordingly.
Additional Documentation Reference for dynamic thresholds and smart detections, see if this fits your requirement.
Metric Alerts with Dynamic Thresholds in Azure Monitor
Smart Detection in Application Insights
Hope the above information helps !
Related
I am exploring using ADX as a timeseries data store for sensor metrics. Our current solution is storing data in MSSQL and I'm testing ADX as an alternative. I was able to set up data ingestion and I can perform basic queries, and with the added aggregation functions, computing insights and statistics seems to be much faster.
As part of the solution, we have a API data access layer used by clients and our web portal to query data for display and analysis use. I am currently transforming the MSSQL queries to the KQL version and I'm hitting a stumble block on data pagination.
We have a function to query historical data using a combination of:
an start/end date,
a device identifier,
and some paging options
records per page,
current page,
column sorting / additional filtering
Currently this is handled in a SQL SP on the back-end, by getting the total number of records and pages (which is set as output on the API so that the front-end can use this data in the table view), then getting the records based on the input parameters and pagination details to return a record set - quite straight forward.
Any suggestions on how to achieve effective pagination using ADX/KQL?
I found a section in the docs on pagination on stored query results, but as the queries are dynamic based on user input, so this does not sound like a viable option.
When you paginate (for example viewing result 21-30) you need to consider if you are taking a snapshot of the result and paging through it or viewing live data. If you expect new rows coming in to not affect your pagination, than stored query results is that snapshot. Once you generate it you can select specific rows from it based on your page calculation.
I have 6 DynamoDB tables set up for use as data sources for my iOS app. They all get called from the client near enough the exact same amount when someone uses the app but for some reason one table if getting crazy spikes in reads and going way over the read capacity and causing a provisioned throughput for table exceeded error.
The app only has a very small number of users in general (below 100).
The offending table metrics look like this:
But all the other tables look like this:
Any thoughts on what might be causing this? The code for making requests client side is all the same. They are seemingly all set up the same in the API using lambda and API gateway. What am i missing?
I am having troubles interpreting the data coming back from render API on some metrics I have set up in Graphite. I have a splunk query populating the results, and graphite is configured to poll this query which has a 1 minute range every one minute. The results I am getting back when doing a render api call with format=json are not making much sense to me. If I add into the api call a from=-1min, I get 2 datapoint results, 4 for -2min, etc. Along with this, often the datapoints are just given a null value which should never happen given how populated this splunk query is when I run it manually for the same time ranges.
I am wondering if there is any additional documentation on the render api other than what is listed here: https://graphite.readthedocs.io/en/0.9.15/render_api.html because it really isn't giving me much on why the data is appearing the way it is. Does anyone know of additional docs, or have any insight into what is going on here?
we have a bit of a problem.
We've builded a GWT application on top of our two Alfresco instances. The application should work like this:
User search a document
Our web app spam two same queries against two repositories, wait for both results and expose a merged resultset.
This is true in case the search is for a specific documento (number id for example) or 10, 20, 50 documents (we don't know when this begins to act strange).
If the query is a consistent one (like all documents from last month, there should be about 30-60k/month) obviously the limit of cmis query (500) stops before.
BUT, if the user hits "search" the first time, after a while, the resultset is composed of 2 documents. And if the users hits "search" right after that again, with the same query, the resultset is exposed almost immediately and there are 500 documents listed.
What the heck is wrong? Does CMIS caches results in some way? How do big CMIS queries work?
Thanks
A.
As you mentioned you're using Apache Chemistry. Chemistry has a clientside caching mechanism:
http://chemistry.apache.org/java/how-to/how-to-tune-perfomance.html
I suspect this is not CMIS related at all but is instead due to the Alfresco Lucene "max permission check" problem. At a high-level, there is a config setting for the maximum number of permission checks that Alfresco will do against a search result set. There is also a limit to the total amount of time it will spend performing such checks. These limits are configured in the repository properties file as:
# The maximum time spent pruning results
system.acl.maxPermissionCheckTimeMillis=10000
# The maximum number of results to perform permission checks against
system.acl.maxPermissionChecks=1000
The first time you run a search the server begins performing these checks and hits the limit. It then returns the search results it was able to filter. Now the permission cache is populated so the next time you run the search the results come back much faster and the result set is larger.
Searches in Alfresco are non-deterministic--you cannot guarantee that, for large result sets, you will get back the exact same result set every time, regardless of how big you make those settings.
If you are able to upgrade at some point you may find that configuring Alfresco to use Solr rather than Lucene could help alleviate this, but I'm not 100% sure it will.
To disable security checks replace public SearchService with searchService. Public services have enforced security so with searchService you can avoid security checking.
Can anyone explain Query Bands in Teradata?
I've searched regarding this a lot, but wasnt able to get information which I can understand.
Please be a bit detailed.
Thanks!!!
QUERY BANDING IN TERADATA:
QUERY BANDING PROVIDES CIRCUMSTANTIAL WORKFLOW INFORMATION.
Concept:
Scientists will often band the legs of birds with devices to track their flight paths. Monitoring and analyzing the data retrieved via the bands provides critical information about the species.
The same process is followed by DBAs who need some more information about a query than what is available.
Metadata—such as the name of requesting user, work unit & the application name is important, Workload management will be tracking the entire use of data warehouse & query troubleshooting.
Query banding feature is used such a way that, these metadata details are linked to the query in database.
A query band can contain any number of name or value pairs such as initiating users corporate ID, department & location, also the time of the initiation execution started.
Prashanth provided a good analogy with birds and bands. Adam is asking for specific situations. I can come up with several examples, when query banding may be very useful:
Your system is used by hundreds of users via an Application Server with a custom application or a reporting application like Business Objects, Tableau or Qlikview. Application server connects to Teradata using one user ID, however the administrator would still like to know what users, departments and groups of users generate each query to be able to analyze later in DBQL or simply to allocate proper system resources using TASM. For this the application can be configured in such a way that each query is "banded" with information like "AppUser:User1;Appgroup:DataScientists;QueryType:strategic02". Despite the fact that Application Server uses one Teradata user and a limited number of connection to route all the queries from hundreds of users, each individual query is marked with information exactly which user has initiated the query. You can then perform all kinds of analysis based on this information.
Suppose you have a complex ETL application, and you want to track and analyze your execution of loads - what and when went wrong. Usually you would need to log all the steps of your ETL process, and in the logs you must specify unique Load ID, Process ID, Step ID, etc. You do this because you want to be able to understand what specific process caused this halt or a performance degradation, and without such logging it would not be possible to distinguish running of the same steps between different runs of your ETL application. A good alternative would be to switch on DBQL and embellish your queries with Query Band information with Load ID, Process ID, Step ID, etc. In this way you would have all necessary information in DBQL without the necessity to create additional elaborate log tables.
SET QUERY BAND = 'name=value; name2=value;' FOR SESSION|TRANSACTION;
this will tag your query with some name value pairs. This can be used to manage your query's workload management for example in TDWM you have throttles and priority management hooks that will priorities all name2 types with the value "value". It means you can submit a very rich detail on the session or transaction
Yes, what you described can easily be done with QueryBanding; think of it as a "wagon of key-pair attributes in transit". you can access them via sql or prgrammatically with session attributes in bteq or jdbc for example.
Necromancing... Existing answers do a good job at explaining how query bands work, but since I could not find a complete working example, I thought of adding one here.
Setting query bands in Teradata is already covered, so I will provide an example of how to set them from a .NET client:
private void SetQueryBands()
{
TdQueryBand qb = Connection.QueryBand;
qb["CustomApplicationName"] = "MyAppName";
foreach (string key in CustomQueryBands.Keys)
{
qb[key] = CustomQueryBands[key];
}
Connection.ChangeQueryBand(qb);
}
Connection = new TdConnection(GetConnectionString());
Connection.Open();
SetQueryBands();
More details can be found here.
To retrieve stored queryband data, GetQueryBandValue function can be used:
SELECT CollectTimestamp, QueryBand,
GetQuerybandValue(queryband, 0, 'Key1') AS Value1,
GetQuerybandValue(queryband, 0, 'Key2') AS Value2,
GetQuerybandValue(queryband, 0, 'Key3') AS Value3,
FROM dbql_data.dbqlogtbl
WHERE dateofday = DATE - 1
AND queryband LIKE '%somekeyorvalue%'