How to combine 2 queries to form a single timechart in Humio? - humio

I have 2 queries in Humio to produce 2 timecharts -
regex("Calling service API", field=log.value) | timeChart(function=count(as=Total_Count))
regex("Response from service API", field=log.value) | timeChart(function=count(as=Success_Count))*
I want to combine these 2 into a single timechart. Is it possible?

Related

Azure Application Insights KQL - How to accumulate values?

I would like to accumulate values by timestamp in a ever growing manner.
The following query
ContainerLog
| extend logsize = string_size(LogEntry)
| summarize sum(logsize) by bin(TimeGenerated, 1m)
| render timechart
generates a graph that goes up and down:
I would like to add the previous value with the current value, this way generating an always growing graph symbolizing the total amount of requests up to that moment.
// Sample data generation. Not part of the solution.
let ContainerLog = materialize(range i from 1 to 15 step 1 | extend TimeGenerated = ago(7d*rand()), logsize = rand(1000));
// Solution starts here.
ContainerLog
| summarize sum(logsize) by bin(TimeGenerated, 1m)
| order by TimeGenerated asc
| extend accumulated_sum_logsize = row_cumsum(sum_logsize)
| render timechart
Fiddle
P.S.
I kept sum_logsize for learning purposes.
In your scenario it can be removed.

Efficiently split one SQLite database into multiple files

Is there an efficient way to split on sqlite Database into multiple files by one of their columns? So for example, if I start with one database which has:
full.db:
event_id|type
--------+----
8896631 | a
8896632 | b
8896633 | c
8896634 | b
8896635 | a
8896636 | a
I want to end up with 3 new DB files:
a.db:
event_id|type
--------+----
8896631 | a
8896635 | a
8896636 | a
b.db:
event_id|type
--------+----
8896632 | b
8896634 | b
c.db:
event_id|type
--------+----
8896633 | c
A little more backstory: I have a 1TB sqlite3 file stored on a cluster's network filesystem and each of some 40,000 different tasks across about 1000 nodes are trying to access it. This is not scaling well, so I'm wondering if I can do a pre-processing step to break this giant sqlite file into smaller pieces so that each task only needs access to one of these pieces.

Google Sheets FILTER() and QUERY() not working with SUM()

I'm trying to pull and sum data from one sheet on another. This is GA data being built into a report, so I have sessions split up by landing page and device type, and would like to group them in different ways.
I usually use FILTER() for this sort of thing, but it keeps returning a 0 sum. Thinking this may be an odd edge case with FILTER(), I switched to using QUERY() instead. That gave me an error, but a Google search doesn't offer much documentation about what the error actually means. Taking a guess that it could be indicating an issue with the data type (i.e. not numeric), I changed the format of the source from "Automatic" to "Number", but to no avail.
Maybe it's a lack of coffee, I'm at a loss as to why neither function is working to do a simple lookup and sum by criteria.
FILTER() function
SUM(FILTER(AllData!C:C,AllData!A:A="/chestnut/",AllData!B:B="desktop"))
No error, but returns 0 regardless of filter parameters.
QUERY() function
QUERY(AllData!A:G, "SELECT SUM(C) WHERE A='/chestnut/' AND B='desktop'",1)
Error returned:
Unable to parse query string for Function QUERY parameter 2: AVG_SUM_ONLY_NUMERIC
Sample data:
landingPage | deviceCategory | sessions
-------------|----------------|----------
/chestnut/ | desktop | 4
/chestnut/ | desktop | 2
/chestnut/ | tablet | 5
/chestnut/ | tablet | 1
/maple/ | desktop | 1
/maple/ | desktop | 2
/maple/ | mobile | 3
/maple/ | mobile | 1
I think the summing doesn't work because your numbers are text formatted.
See if any of these work? (change ranges to suit)
using FILTER()
=SUM(FILTER(VALUE(AllData!C:C),AllData!A:A="/chestnut/",AllData!B:B="desktop"))
using QUERY()
=ArrayFormula(QUERY({AllData!A:B, VALUE(AllData!C:C)}, "SELECT SUM(Col3) WHERE Col1='/chestnut/' AND Col2='desktop' label SUM(Col3)''",1))
using SUMPRODUCT()
=SUMPRODUCT(VALUE(AllData!C2:C),AllData!A2:A="/chestnut/",AllData!B2:B="desktop")

Retrieving data with dynamic attributes in Cassandra

I'm working on a solution for Cassandra that's proving impossible.
We have a table that will return a set of candidates given some search criteria. The row with the highest score is returned back to the user. We can do this quite easily with SQL, but there's a need to migrate to Cassandra. Here are the tables involved:
Value
ID | VALUE | COUNTRY | STATE | CITY | COUNTY
--------+---------+----------+----------+-----------+-----------
1 | 50 | US | | |
--------+---------+----------+----------+-----------+-----------
2 | 25 | | TX | |
--------+---------+----------+----------+-----------+-----------
3 | 15 | | | MEMPHIS |
--------+---------+----------+----------+-----------+-----------
4 | 5 | | | | BROWARD
--------+---------+----------+----------+-----------+-----------
5 | 30 | | NY | NYC |
--------+---------+----------+----------+-----------+-----------
6 | 20 | US | | NASHVILLE |
--------+---------+----------+----------+-----------+-----------
Scoring
ATTRIBUTE | SCORE
-------------+-------------
COUNTRY | 1
STATE | 2
CITY | 4
COUNTY | 8
A query is sent that can have any of those four attributes populated or not. We search through our values table, calculate the scores, and return the highest one. If a column in the values table is null, it means it's applicable for all.
ID 1 is applicable for all states, cities, and counties within the US.
ID 2 is applicable for all countries, cities, and counties where the state is TX.
Example:
Query: {Country: US, State: TX}
Matches Value IDs: [1, 2, 3, 4, 6]
Scores: [1, 2, 4, 8, 5(1+4)]
Result: {id: 4} (8 was the highest score so Broward returns)
How would you model something like this in Cassandra 2.1?
Found out the best way to achieve this was using Solr with Cassandra.
Somethings to note though about using Solr, since all the resources I needed were scattered amongst the internet.
You must first start Cassandra with Solr. There's a command with the dse tool for starting cassandra with Solr enabled.
$CASSANDRA_HOME/bin/dse cassandra -s
You must create your keyspace with network topology stategy and solr enabled.
CREATE KEYSPACE ... WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'Solr': 1}
Once you create your table within your solr enabled keyspace, create a core using the dsetool.
$CASSANDRA_HOME/bin/dsetool create_core keyspace.table_name generateResources=true reindex=true
This will allow solr to index your data and generate a number of secondary indexes against your cassandra table.
To perform the queries needed for columns where values may or may not exist requires a somewhat complex query.
SELECT * FROM keyspace.table_name WHERE solr_query = '{"q": "{(-column:[* TO *] AND *:*) OR column:value}"';
Finally, you may notice when searching for text, your solr query column:"Hello" may pick up other unwanted values like HelloWorld or HelloThere. This is due to the datatype used in your schema.xml for Solr. Here's how to modify this behavior:
Head to your Solr Admin UI. (Normally http://hostname:8983/solr/)
Choose your core in the drop down list in the left pane, should be named keyspace.table_name.
Look for Config or Schema, both should take you to the schema.xml.
Copy and paste that file to some text editor. Optionally, you could try using wget or curl to download the file, but you need the real link which is provided in the text field box to the top right.
There's a tag <fieldtype>, with the name TextField. Replace org.apache.solr.schema.TextField with org.apache.solr.schema.StrField. You must also remove the analyzers, StrField does not support those.
That's it, hopefully I've saved people from all the headaches I encountered.

How to copy a value from a certain pattern in R?

I have a data file, each row may have different format, but the certain pattern "\\- .*\\| PR", the data set is kind as following:
|- 7 | PR - Easy to post position and search resumes | Improvement - searching of resumes
[ 1387028] | Recommend - 9 | PR - As a recruiter I find a lot qualified resumes for jobs that I am working on.
|- 10 | PR - its easy to use and good candidiates
I want to have a record of the number in this pattern, or the data I offered, I need a record of 7, 9,10. I have no idea about how to do it, is there someone can help?
as.integer(sub('.*- ([0-9]+) \\| PR.*', '\\1', yourvector))

Resources