groupby an element with jq

groupby an element with jq - jq

I have the following json:
{"us":{"$event":"5bbf4a4f43d8950b5b0cc6d2"},"org":"TΙ UIH","rc":{"$event":"13"}}
{"us":{"$event":"5bbf4a4f43d8950b5b0cc6d3"},"org":"TΙ UIH","rc":{"$event":"13"}}
{"us":{"$event":"5bbf4a4f43d8950b5b0cc6d4"},"org":"AB KIO","rc":{"$event":"13"}}
{"us":{"$event":"5bbf4a4f43d8950b5b0cc6d5"},"org":"GH SVS","rc":{"$event":"17"}}
How could i achieve the following output result? (tsv)
13 TΙ UIH 2
13 AB KIO 1
17 GH SVS 1
so far from what i have searched,
jq -sr 'group_by(.org)|.[]|[.[0].org, length]|#tsv'
how could i add one more group_by to achieve the desired result?

I was able to obtain the expected result from your sample JSON using the following :
group_by(.org, .rc."$event")[] | [.[0].rc."$event", .[0].org, length] | #tsv
You can try it on jqplay.org.
The modification of the group_by clause ensures we will have one entry by pair of .org/.rc.$event (without it we would only have one entry by .org, which might hide some .rc.$event).
Then we add the .rc.$event to the array you create just as you did with the .org, accessing the value of the first item of the array since we know they're all the same anyway.
To sort the result, you can put it in an array and use sort_by(.[0]) which will sort by the first element of the rows :
[group_by(.org, .rc."$event")[] | [.[0].rc."$event", .[0].org, length]] | sort_by(.[0])[] | #tsv

Related

reducing stream data to single result without putting them all into memory

I can reduce produced lines like:
seq 5 | jq --slurp ' reduce .[] as $i (0;.+($i|tonumber))'
to get
15
but this put whole input into memory, I don't want that. Following:
seq 5 | jq ' reduce . as $i (0;.+($i|tonumber))'
produces incorrect output
1
2
3
4
5
similar happens when foreach is used.
What is correct syntax?

Use inputs instead of the context ., along with the --null-input (or -n) option so the context wouldn't eat up the first item:
seq 5 | jq -n 'reduce inputs as $i (0;.+($i|tonumber))'
15
Demo
Explanation: If you just use . as context, your filter will be executed once for each input (five times in this case, hence five outputs, each summing up just one value). Providing the -n option sets the input to null, so the filter is executed only once, while input and inputs sequentially read in another/all new items from the input stream.

How to split a column into two in Bash shell

I have a huge file with many column. I want to count the number of occurences of each values in 1 column. Therefore, I use
cut -f 2 "file" | sort | uniq -c
. I got the result as I want. However, when I read this file to R, It shows that I have only 1 column but the data is like the example below
Example:
123 Chelsea
65 Liverpool
77 Manchester city
2 Brentford
The thing I want is two columns, one for the counts the other for the names. However, I got one only. Can anyone help me to split the column into 2 or a better method to extract from the big file?
Thanks in advance !!!!

Not a beautiful solution, but try this.
Pipe the output of the previous command into this while loop:
"your program" | while read count city
do
printf "%20s\t%s" $count $city
done

If you want to simply count the unique instances in each column, your best bet would be the cut command with the custom delimiter. For instance, it would be the whitespace delimiter.
In this case you have to consider that you have subsequent spaces after the first one e.g. Manchester city.
So, in order to count the unique occurrences of the first column:
cut -d ' ' -f1 <your_file> | uniq | wc -l
where -d sets the delimiter to whitespace ' ', and -f1 gives you the first column; uniq keeps the unique instances and wc -l counts the number of occurrences.
Similarly, to count the unique occurrences of the second column:
cut -d ' ' -f2- <your_file> | uniq | wc -l
where all parameters/commands are the same except for -f2- which allows you to get the from the second column to the last (see cut man page -f<from>-<to>).
EDIT
Based on the update of your question, here is a proposition on how to get what you want in r:
You can use cut with pipe:
df = read.csv(pipe("cut -f1,2- -d ' ' <your_csv_file>"))
And this should return a dataframe with the data separated as you want.

KQL extend to new column with summarize inside

I'm trying to make a table with these columns
type | count
I tried this with no luck
exceptions
| where timestamp > ago(144h)
| extend
type = type, count = summarize count() by type
| limit 100
Any idea on what I'm doing wrong?

You should do this instead:
exceptions
| where timestamp > ago(144h)
| summarize count = count() by type
| limit 100
Explanation:
You should use extend when you want to add new/replace columns to the result, for example, extend day_of_month = dayofmonth(Timestamp) - you'll remain with exactly the same record count in this case - see more info in the doc
You should use summarize when you want to summarize multiple records (so the record count after the summarize will usually be smaller than the original record count), like in your case - see more info in the doc
By the way, instead of 144h you can use 6d, which is exactly the same, but is more natural to the human eye :)

Constructing an object using the genoset package in R

The genoset R package has a function for building a GenoSet by putting together several matrices and a RangedData object that specifies co-ordinates.
I have the following objects - three matrices, all with the same name, and a RangedData object of the following format (called locData).
space ranges |
<factor> <IRanges> |
cg00000957 1 [ 5937253, 5937253] |
cg00001349 1 [166958439, 166958439] |
cg00001583 1 [200011786, 200011786] |
cg00002028 1 [ 20960010, 20960010] |
cg00002719 1 [169396706, 169396706] |
cg00002837 1 [ 44513358, 44513358] |
When I try to create a GenoSet, though, I get the following error.
DMRSet=GenoSet(locData,Exprs,meth,unmeth,universe=NULL)
Error in .Call2("IRanges_from_integer", from, PACKAGE = "IRanges") :
cannot create an IRanges object from an integer vector with missing values.
What am I doing wrong? all the objects I'm putting together have the same rownames, except for the IRanges object itself, which I don't think has rownames since it isn't a matrix.
Additionally, the "column" of locData has non-integer characters.
Thank you!

It sounds like your "locData" may not be a RangedData. It can alternatively be a GRanges. Either way, you will want to name all of your arguments.
The underlying eSet class will be upset about that once you get past the locData trouble.
DMRSet=GenoSet(locData=locData,exprs=Exprs,meth=meth,unmeth=unmeth,universe=NULL)
Pete

How to create a matrix with dynamic rows and columns in ASP.NET?

I have to make a control in ASP.NET that allows me to create a matrix. I have a list of strings (obtained from a method) that will be the rows (each string is one row), and I have another list of strings (obtained from other method) that will be the columns (each string is one column). After that, depending on the row-cloumn cross I have to put an image in that position, something like this:
x | y | z
a | OK | OK | BAD|
------------------
b | OK |BAD | OK |
------------------
c |BAD |BAD | BAD|
How can I achieve this? Thanks a lot in advance!

You can use nested Repeaters.
The outer repeater for rows, the inner one for columns/cells.

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

groupby an element with jq - jq

Related

reducing stream data to single result without putting them all into memory

How to split a column into two in Bash shell

KQL extend to new column with summarize inside

Constructing an object using the genoset package in R

How to create a matrix with dynamic rows and columns in ASP.NET?

Categories

Resources