Print distinct values in an array - azure-application-insights

I have a column which is an array and would like the print the distinct count
Event
| project colors
Present OutPut
["Red", "Green", "Green", "Red"]
["Yellow", "Yellow", "Yellow", "Yellow"]
Expected
["Red", "Green", "Green", "Red"] , 2
["Yellow", "Yellow", "Yellow", "Yellow"], 1

Here's what I think you want, using todynamic and mvexpand and summarize, (and datatable to create the input data)
// create your sample data using datatable to make a 'fake' table
datatable (colors: string) [
'["Red", "Green", "Green", "Red"]',
'["Yellow", "Yellow", "Yellow", "Yellow"]'
]
// this part is answering your question
| extend c = todynamic(colors) // turns colors into arrays
| mvexpand c // expands all the values out of colors into their own rows (but each column value is still "dynamic" type)
| extend c = tostring(c) // turn the dynamic c column to strings to summarize the values
| summarize ["Count"] = dcount(c) by colors // count up distinct c values by each row
that will output what you have in your question:
colors Count
------------------------------------------------
["Red", "Green", "Green", "Red"] 2
["Yellow", "Yellow", "Yellow", "Yellow"] 1

Related

R - Finding duplicates in list entries

I am trying to figure out how to get duplicates out of list objects in R.
So my example list:
examplelist <- list(a = c("blue", "red", "yellow"),
b = c("red", "black", "green"),
c = c("black", "green", "brown"))
What I would like to get as a result:
duplicates: c("red", "black", "green")
vector of all entries, without double entries: c("blue", "red", "yellow", "black", "green", "brown")
I was not able to find a function for that other than duplicated() which just checks my list objects in total but not the entries itselves.
Thank you for your help :)
You can unlist first:
unlisted <- unlist(examplelist)
unlisted[duplicated(unlisted)]
# b1 c1 c2
# "red" "black" "green"
unlisted[!duplicated(unlisted)]
# a1 a2 a3 b2 b3 c3
# "blue" "red" "yellow" "black" "green" "brown"
If you only want the vector (without the names), use unname:
unlisted <- unname(unlist(examplelist))

Delete duplicate word, comma and whitespace

How can I delete all the duplicate words alongside the following comma and whitespace using Regex in R?
So far I have come up with the following regular expression, that matches the duplicate, however not the comma and whitespace. :
(\b\w+\b)(?=[\S\s]*\b\1\b)
An example list would be:
blue, red, blue, yellow, green, blue
The output should look like:
blue, red, yellow, green
So it would have to match two of the "blue" in this case, as well as the following comma and whitespace (if there is any).
Depends if your list is truly a list or if it is a string with comma's
# your data is actually already a list/vector
v <- c("blue", "red", "blue", "yellow", "green", "blue")
unique(v)
[1] "blue" "red" "yellow" "green"
# if your data is actually a comma seperated string
s <- "blue, red, blue, yellow, green, blue"
# if output needs to be a vector
unique(strsplit(s, ", ")[[1]])
[1] "blue" "red" "yellow" "green"
# if output needs to be a string again
paste(unique(strsplit(s, ", ")[[1]]), collapse = ", ")
[1] "blue, red, yellow, green"
Example based on the list column in a data.table or data.frame
dt <- data.table(
id = rep(1:5),
colors = list(
c("blue", "red", "blue", "yellow", "green", "blue"),
c("blue", "blue", "yellow", "green", "blue"),
c("blue", "red", "blue", "yellow"),
c("red", "red", "yellow", "yellow", "green", "blue"),
c("black")
)
)
## using data.table
library(data.table)
setDT(dt)
# use colors instead of clean_list to just fix the existing column
dt[, clean_list := lapply(colors, function(x) unique(x))]
## using dplyr
library(dplyr)
# use colors instead of clean_list to just fix the existing column
dt %>% mutate(clean_list = lapply(colors, function(x) unique(x)))
dt
# id colors clean_list
# 1: 1 blue,red,blue,yellow,green,blue blue,red,yellow,green
# 2: 2 blue,blue,yellow,green,blue blue,yellow,green
# 3: 3 blue,red,blue,yellow blue,red,yellow
# 4: 4 red,red,yellow,yellow,green,blue red,yellow,green,blue
# 5: 5 black black
# or just simply in base
dt$colors <- lapply(dt$colors, function(x) unique(x))
We could use paste with unique and collapse:
paste(unique(string), collapse= (", "))
[1] "blue, red, yellow, green"
data:
string <- c("blue", "red", "blue", "yellow", "green", "blue")

create a vector in R using variable names

I have a variable called school_name
I am creating a vector to define colors that I will use later in ggplot2.
colors <- c("School1" = "yellow", "School2" = "red", ______ = "Orange")
In my code I am using the variable school_name for some logic want to add that as the third element of my vector. The value changes in my for loop and cannot be hard-coded.
I have tried the following but it does not work.
colors <- c("School1" = "yellow", "School2" = "red", get("school_name") = "Orange")
Please can someone help me with this
You can use structure:
school_name = "coolSchool"
colors <- structure(c("yellow", "red", "orange"), .Names = c("School1","School2", school_name))
You can just set the names of the colors using names():
colors <- c("yellow", "red", "orange")
names(colors) <- c("School1", "School2", school_name)
This also works:
school_name <- "school3"
colors <- c("School1" = "yellow", "School2" = "red")
colors[school_name] <- "Orange"
# School1 School2 school3
# "yellow" "red" "Orange"

changing the position of headings

I am trying to draw pie charts using R with the following code. The headings are far from the pie charts. I would like to get the pie charts just below the headings. How can I do that?
x <- c(632,20,491,991,20)
y <- c(37376,41770,5210,5005,3947)
names <- c("alpha","beta","gamma","delta","omega")
par(mfrow=c(1,2))
pie(x, names, col = c("red", "yellow", "blue", "green", "cyan"), main="PIE CHART 1")
pie(y, names,col = c("red", "yellow", "blue", "green", "cyan"), main="PIE CHART 2")
x <- c(632,20,491,991,20)
y <- c(37376,41770,5210,5005,3947)
names <- c("alpha","beta","gamma","delta","omega")
par(fig=c(0,0.5,0,1))
pie(x, names, col = c("red", "yellow", "blue", "green", "cyan"))
title("CHART 1", line=-3)
par(fig=c(0.5,1,0,1),new=TRUE)
pie(y, names,col = c("red", "yellow", "blue", "green", "cyan"))
title("CHART 2", line=-3)
Alterations:
Par - change control to fig=c(x,x,y,y) to specify that you want each plot to take up a portion of the window so as it is at the moment I have got each pie chart taking up half of the plot window
Par new=TRUE states that you want a second plot "overlaid"
Title - separate from plot, line=x states where you want the title to sit, play around with various - figures until you get what you want
As an alternative, you can also keep using mfrow:
par(mfrow=c(1,2))
pie(x, names, col = c("red", "yellow", "blue", "green", "cyan"))
title("PIE CHART 1", line=-1)
pie(y, names, col = c("red", "yellow", "blue", "green", "cyan"))
title("PIE CHART 2", line=-1)

change background and text of strips associated to multiple panels in R / lattice

The following is the example I work on.
require(lattice)
data(barley)
xyplot(yield ~ year | site, data = barley)
I want to put different strip color for different sprips and font color is also different optimized with the backgroud color. For example:
strip background colors = c("black", "green4", "blue", "red", "purple", "yellow")
font color = c("white", "yellow", "white", "white", "green", "red")
Rough sketch of the first one is provided:
How can I achieve this?
Here's a clean and easily customizable solution.
myStripStyle(), the function that is passed in to the strip= argument of xyplot() uses the counter variable which.panel to select colors and also the value of factor.levels for the panel that's currently being plotted.
If you want to play around with the settings, just put a browser() somewhere inside the definition of myStripStyle() and have at it!
bgColors <- c("black", "green4", "blue", "red", "purple", "yellow")
txtColors <- c("white", "yellow", "white", "white", "green", "red")
# Create a function to be passed to "strip=" argument of xyplot
myStripStyle <- function(which.panel, factor.levels, ...) {
panel.rect(0, 0, 1, 1,
col = bgColors[which.panel],
border = 1)
panel.text(x = 0.5, y = 0.5,
font=2,
lab = factor.levels[which.panel],
col = txtColors[which.panel])
}
xyplot(yield ~ year | site, data = barley, strip=myStripStyle)
It might not be wise to refer to variables outside of the scope of the function.
You could use par.strip.text to pass additional arguments to the strip function. par.strip.text can be defined at the plot level and is generally used for setting text display properties, but beeing a list you can use it to bring your variables to the strip function.
bgColors <- c("black", "green4", "blue", "red", "purple", "yellow")
txtColors <- c("white", "yellow", "white", "white", "green", "red")
# Create a function to be passes to "strip=" argument of xyplot
myStripStyle <- function(which.panel, factor.levels, par.strip.text,
custBgCol=par.strip.text$custBgCol,
custTxtCol=par.strip.text$custTxtCol,...) {
panel.rect(0, 0, 1, 1,
col = custBgCol[which.panel],
border = 1)
panel.text(x = 0.5, y = 0.5,
font=2,
lab = factor.levels[which.panel],
col = custTxtCol[which.panel])
}
xyplot(yield ~ year | site, data = barley,
par.strip.text=list(custBgCol=bgColors,
custTxtCol=txtColors),
strip=myStripStyle)

Resources