Customized date format in kusto? - azure-data-explorer

I would like to know if there is a possibility to customize the format of a specific datetime in KQL.
For example I have the following code:
let value = datetime(2022-06-08);
print Date = value
As a result I get: 2022-06-08 00:00:00
My question here is that instead of having 2022-06-08 00:00:00, is there a possibility to format it in a way to have the Date = 8 June 2022

With the supported formats you can do stuff like this:
let dt = datetime(2017-01-29 09:00:05);
print
v1=format_datetime(dt,'yy-MM-dd [HH:mm:ss]'),
v2=format_datetime(dt, 'yyyy-M-dd [H:mm:ss]'),
v3=format_datetime(dt, 'yy-MM-dd [hh:mm:ss tt]')
However the month of year is supported only as numeric values in datetimes.
Maybe run case() in order to get June, July etc
let GetMonth = view(Month:int){
case(
Month==1, "January",
Month==2, "February",
Month==3, "March",
Month==4, "April",
Month==5, "May",
Month==6, "June",
Month==7, "July",
Month==8, "August",
Month==9, "September",
Month==10, "October",
Month==11, "November",
Month==12, "December"
)};
let x = datetime(2017-01-29 09:00:05);
let month = monthofyear(x);
print strcat(
dayofmonth(x), " ",
GetMonth(monthofyear(x)), " ",
datetime_part("Year", x))

Related

Automating importation and naming of a csv file

I am trying to import many csv files from an EPA website. The nomenclature of those csv files is sensible / consistent. Any suggestions on how I can use a loop to automate the importation of the csv files and their naming as dataframes within R?
Right now I'm doing it manually by swapping out the month name in each line of code as illustrated below:
library(tidyverse)
#Download 2013 data
jan_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_jan2013.csv")%>%
add_column("month"="jan","year"=2013)
feb_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_feb2013.csv")%>%
add_column("month"="feb","year"=2013)
mar_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_mar2013.csv")%>%
add_column("month"="mar","year"=2013)
apr_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_apr2013.csv")%>%
add_column("month"="apr","year"=2013)
may_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_may2013.csv")%>%
add_column("month"="may","year"=2013)
jun_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_june2013.csv")%>%
add_column("month"="jun","year"=2013)
jul_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_july2013.csv")%>%
add_column("month"="jul","year"=2013)
aug_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_aug2013.csv")%>%
add_column("month"="aug","year"=2013)
sep_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_sept2013.csv")%>%
add_column("month"="sep","year"=2013)
oct_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_oct2013.csv")%>%
add_column("month"="oct","year"=2013)
nov_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_nov2013.csv")%>%
add_column("month"="nov","year"=2013)
dec_13<-read.csv("https://www.epa.gov/sites/default/files/2017-10/rindata_dec2013.csv")%>%
add_column("month"="dec","year"=2013)
I'd like to set something up where all 12 months are imported, the added column is modified appropriately and the resulting df is named appropriately, by month.
Thanks for the help!
Read all of the csvs using a vector of months and string concatenation, then set their names, enframe, add a year column, and unnest:
months <- c("jan", "feb", "mar", "apr", "may", "june", "july", "aug", "sept", "oct", "nov", "dec")
dfs <- lapply(months, function(x) read.csv(paste0("https://www.epa.gov/sites/default/files/2017-10/rindata_", x, "2013.csv"))) %>%
setNames(months) %>%
enframe(name = "month") %>%
add_column(year = 2013) %>%
unnest(value)
Let me know if this works!

Drop top row of dataframe and make the second row variable names

I have list clean_data_2009 containing 12 monthly data frames named wireless_YY_mmm each, where YY represents year 2009 abbreviated as 09 and mmm abbreviates the calendar months.
I want to drop the first row in each of the 12 dataframes, and then convert the first row to variables name row. The command below works, but I want to write a loop instead.
clean_data_2009$wireless_jan_09 <- clean_data_2009$wireless_jan_09[-1,] %>% row_to_names(row_number = 1)
I have written the loop command to print the text that R should accept to manipulate the data frames using paste command, but R tries to read the paste command and thus gives me an error. I try to fix it with the get command, but still run into the error shown below -
month <- c("jan", "feb", "mar", "april", "may", "june", "july", "aug", "sep", "oct", "nov", "dec")
year <- c("09") # "2010", "2011"
list_dt <- c("clean_data_2009$wireless")
rows2del <- c("[-1, ]")
for (y in year) {
for (m in month) {
print(paste(y,m,sep = "_") )
print(paste(list_dt,m,y,sep = "_"))
print(paste(paste(list_dt,m,y,sep = "_"),rows2del, sep=""))
get(paste(list_dt,m,y,sep = "_")) <- get(paste(paste(list_dt,m,y,sep = "_"),rows2del, sep="")) %>% row_to_names(row_number = 1)
}
}
Error:
[1] "09_jan"
[1] "clean_data_2009$wireless_jan_09"
[1] "clean_data_2009$wireless_jan_09[-1, ]"
Error in get(paste(paste(list_dt, m, y, sep = "_"), rows2del, sep = "")) :
object 'clean_data_2009$wireless_jan_09[-1, ]' not found
This alternative approach might help. If you already have your frames in a list, you can just loop through them, and using indexing to drop the first row, and set the names in a single setNames() call for each frame
lapply(clean_data_2009, \(d) setNames(d[-1,],d[1,]))

Do if then statement based on values inside data frame or vector

As an example, suppose I have this data:
key <- data.frame(num=c(1,2,3,4,5), month=c("January", "Feb", "March", "Apr", "May"))
data <- c(4,2,5,3)
I want to create a new vector, data2 using the mapping of num to month contained in key. I can do this manually using case_when by doing lots of if statements at once:
library(dplyr)
data2<-case_when(
data==1 ~ "January",
data==2 ~ "Feb",
data==3 ~ "March",
data==4 ~ "Apr",
data==5 ~ "May"
)
However, say that I want to automate this process (maybe I actually have thousands of if statements) and utilize the mapping contained in key. Is this, or something like it, possible?
Here is a failed attempt at code:
data2 <- case_when(data=key$num ~ key$month)
What I am going for is a vector called data2 with these elements: c("Apr","Feb","May","March"). How can I do this?
You can use match and base R indexing (also, set stringsAsFactors=FALSE when you initialize the data.frame, as I did below):
key <- data.frame(num=c(1,2,3,4,5), month=c("January", "Feb", "March", "Apr", "May"), stringsAsFactors = FALSE)
data2 <- key$month[match(data, key$num)]
data2
#[1] "Apr" "Feb" "May" "March"

Why are these object sizes different - R

Why do I get "warning longer object length is not a multiple of shorter object length"?
Forgive me for asking this again, but I am unable to figure out why I am getting this error message - even after combing through stackoverflow. From the above link it says:
"memb only has a length of 10. I'm guessing the length of dih_y2$MemberID isn't a multiple of 10. When using == it will spit out a warning if it isn't a multiple to let you know that it's probably not doing what you're expecting it is doing."
I'm am getting the same error message from the following code, but I am not sure what "objects" are of different length in my example and how to fix it! Essentially, I am trying to separate my dates into months for analysis. Please help if you can. Thank you.
library(ggplot2)
library(dplyr)
library(statsr)
piccolos2 <- piccolos2 %>%
mutate(SERPDate = as.Date(piccolosRankings$SERPDate, format='%m/%d/%Y'))
piccolos2 <- piccolos2 %>%
mutate(Month = ifelse(as.numeric(SERPDate) %in% 0017-04-01:0017-04-30, "April",
ifelse(as.numeric(SERPDate) %in% 0017-05-01:0017-05-31, "May",
ifelse(as.numeric(SERPDate) %in% 0017-06-01:0017-06-30, "June",
ifelse(as.numeric(SERPDate) %in% 0017-07-01:0017-07-31, "July", "August")))))
piccolos2 <- piccolos2 %>%
mutate(Month = ifelse(as.numeric(SERPDate) %in% as.Date("0017-04-01"):as.Date("0017-04-30"), "April",
ifelse(as.numeric(SERPDate) %in% as.Date("0017-05-01"):as.Date("0017-05-31"), "May",
ifelse(as.numeric(SERPDate) %in% as.Date("0017-06-01"):as.Date("0017-06-30"), "June",
ifelse(as.numeric(SERPDate) %in% as.Date("0017-07-01"):as.Date("0017-07-31"), "July", "August")))))

loop with list of months as input in R

I am trying to run a loop with months as the input. Moreover, I want to spot the number of times a given topic appears in a specific date. The way I am trying to do so is as follows?
for (i in c("January", "February", "March", "April",
"May", "June", "July", "August", "September", "October", "November", "December")){
print(length(which(data$Date == "i 2005" & data$Maxtopic == 3)))
}
Nevertheless, I get 0 as output for all the dates. Any ideas why?
Cheers,
Try data$Date == sprintf("%s 2005", i). Your attempt searches for the literal string "i 2005".
However, the table function was designed for this. Use gsub to remove the year:
table(gsub(" 2005", "", data[data$Maxtopic == 3, "Date"], fixed = TRUE))
PS: Next time please provide a reproducible example to enable development and testing of solutions.

Resources