Related
I want to group by time bucket and one other column and then only select the top N aggregated rows.
It's best explained with this example:
let T = datatable(d:datetime , continent:string, country:string, val:int)
[
datetime(2022-10-05T01:40:00.00), "Asia", "China", 10,
datetime(2022-10-05T02:50:00.00), "Asia", "India", 25,
datetime(2022-10-05T03:55:00.00), "Asia", "Japan", 15,
datetime(2022-10-05T01:40:00.00), "Europe", "Czech Republic", 1,
datetime(2022-10-05T02:50:00.00), "Europe", "France", 8,
datetime(2022-10-05T07:55:00.00), "Europe", "Germany", 9,
datetime(2022-10-05T04:55:00.00), "North America", "USA", 25,
datetime(2022-10-05T05:55:00.00), "North America", "Haiti", 5,
datetime(2022-10-05T09:55:00.00), "North America", "Jamaica", 3,
datetime(2022-10-06T01:40:00.00), "Asia", "China", 7,
datetime(2022-10-06T02:50:00.00), "Asia", "India", 8,
datetime(2022-10-06T03:55:00.00), "Asia", "Japan", 15,
datetime(2022-10-06T01:40:00.00), "Europe", "Czech Republic", 29,
datetime(2022-10-06T02:50:00.00), "Europe", "France", 14,
datetime(2022-10-06T07:55:00.00), "Europe", "Germany", 13,
datetime(2022-10-06T04:55:00.00), "North America", "USA", 12,
datetime(2022-10-06T05:55:00.00), "North America", "Haiti", 7,
datetime(2022-10-06T09:55:00.00), "North America", "Jamaica", 4,
];
T
| summarize sumval = sum(val) by bin(d,1d), continent
| sort by d asc, sumval desc
This is the result, but I only want the top 2 results per day (highlighted).
In SQL I would use either row_number or cross apply, but I've been struggling in KQL. I want to understand the solution, because it doesn't click yet.
top-nested operator
Please note that in you case you don't really need the 1st sum(val), but it was added since the syntax mandates something there.
We could have used count(), 0, int(null) or other options for that matter.
let T = datatable(d:datetime , continent:string, country:string, val:int)
[
datetime(2022-10-05T01:40:00.00), "Asia", "China", 10,
datetime(2022-10-05T02:50:00.00), "Asia", "India", 25,
datetime(2022-10-05T03:55:00.00), "Asia", "Japan", 15,
datetime(2022-10-05T01:40:00.00), "Europe", "Czech Republic", 1,
datetime(2022-10-05T02:50:00.00), "Europe", "France", 8,
datetime(2022-10-05T07:55:00.00), "Europe", "Germany", 9,
datetime(2022-10-05T04:55:00.00), "North America", "USA", 25,
datetime(2022-10-05T05:55:00.00), "North America", "Haiti", 5,
datetime(2022-10-05T09:55:00.00), "North America", "Jamaica", 3,
datetime(2022-10-06T01:40:00.00), "Asia", "China", 7,
datetime(2022-10-06T02:50:00.00), "Asia", "India", 8,
datetime(2022-10-06T03:55:00.00), "Asia", "Japan", 15,
datetime(2022-10-06T01:40:00.00), "Europe", "Czech Republic", 29,
datetime(2022-10-06T02:50:00.00), "Europe", "France", 14,
datetime(2022-10-06T07:55:00.00), "Europe", "Germany", 13,
datetime(2022-10-06T04:55:00.00), "North America", "USA", 12,
datetime(2022-10-06T05:55:00.00), "North America", "Haiti", 7,
datetime(2022-10-06T09:55:00.00), "North America", "Jamaica", 4,
];
T
| top-nested of bin(d, 1d) by sum(val), top-nested 2 of continent by sum(val)
d
aggregated_d
continent
aggregated_continent
2022-10-05T00:00:00Z
101
Asia
50
2022-10-05T00:00:00Z
101
North America
33
2022-10-06T00:00:00Z
109
Europe
56
2022-10-06T00:00:00Z
109
Asia
30
Fiddle
I have a database.
I want to build a chord diagram similar to this one:
https://i.stack.imgur.com/59JcJ.png
My code:
vertices <- data.frame(name = unique(c(as.character(imports$Partner), as.character(imports$Reporter))) )
mygraph <- graph_from_data_frame( imports, vertices=vertices )
from <- match( imports$Reporter, vertices$name)
to <- match( imports$Partner, vertices$name)
ggraph(mygraph, layout = 'dendrogram', circular = TRUE)
geom_conn_bundle(data = get_con(from = from, to = to), alpha=0.2, colour="skyblue", tension = 0)
geom_node_point(aes(filter = leaf, x = x*1.05, y=y*1.05))
theme_void()
Result:
https://i.stack.imgur.com/uY2Yq.png
I searched for all kinds of settings for the chord diagram, but I didn’t find, how to make the links of the same size, and set the number of lines as an indicator of the value. Does anyone know how to create such a diagram?
data=structure(list(Reporter = c("USA", "USA", "USA", "USA", "India",
"Japan", "Japan", "USA", "Rep. of Korea", "USA", "Japan", "Japan",
"Japan", "Rep. of Korea", "USA", "USA", "USA", "China", "USA",
"USA", "Rep. of Korea", "USA", "Japan", "Japan", "Rep. of Korea",
"China", "China", "Rep. of Korea", "India", "China", "Rep. of Korea",
"USA", "Rep. of Korea", "Japan", "China", "Rep. of Korea", "India",
"China", "China", "India", "China", "China"), Partner = c("Saudi Arabia", "Canada", "Venezuela", "Mexico",
"Areas, nes", "Saudi Arabia", "United Arab Emirates", "Nigeria",
"Saudi Arabia", "Iraq", "Iran", "Qatar", "Kuwait", "United Arab Emirates",
"Angola", "Norway", "Colombia", "Oman", "United Kingdom", "Kuwait",
"Iran", "Gabon", "Indonesia", "Oman", "Kuwait", "Angola", "Iran",
"Oman", "Saudi Arabia", "Saudi Arabia", "Qatar", "Ecuador", "Indonesia",
"China", "Indonesia", "Australia", "Nigeria", "Yemen", "Fmr Sudan",
"Kuwait", "Iraq", "Viet Nam", "Iraq", "Australia", "Angola",
"United Arab Emirates", "Argentina", "Iran", "Trinidad and Tobago",
"Congo", "Yemen", "Iraq", "Viet Nam", "Australia", "Malaysia",
"Mexico", "Indonesia", "China", "Congo", "Ecuador", "Malaysia",
"Qatar", "Brunei Darussalam", "Norway"), Qty = c(69785202126, 68349221243, 68326932683,
64923669168, 57159000064, 53691639675, 52396394737, 46817696134,
38307387772, 31471382247, 25554794183, 19184268129, 18481591406,
16695296617, 16497467586, 16029110463, 15953011573, 15660839936,
14459452736, 13796910873, 11134838478, 10393629031, 10258716565,
9751327665, 9417368771, 8636634112, 7000465408, 6586187350, 5769723904,
5730211328, 5702528697, 5553458497, 5290777764, 5113191253, 4575188480,
4361612670, 3888963072, 3612423424, 3313590784, 3223781888, 3183182080,
3158472192, 3151280715, 3081015515, 3067260000, 2921931008, 2850134892,
2607684096, 2587749446, 2547349198, 2485083122, 2443798762, 2365431992,
2342513214, 2308853961, 2130664704, 1942125162, 1828376381, 1814260579,
1785874000, 1609282280, 1598901888, 1534923974, 1477843712, 1476737920,
1454356736, 1355873401, 1293729024, 1285355978, 1278701346, 1259876360,
1252518912, 1248772992, 1223383808, 1163368000, 1144188000, 1108399232,
1062041363, 1041526592, 977722731, 897418483, 877541040, 845556546,
801940467, 744316800, 739848000, 724177472, 694896000, 685405539,
672387008, 554965585, 540327751, 508204324, 4.87e+08, 457252032,
433428000, 430473920, 426744352, 408635880, 392727578, 390598528,
390189912, 389451923, 384376548, 350920922, 327039700, 285413702,
285143680, 275486240, 274015471, 264478000, 260122000, 238997756,
227806048, 204376795, 192144011, 150791409, 140634221, 135842986,
130777039, 129973032, 125115000, 124681401, 123443000, 120061792,
110795499, 106762492, 105548008, 84693986, 70275359, 57248174,
47944463, 40236018, 30783728, 18364000, 13419253, 12551365, 9631763,
5994199, 374000, 350000, 339115, 86420, 24000, 180)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -145L))
I want to create a scattergeo plot with markers for capitals. These markers are sized and colored regarding database values.
If I use the standard colors, everything goes well : the map is shown properly with good size, different colors, and no legend (I add the information country:value in a hover text on the markers).
However if I use a custom palette using colors feature into my scattergeo plot the colorbar is always displayed. Showlegend=F and Showscale=F doesn't help. Since I remove colors the colorbar disapeared.
Moreover, if I try to customize it (eg change the title or the format of the tickslabels) it doesn't work.
In other words no option I have tried on this colorbar works !
This is my data :
structure(list(ISO3 = c("ARG", "AUS", "AUT", "BEL", "BRA", "CAN",
"CHE", "CHL", "CHN", "COK", "DEU", "DNK", "ESP", "FIN", "FJI",
"FRA", "GBR", "HKG", "IDN", "IND", "ITA", "JPN", "KOR", "LUX",
"MEX", "MYS", "NCL", "NLD", "NOR", "NZL", "PHL", "PRT", "RUS",
"SGP", "SWE", "THA", "TON", "USA", "WSM"), Total = c(1073L, 8204L,
818L, 1502L, 1871L, 7958L, 3524L, 2456L, 3345L, 456L, 5010L,
569L, 2775L, 184L, 75L, 60382L, 4424L, 415L, 146L, 405L, 8369L,
8176L, 1034L, 235L, 961L, 137L, 6522L, 667L, 309L, 7960L, 238L,
316L, 486L, 404L, 480L, 200L, 41L, 85225L, 46L), Size = c(16,
30, 14, 18, 19, 30, 24, 21, 23, 12, 26, 13, 22, 8, 5, 50, 25,
11, 7, 11, 30, 30, 16, 9, 15, 7, 28, 13, 10, 30, 9, 10, 12, 11,
12, 8, 3, 54, 4), Color = c(3, 4, 3, 3, 3, 4, 4, 3, 4, 3, 4,
3, 3, 2, 2, 5, 4, 3, 2, 3, 4, 4, 3, 2, 3, 2, 4, 3, 2, 4, 2, 2,
3, 3, 3, 2, 2, 5, 2), ISO2 = c("AR", "AU", "AT", "BE", "BR",
"CA", "CH", "CL", "CN", "CK", "DE", "DK", "ES", "FI", "FJ", "FR",
"GB", "HK", "ID", "IN", "IT", "JP", "KR", "LU", "MX", "MY", "NC",
"NL", "NO", "NZ", "PH", "PT", "RU", "SG", "SE", "TH", "TO", "US",
"WS"), LABELFR = c("Argentine", "Australie", "Autriche", "Belgique",
"Brésil", "Canada", "Suisse", "Chili", "Chine", "Iles Cook",
"Allemagne", "Danemark", "Espagne", "Finlande", "Fidji", "France",
"Royaume-Uni", "Hong-kong, Chine", "Indonésie", "Inde", "Italie",
"Japon", "Corée, République de", "Luxembourg", "Mexique", "Malaisie",
"Nouvelle-Calédonie", "Pays-Bas", "Norvège", "Nouvelle-Zélande",
"Philippines", "Portugal", "Russie, Fédération de", "Singapour",
"Suède", "Thaïlande", "Tonga", "Etats-Unis", "Samoa"), LABELEN = c("Argentina",
"Australia", "Austria", "Belgium", "Brazil", "Canada", "Switzerland",
"Chile", "China", "Cook Islands", "Germany", "Denmark", "Spain",
"Finland", "Fiji", "France", "United Kingdom", "Hong Kong", "Indonesia",
"India", "Italy", "Japan", "South Korea", "Luxembourg", "Mexico",
"Malaysia", "New Caledonia", "Netherlands", "Norway", "New Zealand",
"Philippines", "Portugal", "Russia", "Singapore", "Sweden", "Thailand",
"Tonga", "United States", "Samoa"), CAPITAL = c("Buenos Aires",
"Canberra", "Vienna", "Brussels", "Brasilia", "Ottawa", "Bern",
"Santiago", "Beijing", "Avarua", "Berlin", "Copenhagen", "Madrid",
"Helsinki", "Suva", "Paris", "London", "N/A", "Jakarta", "New Delhi",
"Rome", "Tokyo", "Seoul", "Luxembourg", "Mexico City", "Kuala Lumpur",
"Noumea", "Amsterdam", "Oslo", "Wellington", "Manila", "Lisbon",
"Moscow", "Singapore", "Stockholm", "Bangkok", "Nuku'alofa",
"Washington", "Apia"), LATITUDE = c("-34.583333333333336", "-35.266666666666666",
"48.2", "50.833333333333336", "-15.783333333333333", "45.416666666666664",
"46.916666666666664", "-33.45", "39.916666666666664", "-21.2",
"52.516666666666666", "55.666666666666664", "40.4", "60.166666666666664",
"-18.133333333333333", "48.86666666666667", "51.5", "0", "-6.166666666666667",
"28.6", "41.9", "35.68333333333333", "37.55", "49.6", "19.433333333333334",
"3.1666666666666665", "-22.266666666666666", "52.35", "59.916666666666664",
"-41.3", "14.6", "38.71666666666667", "55.75", "1.2833333333333332",
"59.333333333333336", "13.75", "-21.133333333333333", "38.883333",
"-13.816666666666666"), LONGITUDE = c("-58.666667", "149.133333",
"16.366667", "4.333333", "-47.916667", "-75.700000", "7.466667",
"-70.666667", "116.383333", "-159.766667", "13.400000", "12.583333",
"-3.683333", "24.933333", "178.416667", "2.333333", "-0.083333",
"0.000000", "106.816667", "77.200000", "12.483333", "139.750000",
"126.983333", "6.116667", "-99.133333", "101.700000", "166.450000",
"4.916667", "10.750000", "174.783333", "120.966667", "-9.133333",
"37.600000", "103.850000", "18.050000", "100.516667", "-175.200000",
"-77.000000", "-171.766667"), CONTINENT = c("South America",
"Australia", "Europe", "Europe", "South America", "Central America",
"Europe", "South America", "Asia", "Australia", "Europe", "Europe",
"Europe", "Europe", "Australia", "Europe", "Europe", "Asia",
"Asia", "Asia", "Europe", "Asia", "Asia", "Europe", "Central America",
"Asia", "Australia", "Europe", "Europe", "Australia", "Asia",
"Europe", "Europe", "Asia", "Europe", "Asia", "Australia", "Central America",
"Australia")), class = c("data.table", "data.frame"), row.names = c(NA,
-39L), .internal.selfref = <pointer: 0x000001ffb6417970>, sorted = "ISO3")
This is my code :
fig <- plot_ly(
type = 'scattergeo',
showlegend=F,
mode='markers',
data=TOUR,
y=~LATITUDE,
x=~LONGITUDE,
text=sprintf("%s : %s",TOUR$LABELFR,TOUR$Total),
hovertemplate = "%{text}<extra></extra>",
colors=c(ispfPalette[c(9,4,2)]),
color=~Color,
marker=list(
showscale=F,
size=~Size,
reversescale=F
)
)
And this is the output I have :
Best solution would be to hide completely the colorbar, but I would also be curious how to customize it by changing the title and formatting the values (per example in case of % or if I want to change the decimal separator).
Thanks for your help !
Update on request:
You could modify the colorbar in the colors argument, for example like this:
fig <- plot_ly(
type = 'scattergeo',
showlegend=F,
mode='markers',
data=TOUR,
y=~LATITUDE,
x=~LONGITUDE,
text=sprintf("%s : %s",TOUR$LABELFR,TOUR$Total),
hovertemplate = "%{text}<extra></extra>",
colors="YlOrRd",
#colors = c("#1B98E0","black"),
color=~Color,
marker=list(
showscale=F,
size=~Size,
reversescale=F
)
)
fig
colors="YlOrRd"
colors = c("#1B98E0","black")
ORIGINAL ANSWER:
Just add: %>% hide_colorbar at the end of your code:
fig <- plot_ly(
type = 'scattergeo',
showlegend=F,
mode='markers',
data=TOUR,
y=~LATITUDE,
x=~LONGITUDE,
text=sprintf("%s : %s",TOUR$LABELFR,TOUR$Total),
hovertemplate = "%{text}<extra></extra>",
colors=c(ispfPalette[c(9,4,2)]),
color=~Color,
marker=list(
showscale=F,
size=~Size,
reversescale=F
)
) %>% hide_colorbar()
Can you use text as X axis labels on a plot? I've searched and cannot see any examples. Am I trying to do something that is not possible in R? Even when I try to plot one variable. Countries is text/character - but I do not know how to set it as such
plot(Finally$Countries,Finally$RobberyPerCent, pch = 16, col = 2)
I get the error
Error in plot.window(...) : need finite 'xlim' values
In addition: There were 24 warnings (use warnings() to see them)
Thank you, my goal is to combine two variables and see if there is a basic pattern. I've been able to figure out simple linear regression (no correlation), but I'm failing at basic plotting
#Subset for Percentages
Q5DataFinal <- subset(Q5Data, select = c(RobberyPerCent, UnlawfulPerCent))
View(Q5DataFinal)
library(data.table)
Nearlythere <- setDT(Q5DataFinal, keep.rownames = TRUE)[] # turn rownames into column data
names(Nearlythere)[names(Nearlythere) == 'rn'] <- 'Countries' #renaming rn to countries
Nearlythere$Countries[] <- lapply(Nearlythere$Countries, as.character) #Changing Countries to Character
Finally <- Nearlythere
summary(Finally) #Countries saved as characters
# Attempt to create two Y axis Graph with Countries as X ticks
par(mar = c(5, 4, 4, 4) + 0.3) # Additional space for second y-axis
plot(Finally$Countries,Finally$RobberyPerCent, pch = 16, col = 2) # Create first plot
par(new = TRUE) # Add new plot
plot(Finally$Countries, Finally$UnlawfulPerCent, pch = 17, col = 3, # Create second plot without axes
axes = FALSE, xlab = "", ylab = "")
axis(side = 4, at = pretty(range(Finally$UnlawfulPerCent))) # Add second axis
mtext("UnlawfulPerCent", side = 4, line = 3) # Add second axis label
Dput is
structure(list(Countries = list("Albania", "Austria", "Bulgaria",
"Croatia", "Cyprus", "Czechia", "Finland", "Germany (until 1990 former territory of the FRG)",
"Greece", "Ireland", "Italy", "Kosovo (under United Nations Security Council Resolution 1244/99)",
"Latvia", "Lithuania", "Luxembourg", "Malta", "Montenegro",
"Romania", "Serbia", "Slovenia", "Spain", "Switzerland"),
RobberyPerCent = c(5, 6, 18, 7, 5, 23, 5, 9, 24, 9, 40, 12,
17, 18, 10, 52, 24, 33, 10, 17, 80, 2), UnlawfulPerCent = c(95,
94, 82, 93, 95, 77, 95, 91, 76, 91, 60, 88, 83, 82, 90, 48,
76, 67, 90, 83, 20, 98)), row.names = c(NA, -22L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x0000020282d01ef0>)
Do you want something like this?
par(mar = c(5, 5, 4, 2))
x <- seq(0, 5, length.out = 500)
plot(x, sin(x^2), xaxt = "n", xlab = expression("Here is X"), ylab = expression(sin(x^2)),
main = expression("My coolest plot" - sin(x^2)))
axis(1, at=0:5, labels=c("Albania", "Kosovo", "Kongo", "Germany", "Bulgaria", "Spain"))
An addition
#your dataset
countries <- list("Albania", "Austria", "Bulgaria",
"Croatia", "Cyprus", "Czechia", "Finland", "Germany (until 1990 former territory of the FRG)",
"Greece", "Ireland", "Italy", "Kosovo (under United Nations Security Council Resolution 1244/99)",
"Latvia", "Lithuania", "Luxembourg", "Malta", "Montenegro",
"Romania", "Serbia", "Slovenia", "Spain", "Switzerland")
#modify to
axis(1, at=0:21, labels=countries, cex.axis=0.5) #select cex.axis for better displaying
I have a factor variable with countries. I have to use ! and %in% operators so that I can keep the "United States", "Switzerland", "United Kingdom" and transform the rest to "Others". But the code I am using is not working
country <- c(rep(x = "United States", 466), rep(x = "United Kingdom", 250), rep(x = "Switzerland", 520),
rep(x = "France", 97), rep(x = "Italy", 85), rep(x = "Germant", 39), rep(x = "Canada", 25),
rep(x = "Singapore", 2), rep(x = "South Africa", 9))
country
bulk <- c("United States", "Switzerland", "United Kingdom")
if(! bulk %in% country) country <- "Others"
I am expecting it to make four categories. United States, Switzerland, United Kingdom, Others. But I don't want the solution out of context of "!" and "%in%" operators.
Solution for a vector:
country[!(country %in% bulk)] <- "Others"
Solution for a data frame:
df<-data.frame(country=country, emptycolumn=NA)
df$country<-as.character(df$country)
df$country[!(df$country %in% bulk)]<-"Others"
View(df)
Try
country[ ! country %in% bulk ] <- "Other"
table(country)
#-------------------------
country
Other Switzerland United Kingdom United States
257 520 250 466
R accepts logical indices for conditional assignments.
country <- as.data.frame(c(rep(x = "United States", 466), rep(x = "United Kingdom", 250), rep(x = "Switzerland", 520),
rep(x = "France", 97), rep(x = "Italy", 85), rep(x = "Germant", 39), rep(x = "Canada", 25),
rep(x = "Singapore", 2), rep(x = "South Africa", 9)), stringsAsFactors = F)
colnames(country) <- "country"
bulk <- c("United States", "Switzerland", "United Kingdom")
country$country[!country$country %in% bulk] <- "Other"
unique(country)
country
1 United States
467 United Kingdom
717 Switzerland
1237 Other