I am trying to plot (for the first time) a chord diagram in the package circlize in R Studio. I am going through the manual chapters (Circular Visualization in R). The first step is to allocate the sectors on a circle by using the circos.initialize command. However, when I get to this step, I get an error stating missing values where TRUE/FALSE needed.
A reproducible example
library(circlize)
Types <- data.frame(Types = c("OOP", "UVA", "MAT", "OIC", "FIN", "WSE"))
stack.df <- data.frame(Year = c(rep(2019, 1), rep(2020, 4), rep(2021, 7), rep(2022, 11), rep(2023, 11)), Invoice = c(paste0("2019.", "10", ".INV"),
paste0("2020.", seq(from = 20, to = 23, by = 1), ".INV"),
paste0("2021.", seq(from = 30, to = 36, by = 1), ".INV"),
paste0("2022.", seq(from = 40, to = 50, by = 1), ".INV"),
paste0("2023.", seq(from = 50, to = 60, by = 1), ".INV")))
stack.df <- cbind(stack.df, Org_1 = Types[sample(nrow(Types), nrow(stack.df), replace = TRUE), ], Org_2 = Types[sample(nrow(Types), nrow(stack.df), replace = TRUE), ])
Making Chord Diagram
My overall objective: Make a chord diagram where the sectors are the stack.df$Year and track 1 is the stack.df$Invoice, with the circos.links from stack.df$Org_1 to stack.df$Org_2.
Initialize
circos.initialize(sectors = stack.df$Year, x = stack.df$Invoice)
Error in if (sector.range[i] == 0) { :
missing value where TRUE/FALSE needed
In addition: Warning message:
In circos.initialize(sectors = stack.df$Year, x = stack.df$Invoice) :
NAs introduced by coercion
What am I am missing? My sector.range !== 0 as stack.df$Year is from 2019-2023. Any help in overcoming this error is greatly appreciated.
I want to fit my points with logarithmic curve. Here is my data which contains x and y. I desire to plot x and y and the add a logarithmic fitting curve.
x<-structure(list(X2.y = c(39.99724745, 29.55541525, 23.39578201,
15.46797044, 10.52063652, 7.296161198, 6.232038434, 4.811851132,
4.641281547, 4.198523289, 3.325515839, 2.596563723, 1.894902523,
1.556380314), X5.y = c(62.76037622, 48.54726084, 37.71302646,
24.93942365, 17.71060023, 13.31130267, 10.36341862, 7.706914722,
7.170517624, 6.294292013, 4.917428837, 3.767836298, 2.891519878,
2.280974128), X10.y = c(77.83154815, 61.12151516, 47.19228808,
31.21034981, 22.47098182, 17.29384973, 13.09875178, 9.623698726,
8.845091983, 7.681873268, 5.971413758, 4.543320659, 3.551367285,
2.760718282), X25.y = c(96.87401383, 77.00911883, 59.16936025,
39.13368164, 28.48573658, 22.32580849, 16.55485248, 12.0455604,
10.96092113, 9.435085861, 7.303126501, 5.523147205, 4.385086234,
3.366876291), X50.y = c(111.0008027, 88.79545082, 68.05463659,
45.01166182, 32.94782526, 26.05880295, 19.11878542, 13.84223574,
12.53056405, 10.73571912, 8.291067088, 6.25003851, 5.003586577,
3.81655893), X100.y = c(125.0232816, 100.4947544, 76.87430545,
50.84623991, 37.37696657, 29.76423356, 21.66378667, 15.6256447,
14.08861698, 12.0267487, 9.271712877, 6.971562563, 5.61752001,
4.262921183)), class = "data.frame", row.names = c(NA, -14L))
I tried this:
single_idf<-function(x) {
idf<-x
durations = c(5/60, 10/60, 15/60, 30/60, 1, 2, 3, 4, 5, 6, 8, 12, 18, 24)
nd = length(durations)
Tp = c(2, 5, 10, 25, 50, 100)
nTp = length(Tp)
psym = seq(1, nTp)
# open new window for this graph, set plotting parameters for a single graph panel
windows()
par(mfrow = c(1,1), mar = c(5, 5, 5, 5), cex = 1)
# set up custom axis labels and grid line locations
ytick = c(1,2,3,4,5,6,7,8,9,10,20,30,40,50,60,70,80,90,100,
200,300,400,500,600,700,800,900,1000,1100,1200,1300,1400)
yticklab = as.character(ytick)
xgrid = c(5,6,7,8,9,10,15,20,30,40,50,60,120,180,240,300,360,
420,480,540,600,660,720,840,960,1080,1200,1320,1440)
xtick = c(5,10,15,20,30,60,120,180,240,300,360,480,720,1080,1440)
xticklab = c("5","10","15","20","30","60","2","3","4","5","6","8","12","18","24")
ymax1 = max(idf)
durations = durations*60
plot(durations, col=c("#FF00FF") ,lwd=c(1), idf[, 1],
xaxt="n",yaxt="n",
pch = psym[1], log = "xy",
xlim = c(4, 24*60), ylim = range(c(1,idf+150)),
xlab = "(min) Duration (hr)",
ylab = "Intensity (mm/hr)"
)
for (iT in 2:nTp) {
points(durations, idf[, iT], pch = psym[iT], col="#FF00FF",lwd=1)
}
for (iT in 1:nTp) {
mod.lm = lm(log10(idf[, iT]) ~ log10(durations))
b0 = mod.lm$coef[1]
b1 = mod.lm$coef[2]
yfit = log(10^(b0 + b1*log10(durations)))
lines(durations,col=c("#FF00FF"),yfit, lty = psym[iT],lwd=1)
}
}
But when I run this, the curves stands far away from the points. I want to see curves over the points. How can I arrange this?
single_idf(x)
Consider this as an option for you using ggplot2 and dplyr. Also added method='lm' to match OP expected output (Many thanks and credits to #AllanCameron for his magnificent advice):
library(ggplot2)
library(dplyr)
#Data
df <- data.frame(x,y)
#Plot
df %>%
pivot_longer(-y) %>%
ggplot(aes(x=log(y),y=log(value),color=name,group=name))+
geom_point()+
stat_smooth(geom = 'line',method = 'lm')
Output:
The main problem is that you were plotting the natural log of the fit rather than the fit itself.
If you change the line
yfit = log(10^(b0 + b1*log10(durations)))
To
yfit = 10^(b0 + b1*log10(durations))
And rerun your code, you get
The tutorial for catboost with R says this:
library(catboost)
countries = c('RUS','USA','SUI')
years = c(1900,1896,1896)
phone_codes = c(7,1,41)
domains = c('ru','us','ch')
dataset = data.frame(countries, years, phone_codes, domains)
label_values = c(0,1,1)
fit_params <- list(iterations = 100,
loss_function = 'Logloss',
ignored_features = c(4,9),
border_count = 32,
depth = 5,
learning_rate = 0.03,
l2_leaf_reg = 3.5)
pool = catboost.load_pool(dataset, label = label_values, cat_features = c(0,3))
model <- catboost.train(pool, params = fit_params)
However, this results in:
Error in catboost.from_data_frame(data, label, pairs, weight, group_id, :
Unsupported column type: character
Many thanks,
I've found this code about a map with road traffic casualties across the UK done with mapdeck.
I would like to create something similar for Italy, but I don't understand how to modify the code to obtain italian area.
library(mapdeck)
set_token(Sys.getenv("MAPBOX"))
crash_data = read.csv("https://git.io/geocompr-mapdeck")
crash_data = na.omit(crash_data)
ms = mapdeck_style("dark")
mapdeck(style = ms, pitch = 45, location = c(0, 52), zoom = 4) %>%
add_grid(data = crash_data, lat = "lat", lon = "lng", cell_size = 1000,
elevation_scale = 50, layer_id = "grid_layer",
colour_range = viridisLite::plasma(6))
Thank you!
I think the data which you have given is for Italy. If it is the case then it is perfectly fine. This is the solution for your problem. Just change the locations latitude and longitude value.
library(mapdeck)
set_token(Sys.getenv("MAPBOX"))
crash_data = read.csv("https://git.io/geocompr-mapdeck")
crash_data = na.omit(crash_data)
ms = mapdeck_style("dark")
mapdeck(style = ms, pitch = 45, location = c(43,12), zoom = 4) %>%
add_grid(data = crash_data, lat = "lat", lon = "lng", cell_size = 1000,
elevation_scale = 50, layer_id = "grid_layer",
colour_range = viridisLite::plasma(6))
If this has answered your question it is well and good. You can also edit the other fine details.
I have been using the sankeyD3 package to create SankeyNetworks and the 'NodePosX' feature isn't working for me yet. The 'NodePosX' feature is not in the 'networkD3' package but it is in the 'sankeyD3' package.
To help illustrate the problem that I am having, I have edited the example from akraemer007 that was posted here to include the X positions of the nodes (see below) but it's still not working in the way that he had originally wanted, with manual control over the x-position of the 'Opted-Out' node.
We're aiming for something like this, but without the small line from 'Opted-Out' to 'Activated':
library(devtools)
devtools::install_github("fbreitwieser/sankeyD3")
library(sankeyD3)
name <- c('Enrolled', 'Opted-Out', 'Invited', 'Activated')
xpos <- c(0, 1, 1, 2)
nodes <- data.frame(name, xpos)
source <- c(0, 0, 2, 1)
target <- c(1, 2, 3, 3)
value <- c(20, 80, 60, 0)
links <- data.frame(source, target, value)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",NodePosX = "xpos",
units = "TWh", fontSize = 12, nodeWidth = 30)
Assuming the last row in your links data frame is only there to force the plot to look the way you want and not part of the actual data you want to plot, you can achieve this with networkd3 using the sinksRight = FALSE parameter.
library(networkD3)
name <- c('Enrolled', 'Opted-Out', 'Invited', 'Activated')
xpos <- c(0, 1, 1, 2)
nodes <- data.frame(name, xpos)
source <- c(0, 0, 2)
target <- c(1, 2, 3)
value <- c(20, 80, 60)
links <- data.frame(source, target, value)
sankeyNetwork(Links = links, Nodes = nodes, Source = "source",
Target = "target", Value = "value", NodeID = "name",
units = "TWh", fontSize = 12, nodeWidth = 30, sinksRight = FALSE)