Related
I'm working on a Bubble map where I generated two columns, one for a color id (column Color) and one for a text refering to the id (column Class). This is a classification of my individuals (Color always belongs to Class).
Class is a factor following a certain order that I made with :
COME1039$Class <- as.factor(COME1039$Class, levels = c('moins de 100 000 F.CFP',
'entre 100 000 et 5 millions F.CFP',
'entre 5 millions et 1 milliard F.CFP',
'entre 1 milliard et 20 milliards F.CFP',
'plus de 20 milliards F.CFP'))
This is my code
g <- list(
scope = 'world',
visible = F,
showland = TRUE,
landcolor = toRGB("#EAECEE"),
showcountries = T,
countrycolor = toRGB("#D6DBDF"),
showocean = T,
oceancolor = toRGB("#808B96")
)
COM.g1 <- plot_geo(data = COME1039,
sizes = c(1, 700))
COM.g1 <- COM.g1 %>% add_markers(
x = ~LONGITUDE,
y = ~LATITUDE,
name = ~Class,
size = ~`Poids Imports`,
color = ~Color,
colors=c(ispfPalette[c(1,2,3,7,6)]),
text=sprintf("<b>%s</b> <br>Poids imports: %s tonnes<br>Valeur imports: %s millions de F.CFP",
COME1039$NomISO,
formatC(COME1039$`Poids Imports`/1000,
small.interval = ",",
digits = 1,
big.mark = " ",
decimal.mark = ",",
format = "f"),
formatC(COME1039$`Valeur Imports`/1000000,
small.interval = ",",
digits = 1,
big.mark = " ",
decimal.mark = ",",
format = "f")),
hovertemplate = "%{text}<extra></extra>"
)
COM.g1 <- COM.g1%>% layout(geo=g)
COM.g1 <- COM.g1%>% layout(dragmode=F)
COM.g1 <- COM.g1 %>% layout(showlegend=T)
COM.g1 <- COM.g1 %>% layout(legend = list(title=list(text='Valeurs des importations<br>'),
orientation = "h",
itemsizing='constant',
x=0,
y=0)) %>% hide_colorbar()
COM.g1
Unfortunately my data are too big to be added here, but this is the output I get :
As you can see, the order of the legend is not the one of the factor levels. How to get it ? If data are mandatory to help you to give me a hint, I will try to limit their size.
Many thanks !
Plotly is going to alphabetize your legend and you have to 'make' it listen. The order of the traces in your plot is the order in which the items appear in your legend. So if you rearrange the traces in the object, you'll rearrange the legend.
I don't have your data, so I used some data from rnaturalearth.
First I created a plot, using plot_geo. Then I used plotly_build() to make sure I had the trace order in the Plotly object. I used lapply to investigate the current order of the traces. Then I created a new order, rearranged the traces, and plotted it again.
The initial plot and build.
library(tidyverse)
library(plotly)
library(rnaturalearth)
canada <- ne_states(country = "Canada", returnclass = "SF")
x = plot_geo(canada, sizes = c(1, 700)) %>%
add_markers(x = ~longitude, y = ~latitude,
name = ~name, color = ~name)
x <- plotly_build(x) # capture all elements of the object
Now for the investigation; this is more so you can see how this all comes together.
# what order are they in?
y = vector()
invisible(
lapply(1:length(x$x$data),
function(i) {
z <- x$x$data[[i]]$name
message(i, " ", z)
})
)
# 1 Alberta
# 2 British Columbia
# 3 Manitoba
# 4 New Brunswick
# 5 Newfoundland and Labrador
# 6 Northwest Territories
# 7 Nova Scotia
# 8 Nunavut
# 9 Ontario
# 10 Prince Edward Island
# 11 Québec
# 12 Saskatchewan
# 13 Yukon
In your question, you show that you made the legend element a factor. That's what I've done as well with this data.
can2 = canada %>%
mutate(name = ordered(name,
levels = c("Manitoba", "New Brunswick",
"Newfoundland and Labrador",
"Northwest Territories",
"Alberta", "British Columbia",
"Nova Scotia", "Nunavut",
"Ontario", "Prince Edward Island",
"Québec", "Saskatchewan", "Yukon")))
I used the data to reorder the traces in my Plotly object. This creates a vector. It starts with the levels and their row number or order (1:13). Then I alphabetized the data by the levels (so it matches the current order in the Plotly object).
The output of this set of function calls is a vector of numbers (i.e., 5, 6, 1, etc.). Since I have 13 names, I have 1:13. You could always make it dynamic, as well 1:length(levels(can2$name).
# capture order
df1 = data.frame(who = levels(can2$name), ord = 1:13) %>%
arrange(who) %>% select(ord) %>% unlist()
Now all that's left is to rearrange the object traces and visualize it.
x$x$data = x$x$data[order(c(df1))] # reorder the traces
x # visualize
Originally:
With reordered traces:
I have 3 dataframes:
> head(ps_data)
mass value
1 1197.106 0.0003046761
2 1197.312 0.0002792939
3 1197.518 0.0002545125
4 1197.724 0.0002304614
5 1197.930 0.0002072700
6 1198.136 0.0001850678
> head(enf_data)
mass value
1 1252.358 0.0001400532
2 1252.560 0.0001380179
3 1252.761 0.0001360147
4 1252.963 0.0001336038
5 1253.165 0.0001310146
6 1253.367 0.0001278587
> head(uti_data)
mass value
1 1209.999 9.404051e-05
2 1210.204 9.176861e-05
3 1210.409 8.892953e-05
4 1210.614 8.613961e-05
5 1210.819 8.299913e-05
6 1211.024 8.038693e-05
I need to plot something close to this:
Where z axis will be the "value" column, y axis will be the "mass" column and the x axis will be each dataframe.
I tried to plot this using plotly package, but I'm not getting it right.
How can I do it?
EDIT: dput as requested.
structure(list(mass = c(1197.10568602095, 1197.31161534199, 1197.51756246145,
1197.72352737934, 1197.92951009569, 1198.1355106105), value = c(0.000304676093184434,
0.000279293920415841, 0.000254512541389108, 0.000230461422005283,
0.000207270028165387, 0.000185067825770437), group = c("PS",
"PS", "PS", "PS", "PS", "PS")), row.names = c(NA, 6L), class = "data.frame")
structure(list(mass = c(1252.3578527531, 1252.55956147119, 1252.76128739414,
1252.96303052216, 1253.16479085545, 1253.3665683942), value = c(0.000140053215421452,
0.000138017894050617, 0.00013601474884925, 0.000133603848925069,
0.000131014621271734, 0.000127858739055662), group = c("ENF",
"ENF", "ENF", "ENF", "ENF", "ENF")), row.names = c(NA, 6L), class = "data.frame")
structure(list(mass = c(1209.99938731277, 1210.20436650703, 1210.40936335465,
1210.61437785568, 1210.81941001019, 1211.02445981824), value = c(9.40405108642129e-05,
9.17686135352109e-05, 8.89295335433793e-05, 8.61396097238083e-05,
8.29991287322805e-05, 8.03869281229029e-05), group = c("UTI",
"UTI", "UTI", "UTI", "UTI", "UTI")), row.names = c(NA, 6L), class = "data.frame")
EDIT 2:
Got some progress using plotly:
ps_data["group"] <- "PS"
enf_data["group"] <- "ENF"
uti_data["group"] <- "UTI"
all_data <- rbind(ps_data,enf_data,uti_data)
all_long <- melt(all_data, id.vars=c("mass","group","value"))
fig <- plot_ly(all_long, x = ~group, y = ~mass, z = ~value, type = 'scatter3d', mode = 'lines',
opacity = 1, line = list(width = 6, color = ~group, reverscale = FALSE))
fig
But some strange lines appeared in x axis and the colors are not right.
EDIT 3:
I managed to plot something quite good.
My data looks like this:
> head(all_data)
mass value group
1 1197.106 0.0003046761 PS
2 1197.312 0.0002792939 PS
3 1197.518 0.0002545125 PS
4 1197.724 0.0002304614 PS
5 1197.930 0.0002072700 PS
6 1198.136 0.0001850678 PS
The dataframe is huge, with three groups (PS, ENF, UTI).
I can't fit all of it here, but I decided to place the head just for you to see the structure.
With this data I used this:
p3 <- plot_ly(all_data, x = ~group, y = ~mass, z = ~value, split = ~group, type = 'scatter3d', mode = 'lines',
line = list(width = 4))
Now I'm just trying to find some reliable way to save it in TIFF and change the axis titles.
I really appreaciate the 'plotly' r-package. Currently I run into an issue, where I want to visualize a data frame as points and map the point size (as well as the shape potentially) to a dimension of the data frame.
The problem I run into with my own dataset is, that the sizes are somehow "mixed up" in the sense, that the bigger points don't correspond to the bigger values.
I haven't fully understood the options I have with plotly (sizeref and other marker-options; the fundamental difference between mapping the dimension directly or in the marker arguments; etc) , so this is my best shot as a minimal example right here.
(The second plot is closer to what I currently do. If this one could be fixed, it would be preferable to me)
Your thoughts are greatly appreciated. :)
library(plotly)
set.seed(1)
df <- data.frame(x = 1:10,
y = rep(c("id1", "id2"), 5),
col = factor(sample(3, 10, replace = TRUE)))
df$size <- c(40, 40, 40, 30, 30, 30, 20, 20, 20, 10)
df
#> x y col size
#> 1 1 id1 1 40
#> 2 2 id2 2 40
#> 3 3 id1 2 40
#> 4 4 id2 3 30
#> 5 5 id1 1 30
#> 6 6 id2 3 30
#> 7 7 id1 3 20
#> 8 8 id2 2 20
#> 9 9 id1 2 20
#> 10 10 id2 1 10
# Mapping looks right, but the size may not be correct
plot_ly(df,
x = ~x,
y = ~y,
color = ~col,
size = ~size,
type = 'scatter',
mode = 'markers',
hoverinfo = "text",
text = ~paste('</br> x: ', x,
'</br> y: ', y,
'</br> col: ', col,
'</br> size: ', size)
# , marker = list(size = ~size)
)
# Size looks right, but mapping to points is wrong
plot_ly(df,
x = ~x,
y = ~y,
color = ~col,
# size = ~size,
type = 'scatter',
mode = 'markers',
hoverinfo = "text",
text = ~paste('</br> x: ', x,
'</br> y: ', y,
'</br> col: ', col,
'</br> size: ', size)
, marker = list(size = ~size)
)
devtools::session_info() # excerpt
#> plotly * 4.8.0
This question already has answers here:
Format axis tick labels to percentage in plotly
(2 answers)
Closed 2 years ago.
I have a df which can have 2 or more columns with the first one month always fixed.I am trying to plot them using plotly r. As of now it has three columns: month,apple,orange. Based on analysis it can have another column banana. Below is the code I am using right now but it even takes the column month for y-axis. How do I fix this:
> sample_test
month apple orange
2 Aug-17 2 1
3 Dec-17 2 1
4 Feb-18 2 1
5 Jan-18 2 1
6 Jul-17 2 1
7 Jun-17 2 1
8 May-17 2 1
9 Nov-17 2 1
10 Oct-17 2 1
11 Sep-17 2 1
p<- plot_ly(sample_test, x = sample_test$month, name = 'alpha', type = 'scatter', mode = 'lines',
line = list(color = 'rgb(24, 205, 12)', width = 4)) %>%
layout(#title = "abbb",
xaxis = list(title = "Time"),
yaxis = list (title = "Percentage"))
for(trace in colnames(sample_test)){
p <- p %>% plotly::add_trace(y = as.formula(paste0("~`", trace, "`")), name = trace)
}
p
The output looks like this :
Does this help?
sample_test <- read.table(
text = ' month apple orange
2 Aug-17 2 1
3 Dec-17 2 1
4 Feb-18 2 1
5 Jan-18 2 1
6 Jul-17 2 1
7 Jun-17 2 1
8 May-17 2 1
9 Nov-17 2 1
10 Oct-17 2 1
11 Sep-17 2 1'
)
sample_test$month <- as.Date(paste('01', sample_test$month, sep = '-'), format = '%d-%b-%y')
library(plotly)
p <- plot_ly(sample_test, type = 'scatter', mode = 'lines',
line = list(color = 'rgb(24, 205, 12)', width = 4)) %>%
layout(#title = "abbb",
xaxis = list(title = "Time"),
yaxis = list (title = "Percentage", tickformat = '%'))
for(trace in colnames(sample_test)[2:ncol(sample_test)]){
p <- p %>% plotly::add_trace(x = sample_test[['month']], y = sample_test[[trace]], name = trace)
}
p
There are couple of things to note here -
While dealing with dates, it's best to format them as dates. This can save a lot of headache later on. It is also useful as most if not all functions that require dealing with dates have methods built to handle them.
While adding traces in a for loop, always reference the vector to be plotted explicitly like data$vector or data[['vector']] and not like y = ~vector, because plotly for some reason ends up plotting just one trace over and over again.
You can specify a trace for the first y element, which will give you your raw counts. Next you can add a format for your y-axis using tickformat, which will convert to percentages.
sample_test <- data.frame(month = c("Aug-17", "Dec-17", "Feb-18"), apple = c(2,2,2), orange = c(1,1,1))
p <- plot_ly(sample_test, x = sample_test$month, y = ~apple, name = 'alpha', type = 'scatter', mode = 'lines',
line = list(color = 'rgb(24, 205, 12)', width = 4)) %>%
layout(xaxis = list(title = "Time")) %>%
layout(yaxis = list(tickformat = "%", title = "Percentage"))
Although for some reason this appears to just multiply by 100 and add a % label for some reason, rather than actually calculate a percentage. From this SO answer, looks like that's all it does. I don't really use plotly, but in ggplot you can do this if you reshape your data to long and map your categorical variable (in this case fruit) as a percent.
Edit: per OP's comment, removed month from being traced.
p <- plot_ly(type = 'scatter', mode = 'lines') %>%
layout(yaxis = list(tickformat = "%", title = "Percentage"))
colNames <- names(sample_test)
colNames <- colNames[-which(colNames == 'month')]
for(trace in colNames){
p <- p %>% plotly::add_trace(data = sample_test, x = ~ month, y = as.formula(paste0("~`", trace, "`")), name = trace)
print(paste0("~`", trace, "`"))
}
p
I have some authors with their city or country of affiliation. I would like to know if it is possible to plot the coauthors' networks (figure 1), on the map, having the coordinates of the countries. Please consider multiple authors from the same country. [EDIT: Several networks could be generated as in the example and should not show avoidable overlaps]. This is intended for dozens of authors. A zooming option is desirable. Bounty promise +100 for future better answer.
refs5 <- read.table(text="
row bibtype year volume number pages title journal author
Bennett_1995 article 1995 76 <NA> 113--176 angiosperms. \"Annals of Botany\" \"Bennett Md, Leitch Ij\"
Bennett_1997 article 1997 80 2 169--196 estimates. \"Annals of Botany\" \"Bennett MD, Leitch IJ\"
Bennett_1998 article 1998 82 SUPPL.A 121--134 weeds. \"Annals of Botany\" \"Bennett MD, Leitch IJ, Hanson L\"
Bennett_2000 article 2000 82 SUPPL.A 121--134 weeds. \"Annals of Botany\" \"Bennett MD, Someone IJ\"
Leitch_2001 article 2001 83 SUPPL.A 121--134 weeds. \"Annals of Botany\" \"Leitch IJ, Someone IJ\"
New_2002 article 2002 84 SUPPL.A 121--134 weeds. \"Annals of Botany\" \"New IJ, Else IJ\"" , header=TRUE,stringsAsFactors=FALSE)
rownames(refs5) <- refs5[,1]
refs5<-refs5[,2:9]
citations <- as.BibEntry(refs5)
authorsl <- lapply(citations, function(x) as.character(toupper(x$author)))
unique.authorsl<-unique(unlist(authorsl))
coauth.table <- matrix(nrow=length(unique.authorsl),
ncol = length(unique.authorsl),
dimnames = list(unique.authorsl, unique.authorsl), 0)
for(i in 1:length(citations)){
paper.auth <- unlist(authorsl[[i]])
coauth.table[paper.auth,paper.auth] <- coauth.table[paper.auth,paper.auth] + 1
}
coauth.table <- coauth.table[rowSums(coauth.table)>0, colSums(coauth.table)>0]
diag(coauth.table) <- 0
coauthors<-coauth.table
bip = network(coauthors,
matrix.type = "adjacency",
ignore.eval = FALSE,
names.eval = "weights")
authorcountry <- read.table(text="
author country
1 \"LEITCH IJ\" Argentina
2 \"HANSON L\" USA
3 \"BENNETT MD\" Brazil
4 \"SOMEONE IJ\" Brazil
5 \"NEW IJ\" Brazil
6 \"ELSE IJ\" Brazil",header=TRUE,fill=TRUE,stringsAsFactors=FALSE)
matched<- authorcountry$country[match(unique.authorsl, authorcountry$author)]
bip %v% "Country" = matched
colorsmanual<-c("red","darkgray","gainsboro")
names(colorsmanual) <- unique(matched)
gdata<- ggnet2(bip, color = "Country", palette = colorsmanual, legend.position = "right",label = TRUE,
alpha = 0.9, label.size = 3, edge.size="weights",
size="degree", size.legend="Degree Centrality") + theme(legend.box = "horizontal")
gdata
In other words, adding the names of authors, lines and bubbles to the map. Note, several authors maybe from the same city, or country and should not overlap.
Figure 1 Network
EDIT: The current JanLauGe answer overlaps two non-related networks. authors "ELSE" and "NEW" need to be apart from others as in figure 1.
Are you looking for a solution using exactly the packages you used, or would you be happy to use suite of other packages? Below is my approach, in which I extract the graph properties from the network object and plot them on a map using the ggplot2 and map package.
First I recreate the example data you gave.
library(tidyverse)
library(sna)
library(maps)
library(ggrepel)
set.seed(1)
coauthors <- matrix(
c(0,3,1,1,3,0,1,0,1,1,0,0,1,0,0,0),
nrow = 4, ncol = 4,
dimnames = list(c('BENNETT MD', 'LEITCH IJ', 'HANSON L', 'SOMEONE ELSE'),
c('BENNETT MD', 'LEITCH IJ', 'HANSON L', 'SOMEONE ELSE')))
coords <- data_frame(
country = c('Argentina', 'Brazil', 'USA'),
coord_lon = c(-63.61667, -51.92528, -95.71289),
coord_lat = c(-38.41610, -14.23500, 37.09024))
authorcountry <- data_frame(
author = c('LEITCH IJ', 'HANSON L', 'BENNETT MD', 'SOMEONE ELSE'),
country = c('Argentina', 'USA', 'Brazil', 'Brazil'))
Now I generate the graph object using the snp function network
# Generate network
bip <- network(coauthors,
matrix.type = "adjacency",
ignore.eval = FALSE,
names.eval = "weights")
# Graph with ggnet2 for centrality
gdata <- ggnet2(bip, color = "Country", legend.position = "right",label = TRUE,
alpha = 0.9, label.size = 3, edge.size="weights",
size="degree", size.legend="Degree Centrality") + theme(legend.box = "horizontal")
From the network object we can extract the values of each edge, and from the ggnet2 object we can get degree of centrality for nodes as below:
# Combine data
authors <-
# Get author numbers
data_frame(
id = seq(1, nrow(coauthors)),
author = sapply(bip$val, function(x) x$vertex.names)) %>%
left_join(
authorcountry,
by = 'author') %>%
left_join(
coords,
by = 'country') %>%
# Jittering points to avoid overlap between two authors
mutate(
coord_lon = jitter(coord_lon, factor = 1),
coord_lat = jitter(coord_lat, factor = 1))
# Get edges from network
networkdata <- sapply(bip$mel, function(x)
c('id_inl' = x$inl, 'id_outl' = x$outl, 'weight' = x$atl$weights)) %>%
t %>% as_data_frame
dt <- networkdata %>%
left_join(authors, by = c('id_inl' = 'id')) %>%
left_join(authors, by = c('id_outl' = 'id'), suffix = c('.from', '.to')) %>%
left_join(gdata$data %>% select(label, size), by = c('author.from' = 'label')) %>%
mutate(edge_id = seq(1, nrow(.)),
from_author = author.from,
from_coord_lon = coord_lon.from,
from_coord_lat = coord_lat.from,
from_country = country.from,
from_size = size,
to_author = author.to,
to_coord_lon = coord_lon.to,
to_coord_lat = coord_lat.to,
to_country = country.to) %>%
select(edge_id, starts_with('from'), starts_with('to'), weight)
Should look like this now:
dt
# A tibble: 8 × 11
edge_id from_author from_coord_lon from_coord_lat from_country from_size to_author to_coord_lon
<int> <chr> <dbl> <dbl> <chr> <dbl> <chr> <dbl>
1 1 BENNETT MD -51.12756 -16.992729 Brazil 6 LEITCH IJ -65.02949
2 2 BENNETT MD -51.12756 -16.992729 Brazil 6 HANSON L -96.37907
3 3 BENNETT MD -51.12756 -16.992729 Brazil 6 SOMEONE ELSE -52.54160
4 4 LEITCH IJ -65.02949 -35.214117 Argentina 4 BENNETT MD -51.12756
5 5 LEITCH IJ -65.02949 -35.214117 Argentina 4 HANSON L -96.37907
6 6 HANSON L -96.37907 36.252312 USA 4 BENNETT MD -51.12756
7 7 HANSON L -96.37907 36.252312 USA 4 LEITCH IJ -65.02949
8 8 SOMEONE ELSE -52.54160 -9.551913 Brazil 2 BENNETT MD -51.12756
# ... with 3 more variables: to_coord_lat <dbl>, to_country <chr>, weight <dbl>
Now moving on to plotting this data on a map:
world_map <- map_data('world')
myMap <- ggplot() +
# Plot map
geom_map(data = world_map, map = world_map, aes(map_id = region),
color = 'gray85',
fill = 'gray93') +
xlim(c(-120, -20)) + ylim(c(-50, 50)) +
# Plot edges
geom_segment(data = dt,
alpha = 0.5,
color = "dodgerblue1",
aes(x = from_coord_lon, y = from_coord_lat,
xend = to_coord_lon, yend = to_coord_lat,
size = weight)) +
scale_size(range = c(1,3)) +
# Plot nodes
geom_point(data = dt,
aes(x = from_coord_lon,
y = from_coord_lat,
size = from_size,
colour = from_country)) +
# Plot names
geom_text_repel(data = dt %>%
select(from_author,
from_coord_lon,
from_coord_lat) %>%
unique,
colour = 'dodgerblue1',
aes(x = from_coord_lon, y = from_coord_lat, label = from_author)) +
coord_equal() +
theme_bw()
Obviously you can change the colour and design in the usual way with ggplot2 grammar. Notice that you could also use geom_curve and the arrow aesthetic to get a plot similar to the one in the uber post linked in the comments above.
As an effort to avoid the overlapping of the 2 networks, I came to this modification of the x and y coordenates of the ggplot, which by default does not overlap the networks, see figure 1 in the question.
# get centroid positions for countries
# add coordenates to authorcountry table
# download and unzip
# https://worldmap.harvard.edu/data/geonode:country_centroids_az8
setwd("~/country_centroids_az8")
library(rgdal)
cent <- readOGR('.', "country_centroids_az8", stringsAsFactors = F)
countrycentdf<-cent#data[,c("name","Longitude","Latitude")]
countrycentdf$name[which(countrycentdf$name=="United States")]<-"USA"
colnames(countrycentdf)[names(countrycentdf)=="name"]<-"country"
authorcountry$Longitude<-countrycentdf$Longitude[match(authorcountry$country,countrycentdf$country)]
authorcountry$Latitude <-countrycentdf$Latitude [match(authorcountry$country,countrycentdf$country)]
# original coordenates of plot and its transformation
ggnetbuild<-ggplot_build(gdata)
allcoord<-ggnetbuild$data[[3]][,c("x","y","label")]
allcoord$Latitude<-authorcountry$Latitude [match(allcoord$label,authorcountry$author)]
allcoord$Longitude<-authorcountry$Longitude [match(allcoord$label,authorcountry$author)]
allcoord$country<-authorcountry$country [match(allcoord$label,authorcountry$author)]
# increase with factor the distance among dots
factor<-7
allcoord$coord_lat<-allcoord$y*factor+allcoord$Latitude
allcoord$coord_lon<-allcoord$x*factor+allcoord$Longitude
allcoord$author<-allcoord$label
# plot as in answer of JanLauGe, without jitter
library(tidyverse)
library(ggrepel)
authors <-
# Get author numbers
data_frame(
id = seq(1, nrow(coauthors)),
author = sapply(bip$val, function(x) x$vertex.names)) %>%
left_join(
allcoord,
by = 'author')
# Continue as in answer of JanLauGe
networkdata <- ##
dt <- ##
world_map <- map_data('world')
myMap <- ##
myMap