Wrap Axis Labels in Correlation Matrix - r

I'm attempting to use the ggcorr() function within library(GGally) to create a correlation matrix. The package is working as it is supposed to, but I'm running into an issue where I would like to edit how the axis labels appear on the plot.
Currently, they will automatically add a _ or . to separate names with spaces or other characters between them. Ideally, I would like to create a line break (\n) between spaces in names so that long names and short names can be easily read and don't extend much further beyond the appropriate column and row.
I have found solutions that others have used on SO, including using str_wrap(), but it was within a ggplot() call, not this specific package. I have inspected the R code for the package, but couldn't find where to edit these labels specifically. Whenever I attempt to edit X or Y axis text, it adds an entirely new axis and set of labels.
I currently dcast() a data frame into the resulting data frame and even when I gsub() "\n" into the player names column, they get lost in the dcast() transition.
Here is an example of what I am working with. I would like to be able to automatically create line breaks between first and last name of the labels.
library(GGally)
library(ggplot2)
test <- structure(list(Date = structure(c(17100, 17102, 17103, 17106,
17107), class = "Date"), `Alexis Ajinca` = c(1.2, NA, 9.2, 6.4,
NA), `Anthony Davis` = c(95.7, 76.9, 29, 67, 24.9), `Buddy Hield` = c(9.7,
4.7, 17, 8, 28.3), `Cheick Diallo` = c(NA, NA, 3.2, NA, NA),
`Dante Cunningham` = c(0.5, 27.6, 14, 13.5, -1), `E'Twaun Moore` = c(19.2,
16.1, 22, 20.5, 10.1), `Lance Stephenson` = c(16.1, 31.6,
8, 8.1, 34.8), `Langston Galloway` = c(10.9, 2, 13.8, 2.2,
29.4), `Omer Asik` = c(4.7, 6.6, 9.9, 15.9, 14.2), `Solomon Hill` = c(4.7,
13.2, 12.8, 35.2, 4.4), `Terrence Jones` = c(17.1, 12.4,
9.8, NA, 20.8), `Tim Frazier` = c(40.5, 40.2, 18.3, 44.1,
7.2)), .Names = c("Date", "Alexis Ajinca", "Anthony Davis",
"Buddy Hield", "Cheick Diallo", "Dante Cunningham", "E'Twaun Moore",
"Lance Stephenson", "Langston Galloway", "Omer Asik", "Solomon Hill",
"Terrence Jones", "Tim Frazier"), row.names = c(NA, -5L), class = "data.frame")
ggc <- ggcorr(test[,-1], method = c("pairwise","pearson"),
hjust = .85, size = 3,
layout.exp=2)
ggc
Thank you for any and all help and please, let me know if you have any questions or need any clarification!

A couple of approaches
You can edit the object returned by ggcorr
g = ggplot_build(ggc)
g$data[[2]]$label = gsub("_", "\n", g$data[[2]]$label )
grid::grid.draw(ggplot_gtable(g))
Or you can create a new data frame and add the labels manually using geom_text. This probably gives a bit more control over the text justification and placement.
# I dont see how to suppress the labels so just set the size to zero
ggc <- ggcorr(test[,-1], method = c("pairwise","pearson"),
hjust = .85,
size = 0, # set this to zero
layout.exp=2)
# Create labels and plot
dat <- data.frame(x = seq(test[-1]), y = seq(test[-1]),
lbs = gsub(" ", "\n", names(test[-1]) ))
ggc + geom_text(data=dat, aes(x, y, label=lbs), nudge_x = 2, hjust=1)

Related

Is there a way I can move my first column in this excel dataset to be the column that specifies the numbers 1 to 8 [duplicate]

This question already has answers here:
Convert the values in a column into row names in an existing data frame
(5 answers)
Closed 1 year ago.
I´d like to change the first data column named "Especies" and the other species names below it; (i.e "Strix_varia, Strix_rufipes...) and make them become the numbers 1 to 8 enclosed in red from link.
I´m working with Moran´s I and having the column "Especies" as data throws me incorrect results.
Any help will be great!
Thanks!
Heres my dput():
structure(list(Especies = c("Strix_varia", "Strix_rufipes", "Strix_occidentalis",
"Strix_aluco", "Strix_uralensis", "Strix_woodfordii", "Strix_leptogrammica",
"Strix_nebulosa"), Notas.segundo = c(2.9, 4.3, 2.9, 1.3, 1, 3,
3.1, 1.1), Notas.llamado = c(6.3, 13.5, 12.2, 5, 3, 6, 4, 9.3
), Duracion.llamado = c(2.9, 2.9, 5.3, 4, 4.5, 1.6, 1.5, 7.3),
Frecuencia.minima = c(149.4, 157.4, 167, 314.7, 75.3, 149.3,
212.2, 147.5), Frecuencia.maxima = c(518.6, 564.8, 594.3,
846.2, 394.9, 438.4, 396.8, 263.8), Ancho.banda = c(369.1,
407.3, 427.2, 531.5, 319.6, 289, 184.6, 116.3), Frecuencia.central = c(522.1,
551.8, 589.9, 844, 385.9, 429, 374.9, 255.2)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -8L))
Assuming that in your table, you have one row per species and species do not repeat, simple data$Especies = seq_along(data$Especies) will do the job. I would suggest keeping the original table so you remember what code belongs to the species, such as with data$id = seq_along(data$Especies).
data = structure(list(Especies = c("Strix_varia", "Strix_rufipes", "Strix_occidentalis",
"Strix_aluco", "Strix_uralensis", "Strix_woodfordii", "Strix_leptogrammica",
"Strix_nebulosa"), Notas.segundo = c(2.9, 4.3, 2.9, 1.3, 1, 3,
3.1, 1.1), Notas.llamado = c(6.3, 13.5, 12.2, 5, 3, 6, 4, 9.3
), Duracion.llamado = c(2.9, 2.9, 5.3, 4, 4.5, 1.6, 1.5, 7.3),
Frecuencia.minima = c(149.4, 157.4, 167, 314.7, 75.3, 149.3,
212.2, 147.5), Frecuencia.maxima = c(518.6, 564.8, 594.3,
846.2, 394.9, 438.4, 396.8, 263.8), Ancho.banda = c(369.1,
407.3, 427.2, 531.5, 319.6, 289, 184.6, 116.3), Frecuencia.central = c(522.1,
551.8, 589.9, 844, 385.9, 429, 374.9, 255.2)), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -8L))
# If you don't want to overwrite your data:
data$id = seq_along(data$Especies)
# If you are OK with overwriting your data
data$Especies = seq_along(data$Especies)
If the species names are not unique and can repeat, this means that two species will have a different id. If that is not what you want, you can use factor():
data$id = as.numeric(factor(data$Especies))
Alternatively, you can create an encoding of your own by creating a named vector and use it to translate species names to id:
names = unique(data$Especies)
coding = seq_along(names)
names(coding) = names
data$id = coding[data$Especies]

parameters of histogram with R

First, I wanted to be able to display the absciss axis with decimal numbers (example: 1.5, 2.6, ...), but the problem is that when I display the histogram with my code, then automatically the x-axis displays whole number as you can see in the follow picture (I have circled in red what I would like to change): hist
How can i change the parameters to be able to get these whole numbers into decimals?
Secondly, I would like the numbers that appear on the x-axis to correspond exactly to my breaks vector.
Could someone please help me?
Here is my code:
my_data <- transform(my_data, new = as.numeric(new/1000000))
sal_hist_default = hist(my_data$new, breaks = c(1,6.3,11.6,16.9,22.2,27.5), col = "blue", border = "black", las = 1, include.lowest=TRUE,right=FALSE, main="Salary Of best category", xlab = "salaries", ylab = "num of players",xlim = c(1,27.5), ylim = c(0,600))
You should really provide sample data, but try this:
set.seed(42)
new <- rnorm(1000, 14, 3.5)
my_data <- data.frame(new)
sal_hist_default = hist(my_data$new, breaks = c(1, 6.3, 11.6, 16.9, 22.2, 27.5), col = "blue",
border = "black", las = 1, include.lowest=TRUE,right=FALSE, main="Salary Of best category",
xlab = "salaries", ylab = "num of players",xlim = c(1,27.5), ylim = c(0,600), xaxt="n")
axis(1, c(1, 6.3, 11.6, 16.9, 22.2, 27.5), c(1, 6.3, 11.6, 16.9, 22.2, 27.5))

Reposition colorbar in plotly and change color scale (R)

I'm creating a scattermapbox plot using plotly and would like to:
Make the colorbar horizontal and position it on top my plot
Change the colorscale to shades of red and rename it to 'Elevation'
I've tried to follow the examples online but it doesn't work:
I added legend = list(orientation = 'h') to the layout but it made no difference. (2nd example here: https://plotly.com/r/scattermapbox/)
Added coloraxis = list(colorscale = "Reds") to the layout following the (first) example on https://plotly.com/r/scattermapbox/ without success
Other examples show it is possible to rename the colorbar by adding a named list to marker by marker = list(..., colorbar = list(title = 'My title')). There's no parameter named colorbar in the reference documentation. (example here: https://plotly.com/r/colorscales/)
On a side note, I have a mapbox token, yet, the plot appears to work with a few styles only.
Data
Sharing 50 uniformly spaced points from the entire dataset (6000 points):
structure(list(lat = c(45.547955, 45.549801, 45.551437, 45.554653,
45.559059, 45.560158, 45.563854, 45.567379, 45.5715069, 45.575817,
45.579056, 45.582672, 45.586857, 45.591194, 45.59362, 45.597103,
45.601231, 45.605034, 45.608997, 45.611233, 45.615376, 45.61932,
45.622749, 45.625629, 45.628456, 45.631489, 45.631611, 45.632305,
45.630135, 45.626793, 45.623497, 45.620045, 45.615589, 45.610992,
45.606541, 45.602821, 45.599106, 45.595169, 45.591198, 45.5872079,
45.582672, 45.578587, 45.57476, 45.570515, 45.565872, 45.56226,
45.55862, 45.555603, 45.552097, 45.548283), lon = c(-73.666939,
-73.668861, -73.674332, -73.673698, -73.672272, -73.66761, -73.664688,
-73.661179, -73.660103, -73.658028, -73.657333, -73.654381, -73.652786,
-73.651154, -73.648354, -73.644836, -73.64328, -73.641556, -73.63961,
-73.637321, -73.635498, -73.632965, -73.629128, -73.624321, -73.620491,
-73.615967, -73.613396, -73.61422, -73.618103, -73.622635, -73.627571,
-73.632332, -73.635414, -73.637466, -73.638702, -73.640244, -73.643547,
-73.646996, -73.649826, -73.651886, -73.652626, -73.6558, -73.6588359,
-73.660416, -73.662086, -73.665947, -73.668335, -73.671501, -73.666359,
-73.667671), ele = c(30.2, 27, 26.6, 25.7999999, 23.2, 26.7999999,
20, 24, 22.7999999, 20.6, 19, 22.2, 19.2, 17.3999999, 25.2, 25.2,
17, 16.6, 15.3999999, 17, 15.2, 15.6, 16.3999999, 16.7999999,
17.7999999, 17.6, 24.3999999, 18.2, 18.6, 18.6, 19.2, 17, 17.2,
18.7999999, 23.7999999, 27.7999999, 27.7999999, 26, 30.7999999,
27.2, 29.2, 24, 23.7999999, 26.6, 24.7999999, 26.2, 31, 31, 31.2,
32.6)), row.names = c(NA, -50L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x7fb33c8118e0>)
Code
plt <- plot_mapbox(data = track, mode = 'scattermapbox',
lat = ~lat, lon = ~lon, color = ~ele) %>%
layout(mapbox = list(style = 'carto-positron', zoom = 10, # open-street-map works, but light, dark, basic don't work
center = list(lon = track[, mean(lon)], lat = track[, mean(lat)])),
margin = list(l = 0, r = 0, t = 0, b = 0),
legend = list(orientation = 'h'))
Use %>% colorbar() with parameters at https://plotly.com/r/reference/#scatter-marker-colorbar

Vertical gradient color with geom_area [duplicate]

This question already has answers here:
How to make gradient color filled timeseries plot in R
(4 answers)
Closed 5 years ago.
I have hard time finding a solution for creating gradient color.
This is how it should look like(dont mind the blue bars)
Something similar to How to make gradient color filled timeseries plot in R, but a bit to advanced for me to reuse this example. I dont have any negative values and max is 80.I have tried the answer offered by nograpes, my PC was frozen for some 6-7 min and then I got message:
Error in rowSums(na) :
'Calloc' could not allocate memory (172440001 of 16 bytes)
This is only a subset of data with 841 rows (some containing NAs), and solution in previous answer could hardly work for me.
df <- structure(list(date = structure(c(1497178800, 1497182400, 1497186000,
1497189600, 1497193200, 1497196800, 1497200400, 1497204000, 1497207600,
1497211200, 1497214800, 1497218400, 1497222000, 1497225600, 1497229200,
1497232800, 1497236400, 1497240000, 1497243600, 1497247200, 1497250800,
1497254400, 1497258000, 1497261600, 1497265200, 1497268800, 1497272400,
1497276000, 1497279600, 1497283200, 1497286800, 1497290400, 1497294000,
1497297600, 1497301200, 1497304800, 1497308400, 1497312000, 1497315600,
1497319200, 1497322800, 1497326400, 1497330000, 1497333600, 1497337200,
1497340800, 1497344400, 1497348000, 1497351600, 1497355200), class = c("POSIXct",
"POSIXt"), tzone = "UTC"), dk_infpressure = c(22, 21.6, 21.2,
20.9, 20.5, 20.1, 19.8, 19.4, 19, 18.6, 18.2, 17.9, 17.5, 17.1,
16.8, 16.4, 16, 15.6, 15.2, 14.9, 14.5, 14.1, 13.8, 13.4, 13,
12.5, 11.9, 11.4, 10.8, 10.3, 9.8, 9.2, 8.7, 8.1, 7.6, 7, 6.5,
6, 5.4, 4.9, 4.3, 3.8, 3.2, 2.7, 2.2, 1.6, 1.1, 0.5, 0, 0)), .Names = c("date",
"dk_infpressure"), row.names = c(NA, -50L), class = c("tbl_df",
"tbl", "data.frame"))
Code to get basic plot:
ggplot()+
geom_area(data=df, aes(x = date, y= dk_infpressure ) )+
scale_y_continuous(limits = c(0, 80))
Because geom_area can't take a gradient fill, it's a somewhat hard problem.
Here's a decidedly hacky but possibly sufficient option that makes a raster (but using geom_tile since x and y sizes differ) and covering the ragged edges with cropping and ggforce::geom_link (a version of geom_segment that can plot a gradient):
library(tidyverse)
df %>%
mutate(dk_infpressure = map(dk_infpressure, ~seq(0, .x, .05))) %>% # make grid of points
unnest() %>%
ggplot(aes(date, dk_infpressure, fill = dk_infpressure)) +
geom_tile(width = 3600, height = 0.05) +
# hide square tops
ggforce::geom_link(aes(color = dk_infpressure, xend = lag(date), yend = lag(dk_infpressure)),
data = df, size = 2.5, show.legend = FALSE) +
scale_x_datetime(expand = c(0, 0)) + # hide overplotting of line
scale_y_continuous(expand = c(0, 0))

Title in R corrplot too not centred and too high

I am using corrplot to visualise correlations, however the title is quite high above the plot, and I would like to bring it closer. How do I do this?
Sample dataframe:
"VADeaths" <-
structure(c(11.7, 18.1, 26.9, 41, 66, 8.7, 11.7, 20.3, 30.9, 54.3, 15.4,
24.3, 37, 54.6, 71.1, 8.4, 13.6, 19.3, 35.1, 50), .Dim = c(5, 4),
.Dimnames = list(c("50-54", "55-59", "60-64", "65-69", "70-74"),
c("Rural Male", "Rural Female", "Urban Male", "Urban Female")))
Calculate the correlation and visualise
library(corrplot)
cors = cor(VADeaths)
corrplot(cors,tl.col="black",title="Example Plot",mar=c(0,0,5,0),tl.offset = 1)
By extending the margin to 5 above the plot I can at least get the title to appear in the plot, but cannot figure out how to bring the title closer to the plot and centred over the plot rather than also the space taken up by the labels.
The above looks like this:
I am wanting something more like this (ignore the fonts)
My actual plots have much smaller labels, so there is a gap of around 3-4cm between the labels and the title. I did not find that increasing the value in mar solved the issue.
You could use mtext to add the title instead
corrplot(cors,tl.col="black", mar=c(0,0,5,0), tl.offset = 1)
mtext("Example Plot", at=2.5, line=-0.5, cex=2)
at controls the horizontal position. line controls the height. cex for the size. ?mtext to see more options
You can draw a correlation plot using ggplot2.
First convert the correaltion data to be a data frame.
library(reshape2)
cors <- cor(VADeaths)
cor_data <- reshape2::melt(
cors,
varnames = paste0("demographic", 1:2),
value.name = "correlation"
)
Then draw the plot.
library(ggplot2)
ggplot(cor_data, aes(demographic1, demographic2, fill = correlation)) +
geom_tile() +
ggtitle("Correlation across demographics for VA deaths")

Resources