Control axis labeling in R heatmap - r

I am trying to create a heatmap in R, but the axis labels (which uses the row.names information of the data frame that is being passed to the heatmap function) is crowding the x-axis, and I can't figure out how to control the labeling.
Here is an example:
vDates = seq.Date(from = as.Date('29-11-2012',
format = '%d-%m-%Y'),
length.out = 203, by = 'day')
dfHeatMap = rdirichlet(length(vDates), runif(15))
row.names(dfHeatMap) = as.character(vDates)
heatmap(t(dfHeatMap), Rowv = NA, Colv = NA,
col = cm.colors(256))
Any suggestions/packages that take care of this issue?

I was able to figure this out by RTFM (more carefully). Initially I was not able to get the labCol and the labRow working. Here is a working example:
library(gtools)
library(ClassDiscovery)
# generate sequence of dates
vDates = seq.Date(from = as.Date('29-11-2012',
format = '%d-%m-%Y'),
length.out = 203, by = 'day')
# generate the random samples
dfHeatMap = as.matrix(rdirichlet(length(vDates), runif(15)))
row.names(dfHeatMap) = as.character(vDates)
# column labels
vDatesNew = rep(as.Date(NA), length(vDates))
vDatesNew[seq(from = 1, to = 203, by = 10)] =
vDates[seq(from = 1, to = 203, by = 10)]
# row labels
labRow = c(NA, NA, 3, NA, NA, 6, NA, NA, 9,
NA, NA, 12, NA, NA, 15)
# draw the heatmap with aspect control
aspectHeatmap(t(dfHeatMap), Rowv = NA, Colv = NA,
col = cm.colors(256), labCol = vDatesNew, labRow = labRow,
margins = c(5, 5), hExp = 1.5, wExp = 4)
I have used the ClassDiscovery package to control the aspect ratio of the heatmap. This is what it looks like:

Related

which function can I use to add individual name under each star plot

I want to give names to individual starplots in R
stars(norm_datas[, 1:12], full = TRUE,radius = TRUE,len = 1.0, key.loc = c(14,1), labels = abbreviate(case.names(norm_datas)),main = "Provision of Ecosystem services", draw.segments = TRUE, lwd = 0.25, lty = par("lty"), xpd = TRUE).
This is what I tried but it just labeled each star plot as 1, 2, 3.
Kindly help resolve.
structure(list(Type_Garden = c("AG", "AG", "AG"), Pollinators = c(10,
6, 5.5), Flower_abundance = c(384, 435, 499), Climate_regulation = c(1,
7, 2), Crop_area = c(34, 25, 10), Plant_diversity = c(22, 53,
41), Nitrogen_balance = c(0.95, 0.26, NA), Phosphorus_balance = c(0.24,
0.04, NA), Habitat_provision = c(1, 2, 0), Recreation_covid = c(1,
NA, NA), Aesthetic_appreciation = c(3, NA, NA), Reconnection_nature = c(4,
NA, NA), Mental_health = c(1, NA, NA), Physical_health = c(1,
NA, NA)), class = "data.frame", row.names = c(NA, -3L))

change second y axis color in base R

Change secondary line axis color changes send color for ggplot, but I chose to go with base R, and would like to be able to select the second y axis color.
I have the following data:
df = structure(list(A = c("Q4-17", "Q1-18", "Q2-18", "Q3-18", "Q4-18",
"Q1-19", "Q2-19", "Q3-19", "Q4-19", "Q1-20", "Q2-20", "Q3-20",
"Q4-20", "Q1-21", "Q2-21", "Q3-21", "Q4-21", "Q1-22", "Q2-22",
"Q3-22"), B = c(69.45, 71.1, 74.94, 73.87, 93.61, 91.83,
95.38, 109.8, 133.75, 125.26, 118.22, 145.65, 144.9757185, 155.3464032,
184.367033, 179.8121721, 187.235487, 189.1684376, 184.3864519,
161.5300056), C = c(70.73, 71.73, 74.33, 73.27,
95.94, 94.38, 95.38, 109.8, 115.32, 116.92, 115.9, 113.87, 106.108147,
96.84273563, 111.5150869, 110.1228567, 110.7448835, 194.9684376,
187.7241152, 167.7665553), D = c(260.3, 216.02, 203.72,
203.52, 300.96, 320.77, 330.5, 413.52, 436.7, 474.96, 463.6,
501.87, 493.8865461, 497.1760767, 514.9903459, 503.7601267, 510.8362938,
614.9915546, 603.5761107, 593.660831), E = c(NA,
NA, NA, NA, NA, NA, NA, NA, 39.237, 35.621, 32.964, NA, 152.137,
140.743023, 167.809, 170.877, 117.517, 102.691723, 88.8, 76.2445528
)), class = "data.frame", row.names = c(NA, -20L))
df = df %>%
rowwise() %>%
mutate(sums = sum(D,E, na.rm = TRUE))
df = df[8:nrow(df),]
and this to generate my plot
x <- seq(1,nrow(df),1)
y1 <- df$B
y2 <- df$D
par(mar = c(5, 4, 4, 4) + 0.3)
plot(x, y1, col = "#000000",
type = "l",
main = "title",
ylim = c(0, max(df[,2:3])),
ylab = "Y1",
xlab = "",
xaxt = "n")
axis(1,
at = seq(from = 13, by = -4, length.out = 4),
labels = df$A[seq(from = 13, by = -4, length.out = 4)])
lines(x, df$C, lty = "dashed", col = "#adadad", lwd = 2)
par(new = TRUE)
plot(x, df$sums, col = "#ffa500",
axes = FALSE, xlab = "", ylab = "", type = "l")
axis(side = 4, at = pretty(range(y2)),
ylim = c(0,max(df[,3:5], na.rm = TRUE)),
col = "#00aa00") # Add colour selection of 2nd axis
par(new = TRUE)
plot(x, df$D , col = "#0000ff",
axes = FALSE, xlab = "", ylab = "", type = "l", lwd = 1)
mtext("y2", side = 4, line = 3)
but this does not colour my complete second y axis, nor labels, nor title
does any one have any suggestions to be able to set entire y2 axis to be #00AA00 - ticks, labels, and title?

Remove NA and only fill cells containing numbers in tableGrob

I have a table (top.table) I would like to display in a ggplot, but am having issues reformatting the table. I need to format it such that all NA elements are blank, and only fill with specified colors if there is a number contained within the element. Basically, fill the colors like in the code below except the NA elements should be filled default (white), and the NA text should be removed. If the removing of the NA is not possible in the way I described, changing the text color/fill would also work for me (i.e. change text color/fill of numbers, but not NA).
top.table <- structure(c(7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 57.5, 45.5,
NA, NA, NA, 128.5, 78.5, 71.5, 49, NA, NA, NA, 1043, NA, NA,
710, 838, 1481, 737, NA, NA, 1096, 5923, 3697, NA, 1726, NA,
NA, 3545, NA, NA, 1733, 2333, NA, 3807, 1795, NA, 2761, NA, 2887,
NA, NA, 2211, 2544), .Dim = c(11L, 5L), .Dimnames = list(NULL,
c("Sample Number", "Static", "D10 FB", "D12 FB", "D14 FB"
)))
colors <- structure(list(newcolor = c("dodgerblue2", "#E31A1C", "#FDBF6F",
"palegreen2", "skyblue2", "green4", "#6A3D9A", "#FF7F00", "gold1",
"#CAB2D6", "#FB9A99")), row.names = c(NA, -11L), class = c("tbl_df",
"tbl", "data.frame"))
tt1 <- ttheme_minimal(
core = list(bg_params = list(fill = colors, col = NA))
)
g <- tableGrob(top.table, theme = tt1)
grid.draw(g)
This may seem like a very obvious solution, but why not just replace the NA with empty strings when you plot the table?
g <- tableGrob(replace(top.table, is.na(top.table), ""), theme = tt1)
grid.newpage()
grid.draw(g)
With help from #AllanCameron, the solution I came up with was to use repeat the colors to the number of columns in top.table and use replace() to convert all NA elements to "white" before calling tableGrob()
#make repeated columns of colors
table.colors <- matrix(rep(colors, each = ncol(top.table)),
ncol = ncol(top.table), byrow = TRUE)
#index matrix to fine NAs
table.ind <- is.na(top.table)
#make replacements
table.colors <- replace(table.colors, table.ind, "white")
tt1 <- ttheme_minimal(
core = list(bg_params = list(fill = table.colors))
)
g <- tableGrob(replace(top.table, is.na(top.table), ""), theme = tt1)
grid.draw(g)

How do we create the data frame using column name,number of missing values and their percentage

Missing_Values = data.frame(colSums(is.na(train)))
Missing_Values_per = data.frame(colMeans(is.na(train))) * 100
data.frame(Column_Name = names(train))
i need to create the data frame using these three variables ,could someone help on this
try this:
library(tidyverse)
train <- tibble(a = c(NA, 1, 4, NA, NA),
b = c(6, NA, NA, NA, NA))
train %>%
gather(column_name, v) %>%
group_by(column_name) %>%
summarize(missing_values = sum(is.na(v)),
missing_values_per = mean(is.na(v)) * 100)

custom rmeta - forest plot generation does not work: " 'x' and 'units' must have length > 0"

I tried to generate a "forest plot" without summary estimates using the rmeta package. However, using ?forestplot and then starting from the description or the example does not help, I am always getting the same error. I would assume that it is a simple one that has to do with the matrix/vector lengths somewhat not lining up but I kept changing and adjusting and still cannot find the error...
Here is the example code:
tabletext<-cbind(c(NA, NA, NA, NA, NA, NA),
c(NA, NA, NA, NA, NA, NA),
c("variable1","subgroup","2nd", "3rd", "4th", "5th"),
c(NA,"mean","1.8683639", "2.5717301", "4.4966049, 9.0008054")
)
tabletext
png("forestplot.png")
forestplot(tabletext, mean = c(NA, NA, 1.8683639, 2.5717301, 4.4966049, 9.0008054), lower = c(NA, NA, 1.4604643, 2.0163468, 3.5197956, 6.9469213), upper = c(NA, NA, 2.3955105, 3.2897459, 5.7672966, 11.7288609),
is.summary = c(rep(FALSE, 6)), zero = 1, xlog=FALSE, boxsize=0.75, xticks = NULL, clip = c(0.9, 12))
dev.off()
Error message:
clip = c(0.9, 12))
Error in unit(rep(1, sum(widthcolumn)), "grobwidth", labels[[1]][widthcolumn]) :
'x' and 'units' must have length > 0
dev.off()
Any help is very much appreciated!
This works with the forestplot-package although you need to remove the xticks=NULL:
tabletext<-cbind(c(NA, NA, NA, NA, NA, NA),
c(NA, NA, NA, NA, NA, NA),
c("variable1","subgroup","2nd", "3rd", "4th", "5th"),
c(NA,"mean","1.8683639", "2.5717301", "4.4966049, 9.0008054")
)
png("forestplot.png")
forestplot(tabletext,
mean = c(NA, NA, 1.8683639, 2.5717301, 4.4966049, 9.0008054),
lower = c(NA, NA, 1.4604643, 2.0163468, 3.5197956, 6.9469213),
upper = c(NA, NA, 2.3955105, 3.2897459, 5.7672966, 11.7288609),
is.summary = c(rep(FALSE, 6)), zero = 1,
xlog=FALSE, boxsize=0.75, clip = c(0.9, 12))
dev.off()
Gives (I recommend some polishing before submitting for publishing):

Resources