Is it possible to transform gbm.step result into a dataframe? - r

I want to insert a scale break in my chart result but I am trying for days and I don't know how to do it. I guess it's only possible if I have a data frame because when I try it all the options that I found online it doesn't work because I have a gbm result. So I guess if I could transform my result into a data frame I would be able to plot with the scale break I need.
ina <- gbm.step(data=bonaci, gbm.x = 2:5, gbm.y = 1,
family = "gaussian", tree.complexity = 5,
learning.rate = 0.0001, bag.fraction = 0.5)
m<-gbm.plot(ina,
variable.no = 1,
smooth= TRUE,
rug= TRUE,
n.plots= 4,
common.scale= FALSE,
write.title= FALSE,
y.label= NULL,
x.label= NULL,
show.contrib= FALSE,
cex.axis= 0.7,
cex.lab= 0.8,
las = 2, #direcao legenda eixo x. 1 = horizontal, 2 = vertical
plot.layout=c(1,1)
When I try this code whith plotrix package:
from <- -1000
to <- -4000
gap.plot(m, gap=c(from,to), xlab="index", ylab="value")
This error appears:
Error in ylim[2] - gapsize[1] :
argumento não-numérico para operador binário
I would like to add a break in the y-axis scale so that values closer to zero can appear.

Related

Heatmap of Gene intensity values in R

I have data that look like this:
Gene
HBEC-KT-01
HBEC-KT-02
HBEC-KT-03
HBEC-KT-04
HBEC-KT-05
Primarycells-02
Primarycells-03
Primarycells-04
Primarycells-05
BPIFB1
15726000000
15294000000
15294000000
14741000000
22427000000
87308000000
2.00E+11
1.04E+11
1.51E+11
LCN2
18040000000
26444000000
28869000000
30337000000
10966000000
62388000000
54007000000
56797000000
38414000000
C3
2.52E+11
2.26E+11
1.80E+11
1.80E+11
1.78E+11
46480000000
1.16E+11
69398000000
78766000000
MUC5AC
15647000
8353200
12617000
12221000
29908000
40893000000
79830000000
28130000000
69147000000
MUC5B
965190000
693910000
779970000
716110000
1479700000
38979000000
90175000000
41764000000
50535000000
ANXA2
14705000000
18721000000
21592000000
18904000000
22657000000
28163000000
24282000000
21708000000
16528000000
I want to make a heatmap like the following using R. I am following a paper and they quoted "Heat maps were generated with the ‘pheatmap’ package76, where correlation clustering distance row was applied". Here is their heatmap.
I want the same like this and I am trying to make one using R by following tutorials but I am new to R language and know nothing about R.
Here is my code.
df <- read.delim("R.txt", header=T, row.names="Gene")
df_matrix <- data.matrix(df)
pheatmap(df_matrix,
main = "Heatmap of Extracellular Genes",
color = colorRampPalette(rev(brewer.pal(n = 10, name = "RdYlBu")))(10),
cluster_cols = FALSE,
show_rownames = F,
fontsize_col = 10,
cellwidth = 40,
)
This is what I get.
When I try using clustering, I got the error.
pheatmap(
mat = df_matrix,
scale = "row",
cluster_column = F,
show_rownames = TRUE,
drop_levels = TRUE,
fontsize = 5,
clustering_method = "complete",
main = "Hierachical Cluster Analysis"
)
Error in hclust(d, method = method) :
NA/NaN/Inf in foreign function call (arg 10)
Can someone help me with the code?
You can normalize the data using scale to archive a more uniform coloring. Here, the mean expression is set to 0 for each sample. Genes lower expressed than average have a negative z score:
library(tidyverse)
library(pheatmap)
data <- tribble(
~Gene, ~`HBEC-KT-01`, ~`HBEC-KT-02`, ~`HBEC-KT-03`, ~`HBEC-KT-04`, ~`HBEC-KT-05`, ~`Primarycells-03`, ~`Primarycells-04`, ~`Primarycells-05`,
"BPIFB1", 1.5726e+10, 1.5294e+10, 1.5294e+10, 1.4741e+10, 2.2427e+10, 2e+11, 1.04e+11, 1.51e+11,
"LCN2", 1.804e+10, 2.6444e+10, 2.8869e+10, 3.0337e+10, 1.0966e+10, 5.4007e+10, 5.6797e+10, 3.8414e+10,
"C3", 2.52e+11, 2.26e+11, 1.8e+11, 1.8e+11, 1.78e+11, 1.16e+11, 6.9398e+10, 7.8766e+10,
"MUC5AC", 15647000, 8353200, 12617000, 12221000, 29908000, 7.983e+10, 2.813e+10, 6.9147e+10,
"MUC5B", 965190000, 693910000, 779970000, 716110000, 1479700000, 9.0175e+10, 4.1764e+10, 5.0535e+10,
"ANXA2", 1.4705e+10, 1.8721e+10, 2.1592e+10, 1.8904e+10, 2.2657e+10, 2.4282e+10, 2.1708e+10, 1.6528e+10
)
data %>%
mutate(across(where(is.numeric), scale)) %>%
column_to_rownames("Gene") %>%
pheatmap(
scale = "row",
cluster_column = F,
show_rownames = FALSE,
show_colnames = TRUE,
treeheight_col = 0,
drop_levels = TRUE,
fontsize = 5,
clustering_method = "complete",
main = "Hierachical Cluster Analysis (z-score)",
)
Created on 2021-09-26 by the reprex package (v2.0.1)

Error while running WTC (Wavelet Coherence) Codes in R

I am doing Wavelet Analysis in R using Biwavelet. However, I receive the error message:
Error in check.datum(y) :
The step size must be constant (see approx function to interpolate)
When I run the following code:
wtc.AB = wtc(t1, t2, nrands = nrands)
Please share your help here. Complete Code is:
# Import your data
Data <- read.csv("https://dl.dropboxusercontent.com/u/18255955/Tutorials/Commodities.csv")
# Attach your data so that you can access variables directly using their
# names
attach(Data)
# Define two sets of variables with time stamps
t1 = cbind(DATE, ISLX)
t2 = cbind(DATE, GOLD)
# Specify the number of iterations. The more, the better (>1000). For the
# purpose of this tutorial, we just set it = 10
nrands = 10
wtc.AB = wtc(t1, t2, nrands = nrands)
# Plotting a graph
par(oma = c(0, 0, 0, 1), mar = c(5, 4, 5, 5) + 0.1)
plot(wtc.AB, plot.phase = TRUE, lty.coi = 1, col.coi = "grey", lwd.coi = 2,
lwd.sig = 2, arrow.lwd = 0.03, arrow.len = 0.12, ylab = "Scale", xlab = "Period",
plot.cb = TRUE, main = "Wavelet Coherence: A vs B")```

How do I contour a smoothScatter plot with missing values in r studio?

This is the code I am using
smoothScatter(longesttimeon$nhmc,longesttimeon$pt2,nrpoints=0)
smoothScatter(longesttimeon$nhmc,longesttimeon$pt2,nrpoints=0,colramp=colorRampPalette(c("white","dodgerblue2","gold","firebrick")))
library(readxl)
kern <- kde2d(longesttimeon$nhmc, longesttimeon$pt2)
contour(kern, drawlabels = FALSE, nlevels = 6,
col = rev(heat.colors(6)), add = TRUE, lwd = 3)
I get this error sign
Error in kde2d(longesttimeon$nhmc, longesttimeon$pt2) :
missing or infinite values in the data are not allowed
I am trying to make it look like this

turn off grid lines for R xyplot timeseries

I am plotting a time series with the timePlot function of the open air package of R. The graph has grey grid lines in the background that I would like to turn off but I do not find a way to do it. I would expect something simple such as grid = FALSE, but that is not the case. It appears to be rather complex, requiring the use of extra arguments which are passed to xyplot of the library lattice. I believe the answer lies some where in the par.settings function but all attempts have failed. Does anyone have any suggestions to this issue?
Here is by script:
timeozone <- import(i, date="date", date.format = "%m/%d/%Y", header=TRUE, na.strings="")
ROMO = timePlot(timeozone, pollutant = c("C7", "C9", "C10"), group = TRUE, stack = FALSE,y.relation = "same", date.breaks = 9, lty = c(1,2,3), lwd = c(2, 3, 3), fontsize = 15, cols = c("black", "black"), ylab = "Ozone (ppbv)")
panel = function(x, y) {
panel.grid(h = 0, v = 0)
panel.xyplot(x,y)
}

heatmap2 : Error in axis(1, at = xv, labels = lv) : no locations are finite

I am trying to plot a non symmetrical and non squared matrix using heatmap2.
But I am getting a message error :
Error in axis(1, at = xv, labels = lv) : no locations are finite
Calls: heatmap.2 -> axis
Execution halted
In fact, my data are in the matrix are equal to 0, 1 or ? (missing values).
For example with this matrix :
dat_mat
C1 C2
P17612|KAPCA_HUMAN ? 0
P22612|KAPCG_HUMAN 0 1
P22694|KAPCB_HUMAN 1 0
P31751|AKT2_HUMAN 0 0
The expected result should be a heatmap with red color for "0", green for "1" and nothing (blank) for "?". Here is the R script I am using :
heatmap.2(dat_mat,
Rowv = FALSE,
Colv = FALSE,
dendrogram = "none",
scale = "none",
margins = c(12,24),
key = TRUE,
keysize = 1.0,
col = rainbow(512, start = 1, end = 0.4),
density.info="density",
denscol = "black",
xlab="Compounds",
ylab="Targets",
main="X Activity Profile",
tracecol = "black")
I'm assuming your data is in the following format:
dat_mat<-matrix(sample(c("?","0","1"), 10*2, replace=T), ncol=2)
colnames(dat_mat)<-c("C1","C2")
rownames(dat_mat)<-letters[1:nrow(dat_mat)]
So you have a character matrix (it's hard to tell from your description). Well, the heatmap is expecting numeric values and it really doesn't like non-standard missing values. So let's remove the "?" and replace them with NA and then convert to numeric
dat_mat[dat_mat=="?"]<-NA
class(dat_mat)<-"numeric"
That's it. You should now be able to plot this as you expect
heatmap.2(dat_mat,
Rowv=F, Colv=F, dendrogram="none", scale="none",
margins = c(12,24),
key = TRUE,
keysize = 1.0,
col = rainbow(512, start = 1, end = 0.4),
density.info="density",
denscol = "black",
xlab="Compounds",
ylab="Targets",
main="X Activity Profile",
tracecol = "black"
)

Resources