plotting with ggplot2. Error - r

I am trying to plot the data using the ggplot2 package, but I am crossing with an error:
the data are set of columns which represents every day values (the values change in altitude)
V1 V2.... V500
2E-15.....3E-14
3e-14.....3E-21
1.3E-15....NA
I want to plot all the data in two axis with a fill of the values.
Code;
a<-data.frame("/../vertical_value.csv",sep=",",header=F)
am<-melt(t(a))
dataset<-expand.grid(X = 1:500, H = seq(1,25,by=1))
dataset$axp<-am$value
g<-ggplot(dataset, aes(x = X, y = H, fill = axp)) + geom_tile()
error:
Error: Casting formula contains variables not found in molten data: XHaxp

Looking at this again, I think that you should be able to bypass this just by dropping NA rows after you melt.
a<-data.frame("/../vertical_value.csv",sep=",",header=F)
am<-melt(t(a))
am <- na.omit(am) ## ADD THIS LINE
dataset<-expand.grid(X = 1:500, H = seq(1,25,by=1))
dataset$axp<-am$value
g<-ggplot(dataset, aes(x = X, y = H, fill = axp)) + geom_tile()

Related

Getting 'Error in UseMethod("mutate") : no applicable method for 'mutate' applied to an object of class "NULL"' when applying mutate to data frame

I have been working with R recently and I have encountered this issue when trying to apply a moving average function onto a data set.
library(dplyr)
library(ggplot2)
library(tidyverse)
library(zoo)
library(bit)
#grabs tab delimited file
Mouse_mm9_rDNA_file <- read.delim("mm9_rDNA_mapping_HOXA9-ER-CEBPA-degron_POLR1A_06122021-Dm3-Q25-norm_to_Input.txt")
#Averages two specific columns from the original file with 8 different columns
Column_position <- Mouse_mm9_rDNA_file[,c("position")]
Columns_5_and_6_mean_0_hrs <- rowMeans(Mouse_mm9_rDNA_file[,c("X5", "X6")])
Columns_7_and_8_mean_4_hrs <- rowMeans(Mouse_mm9_rDNA_file[,c("X7", "X8")])
Columns_9_and_10_mean_8_hrs <- rowMeans(Mouse_mm9_rDNA_file[,c("X9", "X10")])
Columns_11_and_12_means_10_hrs <- rowMeans(Mouse_mm9_rDNA_file[,c("X11", "X12")])
#Puts those averaged columns into rows and then flips the columns and rows
all_Columns_averaged <- rbind(Column_position,
Columns_5_and_6_mean_0_hrs,
Columns_7_and_8_mean_4_hrs,
Columns_9_and_10_mean_8_hrs,
Columns_11_and_12_means_10_hrs)
all_Columns <- t(all_Columns_averaged)
#Turns my dataset into a data frame
all_Columns_dataframe <- setattr(all_Columns, "class", c("tbl", "tbl_df", "data.frame"))
#runs the moving average function on my data set
all_Columns_dataframe <- all_Columns_dataframe %>%
mutate(Averages_03 = rollmean(Columns_5_and_6_mean_0_hrs, k = 5, ))
#Creates a line plot with multiple y-values
p <- all_Columns_dataframe %>%
ggplot(aes(x = Column_position)) +
labs(x = "position", y = "hours", color = "legend") + xlab("position") + ylab("hours")
p +
geom_line(data = all_Columns, aes(x = Column_position, y = Columns_5_and_6_mean_0_hrs), color = "black") +
geom_line(data = all_Columns, aes(x = Column_position, y = Columns_7_and_8_mean_4_hrs), color = "green") +
geom_line(data = all_Columns, aes(x = Column_position, y = Columns_9_and_10_mean_8_hrs), color = "darkslategray1") +
geom_line(data = all_Columns, aes(x = Column_position, y = Columns_11_and_12_means_10_hrs), color = "maroon1")
I am trying to visualize the all_Columns_dataframe data after it has been smoothed out and averaged by the roll means function, but when I try to run this code I get the error:
Error in UseMethod("mutate") :
no applicable method for 'mutate' applied to an object of class "NULL"
At first I thought it may have been because I had NULL values in my data so I added 1 to all values in all_Columns, but the same error persisted. If I take away the
all_Columns_dataframe <- all_Columns_dataframe %>%
mutate(Averages_03 = rollmean(Columns_5_and_6_mean_0_hrs, k = 5, ))
section of my code then everything runs smoothly and I get a nice looking graph with the correct values and everything. I guess my question would be how can I get rollmean to work or what would be the most effective way to run a moving average on my data so I can smooth it out?
It's hard to know without seeing a sample of your data (you can do this with dput(head(df))). But I would first just trying to specify the package for mutate.
dplyr::mutate()
This issue sometimes happens when it is using mutate from another package.

How to make scatter plot points into numbers?

I am creating a scatter plot using ggplot/geom_point. Here is my code for building the function in ggplot.
AddPoints <- function(x) {
list(geom_point(data = dat , mapping = aes(x = x, y = y) , shape = 1 , size = 1.5 ,
color = "blue"))
}
I am wondering if it would be possible to replace the standard points on the plot with numbers. That is, instead of seeing a dot on the plot, you would see a number on the plot to represent each observation. I would like that number to correspond to a column for that given observation (column name 'RP'). Thanks in advance.
Sample data.
Data <- data.frame(
X = sample(1:10),
Y = sample(3:12),
RP = sample(c(4,8,9,12,3,1,1,2,7,7)))
Use geom_text() and map the rp variable to the label argument.
ggplot(Data, aes(x = X, y = Y, label = RP)) +
geom_text()

Add multiple ggplot2 geom_segment() based on mean() and sd() data

I have a data frame mydataAll with columns DESWC, journal, and highlight. To calculate the average and standard deviation of DESWC for each journal, I do
avg <- aggregate(DESWC ~ journal, data = mydataAll, mean)
stddev <- aggregate(DESWC ~ journal, data = mydataAll, sd)
Now I plot a horizontal stripchart with the values of DESWC along the x-axis and each journal along the y-axis. But for each journal, I want to indicate the standard deviation and average with a simple line. Here is my current code and the results.
stripchart2 <-
ggplot(data=mydataAll, aes(x=mydataAll$DESWC, y=mydataAll$journal, color=highlight)) +
geom_segment(aes(x=avg[1,2] - stddev[1,2],
y = avg[1,1],
xend=avg[1,2] + stddev[1,2],
yend = avg[1,1]), color="gray78") +
geom_segment(aes(x=avg[2,2] - stddev[2,2],
y = avg[2,1],
xend=avg[2,2] + stddev[2,2],
yend = avg[2,1]), color="gray78") +
geom_segment(aes(x=avg[3,2] - stddev[3,2],
y = avg[3,1],
xend=avg[3,2] + stddev[3,2],
yend = avg[3,1]), color="gray78") +
geom_point(size=3, aes(alpha=highlight)) +
scale_x_continuous(limit=x_axis_range) +
scale_y_discrete(limits=mydataAll$journal) +
scale_alpha_discrete(range = c(1.0, 0.5), guide='none')
show(stripchart2)
See the three horizontal geom_segments at the bottom of the image indicating the spread? I want to do that for all journals, but without handcrafting each one. I tried using the solution from this question, but when I put everything in a loop and remove the aes(), it give me an error that says:
Error in x - from[1] : non-numeric argument to binary operator
Can anyone help me condense the geom_segment() statements?
I generated some dummy data to demonstrate. First, we use aggregate like you have done, then we combine those results to create a data.frame in which we create upper and lower columns. Then, we pass these to the geom_segment specifying our new dataset. Also, I specify x as the character variable and y as the numeric variable, and then use coord_flip():
library(ggplot2)
set.seed(123)
df <- data.frame(lets = sample(letters[1:8], 100, replace = T),
vals = rnorm(100),
stringsAsFactors = F)
means <- aggregate(vals~lets, data = df, FUN = mean)
sds <- aggregate(vals~lets, data = df, FUN = sd)
df2 <- data.frame(means, sds)
df2$upper = df2$vals + df2$vals.1
df2$lower = df2$vals - df2$vals.1
ggplot(df, aes(x = lets, y = vals))+geom_point()+
geom_segment(data = df2, aes(x = lets, xend = lets, y = lower, yend = upper))+
coord_flip()+theme_bw()
Here, the lets column would resemble your character variable.

Use ggplot2 to generate histogram from each row of data properly

I have a set of data like below:
pos A C G T
0 0.291398 0.190061 0.315722 0.202818
1 0.315597 0.227511 0.175448 0.281445
2 0.252149 0.194597 0.222815 0.330438
Then I imported the table:
library(ggplot2)
d = read.table(tablename, sep = '\t', header = T)
d = d[2:5]
data.frame(t(d))
And I got a reformatted table as below:
X1 X2 X3
A 0.291398 0.315597 0.252149
C 0.190061 0.227511 0.194597
G 0.315722 0.175448 0.222815
T 0.202818 0.281445 0.330438
However, when I tried to plot it:
qplot(X1, data = d, geom = 'histogram')
It gives the image below:
And what I want should be like:(I used libreoffice, so the color and the width and other parameters do not matter)
May I know how to correct my code to make this shape?
Any help is appreciated. Sorry but I am really new to R and ggplot2.
You aren't telling the plot what you want as your Y value. The X1 choice is the value you got, not the base, and everything is present once, so you get all 1s.
You want X1 as your Y and base as your X.
To fix your plot, from d:
d$base<-rownames(d)
ggplot(d,aes(x=base,y=X1))+geom_bar(stat="identity")
or using qplot nomenclature:
d$base<-rownames(d)
qplot(data = d, x = base, y = X1, geom = 'histogram', stat = "identity")
Edit: Here's how I would plot it for all rows:
library(reshape2)
d1 <- melt(d, id = "pos")
ggplot(d1, aes(x = variable, y = value, fill = factor(pos))) +
geom_bar(stat = "identity", position = "dodge")

Only one of two densities is shown in ggplot2

So I have two sets of data (of different length) that I am trying to group up and display the density plots for:
dat <- data.frame(dens = c(nEXP,nCNT),lines = rep(c("Exp","Cont")))
ggplot(dat, aes(x = dens, group=lines, fill = lines)) + geom_density(alpha = .5)
when I run the code it spits an error about the different lengths, i.e.
"arguments imply different num of rows: x, y"
I then augment the code to:
dat <- data.frame(dens = c(nEXP,nCNT),lines = rep(c("Exp","Cont"),X))
Where X is the length of the longer argument so the lengths of "lines" will match that of dens.
Now the issue is that when when I go to plot the data I am only getting ONE density plot.... I know there should be two, since plotting the densities with plot/lines, is clearly two non-equal overlapping distributions, so I am assuming the error is with the grouping...
hope that makes sense.
So I am not sure why but basically I simply had to do the rep() function manually:
A<-data.frame(ExpN, key = "exp")
B<-data.frame(ConN,key = "con")
colnames(A) <- c("a","key")
colnames(B) <- c("a","key")
dat <- rbind(A,B)
ggplot(dat, aes(x = dens, fill = key)) + geom_density(alpha = .5)
You need to tell rep how many times to repeat each element to get it to line up
dat <- data.frame(dens = c(nEXP,nCNT),
lines = rep(c("Exp","Cont"), c(length(nEXP),length(nCNT)))
That should give you a dat you can use with your ggplot call.

Resources