R ggplot only shows one bar - r

I am just starting to work with R, so apologies if my question is too basic,
I have an excel sheet , here's the link: https://file.io/LfsAOdDCVnFq
where I am trying to plot a simple bar plot as follows:
X = I want it to be my sample names , the column called OTU ID in the file
Y = I want it to be the sum of my variables for each sample, column called Sum ZOTUs in the file
so far, I have installed and called library of ggplot2 and tried to plot my data frame but when I do that it only shows one bar, and I don't know what is wrong
install.packages("readxl")
install.packages("ggplot2")
library(readxl)
library(ggplot2)
ZOTU <- read_excel(file.choose())
ggplot(data=ZOTU, aes(x="OTU ID")) + geom_bar ()
and it shows the plot below:
can anyone help how to fix this?
Thanks

I can't see your uploaded image with the excel sheet screenshot.
My guess would be using quotation marks instead of backticks. Try running this code:
ggplot(data = ZOTU, aes(x = `OTU ID`)) + geom_bar()

First
Your question can be better formulated, please read how to ask a good question and how to create a minimal example to understand the basics of a workable question.
In R, you have a very good tool for creating reproducible examples: the reprex package
Also, I would not download anything from a given link in a random question in StackOverflow, and neither should you.
Try
Execute this code in your computer, and see if it helps you understand how ggplot works:
library(ggplot2) # load ggplot
mpg # let's look at a 'mpg' data included in the ggplot package
# Now, a simple bar plot
ggplot(mpg, aes(x = fl)) +
geom_bar()
We use the mpg data as the data for our figure, and we set the x-axis to be the fl column of that data. Finally, we "add" a bar plot to the figure.
By default, the bar plot will plot the count of the different values present in the column you passed as x-axis.
After comments
Following our discussion in the comment section, maybe this is what you want.
If you have the names (discrete variable) for the x-axis in a column, and another column with the variable you want to sum and plot in y for each name, try:
ggplot(data = mpg) +
geom_col(aes(x = manufacturer, y = hwy))
You can have the values with the code
library(tidyverse)
mpg %>% group_by(manufacturer) %>% summarize(total = sum(hwy))
So for your case, if you have a column with the names you want in the x-axis, and another with the values you want the code to sum for each name, use
ggplot(data = your_data_frame) +
geom_col(aes(x = your_names, y = values_to_be_summed_for_each_name))

Related

I want to create a bar graph that counts the number of times a variable occurs in the dataset

I have just started using R and am currently trying to create a bar graph that shows the amount of times each "category" is used. The categories include things like Travel & Events and Sports.
I've tried a few things that come up with errors
barplot(freq, main = category) +geom_bar(stat=category)
Error in as.graphicsAnnot(main) : object 'category' not found
ggplot(data=dat, aes(category))
Error in ggplot(data = dat, aes(category)) : object 'dat' not found
The one time I got a graph to appear it has no data in it just a bunch of lines.
If I understand your question correctly, I think this is an example of what you are after. I used the starwars data set, which is accessible through the tidyverse package:
# load the tidyverse
library(tidyverse)
# take a look at the starwars data set
starwars
# this will show the numbers in each category of hair_colour, which will be plotted below
starwars %>% count(hair_color)
# plot hair colour using ggplot
ggplot(dat = starwars, aes(x = hair_color)) +
geom_bar() +
coord_flip()
From the documentation for geom_bar(): "geom_bar() makes the height of the bar proportional to the number of cases in each group", which is - I think - what you wanted?
This should work if you substitute starwars with your own data, and hair_colour for your variable of interest.

labels right next points in gg plot

Ive tried to google my way to the answere to the question, but none seems to give the answer to what im trying to do.
My goal is to add legends right besides the observations within the plot to show the name of the observation. Name of observations are located in the first column of my data frame.
ggplot(data = coef.vec)+aes(x = coef.x, y = variable.mean) +
geom_point()
You can use labels with geom_text() in next style. I have used simulated data:
library(tidyverse)
#Code
data <- data.frame(group=paste0('Obs',1:10),
coef.x=rnorm(10,0,1),
variable.mean=runif(10,0.015,0.05),stringsAsFactors = F)
#Plot
ggplot(data,aes(x=coef.x,y=variable.mean))+
geom_point()+
geom_text(aes(label=group),hjust=-0.15)
Output:

Getting an error that ggplot2_3.2.0 can't draw more than one boxplot per group

I have xy data that I'd like to plot using R's ggplot:
library(dplyr)
library(ggplot2)
set.seed(1)
df <- data.frame(group = unlist(lapply(LETTERS[1:5],function(l) rep(l,5))),
x = rep(1:5,5),
y = rnorm(25,2,1),
y.se = runif(25,0,0.1)) %>%
dplyr::mutate(y.min = y-3*y.se,
y.low = y-y.se,
y.high = y+y.se,
y.max = y+3*y.se)
As you can see, while df$x is a point (integer), df$y has an associated error, which I would like to include using a box plot.
So my purpose is to plot each row in df by its x coordinate, using y.min, y.low, y, y.high, and y.max to construct a boxplot, and color and fill it by group. That means, that I'd like to have each row in df, plotted as a box along a separate x-axis location and faceted by df$group, such that df$group A's five replicates appear first, then to their right df$group B's replicates, and so on.
This code used to work for my purpose until I just installed the latest ggplot2 package (ggplot2_3.2.0):
ggplot(df,aes(x=x,ymin=y.min,lower=y.low,middle=y,upper=y.high,ymax=y.max,col=group,fill=group))+
geom_boxplot(position=position_dodge(width=0),alpha=0.5,stat="identity")+
facet_grid(~group,scales="free_x")+scale_x_continuous(breaks = integerBreaks())
Now I'm getting this error:
Error: Can't draw more than one boxplot per group. Did you forget aes(group = ...)?
Any idea?
You need a separate boxplot for each combination of x and group, so you can set the group aesthetic to interaction(x, group):
ggplot(df,aes(x=x,ymin=y.min,lower=y.low,middle=y,upper=y.high,
ymax=y.max,col=group,fill=group))+
geom_boxplot(aes(group = interaction(x, group)),
position=position_dodge(width=0),
alpha=0.5,stat="identity")
This code used to work for my purpose until I just installed the
latest ggplot2 package (ggplot2_3.2.0)
You are right: I just experienced a similar error with a code I recently wrote using ggplot2 boxplots, and just find out this new error related to ggplot2 latest update.
As Marius already pointed out, specifying the group in the aes() did solve the problem for me also.
However I don't understand the rest of his answer, as it is not providing the faceting...
Here is a working solution with facet_grid(), you were close:
ggplot(df,aes(x=x,ymin=y.min,lower=y.low,middle=y,upper=y.high,ymax=y.max,col=group,fill=group, group=x))+
geom_boxplot(position=position_dodge(width=0),alpha=0.5,stat="identity")+
facet_grid(~group,scales="free_x")

geom_smooth does not plot line of best fit

I hope this question isn't a duplicate. I tried to find answers per the site's requirements before posting, but since I am so new, the help forums are too foreign to me.
Following Wickham's R for data visualization, I easily used geom_point for an integrated data set, mpg:
simple reference code:
ggplot(data = mpg)+
geom_smooth(mapping = aes(x=displ, y=hwy))+
geom_point(mapping = aes(x=displ, y=hwy))
Excited by this cool plot, I tried to do the same for some personal research data, which describes inteferon-beta production over five time points (A,b,c,d,e instead of numerical data).
I used the same code, essentially:
ggplot(data = ifnonly)+
geom_smooth(mapping = aes(x=HOURS, y=IFNB))+
geom_point(mapping = aes(x=HOURS, y=IFNB))
Unfortunately, the line does not display. In fact, nothing displays until I add the geom_point function. What am I missing here? Is there more complex code required or is there some subtlety that I can apply to future uses of this function and ggplot?
I think you should get your desired output with following one line code
library(ggplot2)
ggplot(mtcars, aes(disp,mpg))+geom_smooth() # one line code where I have mentioned data is mtcars , and disp as x axis and mpg as y axis you could get following output
# please check this link for output
o/p without geom_point
library(ggplot2)
ggplot(mtcars, aes(disp,mpg))+geom_smooth()+geom_point()
o/p with geom_point

Multiple line plot using ggplot2

I am trying to emulate a ggplot of multiple lines which works as follows:
set.seed(45)
df <- data.frame(x=c(1,2,3,4,5,1,2,3,4,5,3,4,5), val=sample(1:100, 13),
variable=rep(paste0("category", 1:3), times=c(5,5,3)))
ggplot(data = df, aes(x=x, y=val)) + geom_line(aes(colour=variable))
I can get this simple example to work, however on a much larger data set I am following the same steps but it is not working.
ncurrencies = 6
dates = c(BTC$Date, BCH$Date, LTC$Date, ETH$Date, XRP$Date, XVG$Date)
opens = c(BTC$Open, BCH$Open, LTC$Open, ETH$Open, XRP$Open, XVG$Open)
categories = rep(paste0("categories", 1:ncurrencies),
times=c(nrow(BTC), nrow(BCH), nrow(LTC), nrow(ETH), nrowXRP), nrow(XVG)))
df = data.frame(dates, opens, categories)
# Plot - Not correct.
ggplot(data=df, aes(x=dates, y=opens)) +
geom_line(aes(colour=categories))
As you can see, the different points are discretised and the y-axis is strange. I am guessing this is a rookie error but I have been going round in circles for a while. Can anyone see it?
P.S. I don't think I can upload the data here as it would be too much code. However, the dataframe is in the same format as the practice example and the categories match up correctly to the x and y data. Therefore I believe it is the way I am defining ggplot - I am relatively new to R.
Thank you Markus and Jan, yes you are correct. df$opens was a factor and changing it to a numeric solved the problem.
opens = as.numeric(c(BTC$Open, BCH$Open, LTC$Open, ETH$Open, XRP$Open, XVG$Open))

Resources