Plot a Bar Chart based on Row Names - r

I am trying to plot a dataframe as follows:
A 1
C 5
B 4
Z 10
M 7
and would it to show the data in the order (i.e. first column in the bar chart is A, second is C, third is B.
I have:
ggplot(pc,aes(x=Let,y=Count))+geom_bar(stat="identity")
And it plots it with the order of the Let column.
df<-data.frame(c('A','C','B','Z','M'),c(1,5,4,10,7))

One way is to convert Let column to factor in the order you want to see them and then use ggplot command.
library(tidyverse)
df$Let <- factor(df$Let, levels = df$Let)
ggplot(df,aes(x=Let,y=Count))+geom_bar(stat="identity")
data
df<-data.frame(Let = c('A','C','B','Z','M'),Count = c(1,5,4,10,7))

Related

R boxplot with several variables - changing variable names on x-axis

I am new to R and having issues figuring out how to plot multiple variables in the same boxplot and have the x-axis display the variable names instead of 1 2 3 4.
In other words I want 1 to be Hi_24h, 2 = Hi_mo, etc.
boxplot(project$Hi_24h, project$Hi_mo, project$Lo_24h, project$Lo_mo)
Try:
boxplot(project, names=names(project))
if you do not want all of your columns and would like to select them manually then create a vector:
mynames<-c("Hi_24h", "Hi_mo", "Lo_24h", "Lo_mo")
boxplot(project$Hi_24h, project$Hi_mo, project$Lo_24h, project$Lo_mo, names=mynames

P value fromJaccard's index

My table having 40 raw and 4 columns, in that 4 columns first column belongs to one group and the remaining constitute the other group.
using following commands for calculating jaccard's index
x <- read.csv(file name,header=T, sep= )
jac <- vegdist(x,method="jaccard")
from this out file(jac) how can i find the p value for two groups?
and how can i plot notched box plot of these two groups?
when i use boxes(as.matrix(jac)~x$first column,notch=TRUE)
its showing 40 box plots. why it so?

Creating stacked barplots in R using different variables

I am a novice R user, hence the question. I refer to the solution on creating stacked barplots from R programming: creating a stacked bar graph, with variable colors for each stacked bar.
My issue is slightly different. I have 4 column data. The last column is the summed total of the first 3 column. I want to plot bar charts with the following information 1) the summed total value (ie 4th column), 2) each bar is split by the relative contributions of each of the three column.
I was hoping someone could help.
Regards,
Bernard
If I understood it rightly, this may do the trick
the following code works well for the example df dataframe
df <- a b c sum
1 9 8 18
3 6 2 11
1 5 4 10
23 4 5 32
5 12 3 20
2 24 1 27
1 2 4 7
As you don't want to plot a counter of variables, but the actual value in your dataframe, you need to use the goem_bar(stat="identity") method on ggplot2. Some data manipulation is necessary too. And you don't need a sum column, ggplot does the sum for you.
df <- df[,-ncol(df)] #drop the last column (assumed to be the sum one)
df$event <- seq.int(nrow(df)) #create a column to indicate which values happaned on the same column for each variable
df <- melt(df, id='event') #reshape dataframe to make it readable to gpglot
px = ggplot(df, aes(x = event, y = value, fill = variable)) + geom_bar(stat = "identity")
print (px)
this code generates the plot bellow

Color bar graph according to grouping variable

I have been looking for some time on stack, but no answer correspond directly to what i am looking for.
I want to plot some results in a bar graph, and color the bars according to the grouping values in another column of my dataset. So, column 1 is the grouping variable ("v" and "d"), and column 2 is the plotted value. I am trying to make a list out of column 1 that I could use as a color argument, but I can't find a way to define that exactly.
status diff
d -2141,5
v 510
d -947
v 867
d -960,5
v 903
d -421
v 1285,5
d -1155
v 556,5
Thanks !
One approach is to use a named vector to define the colors, which you can index with the levels of the factor (this assumes the grouping column is a factor; otherwise it must be a character vector, in which case it could be used as the index argument directly):
df <- data.frame(status=c('d','v','d','v','d','v','d','v','d','v'), diff=c(-2141.5,510,-947,867,-960.5,903,-421,1285.5,-1155,556.5) );
cols <- c(d='red',v='green');
barplot(df$diff,col=cols[levels(df$status)[df$status]]);

R boxplot ggplot issues

I am new in R and am trying to so some graphics using ggplot and a bit of reverse engineering. I have a data frame as:
> data
experiments percentages
1 A 72.11538
2 A 90.62500
3 A 91.52542
4 B 94.81132
5 B 96.95122
6 B 98.95833
7 C 83.75000
8 C 84.84848
9 C 91.12903
because A and B are similar experiments I do the following
data$experiments[data$experiments == "B"] = "A"
If I do now
ggplot(data, aes(x = experiments, y = percentages)) + geom_boxplot()
I get one box for A, one for C but still I get a label for B!
Is there any way of getting rid of B on the X axis?
Thanks a lot for your help
I'm guessing that experiments in data is a factor. If you run str(data), I imagine that experiments is a factor with 3 levels: A, B, and C. By default, strings are turned into factors when a data frame is created.
The idea of factors is that they represent a set of possible values, even if not all the possibilities are in the actual data. There are two ways to fix this.
Convert the column to a string
data$experiments <- as.character(data$experiments)
Or remove the unused level in the factor
data$experiments <- droplevels(data$experiment)

Resources