I'm trying to create stacked barplots using a large dataset (80,000 lines).
POS C1 C2 SELF
1 1 0.546982 0.256896 0.196122
2 2 0.628456 0.263229 0.108315
3 3 0.652629 0.256041 0.091330
4 4 0.562783 0.264318 0.172898
5 5 0.562783 0.264318 0.172898
6 6 0.180272 0.571032 0.248696
80.000
I've imported my csv file using read.csv and convert in a matrix using as.matrix(). I tried to plot using barplot using:
barplot(as.matrix(C3_GO2_06))
But I got a plot like that:
I need each row in POS to be a sample...i.e the column POS should be the x axes, and values associated with C1, C2, SELF stacked... but I don't know how to do that. Is necessary to convert to data frame?
Related
I have a CSV file with individual labels on the 15 first column, followed by 6000 rows of values associated with each label. I'd like to plot them all together in one graph, formatted with the same y-axis limits and stacked in the one graph. Not overlapping each other, just stacked on top of each other with the respective label on the side of each trace. Is there a way to accomplish this in either R (maybe using facets or plotly)?
data would look like this for example:
L1 4 5 7 6 ...... 5.5
L2 9 5 8 .6 ...... 3
L3 2 1 8 4.2 ...... 6.2
.
.
.
L15 2 3.2 4 2 ..... 4.2
L1 through L15 represent the label for each row. I would want to plot out the values that come after the label on the y-axis, and just have the x-axis be number of values (6000) since that represents time (in frames).
I am trying to use the R barplot function to plot the following array on the same graph:
ID 1 2 3 4 5 6 7 8
HeL 0 2 1 4 2 3 2 4
CaC 2 0 0 2 1 5 7 8
NIH 1 2 5 6 3 5 7 9
I would need to have the barplot of each row having its own y-axis, but the x-axis should be common for all rows. What I have achieved so far, is to read the matrix from the file "rna.tab" and then plot each row separately:
dat <- read.table ("rna.tab", row.names=1, header=TRUE)
barplot (as.matrix (dat[,1]))
barplot (as.matrix (dat[,2]))
barplot (as.matrix (dat[,3]))
but I didn't succeed in plotting them all together.
Thanks in advance-
Arturo
Is this what you are looking for? If it isn't could you please make a manual example of what you want and post the image?
par(mfrow = c(ncol(dat),1), mar = c(2.5,4,1,1))
apply(dat, 2, barplot, beside = TRUE)
par(mfrow = c(1,1))
The first par say you want a grid of plots with as many rows as there are columns of dat and 1 column, and changes the margins of the plot to be appropriate. The apply function makes a barplot for eash column of dat and beside = TRUE puts the columns next to each other. The next par resets the plotting grid to a single graph so next time you need to plot something you aren't just making a bunch of tiny plots.
Thanks Barker for the fix and sorry for taking so long to get back to you, but I was sick for almost one week.
Your code works great, the only thing is that, since I need to plot the rows and not the columns, it should be:
apply(dat, 1, barplot, beside = TRUE)
Sorry for not being clear about this point.
I have just one last question, if you don't mind. Usually my real life matrix is 6000*30. This means that I have to plot 30 rows.
Usually I save the image to disk:
png ("plot.png")
par(mfrow = c(ncol(dat),1), mar = c(2.5,4,1,1))
apply(dat, 1, barplot, beside = TRUE)
dev.off ()
When I do this, I get only the plot of the last 4 rows in the file "plot.png", instead of the plot of all rows. Also, since the x-axis is the same for all plots, would be possible to draw it only at the end?
I am a novice R user, hence the question. I refer to the solution on creating stacked barplots from R programming: creating a stacked bar graph, with variable colors for each stacked bar.
My issue is slightly different. I have 4 column data. The last column is the summed total of the first 3 column. I want to plot bar charts with the following information 1) the summed total value (ie 4th column), 2) each bar is split by the relative contributions of each of the three column.
I was hoping someone could help.
Regards,
Bernard
If I understood it rightly, this may do the trick
the following code works well for the example df dataframe
df <- a b c sum
1 9 8 18
3 6 2 11
1 5 4 10
23 4 5 32
5 12 3 20
2 24 1 27
1 2 4 7
As you don't want to plot a counter of variables, but the actual value in your dataframe, you need to use the goem_bar(stat="identity") method on ggplot2. Some data manipulation is necessary too. And you don't need a sum column, ggplot does the sum for you.
df <- df[,-ncol(df)] #drop the last column (assumed to be the sum one)
df$event <- seq.int(nrow(df)) #create a column to indicate which values happaned on the same column for each variable
df <- melt(df, id='event') #reshape dataframe to make it readable to gpglot
px = ggplot(df, aes(x = event, y = value, fill = variable)) + geom_bar(stat = "identity")
print (px)
this code generates the plot bellow
I'm in need of assistance... I'm using R to analyze some data... I have a frequency table called mytable... that I created like this:
mytable=table(cut(var1,12),cut(var2,12))
the table looks something like this:
1-2 2-3 3-4
1-3 2 1 2
3-6 0 1 4
6-9 7 1 8
except is a 12 by 12 table.
I used boxplot.matrix(mytable),the boxplot looks ok... with the 12 boxes corresponding to my 12 stratums, but my boxplot has the frequency as the y-axis and I want the y-axis to be the values from var1, how can I do this?
I wanted to post a pic... but my rep wasnt high enough
use boxplot before you summarize your data.
boxplot(var1)
If you want to see the distribution per split, use the formula format:
boxplot(var1 ~ cut(var2, 12))
I can not seem to figure out how to get a nice barplot that contains the data from two tables that contain a different number of columns.
The tables in question are something like (snipped some data from the end):
> tab1
1 2 3 6 8 31
5872 1525 831 521 299 4
> tab2
1 2 3 4 22
7874 422 2 5 1
Note the column names and sizes are different. When I just do barplot() on one of these tables it comes out with the plot I'd like (showing the column names as the X-axis, frequencies on Y-axis). But, I would like these two side by side.
I've gotten as far as creating a data frame containing both variables as comments and the different row names in the first column (with data.frame()and merge()), but when I plot this the X-axis seems to be all wrong. Attempting to reorder the columns gives me an exception about lengths differing.
Code:
combined <- merge(data.frame(tab1), data.frame(tab2), by = c('Var1'), all=T)
barplot(t(combined[,2:3]), names.arg = combined[,1], beside=T)
This shows a plot, but not all labels are present and the value for position 26 is plotted after 33.
Is there any simple way to get this plot working? A ggplot2 solution would be nice.
You can put all your data in one data frame (as in example).
df<-data.frame(group=rep(c("A","B"),times=c(2,3)),
values=c(23,56,345,6,7),xval=c(1,2,1,2,8))
group values xval
1 A 23 1
2 A 56 2
3 B 345 1
4 B 6 2
5 B 7 8
Then ggplot() with geom_bar() can be used to plot the data.
ggplot(df,aes(xval,values,fill=group))+
geom_bar(stat="identity",position="dodge")