I have a data frame (pLog) containing the number of reads per nucleotide for a chip-seq experiment done for a E. coli genome (4.6MB). I want to be able to plot on the X axis the chromosomal position and on the Y axis the number of reads. To make it easier, I binned the data in windows of 100bp. That makes the data frame of 46,259 rows and 2 columns. One column is named "position" and has a number representing a chromosomal position (1,101,201,....) and the other column is named "values" and contains the number of reads found on that bin e.g.(210,511,315,....). I have been using ggplot for all my analysis and I would like to use it for this plot, if possible.
I am trying for the graph to look something like this:
but I haven't been able to plot it.
This is how my data looks like
I tried
ggplot(pLog,aes(position))+
geom_histogram(binwidth=50)
ggsave(file.jpg)
And this is how it looks like :(
Many thanks!
You cannot use geom_histogram(), try geom_line:
pLog=data.frame(position=seq(1,100000,by=100),
value=rnbinom(10000,mu=100,size=20))
ggplot(pLog,aes(x=position,y=value))+geom_line(alpha=0.7,col="steelblue")
Most likely you need to play around to get the visualization you need
Trying to create a plot showing the number of items (ex. pop_songs) released by year from a dataframe I have (ex. Music_Charts).
I have a year released column in my dataframe and can use that as the x-variable, but I don't know what I would use for the y-variable to show the boxplot since I have the Top 500 Ranked songs on the dataframe.
Well, based on your very general question, if you have a data frame column with the years for each song, you can easily get the count for that column using table.
table(dataframe$year_released)
That should give you the number of entries for every year, then you can plot them (i'm guessing that's what you need)
I have a table in R that has four columns of temperatures. I want the title of all 4 rows to be "Temperature"
I currently have them as 'Temperature.1" "Temperature.2" and so on. This is my first time using R and I believe it will affect a chi-test that I want to run.
Thanks
I've cleaned up a larger dataframe to a simple table that looks something like this (note this is a small sample of a couple hundred rows):
Name<-c("Bob","Bob","Bob","Bob","Bob","Anne","Anne","Anne","Anne","Anne","Anne","Joe","Joe")
start_event <-c(0,266,352,354,553,0,36,192,206,458,997,1102,1198)
end_event <-c(27.5,296,354,402,561,27.5,71,203,217,515,1033,1109,1215)
duration <-c(27.5,30,2,48,8,27.5,35,11,11,57,36,7,17)
run<-c(1,2,3,4,5,1,2,3,4,5,6,1,2)
df<-data.frame(Name,run,start_event,end_event,duration)
My goal is to create a graph that has the names on the y-axis, the total event duration on the x-axis (the min. would be the start_event and the max would be the final end_event).
For each person, a bar would represent the duration of their activity, from start to end. There would be gaps with no bars for the times they were not active.
I've tried mashing together some code from another example (link below) using either geom_rect, geom_bar, and attempts with geom_line, but am having issues with discrete/continuous values.
For reference to help visually frame this, this answer provided for this Q produces a similar result I would like to achieve: https://stackoverflow.com/a/17130467
Dodging the bars/rectangles is not needed, stacked in a single horizontal line is preferred.
Thank you in advance for any guidance/help!
You could use geom_segment :
ggplot(df,aes(y=Name,yend=Name,x=start_event,xend=end_event,color=Name)) + geom_segment(size=6)
I have a google spreadsheet containing a column of datetime values. There can be as many as 10 values that occur on a common date (at different times). Is it possible to create a graph that shows frequency of "events per day" such that the x axis is a date and the y axis is a numerical value from 0 to 10? There doesn't appear to be anything in the chart wizard that resembles this idea and my knowledge of spreadsheets is just about nil...
It is possible but I can't say I'm impressed by the results. If your column of datetime values starts in A2, please insert in B2 =int(A2) and in C2:
=if(B1=B2,1+C1,1)
and copy both down to suit. You might want to add an earlier date in B1.