R Plot muliptle lines with dates - r

I am trying to create a line plot in R. For each 'RuleID' in my data frame I want to plot the 'ErrorCount' at each 'ProcessorTimeStamp'
DQ_Counts= data.frame(RuleID=c(1,2,1,2),
ProcessorTimeStamp=as.Date(c('2016-08-04','2016-08-04','2016-08-08','2016-08-08')),
ErrorCount=c(6,8,3,4))
# RuleID ProcessorTimeStamp ErrorCount
# 1 1 2016-08-04 6
# 2 2 2016-08-04 8
# 3 1 2016-08-08 3
# 4 2 2016-08-08 4
This is a plot I found online that I would like the end result to look like all though I am obviously not talking about trees. The code for this plot is here Code for Tree Growth Plot but I don't understand it well enough to make it work for me.
For my plot 'ProcessTimeStamp' would be my x and 'ErrorCount' would by my y. Each line would represent a different 'RuleID'.
One thing to note is that I have 'ErrorCounts' ranging from 0 to over 3 million (this is why I need to report on them to get them fixed!).
Thanks in advance.

This is probably the easiest way to get a basic plot like the one above with your data
lattice::xyplot(ErrorCount~ProcessorTimeStamp, DQ_Counts,
groups=RuleID, auto.key=T, type="l")
Which returns
or you could use ggplot2
library(ggplot2)
ggplot(DQ_Counts, aes(ProcessorTimeStamp, ErrorCount, color=factor(RuleID))) + geom_line()
to get

Related

Using the same column for muliple lines in gnuplot

I have a continuous stream of data in two columns that I am trying to plot. The data contains different trajectories however, and I want gnuplot to plot theses with lines but not connect the different trajectories. How would I signal gnuplot to recognize these different trajectories and not connect them?
Eg:
1 1
2 4
3 9
new traj
1 1
2 .5
3 .333
Sorry if this has been posted before, I searched for about an hour and gave up. Thanks in advance.

Plot aligned barplots in the same graph

I am trying to use the R barplot function to plot the following array on the same graph:
ID 1 2 3 4 5 6 7 8
HeL 0 2 1 4 2 3 2 4
CaC 2 0 0 2 1 5 7 8
NIH 1 2 5 6 3 5 7 9
I would need to have the barplot of each row having its own y-axis, but the x-axis should be common for all rows. What I have achieved so far, is to read the matrix from the file "rna.tab" and then plot each row separately:
dat <- read.table ("rna.tab", row.names=1, header=TRUE)
barplot (as.matrix (dat[,1]))
barplot (as.matrix (dat[,2]))
barplot (as.matrix (dat[,3]))
but I didn't succeed in plotting them all together.
Thanks in advance-
Arturo
Is this what you are looking for? If it isn't could you please make a manual example of what you want and post the image?
par(mfrow = c(ncol(dat),1), mar = c(2.5,4,1,1))
apply(dat, 2, barplot, beside = TRUE)
par(mfrow = c(1,1))
The first par say you want a grid of plots with as many rows as there are columns of dat and 1 column, and changes the margins of the plot to be appropriate. The apply function makes a barplot for eash column of dat and beside = TRUE puts the columns next to each other. The next par resets the plotting grid to a single graph so next time you need to plot something you aren't just making a bunch of tiny plots.
Thanks Barker for the fix and sorry for taking so long to get back to you, but I was sick for almost one week.
Your code works great, the only thing is that, since I need to plot the rows and not the columns, it should be:
apply(dat, 1, barplot, beside = TRUE)
Sorry for not being clear about this point.
I have just one last question, if you don't mind. Usually my real life matrix is 6000*30. This means that I have to plot 30 rows.
Usually I save the image to disk:
png ("plot.png")
par(mfrow = c(ncol(dat),1), mar = c(2.5,4,1,1))
apply(dat, 1, barplot, beside = TRUE)
dev.off ()
When I do this, I get only the plot of the last 4 rows in the file "plot.png", instead of the plot of all rows. Also, since the x-axis is the same for all plots, would be possible to draw it only at the end?

Simple line plot using R ggplot2

I have data as follows in .csv format as I am new to ggplot2 graphs I am not able to do this
T L
141.5453333 1
148.7116667 1
154.7373333 1
228.2396667 1
148.4423333 1
131.3893333 1
139.2673333 1
140.5556667 2
143.719 2
214.3326667 2
134.4513333 3
169.309 8
161.1313333 4
I tried to plot a line graph using following graph
data<-read.csv("sample.csv",head=TRUE,sep=",")
ggplot(data,aes(T,L))+geom_line()]
but I got following image it is not I want
I want following image as follows
Can anybody help me?
You want to use a variable for the x-axis that has lots of duplicated values and expect the software to guess that the order you want those points plotted is given by the order they appear in the data set. This also means the values of the variable for the x-axis no longer correspond to the actual coordinates in the coordinate system you're plotting in, i.e., you want to map a value of "L=1" to different locations on the x-axis depending on where it appears in your data.
This type of fairly non-sensical thing does not work in ggplot2 out of the box. You have to define a separate variable that has a proper mapping to values on the x-axis ("id" in the code below) and then overwrite the labels with the values for "L".
The coe below shows you how to do this, but it seems like a different graphical display would probbaly be better suited for this kind of data.
data <- as.data.frame(matrix(scan(text="
141.5453333 1
148.7116667 1
154.7373333 1
228.2396667 1
148.4423333 1
131.3893333 1
139.2673333 1
140.5556667 2
143.719 2
214.3326667 2
134.4513333 3
169.309 8
161.1313333 4
"), ncol=2, byrow=TRUE))
names(data) <- c("T", "L")
data$id <- 1:nrow(data)
ggplot(data,aes(x=id, y=T))+geom_line() + xlab("L") +
scale_x_continuous(breaks=data$id, labels=data$L)
You have an error in your code, try this:
ggplot(data,aes(x=L, y=T))+geom_line()
Default arguments for aes are:
aes(x, y, ...)

Histograms in R with a "more" categorie, similar to MS Excel

Consider the following frequency data:
> table(income)
income
3 5 6 7 8 5000
2 7 2 2 2 1
When I type >hist(income) I get the following histogram
So as you can see, the fact that most income values are concentrated around 5 and there is one value quite distant from the others makes the histogram not look very good. MS Excel can consider the 5000 value as of another category, so the data would like this instead:
> table(income)
income
3 5 6 7 8 more
2 7 2 2 2 1
So plotting this as a histogram would look much better, so you can see the frequency within a shorter range:
Is there anyway to do this either with the hist() function or others functions from lattice or ggplot2? I do however, don't want to overwrite the values that exceed a certain threshold, so as I do lose any information.
Thanks a lot!
Data generation:
income <- c(rep(3,2), rep(5,7), rep(6,2), rep(7,2), rep(8,2), 5000)
Function for preparing data for plotting:
nice.data <- function(x, threshold=10){
x[x>threshold] <- "More"
x
}
Plotting:
library(ggplot2)
ggplot() + geom_histogram(aes(x=nice.data(income))) + xlab("Income")
Result:

Plot multiple individual by one function?

Could you please help me to solve this problem:
I have a database like below:
Animal Milk Age
1 11.96703591 1
1 13.41236333 2
1 14.85769075 3
1 16.30301817 4
2 17.74834559 1
2 19.08465881 2
2 20.42097204 3
2 14.66094662 4
2 14.70197368 5
3 14.74300075 1
3 14.78402781 2
3 14.82505488 3
3 14.86608194 4
3 14.90710901 5
I want to make a plot between milk versus age, so I use function plot(Milk~Age, data=mydata)
My question is how can I make the same plot (Milk~Age) for each individual, by using only one function. Since I have about 200 animals and if I have to run 200 times to produce 200 curves.
Thanks
Phuong
One approach would be to use library ggplot2 and then make individual facets for each animal. As you have many animals you can change ncol= or nrow= in facet_wrap() to get better view.
library(ggplot2)
ggplot(df,aes(x=Age,y=Milk))+geom_point()+facet_wrap(~Animal)
The following code should create as many plot as you have unique Animal values, and store them in different pdf files in the working directory :
invisible(by(df, df$Animal, function(tmpdf) {
pdf(paste0("plot",tmpdf$Animal[1],".pdf"))
plot(Milk~Age, data=tmpdf, main=tmpdf$Animal[1])
dev.off()
}))
I would say to use ggplot from the ggplot2 package
ggplot(df,aes(x=Age,y=Milk, color=Animal))+geom_point()
edit1: actually this would lose clarity with 200 animals. Did you want all this data point in one graph or spread out across 200 graphs? If the latter then I agree with Didzis

Resources