Lattice Histogram multiplot with different layouts - r

I wanted to take a closer look at the distribution of RT-times on questions. To do so, I used lattice to make histrograms and depict them in one figure. I used the following settings:
histogram( ~ rt | pp,layout=c(6,4),data = data.frame,
main=list(
label="RT distribution per subject",
cex=1.5),
xlab=list(
label="RT (s)",
cex=0.75),
ylab=list(
label="Percentage occurence",
cex=1.2),
xlim=c(0,40),
breaks = 10
)
In other words, I want the participants' data to be plotted on an x-axis from 0 to 40 seconds, divided into 10 bars. This is done for some sub-plot, but for many they use a different breaks. I added the figure. Why does the function not use the same layout for every sub-plot?

I found a solution to the problem. Instead of specifying the amount of break, you can specify an array with breaks as follows:
histogram( ~ rt | pp,layout=c(6,4),data = data.frame,
xlim=c(0,40),
breaks = c(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55)
)
or, more simply,
histogram( ~ rt | pp,layout=c(6,4),data = data.frame,
xlim=c(0,40),
breaks = seq(from=0,to=55,by=1)
)
Note, however, that the range must include every data point. For more see C-Ran's page about Lattice's histrogram.

Related

Q: How combine two types of lines using ggplot?

I am trying to plot the following graph:
This plot was made using a command in R; however, I need to change the x-axis. As you see the x-axis starts at 0 and finish at 46. I want that the x-axis starts in 1972 and finishes in 2018 seq(1972, 2018). The data used for this graph is the following:
For regime one
structure(c(0.996336942021931, 0.982749831853788, 0.25257000136794,
0.707797489518183, 0.339372705184362, 0.999209103898399, 0.348786927897612,
0.821500770877589, 0.569473419352121, 0.544946043345147, 0.15347485404411,
0.987921203799956, 0.00247541125926418, 0.999925918450173, 0.996940249283586,
0.0141234625702467, 0.105466117156579, 0.999992944275275, 0.991723355647765,
0.0958472062267191, 0.0362729940372193, 0.999999790503447, 0.0750715811130157,
0.999975836828039, 0.998991768987905, 0.327943641159186, 5.05723080618291e-05,
0.999999999869691, 0.995538324405397, 0.123355227931813, 0.999776636825943,
0.00875781169836433, 0.696284480883101, 0.854839147672286, 0.113243492249383,
0.00984853715078062, 0.442061195271808, 0.999959859676686, 0.0249739384218217,
0.715262186931097, 0.269481397703521, 0.708458897302807, 0.0444979324520481,
0.000133950914911277, 0.997976154782607, 0.191386380576805, 0.99775339928206,
0.97921531595208, 0.27690132186733, 0.671995422154737, 0.458800347851363,
0.999155966774432, 0.417000082142666, 0.838969001100901, 0.576424593247709,
0.439169303472056, 0.227227711549776, 0.978527102362448, 0.00408165810824898,
0.999955057843957, 0.994643622809094, 0.00847570472458959, 0.163000467960203,
0.999995704786608, 0.987482614312069, 0.0569007267419926, 0.0585312256476362,
0.999999671060746, 0.118213072794827, 0.99998536150034, 0.998897081324845,
0.212968271334585, 8.35316288758489e-05, 0.999999999920876, 0.993537683112221,
0.188538497918178, 0.999604116439039, 0.00905848219612739, 0.769430430615986,
0.794457999021984, 0.0665707154963958, 0.00776458004359329, 0.5668500474175,
0.999931021995446, 0.0265573724408095, 0.661699294173752, 0.296009575623967,
0.587638579198176, 0.0251758869152202, 0.000220356219397782,
0.997352716237698, 0.191386380576805), .Dim = c(46L, 2L))
for regime 2:
structure(c(0.00366305797806813, 0.0172501681462116, 0.74742999863206,
0.292202510481817, 0.660627294815638, 0.000790896101601132, 0.651213072102388,
0.178499229122411, 0.430526580647879, 0.455053956654853, 0.846525145955889,
0.0120787962000438, 0.997524588740736, 7.40815498269273e-05,
0.00305975071641352, 0.985876537429753, 0.894533882843421, 7.05572472485335e-06,
0.00827664435223535, 0.904152793773281, 0.963727005962781, 2.09496553467159e-07,
0.924928418886985, 2.41631719608902e-05, 0.00100823101209502,
0.672056358840815, 0.999949427691938, 1.30308744399533e-10, 0.00446167559460289,
0.876644772068187, 0.00022336317405711, 0.991242188301636, 0.303715519116899,
0.145160852327714, 0.886756507750617, 0.990151462849219, 0.557938804728191,
4.01403233139628e-05, 0.975026061578178, 0.284737813068903, 0.730518602296479,
0.291541102697193, 0.955502067547952, 0.999866049085089, 0.00202384521739295,
0.808613619423195, 0.00224660071793958, 0.0207846840479196, 0.72309867813267,
0.328004577845263, 0.541199652148637, 0.000844033225568314, 0.582999917857334,
0.161030998899099, 0.423575406752291, 0.560830696527944, 0.772772288450224,
0.0214728976375518, 0.995918341891751, 4.49421560426429e-05,
0.00535637719090558, 0.99152429527541, 0.836999532039797, 4.29521339242403e-06,
0.0125173856879312, 0.943099273258007, 0.941468774352364, 3.28939253926857e-07,
0.881786927205173, 1.46384996596921e-05, 0.00110291867515508,
0.787031728665414, 0.999916468371124, 7.91243531099699e-11, 0.00646231688777926,
0.811461502081822, 0.00039588356096145, 0.990941517803873, 0.230569569384014,
0.205542000978016, 0.933429284503604, 0.992235419956407, 0.4331499525825,
6.89780045536876e-05, 0.973442627559191, 0.338300705826248, 0.703990424376033,
0.412361420801824, 0.97482411308478, 0.999779643780602, 0.00264728376230197,
0.808613619423195), .Dim = c(46L, 2L))
I know that the red line can be plotted using geom_line but I do not know how can the black bars plot? maybe using geom_bar, and also how can I merge the plots?
Thanks for your help
It's actually plotted using base R (good old times), using your first data for For regime one:
plot(Regime1[,1],type="h",xaxt="n",ylab="",cex.axis=0.6,xlab="",xlim=c(0,46))
lines(Regime1[,2],col="red")
mtext("Smoothed Probabilities",2,padj=-5,col="red",cex=0.7)
mtext("Fitted Probabilities",4,padj=1,cex=0.7)
axis(side=1,at=c(0,20,46),labels=c(1972,1992,2018))
Your xaxis values are actually 0:46, so you turn off the x-axis ticks using xaxt="n", then with axis(), you put it at 0,20,46 with the labels 1972...
It also depends on your plotting device, so might have to change the padj parameter in the axis to adjust the axis labels. I guess you can check out post like this for base R plotting functions.
In ggplot2, I guess you just create a data.frame with the Index as the years you need, and you call geom_segment() to plot the vertical lines :
library(ggplot2)
Regime1 = data.frame(Regime1)
colnames(Regime1) = c("Fitted","Smoothed")
Regime1$index = 1:nrow(Regime1)+1972
ggplot(Regime1,aes(x=index))+
geom_segment(aes(xend=index,y=0,yend=Fitted,col="Fitted")) +
geom_line(aes(y=Smoothed,col="Smoothed")) + theme_minimal() +
scale_color_manual(values=c("black","red"))
For a ggplot2 solution, you are going to need a data.frame or tibble with 4 columns (Regime, Year, Smoothed, and Fitted). Based on the data you provided, this would have 92 rows.
Now assuming you use those column names (and storing your data into the variable example.dat), a ggplot2 solution is
example.dat %>%
ggplot( aes(x=Year) ) +
geom_line( aes(y=Smoothed), color="red" ) +
geom_linerange( aes(ymax=Fitted), ymin=0 ) +
facet_wrap( ~ Regime, ncol=1 )
Then you might need to adjust some of the scales to get the best plot.

Using multiple datasets for one graph

I have 2 csv data files. Each file has a "date_time" column and a "temp_c" column. I want to make the x-axis have the "date_time" from both files and then use 2 y-axes to display each "temp_c" with separate lines. I would like to use plot instead of ggplot2 if possible. I haven't been able to find any code help that works with my data and I'm not sure where to really begin. I know how to do 2 separate plots for these 2 datasets, just not combine them into one graph.
plot(grewl$temp_c ~ grewl$date_time)
and
plot(kbll$temp_c ~ kbll$date_time)
work separately but not together.
As others indicated, it is easy to add new data to a graph using points() or lines(). One thing to be careful about is how you format the axes as they will not be automatically adjusted to fit any new data you input using points() and the like.
I've included a small example below that you can copy, paste, run, and examine. Pay attention to why the first plot fails to produce what you want (axes are bad). Also note how I set this example up generally - by making fake data that showcase the same "problem" you are having. Doing this is often a better strategy than simply pasting in your data since it forces you to think about the core component of the problem you are facing.
#for same result each time
set.seed(1234)
#make data
set1<-data.frame("date1" = seq(1,10),
"temp1" = rnorm(10))
set2<-data.frame("date2" = seq(8,17),
"temp2" = rnorm(10, 1, 1))
#first attempt fails
#plot one
plot(set1$date1, set1$temp1, type = "b")
#add points - oops only three showed up bc the axes are all wrong
lines(set2$date2, set2$temp2, type = "b")
#second attempt
#adjust axes to fit everything (set to min and max of either dataset)
plot(set1$date1, set1$temp1,
xlim = c(min(set1$date1,set2$date2),max(set1$date1,set2$date2)),
ylim = c(min(set1$temp1,set2$temp2),max(set1$temp1,set2$temp2)),
type = "b")
#now add the other points
lines(set2$date2, set2$temp2, type = "b")
# we can even add regression lines
abline(reg = lm(set1$temp1 ~ set1$date1))
abline(reg = lm(set2$temp2 ~ set2$date2))

How to put 2 boxplot in one graph in R without additional libraries?

I have this kind of dataset
Defect.found Treatment Program
1 Testing Counter
1 Testing Correlation
0 Inspection Counter
3 Testing Correlation
2 Inspection Counter
I would like to create two boxplotes, one boxplot of detected defects per program and one boxplot of detected defects per technique but in one graph.
Meaning having:
boxplot(exp$Defect.found ~ exp$Treatment)
boxplot(exp$Defect.found ~ exp$Program)
In a joined graph.
Searching on Stackoverflow I was able to create it but with lattice library typing:
bwplot(exp$Treatment + exp$Program ~ exp$Defects.detected)
but i would like to know if its possible to create the graph without additional libraries like ggplot and lattice
Prepare the plot window to receive two plots in one row and two columns (default is obviously one row and one column):
par(mfrow = c(1, 2))
My suggestion is to avoid using the word exp, because it is already used for the exponential function. Use for instance mydata.
Defects found against treatment (frame = F suppresses the external box):
with(mydata, plot(Defect.found ~ Treatment, frame = F))
Defects found against program (ylab = NA suppresses the y label because it is already shown in the previous plot):
with(mydata, plot(Defect.found ~ Program, frame = F, ylab = NA))

How to apply a chunk of code (not only a single function) to all columns in dataset

I would like to apply this chunk of code to each column in a dataset. I can run all columns individually, but it is tedious to make repeated code for 75 different columns and change all of the names in the code to match each column name. Is there a way that I can run all columns individually at once without making code for each column individually?
max.Width =lmer(mergeCowpeaTEST$max.Width ~ (1|Genotype) + (1|Year) + (1|Genotype:Year) + (1|Rep:Year), data=mergeCowpeaTEST,na.action = na.omit)
model.a_max.Width <-lmer(max.Width~ (1|Genotype) + (1|Year) + (1|Genotype:Year) + (1|Rep:Year), data=mergeCowpeaTEST)
alt.est.a_max.Width <- influence(model.a_max.Width, obs=TRUE)
cooks<-cooks.distance(alt.est.a_max.Width)
plot(alt.est.a_max.Width, which="cook", sort=FALSE,main="cook's distance plot of max.Width")
which(residuals(max.Width)>0.10)
which(residuals(max.Width)<(-0.10))
boxplot(residuals(max.Width))
myboxplot<-boxplot(residuals(max.Width))
myboxplot$out
hist(residuals(max.Width))
qqnorm(residuals(max.Width))
pdf("Widiv_max.Width_residual_graphs.pdf",height=8,width=10)
plot(fitted(max.Width),residuals(max.Width), xlab="Predicted values", ylab="Residuals", main="Residual Plot of widiv max.Width")
abline(h=0, col="red")
hist(resid(max.Width),main="histogram of max.Width residuals")
qqnorm(residuals(max.Width), main="Residuals Q-Q Plot");qqline(resid(max.Width))
qqnorm(ranef(max.Width)$Genotype$"(Intercept)", main="Genotypes Q-Q Plot"); qqline(ranef(max.Width)$Genotype$"(Intercept)")
qqnorm(ranef(max.Width)$"Genotype:Year"$"(Intercept)", main="Genotype by Year Q-Q Plot"); qqline(ranef(max.Width)$"Genotype:Year"$"(Intercept)")
plot(alt.est.a_max.Width, which="cook", sort=FALSE,main="cook's distance plot of max.Width")
dev.off()
The key to this is your describing it as "only" a single function. A single function can run an arbitrary amount of things. You can have it print something, then do something, then output something. Or do lots of things. Or play Global Geothermonuclear War. All in a single function.
apply( ChickWeight, 2, function(clmn) {
cat("Hi")
cat("Low")
cat("The only way to win is not to play at all")
} )

Make the main label variable and base it on my conditioning variable

I am generating a xyplot conditioned on a certain variable. I do not want the conditioning variable to be visible in the panel label, But I would like for it to be visible as a part of the main label. I would like to add a variable to the main so that the main on each page (my layout is c(1,1)) has a different text based on my conditioning variable. Here is the example of my code.
xyplot(counts ~ time|conditioningvariable,
data = test.df,
pch=".",cex=1.5,
ylab="Counts (n)",
xlab="Time (sec)",
main=paste("Count By Time for ",conditioningvariable,sep=" "),
layout=c(1,1),scales=list(relation="free"),
strip=FALSE)
I know I would have to use the panel function if I want to have this text change per panel, but I not sure how I would go about it. I know this might be extremely simple. Would be really thankful for a solution.
Perhaps:
... ,
main = substitute(expression(Count~By~Time~'for'~X), list(X=conditioningvariable) ),
... ,
strip = strip.custom(strip.levels=c(FALSE,FALSE), strip.names=c(FALSE,FALSE)),
... ,
Tested solutions rather than guesses offered when reproducible data is provided.
That would not work unless there were also an object named conditioningvariable in the global environment. It would not recover the value from the xyplot call. I played around with the iris dataset and the examples in the help pages and found that the above guess would not succeed. The conditioning levels are stored in the result of xyplot as $condlevel as documented here. (Turns out my memory was failing me since I was part of that exchange.) So a 2 step process succeeds:
X <- xyplot(Petal.Length ~ Petal.Width | Species, iris,
main = substitute(expression(Count~By~Time~'for'~X), list(X='Species') ) )
xyplot(Petal.Length ~ Petal.Width | Species, iris,
main = substitute(expression(Count~By~Time~'for'~X), list(X=names(X$condlevel)) )
)
It also turns out that you can suppress the levels or the conditioning names with single FALSE values.
This will print a "title" that doesn't look like a strip with varying condition levels per page:
X <- xyplot(Petal.Length ~ Petal.Width | Species, iris,
pch=".",cex=1.5,
ylab="Counts (n)",
xlab="Time (sec)", layout=c(1,1,3) ,
scales=list(relation="free"),
strip=function(...){ ltext(.5,.5,paste("Count By Time for ", levels(iris$Species)[panel.number() ]))})
print(X)
For more space at the top use main="\n" and ltext(.5, 1, ...)`.

Resources