Rotating y axis labels with mosaic plots WITHOUT overlap - r

This question is extremely similar to this one yet from another point of view which has not been responded.
Following the proposed code, I am able to generate mosaic plots and rotate the labels so that they are legible. The problem comes when (it seems) the mosaic() function from vcd package does not recognise the rotation and so it does not adapt the graph to fit the labels, yielding results like the following:
Is there any way to change the margins between the labels and the titles? I would be surprised if I am the first one that has encountered this issue. I am open to using other packages to get mosaic graphs if applicable as well.
Code
aux = structure(c(0L, 0L, 3L, 46L, 107L, 14L, 0L, 0L, 4L, 0L, 0L, 2L,
9L, 0L, 23L, 2L, 1L, 3L, 14L, 1L, 8L, 26L, 6L, 11L, 6L, 1L, 6L,
0L, 1L, 1L, 29L, 10L, 62L, 1L, 3L, 1L, 1L, 3L, 1L), .Dim = c(3L,
13L), .Dimnames = list(abcdefghi = c("Madrid", "Valencia", "Granada"
), jklmnopqr = c("roknbjftxcwl", "mfchldbxuyig", "gtyoxeduijpw",
"akbcefymvsiw", "ucbfxplietqk", "mzeykauprfdh", "piermgawyjht",
"chjvatqbylxo", "merhcogjflbd", "wiyrugvmhjlq", "glszdqmjhkov",
"giowaxrtsknm", "pxucytzvljqw")), class = "table")
library(vcd)
colours = c("brown","darkgreen","darkgrey","orange","darkred","gold","blue","red",
"white","pink","purple","navy","lightblue","green","peachpuff","violet","yellow","yellow4")
aux_names = names(attr(aux,"dimnames"))
mosaic(aux,main=paste(aux_names,collapse=" vs. "),
gp=gpar(fill=matrix(sample(colours,max(nrow(aux),ncol(aux))),1,max(nrow(aux),ncol(aux)))),
pop = FALSE,labeling = labeling_border(rot_labels=c(90,0,0,0),
just_labels=c("left","right")))

This code should do what i think you're after.
mosaic(aux,main=paste(aux_names,collapse=" vs. "),
gp=gpar(fill=matrix(sample(colours,max(nrow(aux),ncol(aux))),1,max(nrow(aux),ncol(aux)))),
pop = FALSE,labeling = labeling_border(rot_labels=c(90,0,0,0),
just_labels=c("left","right"),
offset_varnames = c(8,8,8,8)),
margins = c(10, 10, 10, 10))

Related

Error message: 'x' and 'y' must have the same length

I keep getting the following error message in R whilst trying to run a simple correlation. Can anyone help?
Error message is:
Error in cor.test.default(my_data$Year, my_data$Total, method =
"spearman") : 'x' and 'y' must have the same length
this is the code I am using:
library("dplyr")
library ("ggpubr")
library("devtools")
my_data<- read.csv(file.choose())
set.seed(1234)
dplyr::sample_n(my_data, 10)
ggdensity(my_data$Total,
main = "Density plot of barrier closures",
xlab = "Year ending")
ggqqplot(my_data$Total)
shapiro.test(my_data$Total)
cor.test(my_data$Year, my_data$Total, method = "spearman")
The data I am using has two columns in a CSV file, one is labelled "year" one is labelled "total". Both columns have 39 numeric entries so the lengths of the columns is identical. Every other part of the code works fine. I am using the latest version of R and latest version of all the packages
Edit: Someone asked for my data frame so here it is:
structure(list(ï..Year = 83:121, Total = c(1L, 0L, 0L, 1L, 1L,
0L, 1L, 4L, 2L, 0L, 4L, 7L, 4L, 4L, 1L, 1L, 2L, 6L, 24L, 4L,
20L, 1L, 4L, 3L, 8L, 6L, 5L, 5L, 0L, 0L, 5L, 50L, 1L, 1L, 2L,
3L, 2L, 9L, 6L)), class = "data.frame", row.names = c(NA, -39L
))
As user2554330 rightly stated: You'll get that error if you misspecify one of the column names. As can be seen from the output of dput(my_data), the first column's name is not Year, but ï..Year. The given error does not occur with
cor.test(my_data$ï..Year, my_data$Total, method = "spearman")
(You may be able to remove the merging of this byte order mark with the column name by adding the argument fileEncoding="UTF-8-BOM" in the read.csv() call.)

ddply dropping rows with zero sum

I am trying to sum my data per Meter, then average out the sumCover by Transect. My issue is that when I mean the transects, at the meter points where the cover data was taken if no native species were recorded, then that transect is effectively dropped from the dataframe after the ddply function. I have tried using the .drop function, but the issue is each site has unequal transect sampling because it was scaled to site size, so it effectively adds transects to every site. What I need to figure out to do is how to fill in within a list of numbers for missing Transect while taking into account each site varies from 3 to 16 transects - EDIT - the data preview seem to of got cut off and does not have sufficient rows so here is a file:
Here is a downloadable link of the data csv
read.csv()
require(ddply)
NativeNonnativeCoverperMeter <- ddply(RestoredGrasslandSurveys, c("Site","Transect","Locality","Meter"), summarise,
sumCover = sum(Cover))
NativeNonnativeCoverperTransect <- ddply(NativeNonnativeCoverperMeter, c("Site","Transect","Locality"), summarise,
avgCover = mean(sumCover), .drop = F)
dput(RestoredGrasslandSurveys[1:10, ])
structure(list(Site = structure(c(10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L), .Label = c("AzevedoNorth", "AzevedoSouth",
"Big.Banana", "BlohmRanch", "CypressGrove", "Diablo.Canyon",
"Dipsea.Moors", "Elkhorn.Nursery", "Elkhorn.Owl", "ElkhornHotwire",
"FacultyHousing", "Glass.Beach", "Hanson.ESHA", "Hanson.Uplands",
"Hawk.Hill", "LightHouse", "Modoc", "MooreCreek", "Morning.Sun",
"Noyo.Headlands", "Paradise.Ridge", "Prosper.Ridge", "RussianRidge",
"Stinson.Gulch", "Tennessee.Valley", "Watsonville.Uplands", "YoungerLagoon"
), class = "factor"), County = structure(c(4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 4L, 4L), .Label = c("Humboldt", "Marin", "Mendocino",
"Monterery", "Monterey", "San.Luis.Obispo", "SanMateo", "Santa.Barbara",
"SantaCruz", "Sonoma"), class = "factor"), Transect = c(3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), Meter = c(0L, 5L, 10L, 15L,
20L, 25L, 30L, 35L, 40L, 45L), Lifeform = structure(c(4L, 4L,
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L), .Label = c("AnnualForb", "AnnualGrass",
"Fern", "Groundcover", "Horsetail", "Nfixer", "PerennialForb",
"PerennialGrass", "PerrenialForb", "Rush", "Sedge", "Shrub",
"Tree"), class = "factor"), Locality = structure(c(1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Groundcover", "Native",
"Nonnative"), class = "factor"), Species = structure(c(265L,
265L, 265L, 265L, 265L, 265L, 265L, 265L, 265L, 265L), .Label = c("Achillea.millefolium",
"Acmispon.glaber", "Acmispon.maritimus", "Acmispon.parviflorus",
"Acmispon.strigosus", "Agropyron.cristatum", "Aira.caryophyllea",
"Aira.elegans", "Aira.praecox", "Amsinckia.menziesii", "Anaphalis.margaritacea",
"Angelica.hendersonii", "Anthoxanthum.odoratum", "Anthriscus.caucalis",
"Artemisia.californica", "Asclepias.fascicularis", "Atriplex.semibucatta",
"Avena.barbata", "Avena.Barbata", "Avena.fatua", "Baccharis. pilularis",
"Baccharis.pilularis", "Bareground", "Bellis.perennis", "Berberis.pinnata",
"Brachypodium.distachyon", "Brassica.nigra", "Brassica.rapa",
"Brassica.tournefortii", "Briza.maxima", "Briza.minor", "Bromus.carinatus",
"Bromus.catharticus", "Bromus.diandrus", "Bromus.hordeaceous",
"Bromus.madritensis", "Bromus.maritimus", "Bromus.tectorum",
"Calamagrostis.nutkaensis", "Calandrinia.menziesii", "Calendula.arvensis",
"Calystegia.collina", "Calystegia.purpurata", "Cardamine.oligiosperma",
"Carduus.pycnocephalus", "carex.athrostachya", "Carex.gynodynama",
"Carex.lasiocarpa", "Carex.Praegracilis", "Carex.spp", "Carex.suberecta",
"Carex.tomentosa", "Carex.tumulicola", "Carpobrotus.edulis",
"Castilleja.affinis", "Castilleja.densiflora", "Cerastium.fontanum",
"Cerastium.glomeratum", "Chlorogalum.pomeridianum", "Cirsium.brevistylum",
"Cirsium.vulgare", "Clarkia.purpurea", "Clarkia.spp", "Claytonia.perfoliata",
"Clinopodium.douglasii", "Conium.maculatum", "Convolvulus.arvensis",
"Corethrogyne.filaginifolia", "Cortaderia.jubata", "Cotula.coronopifolia",
"Crassula.connata", "Crepis.vesicaria", "Croton.setigerus", "Cynodon.dactylon",
"Cynosurus.echinatus", "Cyperus.eragrostis", "Danthonia.californica",
"Daucus.pusillus", "Deschampsia.cespitosa", "Dichelostemma.capitatum",
"Dichondra.donelliana", "Dichondra.Donelliana", "Dichondra.micrantha",
"Distichlis.spicata", "Dudleya.cymosa", "Dudleya.farinosa", "Dysphania.ambrosioides",
"Ehrharta.erecta", "Elymus.condensatus", "Elymus.glaucus", "Elymus.triticoides",
"Elymus.vancouverensis", "Epilobium.brachycarpum", "Epilobium.cilatum",
"Equisetum.arvense", "Erigeron.canadensis", "Erigeron.glaucus",
"Erigeron.sumatrensis", "Eriogonum.latifolium", "Eriogonum.parvifolium",
"Eriophyllum.staechadifolium", "Erodium.botrys", "Erodium.cicutarium",
"Erodium.moscatum", "Eschscholzia.californica", "Eucalyptus.globulus",
"Festua.muyros", "Festuca.arundinacea", "Festuca.bromioides",
"Festuca.californica", "Festuca.idahoensis", "Festuca.microstachys",
"Festuca.muyros", "Festuca.perennis", "Festuca.pratensis", "Festuca.rubra",
"Foeniculum.vulgare", "Fragaria.vesca", "Frangula.californica",
"Fritillaria.affinis", "Galium.aparine", "Galium.divaricatum",
"Galium.porrigens", "Gamochaeta.ustulata", "Genista.monspessulana",
"Geranium.dissectum", "Geranium.molle", "Gilia.capitata", "Gnaphalium.palustre",
"Grindelia.latifolia", "Grindelia.stricta", "Helminthotheca.echioides",
"Hemiparasitic.ericaceae", "Heracleum.lanatum", "Heterotheca.grandiflora",
"Heterotheca.sessiliflora", "Hirschfieldia.incana", "Holcus.lanatus",
"Hordeum.brachyantherum", "Hordeum.marinum", "Hordeum.murinum",
"Horkelia.californica", "Hosackia.gracilis", "Hypochaeris.spp",
"Iris.douglasiana", "Iris.macrosiphon", "Juncus.bufonis", "Juncus.effusus",
"Juncus.mexicanus", "Juncus.occidentalis", "Juncus.patens", "Juncus.phaeocephalus",
"Koeleria.macrantha", "Lactuca.serriola", "Lasthenia.californica",
"Lathyrus.vestitus", "Leontodon.taraxacoides", "Lichen", "Linum.bienne",
"Logfia.gallica", "Lomatium.dasycarpum", "Lomatium.utriculatum",
"Lonicera.hispidula", "Lotus.corniculatus", "Lotus.micranthus",
"Lupinus.arboreus", "Lupinus.bicolor", "Lupinus.littoralis",
"Lupinus.nanus", "Lupinus.variicolor", "Luzula.comosa", "Luzula.subsessilis",
"Lysimachia.arvensis", "Lythrum.hyssopifolia", "Madia.exigua",
"Madia.gracilis", "Madia.madioides", "Madia.spp", "Malva.parviflora",
"Marah.fabaceus", "Matricaria.discoides", "Medicago.polymorpha",
"Melica.californica", "Melica.imperfecta", "Melica.torreyana",
"Melilotus.indicus", "Melilotus.officinalis", "Modiola.caroliniana",
"Moss", "Mulch", "Mushroom.cover", "Myosotis.discolor", "Oxalis.corniculata",
"Oxalis.pes-caprae", "Parentucellia.latifolia", "Parentucellia.viscosa",
"Paronychia.franciscana", "Pennisetum.clandestinum", "Perideridia.kelloggii",
"Phacelia.californica", "Phacelia.malvifolia", "Phalaris.aquatica",
"Pholistoma.auritum", "Plagiobothyrs.nothofulvus", "Plantago.coronopus",
"Plantago.erecta", "Plantago.lanceolata", "Poa.annua", "Poa.pratensis",
"Polygonum.arenastrum", "Polygonum.aviculare", "Polypodium.califomicum",
"Polypodium.californicum", "Polypogon.monspeliensis", "Polystichum.munitum",
"Prunella.vulgaris", "Pseudognaphalium.beneolens", "Pseudognaphalium.bioletti",
"Pseudognaphalium.californicum", "Pseudognaphalium.canescens",
"Pseudognaphalium.luteoalbum", "Pseudognaphalium.ramosissimum",
"Pseudotsuga.meziesii", "Pteridium.aquilinum", "Quercus.agrifolia",
"Ranunculus.californicus", "Ranunculus.occidentalis", "Raphanus.sativus",
"Raphanus.spp", "Rock", "Rubus.armeniacus", "Rubus.ursinus",
"Rumex.acetosella", "Rumex.crispus", "Rumex.Crispus", "Rumex.transitorius",
"Salix.lasiolepis", "Sanicula.arctopoides", "Sanicula.bipinnatifida",
"Sanicula.crassicaulis", "Scandix.peten-veneris", "Senecio.vulgare",
"Sherardia.arvensis", "Sidalcea.malviflora", "Silene.gallica",
"Sisyrinchium.bellum", "Solanum.americanum", "Solidago.velutina",
"Soliva.sessilis", "Sonchus.asper", "Sonchus.oleraceus", "Spergula.arvensis",
"Stachys.ajugoides", "Stachys.bullata", "Stellaria.media", "Stipa.cernua",
"Stipa.lepida", "Stipa.pulchra", "Stipa.purpurata", "Symphiotrichum.chilensis",
"Taraxia.ovata", "Tauschia.hartwegii", "Thatch.cover", "Thatch.Cover",
"Thatch.Depth", "Thysanocarpus.laciniatus", "Toxicodendron.diversilobum",
"Toxicoscordion.fremontii", "Tragopogon.porrifolius", "Tribulus.terrestris",
"Trifolium.angustifolium", "Trifolium.barbigerum", "Trifolium.bifidum",
"Trifolium.depauperatum", "Trifolium.dubium", "Trifolium.glomeratum",
"Trifolium.hirtum", "Trifolium.hybridum", "Trifolium.macraei",
"Trifolium.microcephalum", "Trifolium.repens", "Trifolium.subterraneum",
"Trifolium.variegatum", "Trifolium.willdenovii", "Triphysaria.pusilla",
"Triphysaria.versicolor", "Trisetum.canescens", "Vaccinium.ovatum",
"Veronica.persica", "Vicia.americana", "Vicia.benghalensis",
"Vicia.sativa", "Vicia.tetrasperma", "Vicia.villosa", "Viola.adunca",
"Viola.pedunculata", "Wyethia.angustifolia", "Wyethia.glabra"
), class = "factor"), Cover = c(1, 1, 0.5, 0.5, 0.5, 8, 2, 2,
5, 1)), row.names = c(NA, 10L), class = "data.frame")

What is the best way to use agricolae to do ANOVAs on a split plot design?

I'm trying to run some ANOVAs on data from a split plot experiment, ideally using the agricolae package. It's been a while since I've taken a stats class and I wanted to be sure I'm analyzing this data correctly, so I did some searching online and couldn't really find consistency in the way people were analyzing their split plot experiments. What is the best way for me to do this?
Here's the head of my data:
dput(head(rawData))
structure(list(ï..Plot = 2111:2116, Variety = structure(c(5L,
4L, 3L, 6L, 1L, 2L), .Label = c("Burbank", "Hodag", "Lamoka",
"Norkotah", "Silverton", "Snowden"), class = "factor"), Rate = c(4L,
4L, 4L, 4L, 4L, 4L), Rep = c(1L, 1L, 1L, 1L, 1L, 1L), totalTubers = c(594L,
605L, 656L, 729L, 694L, 548L), totalOzNoCulls = c(2544.18, 2382.07,
2140.69, 2401.56, 2440.56, 2503.5), totalCWTacNoCulls = c(461.76867,
432.345705, 388.535235, 435.88314, 442.96164, 454.38525), avgLWratio = c(1.260615419,
1.287949374, 1.111981583, 1.08647584, 1.350686661, 1.107173509
), Hollow = c(14L, 15L, 22L, 25L, 14L, 13L), Double = c(10L,
13L, 15L, 22L, 11L, 9L), Knob = c(86L, 80L, 139L, 156L, 77L,
126L), Researcher = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "Wang", class = "factor"),
CullsPounds = c(1.75, 1.15, 4.7, 1.85, 0.8, 5.55), CullsOz = c(28,
18.4, 75.2, 29.6, 12.8, 88.8), totalOz = c(2572.18, 2400.47,
2215.89, 2431.16, 2453.36, 2592.3), totalCWTacCulls = c(466.85067,
435.685305, 402.184035, 441.25554, 445.28484, 470.50245)), row.names = c(NA,
6L), class = "data.frame")
For these data, the whole plot is Rate, the split plot is Variety, the block is Rep, and for discussion's sake here, we can look at totalCWTacNoCulls as the response.
Any help would be very much appreciated! I am still getting the hang of Stack Overflow, so if I have made any mistakes or shared my data wrong, please let me know and I'll change it. Thank you!
You can do this using agricolae package as follows
library(agricolae)
attach(rawData)
Rate = factor(Rate)
Variety = factor(Variety)
Rep = factor(Rep)
sp.plot(Rep, Rate, Variety, totalCWTacNoCulls)
Usage according to agricolae package is
sp.plot(block, pplot, splot, Y)
where, block is replications, pplot is main-plot Factor, splot is sub-plot Factor and Y response variable

Plotting multiple effect plots from logistic regression

I have a number of logistic regression models with different response variables but the same predictor variables. I want to use grid.arrange (or anything else) to make a single figure with all these effect plots that were made with the effects package. I followed the advice here to make such a graph: grid.arrange with John Fox's effects plots
library(effects)
library(gridExtra)
data <- structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L,1L, 1L, 2L, 2L, 2L), .Label = c("group1", "group2"), class = "factor"),obs = c(1L, 1L, 4L, 4L, 6L, 12L, 26L, 1L, 10L, 6L),responseA = c(1L, 1L, 2L, 0L, 1L, 10L, 20L, 0L, 3L, 2L), responseB = c(0L, 0L, 2L, 4L, 6L, 4L, 8L, 1L, 8L, 5L)), .Names = c("group", "obs", "responseA","responseB"), row.names = c(53L, 54L, 55L, 56L, 57L, 58L,59L, 115L, 116L, 117L), class = "data.frame")
model1<-glm(cbind(responseA,(obs-responseA))~group,family=binomial, data=data)
model2<-glm(cbind(responseA,(obs-responseA))~group,family=binomial, data=data)
ef1 <-allEffects(model1)[[1]]
ef2 <- allEffects(model2)[[1]]
elist <- list( ef1,ef2)
class(elist) <- "efflist"
plot(elist, col=2)
The problem is that, in the models I am using the response variable in the model in the form cbind(response A,no response A), but for the figure I would like to change it to something more clean (like Response A). I tried changing the y labels by putting a list, but got a warning, and it turned both labels into "Response A".
plot(elist, ylab=c("response A","response B"),col=2)
Then tried the second method suggestion to change the class to trellis, got an error, so grid.arrange didn’t work either.
p1<-plot(allEffects(model1),ylab="Response A")
p2<-plot(allEffects(model2),ylab="Response B")
class(p1) <- class(p2) <- "trellis"
grid.arrange(p1, p2, ncol=2)
Can anyone provide a method to change each y-axis label separately?
With the ef1 and ef2 variables you created, you can try the following
plot1 <- plot(ef1, ylab = "Response A")
plot2 <- plot(ef2, ylab = "Response B")
grid.arrange(plot1, plot2, ncol=2)

Why does a PDF plot in ggplot2 not show title nor labels?

I'm creating a simple step plot using ggplot2. If I switch the file type from PNG to PDF the plot does not show labels, ticks nor a title or a legend. What I'm doing wrong?
Data:
plotData <- structure(list(iteration = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), time = c(0L, 10L,
20L, 30L, 40L, 50L, 60L, 70L, 80L, 90L, 100L, 0L, 10L, 20L, 30L,
40L, 50L, 60L, 70L), routes = c(6L, 6L, 5L, 3L, 3L, 3L, 3L, 3L,
2L, 1L, 0L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 0L)), .Names = c("iteration",
"time", "routes"), class = "data.frame", row.names = c(NA, -19L
))
Code:
library(ggplot2)
x_axis_breaks <- seq(10, 100, by = 10)
png(file="plot.png",width=1280, height=1280)
## pdf(file="plot.pdf",width=6,height=6)
plot <- ggplot(plotData) + geom_step(data=plotData, size = 5,
mapping=aes(x=time,
y=routes, group=iteration, colour=factor(iteration)), direction="vh")
plot <- plot + scale_x_discrete(breaks=x_axis_breaks, name="time") +
scale_y_discrete(name="#routes");
plot <- plot + opts(axis.text.x=theme_text(size=36,face="bold"),
axis.text.y=theme_text(size=36,face="bold")) +
scale_colour_hue(name="iteration")
plot <- plot + opts(legend.title=theme_text(size=36,face="bold"),
legend.text=theme_text(size=36,face="bold"))
plot <- plot + opts(axis.title.x=theme_text(size=36,face="bold"),
axis.title.y=theme_text(size=36,face="bold"))
plot <- plot + opts(title="network lifetime",
plot.title=theme_text(size=36, face="bold"))
print(plot)
dev.off()
The problem occurs if I'm switching from 'png...' to 'pdf'. The data itself is plotted fine. Maybe I'm just missing some information on generating PDF plots in ggplot2?
Most likely this is due to font embedding.
R does not embed fonts by default and this causes issues that you have described on some PDF readers. Usually you will have no problems with such figures on Adobe Reader that ships with a lot of fonts, while other readers might not come with a lot of fonts (commercial ones in particular) and typically they try to substitute the missing fonts with the closest ones. Sometimes this will fail and you don't see any fonts. i often have this problem with Evince on Ubuntu, not only with R plots but any other PDF that where fonts are not embedded.
On Ubuntu you can check status of the fonts of a pdf file with pdffonts file.pdf.
Some solutions:
- use cairo_pdf device when producing pdf in R, usually this does the trick
- use extrafont package to embed the desired font (font has to be available on your OS), see here for details
In combination with ggplot you should use ggsave() for saving images:
ggsave( "plot.png", plot )
ggsave( "plot.pdf", plot )
...

Resources