I've been trying to make a graph that looks like this (but nicer)
based on what I found in this discussion using the transitionPlot() function from the Gmiscpackage.
However, I can't get my transition_matrix right and I also can't seem to plot the different state classes in separate third column.
My data is based on the symptomatic improvement of patients following surgery. The numbers in the boxes are the number of patients in each "state" pre vs. post surgery. Please note the (LVAD) is not a necessity.
The data for this plot is this called df and is as follows
dput(df)
structure(list(StudyID = structure(c(1L, 2L, 3L, 4L, 5L, 6L,
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L), .Label = c("P1", "P2", "P3",
"P4", "P5", "P6", "P7"), class = "factor"), MeasureTime = structure(c(2L,
2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("Postoperative",
"Preoperative"), class = "factor"), NYHA = c(3L, 3L, 3L, 3L,
3L, 2L, 3L, 1L, 3L, 1L, 3L, 3L, 1L, 1L)), .Names = c("StudyID",
"MeasureTime", "NYHA"), row.names = c(NA, -14L), class = "data.frame")
I've made a plot in ggplot2 that looked like this
but my supervisor didn't like it, because I had to jitterthe lines so that they didn't overlap and so one could see what was happening with each patient and thus the points/lines aren't exactly lined up with the y-axis.
So I was wondering if anyone had an idea, how I'd be able to do this using the Gmisc package making what seems to me to be a transitionPlot.
Your help and time is much appreciated.
Thanks.
Using your sample df data, here are some pretty low-level plotting function that can re-create your sample image. It should be straigtforward to customize however you like
First, make sure pre comes before post
df$MeasureTime<-factor(df$MeasureTime, levels=c("Preoperative","Postoperative"))
then define some plot helper functions
textrect<-function(x,y,text,width=.2) {
rect(x-width, y-width, x+width, y+width)
text(x,y,text)
}
connect<-function(x1,y1,x2,y2, width=.2) {
segments(x1+width,y1,x2-width,y2)
}
now draw the plot
plot.new()
par(mar=c(0,0,0,0))
plot.window(c(0,4), c(0,4))
with(unique(reshape(df, idvar="StudyID", timevar="MeasureTime", v.names="NYHA", direction="wide")[,-1]),
connect(2,NYHA.Preoperative,3,NYHA.Postoperative)
)
with(as.data.frame(with(df, table(NYHA, MeasureTime))),
textrect(as.numeric(MeasureTime)+1,as.numeric(as.character(NYHA)), Freq)
)
text(1, 1:3, c("I","II","III"))
text(1:3, 3.75, c("NYHA","Pre-Op","Post-Op"))
text(3.75, 2, "(LVAD)")
which results in
Related
I want to create a simple barplot of my data frame:
> dput(corr)
structure(list(`sample length` = structure(c(2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), .Label = c("3s", "10s"), class = "factor"),
feature = structure(c(1L, 1L, 5L, 5L, 2L, 5L, 6L, 5L, 5L,
4L, 1L, 1L, 1L, 1L, 1L, 2L, 5L, 5L, 3L, 4L, 1L, 1L, 1L, 1L
), .Label = c("f0", "f1", "f2", "f3", "f2 prime", "f2-f1"
), class = "factor"), measure = c("meanf0 longterm", "meanf0 longterm st",
"f2' Fant", "f2' Carlson", "F1meanERB", "F2meanERB", "f2-f1 ERB",
"f2' Fant", "f2' Carlson", "F3meanERB", "meanf0 3secs", "meanf0 3secs st",
"meanf0 10secs", "meanf0 longterm", "meanf0 longterm st",
"F1meanERB", "f2' Fant", "f2' Carlson", "F2meanERB", "F3meanERB",
"meanf0 longterm", "meanf0 longterm st", "meanf0 3secs",
"meanf0 3s st"), score = c(0.574361009949897, 0.592472685498182,
0.597453479834514, 0.529641256460457, 0.585994252821649,
0.618734735308094, 0.517715270144259, 0.523916918327387,
0.616237363007349, 0.732926257362305, 0.649505366093518,
0.626628120773466, 0.522527636952945, 0.53968850323167, 0.548664887822775,
0.648294358978928, 0.650806695307235, 0.696797693503567,
0.621298393945597, 0.57140950987443, 0.606634531002859, 0.597064217305556,
0.582534743353082, 0.572808145210493), dimension = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 1L, 1L, 1L,
2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L), .Label = c("1", "2", "3",
"4"), class = "factor")), row.names = c(NA, -24L), class = c("tbl_df",
"tbl", "data.frame"))
I have tried the following code:
ggplot(data=corr, aes(x=factor(dimension), y=score)) +
geom_col(aes(fill=feature),position=position_dodge2(width=1,preserve='single')) +
facet_grid(~`sample length`, scales='free_x',space='free_x') +
labs(x="Dimension", y="Correlation Coefficient (Abs. value)") +
geom_text(aes(label=measure),position=position_dodge2(width=0.9, preserve='single'), angle=90,
size=4,hjust=2.5,color='white')
Giving the following barplot:
However, the labels for 'measure' are being incorrectly assigned to the columns. E.g. for 3s facet plot, under 'dimension 2', the two light blue bars should be labelled as 'f2' Carlson' and 'f2' Fant' but they have been swapped with the other two labels.
I think the levels must be wrong, but I don't understand how!
Any help much appreciated, ta
The problem of switching labels comes from geom_text() not knowing how the information should be split for the purposes of dodging. The solution is to supply a group= aesthetic to geom_text() that matches the fill= aesthetic specified for geom_col().
In the case of geom_col(), you specify aes(fill=feature). The height of the different columns is therefore grouped automatically by corr$feature. You can supply a group= aesthetic as well, but it's unnecessary and the dodging will happen as you expect.
In the case of geom_text(), there is no obvious way to group the data. When you do not specify a group= aesthetic, ggplot2 chooses one of the columns (in this case, the first column number) for grouping. For dodging to work here, you need to specify how the label information is grouped. If you don't have a specific legend-associated aesthetic to choose here, you can use the group= aesthetic to specify group=feature. This let's ggplot2 know that the text labels should be sorted and dodged by grouping according to this column in the data:
ggplot(data=corr, aes(x=factor(dimension), y=score)) +
geom_col(aes(fill=feature),position=position_dodge2(width=1,preserve='single')) +
facet_grid(~`sample length`, scales='free_x',space='free_x') +
labs(x="Dimension", y="Correlation Coefficient (Abs. value)") +
geom_text(aes(label=measure, group=feature),position=position_dodge2(width=0.9, preserve='single'), angle=90,
size=4,hjust=2.5,color='white')
As a side note, you don't have to specify the group= aesthetic if you assign a color-based aesthetic (or one that would result in a legend). If we set color=feature with geom_text(), it works without group=. To see the labels, you need to set the alpha for the columns a bit lower, but this should illustrate the point well:
ggplot(data=corr, aes(x=factor(dimension), y=score)) +
geom_col(aes(fill=feature),position=position_dodge2(width=1,preserve='single'), alpha=0.2) +
facet_grid(~`sample length`, scales='free_x',space='free_x') +
labs(x="Dimension", y="Correlation Coefficient (Abs. value)") +
geom_text(aes(label=measure, color=feature),position=position_dodge2(width=0.9, preserve='single'), angle=90,
size=4,hjust=2.5)
I am constructing GLMMs (using glmer() of "lme4" R package) and sometimes I get an error when estimating R2 values (using r.squaredGLMM() from "MuMIn" package).
The model I am trying to fit is simmilar to this one:
library(lme4)
lmA <- glmer(x~y+(1|w)+(1|w/k), data = data1, family = binomial(link="logit"))
Then, to estime R2, I use:
library(MuMIn)
r.squaredGLMM(lmA)
And I get this:
The result is correct only if all data used by the model has not changed since model was fitted. Error in .rsqGLMM(fam = family(x),
varFx = var(fxpred), varRe = varRe, : 'names' attribute [2] must be the same length as the vector [0]
Do you have any idea why this error appears? For instance, If I use only a single random factor (in this case, (1|w)) this error does not appear.
Here is my dataset:
data1 <-
structure(list(w = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 2L,
1L, 2L, 1L), .Label = c("CA", "CB"), class = "factor"), k = structure(c(4L,
4L, 3L, 3L, 3L, 4L, 1L, 3L, 2L, 3L, 2L), .Label = c("CAF01-CAM01",
"CAM01", "CBF01-CBM01", "CBM01"), class = "factor"), x = c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 0L, 0L), y = c(-0.034973549,
0.671720643, 4.557044729, 5.347170897, 2.634240583, -0.555740207,
4.118277809, 2.599825716, 0.95853864, 4.327804344, 0.057331718
)), .Names = c("w", "k", "x", "y"), class = "data.frame", row.names = c(NA,
-11L))
Any thoughts?
This was a bug that has been fixed in version >= 1.15.8 (soon on CRAN, currently on R-Forge).
I have a data frame(tcell_pdx_log_wgene_melt) of gene,sample and and an expression value of certain genes.
My data frame looks like:
gene sample log_fpkm
ITGB1 Sample_7630_T1_PDX_mousereads 4.4667698
ADIPOR1 Sample_7630_T1_PDX_mousereads 3.7562811
ADIPOR2 Sample_7630_T1_PDX_mousereads 2.4823200
RYK Sample_7630_T1_PDX_mousereads 2.4521252
JAG1 Sample_7630_T1_PDX_mousereads 1.7713867
ITGB1 Sample_NYA_MT.05_primary_mousereads 1.9555776
ADIPOR1 Sample_NYA_MT.05_primary_mousereads 1.7365991
ADIPOR2 Sample_NYA_MT.05_primary_mousereads 2.1131181
RYK Sample_NYA_MT.05_primary_mousereads 1.1464496
JAG1 Sample_NYA_MT.05_primary_mousereads 0.6931472
ITGB1 Sample_7630_T1_PDX_humanreads 4.5363987
ADIPOR1 Sample_7630_T1_PDX_humanreads 3.5718399
ADIPOR2 Sample_7630_T1_PDX_humanreads 2.4756977
RYK Sample_7630_T1_PDX_humanreads 1.8449842
JAG1 Sample_7630_T1_PDX_humanreads 1.7451918
The below plot puts these genes alphabetically but I want the plot to be sorted by one of the variable types " Sample_7630_T1_PDX_humanreads"
tcell_pdx_log_wgene_melt$sample <- as.character(tcell_pdx_log_wgene_melt$sample)
tcell_pdx_log_wgene_melt$sample <- factor(tcell_pdx_log_wgene_melt$sample, levels=unique(tcell_pdx_log_wgene_melt$sample))
p <- ggplot(tcell_pdx_log_wgene_melt,aes(gene,log_fpkm,group=sample)) +
geom_point()
p + geom_line(aes(color=sample))
Not really sure what you are looking for, but I've dput() and plotted your data below. Maybe you can use that to explain what excatly you are looking to do? If you make a minimal reproducible example to go along with your question. Something we can work from and use to show you how it might be possible to solve your problem. You can have a look at this SO post on how to make a great reproducible example in R.
tcell_pdx_log_wgene_melt <- structure(list(gene = structure(c(3L, 1L, 2L, 5L, 4L,
3L, 1L, 2L, 5L, 4L, 3L, 1L, 2L, 5L, 4L),
.Label = c("ADIPOR1", "ADIPOR2",
"ITGB1", "JAG1", "RYK"), class = "factor"),
sample = structure(c(2L,
2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L),
.Label = c("Sample_7630_T1_PDX_humanreads",
"Sample_7630_T1_PDX_mousereads", "Sample_NYA_MT.05_primary_mousereads"
), class = "factor"), log_fpkm = c(4.4667698, 3.7562811, 2.48232,
2.4521252, 1.7713867, 1.9555776, 1.7365991, 2.1131181, 1.1464496,
0.6931472, 4.5363987, 3.5718399, 2.4756977, 1.8449842, 1.7451918
)), .Names = c("gene", "sample", "log_fpkm"),
class = "data.frame", row.names = c(NA, -15L))
# install.packages("ggplot2", dependencies = TRUE)
library(ggplot2)
p <- ggplot( tcell_pdx_log_wgene_melt,aes(gene,log_fpkm,group=sample))
+ geom_point()
p + geom_line(aes(color=sample))
This should be simple, but I can't figure out how to remove the border from around my legend. I would also like to place the legend within the graph and remove the inner grid lines and the top and left side border. I am using the scatterplot function and this is the code I've written thus far:
scatterplot(Comp1~ln1wr|Season, moose,
xlab = "Risk", ylab = "Principal component 1",
labels= row.names(moose), by.groups=T, smooth=F, boxplots=F, legend.plot=F)
legend("bottomleft", moose, fill=0)
Here I was just experimenting to even see if I could get the legend to be placed somewhere else, but each time I run this code, I get an error
Error in as.graphicsAnnot(legend) :
argument "legend" is missing, with no default
I would like to place the legend within the graph, but where it will not conflict with the data displaying. here is sample data:
structure(list(ID = structure(c(1L, 1L, 1L, 1L, 1L, 32L, 33L,
33L, 34L, 34L, 34L), .Label = c("F07001", "F07002", "F07003",
"F07004", "F07005", "F07006", "F07008", "F07009", "F07010", "F07011",
"F07014", "F07015", "F07017", "F07018", "F07019", "F07020", "F07021",
"F07022", "F07023", "F07024", "F10001", "F10004", "F10008", "F10009",
"F10010", "F10012", "F10013", "F98015", "M07007", "M07012", "M07013",
"M07016", "M10007", "M10011", "M10015"), class = "factor"), Season = structure(c(1L,
1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 2L), .Label = c("SUM", "WIN"
), class = "factor"), Time = structure(c(1L, 2L, 1L, 2L, 1L,
2L, 1L, 2L, 1L, 2L, 1L), .Label = c("day", "night"), class = "factor"),
Repro = structure(c(2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L), .Label = c("f", "fc", "m"), class = "factor"), Comp1 = c(-0.524557195,
-0.794214153, -0.408247216, -0.621285004, -0.238828585, 0.976634392,
-0.202405922, -0.633821539, -0.306163898, -0.302261589, 1.218779672
), ln1wr = c(0.833126490613386, 0.824526258616325, 0.990730077688989,
0.981816265754353, 0.933462450382474, 1.446048015519, 1.13253050687157,
1.1349442179155, 1.14965388471562, 1.14879830358128, 1.14055365645628
)), .Names = c("ID", "Season", "Time", "Repro", "Comp1",
"ln1wr"), row.names = c(1L, 2L, 3L, 4L, 5L, 220L, 221L, 222L,
223L, 224L, 225L), class = "data.frame")
I would suggest
par(bty="l",las=1)
scatterplot(Comp1~ln1wr|Season, moose,
xlab = "Risk", ylab = "Principal component 1",
labels= row.names(moose),
by.groups=TRUE, smooth=FALSE, boxplots=FALSE,
grid=FALSE,
legend.plot=FALSE)
legend("bottomright", title="Season",
legend=levels(moose$Season), bty="n",
pch=1:2, col=1:2)
As indicated in ?legend, bty controls the legend box -- "n" means "none.
I put the legend in the bottom right rather than in the bottom left because it seems to avoid your data better that way.
I used bty="l" to eliminate the top and right box edges (this means "box type L")
I used las=1 to get the y-axis tick labels horizontal -- you didn't ask for that but I strongly prefer it
grid=FALSE removes the internal grid lines
You have to unique your moose ID as you have more than one point for each moose.
legend("bottomleft",legend=unique(moose))
Then you have to associate a color and a point type to your legend (corresponding to your moose ID in your plot). I would also have a look at plot() instead of scatterplot().
I have produced a fact graph in ggplot2 and the x axis title (bottom) is touching the scale values slightly (it's worsened when I plot to .pdf device). How do I move the axis title down a smidge?
DF<-structure(list(race = structure(c(3L, 1L, 3L, 2L, 3L, 1L, 2L,
2L, 2L, 3L, 2L, 1L, 3L, 3L, 3L, 3L, 2L, 1L, 2L, 3L), .Label = c("asian",
"black", "white"), class = "factor"), gender = structure(c(1L,
1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 1L,
2L, 2L, 2L), .Label = c("female", "male"), class = "factor"),
score = c(0.0360497844302483, 0.149771418578119, 0.703017688328021,
1.32540102136392, 0.627084455719946, -0.320051801571444,
0.852281028633536, -0.440056896755573, 0.621765489966213,
0.58981396944136, 1.95257757882381, 0.127301498272644, -0.0906338578670778,
-0.637727808028146, -0.449607617033673, 1.03162398117388,
0.334259623567608, 0.0912327543652576, -0.0789977852804991,
0.511696466039959), time1 = c(75.9849658266583, 38.7148843859919,
54.3512613852158, 37.3210772390582, 83.8061071736856, 14.3853324033061,
79.2285735003004, 31.1324602891428, 22.2294730114138, 26.427263191766,
40.5529893144888, 19.2463281412667, 8.45085646487301, 97.6770352620696,
61.1874163107771, 31.3727683430548, 99.4155144857594, 79.0996849438957,
21.2504885323517, 94.1079332400361)), .Names = c("race",
"gender", "score", "time1"), class = "data.frame", row.names = c(NA,
-20L))
require(ggplot2)
p <- ggplot(DF, aes(score, time1, group=gender))
p + geom_point(aes(shape=19)) + facet_grid(race~gender) + scale_x_continuous('BLAH BLAH') +
scale_y_continuous('Some MOre Of theat Good Blahing')
In my data BLAH BLAH is touching the numbers. I need it to move down. How?
You can adjust the positioning of the x-axis title using:
+ opts(axis.title.x = theme_text(vjust=-0.5))
Play around with the -0.5 "vertical justification" parameter until it suits you/your display device.
This is an easy workaround, based on the answer provided here
Just add a line break; \n, at the start of your axes title; xlab("\nYour_x_Label") (Or at the end if you need to move your y label).
It doesn't offer as much control as Eduardo's suggestion in the comments; theme(axis.title.x = element_text(vjust=-0.5)), or use of margin, but it is much simpler!
I would like to note that this is not my answer but #JWilliman - their answer is in the comments on #Prasad Chalasani answer. I am writing this as the current upvoted answers did not actually work well for me but #JWilliman's solution does:
#Answer
+ theme(axis.title.x = element_text(margin = margin(t = 20))
This is because theme(axis.title.x = element_text(vjust = 0.5)) has been superseded and now moves the title/label a fixed distance regardless of the value you put in.