Can I arrange my study labels in a forest plot using study years after specifying the byvar to be something different? (Meta package-R) - r

Can I arrange all my study labels within subgroups in my forest plot with the year of publication after specifying that I want subgroups divided by a certain variable?
Here is the code I am currently using.
brugia.forest <- metaprop(event = no.positive, n = no.tested, studlab = studylabel, data = brugia, byvar = diagnostics, bylab = c("direct detection", "direct and indirect detection", "indirect detection"), print.byvar = F, sm = "PLO", method.tau = "REML", title = "", hakn = T)
I would like the studies within the "diagnostics" groups to be arranged from the oldest to the most recent and not alphabetically as is currently the case. I am using the meta package of R because of its user-friendliness and would like to continue using it (so, metafor suggestions may not be too helpful)
Thanks.

I would like to answer this question because the creator of the meta package, Dr. Guido Schwarzer was kind to answer the question via email. Here is the way forward:
By default, the forest function does not sort the studies at all,
instead the order of the dataset is used. One can therefore order the dataset before utilising it for the forestplot first.
Alternatively, one can use the 'sortvar' function to change the order of studies and specify the variable one wants to sort the studies by.
Hope this helps.

Another option is to use ggplot if you are familiar with the ggplot package. Gives a lot of flexibility in arranging and modifying the plots

Related

Understanding the output from a factor analysis using the FAMD function

I have some survey data where people were asked questions and given a yes or no option (1=yes, 0=no). I would like to be able to pick out some patterns in this data.
The questions are:
Do you enjoy XX work?
Do you do XX work alone?
Has your workload increased?
Do you have a backlog of work?
I would like to know whether people who work alone are more likely to have an increased workload, a backlog of work and not enjoy their job. To answer this, I think factor analysis is the way to go but I'm struggling to interpret the output.
Here is an example of my data:
enjoy <- c(1,1,0,1,0,1,0,0,0,1)
alone <- c(0,0,1,1,1,0,0,1,1,0)
workload <- c(0,0,1,1,0,1,0,0,0,1)
backlog <- c(0,0,1,1,0,1,0,0,0,0)
data <- data.frame(enjoy, alone, workload, backlog)
data <- data %>% mutate_if(sapply(data, is.numeric), as.character) ## convert from numeric to categorical
I'm using the FAMD function in factomineR as this can use categorical data.
library(FactoMineR)
data_famd <- FAMD(data, graph = FALSE)
Then using factoextra, I can see which variables contribute to each axis
library(factoextra)
# Contribution to the first dimension
fviz_contrib(data_famd, "var", axes = 1) ## backlog & workload
# Contribution to the second dimension
fviz_contrib(data_famd, "var", axes = 2) ## enjoy and alone
Then I can make this plot:
fviz_mfa_ind(data_famd,
habillage = "alone", # color by groups
palette = c("#00AFBB", "#E7B800", "#FC4E07"),
addEllipses = TRUE, ellipse.type = "confidence",
repel = TRUE) # Avoid text overlapping
This looks like that people who work alone vs not alone answer questions differently. But I don't understand what answers people who work alone (yellow) are giving vs people who don't work alone. They are clearly distinct so are doing something differently.
My main question is: What do the axes mean? I've done PCA's using continuous data before and using the loadings I can figure out what the axes mean, and therefore interpret these graphs. How do you do this for a factor analysis? Is there a different package?
Thanks for any help.

R: How does stat_density_ridges modify and plot data?

<Disclaimer(s) - (1) This is my first post, so please be gentle, specifically regarding formatting and (2) I did try to dig as much as I could on this topic before posting the question here>
I have a simple data vector containing returns of 40 portfolios on the same day:
Year Return
Now -17.39862061
Now -12.98954582
Now -12.98954582
Now -12.86928749
Now -12.37044334
Now -11.07007504
Now -10.68971539
Now -10.07578182
Now -9.984867096
Now -8.764036179
Now -8.698093414
Now -8.594026566
Now -8.193638802
Now -7.818599701
Now -7.622627735
Now -7.535216808
Now -7.391239166
Now -7.331315517
Now -5.58059597
Now -5.579797268
Now -4.525201797
Now -3.735909224
Now -2.687532902
Now -2.65363884
Now -2.177522898
Now -1.977644682
Now -1.353205681
Now -0.042584345
Now 0.096564181
Now 0.275416046
Now 0.638839543
Now 1.959529042
Now 3.715519428
Now 4.842819691
Now 5.475946426
Now 6.380955219
Now 6.535937309
Now 8.421762466
Now 8.556800842
Now 10.39185524
I am trying to plot these returns to compare versus other days (so the rest of my history e.g.). I tried to use stat_density_ridges as per the code block below
ggplot(data = data.plot, aes(x = Return, y = Year, fill = factor(..quantile..))) +
stat_density_ridges(geom = "density_ridges_gradient",calc_ecdf = TRUE,
quantiles = c(0.025, 0.5, 0.975),
quantile_lines = TRUE)
As you can see - the "year" in this case is the same i.e. there is no height parameter, yet I get a nice ridg(y) chart. While the chart is beautiful to behold, and very very awesome, I am at a loss to determine how the plotting function is computing the density in this case, specially the height.
This is the output chart I get (I have omitted the formatting code here since it doesn't make a difference to my question):
Portfolio Return Distribution Plots - US versus Europe
I tried digging into the code of the function itself, but came up with a total blank. The documentation didn't help (except perhaps give me a hint that the function plots continous distributions).
Any help, or guidance, or even a nudge in the right direction would be extremely helpful.

Mosaic plot and text values

I created structable from Titanic dataset and used mosaic function for it. Everything worked great, hovewer I also wanted to label each box from mosaic plot with quantity of titanic passangers given their Class, Survival and Sex. As it turns out, I am not able to do that. I know I need to use labeling_cells to achive that, hovewer i am not able to use it (and i wan't able to find any example) in combination with stuctable and below code.
library("vcd")
struct <- structable(~ Class + Survived + Sex, data = Titanic)
mosaic(struct, data = Titanic, shade = TRUE, direction = "v")
If I understand your question correctly, then the last example in ?labeling_cells is pretty close to what you want to do. Using your example, the labeling_cells() can be added afterwards provided that the viewport tree is not popped. The only aspect that is somewhat awkward is that the struct object has to be a regular table again for the labeling. I have to ask David, the main author, whether this could be handled automatically.
mosaic(struct, shade = TRUE, direction = "v", pop = FALSE)
labeling_cells(text = as.table(struct), margin = 0)(as.table(struct))
Fixed in upstream in vcd 1.4-4, but note that you can simply use
mosaic(struct, labeling = labeling_values)

How to plot a large ctree() to avoid overlapping nodes

When I plotted the decision tree result from ctree() from party package, the font was too big and the box was also too big. They are overlapping other nodes.
Is there a way to customize the output from plot() so that the box and the font would be smaller ?
The short answer seems to be, no, you cannot change the font size, but there are some good other options.
I know of three possible solutions. First, you can change other parameters in the plot to make it more compact. Second, you can write it to a graphic file and view that file. Third, you can use an alternative implementation of ctree() in the partykit package, which is a newer package by some of the same authors.
Default Plot Example
library(party)
airq <- subset(airquality, !is.na(Ozone))
airct <- ctree(Ozone ~ ., data = airq,
controls = ctree_control(maxsurrogate = 3))
plot(airct) #default plot, some crowding with N hidden on leafs
Simplified plot
# simpler version of plot
plot(airct, type="simple", # no terminal plots
inner_panel=node_inner(airct,
abbreviate = TRUE, # short variable names
pval = FALSE, # no p-values
id = FALSE), # no id of node
terminal_panel=node_terminal(airct,
abbreviate = TRUE,
digits = 1, # few digits on numbers
fill = c("white"), # make box white not grey
id = FALSE)
)
This is somewhat better and one might be able to improve it further. To figure out these details, I originally did class(airct) which returned "BinaryTree". Armed with this info, I started reading ?plot.BinaryTree
Write to a file
A second simple solution is to write the plot to a file and then view the file. You may need to play with the settings to find the best fit.
png("airct.png", res=80, height=800, width=1600)
plot(airct)
dev.off()
Plot with partykit package instead
Finally, you can use a newer and not-yet-finished re-implementation of the party package by some of the same authors. At this point (Dec 2012), the only function they have re-done is ctree(). This version allows you to change font size.
library(partykit)
airct <- ctree(Ozone ~ ., data = airq)
class(airct) # different class from before
# "constparty" "party"
plot(airct, gp = gpar(fontsize = 6), # font size changed to 6
inner_panel=node_inner,
ip_args=list(
abbreviate = TRUE,
id = FALSE)
)
Here I have left the leafs in their default setting because I have frankly never figured out how to get it to work the way I want. I suspect this has to do with the fact that the package is incomplete (as of Dec 2012). You can read about the plot method starting with ?plot.party
Another option (that doesn't change what you want but does potentially solve the underlying problem) is to change the size of the figure itself, as I learned in my class for my assignment.
Replace the r in the below:
{r}
with:
{r, fig.width=X, fig.height=Y}
where the X and Y need to be replaced by numbers chosen by you depending on what size you think works better.
This website, talks about doing this in more detail and universally throughout the document.

Exporting individual pdf's for each plot created from a single file in R (using lattice)

So I have a fairly large dataset of GPS locations corresponding to different individuals at different times. It looks like a more complicated version of this...
ID________________Year________________Julian.date________________Distance
1_________________2003______________________15_____________________200
1_________________2004______________________20_____________________500
1_________________2005______________________24_____________________462
1_________________2006______________________28_____________________51
2_________________2002______________________12_____________________248
2_________________2003______________________15_____________________571
2_________________2004______________________16_____________________685
3_________________2003______________________20_____________________521
3_________________2004______________________25_____________________1251
3_________________2005______________________29_____________________225
3_________________2006______________________54_____________________144
What I am trying to do is separate the data out by year and individual. So each individual with have a boxplot of their Distances and the corresponding Julian date. I am able to create a massive pdf of all the plots (12X11) on one sheet using the lattice package (Separator is a column combining the ID and Year columns)..
> barchart(Julian.date~Distance|factor(Separator),data=data)
This isn't particularly helpful as I can't do much with such a massive pdf. So I tried restricting the number of plots per sheet to 1 using...
> barchart(Julian.date~Distance|factor(Separator),data=data,layout=c(1,1))
Which results in all the plots flying past me and none of them exporting to pdf. I have tried searching for a way to accomplish this, but so far no luck. If anyone knows a way of getting these to export as they fly past I would be extremely thankful.
So thanks in advance if anyone out there can help out. And if you need any more information, let me know, I tend not to use the terminology properly.
Ayden
I'm not sure what you are doing, or doing wrong as you don't show code, but using an example modified from ?barchart I see a PDF with multiple pages using this code:
foo <- barchart(yield ~ variety | site, data = barley,
groups = year, layout = c(1,1), stack = TRUE,
auto.key = list(space = "right"),
ylab = "Barley Yield (bushels/acre)",
scales = list(x = list(rot = 45)))
pdf("foo.pdf", onefile = TRUE)
print(foo)
dev.off()
onefile = TRUE should be the default and allows multiple pages in a single PDF. The other thing I do is print the barchart object in the pdf() wrapper; again, I don't think this is required if you are running R interactively but it will be needed if this is a batch or script based job.

Resources