Plot LOESS (STL) decomposition using Ggvis - r

I want to be able to plot the three different elements of The Seasonal Trend Decomposition using Loess (STL) with Ggvis.
However, I recive this error:
Error: data_frames can only contain 1d atomic vectors and lists
I am using the nottem data set.
# The Seasonal Trend Decomposition using Loess (STL) with Ggvis
# Load nottem data set
library(datasets)
nottem <- nottem
# Decompose using stl()
nottem.stl = stl(nottem, s.window="periodic")
# Plot decomposition
plot(nottem.stl)
Now, this is the information I am interested in. In order to make this into a plot that I can play around with I transform this into a data frame using the xts-package. So far so good.
# Transform nottem.stl to a data.frame
library(xts)
df.nottem.stl <- as.data.frame(as.xts(nottem.stl$time.series))
# Add date to data.frame
df.nottem.stl$date <- data.frame(time = seq(as.Date("1920-01-01"), by = ("months"), length =240))
# Glimpse data
glimpse(df.nottem.stl)
# Plot simple line of trend
plot(df.nottem.stl$date, df.nottem.stl$trend, type = "o")
This is pretty much the plot I want. However, I want to be able to use it with Shiny and therefore Ggvis is preferable.
# Plot ggvis
df.nottem.stl%>%
ggvis(~date, ~trend)%>%
layer_lines()
This is where I get my error.
Any hints on what might go wrong?

First of all your df.nottem.stl data.frame contains a Date data.frame, so you should be using the date$time column. Then using the layer_paths function instead of the layer_lines will make it work. I always find layer_paths working better than layer_lines:
So this will work:
library(ggvis)
df.nottem.stl%>%
ggvis(~date$time, ~trend)%>%
#for points
layer_points() %>%
#for lines
layer_paths()
Output:

Related

Represent a colored polygon in ggplot2

I am using the statspat package because I am working on spatial patterns.
I would like to do in ggplot and with colors instead of numbers (because it is not too readable),
the following graph, produced with the plot.quadratest function: Polygone
The numbers that interest me for the intensity of the colors are those at the bottom of each box.
The test object contains the following data:
Test object
I have looked at the help of the function, as well as the code of the function but I still cannot manage it.
Ideally I would like my final figure to look like this (maybe not with the same colors haha):
Final object
Thanks in advance for your help.
Please provide a reproducible example in the future.
The package reprex may be very helpful.
To use ggplot2 for this my best bet would be to convert
spatstat objects to sf and do the plotting that way,
but it may take some time. If you are willing to use base
graphics and spatstat you could do something like:
library(spatstat)
# Data (using a built-in dataset):
X <- unmark(chorley)
plot(X, main = "")
# Test:
test <- quadrat.test(X, nx = 4)
# Default plot:
plot(test, main = "")
# Extract the the `quadratcount` object (regions with observed counts):
counts <- attr(test, "quadratcount")
# Convert to `tess` (raw regions with no numbers)
regions <- as.tess(counts)
# Add residuals as marks to the tessellation:
marks(regions) <- test$residuals
# Plot regions with marks as colors:
plot(regions, do.col = TRUE, main = "")

Set common y axis limits from a list of ggplots

I am running a function that returns a custom ggplot from an input data (it is in fact a plot with several layers on it). I run the function over several different input data and obtain a list of ggplots.
I want to create a grid with these plots to compare them but they all have different y axes.
I guess what I have to do is extract the maximum and minimum y axes limits from the ggplot list and apply those to each plot in the list.
How can I do that? I guess its through the use of ggbuild. Something like this:
test = ggplot_build(plot_list[[1]])
> test$layout$panel_scales_x
[[1]]
<ScaleContinuousPosition>
Range:
Limits: 0 -- 1
I am not familiar with the structure of a ggplot_build and maybe this one in particular is not a standard one as it comes from a "custom" ggplot.
For reference, these plots are created whit the gseaplot2 function from the enrichplot package.
I dont know how to "upload" an R object but if that would help, let me know how to do it.
Thanks!
edit after comments (thanks for your suggestions!)
Here is an example of the a gseaplot2 plot. GSEA stands for Gene Set Enrichment Analysis, it is a technique used in genomic studies. The gseaplot2 function calculates a running average and then plots it and another bar plot on the bottom.
and here is the grid I create to compare the plots generated from different data:
I would like to have a common scale for the "Running Enrichment Score" part.
I guess I could try to recreate the gseaplot2 function and input all of the datasets and then create the grid by facet_wrap, but I was wondering if there was an easy way of extracting parameters from a plot list.
As a reproducible example (from the enrichplot package):
library(clusterProfiler)
data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]
wpgmtfile <- system.file("extdata/wikipathways-20180810-gmt-Homo_sapiens.gmt", package="clusterProfiler")
wp2gene <- read.gmt(wpgmtfile)
wp2gene <- wp2gene %>% tidyr::separate(term, c("name","version","wpid","org"), "%")
wpid2gene <- wp2gene %>% dplyr::select(wpid, gene) #TERM2GENE
wpid2name <- wp2gene %>% dplyr::select(wpid, name) #TERM2NAME
ewp2 <- GSEA(geneList, TERM2GENE = wpid2gene, TERM2NAME = wpid2name, verbose=FALSE)
gseaplot2(ewp2, geneSetID=1, subplots=1:2)
And this is how I generate the plot list (probably there is a much more elegant way):
plot_list = list()
for(i in 1:3) {
fig_i = gseaplot2(ewp2,
geneSetID=i,
subplots=1:2)
plot_list[[i]] = fig_i
}
ggarrange(plotlist=plot_list)

Aggregate data and plot in a bar plot in R

i have a data set with parameter_variations and a score. This score has four scales: like, anth, comf and ueq.
The bargraph.CI function accepts raw data, not aggregated data. So try the following:
bargraph.CI(parameter_variants, response=score, group=scale, data=dat,
main="likeability", legend=TRUE)
This should give you one "two-way" plot. If you don't like the look of it, there are many arguments that make superficial adjustments. Check the help page for details.
To obtain separate plots for each of the four scales, I think you can do something like this:
library(dplyr)
dat %>%
filter(scale=="like") %>% # change the value here.
bargraph.CI(parameter_variants, response=score, data=., main="likeability")
Base R solution:
with(subset(dat, subset=scale=="like"),
bargraph.CI(parameter_variants, response=score, main="likeability")
)

Plotting a matrix "by parts" in R?

I have a 50k by 50k square matrix saved to disk in a text file and I would like to produce a simple histogram to see the distribution of the values in the matrix.
Obviously, when I try to load the matrix in R by using read.table(), a memory error is encountered as the matrix is too big. Is there anyway I could possibly load smaller submatrices one at a time, but still produce a histogram that considers all the values of the original matrix? I can indeed load smaller submatrices, but I just override the histogram that I had for the last submatrix with the distribution of the new one.
Here's an approach. I don't have all the details because you did not provide sample data or the expected output, but one way to do this is through the read_chunked_csv function in the readr package. First, you will need to write your summarisation function and then apply this to each chunk. See the below for a full repex.
# Call the Required Libraries
library(dplyr)
library(ggplot2)
library(readr)
# First Generate Some Fake Data
temp <- tempfile(fileext = ".csv")
fake_dat <- as.data.frame(matrix(rnorm(1000*100), ncol = 100))
write_csv(fake_dat, temp)
# Now write a summarisation function
# This will be applied to each chunk that is read into
# memory
summarise_for_hist <- function(x, pos){
x %>%
mutate(added_bin = cut(V1, breaks = -6:6)) %>%
count(added_bin)
}
# Note that I manually set the cutpoints or "breaks"
# argument. You would need to refine this based on your
# data and subject matter expertise
# A
small_read <- read_csv_chunked(temp, # data
DataFrameCallback$new(summarise_for_hist),
chunk_size = 200 # number of lines to read
)
Now that we have summarised our data, we can combine and plot it.
# Generate our histogram by combining all of the results
# and plotting
small_read %>%
group_by(added_bin) %>%
summarise(total = sum(n)) %>%
ggplot(aes(added_bin, total))+
geom_col()
This will yield the following:

rCharts Polychart: Adding horizontal or vertical lines to a plot

I'm having some trouble understanding how to customize graphs using the rPlot function in the rCharts Package. Say I have the following code
#Install rCharts if you do not already have it
#This will require devtools, which can be downloaded from CRAN
require(devtools)
install_github('rCharts', 'ramnathv')
#simulate some random normal data
x <- rnorm(100, 50, 5)
y <- rnorm(100, 30, 2)
#store in a data frame for easy retrieval
demoData <- data.frame(x,y)
#generate the rPlot Object
demoChart <- rPlot(y~x, data = demoData, type = 'point')
#return the object // view the plot
demoChart
This will generate a plot and that is nice, but how would I go about adding horizontal lines along the y-axis? For example, if I wanted to plot a green line which represented the average y-value, and then red lines which represented +/- 3 standard deviations from the average? If anybody knows of some documentation and could point me to it then that would be great. However, the only documentation I could find was on the polychart.js (https://github.com/Polychart/polychart2) and I'm not quite sure how to apply this to the rCharts rPlot function in R.
I have done some digging and I feel like the answer is going to have something to do with adding/modifying the layers parameter within the rPlot object.
#look at the slots in this object
demoChart$params$layers
#doing this will return the following output (which will be different for
#everybody because I didn't set a seed). Also, I removed rows 6:100 of the data.
demoChart$params$layers
[[1]]
[[1]]$x
[1] "x"
[[1]]$y
[1] "y"
[[1]]$data
x y
1 49.66518 32.75435
2 42.59585 30.54304
3 53.40338 31.71185
4 58.01907 28.98096
5 55.67123 29.15870
[[1]]$facet
NULL
[[1]]$type
[1] "point"
If I figure this out I will post a solution, but I would appreciate any help/advice in the meantime! I don't have much experience playing with objects in R. I feel like this is supposed to have some similarity to ggplot2 which I also don't have much experience with.
Thanks for any advice!
You can overlay additional graphs onto your rCharts plot using layers. Add values for any additional layers as columns on to your original data.frame. copy_layer lets you use the values from the data.frame in the extra layers.
# Regression Plots using rCharts
require(rCharts)
mtcars$avg <- mean(mtcars$mpg)
mtcars$sdplus <- mtcars$avg + sd(mtcars$mpg)
mtcars$sdneg <- mtcars$avg - sd(mtcars$mpg)
p1 <- rPlot(mpg~wt, data=mtcars, type='point')
p1$layer(y='avg', copy_layer=T, type='line', color=list(const='red'))
p1$layer(y='sdplus', copy_layer=T, type='line', color=list(const='green'))
p1$layer(y='sdneg', copy_layer=T, type='line', color=list(const='green'))
p1
Here are a couple of examples: one from the main rCharts website and the other showing how to overlay a regression line.

Resources