I am working with R. Suppose I have the following data frame:
my_data <- data.frame(
"col" = c("red","red","red","red","red","blue","blue","blue","blue","blue","green", "green", "green", "green","green"),
"x_cor" = c(1,2,5,6,7,4,9,1,0,1,4,4,7,8,2),
"y_cor" = c(2,3,4,5,9,5,8,1,3,9,11,5,7,9,1),
"frame_number" = c(1,2,3,4,5, 1,2,3,4,5, 1,2,3,4,5)
)
my_data$col = as.factor(my_data$col)
head(my_data)
col x_cor y_cor frame_number
1 red 1 2 1
2 red 2 3 2
3 red 5 4 3
4 red 6 5 4
5 red 7 9 5
6 blue 4 5 1
In R, is it possible to create a (two-dimensional) graph that will "animate" each colored point to a new position based on the "frame number"?
For example:
I started following the instructions from this website here: https://www.datanovia.com/en/blog/gganimate-how-to-create-plots-with-beautiful-animation-in-r/
First, I made a static graph:
library(ggplot2)
library(gganimate)
p <- ggplot(
my_data,
aes(x = x_cor, y=y_cor, colour = col)
Then, I tried to animate it:
p + transition_time(frame_number) +
labs(title = "frame_number: {frame_number}")
Unfortunately, this produced an empty plot and the following warnings:
There were 50 or more warnings (use warnings() to see the first 50)
1: Cannot get dimensions of plot table. Plot region might not be fixed
2: values must be length 1,
but FUN(X[[1]]) result is length 15
Can someone please show me how to fix this problem?
Thanks
Related
I was trying to fill multiple polygons with color using the following code:
geom_polygon(data = Case1_polygon, aes(x, y, fill = factor(cluster_id),group = cluster_id, color = factor(cluster_id)), show.legend = FALSE)
But this returns the following error:
Error: Insufficient values in manual scale. 84 needed but only 3 provided.
I don't understand why this came since the arguments 'group' and 'color' could work properly.
What is the problem?
Here are the first 20 rows of the data:
cluster_id x y
2 15.11161 12.44378
2 15.13238 12.51207
2 15.13430 12.55426
2 15.22371 12.59345
2 15.23244 12.65508
2 15.12325 12.62875
2 15.11938 12.63302
2 15.08193 12.64385
2 15.04474 12.64052
2 15.05771 12.59970
2 15.04334 12.53319
2 15.04363 12.52807
2 15.04408 12.52420
2 15.09167 12.48072
2 15.11161 12.44378
3 14.86679 12.76042
3 14.81625 12.77065
3 14.80090 12.83679
3 14.79523 12.85389
3 14.79443 12.86224
I found a cool Wes Anderson palette package but I am failing here in actually using it. The variable I am looking at (Q1) has options 1 and 2. There is an NA in the set which is getting plotted however I would like to remove it as well.
library(readxl)
library(tidyverse)
library(wesanderson)
RA_Survey <- read_excel("file extension")
ggplot(data = RA_Survey, mapping = aes(x = Q1)) +
geom_bar() + scale_fill_manual(values=wes_palette(n=2, name="GrandBudapest"))
The plot I'm getting is working but without the color. Any ideas?
There are several issues which need to be addressed.
Using the Wes Anderson palette
As already mentioned by Mako, the fill aesthetic was missing from the call to aes().
Furthermore, the OP reports an error message saying Palette not found. The wesanderson package contains a list of available palettes:
names(wesanderson::wes_palettes)
[1] "BottleRocket1" "BottleRocket2" "Rushmore1" "Rushmore" "Royal1" "Royal2" "Zissou1"
[8] "Darjeeling1" "Darjeeling2" "Chevalier1" "FantasticFox1" "Moonrise1" "Moonrise2" "Moonrise3"
[15] "Cavalcanti1" "GrandBudapest1" "GrandBudapest2" "IsleofDogs1" "IsleofDogs2"
There is no palette called "GrandBudapest" as requested in OP's code. Instead, we have to choose between "GrandBudapest1" and "GrandBudapest2".
Also, the help file help("wes_palette") lists the available palettes.
Here is a working example which uses the dummy data created in the Data section below:
library(ggplot2)
library(wesanderson)
ggplot(RA_Survey, aes(x = Q1, fill = Q1)) +
geom_bar() +
scale_fill_manual(values=wes_palette(n=2, name="GrandBudapest1"))
Removing NA
The OP has asked to remove the NAs from the set. There are two options:
Tell ggplot() to remove the NAs.
Remove the NAs from te data by filtering.
We can tell ggplot() to remove NAs when plotting the x axis:
library(ggplot2)
library(wesanderson)
ggplot(RA_Survey, aes(x = Q1, fill = Q1)) +
geom_bar() +
scale_fill_manual(values=wes_palette(n=2, name="GrandBudapest1")) +
scale_x_discrete(na.translate = FALSE)
Note, this produces a warning message Removed 3 rows containing non-finite values (stat_count). To get rid of the message, we can use geom_bar(na.rm = TRUE).
The other option removes the NAs from the data by filtering
library(dplyr)
library(ggplot2)
library(wesanderson)
ggplot(RA_Survey %>% filter(!is.na(Q1)), aes(x = Q1, fill = Q1)) +
geom_bar() +
scale_fill_manual(values=wes_palette(n=2, name="GrandBudapest1"))
which creates exactly the same chart.
Data
As the OP has not provided a sample dataset, we need to create our own:
library(dplyr)
set.seed(123L)
RA_Survey <- data_frame(Q1 = sample(c("1", "2", NA), 20, TRUE, c(3, 6, 1)))
RA_Survey
# A tibble: 20 x 1
Q1
<chr>
1 2
2 1
3 2
4 1
5 NA
6 2
7 2
8 1
9 2
10 2
11 NA
12 2
13 1
14 2
15 2
16 1
17 2
18 2
19 2
20 NA
I've run a PCA with a moderately-sized data set, but I only want to visualize a certain amount of points from that analysis because they are from repeat observations and I want to see how close the paired observations are to each other on the plot. I've set it up so that the first 18 individuals are the ones I want to plot, but I can't seem to only plot just the first 18 points without only doing an analysis of only the first 18 instead of the whole data set (43 individuals).
# My data file
TrialsMR<-read.csv("NER_Trials_Matrix_Retrials.csv", row.names = 1)
# I ran the PCA of all of my values (without the categorical variable in col 8)
R.pca <- PCA(TrialsMR[,-8], graph = FALSE)
# When I try to plot only the first 18 individuals with this method, I get an error
fviz_pca_ind(R.pca[1:18,],
labelsize = 4,
pointsize = 1,
col.ind = TrialsMR$Bands,
palette = c("red", "blue", "black", "cyan", "magenta", "yellow", "gray", "green3", "pink" ))
# This is the error
Error in R.pca[1:18, ] : incorrect number of dimensions
The 18 individuals are each paired up, so only using 9 colours shouldn't cause an error (I hope).
Could anyone help me plot just the first 18 points from a PCA of my whole data set?
My data frame looks similar to this in structure
TrialsMR
Trees Bushes Shrubs Bands
JOHN1 1 4 18 BLUE
JOHN2 2 6 25 BLUE
CARL1 1 3 12 GREEN
CARL2 2 4 15 GREEN
GREG1 1 1 15 RED
GREG2 3 11 26 RED
MIKE1 1 7 19 PINK
MIKE2 1 1 25 PINK
where each band corresponds to a specific individual that has been tested twice.
You are using the wrong argument to specify individuals. Use select.ind to choose the individuals required, for eg.:
data(iris) # test data
If you want to rename your rows according to a specific grouping criteria for readily identifiable in a plot. For eg. let setosa lies in series starting with 1, something like in 100-199, similarly versicolor in 200-299 and virginica in 300-399. Do it before the PCA.
new_series <- c(101:150, 201:250, 301:350) # there are 50 of each
rownames(iris) <- new_series
R.pca <- prcomp(iris[,1:4],scale. = T) # pca
library(factoextra)
fviz_pca_ind(X= R.pca, labelsize = 4, pointsize = 1,
select.ind= list(name = new_series[1:120]), # 120 out of 150 selected
col.ind = iris$Species ,
palette = c("blue", "red", "green" ))
Always refer to R documentation first before using a new function.
R documentation: fviz_pca {factoextra}
X
an object of class PCA [FactoMineR]; prcomp and princomp [stats]; dudi and pca [ade4]; expOutput/epPCA [ExPosition].
select.ind, select.var
a selection of individuals/variables to be drawn. Allowed values are NULL or a list containing the arguments name, cos2 or contrib
For your particular dummy data, this should do:
R.pca <- prcomp(TrailsMR[,1:3], scale. = TRUE)
fviz_pca_ind(X= R.pca,
select.ind= list(name = row.names(TrialsMR)[1:4]), # 4 out of 8
pointsize = 1, labelsize = 4,
col.ind = TrialsMR$Bands,
palette = c("blue", "green" )) + ylim(-1,1)
Dummy Data:
TrialsMR <- read.table( text = "Trees Bushes Shrubs Bands
JOHN1 1 4 18 BLUE
JOHN2 2 6 25 BLUE
CARL1 1 3 12 GREEN
CARL2 2 4 15 GREEN
GREG1 1 1 15 RED
GREG2 3 11 26 RED
MIKE1 1 7 19 PINK
MIKE2 1 1 25 PINK", header = TRUE)
Here is an excerpt of the dataset I am working on.
Name Value ID Total
A 10 1 3
A 11 2 3
A 10 3 3
B 10 1 4
B 11 2 4
B 11 3 4
B 11 4 4
What I want to do is plot Name on the x-axis ID on the y-axis for all Values of 11; on top of which I want to overlay Total so that when the graph is interpreted, it is possible to see the count of items per a Name group. This might be achieved using length of a group in the Name variable or using Total. Here is what I did and a sample of the output desired.
mydf <- read.csv("./test1.csv", header = T)
x <- ggplot(mydf, aes(Name, ID))+ geom_point(data = subset(mydf, Value==11), size=3, colour="tomato3")+ scale_y_continuous(name="Class ID", limits=c(1,4),breaks=seq(1,4, by=1))
y <- x+ xlab("Class")+theme_bw()
z <- y+scale_x_discrete(limits = c("A","B", "C"))
The three orange asterisks at (A,3) and (B,4) are manual text annotation that I want to replace with either a short line or a circle to indicate the total number of items.
Thank you for your help.
I have the following problem:
I need to create a mosaic plot but want to display the number of cases for each mosaic, as total numbers per country differ. The plot is based on the following data:
1 - not agree 2 3 4 5 - fully agree
DE 6 2 0 0 1
ES 5 3 1 1 0
FR 6 3 1 2 0
SE 4 3 0 0 0
I used the following code:
> mosaicplot(Q1, col=c("red", "orange", "yellow", "green", "green4"),
+ las = 1,
+ main = "There is no need to do anything about it.",
+ ylab = "",
+ xlab = "Country")
Giving me this graph:
Now I would like to divide the first red bar into six bars of the same colour, as there were 6 votes in Germany a.s.o. Any ideas on how to accomplish that?
I applied the procedure explained here:
https://learnr.wordpress.com/2009/03/29/ggplot2_marimekko_mosaic_chart/
Only I had to use two data frames, one for the percentages and one for the absolute values.
Both data frames went through the same calculations. Whilst dfm1 created the chart, dfm21 was used for the labels:
p2 <- p1 + geom_text(aes(x = xtext, y = ytext,
label = ifelse(dfm21$value == "0", paste(" "), paste(dfm21$value))), size = 3.5)