scatterplot3d I can not make a surface graph - r

I'm trying to make a surface graph but I can not do it. The graphic is not pretty and I have tried in several forums how to do and I did not succeed.
library(scatterplot3d)
x1 <- rep(10, 6)
x2 <- rep(15, 6)
x3 <- rep(20, 6)
x4 <- rep(25, 6)
x5 <- rep(30, 6)
x <- c(x1, x2, x3, x4, x5)
y1 <- rep(7, 30)
y2 <- rep(21, 30)
y3 <- rep(35, 30)
y <- c(y1, y2, y3)
z = 1781.166805 + 52.445903*y + 203.454647*x -1.570445*x*y -4.119635*(x**2)
scatterplot3d(x, y, z)
I'd highly appreciate if you help me!

First off, for future postings, please use the code tags to properly format your code. Secondly, your formula for z is not valid R syntax.
Lastly, I'd strongly recommend to spend some time taking the tour and learning how to ask good questions.
scatterplot3d allows you to plot a three dimensional point cloud, not a surface. Based on the (poorly formatted) data you provide, this works just fine:
library(scatterplot3d);
z <- 1781.166805 + 52.445903 * y + 203.454647 * x -1.570445 * x * y -4.119635 * x^2;
scatterplot3d(x, y, z);
Update
If you want to have a 3d surface (mesh) plot, you can use e.g. plotly:
require(plotly);
plot_ly(x = ~x, y = ~y, z = ~z, type = "mesh3d");
which produces an interactive (rotatable, zoomable) plot, a screenshot of which looks like this:
Sample data
x <- c(10, 10, 10, 10, 10, 10, 15, 15, 15, 15, 15, 15, 20, 20, 20,
20, 20, 20, 25, 25, 25, 25, 25, 25, 30, 30, 30, 30, 30, 30, 10,
10, 10, 10, 10, 10, 15, 15, 15, 15, 15, 15, 20, 20, 20, 20, 20,
20, 25, 25, 25, 25, 25, 25, 30, 30, 30, 30, 30, 30, 10, 10, 10,
10, 10, 10, 15, 15, 15, 15, 15, 15, 20, 20, 20, 20, 20, 20, 25,
25, 25, 25, 25, 25, 30, 30, 30, 30, 30, 30);
y <- c(7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 7, 21, 21, 21, 21, 21, 21, 21, 21,
21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21, 21,
21, 21, 21, 21, 21, 21, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35,
35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35, 35,
35, 35, 35, 35);

Related

tidying igraph plot and routing or TSP question

I have less experience in R and I need help tidying my plot as it looks messy. Also, my project is to find the best minimal route from Seoul to every city and back to Seoul. It is almost like Traveling Salesman Problem (TSP) but there are some cities needed to be visited more than once as it is the only way to reach certain cities. I don't know how to do and what packages to use.
This is my code for igraph plot
library(igraph)
g1 <- graph( c("Seoul","Incheon","Seoul","Goyang","Seoul","Seongnam","Seoul",
"Bucheon","Seoul","Uijeongbu","Seoul","Gimpo",
"Seoul","Gwangmyeong", "Seoul", "Hanam","Seoul", "Guri",
"Seoul","Gwacheon","Busan","Changwon","Busan","Gimhae",
"Busan","Jeju","Busan","Yangsan","Busan","Geoje",
"Incheon","Goyang","Incheon","Bucheon","Incheon","Siheung",
"Incheon","Jeju","Incheon","Gimpo","Daegu","Gumi",
"Daegu","Gyeongsan","Daegu","Yeongcheon","Daejeon",
"Cheongju","Daejeon","Nonsan","Daejeon","Gongju",
"Daejeon","Gyeryong","Gwangju","Naju","Suwon","Yongin",
"Suwon","Seongnam","Suwon","Hwaseong","Suwon","Ansan",
"Suwon","Gunpo","Suwon","Osan","Suwon","Uiwang",
"Ulsan","Yangsan","Ulsan","Gyeongju","Ulsan","Miryang",
"Yongin","Seongnam","Yongin","Hwaseong","Yongin","Pyeongtaek",
"Yongin","Gwangju-si","Yongin","Icheon","Yongin","Anseong",
"Yongin","Uiwang","Goyang","Gimpo","Goyang","Paju","Goyang",
"Yangju","Changwon","Gimhae","Changwon","Jinju","Changwon",
"Miryang","Seongnam","Gwangju-si","Seongnam","Hanam","Seongnam",
"Uiwang","Seongnam","Gwacheon","Hwaseong","Ansan","Hwaseong",
"Pyeongtaek","Hwaseong","Gunpo","Hwaseong","Osan","Cheongju",
"Cheonan","Cheongju","Sejong","Bucheon","Siheung","Bucheon",
"Gwangmyeong","Ansan","Anyang","Ansan","Siheung","Ansan",
"Gunpo","Namyangju","Uijeongbu","Namyangju","Chuncheon",
"Namyangju","Hanam","Namyangju","Guri","Cheonan","Pyeongtaek",
"Cheonan","Sejong","Cheonan","Asan","Cheonan","Anseong",
"Jeonju","Gimje","Gimhae","Yangsan","Gimhae","Miryang",
"Pyeongtaek","Asan","Pyeongtaek","Osan","Pyeongtaek","Anseong",
"Pyeongtaek","Dangjin","Anyang","Siheung","Anyang","Gwangmyeong",
"Anyang","Gunpo","Anyang","Gwacheon","Siheung","Gwangmyeong",
"Siheung","Gunpo","Pohang","Yeongcheon","Pohang","Gyeongju",
"Jeju","Gimpo","Jeju","Mokpo","Jeju","Seogwipo","Uijeongbu",
"Yangju","Uijeongbu","Pocheon","Paju","Yangju","Gumi","Gimcheon",
"Gumi","Sangju","Gwangju-si","Hanam","Gwangju-si","Icheon",
"Gwangju-si","Yeoju","Sejong","Gongju","Wonju","Chungju",
"Wonju","Jecheon","Wonju","Yeoju","Jinju","Sacheon", "Yangsan",
"Miryang","Asan","Gongju","Iksan","Gunsan","Iksan","Nonsan",
"Iksan","Gimje","Chuncheon","Pocheon","Gyeongsan","Yeongcheon",
"Gunpo","Uiwang","Suncheon","Yeosu","Suncheon","Gwangyang",
"Gunsan","Gimje","Gyeongju","Yeongcheon","Geoje","Tongyeong",
"Osan","Anseong","Yangju","Pocheon","Yangju","Dongducheon",
"Icheon","Anseong","Icheon","Yeoju","Mokpo","Naju","Chungju",
"Jecheon","Chungju","Yeoju","Chungju","Mungyeong","Gangneung",
"Donghae","Gangneung","Sokcho","Seosan","Dangjin","Andong",
"Yeongju","Pocheon","Dongducheon","Gimcheon","Sangju","Tongyeong",
"Sacheon","Nonsan","Gongju","Nonsan","Boryeong","Nonsan",
"Gyeryong","Gongju","Boryeong","Gongju","Gyeryong","Jeongeup",
"Gimje","Yeongju","Mungyeong","Yeongju","Taebaek","Sangju",
"Mungyeong","Sokcho","Samcheok","Samcheok","Taebaek",
"Suncheon","Gwangju"), directed=F)
E(g1)$distance <- c(27, 16, 20, 19, 20, 24, 14, 20, 15, 15, 36, 18, 299, 18, 53,
25, 8, 12, 440, 18, 36, 13, 33, 33, 31, 26, 15, 20, 13, 20,
19, 18, 13, 16, 10, 33, 36, 51, 24, 31, 28, 21, 23, 27, 22,
11, 12, 24, 18, 52, 27, 11, 13, 19, 13, 14, 34, 20, 23, 38,
18, 12, 9, 12, 7, 10, 19, 53, 11, 8, 20, 27, 11, 26, 24, 18,
33, 25, 18, 15, 44, 14, 12, 4, 5, 12, 12, 37, 21, 458, 146,
27, 10, 23, 24, 21, 36, 14, 23, 36, 21, 39, 33, 26, 20, 32,
40, 20, 29, 18, 47, 24, 4, 27, 19, 22, 29, 17, 24, 18, 13,
32, 18, 37, 28, 43, 51, 33, 56, 20, 28, 12, 30, 38, 29, 47,
17, 47, 22, 26, 46, 51, 20, 10, 36,63)
plot(g1, edge.label=E(g1)$distance,
vertex.label.cex=0.6, vertex.size=4)
igraph plot
Using trick from https://or.stackexchange.com/questions/5555/tsp-with-repeated-city-visits
library(data.table)
library(purrr)
library(TSP)
library(igraph)
We need to create distance matrix based on shortest paths for each pair of vertices:
vertex_names <- names(V(g1))
N <- length(vertex_names)
dt <- map(
head(seq_along(vertex_names), -1),
~data.table(
from = vertex_names[[.x]],
to = vertex_names[(.x+1):N],
path = map(
shortest_paths(g1, vertex_names[[.x]], vertex_names[(.x+1):N])[["vpath"]],
names
)
),
) %>%
rbindlist()
then we calculate distances of shortest paths:
m <- as_adjacency_matrix(g1, type = "both", attr = "distance", sparse = FALSE)
dt[, weight := map_dbl(path, ~sum(m[embed(.x, 2)[, 2:1, drop=FALSE]]))]
now we assemble new matrix:
dt <- rbind(
dt, dt[, .(from = to, to = from, path = map(path, rev), weight = weight)]
)
new_m <- matrix(0, N, N)
rownames(new_m) <- colnames(new_m) <- vertex_names
new_m[as.matrix(dt[, .(from,to)])] <- dt[["weight"]]
on this new matrix we use some heuristic to solve TSP (for exact solution you should use method="concorde"):
res <- new_m %>%
TSP() %>%
solve_TSP(repetitions = 1000, two_opt = TRUE)
now we exchange each pair of consecutive cities with shortest path:
start_city <- "Seoul"
path_dt <- c(start_city, labels(cut_tour(res, start_city)), start_city) %>%
embed(2) %>%
.[,2:1,drop = FALSE] %>%
"colnames<-"(c("from", "to")) %>%
as.data.table()
path_dt <- dt[path_dt, on = .(from ,to)]
my_path <- c(unlist(map(path_dt[["path"]], head, -1)), start_city)
my_path is heuristic solution with distance tour_length(res)

error in densityplot mice- missing data example

I have the following data:
dput(example)
structure(list(q1 = c(5, 22, 16, 24, 9, 20, 21, 16, 28, 28, 24,
25, 34, 22, 29, NA, 24, 13, 10, 17, 24, 21, 22, 35, 20, 25, 25,
23, 22, 20, 27, 22, 20, 23, 5, 21, 19, 17, 27, 20, 35, 35, 10,
16, 22, 34, 34, 23, 25, 23, 25, 30, 18, 21, 15, 23, 5, 35, 5,
30), q2 = c(5, 5, 24, 15, 5, 5, 26, 23, 24, 9, 24, 5, 15, 26,
30, 14, 14, 19, 11, 25, 20, 5, 14, 13, 11, 10, 13, 16, 16, 21,
10, 12, 20, 9, 15, 5, 13, 5, 30, 18, 12, 27, 10, 9, 20, 5, 9,
10, 11, 26, 22, 8, 6, 5, 15, 6, 5, 35, 10, 18), q3 = c(11, 22,
NA, 22, 6, 18, 30, 6, 26, NA, 17, 22, 33, 19, 22, 25, 23, 13,
13, 15, 16, 16, 23, 24, 6, 25, 27, 12, 25, 17, 28, 15, 20, 31,
5, 17, 17, 20, 24, 7, 35, 35, 10, 10, 20, 10, 31, 21, 16, 32,
25, 30, 10, 24, 15, 24, 5, 35, 9, 26), q4 = c(14, 15, 23, 21,
NA, 25, 30, 23, 28, 20, 25, 5, 35, 30, 19, 23, 30, 5, 23, 18,
30, 15, 30, 22, 8, 29, 35, 23, 23, 24, 25, 25, 20, 25, 5, 15,
34, 8, 32, 35, 35, 35, 10, 6, 21, 10, 24, 27, 10, 30, 35, 15,
6, 21, 15, 15, 5, 35, 19, 26), q5 = c(5, 18, 21, 19, 5, 6, 5,
29, 20, 23, 22, 5, 16, 22, 12, 13, 18, 5, 17, 15, 18, 16, 20,
8, 12, 19, 12, 23, 9, 16, 5, 29, 20, 5, 5, 5, 5, 5, 30, 22, 32,
35, 10, 13, 20, 13, 12, 16, 5, 24, 22, 17, 5, 20, 14, 5, 5, 35,
15, 16), q6 = c(15, 9, 25, 26, 6, 17, 28, 32, 26, 28, 24, 25,
11, 24, 31, 18, 19, 6, 20, 26, 29, 17, 21, 24, 7, 29, 17, 17,
14, 25, 24, 35, 24, 6, 16, 6, 9, 6, 38, 19, 30, 42, 12, 20, 27,
26, 25, 13, 9, 36, 27, 27, 7, 24, 22, 6, 16, 42, 14, 11)), class = "data.frame", row.names = c(NA,
-60L))
I then use mice:
*edit: forgot the complete line
library(mice)
imp <- mice(example,m=5,maxit=50,meth='pmm',seed=500)
example_i <- complete(imp,1)
But when trying to get a densityplot I get the following error:
densityplot(imp)
Error in str2lang(x) : <text>:2:0: unexpected end of input
1: ~
^
My questions are:
Is there something fundamentally wrong about my approach to impute missing data? (this is just a small example)
Am I using properly the MICE arguments?
What am I doing wrong with the density plot, as I have gotten it for all of the other scales I am working with?
Answer
You need to supply a formula to densityplot, otherwise it will plot all variables with > 2 missing values. Since you don't have any variables with 2 > missing values, and since densityplot doesn't expect that, it produces this cryptic error.
Example that works
example$q4[1:10] <- NA
imp <- mice(example, m = 5, maxit = 50, meth = "pmm", seed = 500)
densityplot(imp)
# equivalent: densityplot(imp, ~ q4)
Rationale
imp is of class mids, so you are calling densityplot.mids. Normally, densityplot.mids requires you to provide a formula (data argument), so that it knows which variables to plot (see ?densityplot.mids). If you want to plot q4, then the code is densityplot(imp, ~ q4).
Inside densityplot.mids, we see:
if (missing(data)) {
vnames <- vnames[!allfactors & x$nmis > 2 & x$nmis <
nrow(x$data) - 1]
formula <- as.formula(paste("~", paste(vnames,
collapse = "+", sep = ""), sep = ""))
}
If we use traceback() right after getting your error, then you will see that the last line above is the line that throws the error.
In the first line, you can see the condition xnmis > 2, which means that it will grab all the columns with more than 2 missing values. When no columns satisfy the conditions, then vnames will evaluate to character(0), and so the subsequent line yields as output ~, i.e. the code that you see in your error.
So, why does it give an error when there are too few missings? That's because densityplot plots a distribution, and plotting a distribution of 1 or 2 points is just not doable.
Suggestion
The package maintainers could improve the error by simply checking whether vnames has any content, and if not, they can throw an error that is informative. You may want to add this as an issue on Github if you think it is useful.

Remove community boxes in igraph

I have created a simple minimum spanning tree and now have a data frame with columns 'from', 'to' and 'distance'.
Based on this, I found communities using the Louvain method, which I plotted. As far as I understand it, for clustering and plotting I need only the columns from and to, and the distance is not used.
How can I keep the communities I found, ideally each in a different color, but remove the box around the communities?
library(igraph)
from <- c(14, 25, 18, 19, 29, 23, 24, 36, 5, 22, 21, 29, 18, 26, 2, 45, 8, 7, 36, 42, 3, 23, 13, 13, 20, 15, 13, 7, 28, 9, 6, 37, 8, 4, 15, 27, 10, 2, 39, 1, 43, 21, 14, 4, 14, 8, 9, 40, 31, 1)
to <- c(16, 26, 27, 20, 32, 34, 35, 39, 6, 32, 35, 30, 22, 28, 45, 46, 48, 12, 38, 43, 42, 24, 27, 25, 30, 20, 50, 29, 34, 49, 40, 39, 11, 41, 46, 47, 50, 16, 46, 40, 44, 31, 17, 40, 44, 23, 33, 42, 33, 1)
distance <- c(0.3177487, 0.3908324, 0.4804059, 0.4914682, 0.5610357, 0.6061082, 0.6357532, 0.6638961, 0.7269725, 0.8136463, 0.8605391, 0.8665838, 0.8755252, 0.8908454, 0.9411793, 0.9850834, 1.0641603, 1.0721154, 1.0790506, 1.1410964, 1.1925349, 1.2115428, 1.2165045, 1.2359032, 1.2580204, 1.2725243, 1.2843610, 1.2906908, 1.3070725, 1.3397053, 1.3598817, 1.3690732, 1.3744088, 1.3972220, 1.4472312, 1.4574936, 1.4654772, 1.4689660, 1.5999424, 1.6014316, 1.6305410, 1.6450413, 1.6929959, 1.7597620, 1.8113320, 2.0380866, 3.0789517, 4.0105981, 5.1212614, 0.0000000)
mst <- cbind.data.frame(from, to, distance)
g <- graph.data.frame(mst[, 1:2], directed = FALSE)
lou <- cluster_louvain(g)
set.seed(1)
plot(lou, g, vertex.label = NA, vertex.size=5)
The blobs around the groups can be turned off like this:
plot(lou, g, vertex.label = NA, vertex.size=5, mark.groups = NULL)
Do you want this?
plot(lou, g, vertex.label = NA, vertex.size = 5, mark.border = NA)

R Time series data: Plot multiple batches

I have a big timeseries dataset which looks like the table below. T0, T1, T2,... (goes on till T70) are the timestamps and over 400 batches (A,B,C,...). There are multiple features in the data (Description Column in the sample data) which I'm interested in plotting. My first attempt was to separate the dataset for each description so that I get one row per batch in each subset ranging from T0 to T70.
My aim is to convert this dataframe into a timeseries object and check for seasonality for Good and bad batches (for each description). Can someone help with any easy fixes in R? Thanks!
Update:
My subset of the data for one Description looks like this:
In order to melt the data, I used:
mdf <- melt(df,id.vars = c('Batch',colnames(df[, c(2:70)])))
and it didn't work. I want to get just three variables out of it:
Batch - Time - Value.
Any help would be appreciated!
EDIT:dput(head(df,20)) gave the following output. I have truncated the output till T20 instead of T70.
structure(list(Batch = c("A", "B", "C",
"D", "E", "F", "G", "H",
"I", "J", "K", "L", "M",
"N", "O", "P", "Q", "R",
"S", "T"),
T0 = c(5, 6,
4, 2, 6, 3, 4, 6, 4, 1, 6, 5, 4, 5, 6, 5, 6, 5,
5, 6), T1 = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 5, 6, 6), T2 = c(6, 6, 6, 6, 6, 6,
6, 6, 6, 6, 6, 6, 6, 6, 5, 6, 6, 6, 6, 6), T3 = c(20,
19, 19, 19, 19, 18, 20, 20, 20, 20, 20, 20, 20, 19,
18, 19, 20, 20, 20, 19), T4 = c(21, 21, 21, 21, 20,
20, 21, 21, 21, 21, 22, 21, 22, 21, 21, 21, 22, 21,
22, 20), T5 = c(22, 22, 22, 22, 22, 21, 21, 22, 21,
22, 23, 22, 23, 22, 22, 23, 23, 23, 23, 22), T6 = c(23,
23, 24, 23, 23, 23, 23, 23, 23, 24, 24, 23, 23, 24,
23, 24, 24, 24, 24, 23), T7 = c(25, 25, 25, 24, 24,
24, 24, 25, 25, 25, 24, 25, 24, 25, 25, 26, 25, 25,
25, 25), T8 = c(26, 26, 25, 26, 25, 26, 26, 26, 26,
26, 25, 26, 26, 26, 26, 26, 25, 26, 25, 26), T9 = c(20,
23, 19, 21, 22, 27, 24, 26, 24, 25, 21, 23, 21, 22,
28, 22, 20, 24, 19, 27), T10 = c(16, 18, 14, 15, 15,
23, 19, 20, 19, 20, 15, 16, 15, 17, 23, 16, 15, 18,
15, 23), T11 = c(15, 16, 15, 15, 16, 17, 15, 14, 15,
15, 15, 14, 15, 15, 17, 15, 15, 15, 15, 17), T12 = c(15,
16, 15, 15, 16, 14, 17, 15, 15, 15, 15, 15, 15, 16,
15, 15, 15, 16, 15, 15), T13 = c(15, 16, 15, 15, 16,
15, 15, 15, 15, 15, 15, 15, 15, 16, 15, 15, 15, 16,
14, 15), T14 = c(16, 16, 15, 16, 16, 15, 16, 15, 16,
15, 15, 15, 15, 16, 16, 15, 16, 16, 15, 16), T15 = c(16,
16, 16, 16, 17, 15, 16, 15, 16, 15, 16, 15, 16, 16,
16, 16, 16, 16, 15, 16), T16 = c(16, 17, 16, 16, 17,
15, 17, 15, 16, 16, 16, 16, 16, 16, 16, 16, 16, 16,
15, 16), T17 = c(17, 19, 17, 18, 20, 15, 18, 15, 16,
16, 18, 16, 18, 19, 19, 17, 19, 17, 17, 17), T18 = c(24,
26, 27, 26, 28, 22, 25, 20, 25, 20, 26, 25, 27, 26,
25, 25, 28, 25, 27, 24), T19 = c(36, 37, 36, 38, 36,
38, 37, 31, 36, 26, 36, 37, 36, 36, 37, 36, 37, 35,
35, 35), T20 = c(38, 39, 37, 38, 38, 43, 39, 41, 39,
40, 38, 39, 38, 39, 43, 38, 37, 39, 37, 42)), row.names = c(NA,
20L), class = "data.frame")
As long as you don't have data for reproducible practice of the problem, I will add some dummy data. For future questions dput() your data and paste with your question. Your issue can be solved melting your data. In this method with the function melt() from reshape2 you choose variables to be ids and the rest of variables are made rows with a reference in a key variable. Next, I apply that method and I build some plots related to what you want:
library(reshape2)
library(ggplot2)
#Data
df <- data.frame(Batch=rep(c('A','B','C'),2),
Type=c('Good','Bad','Good','Good','Bad','Good'),
Description=c(rep('In',3),rep(c('Out'),3)),
T0=c(1,2,1,4,3,2),
T1=c(2,3,4,1,3,4),
T2=c(3,5,3,5,5,6),stringsAsFactors = F)
#Melt
mdf <- melt(df,id.vars = c('Batch','Type','Description'))
#Plot for description
ggplot(mdf,aes(x=Description,y=value,fill=variable))+
geom_bar(stat='identity')
Using Description on x-axis you will get this:
Also you can wrap by some variable to get different plots like this using facet_wrap():
#Wrap by description
ggplot(mdf,aes(x=Batch,y=value,fill=variable))+
geom_bar(stat='identity')+
facet_wrap(.~Description)
With the melted data mdf you can play and obtain other plots you want.
Update: With the data provided, here a possible solution to your issue:
library(tidyverse)
#Data
dff <- structure(list(Batch = c("A", "B", "C", "D", "E", "F", "G", "H",
"I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T"),
T0 = c(5, 6, 4, 2, 6, 3, 4, 6, 4, 1, 6, 5, 4, 5, 6, 5, 6,
5, 5, 6), T1 = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 5, 6, 6), T2 = c(6, 6, 6, 6, 6, 6, 6, 6, 6, 6, 6,
6, 6, 6, 5, 6, 6, 6, 6, 6), T3 = c(20, 19, 19, 19, 19, 18,
20, 20, 20, 20, 20, 20, 20, 19, 18, 19, 20, 20, 20, 19),
T4 = c(21, 21, 21, 21, 20, 20, 21, 21, 21, 21, 22, 21, 22,
21, 21, 21, 22, 21, 22, 20), T5 = c(22, 22, 22, 22, 22, 21,
21, 22, 21, 22, 23, 22, 23, 22, 22, 23, 23, 23, 23, 22),
T6 = c(23, 23, 24, 23, 23, 23, 23, 23, 23, 24, 24, 23, 23,
24, 23, 24, 24, 24, 24, 23), T7 = c(25, 25, 25, 24, 24, 24,
24, 25, 25, 25, 24, 25, 24, 25, 25, 26, 25, 25, 25, 25),
T8 = c(26, 26, 25, 26, 25, 26, 26, 26, 26, 26, 25, 26, 26,
26, 26, 26, 25, 26, 25, 26), T9 = c(20, 23, 19, 21, 22, 27,
24, 26, 24, 25, 21, 23, 21, 22, 28, 22, 20, 24, 19, 27),
T10 = c(16, 18, 14, 15, 15, 23, 19, 20, 19, 20, 15, 16, 15,
17, 23, 16, 15, 18, 15, 23), T11 = c(15, 16, 15, 15, 16,
17, 15, 14, 15, 15, 15, 14, 15, 15, 17, 15, 15, 15, 15, 17
), T12 = c(15, 16, 15, 15, 16, 14, 17, 15, 15, 15, 15, 15,
15, 16, 15, 15, 15, 16, 15, 15), T13 = c(15, 16, 15, 15,
16, 15, 15, 15, 15, 15, 15, 15, 15, 16, 15, 15, 15, 16, 14,
15), T14 = c(16, 16, 15, 16, 16, 15, 16, 15, 16, 15, 15,
15, 15, 16, 16, 15, 16, 16, 15, 16), T15 = c(16, 16, 16,
16, 17, 15, 16, 15, 16, 15, 16, 15, 16, 16, 16, 16, 16, 16,
15, 16), T16 = c(16, 17, 16, 16, 17, 15, 17, 15, 16, 16,
16, 16, 16, 16, 16, 16, 16, 16, 15, 16), T17 = c(17, 19,
17, 18, 20, 15, 18, 15, 16, 16, 18, 16, 18, 19, 19, 17, 19,
17, 17, 17), T18 = c(24, 26, 27, 26, 28, 22, 25, 20, 25,
20, 26, 25, 27, 26, 25, 25, 28, 25, 27, 24), T19 = c(36,
37, 36, 38, 36, 38, 37, 31, 36, 26, 36, 37, 36, 36, 37, 36,
37, 35, 35, 35), T20 = c(38, 39, 37, 38, 38, 43, 39, 41,
39, 40, 38, 39, 38, 39, 43, 38, 37, 39, 37, 42)), row.names = c(NA,
-20L), class = "data.frame")
Next the code:
#Code
Melted <- pivot_longer(dff,cols = -Batch)
Melted$name <- factor(Melted$name,levels = unique(Melted$name))
#Plot
ggplot(Melted,aes(x=Batch,y=value,color=name,group=name))+geom_line()

excluding new layer from scale_size

I have plotted a scatter plot with the point size scaled by frequency:
g<-ggplot(d, aes(x=Treatment, y= Seam.Cell.Number, size=Frequency))+geom_point(aes(colour=Strain))+ scale_size_continuous(range = c(3, 10), breaks=c(0,1, 2, 3, 4, 5,6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50))+guides(size=FALSE)
Now I am trying to plot means with standard error bars on top. I have the mean and standard error already calculated in columns in my csv file. So so far I have attempted:
g+geom_point(aes(x=Treatment,y=Mean))+geom_errorbar(aes(ymin=Mean-Standard.Error, ymax=Mean+Standard.Error, width=.4))+theme(axis.text.x = element_blank())+theme(legend.key = element_rect(colour = "black"))
And:
g+layer(data=d, mapping=aes(x=Treatment,y=Mean), geom="point")+geom_errorbar(aes(ymin=Mean-Standard.Error, ymax=Mean+Standard.Error), width=.4)+ylab("Seam Cell Number")
But they both give me very fat error bars/data points. It seems they are being affected by my size scaling in object g. I have tried to modify the size and width of the error bars, and I have tried to modify the size of the data points, both in these last bits of code, but to no avail. Is there a way to 'cancel' the size command for this layer?
If you reverse the order of your ggplot, you may be able to avoid the size distortion on the error bars.
Not having reproducible data, I made some up.
df <- data.frame(Treatment = (1:100), Seam.Cell.Number = 3:102, Frequency = 5:104,
Strain = rep(c("A", "B", "C", "D"), 25))
std <- function(x) sd(x)/sqrt(length(x))
Mean <- mean(df$Treatment)
df$Standard.Error <- std(df$Treatment)
g <- ggplot(df, aes(x = Treatment, y = Seam.Cell.Number)) +
geom_point(aes(x=Treatment, y=Mean)) +
geom_errorbar(aes(ymin=Mean-df$Standard.Error, ymax=Mean+df$Standard.Error, width=.4))+
theme(axis.text.x = element_blank())+
theme(legend.key = element_rect(colour = "black"))
g + geom_point(aes(colour=Strain)) +
scale_size_continuous(range = c(3, 10), breaks=c(0,1, 2, 3, 4, 5,6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42,
43, 44, 45, 46, 47, 48, 49, 50)) +
guides(size=FALSE)

Resources