Logarithmic scaling with ggplot2 in R - r

I am trying to create a diagram using ggplot2. There are several very small values to be displayed and a few larger ones. I'd like to display all of them in an appropriate way using logarithmic scaling. This is what I do:
plotPointsPre <- ggplot(data = solverEntries, aes(x = val, y = instance,
color = solver, group = solver))
...
finalPlot <- plotPointsPre + coord_trans(x = 'log10') + geom_point() +
xlab("costs") + ylab("instance")
This is the result:
It is just the same as without coord_trans(x = 'log10').
However, if I use it with the y-axis:
How do I achieve the logarithmic scaling on the x-axis? Besides, it is not about the x-axis, if I switch the values of x and y, then it works on the x-axis and no longer on the y-axis. So there seems to be some problem with the displayed values. Does anybody have an idea how to fix this?
Edit - Here's the used data contained in solverEntries:
solverEntries <- data.frame(instance = c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 7, 7, 7, 7, 8, 8, 8, 8, 9, 9, 9, 9, 10, 10, 10, 10, 11, 11, 11, 11, 12, 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 15, 16, 16, 16, 16, 17, 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 19, 20, 20, 20, 20),
solver = c(4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1, 4, 3, 2, 1),
time = c(1, 24, 13, 6, 1, 41, 15, 5, 1, 26, 16, 5, 1, 39, 7, 4, 1, 28, 11, 3, 1, 31, 12, 3, 1, 38, 20, 3, 1, 37, 10, 4, 1, 25, 11, 3, 1, 32, 18, 4, 1, 27, 21, 3, 1, 23, 22, 3, 1, 30, 17, 2, 1, 36, 8, 3, 1, 37, 19, 4, 1, 40, 21, 3, 1, 29, 11, 4, 1, 33, 10, 3, 1, 34, 9, 3, 1, 35, 14, 3),
val = c(6553.48, 6565.6, 6565.6, 6577.72, 6568.04, 7117.14, 6578.98, 6609.28, 6559.54, 6561.98, 6561.98, 6592.28, 6547.42, 7537.64, 6549.86, 6555.92, 6546.24, 6557.18, 6557.18, 6589.92, 6586.22, 6588.66, 6588.66, 6631.08, 6547.42, 7172.86, 6569.3, 6582.6, 6547.42, 6583.78, 6547.42, 6575.28, 6555.92, 6565.68, 6565.68, 6575.36, 6551.04, 6551.04, 6551.04, 6563.16, 6549.86, 6549.86, 6549.86, 6555.92, 6544.98, 6549.86, 6549.86, 6561.98, 6558.36, 6563.24, 6563.24, 6578.98, 6566.86, 7080.78, 6570.48, 6572.92, 6565.6, 7073.46, 6580.16, 6612.9, 6557.18, 7351.04, 6562.06, 6593.54, 6547.42, 6552.3, 6552.3, 6558.36, 6553.48, 6576.54, 6576.54, 6612.9, 6555.92, 6560.8, 6560.8, 6570.48, 6566.86, 6617.78, 6572.92, 6578.98))

Your data in current form is not log distributed -- most val around 6500 and some 10% higher. If you want to stretch the data, you could use a custom transformation using the scales::trans_new(), or here's a simpler version that just subtracts a baseline value to make a log transform useful. After subtracting 6500, the small values will be mapped to around 50, with the large values around 1000, which is a more appropriate range for a log scale. Then we apply the same transformation to the breaks so that the labels will appear in the right spots. (i.e. the label 6550 is mapped to the data that is mapped to 6550 - 6500 = 50)
This method helps if you want to make the underlying values more distinguishable, but at the cost of distorting the underlying proportions between values. You might be able to help with this by picking useful breaks and labeling them with scaling stats, e.g.
7000
+7% over min
my_breaks <- c(6550, 6600, 6750, 7000, 7500)
baseline = 6500
library(ggplot2)
ggplot(data = solverEntries,
aes(x = val - baseline, y = instance,
color = solver, group = solver)) +
geom_point() +
scale_x_log10(breaks = my_breaks - baseline,
labels = my_breaks, name = "val")

Is this what you're looking for?
x_data <- seq(from=1,to=50)
y_data <- 2*x_data+rnorm(n=50,mean=0,sd=5)
#non log y
ggplot()+
aes(x=x_data,y=y_data)+
geom_point()
#log y scale
ggplot()+
aes(x=x_data,y=y_data)+
geom_point()+
scale_y_log10()
#log x scale
ggplot()+
aes(x=x_data,y=y_data)+
geom_point()+
scale_x_log10()

Related

Density plot of a vector shows tails before and after its minimum and maximum

I have the following vector:
v<-c(1, 1, 8, 3, 1, 9, 4, 21, 13, 13, 1, 1, 3, 10, 1, 13, 22, 1,
1, 4, 2, 1, 13, 1, 5, 1, 2, 1, 1, 2, 12, 10, 26, 15, 2, 9, 6,
5, 1, 3, 18, 2, 10, 2, 8, 9, 4, 1, 11, 4, 2, 12, 3, 14, 2, 1,
27, 3, 6, 2, 1, 1, 3, 16, 3, 36, 13, 9, 11, 10, 24, 2, 27, 4,
4, 2, 9, 1, 3, 13, 3, 1, 8, 5, 5, 15, 1, 1, 3, 1, 4, 14, 8, 1,
1, 2, 20, 1, 9, 3, 1, 2, 5, 14, 5, 11, 1, 3, 2, 9, 10, 21, 9,
1, 20, 5, 11, 23, 2, 1, 1, 2, 1, 7, 2, 9, 1, 19, 9, 9, 2, 15,
17, 8, 11, 17, 2, 14, 2, 8, 13, 1, 2, 9, 15, 25, 3, 8, 32, 4,
11, 1, 1, 2)
I would like to estimate its density in R through the command density. With few lines of code:
d<-density(v)
df<-data.frame(x=d$x,y=d$y,stringsAsFactors = FALSE)
plot(df)
I obtained the following picture:
But the resulting plot doesn't add up, because max(v) is 36 and min(v) is 1 while the graph shows tails before and after 0 and 40.

How to convert igraph file in row/colums?

I would like to pass the information I have to a normal list of axes with nodes but I don't know how to do it. The raw data with "deput" would look like this. If someone knows how to convert this list into something easier to use I would appreciate it.I can visualise the graph with "plot" but to edit it I need to have more precise information.
library(igraph)
dput (net2$graph_pajek)
structure(list(30, FALSE, c(1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2,
3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 6, 7, 13, 13,
14, 15, 16, 18, 20, 20, 21, 27, 27, 27, 27, 29, 2, 2, 2, 2, 2,
2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5,
6, 6, 7, 8, 8, 9, 9, 9, 10, 10, 10, 10, 10, 11, 11, 12, 12, 12,
13, 13, 13, 14, 14, 14, 15, 15, 15, 16, 18, 18, 18, 19, 20, 20,
21, 21, 23, 24, 25, 26, 27, 27, 27, 29, 3, 3, 3, 3, 3, 3, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5,
5, 5, 5, 5, 6, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 10,
10, 10, 10, 11, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 12,
12, 12, 12, 12, 12, 12, 13, 13, 14, 14, 15, 15, 15, 15, 15), list(c(1, 0, 1), structure(list(), .Names = character(0)),
list(name = c("A", "B", "C",
"D", "E", "F", "G", "H",
"I", "J", "K",
"L", "M", "N",
"O", "P", "Q", "R",
"S", "T", "U",
"V", "W", "X", "Y", "Z",
"AB", "AC", "AD", "AE"
), deg = c(248, 532, 855, 574, 1761, 261, 229, 216, 554,
628, 774, 223, 502, 295, 266, 910, 227, 312, 364, 260, 294,
741, 227, 471, 392, 376, 292, 295, 212, 287), size = c(2.,
6, 9, 6, 20,
2, 2, 2, 6,
7, 8, 2, 7,
3, 3, 10, 2,
3, 4, 2, 3.,
8, 2, 5, 4,
4, 3, 3, 2,
3), label.cex = c(0.7, 0.7, 0.7, 0.7, 0.7, 0.7,
0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7,
0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7, 0.7
), id = c("A", "B", "C",
"D", "E", "F", "G", "H",
"I", "J", "K",
"L", "M", "N",
"O", "P", "Q", "R",
"S", "T", "U",
"V", "W", "X", "Y", "Z",
"AB", "AC", "AD", "AE"
)), list(num = c(4, 4, 4, 4, 7, 7, 7, 7, 7, 7, 7, 3, 3, 3,
10, 10, 10, 10, 10, 10, 10, 10, 10, 10, 3, 3, 3, 1, 1, 2,
2, 1, 1, 1, 1, 2, 2, 1, 4, 4, 4, 4, 1, 7, 7, 7, 7, 7, 7,
7, 6, 6, 6, 6, 6, 6, 12, 12, 12, 12, 12, 12, 12, 12, 12,
12, 12, 12, 1, 2, 2, 1, 2, 2, 3, 3, 3, 5, 5, 5, 5, 5, 2,
2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 1, 2,
2, 2, 2, 1, 1, 1, 1, 3, 3, 3, 1, 6, 6, 6, 6, 6, 6, 40, 40,
40, 40, 40, 40, 40, 40, 40), weight = c(4, 4, 4, 4,
7, 7, 7, 7, 7, 7, 7, 3, 3, 3, 10, 10, 10, 10, 10, 10, 10,
10, 10, 10, 3, 3, 3, 1, 1, 2, 2, 1, 1, 1, 1, 2, 2, 1, 4,
4, 4, 4, 1, 7, 7, 7, 7, 7, 7, 7, 6, 6, 6, 6, 6, 6, 12, 12,
12, 12, 12, 12, 12, 12, 12, 12, 12, 12, 1, 2, 2, 1, 2, 2,
3, 3, 3, 5, 5, 5, 5, 5, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 1, 3, 3, 3, 1, 2, 2, 2, 2, 1, 1, 1, 1, 3, 3, 3,
1, 6, 6, 6, 6, 6, 6, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40, 40,
40, 7, 7, 7, 7, 7, 7, 7, 1, 3, 3, 3, 7, 7, 7, 7, 7, 7, 7,
4, 4, 4, 4, 4, 4, 4, 4, 1, 18, 18))), <environment>), class = "igraph")
Are you looking for something like get.data.frame
> get.data.frame(net)
from to weight
1 A B 0.63502922
2 B C 0.79410173
3 C D 0.90802625
4 D E 0.09408188
5 E F 0.16450634
6 F G 0.75931882
7 G H 0.30409658
8 H I 0.23990324
9 I J 0.84762277
10 A J 0.88657718
data
Since I cannot reproduce the example in your post, I created a dummy example net like below
net <- make_ring(10) %>%
set_vertex_attr(name = "name", value = LETTERS[1:vcount(.)]) %>%
set_edge_attr(name = "weight", value = runif(ecount(.)))
To clarify a couple things:
The igraph file is not a plot per se, but a graph structure (as in, nodes and edges).
igraph has functions for plotting graphs, but there is no single and standard way of plotting a graph - instead, different algorithms can be used to determine visually-ideal ways of displaying them, and these algorithms oftentimes rely on random initializations.
The outputs from the plotting functions of igraph are only relevant in terms of R base plot drawing logic, AFAIK they don't use an intermediate format with coordinates handled in a user-comprehensible structure. You can nevertheless manage lots of aspects of how they are drawn - see ?igraph::igraph.plotting.

How can I change the grid line spacing on a ggplot2 dotplot?

I'm analyzing data from the result of pulling 10 numbered balls from a jar with replacement, repeated 70 times. Here's my code (data included):
numbers <- c(8, 3, 9, 5, 1, 9, 10, 8, 8, 1, 9, 9, 8, 5, 1, 10, 5, 9, 6, 4, 10, 3,
10, 9, 8, 4, 8, 8, 9, 9, 1, 5, 9, 8, 4, 1, 8, 6, 7, 8, 2, 9, 5, 6,
10, 9, 1, 1, 5, 6, 2, 8, 6, 5, 2, 5, 4, 10, 10, 2, 2, 4, 9, 6, 9,
9, 6, 10, 9, 10)
num_frame <- data.frame(numbers)
ggplot(num_frame) +
geom_dotplot(aes(numbers), binwidth = 1, dotsize = 0.4) +
theme_bw() +
xlab("Numbers") +
ylab("Frequency")
The resulting plot is nice, except it labels gridlines at 0, 2.5, 5, 7.5, and 10, which is obviously not what I want. The scale is fine, but I would like the gridlines to be at integer values 1 through 10 (0 is fine too if necessary). How can I do this? I'd also like the y-axis to adjust likewise so that the grid is still square. Thanks!
Just add:
scale_x_continuous(breaks=1:10, minor_breaks=NULL)
minor_breaks=NULL suppress lines that aren't at the breaks

How to fix invalid vertex id error in tidygraph?

Data
network_data <- list(nodes = structure(list(id = c(0, 1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14), label = c("2892056", "2894543", "2894544",
"2894545", "2894546", "2894547", "2894548", "2894549", "2894550",
"2894551", "2894552", "2894553", "2894554", "2894555", "2894556"
)), row.names = c(NA, -15L), class = "data.frame"), links = structure(list(
from = c(3, 5, 7, 13, 13, 7, 3, 5, 0, 0, 5, 2, 7, 6, 13,
11, 0, 3, 2, 7, 13, 3, 0, 0, 5, 3, 13, 4, 0, 14, 13, 7, 2,
3, 5, 0, 12), to = c(0, 0, 0, 0, 2, 2, 2, 2, 2, 3, 3, 3,
3, 3, 3, 4, 5, 5, 5, 5, 5, 6, 6, 7, 7, 7, 7, 11, 12, 12,
12, 13, 13, 13, 13, 13, 14), weight = c(1, 2, 2, 1, 2, 1,
1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 2, 1, 2, 1, 2, 2, 2, 1, 1,
2, 1, 2, 1, 1, 2, 2, 2, 1, 2, 1, 1)), row.names = c(NA, -37L
), class = "data.frame"))
I have this list of nodes and links for building a network. Rather than plotting the network, I want to get the network characteristics such as isolates, reciprocity, etc.
Here's the rest of the code that I'm using to obtain these characteristics:
network_data$nodes <- network_data$nodes %>% select(id, label)
network_data$links <- network_data$links %>% rename(from = source, to = target)
print(network_data$nodes)
print(network_data$links)
SNA <- tidygraph::tbl_graph(
nodes = network_data$nodes,
edges = network_data$links,
directed = T
)
The last line is where it errors out.
Error in (function (edges, n = max(edges), directed = TRUE) :
At structure_generators.c:86 : Invalid (negative) vertex id, Invalid vertex id
I googled the issue and seems like it's pretty prevalent, but none of the methods suggested worked for me. What's different in my data that it's still generating the error, and how can I resolve this error?

Quade test in R

I would like to perform a Quade test with more than one covariate in R. I know the command quade.test and I have seen the example below:
## Conover (1999, p. 375f):
## Numbers of five brands of a new hand lotion sold in seven stores
## during one week.
y <- matrix(c( 5, 4, 7, 10, 12,
1, 3, 1, 0, 2,
16, 12, 22, 22, 35,
5, 4, 3, 5, 4,
10, 9, 7, 13, 10,
19, 18, 28, 37, 58,
10, 7, 6, 8, 7),
nrow = 7, byrow = TRUE,
dimnames =
list(Store = as.character(1:7),
Brand = LETTERS[1:5]))
y
quade.test(y)
My question is as follows: how could I introduce more than one covariate? In this example the covariate is the Store variable.

Resources