Visualize igraph degree distribution with ggplot2 - r

I want to visualise the degree distribution of an igraph object with ggplot2. Because ggplot2 doesn't take a the simple numeric vector generated by degree() I convert it to a frequency table. Then I pass it to ggplot(). Still I get: geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic? I can't set the table column degree to factors since I need to plot it also on a log scale.
library(igraph)
library(ggplot2)
g <- ba.game(20)
degree <- degree(g, V(g), mode="in")
degree
# [1] 6 2 7 1 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0
degree <- data.frame(table(degree))
degree
# degree Freq
# 1 0 13
# 2 1 4
# 3 2 1
# 4 6 1
# 5 7 1
ggplot(degree, aes(x=degree, y=Freq)) +
geom_line()
# geom_path: Each group consist of only one observation. Do you need to adjust the group aesthetic?

The problem is that you have turned degree$degree into a factor by using table. Two things to fix this:
make it a factor with all possible values (up to the largest degree) so that you don't miss the zeros.
convert the labels back to numbers before plotting
Implementing those (I used degree.df instead of overwriting degree to keep the different steps distinct):
degree.df <- data.frame(table(degree=factor(degree, levels=seq_len(max(degree)))))
degree.df$degree <- as.numeric(as.character(degree.df$degree))
Then the plotting code is what you had:
ggplot(degree.df, aes(x=degree, y=Freq)) +
geom_line()

Related

Grouping Set of Points to a Pre Defined Point

I'm looking to create a model that classifies a set of points that are near a pre-defined point.
For example, let's say I have points:
X
Y
1
1
1
2
1
3
2
1
2
3
3
1
3
2
3
3
6
6
8
7
8
5
9
3
10
7
My goal is to identify which points are closest to predefined point (2,2) and ideally output which points those are.
I tried using KNN, but I could not figure out how to get the KNN model to train results near (2,2). Any guidance to how I may accomplish this would be awesome. :)
Plot of Points
df <- data.frame( x = c(1,1,1,2,2,2,3,3,3,6,8,8,9,10), y = c(1,2,3,1,2,3,1,2,3,6,7,5,3,7))
df
goal_point <- c(x=2,y=2)
goal_point
You might approach this by calculating distance from goal as a feature.
df$dist = sqrt((df$x - goal_point["x"])^2 +
(df$y - goal_point["y"])^2)
df$clust = kmeans(df, 2)$cluster
library(ggplot2)
ggplot(df, aes(x, y, color = clust)) +
geom_point()
In this case kmeans is using x, y, and distance from goal. You could also use just distance from goal by using df$clust = kmeans(df[,3], 2)$cluster, which would lead here to the same clustering.

Incorrect order of stack, identity geom_bar

I used dplyr to filter a dataset, which resulted in the tibble below. I want to create a stacked bar chart of the types of features and their capability levels. I would like the bar chart to be ordered from largest frequency to smallest.
Using the code below, the plot that is output has the first two values reversed. Is this because "Position" only has two capability levels, whereas the rest have 3? Even then the highest frequency overall is 96 and belongs to a "Distance" level.
I would ideally like to do the least amount of "brute forcing" to make the code work as the actual data I am working with have over 10 types of features, some with only one capability level.
# A tibble: 11 x 3
# Groups: Type.of.Feature [?]
Type.of.Feature Capability.Category Freq
<fct> <chr> <int>
1 Diameter <1 75
2 Diameter >1.33 5
3 Diameter 1-1.33 13
4 Distance <1 96
5 Distance >1.33 5
6 Distance 1-1.33 6
7 Position <1 90
8 Position >1.33 4
9 Radius <1 7
10 Radius >1.33 1
11 Radius 1-1.33 2
ggplot(freq, aes(x=reorder(Type.of.Feature, -Freq), y=Freq, fill=Capability.Category)) +
geom_bar(stat="identity", position="stack")
Please follow the below procedure to order your bars
#Import Data
file1<- readxl::read_excel(file.choose())
#Import Required Libraries
library(ggplot2)
library(dplyr)
#Split Dataframe into list based on the Type.of.Feature factor
factor_list <-split.data.frame(file1, f= file1$Type.of.Feature)
#Create new column with frequency sum for each of the level of factor above
for( lnam in names(factor_list)){
factor_list[[lnam]]["group_sum"]<- sum(factor_list[[lnam]]["Freq"])
}
#Get back the data into dataframe
file1<- rbind_list(factor_list)
#Use newly created group frequency to order your bars
ggplot(file1, aes(x=reorder(Type.of.Feature, -group_sum), y=Freq, fill=Capability.Category)) +
geom_bar(stat="identity", position="stack")

Color the individuals of a R PCoA plot by groups

Should be a simple question, but I haven't found exactly how to do it so far.
I have a matrix as follow:
sample var1 var2 var3 etc.
1 5 7 3 1
2 0 1 6 8
3 7 6 8 9
4 5 3 2 4
I performed a PCoA using Vegan and plotted the results. Now my problem is that I want to color the samples according to a pre-defined group:
group sample
1 1
1 2
2 3
2 4
How can I import the groups and then plot the points colored according to the group tey belong to? It looks simple but I have been scratching my head over this.
Thanks!
Seb
You said you used vegan PCoA which I assume to mean wcmdscale function. The default vegan::wcmdscale only returns a scores matrix similarly as standard stats::cmdscale, but if you added some special arguments (such as eig = TRUE) you get a full wcmdscale result object with dedicated plot and points methods and you can do:
plot(<pcoa-result>, type="n") # no reproducible example: edit like needed
points(<pcoa-result>, col = group) # no reproducible example: group must be visible
If you have a modern vegan (2.5.x) the following also works:
library(magrittr)
plot(<full-pcoa-result>, type = "n") %>% points("sites", col = group)

Create heatmap in R using stat_density2d

I have several (x,y) coordinates, and each one is associated with a binary value (either 1 or 0). I want to create a heatmap showing what the probability is at each point that a given point in that location will have a 1 associated with it.
Sample data:
data = read.table(header=TRUE,
text="x y value
7 3 0
4 5 0
3 7 1
3 6 0
4 5 1
5 6 0")
And so on. I can create a plot showing where the points are concentrated using the following:
ggplot(data, aes(x=x,y=y)) + stat_density2d(aes(fill=..level..), geom="polygon")
But when I try to set fill = value, I get the following error:
Error in unit(tic_pos.c, "mm") : 'x' and 'units' must have length > 0
How do I do this?
Edit: I should add that I can easily accomplish this using stat_summary2d or even geom_tile, but it looks much more boxy and less smooth, which I want it to be.

How can i make a line plot in R?

I am trying to use R to make an excel kind of a line plot, where my x axis is text (A,B,c..etc) and the y axis(which can be both negative and positive) are up and down columns. I want to give up a red color and down green.
I would really appreciate if anyone can help me regarding this. I have plotted this in excel but i have thousands of rows in my data and excel doesnot show all the text point in my plot.
My data looks like the following:
Name UP Downs
A 10 -3
B 2 -4
C 1 -1
D 4 -1
E 5 0
F 0 -1
G 6 -5
H 0 -1
I 7 -1
J 0 -1
K 0 -11
L 3 -1
M 0 -13
N 2 -1
O 0 -1
P 1 -1
Q 0 0
R 1 -1
S 0 0
T 12 -1
This is probably not the most elegant way to do it, but you can work it all out using with plot, points, and axis (axis is the main one, it explains how you can change the labels on the axis): ?axis, ?plot, ?points.
First make a data frame similar to yours so I can demonstrate...
# make a data frame similar to yours
mydf <- data.frame( Name=LETTERS,
Up=sample.int(15,size=26,replace=T),
Down=-sample.int(15,size=26,replace=T) )
Now plot.
# set up a plot: x axis goes from 1 to 26,
# y limit goes from -15 to 15 (picked manually, you can work yours out
# programmatically)
# Disable plotting of axes (axes=FALSE)
# Put in some x and y labels and a plot title (see ?plot...)
plot(0,xlim=c(1,26),ylim=c(-15,15),type='n',
axes=FALSE, # don't draw axis -- we'll put it in later.
xlab='Name',ylab='Change', # x and y labels
main='Ups and Downs') #,frame.plot=T -- try if you like. ?plot.default
# Plot the 'Up' column in green (see ?points)
points(Up~Name,mydf,col='green')
# Plot the 'Down' column in red
points(Down~Name,mydf,col='red')
# ***Draw the x axis, with labels being A-Z
# (type in 'LETTERS' to the prompt to see what they are)
# see also ?axis
axis(1,at=1:26,labels=LETTERS)
# Draw the y axis
axis(2)
Tweak it as you wish: ?points and ?par and ?axis are particularly helpful in this respect.

Resources