Create graph (network analysis R)? - r

I'm quite new to R and having trouble with the following:
I'm researching politicians in Belgium on Twitter, and would like to see if any networks form within political parties on Twitter.
I have two data files
The matrix file that contains whether or not politicians are linked
(politicixpolitici.csv)
The file that contains all the polticians with that respective
fistname, name, political party, twitterhandle and parliament
(data.csv)
I want to create a graph that shows the network, but with the nodes colored by their politicial party (this variable is called 'fractie' in the data.csv file).
I've tried doing this as follows:
First, I've tried to combine the files as follows:
rownames(politicicsv) <- politicicsv[,'TwitterHandle']
test <- cbind(politicixpolitici,
politicicsv[, "Fractie"][match(rownames(politicixpolitici),
rownames(politicicsv))])
=> I've plotted this network, but it comes out very sloppy and the names are on there which makes it very hard to see + the nodes are obviously not coloured according to the party.
Then, I've tried it using statnet, but when I wanted to create the the graph, I had trouble with the creation of the vertex attribute:
fractie <- get.vertex.attribute(politicicsv, "Fractie")
Error in get.vertex.attribute(politicicsv, "Fractie") :
get.vertex.attribute requires an argument of class network.
Can someone help me in plotting this network, with the nodes colored according to the political party ("Fractie") they belong to?
Files can be found here
Thank you, this would help me with my thesis.

Can someone help me in plotting this network, with the nodes colored
according to the political party ("Fractie") they belong to?
You could do it like this
df <- read.csv("data.csv")
m <- as.matrix(read.csv2("politicixpolitici.csv", row.names = 1))
library(igraph)
g <- simplify(graph_from_adjacency_matrix(m))
# Color palette:
(pal <- setNames(
colorRampPalette(categorical_pal(8))(nlevels(df$Fractie)),
levels(df$Fractie)) )
# CD&V Ecolo-Groen Groen N-VA Onafhankelijke
# "#E69F00" "#81ADA3" "#33ABB9" "#18A56E" "#C0D64B"
# Open vld Open Vld sp.a VB Vlaams Belang
# "#77AB7A" "#2A6D8E" "#BF5F11" "#CF6E64" "#BC82A2"
# Vuye&Wouters
# "#999999"
V(g)$color <- pal[df$Fractie[match(V(g)$name, df$TwitterHandle)]]
set.seed(1); coords <- layout_with_fr(g)
plot(g,
layout=coords, vertex.label.cex=.2, vertex.size=2,
edge.arrow.size=0, edge.lty="blank", asp = 0)
or try an interactive plot:
library(visNetwork)
visIgraph(g) %>%
visIgraphLayout(layout="layout.norm", layoutMatrix = coords, type = "full")
All in all I'd recommend exporting your graph to gephi and experiment with other layouts and visualizations there interactively.

Related

Represent a colored polygon in ggplot2

I am using the statspat package because I am working on spatial patterns.
I would like to do in ggplot and with colors instead of numbers (because it is not too readable),
the following graph, produced with the plot.quadratest function: Polygone
The numbers that interest me for the intensity of the colors are those at the bottom of each box.
The test object contains the following data:
Test object
I have looked at the help of the function, as well as the code of the function but I still cannot manage it.
Ideally I would like my final figure to look like this (maybe not with the same colors haha):
Final object
Thanks in advance for your help.
Please provide a reproducible example in the future.
The package reprex may be very helpful.
To use ggplot2 for this my best bet would be to convert
spatstat objects to sf and do the plotting that way,
but it may take some time. If you are willing to use base
graphics and spatstat you could do something like:
library(spatstat)
# Data (using a built-in dataset):
X <- unmark(chorley)
plot(X, main = "")
# Test:
test <- quadrat.test(X, nx = 4)
# Default plot:
plot(test, main = "")
# Extract the the `quadratcount` object (regions with observed counts):
counts <- attr(test, "quadratcount")
# Convert to `tess` (raw regions with no numbers)
regions <- as.tess(counts)
# Add residuals as marks to the tessellation:
marks(regions) <- test$residuals
# Plot regions with marks as colors:
plot(regions, do.col = TRUE, main = "")

plot a subset of igraph data , just subset by their attribute

I have a fblog data set ,PolParty is one attribute of my data, I want to plot just 2 political parties (say P1 and P2) and plot the network of blogs.
i wrote code below but i think it is wrong , can some one help me?
library(statnet)
library(igraph)
library(sand)
data(fblog)
fblog = upgrade_graph(fblog)
class(fblog)
summary(fblog)
V(fblog)$PolParty
table(V(fblog)$PolParty)
p1<-V(fblog)[PolParty=="PS"] # by their labels/names
p2<-V(fblog)[PolParty=="UDF"]
class(p1)
The objects p1 and p2 that you were creating are of the class igraph.vs (instead of igraph). This object just documents the vertices. Which is not a full graph. Hence when you try to plot it you don't get anything.
Based on the following post: Subset igraph graph by label
g=subgraph.edges(graph=fblog, eids=which(V(fblog)$PolParty==" PS"), delete.vertices = TRUE)
plot(g)
The above works. NOTE: regarding the output of V(fblog)$PolParty- everything is preceeded by a space hence you need to use V(fblog)$PolParty==" PS"
UPDATE: if I want to subset based on 2 conditions- I will modify the which() command:
g=subgraph.edges(graph=fblog, eids=which(V(fblog)$PolParty==" PS"| V(fblog)$PolParty==" UDF"), delete.vertices = TRUE)
plot(g)

r convert igraph into visNetwork

I found a way to convert igraph into visNetwork (refer to Interactive arules with arulesViz and visNetwork). Suppose before and after conversion from igraph to visNetwork should be the same, but my result shows after convert into visNetwork, the results are different.
I'll try demonstrate the issue using sample data data("Groceries") from Library(arules).
#Pre-defined library
library(arules)
library(arulesViz)
library(visNetwork)
library(igraph)
#Get sample data & get association rules
data("Groceries")
rules <- apriori(Groceries, parameter=list(support=0.01, confidence=0.4))
rules <- head(sort(rules, by="lift"), 10)
#Convert rules to data.table
library(data.table)
rules_dt <- data.table( lhs = labels( lhs(rules) ),
rhs = labels( rhs(rules) ),
quality(rules) )[ order(-lift), ]
print all rules in table format (sort by lift)
Plot top 10 association rules via using igraph
ig <- plot(rules, method="graph", control=list(type="items"))
Note: Based on association rules i plot a network diagram via using igraph, everything is correct. Next i'll try to convert existing igraph to visNetwork, then we compare the results.
tf <- tempfile( )
saveAsGraph(rules, file = tf, format = "dot" )
# clean up temp file if desired
#unlink(tf)
# Convert igraph to dataframe
ig_df <- as_data_frame(ig, what = "both")
# Plot visNetwork
visNetwork(
nodes = data.frame(
id = ig_df$vertices$name
,value = ig_df$vertices$lift # could change to lift or confidence
,title = ifelse(ig_df$vertices$label == "",ig_df$vertices$name,
ig_df$vertices$label)
,ig_df$vertices
),
edges = ig_df$edges
) %>%
visEdges(arrows ="to") %>%
visOptions( highlightNearest = T )
Plot top 10 association rules via using visNetwork
Note: For visNetwork diagram, the size of intercept node indicate "lift", the higher the lift, the larger the size of intercept node; unlike igraph diagram, size of intercept node indicate "support", while colour of intercept node indicate "lift".
Let's compare the igraph and visNetwork
By referring to the association rules in table format, rules no.10 (rules with smallest "lift"), suppose the size of the intercept node is the smallest, but end up it's not the smallest.
Problems
I tried to further drill down into ig_df <- get.data.frame( ig, what = "both" ), and i found something weird on ig_df$vertices table, which generated from as_data_frame function from library(igraph).
I found that the assoc10 (intercept node for association rules no.10) actually had NA for all variables (i.e. "support", "confidence", "lift" & "count"), more precisely the dimension for columns "support", "confidence", "lift" & "count" in ig_df$vertices are moving up one row! Kindly correct me if i'm wrong..
Conclusion
Since the key to convert an igraph into visNetwork is to use this as_data_frame to get extract all data from an igraph and convert those data into dataframe, then plot visNetwork using the data from extracted dataframe. But due to extract issue when using as_data_frame to extract data from igraph, so the result from also different.
Question: is this a bug? or i made a mistake on my code? Any suggestion is welcome. Thanks!
A year later... but I already typed everything so might as well.
If I understand it correctly, this is the source of your troubles -
value = ig_df$vertices$lift # could change to lift or confidence
The easiest solution is to assign your lift values to size because in visNetwork size: Number. Default to 25. The size is used to determine the size of node shapes that do not have the label inside of them. These shapes are: image, circularImage, diamond, dot, star, triangle, triangleDown, square and icon. I'm not sure how values works here but I think you could use values if you also provide appropriate scaling.
https://www.rdocumentation.org/packages/visNetwork/versions/2.0.9/topics/visNodes
So the solution is:
size = ig_df$vertices$lift # could change to lift or confidence
Note that the nodes that have NA lift are set to a size 25 by default and that with a size 2 or 3 lift you can barely see the nodes. You could raise the lift size exponentially which will increase the size of the nodes and exaggerate the differences in lift.
ig_df <- as_data_frame(ig, what = "both")
ig_df$vertices["lift"] <- ig_df$vertices["lift"] ** 3
And I couldn't reproduce your shifty table problem, mine looked fine, hopefully it solves itself!

Colorize the map of Russia depending on the variable in R

I have a map of Russia with regional subdivision
library(raster)
data <- getData('GADM', country='RUS', level=1)
http://www.gks.ru/bgd/regl/B16_14p/IssWWW.exe/Stg/d01/08-01.doc
The link is to a Word.doc with data (table) on crime rates for Russian regions. I can extract this data and use it in R. I want to take 2015 year and colorize regions on the map depending on the crime rate (also add a legend). How can I do this? The problem is that names of regions are sometimes different in the shape file (NL_NAME_1) and in the data from www.gks.ru.
I also have this code for graph that I need, except that here we have meaningless colors:
library(sp)
library(RColorBrewer)
data$region <- as.factor(iconv(as.character(data$NAME_1)))
spplot(data, "region", xlim=c(15,190), ylim=c(40,83),
col.regions=colorRampPalette(brewer.pal(12, "Set3"))(85), col = "white")
If I understand your question properly, you just need to add your data to the spatial object for making colors meaningful.
Note, please, that the data is a reserved word in R. So, it's better to modify a little your variable name:
geo_data <- getData('GADM', country = 'RUS', level = 1)
Let's emulate some data to demonstrate a visualization strategy:
set.seed(23)
geo_data#data["data_to_plot"] <- sample(1:100, length(geo_data#data$NAME_1))
Using a default GADM projection would cut the most eastern part of the country. A simple transformation helps to fit the whole area to a plot:
# fit Russian area inside the plot
geo_data_trsf <- spTransform(geo_data, CRS("+proj=longlat +lon_wrap=180"))
Draw the map selecting data_to_plot instead of region:
max_data_val <- max(geo_data_trsf#data$data_to_plot)
spplot(geo_data_trsf, zcol = "data_to_plot",
col.regions = colorRampPalette(brewer.pal(12, "Set3"))(max_data_val),
col = "white")
The plot limits are adjusted automatically for the transformed spatial data geo_data_trsf, making possible to omit xlim and ylim.
As for the problem with the names, I can't provide any ready-to-use solution. Obviously, the regions' names of NL_NAME_1 need some additional treatment to use them as labels. I think, it would be better to use NAME_1 as an identifier in your code to ensure that it'll be no troubles with encoding. The NL_NAME_1 column is perfectly suitable to set the correspondence between your Word-data and the data inside the spatial object geo_data.

R Highlighting some of the boundaries (or borders) of the region in a shape file with spplot()

I am working on a shape file and like to highlight some of the boundaries (borders) of the regions (as figure 1):
Figure 1: some but not all of the regions (borders) of the shape file are highlighted
(Source: https://dl.dropboxusercontent.com/u/48721006/highlighted.png)
The highlighting is achieved with ArcMap. I can't figure out how to do the same with R (particularly with the spplot()). Any suggestions on this?
To get the shape file
library(sp)
library(maptools)
con <- url("http://gadm.org/data/rda/ZAF_adm2.RData")
print(load(con))
close(con)
plot(gadm)
Many thanks!
G
What I would do: (1) plot the complete set; (2) take a subset; (3) plot the subset with a different line type. For subsetting shape files, check this question.
plot(gadm)
# check class and structure of the data
class(gadm)
head(gadm#data)
# take a subset based on ID_2
some_polygons = subset(gadm,ID_2>=38840 & ID_2<38850)
plot(some_polygons, add=T, border='cyan', lwd=2)

Resources