I want to build a network diagram from a dataframe that I have, but I am having troubles.
This is what the dataframe looks like.
Shop
Manager
S1
34
S1
12
S2
11
S2
34
S3
34
S4
50
For example, S1 should be connected to S2 and S3 since they have the same manager and so on. Also, is it possible to set the size of the dot based on the number of managers a shop has?
I really appreciate the help. Thanks!
You can try graph_from_adjacency_matrix + tcrossprod + table
library(igraph)
g <- graph_from_adjacency_matrix(as.dist(tcrossprod(table(df))))
and plot(g) shows the network like below
Another way is bipartite.projection
df %>%
graph_from_data_frame() %>%
set_vertex_attr(name = "type", value = names(V(.)) %in% df$Shop) %>%
bipartite.projection() %>%
pluck(2) %>%
plot()
Data
> dput(df)
structure(list(Shop = c("S1", "S1", "S2", "S2", "S3", "S4"),
Manager = c(34L, 12L, 11L, 34L, 34L, 50L)), class = "data.frame", row.names = c(NA,
-6L))
Related
Following is my dataset:
Result
course1
course2
course3
pass
15
17
18
pass
12
14
19
Fail
9
13
3
Fail
3
2
0
pass
14
11
20
Fail
5
0
7
I want to plot a grouped bar graph. I am able to plot following graphs but I want both the results in same graph.
par(mfrow=c(1,1))
options(scipen=999)
coul <- brewer.pal(3, "Set2")
# Bar graph for passed courses
result_pass <-data %>% filter(Result=='Pass') %>% summarize(c1_tot=sum(course1),
c2_tot = sum(course2), c3_tot = sum(course3) )
col_sum <- colSums(result_pass[,1:3])
barplot(colSums(result_pass[,1:3]), xlab = "Courses", ylab = "Total Marks", col = coul, ylim=range(pretty(c(0, col_sum))), main = "Passed courses ")
# Bar graph for Failed courses
result_fail <-data %>% filter(Result=='Fail') %>% summarize(c1_tot=sum(course1),
c2_tot = sum(course2), c3_tot = sum(course3) )
col_sum <- colSums(result_fail[,1:3])
barplot(colSums(result_fail[,1:3]), xlab = "Courses", ylab = "Total Marks", col = coul, ylim=range(pretty(c(0, col_sum))), main = "Failed courses ")
Any suggestion for which I can merge both the above plots and create grouped bar graph for Pass and Fail courses.
It's probably easier than you think. Just put the data directly in aggregate and use as formula . ~ Result, where . means all other columns. Removing first column [-1] and coerce as.matrix (because barplot eats matrices) yields exactly the format we need for barplot.
This is the basic code:
barplot(as.matrix(aggregate(. ~ Result, data, sum)[-1]), beside=TRUE)
And here with some visual enhancements:
barplot(as.matrix(aggregate(. ~ Result, data, sum)[-1]), beside=TRUE, ylim=c(0, 70),
col=hcl.colors(2, palette='viridis'), legend.text=sort(unique(data$Result)),
names.arg=names(data)[-1], main='Here could be your title',
args.legend=list(x='topleft', cex=.9))
box()
Data:
data <- structure(list(Result = c("pass", "pass", "Fail", "Fail", "pass",
"Fail"), course1 = c(15L, 12L, 9L, 3L, 14L, 5L), course2 = c(17L,
14L, 13L, 2L, 11L, 0L), course3 = c(18L, 19L, 3L, 0L, 20L, 7L
)), class = "data.frame", row.names = c(NA, -6L))
I have a 3 column csv file like this
x,y1,y2
100,50,10
200,10,20
300,15,5
I want to have a barplot using R, with first column values on x axis and second and third columns values as grouped bars for the corresponding x. I hope I made it clear. Can someone please help me with this? My data is huge so I have to import the csv file and can't enter all the data.I found relevant posts but none was exactly addressing this.
Thank you
Use the following code
library(tidyverse)
df %>% pivot_longer(names_to = "y", values_to = "value", -x) %>%
ggplot(aes(x,value, fill=y))+geom_col(position = "dodge")
Data
df = structure(list(x = c(100L, 200L, 300L), y1 = c(50L, 10L, 15L),
y2 = c(10L, 20L, 5L)), class = "data.frame", row.names = c(NA,
-3L))
I am starting to work with R, this has to be a basic question but it doesn't seem obvious how to do it easily. If I have the following data set :
x y
0,1
0,2
1,2
1,4
and so on, so there are multiple y values for each x value. This how can I easily do a plot showing the data and the means and the CI intervals.
I can do it, as I have hodged together a solution off by hacking bits of code together, but there has to be a simpler solution.
This is what I am doing with this very simple data file
cardboard,r1,r2,r3,r4,r5,r6
0,233,130,110,140,160
101,293,340,313,260,366,38
and this mess of code :
er <- read.csv(file="ianevans.csv",head=TRUE,sep=",")
er[,2:7] <- min(er[1,2:7],na.rm=TRUE)/er[,2:7]*100
er$sharpness[1]=mean(as.vector(er[1,2:7], "numeric"),na.rm=TRUE)
er$sharpness[2]=mean(as.vector(er[2,2:7], "numeric"),na.rm=TRUE)
er$se[1]=sd(as.vector(er[1,2:7], "numeric"),na.rm=TRUE)/sqrt(6-1)
er$se[2]=sd(as.vector(er[2,2:7], "numeric"),na.rm=TRUE)/sqrt(6-1)
p <- ggplot(er,aes(x=cardboard,y=sharpness))
p1 <- p + geom_point(aes(y=sharpness,color="red",size=5) )
p2 <- p1 +
scale_shape_discrete(solid=F) +
geom_point(aes(y=r1),color="blue",shape="o",size=3) +
geom_point(aes(y=r2),color="blue",shape="o",size=3) +
geom_point(aes(y=r3),color="blue",shape="o",size=3) +
geom_point(aes(y=r4),color="blue",shape="o",size=3) +
geom_point(aes(y=r5),color="blue",shape="o",size=3) +
geom_point(aes(y=r6),color="blue",shape="o",size=3)
p3 <- p2 +
geom_errorbar(aes(ymax=sharpness+se,ymin=sharpness-se),width=5)
Again this works, more or less, but as you can see there are a bunch of hard coded numbers and there has to be a better way to both have the data file set up and do the plots and easily be able to deal with different amounts of y data for each x data for example. The way some of the data manipulation is done is also awkward as it should be possible do to it without individual references for the means, sd's, .
Here is my attempt to simplify your code. It is based on dplyr and ggplot2:
library(dplyr)
library(ggplot2)
er <- structure(list(cardboard = c("100", "101"), r1 = c(30L, 293L),
r2 = c(233L, 340L), r3 = c(130L, 313L), r4 = c(110L, 260L
), r5 = c(140L, 366L), r6 = c(160L, 38L)), .Names = c("cardboard",
"r1", "r2", "r3", "r4", "r5", "r6"), row.names = c(NA, -2L), class = "data.frame")
d2 <- er %>%
melt(id.vars='cardboard',na.rm=T) %>% # convert to tidy format
group_by(cardboard) %>% # group by cardboard
mutate(v2=min(value)/value) %>% # calculate v2
summarise(sharpness=mean(v2),se=sd(v2)) # calculate mean & sd
# cardboard sharpness se
#1 100 0.3390063 0.3273334
#2 101 0.2688070 0.3585103
ggplot(d2) +
aes(x=cardboard,y=sharpness,ymax=sharpness+se,ymin=sharpness-se) +
geom_pointrange()
I'm a beginner to R.
I just imported some data from csv file into R and am trying to make a basic graph around it.
Name | Marks
A | 65
B | 78
C | 55
D | 66
i created a variable data <- read.table("input.csv")
But after I imported the data successfully, I'm unable to plot a graph that makes sense.
When I tried plot(data) it just gave the following graph
It didn't make any sense... I want a very basic graph that makes sense.. with the data I have.. anything a pie or bar or anything will do.. Please HELP!!
This would probably not pass the 'Tufte' test, but might be a step in the right direction:
library(ggplot2)
data <- data.frame(cbind(c('A', 'B', 'C', 'D'), c(65, 78, 55, 66)))
names(data) <- c('name', 'marks')
ggplot(data, aes(x=name, y=marks)) + geom_bar(stat="identity")
Try:
mydf<-structure(list(Name = structure(1:5, .Label = c("A", "B", "C", "D", "E"), class = "factor"), Marks = c(65L, 78L, 55L, 66L, 93L )), .Names = c("Name", "Marks"), class = "data.frame", row.names = c(NA, -5L))
barplot(mydf$Marks,names.arg=mydf$Name)
Most basic plot can be with 'plot(x,y)' command. It is good for screening most data.
plot(ddf$Name,ddf$Marks)
hi i have a data set like this
ALL Critical Error Warning Review
2016 1412 475 4 125
154 45 49 2 58
116 86 12 1 17
I want to plot a stacked bar graph using ggplot2 where a single bar would show 100% of "ALL" and rest "Critical","Error","Warning","Review" should be on top of another according to their contribution in "ALL".
I am try it with no luck!!! Need a hand..Thanks
I'm not quite sure if your description of the desired plot is non-ambiguous.
My interpretation would be the following:
## Copied from user1317221_G - Thanks for that.
babydf <- structure(list(ALL = c(2016L, 154L, 116L), Critical = c(1412L,
45L, 86L), Error = c(475L, 49L, 12L), Warning = c(4L, 2L, 1L),
Review = c(125L, 58L, 17L)), .Names = c("ALL", "Critical",
"Error", "Warning", "Review"), class = "data.frame", row.names = c(NA,
-3L))
# Add IDs
babydf <- cbind(id=1:nrow(babydf), babydf))
library(reshape2)
library(ggplot2)
# reshape the dataframe:
df.reshaped <- melt(babydf, id.vars='id')
ggplot(subset(df.reshaped, variable != 'ALL'), aes(x=id, y=value, fill=variable)) + geom_bar(stat='identity')
If you want to have all bars of equal height, just do
babydf[, 3:6] <- babydf[, 3:6] / babydf$ALL * 100
before melt. The result: