How can I built a basic graph out of this data? - r

I'm a beginner to R.
I just imported some data from csv file into R and am trying to make a basic graph around it.
Name | Marks
A | 65
B | 78
C | 55
D | 66
i created a variable data <- read.table("input.csv")
But after I imported the data successfully, I'm unable to plot a graph that makes sense.
When I tried plot(data) it just gave the following graph
It didn't make any sense... I want a very basic graph that makes sense.. with the data I have.. anything a pie or bar or anything will do.. Please HELP!!

This would probably not pass the 'Tufte' test, but might be a step in the right direction:
library(ggplot2)
data <- data.frame(cbind(c('A', 'B', 'C', 'D'), c(65, 78, 55, 66)))
names(data) <- c('name', 'marks')
ggplot(data, aes(x=name, y=marks)) + geom_bar(stat="identity")

Try:
mydf<-structure(list(Name = structure(1:5, .Label = c("A", "B", "C", "D", "E"), class = "factor"), Marks = c(65L, 78L, 55L, 66L, 93L )), .Names = c("Name", "Marks"), class = "data.frame", row.names = c(NA, -5L))
barplot(mydf$Marks,names.arg=mydf$Name)

Most basic plot can be with 'plot(x,y)' command. It is good for screening most data.
plot(ddf$Name,ddf$Marks)

Related

Network Graph in R

I want to build a network diagram from a dataframe that I have, but I am having troubles.
This is what the dataframe looks like.
Shop
Manager
S1
34
S1
12
S2
11
S2
34
S3
34
S4
50
For example, S1 should be connected to S2 and S3 since they have the same manager and so on. Also, is it possible to set the size of the dot based on the number of managers a shop has?
I really appreciate the help. Thanks!
You can try graph_from_adjacency_matrix + tcrossprod + table
library(igraph)
g <- graph_from_adjacency_matrix(as.dist(tcrossprod(table(df))))
and plot(g) shows the network like below
Another way is bipartite.projection
df %>%
graph_from_data_frame() %>%
set_vertex_attr(name = "type", value = names(V(.)) %in% df$Shop) %>%
bipartite.projection() %>%
pluck(2) %>%
plot()
Data
> dput(df)
structure(list(Shop = c("S1", "S1", "S2", "S2", "S3", "S4"),
Manager = c(34L, 12L, 11L, 34L, 34L, 50L)), class = "data.frame", row.names = c(NA,
-6L))

Grouped barplots in R using csv

I have a 3 column csv file like this
x,y1,y2
100,50,10
200,10,20
300,15,5
I want to have a barplot using R, with first column values on x axis and second and third columns values as grouped bars for the corresponding x. I hope I made it clear. Can someone please help me with this? My data is huge so I have to import the csv file and can't enter all the data.I found relevant posts but none was exactly addressing this.
Thank you
Use the following code
library(tidyverse)
df %>% pivot_longer(names_to = "y", values_to = "value", -x) %>%
ggplot(aes(x,value, fill=y))+geom_col(position = "dodge")
Data
df = structure(list(x = c(100L, 200L, 300L), y1 = c(50L, 10L, 15L),
y2 = c(10L, 20L, 5L)), class = "data.frame", row.names = c(NA,
-3L))

R, basic plot with multiple y values

I am starting to work with R, this has to be a basic question but it doesn't seem obvious how to do it easily. If I have the following data set :
x y
0,1
0,2
1,2
1,4
and so on, so there are multiple y values for each x value. This how can I easily do a plot showing the data and the means and the CI intervals.
I can do it, as I have hodged together a solution off by hacking bits of code together, but there has to be a simpler solution.
This is what I am doing with this very simple data file
cardboard,r1,r2,r3,r4,r5,r6
0,233,130,110,140,160
101,293,340,313,260,366,38
and this mess of code :
er <- read.csv(file="ianevans.csv",head=TRUE,sep=",")
er[,2:7] <- min(er[1,2:7],na.rm=TRUE)/er[,2:7]*100
er$sharpness[1]=mean(as.vector(er[1,2:7], "numeric"),na.rm=TRUE)
er$sharpness[2]=mean(as.vector(er[2,2:7], "numeric"),na.rm=TRUE)
er$se[1]=sd(as.vector(er[1,2:7], "numeric"),na.rm=TRUE)/sqrt(6-1)
er$se[2]=sd(as.vector(er[2,2:7], "numeric"),na.rm=TRUE)/sqrt(6-1)
p <- ggplot(er,aes(x=cardboard,y=sharpness))
p1 <- p + geom_point(aes(y=sharpness,color="red",size=5) )
p2 <- p1 +
scale_shape_discrete(solid=F) +
geom_point(aes(y=r1),color="blue",shape="o",size=3) +
geom_point(aes(y=r2),color="blue",shape="o",size=3) +
geom_point(aes(y=r3),color="blue",shape="o",size=3) +
geom_point(aes(y=r4),color="blue",shape="o",size=3) +
geom_point(aes(y=r5),color="blue",shape="o",size=3) +
geom_point(aes(y=r6),color="blue",shape="o",size=3)
p3 <- p2 +
geom_errorbar(aes(ymax=sharpness+se,ymin=sharpness-se),width=5)
Again this works, more or less, but as you can see there are a bunch of hard coded numbers and there has to be a better way to both have the data file set up and do the plots and easily be able to deal with different amounts of y data for each x data for example. The way some of the data manipulation is done is also awkward as it should be possible do to it without individual references for the means, sd's, .
Here is my attempt to simplify your code. It is based on dplyr and ggplot2:
library(dplyr)
library(ggplot2)
er <- structure(list(cardboard = c("100", "101"), r1 = c(30L, 293L),
r2 = c(233L, 340L), r3 = c(130L, 313L), r4 = c(110L, 260L
), r5 = c(140L, 366L), r6 = c(160L, 38L)), .Names = c("cardboard",
"r1", "r2", "r3", "r4", "r5", "r6"), row.names = c(NA, -2L), class = "data.frame")
d2 <- er %>%
melt(id.vars='cardboard',na.rm=T) %>% # convert to tidy format
group_by(cardboard) %>% # group by cardboard
mutate(v2=min(value)/value) %>% # calculate v2
summarise(sharpness=mean(v2),se=sd(v2)) # calculate mean & sd
# cardboard sharpness se
#1 100 0.3390063 0.3273334
#2 101 0.2688070 0.3585103
ggplot(d2) +
aes(x=cardboard,y=sharpness,ymax=sharpness+se,ymin=sharpness-se) +
geom_pointrange()

'height' must be a vector or a matrix. barplot error

I am trying to create a simple bar chart, but I keep receiving the error message
'height' must be a vector or a matrix
The barplot function I have been trying is
barplot(data, xlab="Percentage", ylab="Proportion")
I have inputted my csv, and the data looks as follows:
34.88372093 0.00029997
35.07751938 0.00019998
35.27131783 0.00029997
35.46511628 0.00029997
35.65891473 0.00069993
35.85271318 0.00069993
36.04651163 0.00049995
36.24031008 0.0009999
36.43410853 0.00189981
...
Where am I going wrong here?
Thanks in advance!
EDIT:
dput(head(data)) outputs:
structure(list(V1 = c(34.88372093, 35.07751938, 35.27131783,
35.46511628, 35.65891473, 35.85271318), V2 = c(0.00029997, 0.00019998,
0.00029997, 0.00029997, 0.00069993, 0.00069993)), .Names = c("V1",
"V2"), row.names = c(NA, 6L), class = "data.frame")
and barplot(as.matrix(data)) produced a chart with all the data one bar as opposed to each piece of data on a separate bar.
You can specify the two variables you want to plot rather than passing the whole data frame, like so:
data <- structure(list(V1 = c(34.88372093, 35.07751938, 35.27131783, 35.46511628, 35.65891473, 35.85271318),
V2 = c(0.00029997, 0.00019998, 0.00029997, 0.00029997, 0.00069993, 0.00069993)),
.Names = c("V1", "V2"), row.names = c(NA, 6L), class = "data.frame")
barplot(data$V2, data$V1, xlab="Percentage", ylab="Proportion")
Alternatively, you can use ggplot to do this:
library(ggplot2)
ggplot(data, aes(x=V1, y=V2)) + geom_bar(stat="identity") +
labs(x="Percentage", y="Proportion")
Probably the entire dataframe format is wrong, The same thing happened to me since I added the columns individually and made the dataframe together.
table.values = c(value1, value2,.......)
table = matrix(table.values,nrow=number of rows ,byrow = T)
colnames(table) = c("column1","column2",........)
row.names(table) = c("row1", "row2",............)
barplot(table, beside = T, xlab= "X-axis",ylab= "Y-axis")

Need to Draw a Bar Graph ( in percentile manner ) in ggplot2

hi i have a data set like this
ALL Critical Error Warning Review
2016 1412 475 4 125
154 45 49 2 58
116 86 12 1 17
I want to plot a stacked bar graph using ggplot2 where a single bar would show 100% of "ALL" and rest "Critical","Error","Warning","Review" should be on top of another according to their contribution in "ALL".
I am try it with no luck!!! Need a hand..Thanks
I'm not quite sure if your description of the desired plot is non-ambiguous.
My interpretation would be the following:
## Copied from user1317221_G - Thanks for that.
babydf <- structure(list(ALL = c(2016L, 154L, 116L), Critical = c(1412L,
45L, 86L), Error = c(475L, 49L, 12L), Warning = c(4L, 2L, 1L),
Review = c(125L, 58L, 17L)), .Names = c("ALL", "Critical",
"Error", "Warning", "Review"), class = "data.frame", row.names = c(NA,
-3L))
# Add IDs
babydf <- cbind(id=1:nrow(babydf), babydf))
library(reshape2)
library(ggplot2)
# reshape the dataframe:
df.reshaped <- melt(babydf, id.vars='id')
ggplot(subset(df.reshaped, variable != 'ALL'), aes(x=id, y=value, fill=variable)) + geom_bar(stat='identity')
If you want to have all bars of equal height, just do
babydf[, 3:6] <- babydf[, 3:6] / babydf$ALL * 100
before melt. The result:

Resources