Bubble chart with ggplot with no curcle - r

The data of df a I use is:
x y size
589 127 16,4724409449
465 58 21,0517241379
408 58 15,9137931034
I use this to take a bubble chart
library(ggplot2)
a <- read.csv("numbers.csv", header = TRUE)
ggplot(a,aes(x,y))+geom_point(size=a$size)
but in the chart I can't see any bubble. How can I make it?
Here is the dput of a data frame:
structure(list(x = c(589L, 465L, 408L), y = c(127L, 58L, 58L),
size = structure(c(2L, 3L, 1L), .Label = c("15,9137931034",
"16,4724409449", "21,0517241379"), class = "factor")), .Names = c("x",
"y", "size"), class = "data.frame", row.names = c(NA, -3L))
Also if it possible to add names and different colours to every bubble?
x y size name
589 127 16,4724409449 nameA
465 58 21,0517241379 nameB
408 58 15,9137931034 nameC

To generate a bubble chart with size mapped to a$size, and colour and labels to a$name, we can try:
ggplot(a, aes(x, y, label = name)) +
geom_point(aes(size = size, colour = name)) +
geom_text()

Related

ggplot: edit shape of points based on second column

My goal is to plot a map with dwelling locations as points, where points are divided into two colours, based on a categorical variable, name category. Of those dwellings, a few dwellings need to have a different shape, e.g., a star. The column that describes this is called star in the example below. My dataframe looks like this:
x
y
category
star
123
456
1
0
143
556
0
0
124
556
1
1
233
256
1
0
ggplot(data = df, aes(x = x, y = y, color=category)) +
geom_point()
The code above gives me what I need, except for the 'stars'. How can distinguish this second column?
Have assumed you want the points with star with a value of 1 to be a star shape.
library(ggplot2)
ggplot(data = df1, aes(x = x, y = y, color=factor(category), shape = factor(star))) +
geom_point(size = 8) +
scale_shape_manual(breaks = c(0, 1),
values = c(1, 11))+
labs(color = "Category",
shape = "Star")
data
df1 <- structure(list(x = c(123L, 143L, 124L, 233L),
y = c(456L, 556L, 556L, 256L),
category = c(1L, 0L, 1L, 1L),
star = c(0L, 0L, 1L, 0L)),
class = "data.frame", row.names = c(NA, -4L))
Created on 2022-10-13 with reprex v2.0.2

plotting means in a horizontal bar with a vertical line

ID
score1
score 2
score 3
score 4
1
200
300
400
-200
2
250
-310
-470
-200
3
210
400
480
-200
4
220
-10
-400
-200
5
150
-50
400
-200
I am new to R, I want to make a graph that presents the mean of each score.
whereas, the scores are lined in the Y axis, and there is a vertical line which represents the 0.
every score mean above zero a horizontal bar appears from the central to the right.
every score mean below zero a horizonal bar appears from the central to the left.
Thanks for the help!
You could achieve your desired result by first converting your dataset to long format and by computing the means per score afterwards. After these data wrangling steps you could plot the means using ggplot2 via geom_col and add a vertical zero line using geom_vline:
df <- data.frame(
ID = c(1L, 2L, 3L, 4L, 5L),
score1 = c(200L, 250L, 210L, 220L, 150L),
score.2 = c(300L, -310L, 400L, -10L, -50L),
score.3 = c(400L, -470L, 480L, -400L, 400L),
score.4 = c(-200L, -200L, -200L, -200L, -200L)
)
library(dplyr)
library(tidyr)
library(ggplot2)
df1 <- df |>
tidyr::pivot_longer(-ID, names_to = "score") |>
group_by(score) |>
summarise(value = mean(value))
ggplot(df1, aes(value, score)) +
geom_vline(xintercept = 0) +
geom_col()
EDIT To label the bars you could use geom_text. Tricky part is to align the labels. To this end I make use of an ifelse to right align (hjust = 1) the labels in case of a positive mean and left align (hjust = 0) in case of a negative mean. Actually I did 1.1 and -.1 to add some padding between the label and the bar. The axis labels could be set via the labels argument of the scale, in your case it is scale_y_discrete. Personally I prefer to use a named vector which assign labels to categories in the data.
ggplot(df1, aes(value, score)) +
geom_vline(xintercept = 0) +
geom_col() +
geom_text(aes(label = value, hjust = ifelse(value > 0, 1.1, -.1)), color = "white") +
scale_y_discrete(labels = c("score1" = "Test1", "score.2" = "Test2", "score.3" = "Test3", "score.4" = "Test4"))
Similar approach with stefan's but slightly different choice of functions:
The data:
dat <- structure(list(ID = 1:5, score1 = c(200L, 250L, 210L, 220L, 150L
), score2 = c(300L, -310L, 400L, -10L, -50L), score3 = c(400L,
-470L, 480L, -400L, 400L), score4 = c(-200L, -200L, -200L, -200L,
-200L)), class = "data.frame", row.names = c(NA, -5L))
The chain of functions
dat %>%
select(-ID) %>%
map_df(mean) %>%
pivot_longer(everything(), names_to = "score", values_to = "means") %>%
ggplot() +
coord_flip() +
geom_col(aes(x = score, y = means))
The result
In case you want to change the labels on the tick marks ("score 1", "score2", etc) to other labels, you can use scale_x_discrete.
In addition, in case you want to show the numeric value on top of each bar, you can use geom_text with hjust to adjust the label positions.
For example :
dat %>%
select(-ID) %>%
map_df(mean) %>%
pivot_longer(everything(), names_to = "score", values_to = "means") %>%
ggplot() +
coord_flip() +
geom_col(aes(x = score, y = means)) +
scale_x_discrete(labels = c("Test A", "Test B", "Test C", "Test D")) +
geom_text(aes(x = score, y = means, label = means),
hjust = c(-0.5, -0.5, -0.5, 1.1))

Stacked Bar Chart in ggplot

I would like to have a stacked bar-chart. I succesfully created my dataframe using lubridate, however as I can just specify x and y values I do not know how to 'put in' my data values.
The dataframe is looking like so:
Date Feature1 Feature2 Feature3
2020-01-01 72 0 0
2020-02-01 90 21 5
2020-03-01 112 28 2
2020-04-01 140 36 0
...
The date should be on the x-axis and each row represents one bar in the bar chart (the height of the bar is the sum of Feature1+Feature2+Feature3
The only thing I get is this:
ggplot(dataset_monthly, aes(x = dataset_monthly$Date, y =dataset_monthly$????)) +
+ geom_bar(stat = "stack")
We can reshape to 'long' format first
library(dplyr)
library(tidyr)
library(ggplot2)
dataset_monthly %>%
pivot_longer(cols = -Date, names_to = 'Feature') %>%
ggplot(aes(x = Date, y = value, fill = Feature)) +
geom_col()
-output
data
dataset_monthly <- structure(list(Date =
structure(c(18262, 18293, 18322, 18353), class = "Date"),
Feature1 = c(72L, 90L, 112L, 140L), Feature2 = c(0L, 21L,
28L, 36L), Feature3 = c(0L, 5L, 2L, 0L)), row.names = c(NA,
-4L), class = "data.frame")
Slightly modified using geom_bar. thanks to akrun!
library(tidyverse)
# Bring data in longformat -> same code as akruns!
df <- dataset_monthly %>%
pivot_longer(cols = -Date, names_to = 'Feature')
ggplot(df, aes(x=Date, y=value, fill=Feature, label = value)) +
geom_bar(stat="identity")+
geom_text(size = 3, position = position_stack(vjust = 0.8)) +
scale_fill_brewer(palette="Paired")+
theme_classic()

Interaction plot with multiple facets using ggplot

I am on R studio, and I am working on a graph that allows comparison between an input vector and what the database have.
The data looks like this:
Type P1 P2 P3
H1 2000 60 4000
H2 1500 40 3000
H3 1000 20 2000
The input vector for comparison will look like this:
Type P1 P2 P3
C 1200 30 5000
and I want my final plot to look like this:
The most important thing is a visual comparison between the input vector and the different types, for each P component. The scale of the y axis should adapt to each type of P, because there is big differences between them.
library(dplyr)
library(tidyr)
library(ggplot2)
d %>% gather(var1, val, -Type) %>%
mutate(input = as.numeric(d2[cbind(rep(1, max(row_number())),
match(var1, names(d2)))]),
slope = factor(sign(val - input), -1:1)) %>%
gather(var2, val, -Type, -var1, -slope) %>%
ggplot(aes(x = var2, y = val, group = 1)) +
geom_point(aes(fill = var2), shape = 21) +
geom_line(aes(colour = slope)) +
scale_colour_manual(values = c("red", "blue")) +
facet_grid(Type ~ var1)
DATA
d = structure(list(Type = c("H1", "H2", "H3"),
P1 = c(2000L, 1500L, 1000L),
P2 = c(60L, 40L, 20L),
P3 = c(4000L, 3000L, 2000L)),
class = "data.frame",
row.names = c(NA, -3L))
d2 = structure(list(Type = "C", P1 = 1200L, P2 = 30L, P3 = 5000L),
class = "data.frame",
row.names = c(NA, -1L))

How to visualize two column in bar chart using R?

I don't know if my question clear enough...
I have this table
Name Mark_Oral Mark_Written Total_M_Oral Total_M_Written
1 Hercule Poirot 50 49 858 781
2 Joe O'Neil 70 79 1056 1083
3 John McAuley 81 99 1219 1333
and I have to visualize the last two column in bar chart using R to compare student total mark
Data
table <- structure(list(Name = c("Hercule Poirot", "Joe O'Neil", "John McAuley"),
Mark_Oral = c(50L, 70L, 81L),
Mark_Written = c(49L, 79L, 99L),
Total_M_Oral = c(858L, 1056L, 1219L),
Total_M_Written = c(781L, 1083L, 1333L)),
.Names = c("Name", "Mark_Oral", "Mark_Written", "Total_M_Oral", "Total_M_Written"),
row.names = c("1", "2", "3"), class = "data.frame")
You can use + to combine other plots on the same ggplot object. For example:
ggplot(survey, aes(often_post,often_privacy)) +
geom_point() +
geom_smooth() +
geom_point(aes(frequent_read,often_privacy)) +
geom_smooth(aes(frequent_read,often_privacy))
With ggplot2 (as your tags suggest) the syntax is:
ggplot(data = table,aes(x= Total_M_Oral,y=Total_M_Written))+geom_bar(stat = "identity")
Where table is replaced by the name of your dataframe.
Edit
I was unsure that my first answer really answered your question (multiple uses of bars).
Create dummy data
df<-data.frame(x = rpois(n = 100,lambda = 800),y = rpois(n = 100,lambda = 800))
With previous plot:
If you want to count and have a color for Oral and one for written
df2<-data.frame(x = c(df$x,df$y),y = rep(c("written","oral"),each = nrow(df)))
ggplot(data = df2,aes(x= x,fill=y),alpha = I(0.5))+geom_bar(stat = "count")
Which gives:
Comment: alpha parameter is not necessary, it just deals with the transparency so that you can see when there are overlapping bars.
With student names
df3<-data.frame(name = rep(table$Name,times = 2),
y = c(table$Total_M_Oral, table$Total_M_Written),
fill = rep(c("oral","written"),each = nrow(table)))
ggplot(data = df3, aes(x = name,y= y,fill = fill,alpha = 0.5))+geom_bar(stat= "identity")

Resources