I have data.table data to create a stacked chart and with grouping using below code:
causesDf <- causesDf[, c('Type', 'Gender', 'Total')]
causesSort <- causesDf[, lapply(.SD, sum),
by=list(causesDf$Type, causesDf$Gender)]
and Data will be like below:
causesDf causesDf.1 Total
1: Illness (Aids/STD) Female 2892
2: Change in Economic Status Female 4235
3: Cancellation/Non-Settlement of Marriage Female 6126
4: Family Problems Female 133181
5: Illness (Aids/STD) Male 5831
6: Change in Economic Status Male 31175
7: Cancellation/Non-Settlement of Marriage Male 5170
and so on..
I am trying to make barplot like below:
barpos <- barplot(sort(causesSort$Total, decreasing=TRUE),
col=c("red","green"), xlab="", ylab="",
horiz=FALSE, las=2)
legend("topright", c("Male","Female"), fill=c("red","green"))
end_point <- 0.2 + nrow(causesSort) + nrow(causesSort) - 0.1
text(seq(0.1, end_point, by=1), par("usr")[3] - 30,
srt=60, adj= 1, xpd=TRUE,
labels=paste(causesSort$causesDf), cex=0.65)
but X-labels are not aligning properly, did I miss anything?
Expected output like:
Edited:
causesSort
structure(list(causesDf = c("Illness (Aids/STD)", "Change in Economic Status",
"Cancellation/Non-Settlement of Marriage", "Physical Abuse (Rape/Incest Etc.)",
"Dowry Dispute", "Family Problems", "Ideological Causes/Hero Worshipping",
"Other Prolonged Illness", "Property Dispute", "Fall in Social Reputation",
"Illegitimate Pregnancy", "Failure in Examination", "Insanity/Mental Illness",
"Love Affairs", "Professional/Career Problem", "Divorce", "Drug Abuse/Addiction",
"Not having Children(Barrenness/Impotency", "Causes Not known",
"Unemployment", "Poverty", "Death of Dear Person", "Cancer",
"Suspected/Illicit Relation", "Paralysis", "Property Dispute",
"Unemployment", "Poverty", "Family Problems", "Illness (Aids/STD)",
"Drug Abuse/Addiction", "Other Prolonged Illness", "Death of Dear Person",
"Causes Not known", "Cancer", "Not having Children(Barrenness/Impotency",
"Cancellation/Non-Settlement of Marriage", "Paralysis", "Physical Abuse (Rape/Incest Etc.)",
"Professional/Career Problem", "Love Affairs", "Fall in Social Reputation",
"Dowry Dispute", "Ideological Causes/Hero Worshipping", "Illegitimate Pregnancy",
"Failure in Examination", "Change in Economic Status", "Insanity/Mental Illness",
"Divorce", "Suspected/Illicit Relation", "Not having Children (Barrenness/Impotency",
"Not having Children (Barrenness/Impotency"), causesDf.1 = c("Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Female", "Male"), Total = c(2892,
4235, 6126, 2662, 31206, 133181, 776, 69072, 4601, 4697, 2391,
12054, 33352, 21339, 1596, 2535, 1205, 5523, 148134, 3748, 7905,
4707, 2878, 8093, 2284, 14051, 23617, 24779, 208771, 5831, 28841,
125493, 5614, 304985, 6180, 2299, 5170, 5002, 1330, 10958, 23700,
8767, 764, 1342, 103, 14951, 31175, 60877, 1598, 6818, 544, 222
)), row.names = c(NA, -52L), class = c("data.table", "data.frame"
)
# , .internal.selfref = <pointer: 0x00000000098d1ef0> # seems not to work
)
If you don't rely on 45° rotation (that one is a bit more tricky) you could use this solution.
First we need to reshape the data by sex.
library(reshape2)
df2 <- dcast(causesSort, ... ~ causesDf.1 , value.var="Total")
Then we generate rownames from the type column and delete this column.
rownames(df2) <- df2[, 1]
df2 <- df2[, -1]
Then we order the data by one column, e.g. by Female.
df2 <- df2[order(-df2$Female), ]
The labels are the rownames.
# labs <- rownames(df2)
However, since they are very long (and bad for the reader's eye!), we may have to think of shorter ones. A workaround is to shorten them a little.
labs <- substr(sapply(strsplit(rownames(df2), " "),
function(x) x[1]), 1, 8)
Now we are able to apply barplot().
pos <- barplot(t(df2), beside=TRUE, xaxt="n",
col=c("#3C6688", "#45A778"), border="white")
pos gives us a matrix of bar positions, because we have a grouped plot we need the column means. We can use it to plot the axis.
axis(1, colMeans(pos), labs, las=2)
Result
Here is ggplot2 solution. This may provide better control over the final output
library(dplyr)
library(ggplot2)
#Rename columns names
names(causesDf) <- c('Type', 'Gender', 'Total')
#sort male before females
causesDf$Gender<-factor(causesDf$Gender, levels=c("Male", "Female"), ordered=TRUE)
#sort types by total sum and sort in decreasing order
sorted<-causesDf %>% group_by(Type) %>% summarize(gtotal=sum(Total)) %>% arrange(desc(gtotal))
causesDf$Type<-factor(causesDf$Type, levels=sorted$Type, ordered=TRUE)
#plot graph
g<-ggplot(causesDf, aes(x=Type, y=Total, group=Gender, fill=Gender)) +
geom_col(position = "dodge") +
theme(axis.text.x = element_text(angle = 45, hjust=1)) +
scale_fill_manual(values = alpha(c("blue", "green"), .5))
print(g)
Related
I want to draw the same exact graph in R. However, I want to consider two options:
(1) with one x axis for each of the genders &
(2) two different xaxes for each of the gender. Here is also the link for where I found the image: https://rpubs.com/WhataBurger/Anovatype3
Thanks for sharing the knowledge.
Here is a randomly generated one. Please feel free to share your random data in the responses (if you have any).
Show in New Window
structure(list(gender = c("Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Male", "Male", "Male",
"Male", "Male", "Male", "Male", "Male", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Female", "Female", "Female"), education = c("Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "Education",
"Education", "Education", "Education", "Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education", "No Education", "No Education", "No Education",
"No Education"), salary = c(54395.2435344779, 57698.2251051672,
75587.0831414912, 60705.0839142458, 61292.8773516095, 77150.6498688328,
64609.162059892, 47349.3876539347, 53131.4714810647, 55543.3802990004,
72240.8179743946, 63598.1382705736, 64007.7145059405, 61106.8271594512,
54441.5886524592, 77869.1313680308, 64978.5047822924, 40333.8284337036,
67013.5590156369, 55272.0859227207, 49321.7629401315, 57820.250853417,
49739.9555169276, 52711.0877070886, 53749.6073215074, 54395.2435344779,
57698.2251051672, 75587.0831414912, 60705.0839142458, 61292.8773516095,
77150.6498688328, 64609.162059892, 47349.3876539347, 53131.4714810647,
55543.3802990004, 72240.8179743946, 63598.1382705736, 64007.7145059405,
61106.8271594512, 54441.5886524592, 77869.1313680308, 64978.5047822924,
40333.8284337036, 67013.5590156369, 55272.0859227207, 49321.7629401315,
57820.250853417, 49739.9555169276, 52711.0877070886, 53749.6073215074,
23253.2267570303, 33351.1481779781, 30613.4924713461, 25447.4522519522,
35015.2596842797, 31705.8568859073, 28819.7140680309, 33580.5026441801,
33512.5339501322, 33286.3243265499, 32754.5610164004, 32215.6706141504,
29752.3531576931, 28776.1493450403, 28478.1159959505, 27221.172084318,
29168.3308879216, 24938.4145937269, 38675.8238613541, 34831.84799322,
25507.5656671866, 28388.4606588037, 28133.3785855071, 33119.8604733453,
29666.5237341127, 23253.2267570303, 33351.1481779781, 30613.4924713461,
25447.4522519522, 35015.2596842797, 31705.8568859073, 28819.7140680309,
33580.5026441801, 33512.5339501322, 33286.3243265499, 32754.5610164004,
32215.6706141504, 29752.3531576931, 28776.1493450403, 28478.1159959505,
27221.172084318, 29168.3308879216, 24938.4145937269, 38675.8238613541,
34831.84799322, 25507.5656671866, 28388.4606588037, 28133.3785855071,
33119.8604733453, 29666.5237341127)), class = "data.frame", row.names = c(NA,
-100L))
Look at this code, it may help you to start. Your data it's not complete as all Education are male and all No Education are female, so you can't get a facet_wrap() with all categories. Anyway, I think this may be of help.
Once your variables charged, make a dataframe and analyse with ggplot:
library (ggplot2)
df <- data. Frame(education, gender, salary)
# plot 1
ggplot(df, aes(x = education, y = salary, fill=gender)) +
geom_boxplot() +
facet_wrap(.~gender) +
theme_bw()
# plot 2
ggplot(df, aes(x = education, y = salary, fill = gender)) +
geom_boxplot() +
theme_bw()
I have a character vector data frame and I would like to randomly generate pairs of names coming from this vector. My code gives the all combinations. But I want to generate all names should be paired with one time in random order; an item cannot be partner with itself.
My code is:
# Creating a dataframe
df = data.frame(
"Name" = c("Amiya", "Raj", "Asish", "John", "ruban", "mary", "barath", "leema", "joshi", "indhu", "praveen", "joshua",
"alex", "martin", "stella", "veronica", "henry", "rajesh", "yusuf", "jenita", "johana", "jerald", "jegan", "lincy",
"jona", "rani", "julie", "ross", "chandler", "monica", "penny", "sheldon"),
"Sex" = c("Female", "Male", "Male", "male", "male", "Female", "male", "Female", "Female", "Female", "male",
"male", "male", "male", "Female", "Female", "male", "male", "male", "Female", "Female", "male",
"male", "Female", "Female", "Female", "Female", "male", "male", "Female", "Female", "male"),
"Number" = c(8937998889, 2598279874, 4589987483, 2876876877, 2876876876, 2487698798, 2879879877, 2887987897, 2878798733,
4309808098, 8748098990, 9883798798, 8734787987, 8973498787, 8734887877, 9798374877, 8786487687, 7275687263,
4379879847, 8943787876, 3874879874, 8978973987, 8978347878, 8839478768, 9378887774, 8467676764, 7246276874,
7478798743, 6576787877, 7328776876, 6648678833, 6378787878)
)
print(df)
# Accessing first and second column
cat("Accessing first and second column\n")
dat <- print(df[, 1])
t(combn(unique(dat,2)))
TIA
Get the unique elements from 'Name' column, sample it and convert to a matrix with 2 columns (assuming the length of unique elements are even)
matrix(sample(unique(df$Name)), ncol = 2)
I am creating a circular heatmap as follows:
suppressPackageStartupMessages({
library(circlize)
})
# input data
dput(annot)
structure(list(Specimen_Type = c("Both", "Plasma", "Both", "Both",
"Plasma", "Plasma", "Plasma", "Both", "Both", "Both", "Plasma",
"Plasma", "Both", "Both", "Both", "Both", "Both", "Both", "Both",
"Both", "Both", "Plasma", "Both", "Both", "Plasma", "Both", "Plasma",
"Plasma", "Both", "Plasma", "Both", "CSF", "Both", "Plasma",
"Both", "Both", "Both", "Plasma", "Both", "Plasma", "Both", "Plasma",
"Plasma", "Both", "Both", "Plasma", "Both", "Both", "Plasma",
"Plasma", "Plasma", "Plasma", "Plasma", "Both", "Both"), Sex = c("Female",
"Female", "Female", "Male", "Female", "Female", "Female", "Female",
"Female", "Female", "Male", "Male", "Male", "Male", "Male", "Male",
"Female", "Female", "Male", "Male", "Female", "Female", "Male",
"Male", "Female", "Male", "Male", "Male", "Female", "Female",
"Male", "Male", "Female", "Female", "Male", "Male", "Female",
"Male", "Male", "Male", "Male", "Female", "Male", "Male", "Female",
"Female", "Male", "Male", "Female", "Male", "Male", "Male", "Female",
"Female", "Female")), row.names = c("15635-29", "15635-31", "15635-32",
"15635-37", "15635-38", "15635-182", "15635-42", "15635-43",
"15635-45", "15635-46", "15635-53", "15635-215", "15635-58",
"15635-60", "15635-63", "15635-68", "15635-70", "15635-75", "15635-80",
"15635-81", "15635-87", "15635-90", "15635-100", "15635-101",
"15635-108", "15635-120", "15635-127", "15635-129", "15635-132",
"15635-134", "15635-135", "15635-1", "15635-2", "15635-251",
"15635-7", "15635-11", "15635-145", "15635-148", "15635-150",
"15635-154", "15635-156", "15635-158", "15635-161", "15635-169",
"15635-170", "15635-187", "15635-197", "15635-214", "15635-228",
"15635-225", "15635-246", "15635-254", "15635-234", "15635-239",
"15635-279"), class = "data.frame")
split <- factor(annot$Specimen_Type)
col_fun1 <- list("Male" = "navy",
"Female" = "deeppink4",
'Plasma' = '#fcff5c',
'CSF' = '#8d14ff',
'Both' = '#14f9ff')
circos.par(start.degree = 30, gap.degree = 1, points.overflow.warning = FALSE)
circos.heatmap(annot,
split = split,
col = unlist(col_fun1),
track.height = 0.4,
bg.border = "gray50", bg.lty = 1.5,
show.sector.labels = T)
circos.clear()
How do I add gaps between individual cells in the heatmap?
I needed to update the circlize package and add cell.border = "white" param.
I have a bar plot created with ggplot which I would like to animate; the basic plot is ...
I have researched this on the web and created this code:-
Here is some Reprex data ...
Data <- as.data.frame(rbind(c("11 Mar'", "Male", "20-30"),
c("11 Mar'", "Male", "20-30"),
c("11 Mar'", "Female", "20-30"),
c("12 Mar'", "Female", "50-60"),
c("12 Mar'", "Female", "10-20"),
c("12 Mar'", "Male", "60-70"),
c("13 Mar'", "Female", "20-30"),
c("13 Mar'", "Female", "60-70"),
c("13 Mar'", "Male", "60-70"),
c("13 Mar'", "Male", "60-70"),
c("13 Mar'", "Female", "20-30"),
c("14 Mar'", "Female", "70-80"),
c("14 Mar'", "Female", "70-80"),
c("14 Mar'", "Male", "40-50")))
colnames(Data) <- c("Date", "Sex", "AgeGroup")
And this is my call to ggplot ...
ggplot(Data, aes(x = AgeGroup, fill = Sex, frame = Date, Cumulative = TRUE)) +
geom_bar(position = position_dodge2(preserve = "single")) +
scale_fill_manual(values = c("lightblue", "darkblue")) +
xlab("\nAge Group") + ylab("\nIndividuals") +
ggtitle("\nMarch - April 2020") +
scale_y_discrete(limits= c(2,4,6,8)) + theme_pc() +
transition_states(Date, transition_length = 4, state_length = 1) +
labs(title = 'Date: {closest_state}',
subtitle = "Age and Gender distribution",
caption = "data as of 0945 10 Apr 2020")
Unfortunately I am not getting the the desired output - the data is not being shown cumulatively
instead each frame of the plot refreshes to show only a single days result. I expected that geom_bar
would always accumulate the data across days but it doesn't seem to - I even have "Cumulative = TRUE"
in the ggplot call but still the result looks like this ...
Can anyone point me in the right direction ??
I have managed to get it.
I have reproduced the data set by duplicating the previous date's rows and adding to current date and so. So that the frequencies will be cumulative. I couldn't get it done in R. However, the data will be looking like this:
Data <- structure(list(Date = c("11 Mar'", "11 Mar'", "11 Mar'", "12 Mar'",
"12 Mar'", "12 Mar'", "12 Mar'", "12 Mar'", "12 Mar'", "13 Mar'",
"13 Mar'", "13 Mar'", "13 Mar'", "13 Mar'", "13 Mar'", "13 Mar'",
"13 Mar'", "13 Mar'", "13 Mar'", "13 Mar'", "14 Mar'", "14 Mar'",
"14 Mar'", "14 Mar'", "14 Mar'", "14 Mar'", "14 Mar'", "14 Mar'",
"14 Mar'", "14 Mar'", "14 Mar'", "14 Mar'", "14 Mar'", "14 Mar'"
), Sex = c("Female", "Male", "Male", "Female", "Female", "Female",
"Male", "Male", "Female", "Female", "Female", "Female", "Female",
"Female", "Female", "Male", "Male", "Male", "Male", "Male", "Female",
"Female", "Female", "Female", "Female", "Female", "Female", "Female",
"Male", "Male", "Female", "Male", "Male", "Male"), AgeGroup = c("20-30",
"20-30", "20-30", "10-20", "20-30", "50-60", "20-30", "20-30",
"60-70", "10-20", "20-30", "20-30", "20-30", "50-60", "60-70",
"20-30", "20-30", "60-70", "60-70", "60-70", "10-20", "20-30",
"20-30", "20-30", "50-60", "60-70", "70-80", "70-80", "20-30",
"20-30", "40-50", "60-70", "60-70", "60-70")), class = c("spec_tbl_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -34L), spec = structure(list(
cols = list(Date = structure(list(), class = c("collector_character",
"collector")), Sex = structure(list(), class = c("collector_character",
"collector")), AgeGroup = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))
Then, enter_grow() and exit_fade() with alpha value needs to be added. Ensure that the bars are appearing in the same order of the category. In general, the bar for the the value is available at first is getting plotted irrespective of the color. The makes the shifting of the bar color while transiting.
ggplot(Data, aes(x = AgeGroup, fill = Sex, frame = Date)) +
geom_bar(position = position_dodge(preserve = "single")) +
scale_fill_manual(values = c("lightblue", "darkblue"), drop = TRUE) +
xlab("\nAge Group") + ylab("\nIndividuals") +
ggtitle("\nMarch - April 2020") +
scale_y_discrete(limits= c(2,4,6,8)) +
transition_states(Date, transition_length = 4, state_length = 1) + shadow_mark() +
enter_grow() +
exit_fade(alpha = 1)+
labs(title = 'Date: {closest_state}',
subtitle = "Age and Gender distribution",
caption = "data as of 0945 10 Apr 2020")
This question already has answers here:
facet_wrap add geom_hline
(2 answers)
Closed 5 months ago.
I am quite a newbie to RStudio and I am having problems adding different vertical lines on each of my two facets using facet_wrap
Here is what I thought:
library(ggplot2)
g <- ggplot(dat, aes(x=index, na.rm= TRUE))
d <- g+ geom_density() + facet_wrap(~gender)
vline.data <- data.frame(z = c(2.36,2.48),gender = c("Female","Male"))
d1 <- d + geom_vline(aes(xintercept = z),vline.data)
But it adds the same two lines to each facet - what would you reckon is the problem? I have thought of somehow splitting the facets into two separate data frames, but I have no idea how to go on about it.
P.S The x-axis (the index) goes from 1 to 4.
Thank you in advance.
index <- c(NA, NA, 4, 4, 4, NA, NA, NA, NA, 2, 2, 2, 2, 2, 3, 3, 3, 3,
3, NA, 3, NA, NA, 3, 4, 4, 4, 4, 4, 3, 4, 3, 4, 3, NA, 4, 2,
4, 4, 2, 2, NA, 2, 3, 3, 2, 2, NA, NA, 2)
gender <- c("Female", "Female", "Male", "Male", "Male", "Female", "Female",
"Male", "Female", "Male", "Female", "Male", "Female", "Male",
"Male", "Female", "Female", "Female", "Male", "Female", "Female",
"Male", "Male", "Female", "Female", "Female", "Male", "Male",
"Female", "Female", "Male", "Female", "Female", "Female", "Male",
"Male", "Female", "Male", "Male", "Female", "Female", "Male",
"Male", "Female", "Female", "Male", "Male", "Male", "Male", "Male")
Using the code and data above I get this plot
This was asked a long time ago, but here's the answer:
You need to add geom_vline to your ggplot. geom_vline takes a single value, not a vector, and it doesn't iterate, so you can't have multiple values in a separate data object from the rest of your data. Merge vline.data as a new column with the rest of dat as is appropriate for your data set. This question doesn't have reproducible data, but put index,gender and however you calculate vline.data into a single dat dataframe with each of those pieces as column names. Then you can summarise the vline.data. The example below assumes the values within vline.data are the mean values for the entire index column.
p <- ggplot(dat, aes(x=index, na.rm= TRUE)) +
geom_density() +
facet_wrap(. ~ gender) +
geom_vline(data = . %>% group_by(gender) %>% summarise(vl=mean(index)),
aes(xintercept=vl))
How to Add Lines With A Facet R