Parse plot lines by more than one variable ggplot - r

I am desiring to plot two levels of data (high, low) for two days (day o, day 1) for both male and female subjects. I have been success in faceting by day and by level. However I am unsuccessful of combining and identifying the genders. I would like to show the male/female together on day 0 and day 1. Below is specific code I have been trying to create.
Thanks in advance
data <- function(ids,time_vec) {
obs.data <-
data.frame(expand.grid(ids,time_vec),DOSE=0,Conc=rnorm(13,10,2),Day=0)
names(obs.data) <- c("ID","TIME","DOSE","Conc","Day")
obs.data<-obs.data[order(obs.data$ID),]
return(obs.data)
}
test<-data(ids=1:4, time_vec= seq(0,120,10))
test$Gender<-ifelse(test$ID==1|test$ID ==3,"Male","Female")
test$Day<-ifelse(test$ID==1|test$ID==2,"Day 1","Day 2")
test$DoseLevel<-ifelse(test$ID==1|test$ID==2,"Low","High")
gf1<-ggplot(test,aes(x=TIME, y=(Conc), group=interaction(DoseLevel,Day,
Gender)))+
geom_line(size=1.25)+
facet_grid(DoseLevel~.,as.table=FALSE)
gf1
gf2<-gf1+ geom_point(aes(shape=factor(Day), fill=factor(Day),
colour=factor(Day)),size=4,show_guide=TRUE)+
scale_shape_manual(values=c(21, 21))+
scale_fill_manual(values=c("black","white"))+
scale_colour_manual(values=c("black","black"))
gf2

This link will help you:
Plotting continuous and discrete series in ggplot with facet
Just take care the using melt function.

Related

Too many dates on X axis- ggplot2

I have two questions.
I am using a dataset detailing information on Chicago (temp, ozone, season, date).
I would like a plot show the temperature across time, along with a smooth line to
show the trend, specifically with dates isolated from 1997-2000.
Thus:
library(ggplot2)
library(dplyr)
chicago <- read.csv(file = 'FilePath')
## Limit data to 1997-2000
chicago2 <- chicago %>%
filter(date >= "1997-01-01" & date <= "2000-12-31")
ggplot(chicago2, aes(x=date, y=temp)) +
geom_point() +
geom_smooth() +
labs(title="Temperature")
My issues are as follows:
There seems to be an issue with the x-axis, with the dates not cleanly represented. When I zoom into the picture, it seems like R is plotting every single date on the x-axis, but I am not sure if this is the actual issue. Scatterplot
While I am able to accurately plot a scatterplot, there is no overlayed smooth line, despite the use of the geom_smooth() function.
Looking forward to your responses,

R ggplot2 Visualize categorical variable that levels appear more than once

I am trying to visualize some tennis data with ggplot2 in R.
Here are my data:
Year<-c(1999:2020)
Player <- rep("Federer",22)
Rank <-
c("Q1","3R","3R","4R","4R","W","SF","W","W","SF","F","W","SF","SF","SF","SF","3R",
"SF","W","W","4R","SF")
data <- data.frame(Year, Player, Rank)
data$Rank <- factor(data$Rank, levels = unique(data$Rank))
What I want to do is a diagram that looks like a bar plot but actually is not a bar plot. I would like to have as x-axis Years from 1999 to 2020 and correspond them to Rank level.
My problem is that Rank, which is I converted to categorical variable, has some levels that appear more than once in time and this makes things difficult for me.
I am looking to do something like the following pic from Wikipedia with specific color for every level of Rank variable.
The Australian open result is what I want to visualize.
Maybe something like this, using geom_tile() to make like a heatmap..instead of a barplot:
library(ggthemes)
ggplot(data,aes(x=factor(Year),y=Player,fill=Rank)) +
geom_tile() + scale_fill_economist()

Box-plot by month in r, remove RA

Im very new in R and I am making a boxplot in R using data collected annually. The questions is about depression and falls in to to this categories: all, most, some, a little, or none of the time. I want to see if there is some correlation between frequency of depression and months of the year.
How do i remove RA?
enter image description here
If you mean how do you avoid plotting the NA values try this:
ggplot(data %>% filter(!is.na(menthlth)), aes(...)) +
geom_boxplot()

I am using ggplot2 to make a bar chart and can't get the years correct along the x-axis

I am using ggplot2 to make a bar chart of the number of participants per year by gender. If I have 14 years included, I would like 2 bars for each year corresponding to the number of males and females for that year. I am not getting each year along the x-axis. I think data is being binned. I have tried changing the bin width, using scale_x_date and am still stuck.
Can you help me figure out how to have the data for EACH year in my graph?
As an example, here is my data for years 2004-2017:
year=c(2004,2005,2006,2007,2008,2009,2010,2011,2012,2013,2014,2015,2016,2017)
gender=c("male" , "female")
Participants is by gender, male then female respectively per year:
Participants=c(1307,443,1847,630,2109,765, 1824,691,2250,952,3123,1421,4097,1904,6415,3284,8788,4678,11581,6694,13141,8478,16389,10575,20990,13811,26951,19729)
data=data.frame(year,gender,Participants)
Here is how I am trying to generate my plot:
MyPlot <- ggplot(data, aes(fill=gender, y=Participants, x=year)) +
geom_bar(position="dodge", stat="identity",width = .8)
print(MyPlot + ggtitle("Annual Number of Participants by Gender"))
On the x-axis, the years 2006, 2010, 2014 and 2018 are marked and the bars correspond to data from two years. I want data for each year, both in terms of the bars and in terms of the ticks on the x-axis.
Any help would be appreciated!
You have more participants than years, so you don't have a clear dataframe design to serve as an input to ggplot.
Start here:
Read this: https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html
The key to which is:
Each variable forms a column.
Each observation forms a row.
Each type of observational unit forms a table.
Then once you have a tibble/data frame your ggplot2 code should work fine. I'd kill the width= option until you have it working.

How to create a time scatterplot with R?

The data are a series of dates and times.
date time
2010-01-01 09:04:43
2010-01-01 10:53:59
2010-01-01 10:57:18
2010-01-01 10:59:30
2010-01-01 11:00:44
…
My goal was to represent a scatterplot with the date on the horizontal axis (x) and the time on the vertical axis (y). I guess I could also add a color intensity if there are more than one time for the same date.
It was quite easy to create an histogram of dates.
mydata <- read.table("mydata.txt", header=TRUE, sep=" ")
mydatahist <- hist(as.Date(mydata$day), breaks = "weeks", freq=TRUE, plot=FALSE)
barplot(mydatahist$counts, border=NA, col="#ccaaaa")
I haven't figured out yet how to create a scatterplot where the axis are date and/or time.
I would like also to be able to have axis not necessary with linear dates YYYY-MM-DD, but also based on months such as MM-DD (so different years accumulate), or even with a rotation on weeks.
Any help, RTFM URI slapping or hints is welcome.
The ggplot2 package handles dates and times quite easily.
Create some date and time data:
dates <- as.POSIXct(as.Date("2011/01/01") + sample(0:365, 100, replace=TRUE))
times <- as.POSIXct(runif(100, 0, 24*60*60), origin="2011/01/01")
df <- data.frame(
dates = dates,
times = times
)
Then get some ggplot2 magic. ggplot will automatically deal with dates, but to get the time axis formatted properly use scale_y_datetime():
library(ggplot2)
library(scales)
ggplot(df, aes(x=dates, y=times)) +
geom_point() +
scale_y_datetime(breaks=date_breaks("4 hour"), labels=date_format("%H:%M")) +
theme(axis.text.x=element_text(angle=90))
Regarding the last part of your question, on grouping by week, etc: To achieve this you may have to pre-summarize the data into the buckets that you want. You can use possibly use plyr for this and then pass the resulting data to ggplot.
I'd start by reading about as.POSIXct, strptime, strftime, and difftime. These and related functions should allow you to extract the desired subsets of your data. The formatting is a little tricky, so play with the examples in the help files.
And, once your dates are converted to a POSIX class, as.numeric() will convert them all to numeric values, hence easy to sort, plot, etc.
Edit: Andre's suggestion to play w/ ggplot to simplify your axis specifications is a good one.

Resources