Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 1 year ago.
Improve this question
I am trying to create a plot in R that shows post-surgical outcomes over time. I want to plot a certain data point at pre-op, 1 month post-op, 6 months post-op, etc. Here is an example dataframe:
dat <- data.frame(Preop=c(-2,0.5,-0.25,1.5), PO_1M=c(-1.5,0.2,-0.1,1.0), PO_6M=c(-1.2,0.1,-0.05,0.5), PO_1Y=c(-1.0,0.05,0,0.25))
dat
Ideally, the x axis will have markings for the time (preop, 1 month post-op, etc.), and the y axis will have the value at that time. The data should converge around y=0 coming from either the positive or negative direction, and I imagine a plot looking something like this:
My actual dataframe also has many missing values, so this would need to be accounted for somehow. I would appreciate if anyone could help approach this problem using either ggplot or base R plotting functions. Thanks so much!
Your data should be restructured. Use tidyr package to help make your columns into rows. Then use ifelse logic to convert your column names into the number of months. I assigned pre-op to zero months.
library(tidyverse)
dat2<-dat %>% tidyr::pivot_longer(cols=Preop:PO_1Y)
dat2$nummonths<-ifelse(dat2$name=='Preop',0,
ifelse(dat2$name=='PO_1M',1,
ifelse(dat2$name=='PO_6M',6,
ifelse(dat2$name=='PO_1Y',12,NA))))
ggplot(dat2, aes(nummonths,value))+geom_point()+theme_dark()
Related
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have a data frame where the two columns of interest are a list of measurements (as integers) and the date/time when the measurement was taken (as POSIXct). When I plot the entire set, it's difficult to see any detail since the data were taken every five minutes and the data span ~2 months. I was hoping that I could use lattice to generate a plot of each day on its own without having to specify the xmin/xmax 60 times. Is this possible?
You would do something like this:
# Generate some fake data
dat <- data.frame(x = rnorm(500),
y = rnorm(500),
date = sample(Sys.Date()+1:20, 500, replace=TRUE))
library(lattice)
xyplot(y ~ x | date, data=dat)
Created on 2021-06-26 by the reprex package (v2.0.0)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
My goal is to create a line graph showcasing:
1. Each line is one different person
2. The line plots the person's ranking over time
Since I'm trying to display ranks (eg. A person is ranked 2nd on the first month, but moves down to 4th the next month, and so on...), I would prefer the point of #1 on the Y axis to be at the top rather than at the bottom. Is that possible?
I've already created the line graph of everyone and their ranks along each timeframe, I just need a way to flip the scale of the Y axis. Thank You for your help!
Like this? Using scale_y_reverse
library(tidyverse)
ChickWeight <- ChickWeight
ggplot(ChickWeight,aes(Time,weight,group = Chick)) +
geom_line() +
scale_y_reverse()
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 years ago.
Improve this question
I am using boxplot to show the distribution among 5 different data sets.
I know it is possible to arrange them based on their median values.
What I am looking for is to arrange them based on the difference between the first quartile and the third.
Obviously I do not want to arrange them manually by reordering the levels.
I have fixed this using tidyverse group_by and summarise and calculating the difference between the desired quartiles and using that to arrange the boxes.
If anyone need the code or has a better solution, please let me know.
Thank you.
Do you mean the Interquartile range (IQR())? If so you can do
diamonds %>%
as.tibble() %>%
ggplot(aes(reorder(cut, price, IQR), price)) +
geom_boxplot()
Here is how I ordered my boxplots based on the difference between 1st and 3rd quartiles. "df" is your data.frame, "column1" is the column you want to group by based on, and "column2" contains your values which you are trying to see the distribution on.
DisTable <- df %>%
group_by(column1) %>%
summarise(Min=quantile(column2,probs=0.0),
Q1=quantile(column2, probs=0.25),
Median=quantile(column2, probs=0.5),
Q3=quantile(column2, probs=0.75),
Max=quantile(column2,probs=1),
DiffQ3Q1=Q3-Q1) %>%
arrange(desc(DiffQ3Q1))
bporder <- as.character(DisTable$column1)
ggplot(df,aes(x=factor(df$column1,levels=bporder),y=column2,fill=column1))+
geom_boxplot()
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I am trying to make a histogram of grades. Here are my variables.
> grade <- factor(c("A","A","A","B","A","A","A","A","B","A","C","B","B","B"))
> numberBook <- c(53,42,40,40,39,34,34,30,28,24,22,21,20,16)
But when I plot it, I get an error message.
> hist(numberBook~grade)
Error in hist.default(numberBook ~ grade) : 'x' must be numeric
What can I do?
I'm not sure why you've got multiple letters so I've guessed that you want a total of all the A, B and Cs. This may not be quite right. I've recreated your data like this using rep and summing the counts of grades (could be wrong)
data <-c(rep("A",(53+42+40+34+34+30+28+22)), rep("B",(39+24+20+16+22)),rep("C",22))
Then I can plot the data using barplot:
barplot(prop.table(table(data)))
Barplot is probably what you want here.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm working with a dataset in R where the main area of interest is the date. (It has to do with army skirmishes and the date of the skirmish is recorded). I wanted to check if these were more likely to happen in a given season, or near a holiday, etc, so I want to be able to see how many dates there are in the summer, winter, etc but I'm sort of at a loss for how to do that.
A general recommendation: use the package lubridate for converting from strings to dates if you're having trouble with that. use cut() to divide dates into ranges, like so:
someDates <- c( '1-1-2013',
'2-14-2013',
'3-5-2013',
'8-21-2013',
'9-15-2013',
'11-28-2013',
'12-22-2013')
cutpoints<- c('1-1-2013',# star of range 'winter'
'3-20-2013',# spring
'6-21-2013',# summer
'9-23-2013',# fall
'12-21-2013',# winter
'1-1-2014')# end of range
library(lubridate)
temp <- cut(mdy(someDates),
mdy(cutpoints),
labels=FALSE)
someSeasons <- c('winter',
'spring',
'summer',
'fall',
'winter')[temp]
Now use 'someSeasons' to group your data into date ranges with your favorite
statistical analysis. For a choice of statistical analysis, poisson
regression adjusting for exposure (i.e. length of the season), comes to
mind, but that is probably a better question for Cross Validated
You can make a vector of cut points with regular intervals like so:
cutpoints<- c('3-20-2013',# spring
'6-21-2013',# summer
'9-23-2013',# fall
'12-21-2013')# winter
temp <- cut(mdy(someDates),
outer(mdy(cutpoints), years(1:5),`+`),
labels=F)
someSeasons <- c('spring',
'summer',
'fall',
'winter')[(temp-1)%% 4 + 1] #the index is just a little tricky...