First time poster so I hope I've enough information here.
I'm trying to show my survival curves in 4 categories. The analysis is stratified according to my 4 categories in survival tables, but the survival plots do not depict these 4 categories and instead show many different survival curves. What am I doing wrong here?
Survival curve
# categorise ADAMTS13 levels
TMAdata$ADAMTS13level.f<-cut(TMAdata$ADAMTS13level,
breaks=c(0.0,10.0,40.0, 60.0,160.0),
labels=c('0-10.0',
'10.1-40.0',
'40.1-60.0',
'60.1-160.0'))
summary(TMAdata$ADAMTS13level.f)
# use 10-40% ADAMTS13 level as reference point
TMAdata$ADAMTS13level.f = relevel(TMAdata$ADAMTS13level.f, ref="10.1-40.0")
# platelet recovery according to ADAMTS13 level (reference point is 10.1-40.0)
pltrecovery_ADAMTS13_table <- survfit(Surv(TMAdata$Daysplateletrecovery, TMAdata$Recoveredplatelets)~TMAdata$ADAMTS13level.f)
summary(pltrecovery_ADAMTS13_table)
plot(pltrecovery_ADAMTS13_table, conf.int=0,
xlab = "Days",
ylab = "Probability of not achieving platelet count =>150")
legend("topright", inset=0.03,
c("0-10.0",
"10.1-40.0",
"40.1-60.0",
"60.1-160.0"),
lty=1:2,
lwd=2,
cex=1)
The extra lines are confidence boundaries. Specifying conf.int=0 does not suppress confidence interval plotting. That's arguably incorrect with it's easy to demonstrate using the first example in ?survfit.formula. If you don't want the confidence boundaries, then leave out the conf.int parameter all-together.
The legend will only have two types of lines and they probably won't match the types of the survival plotted.
I am trying to make an exponential plot of a variable. The coefficient of the variable is very high (350 million) from the GLM results. With other variables that had lower coefficients, I was able to plot them easily with no issues. I have been trying to set the sequence interval smaller and smaller but it keeps crashing r when I try to plot it.
Any suggestions? I have tried breaking up the data already with no luck.
My vectors are very large numerics as well (18Mb).
chlautcnod<-seq.int(0, 2.45259, 0.000001)
chlautcnodline<- glmnodosaALL$coefficients[1] +
glmnodosaALL$coefficients[2]*mean(bornodosaAP$Chl_spring) +
glmnodosaALL$coefficients[3]*chlautcnod + glmnodosaALL$coefficients[4]*mean(bornodosaAP$Dist_coast) +
glmnodosaALL$coefficients[5]*mean(bornodosaAP$Chl_winter)+ glmnodosaALL$coefficients[6]*mean(bornodosaAP$Depth) +
glmnodosaALL$coefficients[7]*mean(bornodosaAP$Chl_yr_avg)+ glmnodosaALL$coefficients[8]*mean(bornodosaAP$Dist_complete_river) +
glmnodosaALL$coefficients[9]*mean(bornodosaAP$Temp_yr_min)+ glmnodosaALL$coefficients[10]*mean(bornodosaAP$Chl_summer)+
glmnodosaALL$coefficients[11]*mean(bornodosaAP$Chl_yr_max)+ glmnodosaALL$coefficients[12]*mean(bornodosaAP$SWH_summer)+
glmnodosaALL$coefficients[13]*mean(bornodosaAP$SWH_yr_min)+ glmnodosaALL$coefficients[14]*mean(bornodosaAP$SWH_spring)
gc(plot(exp(1)^chlautcnodline~chlautcnod, xlab = (expression(paste("Chlorophyll-α Autumn (mg/m"^"3"~")"))), ylab= "Probability of C. nodosa occurance",ylim=c(0,0.05),xlim=c(0.15,0.17), type="l", bty="l")
enter image description hereI'm trying to visualize correlation between two columns in my dataset.
I tried to use plot(), scatterplot, but the result is not a readable graph.
For example I used this function:
scatter.smooth(x=Lifestyles$SLEEP_HOURS, y=Lifestyles$SUFFICIENT_INCOME, main="sleep hours and Income", xlab = "Sleep hours", ylab = "income, 1,2")
About dataset.
I have about 12000 observations and 20 columns.
both columns are as.numeric and integer.
here I'm trying to observe number of sleep hours and how many tasks completed daily
my link to my dataset: https://www.kaggle.com/ydalat/lifestyle-and-wellbeing-data
Thank you all in advance!
I have population simulation data with 200 replications of 50, 1 year iterations. I want to plot all 200 trajectories as lines (y=population size, x=year) on the same plot. The following code meets this need...
baseline<-read.csv("C:\\Users\\Chelsea Mitchell\\Desktop\\Poster materials\\chinook baseline raw.csv", header=T)
plot(baseline$time.step..year[1:50],
baseline$pop.size[1:50], type="l", main="baseline model"
, xlab= "Year", ylab= "Population size", ylim= c(0,2e+08))
for (i in 2:(length(baseline$time.step..year)/50))
{lines(baseline$time.step..year[(1+(i-1)*50):(i*50)],
baseline$pop.size[(1+(i-1)*50):(i*50)])}
image of appropriate plot without extinction
But in some cases, the population goes extinct, and trajectories stop before year 50. How can I tell the for loop to stop the trajectory line before the following simiulation data starts again at year 1?
image of problem plot with extinction
Here is a constructed, simple version of the data and code with the same issue. The maximum number of years is 10, so the for loop plots trajectory lines for "year" sequences of 1:10. In cases where pop.size reaches 0, the replications stop, so the trajectory plotting should also stop.
rep <- c(1,1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3)
year <- c(1,2,3,4,1,2,3,4,5,6,1,2,3,4,5,6,7,8,9,10)
pop.size <- c(526,120,165,0,634,637,452,130,189,0,599,436,320,245,336,225,134,37,87,0)
extinct.pop <- data.frame(rep,year,pop.size)
plot(extinct.pop$year[1:10], extinct.pop$pop.size[1:10],
type="l", xlim= c(0,10))
for (i in 2:(length(extinct.pop$year)/10)){
lines(extinct.pop$year[(1+(i-1)*10):(i*10)],
extinct.pop$pop.size[(1+(i-1)*10):(i*10)])}
Thank you for your help!
Your codes are plotting just one line.
One alternative is using ggplot2 and you won't need the loop. Do you want something like this?
library(ggplot2)
ggplot(data=extinct.pop, aes(x=year, y=pop.size, group = rep)) + geom_line()
I have used forecast() to the first 1526 data points in my data serie VIX, estimating the final 300 data points. I want to measure the goodness of fit with the variance of the difference between actual historical data and forecasted result. Is there an easy way of doing this in R?
The code currently is
r_vix_3b=diff(log(VIX[,"Close"]))
num_train=1526
h=300
plot_start=1300
plot_labels=126 # interval between x-axis major tick marks
data_fcst_pts=num_train:(num_train+h)
fit_1step=auto.arima(r_vix_3b[1:num_train])
forecast_1step = forecast(fit_1step, h=h)
plot(forecast_1step, xaxt="n", xlim=c(plot_start, num_train+h), ylim=c(-0.3, 0.3)) #ylim=range(r_vix)
points(data_fcst_pts, r_vix_3b[data_fcst_pts],col="blue", type="l", pch=16)
axis(1, at=seq(0,length(r_vix_3b)+h-1,plot_labels), labels=VIX$Date[seq(2, length(r_vix_3b)+h,plot_labels)] )
diff_1_step = r_vix_3b[1526:1825] - forecast_1step
Please check ?accuracy function from forecast package.
I guess in your case it would be something like:
acc <- accuracy(forecast_1step,diff_1_step)