how to find 80% of distribution [closed] - r

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
A have a problem.
I have a data frame byDays, which consists of two columns: day and money.
Day looks like sequence from 0 to 100. And Money means amount of money, our customers spent in this day.
I plotted distribution, but cant link it, havent got enough reputation.
And i need to find a day(!) left from which will be 80% of area of my distibution.

If you want the point at which 80% of the total is reached this will give you the answer:
set.seed(1)
day <- 1:100
profit <- runif(100, 0, 15)
## Point at which 80% of the total is reached:
pct <- max(x[ cumsum(profit)/sum(profit) <= 0.8])
plot(day, cumsum(profit)/sum(profit))
abline(v=pct, col="red")

Related

How to find the percentage of observations in each day in R? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 17 days ago.
Improve this question
I have a dataset with time, where the time intervals are 6 hours apart and I have a column of heaterstatus.
The dataset :
I would like to know the percentage of zero occurred in each day for heaterstatus. New to R, any suggestion will be helpful.
Not tested since you only provided data as an image, but this should do what you want:
library(dplyr)
dat %>%
group_by(day = as.Date(Time)) %>%
summarize(pct_0 = mean(HeaterStatus == 0))

calculating timeframe accuracy in R [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am trying to look at prediction accuracy related to timeframes related to hospital discharge.
For example, I think Mr. Smith will be discharged within 3-7 days, which would mean he could any day from 11/9-11/13 would be correct. If he discharges in 2 days, I would say I was 1 day off and if he discharges within 10 days, I was 3 days off...
Is there any good method to do this using dplyr, base R, and lubridate? TIA. Sample data is at the link:
Sample data
A possible solution would be to express your need in a case_when.
library(dplyr)
df %>%
dplyr::mutate(DIF = case_when(discharge_calender_date < discharge_prediction_lower_bound ~ discharge_calender_date - discharge_prediction_lower_bound,
discharge_calender_date <= discharge_prediction_upper_bound ~ 0,
TRUE ~ discharge_calender_date - discharge_prediction_upper_bound))
This way you get a negative value if the patient left before the lower bound, zero if he left within the prediction and a positive result if he left after the prediction.

Changing Horizontal Label axis in curve fitting in R [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have tried to plot sales order against time. graph is as follows:
I want to replace number of days by dates:
which means replace 0 by 22/09/2016
50 by 11/11/2016
100 by 31/12/2016
150 by 19/02/2017
200 by 10/04/2017
I don't know how to proceed.
You will want to make use of the xts package which can handle daily values
library(xts)
v <- 0:200 # your data here
d <- seq.Date(as.Date('2016-09-22'), as.Date('2017-04-10'), length.out = 201)
plot(xts(v, order.by = as.POSIXct(d)))
Here is a great cheat sheet:
https://www.datacamp.com/community/blog/r-xts-cheat-sheet#gs.1XjXRyI

histogram of letter grades [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I am trying to make a histogram of grades. Here are my variables.
> grade <- factor(c("A","A","A","B","A","A","A","A","B","A","C","B","B","B"))
> numberBook <- c(53,42,40,40,39,34,34,30,28,24,22,21,20,16)
But when I plot it, I get an error message.
> hist(numberBook~grade)
Error in hist.default(numberBook ~ grade) : 'x' must be numeric
What can I do?
I'm not sure why you've got multiple letters so I've guessed that you want a total of all the A, B and Cs. This may not be quite right. I've recreated your data like this using rep and summing the counts of grades (could be wrong)
data <-c(rep("A",(53+42+40+34+34+30+28+22)), rep("B",(39+24+20+16+22)),rep("C",22))
Then I can plot the data using barplot:
barplot(prop.table(table(data)))
Barplot is probably what you want here.

Calculate churn in R given data variables [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm trying to calculate the number of retained students using R. The two variables I'm working with are 'registration_date' (mm/dd/yr) and 'date_of_last_login' (mm/dd/yr). A student is considered retained if they logged-in in the preceding 30 days.
ID 1 , 2, 3, 4, 5
registration_date 2/1/15, 2/1/15, 3/15/15, 2/10/15, 4/15/15
date_of_last_login 2/3/15, 3/15/15, 4/30/15, 4/25/15, 5/16/15
I imagine the idea is to create a new variable: 'retained students' but I am not sure how to set up the formula in R.
Assuming you mean the 30 days previous to today:
last_login <- c("2/3/15","3/15/15","4/30/15")
login <- as.Date(last_login, format = '%m/%d/%y')
retained_students <- (Sys.Date()-login < 30)
retained_students
retained_students is then a vector with either TRUE or FALSE for each login

Resources