Calculate churn in R given data variables [closed] - r

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I'm trying to calculate the number of retained students using R. The two variables I'm working with are 'registration_date' (mm/dd/yr) and 'date_of_last_login' (mm/dd/yr). A student is considered retained if they logged-in in the preceding 30 days.
ID 1 , 2, 3, 4, 5
registration_date 2/1/15, 2/1/15, 3/15/15, 2/10/15, 4/15/15
date_of_last_login 2/3/15, 3/15/15, 4/30/15, 4/25/15, 5/16/15
I imagine the idea is to create a new variable: 'retained students' but I am not sure how to set up the formula in R.

Assuming you mean the 30 days previous to today:
last_login <- c("2/3/15","3/15/15","4/30/15")
login <- as.Date(last_login, format = '%m/%d/%y')
retained_students <- (Sys.Date()-login < 30)
retained_students
retained_students is then a vector with either TRUE or FALSE for each login

Related

How to find the percentage of observations in each day in R? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 17 days ago.
Improve this question
I have a dataset with time, where the time intervals are 6 hours apart and I have a column of heaterstatus.
The dataset :
I would like to know the percentage of zero occurred in each day for heaterstatus. New to R, any suggestion will be helpful.
Not tested since you only provided data as an image, but this should do what you want:
library(dplyr)
dat %>%
group_by(day = as.Date(Time)) %>%
summarize(pct_0 = mean(HeaterStatus == 0))

Question about common values across columns [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have a set of 20 column, each contains number value. I would like to have a function in excel or in r or somewhere else to extract the shared values among all the columns.
Several of the online Venn tools can visualize and list among up to 6 columns.
Any tool?
Thanks
in R, we can use intersect with Reduce to get the common values across all the columns
Reduce(intersect, dftest)
data
dftest <- data.frame(col1 = 1:5, col2 = 2:6, col3 = 3:7)

Changing Horizontal Label axis in curve fitting in R [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I have tried to plot sales order against time. graph is as follows:
I want to replace number of days by dates:
which means replace 0 by 22/09/2016
50 by 11/11/2016
100 by 31/12/2016
150 by 19/02/2017
200 by 10/04/2017
I don't know how to proceed.
You will want to make use of the xts package which can handle daily values
library(xts)
v <- 0:200 # your data here
d <- seq.Date(as.Date('2016-09-22'), as.Date('2017-04-10'), length.out = 201)
plot(xts(v, order.by = as.POSIXct(d)))
Here is a great cheat sheet:
https://www.datacamp.com/community/blog/r-xts-cheat-sheet#gs.1XjXRyI

R: how to calculate if any of column in each row satisfies a condition without loop [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
dat = runif(10,1,10)
dat2 = runif(10,1,10)
dat3 = runif(10,1,10)
data = rbind(dat,dat2,dat3)
In the case of above data, I am wondering how I can filter out rows as long as there is one element in that row exceeding 5.
I know that I can use loop to achieve this, but I am wondering if there is more succinct way to do this.
try this:
data[apply(data>5, 1, sum)>0,]
This says "for condition having more than zero number exceeding 5 in each row, filter data".
data[do.call(pmax,data.frame(data))<=5,]
Cheers,
Bert

ttest on many columns in Matlab/R [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Does anybody know the examples on how to run paired ttest in Matlab/R/SAS or Python/Java on many columns (I have 1139 variables) in all combinations or selected respective columns in a loop.
thank you
MATLAB Solution:
If I understand correctly, you're just looking for a way to feed ttest with two different columns from your input matrix everytime. You can get all possible combinations of column pairs using nchoosek:
pairs = nchoosek(1:size(X, 2), 2);
Now you can iterate over these indices, each time invoking ttest with a different pair:
for idx = transpose(pairs)
h = ttest(X(:, idx(1)), X(:, idx(2)));
%// Do something with the result...
end

Resources