Aptitude ( Time and work based question (Wipro)) ) - aptitude

A and B can do a piece of work in 30 days while B and C can do the same the work in 24 days and C and A in 20 days. The work all together for 10 days when B and D leaves How many day more will A take to finish the work?

This is a math question, consider moving it to https://math.stackexchange.com/, but here you go:
let 1 be the amount of work, and let A,B and C be the daily work amount of A B and C respectively. What we know is:
1/30=A+B
1/24=B+C
1/20=A+C
Solving for A,B and C you get:
a = 1/48 and b = 1/80 and c = 7/240.
Then in 10 days the work done is:
10/48+10/80+70/240=0.625, so 0.375 remains, which is 18/48.
That means A will need 18 further days to complete the work.

Related

Can I do on-the-fly calculation on a column using data.table in r?

Hi I am wondering if anyone knows/ could show me how to calculate a value in a column C based on the previous value(s) in this column C and another column D and save the calculated value as a new current value in column C?
For example, suppose I first initialize the column C to 1s and the calculation I want to implement is C(1) = 1 + B(1)*0.1*1 and C(2) = C(1) + B(2)*0.1*C(1).
test=data.table(A=1:5,B=c(1,2,1,2,1),C=1)
test
A B C
1: 1 1 1
2: 2 2 1
3: 3 1 1
4: 4 2 1
5: 5 1 1
What I want is:
test
A B C
1: 1 1 1.1
2: 2 2 1.32
3: 3 1 1.452
4: 4 2 1.7424
5: 5 1 1.91664
I could achieve what I want with for loop or apply() but I really want to know if this is doable just using data.table and get some speed up.
Edit:
As pointed out by Frank in the comments below,
test[, C := cumprod(1 + .1*B)]
will do since multiplication is distributive. What if I want to supply a more complex custom function?
Many thanks in advance!
Using the formula as presented we have:
test[, C := Reduce(function(c, b) c + .1 * b * c, B, init = 1, acc = TRUE)[-1] ]
Of course, as pointed out already it simplifies in this particular case since we can write the body of the function as c * ( 1 + .1 * b) which implies a cumulative product of the parenthesized portion.
It seems you need to apply the function cumulatively
library(data.table)
library(zoo)
test=data.table(A=1:5,B=c(1,2,1,2,1),C=1)
z <- function(b){1+b*0.1}
test[,C:=cumprod(rollapply(B, width=1, FUN=z))]
But I agree that there's really no need to bring zoo here. Frank's solution is more elegant and concise.
test[,C:=cumprod(1 + .1*B)]
I don't believe there is a similar data.table function, but it seems like accumulate from purrr is what you want. Simple example below, but the input could be rows of a data.table also.
library(purrr)
accumulate(1:4, function(x, y){2*x + y})
# [1] 1 4 11 26

R enumerate duplicates in a dataframe with unique value

I have a dataframe containing a set of parts and test results. The parts are tested on 3 sites (North Centre and South). Sometimes those parts are re-tested. I want to eventually create some charts that compare the results from the first time that a part was tested with the second (or third, etc.) time that it was tested, e.g. to look at tester repeatability.
As an example, I've come up with the below code. I've explicitly removed the "Experiment" column from the morley data set, as this is the column I'm effectively trying to recreate. The code works, however it seems that there must be a more elegant way to approach this problem. Any thoughts?
Edit - I realise that the example given was overly simplistic for my actual needs (I was trying to generate a reproducible example as easily as possible).
New example:
part<-as.factor(c("A","A","A","B","B","B","A","A","A","C","C","C"))
site<-as.factor(c("N","C","S","C","N","S","N","C","S","N","S","C"))
result<-c(17,20,25,51,50,49,43,45,47,52,51,56)
data<-data.frame(part,site,result)
data$index<-1
repeat {
if(!anyDuplicated(data[,c("part","site","index")]))
{ break }
data$index<-ifelse(duplicated(data[,1:2]),data$index+1,data$index)
}
data
part site result index
1 A N 17 1
2 A C 20 1
3 A S 25 1
4 B C 51 1
5 B N 50 1
6 B S 49 1
7 A N 43 2
8 A C 45 2
9 A S 47 2
10 C N 52 1
11 C S 51 1
12 C C 56 1
Old example:
#Generate a trial data frame from the morley dataset
df<-morley[,c(2,3)]
#Set up an iterative variable
#Create the index column and initialise to 1
df$index<-1
# Loop through the dataframe looking for duplicate pairs of
# Runs and Indices and increment the index if it's a duplicate
repeat {
if(!anyDuplicated(df[,c(1,3)]))
{ break }
df$index<-ifelse(duplicated(df[,c(1,3)]),df$index+1,df$index)
}
# Check - The below vector should all be true
df$index==morley$Expt
We may use diff and cumsum on the 'Run' column to get the expected output. In this method, we are not creating a column of 1s i.e 'index' and also assuming that the sequence in 'Run' is ordered as showed in the OP's example.
indx <- cumsum(c(TRUE,diff(df$Run)<0))
identical(indx, morley$Expt)
#[1] TRUE
Or we can use ave
indx2 <- with(df, ave(Run, Run, FUN=seq_along))
identical(indx2, morley$Expt)
#[1] TRUE
Update
Using the new example
with(data, ave(seq_along(part), part, site, FUN=seq_along))
#[1] 1 1 1 1 1 1 2 2 2 1 1 1
Or we can use getanID from library(splitstackshape)
library(splitstackshape)
getanID(data, c('part', 'site'))[]
I think this is a job for make.unique, with some manipulation.
index <- 1L + as.integer(sub("\\d+(\\.)?","",make.unique(as.character(morley$Run))))
index <- ifelse(is.na(index),1L,index)
identical(index,morley$Expt)
[1] TRUE
Details of your actual data.frame may matter. However, a couple of options working with your example:
#this works if each group starts with 1:
df$index<-cumsum(df$Run==1)
#this is maybe more general, with data.table
require(data.table)
dt<-as.data.table(df)
dt[,index:=seq_along(Speed),by=Run]

Counting repetition in r

I want to count the number of specific repetitions in my dataframe. Here is a reproducible example
df <- data.frame(Date= c('5/5', '5/5', '5/5', '5/6', '5/7'),
First = c('a','b','c','a','c'),
Second = c('A','B','C','D','A'),
Third = c('q','w','e','w','q'),
Fourth = c('z','x','c','v','z'))
Give this:
Date First Second Third Fourth
1 5/5 a A q z
2 5/5 b B w x
3 5/5 c C e c
4 5/6 a D w v
5 5/7 c A q z
I read a big file that holds 400,000 instances and I want to know different statistics about specific attributes. For an example here I'd like to know how many times a happens on 5/5. I tried using sum(df$Date == '5/5' & df$First == 'a', na.rm=TRUE) which gave me the right result here (2), but when I run it on the big data set, the numbers are not accurate.
Any idea why?

Aggregate with trimmed means in R

I am trying to aggregate data like this in R:
df = data.frame(c("a","a","a","a","a","b","b","b","b","b","c","c","c"))
colnames(df) = "f"
set.seed(10)
df$e = rnorm(13,20,5)
f e
1 a 20.09373
2 a 19.07874
3 a 13.14335
4 a 17.00416
5 a 21.47273
6 b 21.94897
7 b 13.95962
8 b 18.18162
9 b 11.86664
10 b 18.71761
11 c 25.50890
12 c 23.77891
13 c 18.80883
Which I would like to aggregate by the column f and have a trimmed mean of e for each unique f type (i.e. produce 3 rows of data).
I tried:
df2=data.frame(0)
df2=aggregate(df$e, by = "f",mean(df$e, trim=0.1))
got the following error:
Error in match.fun(FUN) :
'mean(df$e, trim = 0.1)' is not a function, character or symbol
Tried a few searches online and came up empty. My actual data consists of around 30 values of e per f so I am not concerned that trim=0.1 won't actually trim the means in the example (because no points lie outside of the upper and lower 5th percentile) it will with the real data, this is just to get the aggregate function working as intended. Thanks!
Try this
df2=aggregate(e~f,data=df,mean,trim=0.1)
f e
1 a 18.15854
2 b 16.93489
3 c 22.69888
Function to use for calculation in this case can be given just by its name, for example, mean, and additional parameters needed for that function are set after comma.

help with rle command

I'm having some trouble with an rle command that is designed to find the point at which participants reach 8 contiguous ones in a row.
For example, if:
x <- c(0,1,0,1,1,1,1,1,1,1,1,1)
i want to return a value of 11.
Thanks to DWin to I've been using this piece of code:
which( rle(x2)$values==1 & rle(x2)$lengths >= 8)
sum(rle(x)$lengths[ 1:(min(which(rle(x)$lengths >= 8))-1) ]) + 8
I've been using this code successfully to process my data. However, i noticed that it made a mistake when processing one of my data files.
For example, if
x <- c(1,1,1,1,0,0,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,1,1,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
the code returns 19, which is the point at which eight contiguous zeros in a row is reached. i'm not sure what is going wrong or how it fix it.
thanks in advance for your help.
Will
You need to paste the first line of code in its entirety into the second:
sum(rle(x)$lengths[ 1:(min(which( rle(x2)$values==1 & rle(x2)$lengths >= 8))-1) ]) + 8
[1] 39
However, here is another approach, using the function filter. This yields the same result in what I consider to be much more readable code:
which(filter(x2, rep(1/8, 8), sides=1) == 1)[1]
[1] 39
The filter function when used in this way essentially computes a moving average over a block of 8 values in the vector. I then return the position of the first value where the moving average equals 1.
In the basic programming course I teach, I advise students to give proper names to subresults, and to inspect these subresults:
lengthOfrepeatsOfAnything<-rle(x)$lengths
#4 2 5 11 2 2 3 2 17
whichRepeatsAreOfOnes<-rle(x)$values==1
#1 3 5 7 9
repeatsOfOnesLength<-lengthOfrepeatsOfAnything * whichRepeatsAreOfOnes #TRUE = 1, FALSE=0
#4 0 5 0 2 0 3 0 17
whichRepeatOfOneAreLongerThanEight<-which(repeatsOfOnesLength >= 8)
#9
result<-NA
if(length(whichRepeatOfOneAreLongerThanEight)>0){
firstRepeatOfOneAreLongerThanEight<-whichRepeatOfOneAreLongerThanEight[1]
#9
if(firstRepeatOfOneAreLongerThanEight==1){
result<-8
}
else{
repeatsBeforeFirstEightOnes<-1:(firstRepeatOfOneAreLongerThanEight-1)
#1 2 3 4 5 6 7 8
lengthsOfRepeatsBeforeFirstEightOnes<-lengthOfrepeatsOfAnything[repeatsBeforeFirstEightOnes]
#4 2 5 11 2 2 3 2
result<-sum(lengthsOfRepeatsBeforeFirstEightOnes) + 8
}
}
I know it doesn't look as dandy as a oneline solution, but it helps to make things clear and to pick up errors... Besides: what if you look back at this code in 4 months? Which one will be easier to understand again?
My advice would be to break the code up into simpler pieces. As suggested by #Nick, you want to write code which can be easily debugged and modular coding allows you to do that.
# find runs of 0s and 1s
run_01 = rle(x)
# find run of 1's with length >=8
run_1 = with(run_01, which(values == 1 & lengths >=8))
# find starting position of run_1
start_pos = sum(run_01$lengths[1:(run_1 - 1)])
# add 8 to it
end_pos = start_pos + 8

Resources