R and apply info - r

I could find any answers to that. So I've got the following code and trying to put it into apply, so it does the work quicker, my data set is 130k rows long. I need an apply that will calculate the missing times of the horses from Behind(in Length) and the winning Horse time. The problem is that the column Behind gives a the distance behind the horse before, not the first 1. So I'm in need to create a variable that will carry on as the function goes and if new race is identified, finds that the position == 1, it resets the variables.
missingTimes <- function(x) {
L <- 2.4384
for(i in 1:nrow(x) - 10) {
distanceL <- (x$distance[i] * 1000) / L
LperS <- x$Winner.Race.time[i] / distanceL
if(x$position[i] == 1 && !is.na(x$position[i])) {
distanceL <- NULL
LperS <- NULL
}
if(grepl("L",x$Behind[i])) {
x$results[i] <- (distanceL + as.numeric(sub("L", "", x$Behind[i]))) * LperS
}
}
}
I need at least 10 reputation to post images, thats why I give you links instead!
http://i.stack.imgur.com/xN23M.png
http://i.stack.imgur.com/Cspfr.png
The results should just give me a column with the proper times for the finish times of the other horses, in a form like the column Winner Race Time
For further understanding Imma count a few results myself for you:
Starting with first row, it sees position = 1, so it cleans the variables.
Then it takes the distance * 1000, and divides it by the constant L,
2.375 * 1000 / 2.4384 = 973.99
Then It need to get the time in seconds it takes to complete 1 length(L),
290.9 / 973.99 = 0.298
Now to get the finish time for the second horse It adds the length BEHIND to the distance of the racing track and multiplies it by the length per second,
973.99 + 2.25 = 976.24 * 0.298 = 290.91952
Then for the next horses time it'd be:
976.24 + 13 = 989.24 * 0.298 = 294.79352
and so on, remember when it hits position = 1, distance needs to reset
What I've done alternatively is put the distanceL in a separate column, same with LperS, of course after calculation.
If you could walk me through steps required to get that done It'd be great. I'm a complete rookie to the R stuff, so please be descriptive. I hope you catch my understanding!
Thank you!

Related

I want to calculate the timedifference between to times

I want to calculate the difference of two columns of a dataframe containing times. Since not always a value from the same column ist bigger/later, I have to do a workaround with an if-clause:
counter = 1
while(counter <= nrow(data)){
if(data$time_end[counter] - data$time_begin[counter] < 0){
data$chargingDuration[counter] = 1-abs(data$time_end[counter]-data$time_begin[counter])
}
if(data$time_end[counter] - data$time_begin[counter] > 0){
data$chargingDuration[counter] = data$time_end[counter]-data$time_begin[counter]
}
counter = counter + 1
}
The output I get is a decimalvalue smaller than 1 (i.e.: 0,53322 meaning half a day)... However, if I use my console and calculate the timedifference manually for a single line, I get my desired result looking like 02:12:03...
Thanks for the help guys :)

Get out of infinite while loop

What is the best way to have a while loop recognize when it is stuck in an infinite loop in R?
Here's my situation:
diff_val = Inf
last_val = 0
while(diff_val > 0.1){
### calculate val from data subset that is greater than the previous iteration's val
val = foo(subset(data, col1 > last_val))
diff_val = abs(val - last_val) ### how much did this change val?
last_val = val ### set last_val for the next iteration
}
The goal is to have val get progressively closer and closer to a stable value, and when val is within 0.1 of the val from the last iteration, then it is deemed sufficiently stable and is released from the while loop. My problem is that with some data sets, val gets stuck alternating back and forth between two values. For example, iterating back and forth between 27.0 and 27.7. Thus, it never stabilizes. How can I break the while loop if this occurs?
I know of break but do not know how to tell the loop when to use it. I imagine holding onto the value from two iterations before would work, but I do not know of a way to keep values two iterations ago...
while(diff_val > 0.1){
val = foo(subset(data, col1 > last_val))
diff_val = abs(val - last_val)
last_val = val
if(val == val_2_iterations_ago) break
}
How can I create val_2_iterations_ago?
Apologies for the non-reproducible code. The real foo() and data that are needed to replicate the situation are not mine to share... they aren't key to figuring out this issue with control flow, though.
I don't know if just keeping track of the previous two iterations will actually suffice, but it isn't too much trouble to add logic for this.
The logic is that at each iteration, the second to last value becomes the last value, the last value becomes the current value, and the current value is derived from foo(). Consider this code:
while (diff_val > 0.1) {
val <- foo(subset(data, col1 > last_val))
if (val == val_2_iterations_ago) break
diff_val = abs(val - last_val)
val_2_iterations_ago <- last_val
last_val <- val
}
Another approach, perhaps a little more general, would be to track your iterations and set a maximum.
Pairing this with Tim's nice answer:
iter = 0
max_iter = 1e6
while (diff_val > 0.1 & iter < max_iter) {
val <- foo(subset(data, col1 > last_val))
if (val == val_2_iterations_ago) break
diff_val = abs(val - last_val)
val_2_iterations_ago <- last_val
last_val <- val
iter = iter + 1
}
How this is generally done is that you have:
A convergence tolerance, so that when your objective function doesn't change appreciably, the algorithm is deemed to have converged
A limit on the number of iterations, so that the code is guaranteed to terminate eventually
A check that the objective function is actually decreasing, to catch the situation where it's diverging/cyclic (many optimisation algorithms are designed so this shouldn't happen, but in your case it does happen)
Pseudocode:
oldVal <- Inf
for(i in 1:NITERS)
{
val <- objective(x)
diffVal <- val - oldVal
converged <- (diffVal <= 0 && abs(diffVal) < TOL)
if(converged || diffVal > 0)
break
oldVal <- val
}

How to create an efficient for loop to resolve the rate limit issue with twitteR?

I am quite new to TwitteR and the concept of for loop. I have come across to this code to get the followers and profiles.
This code below works fine. Not entirely sure if retry on rate limit should be set for such a long time.
#This extracts all or most followers.
followers<-getUser("twitter_handle_here")$getFollowerIDs(retryOnRateLimit=9999999)
This code below is the for loop to get the profiles.
However, I think there should be a way to use length(followers) and getCurRateLimitInfo() to better contruct the loop.
My question is that if the length(followers) = 40000 and the ratelimit = 180, then how to construct the loop to sleep with the right amount of time and to get all 40000 twitter profiles?
Any help would be much appreciated.
#This is the for loop to sleep for 5 seconds.
#Problem with this is it simply sleeps for X seconds
for (follower in followers){
Sys.sleep(5)
followers_info<-lookupUsers(followers)
followers_full<-twListToDF(followers_info)
}
Here is some code I had written for a similar purpose, First you need to define this function stall_rate_limit:
stall_rate_limit <- function(limit) {
# Store the record of all the rate limits into rate
rate = getCurRateLimitInfo()
message("Checking Rate Limit")
if(any(as.numeric(rate[,3]) == 0)) {
# Get the locations of API Calls that are used up
index = which(as.numeric(rate[,3]) == 0)
# get the time till when rates limits Reset
wait = as.POSIXct(min(rate[index,4]), ## Reset times in the 4th col
origin = "1970-01-01", ## Origin of Unix Time
tz = "US/Mountain") ## Replace with your Timezone
message(paste("Waiting until", wait,"for Godot to reset rate limit"))
# Tell the computer to sleep until the rates reset
Sys.sleep(difftime(wait, Sys.time(), units = "secs"))
# Set J = to 0
J = 0
# Return J as a counter
return(J)
} else {
# Count was off, Try again
J = limit - 1
return(J)
}
}
Then you can run your code something like this:
callsMade = 0 ## This is your counter to count how many calls were made
limit = 180 ## the Limit of how many calls you can make
for(i in 1:length(followers)){
# Check to see if you have exceeded your limit
if(callsMade >= limit){
# If you have exceeded your limit, wait and set calls made to 0
callsMade = stall_rate_limit(limit)
}
### Execute your Code Here ... ###
callsMade = callsMade + 1 # or however many calls you have made
}

Loop will not execute in R

I have a loop I want to execute that depends on the output of the previous loop in the code. This is the code;
holder <- list()
if (i < historyLength) movement <- movementType(relAngle, angleThreshold)
else if (i > historyLength-1) {
# Array to store speeds
speedHistory <- array(historyLength)
n = historyLength-1
# get the speeds from the previous n (hisoryLength) "Movements"
for (j in seq(1, length(historyLength))){
speedHistory [n] = R[i-j, 6]
n-1
}
if (!bayesFilter(speedHistory, minSpeed, GPS_accy)) movement <- "non-moving"
else if(bayesFilter(speedHistory, minSpeed, GPS_accy)) movement <- movementType(relAngle, angleThreshold)
}
holder [[i]] <- (movement)
for (t in seq(1, length(holder))){
if (t == t-1)
changes <- 0
else if (t != t-1)
changes <- 1
}
You cannot see the beginning of loop but it results in a column of data called 'movements.'
I have attempted to temporarily store the 'movements' in the object 'holder.' What i want then is for the bottom for loop to go through 'holder' and label changes as either 0 or 1 in another column. Basically if the next 'movement' is not equal to the previous record the change as 0 and so forth. I think the problem is with the object 'holder' perhaps?
Currently I'm getting it to loop but it's only printing out a column of '1's.'
Any help much appreciated! Thanks.
Currently get the following output:
Movement Changes
left 1
right 1
forward 1
non-moving 1
non-moving 1
Think the problem lies in the list where movements are stored? Sorry, if I knew where the problem was I'd be more specific. Really new to this!
I end up with a data frame with column headers "Distance" "Speed" "Heading" "Movement" and "Changes." It's looping fine but for some reason Changes reults in a column of 1's as above. Is there an obvious mistake below?:
holder[[i]] <- (movement)
for (t in seq(1, length(holder))){
if (t == t-1)
changes <- 0
else if (t != t-1)
changes <- 1
I have also tried this, but then it doesn't loop at all.
holder[[i]] <- (movement)
for (t in seq(1, length(holder))){
if (holder[t] == holder[t-1])
changes <- 0
else if (holder[t] != holder[t-1])
changes <- 1
I'm currently getting this error: Error in holder[[t - 1]] : attempt to select less than one element
for the following code:
holder <- list(movement)
for (t in length(holder)){
if (holder[[t]] == holder[[t-1]])
changes <- 0
else changes <- 1
This is too long for a comment so I'm putting this as answer (actually it might answer your problem):
As I already mentioned in a comment to your previous question, you should have a look at what is seq(1, length(holder)) and so what you are doing when you put if (t == t-1) : you are doing something like "if 1==0" which cannot be TRUE.
You need to go with "the second version" of your loop (or, actually, without a loop...), which compares the right things, except that holder is a list so you need to either define it as a vector or use double brackets (holder[[t]]).
You don't need another if after else (what you are actually "saying" to R is "if A is true then do something, else, if 'opposite A' is true then do something else" but, necessarily, if A is not TRUE, then 'opposite A' is...
So something like:
for (t in seq(length(holder))){
if (holder[[t]] == holder[[t-1]]) changes <- 0 else changes <- 1
}
Please consider spending some time on the answer from your previous question to understand why your solution didn't work and why the answer provided did. (This includes reading documentations for the different functions and also take a look at the values your variable can take, e.g. running the loop, one "turn" at a time).

R - Arrays with variable dimension

I have a weird question..
Essentially, I have a function which takes a data frame of dimension Nx(2k) and transforms it into an array of dimension Nx2xk. I then further use that array in various locations in the function.
My issue is this, when k == 2, I'm left with a matrix of degree Nx2, and even worse, if N = 1, I'm stuck with a matrix of degree 1x2.
I would like to write myArray[thisRow,,] to select that slice of the array, but this falls short for the N = 1, k = 2 case. I tried myArray[thisRow,,,drop = FALSE] but that gives an 'incorrect number of dimensions' error. This same issue arrises for the Nx2 case.
Is there a work around for this issue, or do I need to break my code into cases?
Sample Code Shown Below:
thisFunction <- function(myDF)
{
nGroups = NCOL(myDF)/2
afMyArray = myDF
if(nGroups > 1)
{
afMyArray = abind(lapply(1:nGroups, function(g){myDF[,2*(g-1) + 1:2]}),
along = 3)
}
sapply(1:NROW(myDF),
function(r)
{
thisSlice = afMyArray[r,,]
*some operation on thisSlice*
})
}
Thanks,
James

Resources