Full disclosure: I've only been using Julia for about a day, so it may be too soon to ask questions.
I'm not really understanding the utility of the Dates module's Period types. Let's say I had two times and I wanted to find the number of minutes between them. It seems like the natural thing to do would be to subtract the times and then convert the result to minutes. I can deal with not having a Minute constructor (which seems most natural to my Python-addled brain), but it seems like convert should be able to do something.
The "solution" of converting from Millisecond to Int to Minute seems a little gross. What's the better/right/idiomatic way of doing this? (I did RTFM, but maybe the answer is there and I missed it.)
y, m, d = (2015, 03, 16)
hr1, min1, sec1 = (8, 14, 00)
hr2, min2, sec2 = (9, 23, 00)
t1 = DateTime(y, m, d, hr1, min1, sec1)
t2 = DateTime(y, m, d, hr2, min2, sec2)
# println(t2 - t1) # 4140000 milliseconds
# Minute(t2 - t1) # ERROR: ArgumentError("Can't convert Millisecond to Minute")
# minute(t2 - t1) # ERROR: `minute` has no method matching
# minute(::Millisecond)
# convert(Minute, (t2-t1)) # ERROR: `convert` has no method matching
# convert(::Type{Minute}, ::Millisecond)
delta_t_ms = convert(Int, t2 - t1)
function ms_to_min(time_ms)
MS_PER_S = 1000
S_PER_MIN = 60
# recall that division is floating point unless you use div function
return div(time_ms, (MS_PER_S * S_PER_MIN))
end
delta_t_min = ms_to_min(delta_t_ms)
println(Minute(delta_t_min)) # 69 minutes
(My apologies for choosing a snicker-inducing time interval. I happened to convert two friends' birthdays into hours and minutes without really thinking about it.)
Good question; seems like we should add it! (Disclosure: I made the Dates module).
For real, we had conversions in there at one point, but then for some reason or another they were taken out (I think it revolved around whether inexact conversions should throw errors or not, which has recently been cleaned up quite a bit in Base for Ints/Floats). I think it definitely makes sense to add them back in. We actually have a handful in there for other operations, so obviously they're useful.
As always, it's also a matter of who has the time to code/test/submit and hopefully that's driven by people with real needs for the functionFeel free to submit a PR if you're feeling ambitious!
Related
I've looked at both [1] and [2] and I'm completely confused (and since the dbf file is a version
4 file, [1] should apply well). For one thing why does [1] state that the timestamp's date portion is the # of days since 1/1/4713 BC? That's just very puzzling. Secondly, assuming that it is the # of days since 4713 BC, I'm having some trouble with the value I am getting.
First off, my dbf file has a timestamp field which has an 8 byte long value. The actual
date is 2000/8/16 17:21:41. In the dbf file, the 8 byte sequence is as follows
0x42ccb20e0340df00.
From [1], it says the first 4 bytes are for the date, and 2nd 4 bytes for the time. If the original
byte sequence is actually little-endian (0x42ccb20e) then that should be 0x0eb2cc42 which
comes to the value of 246598722. So date is 0x0eb2cc42 (246598722) and time is 0x00df4003
(14630915).
I must be missing something here or calculating something wrong. 246598722 is equivalent to 675612 years(assuming 1yr = 365 days, as adding leap years would confuse me..and shouldn't really be that much off).
From [2], I shouldn't use 01/01/4173bc as the basis but 12/31/1899 (well, 1/1/1900). But then, the date value I have isn't even in the range of what [2] shows.
Now if I take the actual value (2000/8/16) and use [1] and [2], I get the following:
method [1]: 2450501 days : (2000 - -4713) * 365 + (8 * 30) + 16
method [2]: 36756 days : [100 * 365 + 8 * 30 + 16] (over counting the # of days)
The dbf file isn't corrupted (otherwise, if I look at the timestamp in dBase, it'd crap out
and display something crazy).
I've thought of using big-endian, but that makes even less sense as the values are even larger. I've even thought of the possibility that it's actually the # of seconds elapsed since either date, but that makes the values with even less sense. i.e. 246598722 = # of seconds elapsed (counting back from 2000/8/16) will make the base year as 1812. (calculations: 246898722 / (3600 * 365) = 187.8985, so 2000 - 187.8985 = 1812.1015)
Can someone point out where I'm doing this wrong?
Thanks!
[1] - https://www.dbase.com/Knowledgebase/INT/db7_file_fmt.htm
[2] - Convert dBase Timestamp
For any dBASE questions, I would recommend to go to the dBASE newsgroups, they have a very helpful and knowledgeable community.
I've finally found the answer thanks to [3].
Basically, the timestamp 8 byte sequence is used as a whole with the following notes:
It's stored in big-endian.
The last byte is not used.
It's a Julian Day Number.
So in my case, it's 0x42ccb20e0340df00 and truncating the last byte,
I get 0x42ccb20e0340df.
Then the following python code gets the correct info:
import datetime
base = 0x42cc418ba99a00
frm_date = int('42ccb20e0340df', 16)
final_ts = (frm_date - base) / 500
final_date = datetime.datetime.utcfromtimestamp(final_ts)
which outputs 2000-8-16 17:21:41 and some milliseconds, which I just ignore.
So I'm guessing the theory is that the above code moves the 'base' date to
1970/1/1 from 1/1/1, which helps since utcfromtimestamp() doesn't
work with any value prior to 1970/1/1.
My confusion stems from the fact it doesn't use 4713BC as the
base year, instead it uses 1/1/1, though I'm still trying to figure out how to get the value 0x42cc418ba99a00 for 1970/1/1.
[3] - https://stackoverflow.com/a/60424157/10860403
I am an intro into computer science student and have learned more on how to use python and am now learning R. I'm not used to R, and I've figured out how to calculate overtime pay, but I am not sure what is wrong with my syntax:
computePay <- function(pay,hours){
}if (hours)>=40{
newpay = 40-hours
total=pay*1.5
return(pay*40)+newpay*total
}else{
return (pay * hours)
}
How would I code this correctly?
Without looking at things like vectorization, a direct correction of your function would look something like:
computePay <- function(pay,hours) {
if (hours >= 40) {
newpay = hours - 40
total = pay * 1.5
return(pay*40 + newpay*total)
} else {
return(pay * hours)
}
}
This supports calling the function with a single pay and a single hours. You mis-calculated newpay (which really should be named something overhours), I corrected it.
You may hear people talk about "avoiding magic constants". A "magic constant" is a hard-coded number within code that is not perfectly clear and/or might be useful to allow the caller to modify. For instance, in some contracts it might be that overtime starts at a number other than 40, so that might be configurable. You can do that by changing the formals to:
computePay <- function(pay, hours, overtime_hours = 40, overtime_factor = 1.5)
and using those variables instead of hard-coded numbers. This allows the user to specify other values, but if not provided then they resort to sane defaults.
Furthermore, it might be useful to call it with a vector of one or the other, in which case the current function will fail because if (hours >= 40) needs a single logical value, but (e.g.) c(40,50) >= 40 returns a logical vector of length 2. We do this by introducing the ifelse function. Though it has some gotchas in advanced usage, it should work just fine here:
computePay1 <- function(pay, hours, overtime_hours = 40, overtime_factor = 1.5) {
ifelse(hours >= overtime_hours,
overtime_hours * pay + (hours - overtime_hours) * overtime_factor * pay,
pay * hours)
}
Because of some gotchas and deep-nested readability (I've seen ifelse stacked 12 levels deep), some people prefer other solutions. If you look at it closer, you may find that you can take further advantage of vectorization and pmax which is max applied piece-wise over each element. (Note the difference of max(c(1,3,5), c(2,4,4)) versus pmax(c(1,3,5), c(2,4,4)).)
Try something like this:
computePay2 <- function(pay, hours, overtime_hours = 40, overtime_factor = 1.5) {
pmax(0, hours - overtime_hours) * overtime_factor * pay +
pmin(hours, overtime_hours) * pay
}
To show how this works, I'll expand the pmax and pmin components:
hours <- c(20, 39, 41, 50)
overtime_hours <- 40
pmax(0, hours - overtime_hours)
# [1] 0 0 1 10
pmin(hours, overtime_hours)
# [1] 20 39 40 40
The rest sorts itself out.
Your "newpay*total" expression is outside the return command. You need put it inside the parentheses. The end bracket at the beginning of the second line should be moved to the last line. You also should have "(hours>=40)" rather than "(hours)>=40". Stylistically, the variable names are poorly chosen and there's no indentation (this might have helped you notice the misplaced bracket). Also, the calculation can be simplified:
total_pay = hourly_wage*(hours+max(0,hours-40)/2))
For every hour you work, you get your hourly wage. For every hour over 40 hours, you get your hourly wage plus half your hourly wage. So the total pay is wage*(total hours + (hours over 40)/2). Hours over 40 is either going to be total hours minus 40, or zero, whichever is larger.
I'm working on implementing a finance model in R. I'm using quantmod::getSymbols(), which is returning a xts object. I'm using both stock data from google (or yahoo) and economic/yield data from FRED. Right now I'm receiving errors for non-conformable arrays when attempting to do a comparison.
require(quantmod)
fiveYearsAgo = Sys.Date() - (365 * 5)
bondIndex <- getSymbols("LQD",src="google",from = fiveYearsAgo, auto.assign = FALSE)[,c(0,4)]
bondIndex$score <- 0
bondIndex$low <- runMin(bondIndex,365)
bondIndex$high <- runMax(bondIndex,365)
bondIndex$score <- ifelse(bondIndex > (bondIndex$low * 1.006), bondIndex$score + 1, bondIndex$score)
# Error in `>.default`(bondIndex, (bondIndex$low * 1.006)) :
# non-conformable arrays
bondIndex$score <- ifelse(bondIndex < (bondIndex$high * .994), bondIndex$score - 1, bondIndex$score)
# Error in `<.default`(bondIndex, (bondIndex$high * 0.994)) :
# non-conformable arrays
print (bondIndex$score)
I added the following before the offending line:
print (length(bondIndex))
print (length(bondIndex$low))
print (length(bondIndex$high))
My results were 5024, 1256, and 1256. I want them to be same length where every day has the close, 52 week high, and 52 week low. I additionally want to add more data so the days also have a 50 day moving average. Further still, what really put an ax in my progress was implementing yield data from FRED. My theory is that stock and bond markets have different holidays, resulting in slightly different days with day. In this case, I'd like to na.spline() the missing data.
I know I'm going about this wrong way, what's the best way to do what I'm attempting? I want to have each row be a day, then have columns for close price, high, low, moving average, a few different yields for that day and finally a "score" that has a daily value based on the other data for that day.
Thanks for the help and let me know if you want or need more information.
You need to tell your statement what variable you want. right now you are asking if bondIndex is greater or less than low or high. This doesn't make sense. Presumably you want bondIndex[,1] aka bondIndex$LQD.Close:
bondIndex$score <- ifelse(bondIndex[,1] > (bondIndex$low * 1.006), bondIndex$score + 1, bondIndex$score)
bondIndex$score <- ifelse(bondIndex[,1] < (bondIndex$high * .994), bondIndex$score - 1, bondIndex$score)
As a side note, Sys.Date() - (365 * 5) is not five years ago (hint, leap years). This will be a bug that might bite you down the line.
Assume I have a series t_1, t_2,..., t_n,..., and the number is always coming in. I want to calculate the approximate of sum/average of last t numbers, but without storing those t numbers. The only thing stored is the previous sum/average. What is the appropriate function?
E.g.
s_1 = t_1
s_2 = f(t_2, s_1)
s_3 = f(t_3, s_2)
The possible function may be like s_2 = t_2 + s_1 * (e ^ -1), but what is the optimal solution?
Note: The window size is fixed. So there is no exact solution, but an approximation, since the number out of the window is not known.
Note 2: Thanks for all the discussion. I know the answer now. It is really trivial, my fault not thinking it well. I will delete this question later. But any way, the answer is, I should assume that the number out of the window is the average. Under this assumption, the new sum is
(old average)*(t-1) + new number
and the new average is
((old average)*(t-1)+(new number))/t
First of all, this realistically is probably a question for Mathematics Stack Exchange
but anyway, since you dont mention a programming language, Ill go with C# (with an array). lets call your series 'mySeries':
double average=0;
for (int i = 0; i < mySeries.Length; i++)
average+=mySeries[i]/(i+1);
MessageBox.Show("Here is your average dawg:" + average.ToString());
Hey all, i am trying to figure out how to calculate the wage for an employee when they clock out. This is the code i am currently using:
Dim theStartTime As Date
Dim theEndTime As Date
Dim totalTime As String
theStartTime = "16:11:06"
theEndTime = "18:22:01"
totalTime = Format(CDbl((theEndTime - theStartTime) * 24), "#0.0")
So workable hours would be: 2h 11m
Right now, with my calculation code above, i get 2.2. What would i need to add in order to get it to calculate the correct time of 2:11 instead of 2:20?
David
Note that 2.2 hours is not 2:20, it's 2:12.
Change
Format(CDbl((theEndTime - theStartTime) * 24), "#0.0")
to
Format(theEndTime - theStartTime, "h:mm")
You're getting the right value, just rounding it off when you print. theEndTime - theStartTime is a time span equal to the difference between the two times. As you discovered, multiplying it by 24 will give you the number of hours different. However, you then have to divide by 24 again to use date/time formatting.
Check out all the ways to format dates and time in VB6.
First, I highly suggest going to the .NET framework (with it's easy-to-use TimeSpan class) if possible.
But, dealing in VB6 you should be able to use the DATEDIFF function (and it's been many years since I've touched VB6 so the specific syntax might be a bit off
Dim iHours As Integer, iMins As Integer
iMins = DateDiff("n", theStartTime, theEndTime)
iHours = iMins / 60
iMins = iMins Mod 60
You should also try casting it to the Currency type which can represent all numeric values (within 4 digits to the left of decimal point and 15 digits to the right).
Format(CCur((theEndTime - theStartTime) * 24), "#0.00")