How do i assess a local variable outside function? - netcdf

In this part of code i define a function to subset an area of interest. However, i want to use the variables latselect and lonselect later on in another function. So i have:
def DatasetToSubset(file, LatUpbound, LatLowBound, LonUpBound, LonLowBound):
nc=netCDF4.Dataset(file)
lats=nc.variables['lat'][:]; lons=nc.variables['lon'][:]
latselect=np.logical_and(lats > LatLowBound, lats < LatUpBound)
lonselect=np.logical_and(lon > LonLowBound, lon < LonUpBound)
data=nc.variables['Runoff'][1000, latselect, lonselect]
return data; return latselect; return lonselect

Once a function reaches a return statement, it returns that value and terminates immediately, meaning the subsequent two statements will never execute. You can return the three values as a tuple, like this
def DatasetToSubset(file, LatUpbound, LatLowBound, LonUpBound, LonLowBound):
nc=netCDF4.Dataset(file)
lats=nc.variables['lat'][:]; lons=nc.variables['lon'][:]
latselect=np.logical_and(lats > LatLowBound, lats < LatUpBound)
lonselect=np.logical_and(lon > LonLowBound, lon < LonUpBound)
data=nc.variables['Runoff'][1000, latselect, lonselect]
return (data, latselect, lonselect)
and when you call this function, you can unpack the three values like this
(a, b, c) = DatasetToSubset(...)
a will hold the value of data, b that of latselect, and c that of lonselect.

Related

How to include logical checks in a custom function

I have written a custom function that performs a mathematical transformation on a column of data with the inputs being the data and one other input (temperature). I would like to have 2 different logical checks. The first one is whether or not any values in the column exceed a certain threshold, because the transformation is different above and below the threshold. The second is a check if the temperature input is above a certain value and in that case, to deliver a warning that values above the threshold are unusual and to check the data.
Right now, I have the function written with a series of if/else statements. However, this a warning that it is only using the first element of the string of T/F statements. A simplified example of my function is as follows:
myfun = function(temp,data) {
if(temp > 34){
warning('Temperature higher than expected')
}
if (data > 50) {
result = temp*data
return(result)
} else if(data <= 50) {
result = temp/data
return(result)
}
}
myfun(temp = c(25,45,23,19,10), data = c(30,40,NA,50,10))
As you can see, because it is only using the first value for the if/else statements, it does not properly calculate the return values because it doesn't switch between the two versions of the transformation. Additionally, it's only checking if the first temp value is above the threshold. How can I get it to properly apply the logical check to every value and not just the first?
-edit-simplified the function per #The_Questioner's suggestion and changed < 50 to <= 50.
The main issue with your code is that you are passing all the values to the functions as vectors, but then are doing single element comparisons. You need to either pass the elements one by one to the function, or put some kind of vectorized comparison or for loop into your function. Below is the for loop approach, which is probably the least elegant way to do this, but at least it's easy to understand what's going on.
Another issue is that NA's apparently need to be handled in the data vector before passing to any of your conditional statements, or you'll get an error.
A final issue is what to do when data = 50. Right now you have conditional tests for greater or less than 50, but as you can see, the 4th point in data is 50, so right now you get an NA.
myfun = function(temp,data) {
result <- rep(NA,length(temp))
for (t in 1:length(temp)) {
if(temp[t] > 34) {
warning('Temperature higher than expected')
if (!is.na(data[t])) {
if (data [t] > 50) {
result[t] <- temp[t]*data[t]
} else if(data[t] < 50) {
result[t] <- temp[t]/data[t]
}
}
} else {
if (!is.na(data[t])) {
if (data[t] > 50) {
result[t] <- temp[t]*data[t]
} else if(data[t] < 50) {
result[t] <- temp[t]/data[t]
}
}
}
}
return(result)
}
Output:
> myfun(temp = c(25,45,23,19,10), data = c(30,40,NA,50,10))
[1] 0.8333333 1.1250000 NA NA 1.0000000

Why does this happen when a user-defined R function does not return a value?

In the function shown below, there is no return. However, after executing it, I can confirm that the value entered d normally.
There is no return. Any suggestions in this regard will be appreciated.
Code
#installed plotly, dplyr
accumulate_by <- function(dat, var) {
var <- lazyeval::f_eval(var, dat)
lvls <- plotly:::getLevels(var)
dats <- lapply(seq_along(lvls), function(x) {
cbind(dat[var %in% lvls[seq(1, x)], ], frame = lvls[[x]])
})
dplyr::bind_rows(dats)
}
d <- txhousing %>%
filter(year > 2005, city %in% c("Abilene", "Bay Area")) %>%
accumulate_by(~date)
In the function, the last assignment is creating 'dats' which is returned with bind_rows(dats) We don't need an explicit return statement. Suppose, if there are two objects to be returned, we can place it in a list
In some languages like python, for memory efficiency, generators are used which will yield instead of creating the whole output in memory i.e. Consider two functions in python
def get_square(n):
result = []
for x in range(n):
result.append(x**2)
return result
When we run it
get_square(4)
#[0, 1, 4, 9]
The same function can be written as a generator. Instead of returning anything,
def get_square(n):
for x in range(n):
yield(x**2)
Running the function
get_square(4)
#<generator object get_square at 0x0000015240C2F9E8>
By casting with list, we get the same output
list(get_square(4))
#[0, 1, 4, 9]
There is always a return :) You just don't have to be explicit about it.
All R expressions return something. Including control structures and user-defined functions. (Control-structures are just functions, by the way, so you can just remember that everything is a value or a function call, and everything evaluates to a value).
For functions, the return value is the last expression evaluated in the execution of the function. So, for
f <- function(x) 2 + x
when you call f(3) you will invoke the function + with two parameters, 2 and x. These evaluate to 2 and 3, respectively, so `+`(2, 3) evaluates to 5, and that is the result of f(3).
When you call the return function -- and remember, this is a function -- you just leave the control-flow of a function early. So,
f <- function(x) {
if (x < 0) return(0)
x + 2
}
works as follows: When you call f, it will call the if function to figure out what to do in the first statement. The if function will evaluate x < 0 (which means calling the function < with parameters x and 0). If x < 0 is true, if will evaluate return(0). If it is false, it will evaluate its else part (which, because if has a special syntax when it comes to functions, isn't shown, but is NULL). If x < 0 is not true, f will evaluate x + 2 and return that. If x < 0 is true, however, the if function will evaluate return(0). This is a call to the function return, with parameter 0, and that call will terminate the execution of f and make the result 0.
Be careful with return. It is a function so
f <- function(x) {
if (x < 0) return;
x + 2
}
is perfectly valid R code, but it will not return when x < 0. The if call will just evaluate to the function return but not call it.
The return function is also a little special in that it can return from the parent call of control structures. Strictly speaking, return isn't evaluated in the frame of f in the examples above, but from inside the if calls. It just handles this special so it can return from f.
With non-standard evaluation this isn't always the case.
With this function
f <- function(df) {
with(df, if (any(x < 0)) return("foo") else return("bar"))
"baz"
}
you might think that
f(data.frame(x = rnorm(10)))
should return either "foo" or "bar". After all, we return in either case in the if statement. However, the if statement is evaluated inside with and it doesn't work that way. The function will return baz.
For non-local returns like that, you need to use callCC, and then it gets more technical (as if this wasn't technical enough).
If you can, try to avoid return completely and rely on functions returning the last expression they evaluate.
Update
Just to follow up on the comment below about loops. When you call a loop, you will most likely call one of the built-in primitive functions. And, yes, they return NULL. But you can write your own, and they will follow the rule that they return the last expression they evaluate. You can, for example, implement for in terms of while like this:
`for` <- function(itr_var, seq, body) {
itr_var <- as.character(substitute(itr_var))
body <- substitute(body)
e <- parent.frame()
j <- 1
while (j < length(seq)) {
assign(x = itr_var, value = seq[[j]], envir = e)
eval(body, envir = e)
j <- j + 1
}
"foo"
}
This function, will definitely return "foo", so this
for(i in 1:5) { print(i) }
evalutes to "foo". If you want it to return NULL, you have to be explicit about it (or just let the return value be the result of the while loop -- if that is the primitive while it returns NULL).
The point I want to make is that functions return the last expression they evaluate has to do with how the functions are defined, not how you call them. The loops use non-standard evaluation, so the last expression in the loop body you provide them might be the last value they evaluate and might not. For the primitive loops, it is not.
Except for their special syntax, there is nothing magical about loops. They follow the rules all functions follow. With non-standard evaluation it can get a bit tricky to work out from a function call what the last expression they will evaluate might be, because the function body looks like it is what the function evaluates. It is, to a degree, if the function is sensible, but the loop body is not the function body. It is a parameter. If it wasn't for the special syntax, and you had to provide loop bodies as normal parameters, there might be less confusion.

calling variable name of a dataframe inside functions in R using "$"

I have a dataframe(master) that has some variables which i have stored in the list below:
cont<-list("Quantity","Amt_per_qty","Trans_tax","Total_trans_amt")
catg<-list("Gender","Region_code","SubCategory")
I am trying to create a function where I can access the variables from dataframe and perform some function on them, though x and val in below function seems to resolve, how can I access the variables using the $ sign inside function
univar<-function (x){
for (val in cont){
print (val)
n<-nrow(x$val) }
print (n) }
univar(master)
Its returning NULL, I tried even with n<-nrow(x[,val]), that also don't seem to work.
i) x[val] returns a data.frame
ii) x[,val,drop = TRUE] returns a vector
iii) x[[val]] shall return as a vector. Advantage of this is : it also works with data.tables
n <- nrow(x) or length(x[[val]])
The reason is that the OP created a list, it could be unlisted and then use [
cont <- unlist(cont)
univar<-function(x){
for (val in cont){
print (val)
n<-nrow(x[[val]]) }
print (n) }
univar(master)

Groovy collections - Finding an element that matches a condition

I have a groovy collections which is an array, containing value starting from 0 through 'n'. I need to find a particular array index when a series of conditions occured. And,I do not need to scan through every value of the array but can jump across pre-defined intervals. For example, look for the condition for every 10 values in the array. Can someone tell me a way to do this?
For example, I want to do somehting like this below
def alltimes = [0 . . . . . 10000]
def end_time = 10000
def time = 0
while(time <= end_time)
{
// check the condition for alltimes[time]
if(condition_satisfied){
println "condition satisfied at time ${time}"
break
}
time = time + 50
}
When i explored all available methods of array, i did not find any one which can allow to jump variables instead of just one as in methods each, eachwithindex.
Seems like I need to use metaclass and create a new method?
You can use find for this:
def allTimes = 0..10000
Closure<Boolean> checkCondition = { all, single ->
single > 300
}
​(0..10000).step( 50 )​.find { time -> ​checkCondition( allTimes, time ) }​
Which is ripe for currying:
def allTimes = 0..10000
Closure<Boolean> checkCondition = { all, single ->
single > 300
}
​(0..10000).step( 50 )​.find checkCondition.curry( allTimes )​

Stuck in an infinite loop in a function

I'm stuck in an infinite loop in this function:
let rec showGoatDoorSupport(userChoice, otherGuess, aGame) =
if( (userChoice != otherGuess) && (List.nth aGame otherGuess == "goat") ) then otherGuess
else showGoatDoorSupport(userChoice, (Random.int 3), aGame);;
And here's how I'm calling the function:
showGoatDoorSupport(1, 2, ["goat"; "goat"; "car"]);
In the first condition in the function, I compare the first 2 input parameters (1 and 2) if the are different, and if the item in the list at index "otherGuess" is not equal to "goat", I want to return that otherGuess.
Otherwise, I want to run the function again with a random number between 0-2 as the second input parameter.
The point is to keep trying to run the function until the second parameter doesnt equal the first, and that slot in the List isn't "goat", then return that slot number.
Don't use ==, it checks for physical equality. Use =. Two different strings will never be physically equal, even if they contain the same sequence of characters. (This is necessary, because strings are mutable in OCaml.)
$ ocaml
OCaml version 4.00.0
# "abc" == "abc";;
- : bool = false
# "abc" = "abc";;
- : bool = true
Another to do that is to use the String.compare. An example:
if String.compare str1 str2 = 0 then (* case equal *)
else (* case not equal *)

Resources