Related
I'm trying to call this macro from within a method where the parameters to the macro are passed in to the method. It works fine when I call it directly but there is something about the macro expansion which is preventing the variables lat and lon from being correctly used in the macro.
The macro I'm calling is #select here: https://github.com/Alexander-Barth/NCDatasets.jl/blob/4e35e843a53cdcff7f7ef66ebc3ceab1ee1e860b/src/select.jl#L54-L168
and here is the function where the lat and lon variables are not being expanded correctly
function data_for_lat_lon(ds, region, lat_lon_pair)
println("have latlon piar ", lat_lon_pair)
lat = lat_lon_pair[1]
lon = lat_lon_pair[2]
data = []
if(ArchGDAL.contains(region[1], ArchGDAL.createpoint(lon, lat)))
println(lat, " ", lon)
#the below call fails when called in this way
single_lat_lon = NCDatasets.#select(ds, latitude==$lat && longitude==$lon)
for (varname, var) in single_lat_lon
if varname in ["latitude", "longitude", "time"]
continue
end
push!(var_names, varname)
push!(data, Array[single_lat_lon[varname]][1][:])
end
return reduce(hcat, data)'
end
end
This is the error & stack trace I get when calling it:
MethodError: no method matching (::NCDatasets.var"#154#155")(::Float64)
The applicable method may be too new: running in world age 32645, while current world is 32646.
Closest candidates are:
(::NCDatasets.var"#154#155")(::Any) at none:0 (method too new to be called from this world context.)
Stacktrace:
[1] _broadcast_getindex_evalf
# .\broadcast.jl:670 [inlined]
[2] _broadcast_getindex
# .\broadcast.jl:643 [inlined]
[3] getindex
# .\broadcast.jl:597 [inlined]
[4] copy
# .\broadcast.jl:899 [inlined]
[5] materialize
# .\broadcast.jl:860 [inlined]
[6] findall(testf::NCDatasets.var"#154#155", A::Vector{Float64})
# Base .\array.jl:2311
[7] macro expansion
# C:\Users\scott\.julia\packages\NCDatasets\sLdiM\src\select.jl:242 [inlined]
[8] data_for_lat_lon(ds::NCDatasets.MFDataset{DeferDataset, 1, String, NCDatasets.DeferAttributes, NCDatasets.DeferDimensions, NCDatasets.DeferGroups}, region::DataFrameRow{DataFrame, DataFrames.Index}, lat_lon_pair::Tuple{Float64, Float64})
# Main .\In[46]:8
[9] top-level scope
# .\In[50]:3
[10] eval
# .\boot.jl:368 [inlined]
[11] include_string(mapexpr::typeof(REPL.softscope), mod::Module, code::String, filename::String)
# Base .\loading.jl:1428
I'm fairly confident this has something to do with the concept of "hygiene" and variable expansion with a macro but I'm new enough to julia to not understand what needs to be done in my calling function to resolve this. I have reveiwed this question but am not sure it applies to this case: How to pass variable value to a macro in julia?
Thanks!
Turns out the issue was unrelated and was a known issue in julia: https://discourse.julialang.org/t/how-to-bypass-the-world-age-problem/7012
In Julia, one can draw a boxplot using StatsPlots.jl. Assuming There is a DataFrame named df, we can draw a boxplot for one of its columns named a by this:
julia> #df df boxplot(["a"], :a, fillalpha=0.75, linewidth=2)
I want to put the same structure in a function:
julia> function BoxPlotColumn(col::Union{Symbol, String}, df::DataFrame)
if isa(col, String)
#df df boxplot([col], Symbol(col), fillalpha=0.75, linewidth=2)
else
#df df boxplot([String(col)], col, fillalpha=0.75, linewidth=2)
end
end
BoxPlotColumn (generic function with 1 method)
Then, if I say BoxPlotColumn("a", df), Julia throws an error:
ERROR: Cannot convert Symbol to series data for plotting
Stacktrace:
[1] error(s::String)
# Base .\error.jl:35
[2] _prepare_series_data(x::Symbol)
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\series.jl:8
[3] _series_data_vector(x::Symbol, plotattributes::Dict{Symbol, Any})
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\series.jl:35
[4] macro expansion
# C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\series.jl:135 [inlined]
[5] apply_recipe(plotattributes::AbstractDict{Symbol, Any}, #unused#::Type{RecipesPipeline.SliceIt}, x::Any, y::Any, z::Any)
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesBase\qpxEX\src\RecipesBase.jl:289
[6] _process_userrecipes!(plt::Any, plotattributes::Any, args::Any)
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\user_recipe.jl:36
[7] recipe_pipeline!(plt::Any, plotattributes::Any, args::Any)
# RecipesPipeline C:\Users\Shayan\.julia\packages\RecipesPipeline\OXGmH\src\RecipesPipeline.jl:70
[8] _plot!(plt::Plots.Plot, plotattributes::Any, args::Any)
# Plots C:\Users\Shayan\.julia\packages\Plots\lW9ll\src\plot.jl:209
[9] #plot#145
# C:\Users\Shayan\.julia\packages\Plots\lW9ll\src\plot.jl:91 [inlined]
[10] boxplot(::Any, ::Vararg{Any}; kw::Base.Pairs{Symbol, V, Tuple{Vararg{Symbol, N}}, NamedTuple{names, T}} where {V, N, names, T<:Tuple{Vararg{Any, N}}})
# Plots C:\Users\Shayan\.julia\packages\RecipesBase\qpxEX\src\RecipesBase.jl:410
[11] add_label(::Vector{String}, ::typeof(boxplot), ::Vector{String}, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Real, Tuple{Symbol, Symbol}, NamedTuple{(:fillalpha, :linewidth), Tuple{Float64, Int64}}}) # StatsPlots C:\Users\Shayan\.julia\packages\StatsPlots\faFN5\src\df.jl:153
[12] (::var"#33#34"{String})(349::DataFrame)
# Main .\none:0
[13] BoxPlotColumn(col::String, df::DataFrame)
# Main c:\Users\Shayan\Documents\Python Scripts\test2.jl:15
[14] top-level scope
# c:\Users\Shayan\Documents\Python Scripts\test2.jl:22
Which is because of this : #df df boxplot([col], Symbol(col), fillalpha=0.75, linewidth=2)
How can I fix this? Why does this happen? I wrote the same thing just in a function.
I wrote the same thing just in a function.
You have not written the same thing. In your original code you use string and Symbol literals, and in function you pass a variable. This is the key difference.
To fix this I recommend you to use #with from DataFramesMeta.jl:
BoxPlotColumn(col::Union{Symbol, String}, df::DataFrame) =
#with df boxplot([string(col)], $col, fillalpha=0.75, linewidth=2)
which does what you want, as #with supports working with column names programmatically with $.
EDIT
Why Julia doesn't operate when we say boxplot(..., col, ...)
It does not operate because both #df and #which are macros. Since they are macros they transform code into other code that is only later executed. These macros are designed in a way that when they see a symbol literal, e.g. :a they treat it in a special way and consider it to be a column of a data frame. When they see a variable col they cannot know that this variable points to a symbol as the macro is executed before code is evaluated (remember - macro is a method to transform code into other code before this code is executed). See https://docs.julialang.org/en/v1/manual/metaprogramming/#man-macros
MethodError: no method matching isfinite(::String15)
Most likely you have a column with strings not numbers, instead write e.g. names(df, Real) to only get a list of columns that store real numbers (without missing). If you want to allow missing then write names(df, Union{Missing,Real}).
I want to use the hasmethod function to find if an object t::T supports t[!, something] syntax.
The key is something can be of many types and I don't want to check them all, I just want a way to express that hasmethod(getindex, Tuple{T, typeof{!}, S}) regardless of what S is.
How do I do that?
I think the way to get the list of methods is:
methods(getindex, Tuple{Any, typeof(!), Any})
The only problem is that if the second argument allowed by getindex is a supertype of typeof(!) it will be also listed. I do not think it can be avoided though, as you cannot rule out that such getindex definition actually allows ! to be passed as a first argument.
For instance if no packages are loaded the following is the result of the call above:
julia> methods(getindex, Tuple{Any, typeof(!), Any})
# 10 methods for generic function "getindex":
[1] getindex(md::Markdown.MD, args...) in Markdown at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Markdown\src\parse\parse.jl:24
[2] getindex(r::Distributed.Future, args...) in Distributed at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Distributed\src\remotecall.jl:624
[3] getindex(::Type{Any}, vals...) in Base at array.jl:357
[4] getindex(::Type{T}, x, y) where T in Base at array.jl:353
[5] getindex(::Type{T}, vals...) where T in Base at array.jl:344
[6] getindex(A::SparseArrays.SparseMatrixCSC, i, ::Colon) in SparseArrays at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\SparseArrays\src\sparsematrix.jl:1879
[7] getindex(A::AbstractArray, I...) in Base at abstractarray.jl:979
[8] getindex(t::AbstractDict, k1, k2, ks...) in Base at abstractdict.jl:476
[9] getindex(itr::Base.SkipMissing, I...) in Base at missing.jl:232
[10] getindex(r::Distributed.RemoteChannel, args...) in Distributed at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.2\Distributed\src\remotecall.jl:626
All of the cases are the ones where getindex does not put restrictions on the second argument so you cannot rule out that they could possibly allow ! as a valid value.
But if e.g. you load DataFrames.jl and restrict the first argument to AbstractDataFrame you get:
julia> methods(getindex, Tuple{AbstractDataFrame, typeof(!), Any})
# 5 methods for generic function "getindex":
[1] getindex(df::DataFrame, ::typeof(!), col_ind::Symbol) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:367
[2] getindex(df::DataFrame, ::typeof(!), col_ind::Union{Signed, Unsigned}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:358
[3] getindex(df::DataFrame, row_ind::typeof(!), col_inds::Union{Colon, Regex, AbstractArray{T,1} where T, All, Between, InvertedIndex}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:405
[4] getindex(sdf::SubDataFrame, ::typeof(!), colind::Union{Signed, Symbol, Unsigned}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\subdataframe\subdataframe.jl:127
[5] getindex(df::SubDataFrame, row_ind::typeof(!), col_inds::Union{Colon, Regex, AbstractArray{T,1} where T, All, Between, InvertedIndex}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\subdataframe\subdataframe.jl:137
which is now more informative, because in DataFrames.jl we try to be careful not to leave indexing arguments free (i.e. allow them to be Any and only internally check what is valid).
Additionally you can use methodswith to check which methods accept ! explicitly and filter only getindex instances. Here is the result after loading DataFrames.jl:
julia> filter(x -> x.name == :getindex, methodswith(typeof(!)))
[1] getindex(df::DataFrame, ::typeof(!), col_ind::Symbol) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:367
[2] getindex(df::DataFrame, ::typeof(!), col_ind::Union{Signed, Unsigned}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:358
[3] getindex(df::DataFrame, row_ind::typeof(!), col_inds::Union{Colon, Regex, AbstractArray{T,1} where T, All, Between, InvertedIndex}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\dataframe\dataframe.jl:405
[4] getindex(sdf::SubDataFrame, ::typeof(!), colind::Union{Signed, Symbol, Unsigned}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\subdataframe\subdataframe.jl:127
[5] getindex(df::SubDataFrame, row_ind::typeof(!), col_inds::Union{Colon, Regex, AbstractArray{T,1} where T, All, Between, InvertedIndex}) in DataFrames at D:\AppData\.julia\packages\DataFrames\yH0f6\src\subdataframe\subdataframe.jl:137
Finally note that x[!, y] syntax can also mean a view (if preceded by e.g. #view) or a setindex! operation (if it appears on LHS of assignment) so you might wan to check also these functions if they accept ! (and in DataFrames.jl they actually do).
You can do that in theory, if you get the syntax right:
julia> hasmethod(getindex, Tuple{Vector{Int}, typeof(!), Any})
true
This should work since Tuples are covariant.
But its returning true is obviously nonsense:
julia> getindex([1], !, 1)
ERROR: ArgumentError: invalid index: ! of type typeof(!)
Stacktrace:
[1] to_index(::Function) at ./indices.jl:270
[2] to_index(::Array{Int64,1}, ::Function) at ./indices.jl:247
[3] to_indices(::Array{Int64,1}, ::Tuple{Base.OneTo{Int64}}, ::Tuple{typeof(!),Int64}) at ./indices.jl:298
[4] to_indices at ./indices.jl:294 [inlined]
[5] getindex(::Array{Int64,1}, ::Function, ::Int64) at ./abstractarray.jl:981
[6] top-level scope at REPL[26]:1
The reason is that it delegates to other methods internally.
I think it's better to constrain T to some abstract type(s) for which this kind of indexing is known to work. Or, as a last resort, use try.
Given an arbitrary R object, how can I obtain all the methods associated with the object?
The closest I can think of is methods (if S3 object/function, List all available methods for an S3 generic function, or all methods for a class.), or showMethods (if S4).
e.g.:
> A <- matrix(runif(10))
> B <- methods(class=class(A))
> B
[1] anyDuplicated.matrix as.data.frame.matrix as.raster.matrix*
[4] boxplot.matrix determinant.matrix duplicated.matrix
[7] edit.matrix* head.matrix isSymmetric.matrix
[10] relist.matrix* subset.matrix summary.matrix
[13] tail.matrix unique.matrix
Non-visible functions are asterisked
> attr(B,'info')
visible from
anyDuplicated.matrix TRUE package:base
as.data.frame.matrix TRUE package:base
as.raster.matrix FALSE registered S3method
boxplot.matrix TRUE package:graphics
determinant.matrix TRUE package:base
duplicated.matrix TRUE package:base
edit.matrix FALSE registered S3method
head.matrix TRUE package:utils
isSymmetric.matrix TRUE package:base
relist.matrix FALSE registered S3method
subset.matrix TRUE package:base
summary.matrix TRUE package:base
tail.matrix TRUE package:utils
unique.matrix TRUE package:base
Or for a function:
> methods(summary)
[1] summary.aov summary.aovlist summary.aspell*
[4] summary.connection summary.data.frame summary.Date
[7] summary.default summary.ecdf* summary.factor
[10] summary.glm summary.infl summary.lm
[13] summary.loess* summary.manova summary.matrix
[16] summary.mlm summary.nls* summary.packageStatus*
[19] summary.PDF_Dictionary* summary.PDF_Stream* summary.POSIXct
[22] summary.POSIXlt summary.ppr* summary.prcomp*
[25] summary.princomp* summary.srcfile summary.srcref
[28] summary.stepfun summary.stl* summary.table
[31] summary.tukeysmooth*
Non-visible functions are asterisked
?Methods may also prove a useful read.
The class of an R object is recovered with class. Objects do not have methods associated with them in typical R parlance. The class of an object determines what function-methods will be applied to it. In order to determine what functions have methods associated with a given class you would need to test all available functions to see whether there was a class-specific method. Even then generic functions would attempt to use a "default" method in most instances.
Some methods associated with a generic S3 function are displayed with methods. The methods of an S4 function are recovered with showMethods. So, for what most people would call "objects", your question does not make sense, but if it happened that you were including functions under the general term "objects" (which is technically fair) then I have answered.
showMethods(classes="data.frame")
methods(class="data.frame")
Then there are a group of methods that might be called "implicit" although their R name is "groupGeneric"
?groupGeneric
methods("Math") # These are "add-on" methods to the primitive Math functions
[1] Math.data.frame Math.Date Math.dates* Math.difftime Math.factor
[6] Math.mChoice Math.polynomial* Math.POSIXt Math.ratetable* Math.Surv*
[11] Math.times*
Non-visible functions are asterisked
?"+"
methods("Ops") # The binary operators such as "+", "-", "/"
[1] Ops.data.frame Ops.Date Ops.dates* Ops.difftime Ops.factor
[6] Ops.findFn Ops.mChoice Ops.numeric_version Ops.ordered Ops.polynomial*
[11] Ops.POSIXt Ops.raster* Ops.ratetable* Ops.Surv* Ops.times*
[16] Ops.ts* Ops.unit* Ops.yearmon* Ops.yearqtr* Ops.zoo*
Non-visible functions are asterisked
And even then you have not really display the members of the Math or the Ops family, but you would have seen them at the help page for ?groupGeneric. You do not see Ops.numeric. A somewhat lower level view is provided by:
.Primitive("+")
# function (e1, e2) .Primitive("+")
These will throw an error if offered the wrong class argument.
Some packages define functions that are not methods but which are nevertheless intended for use with a particular class. For example, library(igraph) defines the function radius(_), which is intended for use on objects in the igraph class. Since such functions are not methods, methods(_) and showMethods(_) will not reveal them.
In such cases, lsf.str(_) can be very helpful. For example:
lsf.str("package:igraph")
includes the line:
radius : function (graph, mode = c("all", "out", "in", "total"))
I've read the documentation for parent.env() and it seems fairly straightforward - it returns the enclosing environment. However, if I use parent.env() to walk the chain of enclosing environments, I see something that I cannot explain. First, the code (taken from "R in a nutshell")
library( PerformanceAnalytics )
x = environment(chart.RelativePerformance)
while (environmentName(x) != environmentName(emptyenv()))
{
print(environmentName(parent.env(x)))
x <- parent.env(x)
}
And the results:
[1] "imports:PerformanceAnalytics"
[1] "base"
[1] "R_GlobalEnv"
[1] "package:PerformanceAnalytics"
[1] "package:xts"
[1] "package:zoo"
[1] "tools:rstudio"
[1] "package:stats"
[1] "package:graphics"
[1] "package:utils"
[1] "package:datasets"
[1] "package:grDevices"
[1] "package:roxygen2"
[1] "package:digest"
[1] "package:methods"
[1] "Autoloads"
[1] "base"
[1] "R_EmptyEnv"
How can we explain the "base" at the top and the "base" at the bottom? Also, how can we explain "package:PerformanceAnalytics" and "imports:PerformanceAnalytics"? Everything would seem consistent without the first two lines. That is, function chart.RelativePerformance is in the package:PerformanceAnalytics environment which is created by xts, which is created by zoo, ... all the way up (or down) to base and the empty environment.
Also, the documentation is not super clear on this - is the "enclosing environment" the environment in which another environment is created and thus walking parent.env() shows a "creation" chain?
Edit
Shameless plug: I wrote a blog post that explains environments, parent.env(), enclosures, namespace/package, etc. with intuitive diagrams.
1) Regarding how base could be there twice (given that environments form a tree), its the fault of the environmentName function. Actually the first occurrence is .BaseNamespaceEnv and the latter occurrence is baseenv().
> identical(baseenv(), .BaseNamespaceEnv)
[1] FALSE
2) Regarding the imports:PerformanceAnalytics that is a special environment that R sets up to hold the imports mentioned in the package's NAMESPACE or DESCRIPTION file so that objects in it are encountered before anything else.
Try running this for some clarity. The str(p) and following if statements will give a better idea of what p is:
library( PerformanceAnalytics )
x <- environment(chart.RelativePerformance)
str(x)
while (environmentName(x) != environmentName(emptyenv())) {
p <- parent.env(x)
cat("------------------------------\n")
str(p)
if (identical(p, .BaseNamespaceEnv)) cat("Same as .BaseNamespaceEnv\n")
if (identical(p, baseenv())) cat("Same as baseenv()\n")
x <- p
}
The first few items in your results give evidence of the rules R uses to search for variables used in functions in packages with namespaces. From the R-ext manual:
The namespace controls the search strategy for variables used by functions in the package.
If not found locally, R searches the package namespace first, then the imports, then the base
namespace and then the normal search path.
Elaborating just a bit, have a look at the first few lines of chart.RelativePerformance:
head(body(chart.RelativePerformance), 5)
# {
# Ra = checkData(Ra)
# Rb = checkData(Rb)
# columns.a = ncol(Ra)
# columns.b = ncol(Rb)
# }
When a call to chart.RelativePerformance is being evaluated, each of those symbols --- whether the checkData on line 1, or the ncol on line 3 --- needs to be found somewhere on the search path. Here are the first few enclosing environments checked:
First off is namespace:PerformanceAnalytics. checkData is found there, but ncol is not.
Next stop (and the first location listed in your results) is imports:PerformanceAnalytics. This is the list of functions specified as imports in the package's NAMESPACE file. ncol is not found here either.
The base environment namespace (where ncol will be found) is the last stop before proceeding to the normal search path. Almost any R function will use some base functions, so this stop ensures that none of that functionality can be broken by objects in the global environment or in other packages. (R's designers could have left it to package authors to explicitly import the base environment in their NAMESPACE files, but adding this default pass through base does seem like the better design decision.)
The second base is .BaseNamespaceEnv, while the second to last base is baseenv(). These are not different (probably w.r.t. its parents). The parent of .BaseNamespaceEnv is .GlobalEnv, while that of baseenv() is emptyenv().
In a package, as #Josh says, R searches the namespace of the package, then the imports, and then the base (i.e., BaseNamespaceEnv).
you can find this by, e.g.:
> library(zoo)
> packageDescription("zoo")
Package: zoo
# ... snip ...
Imports: stats, utils, graphics, grDevices, lattice (>= 0.18-1)
# ... snip ...
> x <- environment(zoo)
> x
<environment: namespace:zoo>
> ls(x) # objects in zoo
[1] "-.yearmon" "-.yearqtr" "[.yearmon"
[4] "[.yearqtr" "[.zoo" "[<-.zoo"
# ... snip ...
> y <- parent.env(x)
> y # namespace of imported packages
<environment: 0x116e37468>
attr(,"name")
[1] "imports:zoo"
> ls(y) # objects in the imported packages
[1] "?" "abline"
[3] "acf" "acf2AR"
# ... snip ...