How do I create a list in Julia? - julia

I thought that a lst could be written as following.
But using the typeof() function one can see that its not a list.
julia> a = [1,"test", π]
typeof(a)
Vector{Any} (alias for Array{Any, 1})

If you come from Python then what Python calls a list is the same as Vector{Any} in Julia in your example.
However, if you are interested in a linked list data structure instead then you have it in DataStructures.jl package, see here.

Related

Save a Function with JLD2

I want to save a function in a JLD2 file. Is this even possible and if so how?
using JLD2
a = [1,2,3]
function foo(x, y)
x .> y
end
foo(x) = foo(x, a)
save_object("stored.jld2", foo)
So, I guess my point here is actually two things
I want to save the function.
I want it to have the method foo(x) available, i.e. if the jld2 file is opened somewhere else it has the method foo(x) with a = [1,2,3].
When someone builds a machine learning model and saves it, it must work something like this, right?
Really hope it's understandable what I mean here. If not please let me know.
OK, after some more research and some thinking, I come to the conclusion (happy to be proven wrong...) that this is not possible and/or not sensible. The comments in the question suggest using machine learning packages to save models and I think this approach explains the underlying point...
You can store data in jld2 files and but you can not store functions in jld2 files. What I tried here is not going to work because
foo(x) = foo(x, a)
accesses object a in the global environment - it does not magically store the data of a within the function. Furthermore, functions cannot be saved with JLD2 - who know's, the latter one is maybe gonna change in the future.
Great, but what's the solution now??
What you can instead do is store the data...
using JLD2
a = [1,2,3]
save("bar.jld2", Dict("a" => a))
...and make the function available through a module or a script you include().
include("module_with_foo_function.jl")
using module_with_foo_function
using JLD2
my_dict = load("bar.jld2")
a = my_dict["a"]
foo(x) = foo(x, a)
And if you think about it, this is exactly what packages like MLJ are doing as well. There you store an object that contains the information, but the functionality comes from the package, not from your stored object.
Not sure anyone ever has a similar problem, but if so I hope these thoughts help.

How to save a file in Julia

At some point, (I think Julia v0.7) you could do #save savepath thingtosave in order to save files using Julia. I tried to run this on v0.7 to see if I got a deprecation warning but even on 0.7 it says that #save is undefined.
How can I programmatically save files using Julia?
Since you mention #save, presumably, you were using JLD.jl or its successor JLD2.jl.
A simple example for using JLD2 would be
julia> using JLD2
julia> #save "test.jld2" x
julia> x = nothing # "forgetting" x
julia> #load "test.jld2"
1-element Array{Symbol,1}:
:x
julia> x
2×2 Array{Float64,2}:
0.698264 0.319665
0.252174 0.80799
In contrast to write, those packages are based on HDF5 (through HDF5.jl). They pretty much allow you to store arbitrary Julia objects. HDF5 (not necessarily JLD/JLD2) is a file format which is supported by almost all programming languages and many programs (Mathematica for example). It is suitable for long-term storage in contrast to read/write which might change in future Julia versions.
Note that this doesn't show up in 0.7 since it is a package feature and not part of Base (or a stdlib).
From the julia docs, there is the write function:
write(io::IO, x)
write(filename::AbstractString, x)
Write the canonical binary representation of a value to the given I/O stream or file. Return the number of bytes written into the stream. See also print to write a text representation (with an encoding that may depend upon io).
You can write multiple values with the same write call. i.e. the following are equivalent:
write(io, x, y...)
write(io, x) + write(io, y...)
writing a text file:
write("text.txt","this is a test")

Are these strings or variables?

Coming from a C / Python / Java background, I have trouble understanding some R syntax, where literals look like variables, but seem to behave like strings. For example:
library(ggplot2)
library("ggplot2")
The two lines behave equivalently. However, I would expect the first line to mean "load the library whose name is stored in the ggplot2 variable" and give an error like object 'ggplot2' not found.
Speaking of ggplot2:
ggplot(data, aes(factor(arrivalRate), responseTime, fill=factor(mode))) +
geom_violin(trim=FALSE, position=dodge)
The variables arrivalRate, responseTime and mode do not exist, but somehow R knows to look them up inside the data data frame. I assume that aes actually receives strings, that are then processed using something like eval.
How does R parse code that it ends up interpreting some literals as strings?
promises
When an argument is passed to a function it is not passed as a value but is passed as a promise which consists of
the expression or code that the caller uses as the actual argument
the environment in which that expression is to be evaluated, viz. the caller's environment.
the value that the expression represents when the expression is evaluated in the promise's environment -- this slot is not filled in until the promise is actually evaluated. It will never be filled in if the function never accesses it.
The pryr package can show the info in a promise:
library(pryr)
g <- function(x) promise_info(x)
g(ggplot2)
giving:
$code
ggplot2 <-- the promise x represents the expression ggplot2
$env
<environment: R_GlobalEnv> <-- if evaluated it will be done in this environment
$evaled
[1] FALSE <-- it has not been evaluated
$value
NULL <-- not filled in because promise has not been evaluated
The only one of the above slots in the pryr output that can be accessed at the R level without writing a C function to do it (or using a package such as pryr that accesses such C code) is the code slot. That can be done using the R function substitute(x) (or other means). In terms of the pryr output substitute applied to a promise returns the code slot without evaluating the promise. That is, the value slot is not modified. Had we accessed x in an ordinary way, i.e. not via substitute, then the code would have been evaluated in the promise's environment, stored in the value slot and then passed to the expression in the function that accesses it.
Thus either of the following result in a character string representing what was passed as an expression, i.e. the character representation of the code slot, as opposed to its value.
f <- function(x) as.character(substitute(x))
f("ggplot2")
## [1] "ggplot2"
f(ggplot2)
## [1] "ggplot2"
library
In fact, library uses this idiom, i.e. as.character(substitute(x)), to handle its first argument.
aes
The aes function uses match.call to get the entire call as an expression and so in a sense is an alternative to substitute. For example:
h <- function(x) match.call()
h(pi + 3)
## h(x = pi + 3)
Note
One cannot tell without looking at the documentation or code of a function how it will treat its arguments.
An interesting quirk of the R language is the way it evaluates expressions. In most cases, R behaves the way you'd expect. Expressions in quotes are treated as strings, anything else is treated as a variable, function, or other token. But some functions allow for "non-standard evaluation", in which an unquoted expression is evaluated, more or less, as if it were a quoted variable. The most common example of this is R's way of loading libraries (which allows for unquoted or quoted library names) and its succinct formula interface. Other packages can take advantage of NSE. Hadley Wickham makes extensive use of it throughout his extremely popular tidyverse packages. Aside from saving the user a few characters of typing, NSE has a number of useful properties for dynamic programming.
As noted in the other answer, Wickham has an excellent tutorial on how it all works. RPubs user lionel also has a great working paper on the topic.
The concept is called "non-standard evaluation", and there are many different ways in which it can be used in different R functions. See this book chapter for an introduction.
This language feature can be confusing, and arguably is not needed for the library() function, but it allows incredibly powerful code when you need to specify computations on data frames, as is the case in ggplot2 or in dplyr, for example.
The lines
library(ggplot2)
library("ggplot2")
are not equivalent. In the first line, ggplot2 is a symbol, which may
or may not be bound to some value. In the second line, "ggplot2" is a
character vector of length one.
A function, however, can manipulate the arguments that it gets without
evaluating them, and can decide to treat both cases equivalently, which is what library does apparently.
Here's an example of how to manipulate an unevaluated expression:
> f <- function(x) match.call() # return unevaluated function call
> x <- f(foo)
> x
f(x = foo)
> mode(x)
[1] "call"
> x[[1]]
f
> x[[2]]
foo
> mode(x[[2]])
[1] "name"
> as.character(x[[2]])
[1] "foo"
> x <- f("foo")
> mode(x[[2]])
[1] "character"

Run Julia function every time Julia environment launches

I am moving from R, and I use the head() function a lot. I couldn't find a similar method in Julia, so I wrote one for Julia Arrays. There are couple other R functions that I'm porting to Julia as well.
I need these methods to be available for use in every Julia instance that launches, whether through IJulia or through command line. Is there a "startup script" of sorts for Julia? How can I achieve this?
PS: In case someone else is interested, this is what I wrote. A lot needs to be done for general-purpose use, but it does what I need it to for now.
function head(obj::Array; nrows=5, ncols=size(obj)[2])
if (size(obj)[1] < nrows)
println("WARNING: nrows is greater than actual number of rows in the obj Array.")
nrows = size(obj)[1]
end
obj[[1:nrows], [1:ncols]]
end
You can make a ~/.juliarc.jl file, see the Getting Started section of the manual.
As for you head function, here is how I'd do it:
function head(obj::Array; nrows=5, ncols=size(obj,2))
if size(obj,1) < nrows
warn("nrows is greater than actual number of rows in the obj Array.")
nrows = size(obj,1)
end
obj[1:nrows, 1:ncols]
end

I want to have a list of all the functions in my R workspace that works like the objects() command

I love working with R functions, so my workspace accumulates a lot of functions.
However, the "objects()" command seems to return strings that name my objects instead of the objects themselves. So when I have a function named "barchart00", it shows up with the objects() command, and if I test its type, it is detectable as a function, as the following code shows:
is.function(barchart00)
[1] TRUE
> objects()
[1] "barchart00"
> OL<-objects()
> OL
[1] "barchart00"
> is.function(OL[1])
[1] FALSE
This wouldn't be a problem if I had just one or two or three functions. But in practice I have dozens of functions, AND dozens of objects that are not functions, and I want to get a list of functions that's just as convenient as the list of objects returned by objects().
My first thought was that if objects() returned a list of actual objects, I could just go through that list and test for function status. But in fact, objects() seems to return a list of strings that are the names of my objects, not the objects themselves.
Any constructive advice would be greatly appreciated. Thanks.
...Hong Ooi answered the question but I can't mark it as answered for another eight hours.
lsf.str() is the syntax I was looking for.
All credit should go to Hong Ooi.
https://stackoverflow.com/users/474349/hong-ooi
Thanks, Hong Ooi.
lsf.str looks like a fine answer. If you'd like a more general tool, here's one from my (horn-tooting here) cgwtools package. You can get a list of any particular type of object in your environment (not just closures).
lstype <- function(type='closure'){
#simple command to get only one type of object in current environment
# Note: if you foolishly create variables named 'c' ,'q' ,'t' or the like,
# this will fail because typeof finds the builtin function first
inlist<-ls(.GlobalEnv)
if (type=='function') type <-'closure'
typelist<-sapply(sapply(inlist,get),typeof)
return(names(typelist[typelist==type]))
}

Resources