R calling Fortran subroutine - r

I understood that .Fortran from following code invokes Fortran subroutine, but why we are using C_ for subroutine name here? Few other subroutine calling examples I looked over internet are simply "stl", can someone please help me with why C_stl instead of stl?
z <- .Fortran(C_stl, x, n,
as.integer(period),
as.integer(s.window),
as.integer(t.window),
as.integer(l.window),
s.degree, t.degree, l.degree,
nsjump = as.integer(s.jump),
ntjump = as.integer(t.jump),
nljump = as.integer(l.jump),
ni = as.integer(inner),
no = as.integer(outer),
weights = double(n),
seasonal = double(n),
trend = double(n),
double((n+2*period)*5))

C_stl is an object in the stats package containing auxiliary information about the Fortran subroutine. It's not exported, so to see it you'll have to type stats:::C_stl.
> stats:::C_stl
$name
[1] "stl"
$address
<pointer: 0x000000000f87b950>
attr(,"class")
[1] "RegisteredNativeSymbol"
$dll
DLL name: stats
Filename: E:/apps/R/R-3.1.1/library/stats/libs/x64/stats.dll
Dynamic lookup: FALSE
$numParameters
[1] 18
attr(,"class")
[1] "FortranRoutine" "NativeSymbolInfo"

After a lot of searching I believe I found the answer. Look in the the NAMESPACE file in the directory <path to R sources>/src/library/stats.
You'll see that all C/Fortran routines are referred to with names prefixed with C_, This appears to be done by useDynLib.

Related

How to store vector of string in Julia?

I have two numerical one dimensional vector (A,B) of size ~800000, and through some hashing operation these two are combined together and produce a string of size 6. Now I don't know how to store these strings? Whatever I do, it gives me an error.
I've tried using ArrayArray{string}(undef, 6) and also Dict.
My try is something like this:
# import pakages: read from csv, hashing
using DelimitedFiles
using GeohashHilbert
csvfilename = "C:/Users/lin/Desktop/uber-raw-data-jul14.csv"
csvdata = readdlm(csvfilename, ',', header=true)
data = csvdata[1]
header = csvdata[2]
lat = data[:,2]
long = data[:,3]
lat_len = length(lat)
#GeoHashed = GeohashHilbert.encode(lon, lat, precision, bits_per_char)
GeoHashed = Dict()
for i in 1:lat_len
GeoHashed[i] = GeohashHilbert.encode(long[i], lat[i], 6, 6)
end
What's the issue??
ERROR: LoadError: MethodError: no method matching isless(::Int64, ::SubString{String})
Closest candidates are:
isless(::AbstractString, ::AbstractString) at strings/basic.jl:344
isless(::Any, ::Missing) at missing.jl:88
isless(::Missing, ::Any) at missing.jl:87
...
It looks like you're feeding strings into the encode function which doesn't work:
julia> GeohashHilbert.encode("51.1", "0.5", 6, 6)
ERROR: MethodError: no method matching isless(::Int64, ::String)
...
Stacktrace:
[1] <(x::Int64, y::String)
# Base .\operators.jl:352
[2] <=(x::Int64, y::String)
# Base .\operators.jl:401
[3] encode(lon::String, lat::String, precision::Int64, bits_per_char::Int64)
# GeohashHilbert C:\Users\ngudat\.julia\packages\GeohashHilbert\vh6xu\src\GeohashHilbert.jl:118
[4] top-level scope
# REPL[62]:1
[5] top-level scope
# C:\Users\ngudat\.julia\packages\CUDA\KnJGx\src\initialization.jl:52
You probably meant to do:
julia> GeohashHilbert.encode(51.1, 0.5, 6, 6)
"W13T#3"
so your problem is likely reading in the data. It's impossible to tell without having the csv file available, but I'm assuming if you did typeof(lat) you would get Vector{SubString{String}} instead of Vector{Float64} as you seem to expect.
So the solution is probably to use a more fully featured CSV reader like CSV.jl to read your csv file to ensure that you end up with numerical data, or do parse.(Float64, lat) to convert your data after reading it in from csv.

R S3 generic methods set to visible FALSE

I have an R object lf which is an element of the class tbl_lazy:
library(dbplyr)
lf <- lazy_frame(a = TRUE, b = 1, c = 2, d = "z", con = simulate_hana())
>class(lf)
[1] "tbl_HDB" "tbl_lazy" "tbl"
With the help of the sloop package, I can see that the generic function print.tbl_lazy is set to visible = FALSE. This seems to be the reason why printing print.tbl_lazy returns Error: object 'print.tbl_lazy' not found.
generic class visible source
<chr> <chr> <lgl> <chr>
11 print tbl_lazy FALSE registered S3method
When I debug print I see the call to print.lazy and can now see the content of print.tbl_lazy.
debugging in: function (x, ...)
UseMethod("print")(x)
debug: UseMethod("print")
Browse[2]> n
debugging in: print.tbl_lazy(x)
debug: {
show_query(x)
}
My question is why are all the methods of the class tbl_lazy set to visible = FALSE and what are the consequences of this? It would appear to me, while it may have some advantages, whatever they might be, it makes the code of the method more difficult to access, which in a language like R, used by so many non technical users, seems to be a big disadvantage.
I wasn't able to find any documentation on this.

How to get the queue number from CONDOR into your R job

I think I have a simple problem because I was looking up and down the internet and couldn't find someone else asking this question:
My university has a Condor set-up. I want to run several repetitions of the same code (e.g. 100 times). My R code has a routine to store the results in a file, i.e.:
write.csv(res, file=paste(paste(paste(format(Sys.time(), '%y%m%d'),'res', queue, sep="_"), sep='/'),'.csv',sep='',collapse=''))
res are my results (a data.frame), I indicate that this file contains the results with 'res' and finally I want to add the queue number of this calculation (otherwise files would be replaced, wouldn't they?). It should look like: 140109_res_1.csv, 140109_res_2.csv, ...
My submit file to condor looks like this:
universe = vanilla
executable = /usr/bin/R
arguments = --vanilla
log = testR.log
error = testR.err
input = run_condor.r
output = testR$(Process).txt
requirements = (opsys == "LINUX") && (arch == "X86_64") && (HAS_R_2_13 =?= True)
request_memory = 1000
should_transfer_files = YES
transfer_executable = FALSE
when_to_transfer_output = ON_EXIT
queue 3
I wonder how do I get the 'queue' number into my R code? I tried a simple example with
print(queue)
print(Queue)
But there is no object found called queue or Queue. Any suggestions?
Best wishes,
Marco
Okay, I solved the problem. This is how it goes:
I had to change my submit file. I changed the slot arguments to:
arguments = --vanilla --args $(Process)
Now the process number is forwarded to the R code. There you retrieve it with the following line. The value will be stored as a character. Therefore, you should convert it to a numeric value (also check whether a number like 10 is passed on as '1' and '0' in which case you should also collapse the values).
run <- commandArgs(TRUE)
Here is an example of the code I let run.
> run <- commandArgs(TRUE)
> run
[1] "0"
> class(run)
[1] "character"
> try(as.numeric(run))
[1] 0
> try(run <- as.numeric(paste(run, collapse='')) )
> try(print(run))
[1] 0
> try(write(run, paste(run,'csv', sep='.')))
You can also find information how to pass on variables/arguments to your code here: http://research.cs.wisc.edu/htcondor/manual/v7.6/condor_submit.html
I hope this helps anyone.
Cheers and thanks for all other commenters!
Marco

R: Strange behavior while saving list() with save() from function output

I am currently facing a strange problem while saving lists and 'sublists' with R. The title may not be explicit but here is what is troubling me :
Given some data (here the data is totaly artificial but the problem isn't the relevance of the model) :
set.seed(1)
a0 = rnorm(10000,10,2)
b1 = rnorm(10000,10,2)
b2 = rnorm(10000,10,2)
b3 = rnorm(10000,10,2)
data = data.frame(a0,b1,b2,b3)
And a function returning a list of complex objects (let's say lm() objects) :
test = function(k){
tt = vector('list',k)
for(i in 1:k) tt[[i]] = lm(a0~b1+b2+b3,data = data)
tt
}
Our test fonction returns a list of lm() objects. Lets look the size of this object :
ok = test(2)
object.size(ok)
> object.size(ok)
4019336 bytes
Let's create ok2, an exactly similar object but not within a function :
ok2 = vector('list',2)
ok2[[1]] = lm(a0~b1+b2+b3,data = data)
ok2[[2]] = lm(a0~b1+b2+b3,data = data)
... and check his size :
> object.size(ok2)
4019336 bytes
Here we are, ok and ok2 are exactly the same, and so tells us R.
Problem, if we save these objects on hard drive as R object (with save() or saveRDS()) :
save(ok,file='ok.RData')
save(ok2,file='ok2.RData')
Theirs sizes on hard drive are respectively : 3 366 005 bytes and 1 678 851 bytes.
ok is 2 times bigger than ok2 while they are exactly similar!
Even more strange, if you save a 'sublist' of our objects, lets say ok[[1]] and ok2[[1]] (objects once again totaly identical) :
a = ok[[1]]
a2 = ok2[[1]]
save(a,file='console/a.RData')
save(a2,file='console/a2.RData')
Theirs sizes on hard drive respectively : 2 523 284 bytes and 838 977 bytes.
Two things :
Why does the size of a differ from the size of a2 on hard drive? Why does the size of ok differ from the size of ok2 on hard drive?
And why a which is exactly half of ok sizes 2 523 284 bytes while ok sizes at 3 366 005 bytes on HD?.
Am I missing something?
ps : I runned this test under Windows 7 32bits with R 2.15.1, 2.15.2, 2.15.3, 3.0.0, and with debian and R 2.15.1, R 2.15.2. I am having this problem every time.
EDIT
thx to #user1609452, here is a little trick which seems to be working :
test2 = function(k){
tt = vector('list',k)
for(i in 1:k){
tt[[i]] = lm(a0~b1+b2+b3,data = data)
attr(tt[[i]]$terms,".Environment") = .GlobalEnv
attr(attr(tt[[i]]$model,"terms"),".Environment") = .GlobalEnv
}
tt
}
Formula objects come with their own environment and a lot of stuff in it. Put it to NULL or to .GlobalEnv and it seems to be working. Functions like predict.lm() still work and our saved objects have the right size on the HD. Not sure why though.
look at
> attr(ok[[1]]$terms,".Environment")
<environment: 0x9bcf3f8>
> attr(ok2[[1]]$terms,".Environment")
<environment: R_GlobalEnv>
also
> ls(envir = attr(ok[[1]]$terms,".Environment"))
[1] "i" "k" "tt"
so ok is dragging around the environment of the function with it.
Also read ?object.size
The calculation is of the size of the object, and excludes the
space needed to store its name in the symbol table.
Associated space (e.g. the environment of a function and what the
pointer in a ‘EXTPTRSXP’ points to) is not included in the
calculation.
For example define a test2 and an ok3
test2 = function(k){
tt = vector('list',k)
for(i in 1:k) tt[[i]] = lm(a0~b1+b2+b3,data = data)
rr = tt
tt
}
ok3 <- test2(2)
save(ok3, 'ok3.RdData')
> file.info('ok3.RData')$size
[1] 5043933
> file.info('ok.RData')$size
[1] 3366005
> file.info('ok2.RData')$size
[1] 1678851
> ls(envir = attr(ok3[[1]]$terms,".Environment"))
[1] "i" "k" "rr" "tt"
so ok is roughly twice as big as ok2 because it has the extra tt and ok3 is three times as big as it has tt and rr
> c(object.size(ok),object.size(ok2),object.size(ok3))
[1] 4019336 4019336 4019336
There is related discussion here

Importing package namespace into default namespace [duplicate]

This question already has an answer here:
How can I read the source code for an R function?
(1 answer)
Closed 9 years ago.
Frequently when I am working with R and I want to find out what the function does, I type in the name of the function and scroll through the code. However, sometimes when I type in the name of the function I get a response that does not tell me anything.
> library(limma)
> plotMDS #can't get to the code
function (x, ...)
UseMethod("plotMDS")
<environment: namespace:limma>
> limma:::plotMDS
function (x, ...)
UseMethod("plotMDS")
<environment: namespace:limma>
> heatmap #im expecting something more like this
function (x, Rowv = NULL, Colv = if (symm) "Rowv" else NULL,
distfun = dist, hclustfun = hclust, reorderfun = function(d,
w) reorder(d, w), add.expr, symm = FALSE, revC = identical(Colv,
"Rowv"), scale = c("row", "column", "none"), na.rm = TRUE,
margins = c(5, 5), ColSideColors, RowSideColors, cexRow = 0.2 +
1/log10(nr), cexCol = 0.2 + 1/log10(nc), labRow = NULL,
labCol = NULL, main = NULL, xlab = NULL, ylab = NULL, keep.dendro = FALSE,
verbose = getOption("verbose"), ...)
{
scale <- if (symm && missing(scale))
"none"
else match.arg(scale)
/* ... many lines removed ... */
}
invisible(list(rowInd = rowInd, colInd = colInd, Rowv = if (keep.dendro &&
doRdend) ddr, Colv = if (keep.dendro && doCdend) ddc))
}
<bytecode: 0x16199b8>
<environment: namespace:stats>
Thus, I was wondering if there is a way to import a package's namespace into the default namespace so I can look at code in functions (and debug things easier). I've been reading up on namespace but most of the time it is written for developers so it is talking about how to export namespaces for packages.
plotMDS is the generic function. What you access via plotMDS and limma:::plotMDS is exactly the same thing, the latter just less-efficiently. What you want to get at are the methods for this generic function.
To see the list of method for plotMDS try
methods(plotMDS)
That will return a vector of function names. I can't install limma so here is what we see for the base plot generic [in my current session]:
> methods(plot)
[1] plot.acf* plot.correspondence* plot.data.frame*
[4] plot.decomposed.ts* plot.default plot.dendrogram*
[7] plot.density plot.ecdf plot.factor*
[10] plot.formula* plot.function plot.hclust*
[13] plot.histogram* plot.HoltWinters* plot.isoreg*
[16] plot.lda* plot.lm plot.mca*
[19] plot.medpolish* plot.mlm plot.ppr*
[22] plot.prcomp* plot.princomp* plot.profile*
[25] plot.profile.nls* plot.ridgelm* plot.spec
[28] plot.stepfun plot.stl* plot.table*
[31] plot.ts plot.tskernel* plot.TukeyHSD
Non-visible functions are asterisked
To access the code of non-starred functions we just enter the full function name, e.g.
> plot.density
function (x, main = NULL, xlab = NULL, ylab = "Density", type = "l",
zero.line = TRUE, ...)
{
....
To see the code for starred functions/methods you need the pkg:::function structure, e.g. for the plot.data.frame method
> plot.data.frame
Error: object 'plot.data.frame' not found
> graphics:::plot.data.frame
function (x, ...)
{
....
If you don't know which namespace a method belongs to, then use getAnywhere, e.g.
> getAnywhere(plot.data.frame)
A single object matching ‘plot.data.frame’ was found
It was found in the following places
registered S3 method for plot from namespace graphics
namespace:graphics
with value
function (x, ...)
{
....
The printed results indicate the relevant namespace (in this case graphics) plus return the value of the function, or the code.
This is a really crude alternative, but it does what was requested:
Firstly, copy the contents of the namespace to a list in the global environment:
L <- as.list(asNamespace("yourpackage"))
Now you can either navigate L or copy all its contents to equally named objects in the global environment with this:
invisible(lapply(names(L), function(x) eval(parse(text=paste0(x,"<-L[['",x,"']]")), globalenv())))
warning: this will overwrite whatever object you have defined with the same name! So use with care.

Resources