Please mark the list of removed functions in the current version documents. New aspirants like myself will find it difficult to trace deprecated or modified functions!!!
What happened to functions like dec, chr2ind and ind2chr etc? Is there any doc on deprecated functions??
Up till Julia 1.0 the policy was that if function were to be removed in version 0.X, then it was deprecated in version 0.X-1.
After Julia 1.0 you can expect that breaking changes like function removal will not happen till Julia 2.0.
A practical advice is twofold:
strict: if you have a program or manual written for Julia 0.X you should install this version to run it (this is probably an advice that is the same as for any other software product)
loose: Julia 0.7 has the same functionality as Julia 1.0 but it prints deprecation warnings for functions that were removed/renamed from Julia 0.6.
Now regarding your specific questions I relate to changes between Julia 0.6 and 1.0 (as you have not specified for which Julia version you had your program/manual):
instead of dec(10, 3) use string(10, pad=3)
instead of chr2ind("αβγdef", 2) use nextind("αβγdef", 0, 2)
instead of ind2chr("αβγdef", 2) use length("αβγdef", 1, 2)
(note, however, that regarding string handling Julia 1.0 introduced some breaking changes in the infrastructure - in particular you can ingest and work with even invalid UTF-8 strings so in some cases these functions might not have exactly identical behavior)
Now regarding finding removals. What I typically do is search a Julia GitHub repo for a given function. Most of the time it is easy to find a commit that deprecates it. For example here is a commit deprecating chr2ind and ind2chr: https://github.com/JuliaLang/julia/commit/dcf9552ace3331cbd5426f91a5c84c8e810f9a91. The additional benefit from this approach is that you can get the understanding of the reason that lead to the change (as you have reference back to a specific issues/PRs). In this case you can see that the specific functions got deprecated over one year ago, which means that your sources are probably relatively old and one year in pre Julia 1.0 world was a lot of time as it was evolving very fast back then.
Related
In short: How can I call, from within Rccp C++ code, the agrep C internal function that gets called when users use the regular agrep function from base R?
In long: I have found multiple questions here about how to invoke, from within Rcpp, a C or C++ function created for another package (e.g. using C function from other package in Rcpp
and Rcpp: Call C function from a package within Rcpp).
The thing that I am trying to achieve, however, is at the same time simpler but also way less documented: it is to directly call, from within Rcpp, a .Internal C function that comes with base R rather than another package, without interfacing with R (that is, without doing what is said in Call R functions in Rcpp). How could I do that for the .Internal C function that lays underneath base R's agrep wrapper?
The specific function I am trying to call here is the agrep internal C function. And for context, what I am ultimately trying to achieve is to speed-up a call to agrep for when millions of patterns must be each checked against each of millions of x targets.
Great question. The long and short of it is "You cant" (in many cases) unless the function is visible in one of the header files in "src/include/". At least not that easily.
Not long ago I had a similar fun challenge, where I tried to get access to the do_docall function (called by do.call), and it is not a simple task. First of all, it is not directly possible to just #include <agrep.c> (or something similar). That file simply isn't available for inclusion, as it is not a part of the "src/include". It is compiled and the uncompiled file is removed (not to mention that one should never "include" a .c file).
If one is willing to go the mile, then the next step one could look at is "copying" and "altering" the source code. Basically find the function in "src/main/agrep.c", copy it into your package and then fix any errors you find.
Problems with this approach:
As documented in R-exts the internal structures of sexprec_info is not made public (this is the base structure for all objects in R). Many internal function use the fields within this structure, so one has to "copy" the structure into your source code, to make it public to your code specifically.
If you ever #include <Rcpp.h> prior to this file, you will need to go through each and every call to internal functions and likely add either R_ or Rf_.
The function may contain calls to other "internal" functions, that further needs to be copied and altered for it to work.
You will also need to get a clear understanding of what CDR, CAR and similar does. The internal functions have a documented structure, where the first argument contains the full call passed to the function, and function like those 2 are used to access parts of the call.
I did myself a solid and rewrote do_docall changing the input format, to avoid having to consider this. But this takes time. The alternative is to create a pairlist according to the documentation, set its type as a call-sexp (the exact name is lost to me at the moment) and pass the appropriate arguments for op, args and env.
And lastly, if you go through the steps, and find that it is necessary to copy the internal structures of sexprec_info (as described later), then you will need to be very careful about when you include Rinternals and Rcpp, as any one of these causes your code to crash and burn in the most beautiful and silent way if you include your header and these in the wrong order! Note that this even goes for [[Rcpp::export]], which may indeed turn out to include them in the wrong arbitrary order!
If you are willing to go this far down the drainage, I would suggest carefully reading adv-R "R's C interface" and Chapter 2, 5 and 6 of R-ext and maybe even the R internal manual, and finally once that is done take a look at do_docall from src/main/coerce.c and compare it to the implementation in my repository cmdline.arguments/src/utils/{cmd_coerce.h, cmd_coerce.c}. In this version I have
Added all the internal structures that are not public, so that I can access their unmodified form (unmodified by the current session).
This includes the table used to store the currently used SEXP's, that was used as a lookup. This caused a problem as I can't access the modified version, so my code is slightly altered with the old code blocked by the macro #if --- defined(CMDLINE_ARGUMENTS_MAYBE_IN_THE_FUTURE). Luckily the code causing a problem had a static answer, so I could work around this (but this might not always be the case).
I added quite a few Rf_s as their macro version is not available (since I #include <Rcpp.h> at some point)
The code has been split into smaller functions to make it more readable (for my own sake).
The function has one additional argument (name), that is not used in the internal function, with some added errors (for my specific need).
This implementation will be frozen "for all time to come" as I've moved on to another branch (and this one is frozen for my own future benefit, if I ever want to walk down this path again).
I spent a few days scouring the internet for information on this and found 2 different posts, talking about how this could be achieved, and my approach basically copies this. Whether this is actually allowed in a cran package, is an whole other question (and not one that I will be testing out).
This approach goes again if you want to use not-public code from other packages. While often here it is as simple as "copy-paste" their files into your repository.
As a final side note, you mention the intend is to "speed up" your code for when you have to perform millions upon millions of calls to agrep. It seems that this is a time where one should consider performing the task in parallel. Even after going through the steps outlined above, creating N parallel sessions to take care of K evaluations each (say 100.000), would be the first step to reduce computing time. Of course each session should be given a batch and not a single call to agrep.
Could somebody please explain to me what this message might mean?
I have the Julia client running in Atom, and my code works properly and it gets me the results, but for some line executions(ctrl+enter) the instant eval gives me "julia-client: can't render lazy".
It appears that the behind the scenes the code is executed, but the inline evaluations prefers not to output anything.
The lines corresponding to these messages usually should return a 2 dimensional arrays or dataframes, and in Julia usually the type and the dimensions are printed in the eval, but for some specific lines it can't render.
I could not find similar reports anywhere else.
julia version 0.5.0-rc3
This is a problem with package versions being out of sync. It's you're on the Julia release (v0.5), this will be fixed with a Pkg.update(). In the future, this kind of question is better suited for the Juno discussion board
Is there a historical precedent of internal changes to the R parser, adding new reserved words or symbols?
If I remember correctly data.table uses a serendipitous := that was once defined but left unused in R internals, but I'm not aware of others. However, as the language evolves, it would sometimes seem useful to define new symbols.
An obvious case could be made for magrittr's pipe %>% which has become ubiquitous for many, but remains a pain to type (sure, there are keyboard tricks, but still). Similarly, dplyr/rlang introduce/repurpose notations for "tidy evaluation" (!!, !!!, :=, ~, etc.).
Another case I'm seeing is the verbosity of lambda functions. Would it be possible, theoretically, to define internally something like f = λ(x) x+1 instead of f = function(x) x+1, or are there character restrictions on top of other reasons?
Why add an ergonomics feature if you risk breaking a runtime that hosts a huge ecosystem? Also, once you add one feature, you are on a slippery slope and are staring straight in the face of feature bloat.
And if you say that we can be smart and judicious about what features we add, how do we structure that decision process? R does not have a "benevolent dictator" having a final word in decisions like this so you are left with design by committee with all that it entails.
The big thing with R has always been the package ecosystem, in which if you want a feature you write it yourself -- as in your magrittr example. The language itself has remained close to its S roots and has successfully served as a stable platform for all the development that has been happening.
HI all,
I was trying to load a certain amount of Affymetrix CEL files, with the standard BioConductor command (R 2.8.1 on 64 bit linux, 72 GB of RAM)
abatch<-ReadAffy()
But I keep getting this message:
Error in read.affybatch(filenames = l$filenames, phenoData = l$phenoData, :
allocMatrix: too many elements specified
What's the general meaning of this allocMatrix error? Is there some way to increase its maximum size?
Thank you
The problem is that all the core functions use INTs instead of LONGs for generating R objects. For example, your error message comes from array.c in /src/main
if ((double)nr * (double)nc > INT_MAX)
error(_("too many elements specified"));
where nr and nc are integers generated before, standing for the number of rows and columns of your matrix:
nr = asInteger(snr);
nc = asInteger(snc);
So, to cut it short, everything in the source code should be changed to LONG, possibly not only in array.c but in most core functions, and that would require some rewriting. Sorry for not being more helpful, but i guess this is the only solution. Alternatively, you may wait for R 3.x next year, and hopefully they will implement this...
If you're trying to work on huge affymetrix datasets, you might have better luck using packages from aroma.affymetrix.
Also, bioconductor is a (particularly) fast moving project and you'll typically be asked to upgrade to the latest version of R in order to get any continued "support" (help on the BioC mailing list). I see that Thrawn also mentions having a similar problem with R 2.10, but you still might think about upgrading anyway.
I bumped into this thread by chance. No, the aroma.* framework is not limited by the allocMatrix() limitation of ints and longs, because it does not address data using the regular address space alone - instead it subsets also via the file system. It never hold and never loads the complete data set into memory at any time. Basically the file system sets the limit, not the RAM nor the address space of you OS.
/Henrik
(author of aroma.*)
I'm working with large numbers that I can't have rounded off. Using Lua's standard math library, there seem to be no convenient way to preserve precision past some internal limit. I also see there are several libraries that can be loaded to work with big numbers:
http://oss.digirati.com.br/luabignum/
http://www.tc.umn.edu/~ringx004/mapm-main.html
http://lua-users.org/lists/lua-l/2002-02/msg00312.html (might be identical to #2)
http://www.gammon.com.au/scripts/doc.php?general=lua_bc (but I can't find any source)
Further, there are many libraries in C that could be called from Lua, if the bindings where established.
Have you had any experience with one or more of these libraries?
Using lbc instead of lmapm would be easier because lbc is self-contained.
local bc = require"bc"
s=bc.pow(2,1000):tostring()
z=0
for i=1,#s do
z=z+s:byte(i)-("0"):byte(1)
end
print(z)
I used Norman Ramsey's suggestion to solve Project Euler problem #16. I don't think it's a spoiler to say that the crux of the problem is calculating a 303 digit integer accurately.
Here are the steps I needed to install and use the library:
Lua needs to be built with dynamic loading enabled. I use Cygwin, but I changed PLAT in src/Makefile to be linux. The default, none, doesn't enable dynamic loading.
The MAMP needs to be built and installed somewhere that your C compiler can find it. I put libmapm.a in /usr/local/lib/. Next m_apm.h and m_apm_lc.h went to /usr/local/include/.
The makefile for lmamp needs to be altered to the correct location of the Lua and MAMP libraries. For me, that means uncommenting the second declaration of LUA, LUAINC, LUALIB, and LUABIN and editing the declaration of MAMP.
Finally, mapm.so needs to be placed somewhere that Lua will find it. I put it at /usr/local/lib/lua/5.1/.
Thank you all for the suggestions!
The lmapm library by Luiz Figueiredo, one of the authors of the Lua language.
I can't really answer, but I will add LGMP, a GMP binding. Not used.
Not my field of expertise, but I would expect the GNU multiple precision arithmetic library to be quite a standard here, no?
Though not arbitrary precision, Lua decNumber, a Lua 5.1 wrapper for IBM decNumber, implements the proposed General Decimal Arithmetic standard IEEE 754r. It has the Lua 5.1 arithmetic operators and more, full control over rounding modes, and working precision up to 69 decimal digits.
There are several libraries for the problem, each one with your advantages
and disadvantages, the best choice depends on your requeriments. I would say
lbc is a good first pick if it
fulfills your requirements or any other by Luiz Figueiredo. For the most efficient one I guess would be any using GMP bindings as GMP is a standard C library for dealing with large integers and is very well optimized.
Nevertheless in case you are looking for a pure Lua one, lua-bint
library could be an option for dealing with big integers,
I wouldn't say it's the best because there are more efficient
and complete ones such the ones mentioned above, but usually they requires compiling C code
or can be troublesome to setup. However when comparing pure Lua big integer libraries
and depending in your use case it could perhaps be an efficient choice. The library is documented,
code fully covered by tests and have many examples. But take this recommendation with grant of
salt because I am the library author.
To install you can use luarocks if you already have it in your computer or simply download the
bint.lua
file in your project, as it has no other dependencies other than requiring Lua 5.3+.
Here is a small example using it to solve the problem #16 from Project Euler
(mentioned in previous answers):
local bint = require 'bint'(1024)
local n = bint(1) << 1000
local digits = tostring(n)
local sum = 0
for i=1,#digits do
sum = sum + tonumber(digits:sub(i,i))
end
print(sum) -- should output 1366