Writing help information for user defined functions in R - r

I frequently use user defined functions in my code.
RStudio supports the automatic completion of code using the Tab key. I find this amazing because I always can read quickly what is supposed to go in the (...) of functions/calls.
However, my user defined functions just show the parameters, no additional info and obviously, no help page.
This isn't so much pain for me but I would like to share code I think it would be useful to have some information at hand besides the #coments in every line.
Nowadays, when I share, my lines usually look like this
myfun <- function(x1,x2,x3,...){
# This is a function for this and that
# x1 is a factor, x2 is an integer ...
# This line of code is useful for transformation of x2 by x1
some code here
# Now we do this other thing
more code
# This is where the magic happens
return (magic)
}
I think this line by line comment is great but I'd like to improve it and make some things handy just like every other function.

Not really an answer, but if you are interested in exploring this further, you should start at the rcompgen-help page (although that's not a function name) and also examine the code of:
rc.settings
Also, executing this allows you to see what the .CompletionEnv has in it for currently loaded packages:
names(rc.status())
#-----
[1] "attached_packages" "comps" "linebuffer" "start"
[5] "options" "help_topics" "isFirstArg" "fileName"
[9] "end" "token" "fguess" "settings"
And if you just look at:
rc.status()$help_topics
... you see the character items that the tab-completion mechanism uses for matching. On my machine at the moment there are 8881 items in that vector.

Related

Plothraw PARIGP (or similar) doesn't work (latexit crash)

I'm a new user of PARI/GP, and after writing my script, I wanted to make a graph of it. As my function take an integer and return a number, it's closer to a sequence. Actually, I didn't know how to do it, so I read the documentation of PARI/GP, and after that I made some test in order to obtain a graph from a list.
After reading an answer in stackoverflow (Plotting multiple lists in Pari), I wanted to test with the following code:
plothraw([0..200], apply(i->cos(i*3*Pi/200), [0..200]), 0);
But when I do it, it tries to open something on latexit, but then it crash and give me a problem report.
I didn't even know that I had an app named latextit, maybe it was install during the installation of PARI/GP. Anyway, how can I fix this?
PARI/GP definitely doesn't install latexit.
The way hi-res graphics work on the Win32 version of PARI/GP is to write down an Enhanced Metafile (.EMF) in a temp directory and ask the system to
"open" it. When you installed latexit it probably created an association in the registry to let it open .EMF files
i3Pi does not mean what you think, it just creates a new variable with that name. You want i * 3 * Pi instead.
The following constructions both work in my setup
plothraw([0..200], apply(i->cos(i*3*Pi/200), [0..200]), 0);
plothraw([0..200], apply(i->cos(i*3*Pi/200), [0..200]), 1);
(the second one being more readable because a red line is drawn between successive points; I have trouble seeing the few tiny blue dots)
Instead of apply, you can use a direct constructor as in
vector(201, i, cos((i-1) * 3 * Pi / 200))
which of course can be computed more efficiently as
real( powers(exp(3*I*Pi/200), 200) )
(of course, it doesn't matter here, but compare both commands at precision \p10000 or so ...)

Is there a way to let the console in RStudio produce time stamps? [duplicate]

I wonder if there is a way to display the current time in the R command line, like in MS DOS, we can use
Prompt $T $P$G
to include the time clock in every prompt line.
Something like
options(prompt=paste(format(Sys.time(), "%H:%M:%S"),"> "))
will do it, but then it is fixed at the time it was set. I'm not sure how to make it update automatically.
Chase points the right way as options("prompt"=...) can be used for this. But his solutions adds a constant time expression which is not what we want.
The documentation for the function taskCallbackManager has the rest:
R> h <- taskCallbackManager()
R> h$add(function(expr, value, ok, visible) {
+ options("prompt"=format(Sys.time(), "%H:%M:%S> "));
+ return(TRUE) },
+ name = "simpleHandler")
[1] "simpleHandler"
07:25:42> a <- 2
07:25:48>
We register a callback that gets evaluated after each command completes. That does the trick. More fancy documentation is in this document from the R developer site.
None of the other methods, which are based on callbacks, will update the prompt unless a top-level command is executed. So, pressing return in the console will not create a change. Such is the nature of R's standard callback handling.
If you install the tcltk2 package, you can set up a task scheduler that changes the option() as follows:
library(tcltk2)
tclTaskSchedule(1000, {options(prompt=paste(Sys.time(),"> "))}, id = "ticktock", redo = TRUE)
Voila, something like the MS DOS prompt.
NB: Inspiration came from this answer.
Note 1: The wait time (1000 in this case) refers to the # of milliseconds, not seconds. You might adjust it downward when sub-second resolution is somehow useful.
Here is an alternative callback solution:
updatePrompt <- function(...) {options(prompt=paste(Sys.time(),"> ")); return(TRUE)}
addTaskCallback(updatePrompt)
This works the same as Dirk's method, but the syntax is a bit simpler to me.
You can change the default character that is displayed through the options() command. You may want to try something like this:
options(prompt = paste(Sys.time(), ">"))
Check out the help page for ?options for a full list of things you can set. It is a very useful thing to know about!
Assuming this is something you want to do for every R session, consider moving that to your .Rprofile. Several other good nuggets of programming happiness can be found hither on that topic.
I don't know of a native R function for doing this, but I know R has interfaces with other languages that do have system time commands. Maybe this is an option?
Thierry mentioned system.time() and there is also proc.time() depending on what you need it for, although neither of these give you the current time.

Is it possible to place code into the console in R?

Blasphemy I know to ask IF it is possible to do something in R, but here I am!
I am interested in the ability to create a function that will place code into the console. In other words, if the user types in f("3+3") and hits enter then the console will be waiting for the next command with > 3+3. Then when the user hits enter, it will return 6 in this case. Possible? Any ideas?
I wish I had more to share but I've never even thought this functionality would be useful before...
One way you could do this is to call system2() to invoke an external utility that synthesizes keyboard input. I've written a C++ program called sendkeys that can do this on Windows by (ultimately) calling SendInput(). Demo:
system2('sendkeys','3\\\\+3');
3+3
## [1] 6
(The backslash escaping is necessary because of the way my utility parses its input; + is a metachar that must be escaped to become literal.)
Let me know if you want my C++ code.
Would that be the kind of function you would need? Maybe it is not a very elegant solution, though.
printEval <- function(x){
cat(">", x,"\n")
cat ("Press [enter] to continue")
line <- readline()
eval(parse(text=x))
}
EDIT: Sorry, I just noticed that the eval(parse()) solution was already suggest by #Ping in the comment field right under the question.

SnowballC in R stems "many" and "only"

I am using SnowballC to process a text document, but realize it stems words such as "many" and "only" even though they are not supposed to be stemmed.
> library(SnowballC)
>
> str <- c("many", "only", "things")
> str.stemmed <- stemDocument(str)
> str.stemmed
[1] "mani" "onli" "thing"
>
> dic <- c("many", "only", "online", "things")
> str.complete <- stemCompletion(str.stemmed, dic)
> str.complete
mani onli thing
"" "online" "things"
You can see that after stemming, "many" and "only" became "mani" and "onli", which cannot be completed back with stemCompletion later on, since letters in "many" is not inclusive of "mani". Notice how "onli" gets completed to "online" instead of the original "only".
Why is that? Is that a way to fix this?
Stemming is often executed as a set of rules from stripping all affixes--both derivational and inflectional--from a word, leaving its root. Lemmatization typically only removes inflectional affixes. Stemming is a much more aggressive version of lemmatization. Given what you want, it seems like you'd prefer lemmatization.
To compare the two, most lemmatizers are limited to a few rules for dealing with affixes to nouns and verbs in English---ed, -s, -ing, for example. There are a few irregular cases they have to handle, but with some training data, many are probably covered.
Stemmers are expected to dig deeper. As a result, the space of possible transformations they can make is bigger, so you're a lot more likely to end up with errors.
To see what's happening in your data, let's look at the specifics.
online -> onli: why on earth would this happen? Not totally sure on this one; there's probably some rule that tries to cater to words like medic-ine and medic-al, sub-mari-ne and mari-ne, imagi-ne and imagi-na-tion.
only -> onli, many -> mani: These seem particularly strange, but are probably more reasonable than the previous rule--especially in the context of dealing with verbs that end in -ed. If you're stemming the words denied, studied, modified, specified, you'll want them to be equivalent to their uninflected forms deny, study, modify, specify.
You could have a rule to transform each verb into the uninflected form, but the authors here chose to make the roots the forms ending in -i. To ensure that these match, -y endings had to be transformed to -i as well.
With a lemmatizer, you might get more predictable results. Since they only remove inflectional affixes, you'd get only, many, online, and thing, as you wanted. Both a good stemmer and lemmatizer can work well, but the stemmer does more stuff and therefore has more room for error.
That is how stemmers work. You've got a (smallish) set of rules that reduce most words to something resembling a canonical form (a stem), but not quite. There are many other corner cases you will find, so many in fact that I hesitate to call them corner cases, e.g.
many -> mani
other -> other
corner -> corner
cases -> case
in -> in
sentences -> sentenc
What you want is a lemmatiser. Have a look at this question for a more detailed explanation:
Stemmers vs Lemmatizers

R: How can I disable truncation of listing of package functions?

How can I list all of the results that used to occur when typing packageName<tab>, i.e. the full list offered via auto-completion? In R 2.15.0, I get the following for Matrix::<tab>:
> library(Matrix)
> Matrix::
Matrix::.__C__abIndex Matrix::.__C__atomicVector Matrix::.__C__BunchKaufman Matrix::.__C__CHMfactor Matrix::.__C__CHMsimpl
Matrix::.__C__CHMsuper Matrix::.__C__Cholesky Matrix::.__C__CholeskyFactorization Matrix::.__C__compMatrix Matrix::.__C__corMatrix
Matrix::.__C__CsparseMatrix Matrix::.__C__dCHMsimpl Matrix::.__C__dCHMsuper Matrix::.__C__ddenseMatrix Matrix::.__C__ddiMatrix
Matrix::.__C__denseLU Matrix::.__C__denseMatrix Matrix::.__C__dgCMatrix Matrix::.__C__dgeMatrix Matrix::.__C__dgRMatrix
Matrix::.__C__dgTMatrix Matrix::.__C__diagonalMatrix Matrix::.__C__dMatrix Matrix::.__C__dpoMatrix Matrix::.__C__dppMatrix
Matrix::.__C__dsCMatrix Matrix::.__C__dsparseMatrix Matrix::.__C__dsparseVector Matrix::.__C__dspMatrix Matrix::.__C__dsRMatrix
Matrix::.__C__dsTMatrix Matrix::.__C__dsyMatrix Matrix::.__C__dtCMatrix Matrix::.__C__dtpMatrix Matrix::.__C__dtrMatrix
Matrix::.__C__dtRMatrix Matrix::.__C__dtTMatrix Matrix::.__C__generalMatrix Matrix::.__C__iMatrix Matrix::.__C__index
Matrix::.__C__isparseVector Matrix::.__C__ldenseMatrix Matrix::.__C__ldiMatrix Matrix::.__C__lgCMatrix Matrix::.__C__lgeMatrix
Matrix::.__C__lgRMatrix Matrix::.__C__lgTMatrix Matrix::.__C__lMatrix Matrix::.__C__lsCMatrix Matrix::.__C__lsparseMatrix
[...truncated]
That [...truncated] message is irritating and I want to produce the full listing. Which option/flag/knob/configuration/incantation do I need to invoke in order to avoid the truncation? I have this impression that I used to see the full list, but not anymore - perhaps that was on a different OS (e.g. Linux).
I know that ls("package:Matrix") is one useful approach, but it is not the same as setting an option, and the list is different.
Unfortunately, on Windows, it looks like this behavior is hard-wired into the C code used to construct the console. So the answer seems to be that "no, you can't disable it" (at least not without modifying the sources and then recompiling R from scratch).
Here are the relevant lines from $RHOME/src/gnuwin32/console.c:
909 static void performCompletion(control c)
910 {
911 ConsoleData p = getdata(c);
912 int i, alen, alen2, max_show = 10, cursor_position = p->c - prompt_wid;
...
...
1001 if (alen > max_show)
1002 consolewrites(c, "\n[...truncated]\n");
You are correct that on some other platforms, all of the results are printed out. (I often use Emacs, for instance, and it pops all results of tab completion up in a separate buffer).
As an interesting side note, rcompgen, the backend that actually performs the tab-completion (as opposed to printing results to the console) does always find all completions. It's just that Windows doesn't then print them out for us to see.
You can verify that this happens even on Windows by typing:
library(Matrix)
Matrix::
## Then type <TAB> <TAB>
## Then type <RET>
rc.status() ## Careful not to use tab-completion to complete rc.status !
matches <- rc.status()$comps
length(matches) # -> 288
matches # -> lots of symbols starting with 'Matrix::'
For more details about about the backend, and the functions and options that control its behavior, see ?rcompgen.

Resources