I want to enable the auto-complete functionality in emacs for editing my R files. For this, I need to have listed all the keywords in the R language. Does someone know if this is available somewhere? I know I would have to include all the functions names in the external packages I am using, but for now the list of what is in r-cran-base should be fine for me.
Thanks a lot!
apropos with an empty string argument will list all objects on the search path. It is what is used for the tab complete in the default GUI.
apropos("")
[1] "-"
[2] "-.Date"
[3] "-.POSIXt"
[4] "!"
[5] "!.hexmode"
[6] "!.octmode"
...
The R Language Definition lists all of R's keywords. Note that those are also reserved.
The following identifiers have a special meaning and cannot be used for object names
if else repeat while function for in next break TRUE FALSE NULL Inf NaN NA NA_integer_ NA_real_
NA_complex_ NA_character_ ... ..1 ..2 etc.
See ?Reserved, ?Control and maybe ?Syntax and ?Ops.
Just a little heads up that what's being discussed here has nothing to do with actual R keywords, which is its own special thing. I suspect this is what #HongOoi is alluding to.
R keywords exists ostensibly to help group functions by theme, but, except for the special case of internal, isn't widely used.
If you want to see the list of valid keywords you can get it like this
readLines(file.path(R.home("doc"), "KEYWORDS.db"))
try this
ls('package:base')
list all objects in a package
You might go to an R buffer and look at the following variable (given that you have Emacs Speaks Statistics):
ess-R-font-lock-keywords
by using C-h v ess-R-font-lock-keywords.
From there on, you can look in ess-custom.el and find everything you need on the implementation.
Just making sure you really want to do this since ?rcompgen describes the built-in functions by Deepayan Sarkar in the utils-package that already provide "tab-completion".
Related
In Julia, how can I check an English word is a meaningful word? Suppose I want to know whether "Hello" is meaningful or not. In Python, one can use the enchant or nltk packages(Examples: [1],[2]). Is it possible to do this in Julia as well?
What I need is a function like this:
is_english("Hello")
>>>true
is_english("Hlo")
>>>false
# Because it doesn't have meaning! We don't have such a word in English terminology!
is_english("explicit")
>>>true
is_english("eeplicit")
>>>false
Here is what I've tried so far:
I have a dataset that contains frequent 5char English words(link to google drive). So I decided to augment it to my question for better understanding. Although this dataset is not adequate (because it just contains frequent 5char meaningful words, not all the meaningful English words with any length), it's suitable to use it to show what I want:
using CSV
using DataFrames
df = CSV.read("frequent_5_char_words.csv" , DataFrame , skipto=2)
df = [lowercase(item) for item in df[:,"0"]]
function is_english(word::String)::Bool
return lowercase(word) in df
end
Then when I try these:
julia>is_english("Helo")
false
julia>is_english("Hello")
true
But I don't have an affluent dataset! So this isn't enough. So I'm curious if there are any packages like what I mentioned before, in Julia or not?
(not enough rep to post a comment!)
You can still use NLTK in Julia via PyCall. Or, as it seems you don't need an NLP tool but just a dictionary, you can use wiktionary to do some lookup or build the dataset.
There is a recently new package, Named LanguageDetect.jl. It does not return true/false, but a list of probabilities. You could define something like:
using LanguageDetect: detect
function is_english(text, threshold=0.8)
langs = detect(text)
for lang in langs
if lang.language == "en"
return lang.probability >= threshold
end
end
ret
I want to clean up strings so they can be parsed as unique legal symbols. I intend to clean up a lot of strings, so there is an undesirable risk of duplicated symbols in the output. It would suffice to take every illegal character and replace it with its base 32 encoding. Desired behavior:
sanitize("_bad_symbol$not*a&list%$('")
## [1] "L4bad_symbolEQnotFIaEYlistEUSCQJY"
I think all I need is a complete list of possible characters to grep for. I know about letters and LETTERS, but what about everything else?
Does a better solution already exist? Because I would love that.
EDIT: just found about make.names() from this post. I could go with that in a pinch, but I would rather not.
With make.names() and make.unique() together, the problem is solved.
make.unique(make.names(c("asdflkj###$", "asdflkj####")))
## [1] "asdflkj...." "asdflkj.....1"
I am interested it writing a few operators. Many characters are reserved and cannot be used a or b for example while others are currently used and I would not like them overwritten +,-,>, and < for example. There are others which are unavailable for less clear reasons such as $ or #.
I would like a list of characters that can be used as user written operators.
Thanks for your help,
Francis
Both "?" and "!" can be overloaded. I'm not going to reproduce the source code here, but take a look at the sos package and at the cgwtools::splatnd function for info on how to write your own unary (single-argument) operators.
I believe there's a tutorial on how to write %foo% binary operators but I forget where I saw it :-(
What are the restrictions as to what characters (and maybe other restrictions) can be used for a variable name in R?
(This screams of general reference, but I can't seem to find the answer)
You might be looking for the discussion from ?make.names:
A syntactically valid name consists of letters, numbers and the dot or
underline characters and starts with a letter or the dot not followed
by a number. Names such as ".2way" are not valid, and neither are the
reserved words.
In the help file itself, there's a link to a list of reserved words, which are:
if else repeat while function for in next break
TRUE FALSE NULL Inf NaN NA NA_integer_ NA_real_ NA_complex_
NA_character_
Many other good notes from the comments include the point by James to the R FAQ addressing this issue and Josh's pointer to a related SO question dealing with checking for syntactically valid names.
Almost NONE! You can use 'assign' to make ridiculous variable names:
assign("1",99)
ls()
# [1] "1"
Yes, that's a variable called '1'. Digit 1. Luckily it doesn't change the value of integer 1, and you have to work slightly harder to get its value:
1
# [1] 1
get("1")
# [1] 99
The "syntactic restrictions" some people might mention are purely imposed by the parser. Fundamentally, there's very little you can't call an R object. You just can't do it via the '<-' assignment operator. "get" will set you free :)
The following may not directly address your question but is of great help.
Try the exists() command to see if something already exists and this way you know you should not use the system names for your variables or function.
Example...
> exists('for')
[1] TRUE
>exists('myvariable')
[1] FALSE
Using the make.names() function from the built in base package may help:
is_valid_name<- function(x)
{
length_condition = if(getRversion() < "2.13.0") 256L else 10000L
is_short_enough = nchar(x) <= length_condition
is_valid_name = (make.names(x) == x)
final_condition = is_short_enough && is_valid_name
return(final_condition)
}
I am looking to assign objects in a loop. I've read that some form of eval(parse( is what I need to perform this, but I'm running into errors listing invalid text or no such file or directory. Below is sample code of generally what I'm attempting to do:
x <- array(seq(1,18,by=1),dim=c(3,2,3))
for (i in 1:length(x[1,1,])) {
eval(parse(paste(letters[i],"<-mean(x[,,",i,"])",sep="")
}
And when I'm finished using these objects, I would like to remove them (the actual objects are very large and cause memory problems later on...)
for (i in 1:length(x[1,1,])) eval(parse(paste("rm(",letters[i],")",sep="")))
Both eval(parse(paste( portions of this script return errors for invalid text or no such file or directory. Am I missing something in using eval(parse(? Is there a easier/better way to assign objects in a loop?
That's a pretty disgusting and frustrating way to go about it. Use assign to assign and rm's list argument to remove objects.
> for (i in 1:length(x[1,1,])) {
+ assign(letters[i],mean(x[,,i]))
+ }
> ls()
[1] "a" "b" "c" "i" "x"
> a
[1] 3.5
> b
[1] 9.5
> c
[1] 15.5
> for (i in 1:length(x[1,1,])) {
+ rm(list=letters[i])
+ }
> ls()
[1] "i" "x"
>
Whenever you feel the need to use parse, remember fortune(106):
If the answer is parse() you should
usually rethink the question.
-- Thomas Lumley, R-help (February 2005)
Although it seems there are better ways to handle this, if you really did want to use the "eval(parse(paste(" approach, what you're missing is the text flag.
parse assumes that its first argument is a path to a file which it will then parse. In your case, you don't want it to go reading a file to parse, you want to directly pass it some text to parse. So, your code, rewritten (in what has been called disgusting form above) would be
letters=c('a','b','c')
x <- array(seq(1,18,by=1),dim=c(3,2,3))
for (i in 1:length(x[1,1,])) {
eval(parse(text=paste(letters[i],"<-mean(x[,,",i,"])",sep="")))
}
In addition to not specifying "text=" you're missing a few parentheses on the right side to close your parse and eval statements.
It sounds like your problem has been solved, but for people who reach this page who really do want to use eval(parse(paste, I wanted to clarify.
Very bad idea; you should never use eval or parse in R, unless you perfectly know what you are doing.
Variables can be created using:
name<-"x"
assign(name,3) #Eqiv to x<-3
And removed by:
name<-"x"
rm(list=name)
But in your case, it can be done with simple named vector:
apply(x,3,mean)->v;names(v)<-letters[1:length(v)]
v
v["b"]
#Some operations on v
rm(v)
It is best to avoid using either eval(paste( or assign in this case. Doing either creates many global variables that just cause additional headaches later on.
The best approach is to use existing data structures to store your objects, lists are the most general for these types of cases.
Then you can use the [ls]apply functions to do things with the different elements, usually much quicker than looping through global variables. If you want to save all the objects created, you have just one list to save/load. When it comes time to delete them, you just delete 1 single object and everything is gone (no looping). You can name the elements of the list to refer to them by name later on, or by index.