Is there a list of all available color options for `col` in R plot? [duplicate] - r

Short question, if I have a string, how can I test if that string is a valid color representation in R?
Two things I tried, first uses the function col2rgb() to test if it is a color:
isColor <- function(x)
{
res <- try(col2rgb(x),silent=TRUE)
return(!"try-error"%in%class(res))
}
> isColor("white")
[1] TRUE
> isColor("#000000")
[1] TRUE
> isColor("foo")
[1] FALSE
Works, but doesn't seem very pretty and isn't vectorized. Second thing is to just check if the string is in the colors() vector or a # followed by a hexadecimal number of length 4 to 6:
isColor2 <- function(x)
{
return(x%in%colors() | grepl("^#(\\d|[a-f]){6,8}$",x,ignore.case=TRUE))
}
> isColor2("white")
[1] TRUE
> isColor2("#000000")
[1] TRUE
> isColor2("foo")
[1] FALSE
Which works though I am not sure how stable it is. But it seems that there should be a built in function to make this check?

Your first idea (using col2rgb() to test color names' validity for you) seems good to me, and just needs to be vectorized. As for whether it seems pretty or not ... lots/most R functions aren't particularly pretty "under the hood", which is a major reason to create a function in the first place! Hides all those ugly internals from the user.
Once you've defined areColors() below, using it is easy as can be:
areColors <- function(x) {
sapply(x, function(X) {
tryCatch(is.matrix(col2rgb(X)),
error = function(e) FALSE)
})
}
areColors(c(NA, "black", "blackk", "1", "#00", "#000000"))
# <NA> black blackk 1 #00 #000000
# TRUE TRUE FALSE TRUE FALSE TRUE

Update, given the edit
?par gives a thorough description of the ways in which colours can be specified in R. Any solution to a valid colour must consider:
A named colour as listed in colors()
A hexademical representation, as a character, of the form "#RRGGBBAA specifying the red, green, blue and alpha channels. The Alpha channel is for transparency, which not all devices support and hence whilst it is valid to specify a colour in this way with 8 hex values it may not be valid on a specific device.
NA is a valid "colour". It means transparent, but as far as R is concerned it is a valid colour representation.
Likewise "transparent" is also valid, but not in colors(), so that needs to be handled as well
1 is a valid colour representation as it is the index of a colour in a small palette of colours as returned by palette()
> palette()
[1] "black" "red" "green3" "blue" "cyan" "magenta" "yellow"
[8] "gray"
Hence you need to cope with 1:8. Why is this important, well ?par tells us that it is also valid to represent the index for these colours as a character hence you need to capture "1" as a valid colour representation. However (as noted by #hadley in the comments) this is just for the default palette. Another palette may be used by a user, in which case you will have to consider a character index to an element of a vector of the maximum allowed length for your version of R.
Once you've handled all those you should be good to go ;-)
To the best of my knowledge there isn't a user-visible function that does this. All of this in buried away inside the C code that does the plotting; very quickly you end up in .Internal(....) land and there be dragons!
Original
[To be pedantic #000000 isn't a colour name in R.]
The only colour names R knows are those returned by colors(). Yes, #000000 is one of the colour representations that R understands but you specifically ask about a name and the definitive list or solution is x %in% colors() as you have in your second example.
This is about as stable as it gets. When you use a colour like col = "goldenrod", internally R matches this with a "proper" representation of the colour for whichever device you are plotting on. color() returns the list of colour names that R can do this looking up for. If it isn't in colors() then it isn't a colour name.

Related

Why does empty logical vector pass the stopifnot() check?

Today I found that some of my stopifnot() tests are failing because the passed arguments evaluate to empty logical vectors.
Here is an example:
stopifnot(iris$nosuchcolumn == 2) # passes without error
This is very unintuitive and seems to contradict a few other behaviours. Consider:
isTRUE(logical())
> FALSE
stopifnot(logical())
# passes
So stopifnot() passes even when this argument is not TRUE.
But furthermore, the behaviour of the above is different with different types of empty vectors.
isTRUE(numeric())
> FALSE
stopifnot(numeric())
# Error: numeric() are not all TRUE
Is there some logic to the above, or should this be considered a bug?
The comments by akrun and r2evans are spot on.
However, to give details on why specifically this happens and why you're confused vs. isTRUE() behavior, note that stopifnot() checks for three things; the check is (where r is the result of the expression you pass):
if (!(is.logical(r) && !anyNA(r) && all(r)))
So, let's take a look:
is.logical(logical())
# [1] TRUE
!anyNA(logical())
# [1] TRUE
all(logical())
# [1] TRUE
is.logical(numeric())
# [1] FALSE
!anyNA(numeric())
# [1] TRUE
all(numeric())
# [1] TRUE
So, the only reason why logical() passes while numeric() fails is because numeric() is not "logical," as suggested by akrun. For this reason, you should avoid checks that may result in logical vectors of length 0, as suggested by r2evans.
Other answers cover the practical reasons why stopifnot behaves the way it does; but I agree with Karolis that the thread linked by Henrik adds the real explanation of why this is the case:
As author stopifnot(), I do agree with [OP]'s "gut feeling" [...] that
stopifnot(dim(x) == c(3,4)) [...][should] stop in the case
where x is a simple vector instead of a matrix/data.frame/... with
dimensions c(3,4) ... but [...] the gut feeling is wrong because of the fundamental lemma of logic: [...]
"All statements about elements of the empty set are true"
Martin Maechler, ETH Zurich
Also, [...], any() is to "|" what sum() is to "+" and what all() is to
"&" and prod() is to "*". All the operators have an identity element,
namely FALSE, 0, TRUE, and 1 respectively, and the generic convention
is that for an empty vector, we return the identity element, for the
reason given above.
Peter D.

Named colors in R and ggvis

I have encountered a problem with R's named colors when I use the ggvis package. For example, if I set the ggvis fill property to "cadetblue" it works, but if I set it to "cadetblue1" it does not. Here is a small reproducible example:
library(ggvis)
pressure %>% ggvis(~temperature, ~pressure, fill := "cadetblue") %>% layer_bars()
When I change the fill property to "cadetblue1" the plot turns black. It seems only the major named colors without a number in the name works when using ggvis. Does anybody know why or have I misunderstood something here?
I don't know why that's happening, but you can use the named colors by converting from name to hexadecimal.
col2rgb converts the color name to its 8-bit numeric rgb values (rendered in base 10).
rgb takes the output of col2rgb (after conversion from a column vector to a row vector using t (transpose)) and converts them to the corresponding hexadecimal color code.
So, in your case, the code would be:
fill := rgb(t(col2rgb("cadetblue1")), maxColorValue=255)
Or, to see the individual steps:
x = t(col2rgb("cadetblue1"))
red green blue
[1,] 152 245 255
rgb(x, maxColorValue=255)
[1] "#98F5FF"
You can use any HTML hex color code for colors. For example, try
pressure %>% ggvis(~temperature, ~pressure, fill := "#FFFF00") %>% layer_bars()
For yellow or "#5f9ea0" for cadet blue, and so on...
The plot turns black because R doesn't recognize your color input as valid and it defaults to black.

Is there a function to recognize a word?

Is there a way to evaluate a string and see if it evaluates to a word in English? Here is what I am looking for
is.word("hello world")
[1] FALSE
is.word(c("hello", "world")
[1] TRUE TRUE
The above does not work as there is no is.word logical function.
As the comments have pointed out, you need an English dictionary to match against. The GradyAugmented object in the qdapDictionary package is one such dictionary:
A dataset containing a vector of Grady Ward's English words
augmented with ‘DICTIONARY’, Mark Kantrowitz's names list, other
proper nouns, and contractions.
library(qdapDictionaries)
is.word <- function(x) x %in% GradyAugmented
is.word(c("hello world"))
## [1] FALSE
is.word(c("hello", "world"))
## [1] TRUE TRUE
is.word(c("asfasdf"))
## [1] FALSE
No, there is no such function in R.
Although you can easily implement naïve approach that will work in 9 out of 10 cases.
Custom solution
First of all, you need a dictionary of "words" that you will match your data against. One such dictionary is compiled by GNU people and distributed under open source license at SCOWL (And Friends) website.
Download data file and unzip it. Words are scattered across multiple files with suffix indicating region, category and commonness (or probability that everyday English user will not be familiar with word). Using list.files() function with pattern argument, or grepl() function, you can select exact set of dictionaries that you care about.
# set path to extracted package
words.dir <- '/tmp/scowl-2015.08.24/final/'
words <- unlist(sapply(list.files(words.dir, pattern='[1-6][05]$', full.names=TRUE), readLines, USE.NAMES=FALSE))
# For some reason most frequent words are not in "final" dir…
words <- c(words, readLines(paste0(words.dir, '../r/special/frequent')))
length(words)
# [1] 143681
Then verifying if word is English is as easy as checking if it exists in vector of known words. The nice thing is that you get vectorization for free.
c("knight", "stack", "selfie", "l8er", "googling", "echinuliform") %in% words
# [1] TRUE TRUE TRUE FALSE TRUE FALSE
Core problem
The real problem is "what counts as word?". Does "googling" count as word? It is commonly used now, but that wasn't a case 15 years ago. And what about "echinuliform"? I guess that plenty of native speakers wouldn't understand it.
Discussing this issue falls outside of scope of this website, but there is some arbitrariness in language and currently no computer program is able to cope with that.

Use the grep functie to find words which contain either blue of red

Im using the grep function to select certain column heads. The heads I want to select should contain exactly "red" or "blue"
I got the red thing to work using (I stored the columnnames in a variable called x) ->
x <- c("Red", "Blue", "blue", "green")
grep("^red$", x, varnames=TRUE)
But i cant figure out how to look for red OR blue... Any thoughts?
grep("^(red|blue)$", x, varnames=TRUE)
This doesn't seem to work...
If the search is not supposed to be case-sensitive, then I'd suggest the following:
> x <- c("Red", "Blue", "blue", "green")
> grep("^(red|blue)$",tolower(x))
[1] 1 2 3
grep("red|blue", x, ignore.case=T, value=T) # returns [1] "Red" "Blue" "blue"
If you require the match to be case-sensitive, remove the ignore.case=T.
If you require a case-sensitive match to the entire string (which is what you get when you use the assertions ^ and $) then you are basically asking for x[x=="blue"|x=="red"], which may be more efficient than a regex.

using gsub to find all values that are NOT equal in R

I am trying to use gsub to change values in an Igraph vertex variable to colors before I plot a network graph.
The issue is that my graph has 3 values that I care about, and many others that I'd just like to group as "other" and assign 1 color to.
For example, if I had data that looks like this:
Name........Value
A............1
B............2
C............3
D............4
E............5
and I had code like this:
V(g)$color=V(g)$value #assign the "Value" attribute as the vertex color
V(g)$color=gsub("1","red",V(g)$color) #1 will be red
V(g)$color=gsub("2","blue",V(g)$color) #2 will be blue
V(g)$color=gsub("3", "yellow", V(DMedge)$color) #3 is yellow
What line of code could I add to make 4 and 5 into some other color, (green for example)? Thanks so much for any help you might have!
I would avoid sub (this is not about matching patterns) and do:
my.colors <- c("red", "blue", "yellow", "green")
V(g)$color <- my.colors[match(V(g)$value, c(1, 2, 3), nomatch = 4)]
It looks like this suffices for what you want to do:
x <- c("1","2","3","4")
gsub("4|5", "green", x)
[1] "1" "2" "3" "green" "green"
Or this
gsub("[^1-3]", "green", x)
[1] "1" "2" "3" "green" "green"
However as pointed out in other answers it looks like a better idea to set up a lookup table mapping numbers to colors and use match to determine the color.
Assuming that after you have made the initial substitutions, the only numbers left are the ones you want to be one uniform color, you can use a regex to match all contiguous digits and put the same color for them.
V(g)$color=gsub("\\d+", "green",V(g)$color)
See this page for gsub regular expressions.

Resources