Converting a text import of a range in format 1:5 from character with : operator to a numeric string/value for use in a function - r

I am importing a key in which each row is an argument setting for a function I have programmed. The goal is to batch test my function by producing outputs for all sets of arguments. That's not terribly important. What is important is that I import a column that contains in each row a value for a range. For instance, "1:5" is meant to be entered into an argument as the value 1:5. I try to coerce using as.numeric("1:5"), but R is not happy with this. Is there a way to coerce this to the string c(1,2,3,4,5) from the character value "1:5"

Your text is valid code, so you can eval(parse it
dat$parsed <- lapply(dat$key, function(x) eval(parse(text=x)))
# key parsed
# 1 1:5 1, 2, 3, 4, 5
# 2 1:6 1, 2, 3, 4, 5, 6
# 3 1:4 1, 2, 3, 4
Data
dat <- read.table(text="key
1:5
1:6
1:4", strings=F, header=T)

Reduce(':', strsplit(x,":")[[1]])
[1] 1 2 3 4 5
If x = "1:5", we can use strsplit to separate the two numbers. We can then use Reduce to execute the operator : on the split.

Related

is Ifelse... the right function to use here?

I am trying to use ifelse to populate a new column in a data frame.
I want to extract the last digits of a character string in a column if this is longer than 3. if the charachter string is shorter I just want it to give -1...
I already figured out how to extract the last characters of the string if the string is longer than 3 characters.
x<- c("ABCD1", "ABCD2", "ABCD3", "ABCD4", "BC5", "BC6", "BC7")
y<-NULL
dat<-cbind(x,y)
ifelse (nchar(x>3), y=substr(x, 5,5), y=-1)
dat<-cbind(x,y)
view(dat)
when I run this, I get the next error
Error in ifelse(nchar(x > 3), y = substr(x, 4, 5), y = substr(x, 3)) :
formal argument "yes" matched by multiple actual arguments`
What I want is that vector "y" gets the numbers 1,2,3,4,-1,-1,-1
so I can bind both columns latter. If you have a better way of doing this I would appreciate it.
You're almost there! This will work as long as the strings with length > 3 are 4 characters long.
ifelse(nchar(x) > 3, substr(x, 5, 5), -1)
If your strings might be longer than 4 characters:
ifelse(nchar(x) > 3, sub(".*([0-9]).*", "\\1", x), -1)
I am guessing you need a dataframe. Here's what you probably need -
x <- c("ABCD1", "ABCD2", "ABCD3", "ABCD4", "BC5", "BC6", "BC7")
dat <- data.frame(x, stringsAsFactors = F)
dat$y <- ifelse(nchar(dat$x) > 3, as.numeric(substr(dat$x, 5,5)), -1)
x y
1 ABCD1 1
2 ABCD2 2
3 ABCD3 3
4 ABCD4 4
5 BC5 -1
6 BC6 -1
7 BC7 -1

Sorting a string by specific values

I have the following string:
str1<-"{a{c}{b{{e}{d}}}}"
In addition, I have a list of integers:
str_d <- ( 1, 2, 2, 4, 4)
There is one to one relation between the list to the string.
It means:
a 1
c 2
b 2
e 4
d 4
I would like to sort in alphabetic order only the characters of str1 that have same level.
It means to sort c, b (which have the same value 2) will yield b,c
and to sort e, d (which have the same value 4) will yield d,e.
The required result will be:
str2<-"{a{b}{c{{d}{e}}}}"
In addition a,b,c,d and e can be not only characters, but might be words, such as:
str1<-"{NSP{ARD}{BOS{{DUD}{COR}}}}"
How can I do it with keeping the { in their place?
brkts <- gsub("\\w+", "%s", str1)
strings <- regmatches(str1,gregexpr("[^{}]+",str1))[[1]]
fixed <- ave(strings, str_d, FUN=function(x) sort(x))
do.call(sprintf, as.list(c(brkts, fixed)))
[1] "{a{b}{c{{d}{e}}}}"
and
[1] "{NSP{ARD}{BOS{{COR}{DUD}}}}"
It will work for the first and second case. We first isolate the text with gsub and place %s instead. That will be used later for sprintf. Next we isolate the strings by splitting with strsplit on the comma that we placed after each group of bracket symbols. We then sort based on the sorting vector given and save the characters in the vector fixed. Lastly, we call sprintf on the brkts variable that we created at the beginning and the sorted strings.
Data
str_d <- c(1, 2, 2, 4, 4)
str1<-"{a{c}{b{{e}{d}}}}"
str1<-"{NSP{ARD}{BOS{{DUD}{COR}}}}"
One possible solution (using stringr package):
words <- str_extract_all(str1, '\\w+')[[1]]
ordered <- words[order(paste(str_d, words))]
formatter <- str_replace_all(str1, '\\w+', '%s')
do.call(sprintf, as.list(c(formatter, ordered)))
words is an extract of the words between the braces. I ordered those by sorting the combination of the words with str_d. E.g. the words will become:
1 a
2 c
2 b
4 e
4 d
Then I slap it all back together with sprintf().

R: edit column values by using if condition

I have a data frame with several columns. One of those contains Plotids like AEG1, AEG2,..., AEG50, HEG1, HEG2,..., HEG50, SEG1, SEG2,..., SEG50. So, the data frame has 150 rows. Now I want to change only some of these Plotids, so that there is AEG01, AEG02,... instead of AEG1, AEG2, ... So, I just want to add a "0" to some of the column entries. I tried it by using lapply, a for loop, writing a function,... but nothing did the job. There was always the error message:
In if (nchar(as.character(dat_merge$EP_Plotid)) == 4)
paste(substr(dat_merge$EP_Plotid, ... :
the condition has length > 1 and only the first element will be used
So, this was my last try:
Plotid_func <- function(x) {
if(nchar(as.character(dat_merge$EP_Plotid))==4)
paste(substr(dat_merge$EP_Plotid, 1, 3), "0", substr(dat_merge$EP_Plotid, 4, 4), sep="")
}
dat_merge$Plotid <- sapply(dat_merge$EP_Plotid, Plotid_func)
Therewith, I wanted to select only those column entries with four digits. And to only those selected entries, I wanted to add a 0. Can anybody help me? dat_merge is the name of my data frame and EP_Plotid is the column I want to edit. Thanks in advance
Just extract the "string" portion and the "numeric" portion and paste them back together after using sprintf on the numeric portion.
An example:
## "x" is the "column" of plot ids. Here I go up to 12
## to demonstrate the zero padding that it sounds like
## you're looking for
x <- c(paste0("AEG", 1:12), paste0("HEG", 1:12))
## Extract the string values
Strings <- gsub("([A-Z]+)(.*)", "\\1", x)
## Extract the numeric values
Nums <- gsub("([A-Z]+)(.*)", "\\2", x)
## Put them back together
paste0(Strings, sprintf("%02d", as.numeric(Nums)))
# [1] "AEG01" "AEG02" "AEG03" "AEG04" "AEG05" "AEG06"
# [7] "AEG07" "AEG08" "AEG09" "AEG10" "AEG11" "AEG12"
# [13] "HEG01" "HEG02" "HEG03" "HEG04" "HEG05" "HEG06"
# [19] "HEG07" "HEG08" "HEG09" "HEG10" "HEG11" "HEG12"
Or you can just modify your function to actually use the input variable x (which is not happening in your original function)
dat_merge <- data.frame(EP_Plotid = c("AEG1", "AEG2", "AEG50", "HEG1", "HEG2", "HEG50", "SEG1", "SEG2", "SEG50"))
Plotid_func <- function(x) {
if(nchar(as.character(x)) == 4){
paste(substr(x, 1, 3), "0", substr(x, 4, 4), sep="")
} else as.character(x)
}
dat_merge$Plotid <- sapply(dat_merge$EP_Plotid, Plotid_func)
dat_merge
# EP_Plotid Plotid
# 1 AEG1 AEG01
# 2 AEG2 AEG02
# 3 AEG50 AEG50
# 4 HEG1 HEG01
# 5 HEG2 HEG02
# 6 HEG50 HEG50
# 7 SEG1 SEG01
# 8 SEG2 SEG02
# 9 SEG50 SEG50
A vectorized version of your function (which is much better than using sapply which is just a for loop) would be
dat_merge$Plotid <- ifelse(nchar(as.character(dat_merge$EP_Plotid))==4, paste(substr(dat_merge$EP_Plotid, 1, 3), "0", substr(dat_merge$EP_Plotid, 4, 4), sep=""), as.character(dat_merge$EP_Plotid))
Or use a combination of formatC with str_extract from library(stringr)
library(stringr)
x from Ananda's post.
Extract alphabets and numbers separately.
Flag 0's to the numbers with formatC
paste together
paste0(str_extract(x, "[[:alpha:]]+"), formatC(as.numeric(str_extract(x,"\\d+")), width=2, flag=0))
#[1] "AEG01" "AEG02" "AEG03" "AEG04" "AEG05" "AEG06" "AEG07" "AEG08" "AEG09"
#[10] "AEG10" "AEG11" "AEG12" "HEG01" "HEG02" "HEG03" "HEG04" "HEG05" "HEG06"
#[19] "HEG07" "HEG08" "HEG09" "HEG10" "HEG11" "HEG12"

Check if the element belongs to a vector and get its indices in R

I want to know if an element belongs to a vector in R. I can do it with %in%, right? However, I also want to know the indices of all the examples of this element, and I want these indices as a vector as well. For example,
x<-c(1,3,5,5,7,5,8,9,0,5)
y< - myCoolFunction (x, 5)
y should be equal [3,4,6,10], because that's where 5 is in x.
I know how to do it algorithm-wise (with ifs and loops etc.), my questions is: is there an elegant R-style function to do it? Or a combination of two?
You can use which:
x <- c(1,3,5,5,7,5,8,9,0,5)
which(x == 5)
# [1] 3 4 6 10
Or using %in% for multiple values:
which(x %in% c(1,3))
# [1] 1 2
And in a function:
myCoolFunction = function(vec, value) which(vec %in% value)
myCoolFunction(x, 5)
Although this essentially makes myCoolFunction an alias for which (with slightly different syntax).
Believe it or not, you can also use grep for this, no quotes necessary. However, it's much slower over large vectors. Nice for short ones though...
> x <- c(1, 3, 5, 5, 7, 5, 8, 9, 0, 5)
> grep(5, x)
## [1] 3 4 6 10
Another variation to the which method uses is.element. Might be easier to read.
> which(is.element(x, 5))
## [1] 3 4 6 10

igraph assign a vector as an attribute for a vertex

I am trying to assign a vector as an attribute for a vertex, but without any luck:
# assignment of a numeric value (everything is ok)
g<-set.vertex.attribute(g, 'checked', 2, 3)
V(g)$checked
.
# assignment of a vector (is not working)
g<-set.vertex.attribute(g, 'checked', 2, c(3, 1))
V(g)$checked
checking the manual, http://igraph.sourceforge.net/doc/R/attributes.html
it looks like this is not possible. Is there any workaround?
Up till now the only things I come up with are:
store this
information in another structure
convert vector to a string with delimiters and store as a string
This works fine:
## replace c(3,1) by list(c(3,1))
g <- set.vertex.attribute(g, 'checked', 2, list(c(3, 1)))
V(g)[2]$checked
[1] 3 1
EDIT Why this works?
When you use :
g<-set.vertex.attribute(g, 'checked', 2, c(3, 1))
You get this warning :
number of items to replace is not a multiple of replacement length
Indeed you try to put c(3,1) which has a length =2 in a variable with length =1. SO the idea is to replace c(3,1) with something similar but with length =1. For example:
length(list(c(3,1)))
[1] 1
> length(data.frame(c(3,1)))
[1] 1

Resources