I'm a begginer with R and I can't figure out how to do this:
I have a named vector with player names and his score:
x <-c(3, 4, 6, 2, 3, 5, 0, 1, 1, 2)
names(x) <- c("ALBERTO", "ANTONIO", "PEPE", "JUAN", "ANDRES", "PEDRO", "MARCOS", "MATEO", "JAVIER", "FRANCISCO")
What I need is to get the scores for the players which name starts with letter "A".
Is it possible to set a condition on the element name?
Thank you!
One way is
x[grepl("^A", names(x))]
# ALBERTO ANTONIO ANDRES
# 3 4 3
^ stands for beginning of the string in regex. grepl will return a logical vector which will allow to index out of x
Or (as pointed in comments) you could avoid regex and do
x[substr(names(x), 1, 1) == 'A']
Related
So let me describe the data:
abc and xyz are metrics.
hit is basically the index number, eg:
hit value 3 means the corresponding value in abc3 and xyz3
hit 4 means abc4, xyz4
Data
dat <- data.frame( abc1=c(7, 0, 7),
abc2=c(5, 10, 20),
abc3=c(0, 0, 10),
abc4=c(3, 5, 19),
abc5=c(2, 2, 0),
abc6=c(2, 26, 0),
xyz1=c(0, 2, 0),
xyz2=c(1, 1, 6),
xyz3=c(8, 2, 0),
xyz4=c(6, 3, 5),
xyz5=c(9, 2, 2),
xyz6=c(4, 0, 0),
hit=c(3, 4, 4))
What I need to do is find the abc and xyz before hits and after hits.
The below for loop does the job well for small datasets, but if the data crosses 100k rows, the loop runs seemingly forever.
for (c in c('abc','xyz')){
for (i in 1:nrow(dat)){
for (m in -2:2){
dat[[paste(c,'hit', m)]][i] = dat[i,paste(c, dat$hit[i]-m, sep = "")]
}
}
}
In the output file,
'abc hit 0' for row 1 refers to : hit=3 which in turn picks the value in abc3 and assigns to abc hit 0.
abc hit -1 translates to hit=3-1=2 which points to abc2 and xyz2
I know the 3 for loops are bad idea. Please help me better the code by using apply function or any other way which reduce the execution time.
You appear to have reversed the 'm' in your for-loop: it runs from -2 to 2, but then you've got dat$hit - m - is the subtraction what you wanted? Or should it be dat$hit + m?
You could do something like the below - I've not tested it on large datasets but do give it a go:
dat1 <- do.call(rbind,
lapply(split(dat, 1:NROW(dat)),
function(x) {
z <- x[paste0('abc', x$hit + 2:-2)];
names(z) <- paste0('abc', -2:2);
z
}
))
The split function gives you the rows of the dataframe, happily preserving the column names, on which you can then use the lapply function to operate line-by-line.
You can lookup the relevant columns of each line by adding -2 through 2 to hit.
Then you bash the resultant list back together into a dataframe.
Update:
This is faster than the above by about 30% for 90K rows:
dat1 <- t(sapply(split(dat, 1:NROW(dat)),
function(x) unname(x[paste0('abc', x$hit + 2:-2)])
))
dat1 <- as.data.frame(dat1)
colnames(dat1) <- paste0('abc', -2:2)
I am trying to use ifelse to populate a new column in a data frame.
I want to extract the last digits of a character string in a column if this is longer than 3. if the charachter string is shorter I just want it to give -1...
I already figured out how to extract the last characters of the string if the string is longer than 3 characters.
x<- c("ABCD1", "ABCD2", "ABCD3", "ABCD4", "BC5", "BC6", "BC7")
y<-NULL
dat<-cbind(x,y)
ifelse (nchar(x>3), y=substr(x, 5,5), y=-1)
dat<-cbind(x,y)
view(dat)
when I run this, I get the next error
Error in ifelse(nchar(x > 3), y = substr(x, 4, 5), y = substr(x, 3)) :
formal argument "yes" matched by multiple actual arguments`
What I want is that vector "y" gets the numbers 1,2,3,4,-1,-1,-1
so I can bind both columns latter. If you have a better way of doing this I would appreciate it.
You're almost there! This will work as long as the strings with length > 3 are 4 characters long.
ifelse(nchar(x) > 3, substr(x, 5, 5), -1)
If your strings might be longer than 4 characters:
ifelse(nchar(x) > 3, sub(".*([0-9]).*", "\\1", x), -1)
I am guessing you need a dataframe. Here's what you probably need -
x <- c("ABCD1", "ABCD2", "ABCD3", "ABCD4", "BC5", "BC6", "BC7")
dat <- data.frame(x, stringsAsFactors = F)
dat$y <- ifelse(nchar(dat$x) > 3, as.numeric(substr(dat$x, 5,5)), -1)
x y
1 ABCD1 1
2 ABCD2 2
3 ABCD3 3
4 ABCD4 4
5 BC5 -1
6 BC6 -1
7 BC7 -1
I wonder (remembering that Perl 6 has everything you could wish), whether there are some built-in instruments that can help to produce all the non-empty subsets (order doesn't matter) of a list.
E.g., I have a list:
my #a = 1, 2, 3;
I need a function f so that f(#a) will produce:
((1), (2), (3), (1, 2), (1, 3), (2, 3), (1, 2, 3))
#a.combinations(1..*)
will return the Seq you're looking for. Note that without the argument, an empty list would be generated as first element.
I am importing a key in which each row is an argument setting for a function I have programmed. The goal is to batch test my function by producing outputs for all sets of arguments. That's not terribly important. What is important is that I import a column that contains in each row a value for a range. For instance, "1:5" is meant to be entered into an argument as the value 1:5. I try to coerce using as.numeric("1:5"), but R is not happy with this. Is there a way to coerce this to the string c(1,2,3,4,5) from the character value "1:5"
Your text is valid code, so you can eval(parse it
dat$parsed <- lapply(dat$key, function(x) eval(parse(text=x)))
# key parsed
# 1 1:5 1, 2, 3, 4, 5
# 2 1:6 1, 2, 3, 4, 5, 6
# 3 1:4 1, 2, 3, 4
Data
dat <- read.table(text="key
1:5
1:6
1:4", strings=F, header=T)
Reduce(':', strsplit(x,":")[[1]])
[1] 1 2 3 4 5
If x = "1:5", we can use strsplit to separate the two numbers. We can then use Reduce to execute the operator : on the split.
I am trying to assign a vector as an attribute for a vertex, but without any luck:
# assignment of a numeric value (everything is ok)
g<-set.vertex.attribute(g, 'checked', 2, 3)
V(g)$checked
.
# assignment of a vector (is not working)
g<-set.vertex.attribute(g, 'checked', 2, c(3, 1))
V(g)$checked
checking the manual, http://igraph.sourceforge.net/doc/R/attributes.html
it looks like this is not possible. Is there any workaround?
Up till now the only things I come up with are:
store this
information in another structure
convert vector to a string with delimiters and store as a string
This works fine:
## replace c(3,1) by list(c(3,1))
g <- set.vertex.attribute(g, 'checked', 2, list(c(3, 1)))
V(g)[2]$checked
[1] 3 1
EDIT Why this works?
When you use :
g<-set.vertex.attribute(g, 'checked', 2, c(3, 1))
You get this warning :
number of items to replace is not a multiple of replacement length
Indeed you try to put c(3,1) which has a length =2 in a variable with length =1. SO the idea is to replace c(3,1) with something similar but with length =1. For example:
length(list(c(3,1)))
[1] 1
> length(data.frame(c(3,1)))
[1] 1