Using paste and sum inside a for-loop - r

I need to compare a character string to multiple others and tried to do it the following way:
empty = character(0)
ps_2 = c("h2","h3")
ps_3 = c("h3", "h4")
visible = ("h2")
i = 2
ps_t = empty
ps_t <- append(ps_t, sum(visible %in% paste("ps_", i, sep="")))
With the intention to write a loop instead of i = 2, in order to cycle trough ps_2,ps_3,...
However I think it's not working since the paste() command returns a string instead of the character string with the name: ps_2.
How can I fix this?
Thanks for the time and effort!
Kind regards,
A fellow datafanatic!

The function you need is get(), which gets the value of the object.
ps_t <- ps_t = NULL
sapply(2:3, function(i) append(ps_t, sum(visible %in% get(paste0("ps_", i)))))
Or simply:
sapply(2:3, function(i) sum(visible %in% get(paste0("ps_", i))))
Output
[1] 1 0

You can use eval in R to convert the string to a variable name. You can find the solution here.
Here's what your code will look like:
ps_t <- c(0, (sum(visible %in% eval(parse(text = paste("ps_", i, sep=""))))))
It will give you a numeric vector.
OR
You can use get.
ps_t <- append(0, sum(visible %in% get(paste("ps_", i, sep = ""))))
ps_t

Related

explicit statement works, but function modeled on statement returns empty list

I have an explicit statement that works just fine, but fails when I try to convert it into a function, and I'm not sure where I'm going wrong. I have:
id <- c(1, 2, 3, 4, 5, 6)
string <-c("apple", "grape", "orange", "tomato", "pear", "plum")
df <- data.frame(id, string)
If I say
list_value = paste0(df$string, sep = ";")
I get the following character vector returned, which is what I want:
list_value = "apple;""grape;""orange;""tomato;""pear;""plum;"
But if I try to write a function
concat <- function(d, n, s) {
list_value = paste0({{d$n}}, sep = s)
return(list_value)
}
then ask for
test <- concat(df, string, ";")
print(test)
all I get is ";". Why does the explicit statement work, but the function returns an empty list? I specifically need the function, because I want to loop over it for unique values in the id column of the df.
thanks
*edited to fix misplaced ;
The expression {{d$n}} doesn't mean anything special in R: it's just the same as d$n. Since your dataframe didn't have a column named "n", it gives NULL, and you get the result you saw.
You've probably been confused because some tidyverse functions evaluate things like {{d$n}} in a special way. But your function doesn't use any of the tidyverse non-standard evaluation, so it's just R you're working with.
To get what you want, you would have to write the function as
concat <- function(d, n, s) {
list_value = paste0(d[[deparse(substitute(n))]], sep = s)
return(list_value)
}
which is pretty ugly. I'd recommend specifying that the second argument needs to be a string, then you could write
concat2 <- function(d, n, s) {
list_value = paste0(d[[n]], sep = s)
return(list_value)
}
but you would have to call it as
concat2(df, "string", ";")

Show a value inside a prompt in R

I have been searching this for some time and haven't been able to find the answer. Hope you can help me:
a <- readline(prompt="No. of attributes: ")
lev <- c()
i <- 0
while (i<a) {
l <- readline(prompt="No. of levels in attribute i: ")
l <- as.numeric(strsplit(l,",")[[1]])
lev <- c(lev, l)
i=i+1
}
Inside the loop, in the prompt in l, I want i to change by the real value of i.
Sorry for being such a noob.
Thank you!
You could use
prompt=paste0("No. of levels in attribute ", i, ":")
Edit: FYI, there's also a paste function that is very similar, but puts spaces between the strings it pastes. Also look into the collapse= parameter to paste and paste0 if you're trying to paste together a vector of strings.
Do you mean "assign the prompt value to 'i'"? Wouldn't this work?
(edited)
Seems like you need three changes --
use paste0 (which concatenates raw text with string variables, with "0" spaces, i.e., no spaces)
your need to assign to "l" to correctly so the prompt increments
You maybe want i to start at 1, and change the i<a to i<=a so you are correctly getting each level:
a <- readline(prompt="No. of attributes: ")
lev <- c()
i <- 1
while (i<=a) {
l <- readline(prompt=paste0("No. of levels in attribute ",i,": "))
l <- as.numeric(strsplit(l,",")[[1]])
lev <- c(lev, l)
i=i+1
}

Modify the object without using return in R function

I am trying to reverse a string without using extra space in R. Below is the code for the same. My question is how to get the ReverseString function change the input without using extra space. I even tried using <<- without any luck.
ReverseString <- function(TestString){
TestString <- unlist(strsplit(TestString, ""))
Left <- 1
Right <- length(TestString)
while (Left < Right){
Temp <- TestString[Left]
TestString[Left] <- TestString[Right]
TestString[Right] <- Temp
Left <- Left + 1
Right <- Right - 1
}
return(paste(TestString, collapse = ""))
}
## Input
a = "StackOverFlow"
## OutPut
ReverseString(a)
"wolFrevOkcatS"
##
a
"StackOverFlow"
It is always better to take advantage of the vectorization in R (instead of for or while loops). So, in base-R, without any packages, it would be something like:
ReverseString <- function(x) {
#splitstring splits every character, and rev reverses the order
out <- rev(strsplit(x, split = '')[[1]])
#paste to paste them together
paste(out, collapse = '')
}
a <- "StackOverFlow"
ReverseString(a)
#[1] "wolFrevOkcatS"
According to your comment you want to reverse the string without calling any function that does the reversal, i.e. no rev and co. Both of the solutions below do this.
I think you are also trying to modify global a from within the function, which is why you tried <<-. I'm not sure why it didn't work for you, but you might have used it incorrectly.
You should know that using <<- alone does not mean that you are using less space. To really save space you would have to call or modify global a at each step in your function where you call or modify TestString. This would entail some combination of assign, do.call, eval and parse - not to mention all the pasteing you would have to do to access elements of a by integer position. Your function would end up bulky, nearly unreadable, and very likley less efficient due to the numerous function calls, despite having saved a negligible amount of space by not storing a copy of a. If you're dead set on creating such an abomination, then take a look at the functions I just listed and figure out how to use them.
Your energy would be better spent by improving upon you string-reversing function in other ways. For example, you can shorten it quite a bit by using a numerical sequence such as 13:1 in sapply:
reverse_string <- function(string) {
vec <- str_split(string, "")[[1]]
paste(sapply(length(vec):1, function(i) vec[i]), collapse = "")
}
reverse_string("StackOverFlow")
#### OUTPUT ####
[1] "wolFrevOkcatS"
If your interviewers also have a problem with reverse sequences then here's another option that's closer to your original code, just a little cleaner. I also did my best to eliminate other areas where "extra space" was being used (indices stored in single vector, no more Temp):
reverse_string2 <- function(string){
vec <- str_split(string, "")[[1]]
i_vec <- c(1, length(vec))
while(i_vec[1] < i_vec[2]) {
vec[i_vec] <- vec[c(i_vec[2], i_vec[1])]
i_vec <- i_vec + c(1, -1)
}
paste(vec, collapse = "")
}
reverse_string2("StackOverFlow")
#### OUTPUT ####
[1] "wolFrevOkcatS"
It can be done easily with stringi
library(stringi)
a <- "StackOverFlow"
stri_reverse(a)
#[1] "wolFrevOkcatS"
I'm not sure I understood exactly the problem, but I think you're looking for a way to reverse the string object and automatically assign it to the original object without having to do a <- ReverseString(a) (assuming this is the reason why you tried using <<-). My solution to this is using deparse(substitute()) to read the original variable name inside the function and assign (using envir = .GlobalEnv) to assign your result over the original variable.
ReverseString <- function(TestString){
nm <- deparse(substitute(TestString))
TestString <- unlist(strsplit(TestString, ""))
Left <- 1
Right <- length(TestString)
while (Left < Right){
Temp <- TestString[Left]
TestString[Left] <- TestString[Right]
TestString[Right] <- Temp
Left <- Left + 1
Right <- Right - 1
}
assign(nm, paste(TestString, collapse = ""), envir = .GlobalEnv)
}
## Input
a = "StackOverFlow"
ReverseString(a)
a
#[1] "wolFrevOkcatS"

Assigning new strings with conditional match

I have an issue about replacing strings with the new ones conditionally.
I put short version of my real problem so far its working however I need a better solution since there are many rows in the real data.
strings <- c("ca_A33","cb_A32","cc_A31","cd_A30")
Basicly I want to replace strings with replace_strings. First item in the strings replaced with the first item in the replace_strings.
replace_strings <- c("A1","A2","A3","A4")
So the final string should look like
final string <- c("ca_A1","cb_A2","cc_A3","cd_A4")
I write some simple function assign_new
assign_new <- function(x){
ifelse(grepl("A33",x),gsub("A33","A1",x),
ifelse(grepl("A32",x),gsub("A32","A2",x),
ifelse(grepl("A31",x),gsub("A31","A3",x),
ifelse(grepl("A30",x),gsub("A30","A4",x),x))))
}
assign_new(strings)
[1] "ca_A1" "cb_A2" "cc_A3" "cd_A4"
Ok it seems we have solution. But lets say if I have A1000 to A1 and want to replace them from A1 to A1000 I need to do 1000 of rows of ifelse statement. How can we tackle that?
If your vectors are ordered to be matched, then you can use:
> paste0(gsub("(.*_)(.*)","\\1", strings ), replace_strings)
[1] "ca_A1" "cb_A2" "cc_A3" "cd_A4"
You can use regmatches.First obtain all the characters that are followed by _ using regexpr then replace as shown below
`regmatches<-`(strings,regexpr("(?<=_).*",strings,perl = T),value=replace_strings)
[1] "ca_A1" "cb_A2" "cc_A3" "cd_A4"
Not the fastests but very tractable and easy to maintain:
for (i in 1:length(strings)) {
strings[i] <- gsub("\\d+$", i, strings[i])
}
"\\d+$" just matches any number at the end of the string.
EDIT: Per #Onyambu's comment, removing map2_chr as paste is a vectorized function.
foo <- function(x, y){
x <- unlist(lapply(strsplit(x, "_"), '[', 1))
paste(x, y, sep = "_"))
}
foo(strings, replace_strings)
with x being strings and y being replace_strings. You first split the strings object at the _ character, and paste with the respective replace_strings object.
EDIT:
For objects where there is no positional relationship you could create a reference table (dataframe, list, etc.) and match your values.
reference_tbl <- data.frame(strings, replace_strings)
foo <- function(x){
y <- reference_tbl$replace_strings[match(x, reference_tbl$strings)]
x <- unlist(lapply(strsplit(x, "_"), '[', 1))
paste(x, y, sep = "_")
}
foo(strings)
Using the dplyr package:
strings <- c("ca_A33","cb_A32","cc_A31","cd_A30")
replace_strings <- c("A1","A2","A3","A4")
df <- data.frame(strings, replace_strings)
df <- mutate(rowwise(df),
strings = gsub("_.*",
paste0("_", replace_strings),
strings)
)
df <- select(df, strings)
Output:
# A tibble: 4 x 1
strings
<chr>
1 ca_A1
2 cb_A2
3 cc_A3
4 cd_A4
yet another way:
mapply(function(x,y) gsub("(\\w\\w_).*",paste0("\\1",y),x),strings,replace_strings,USE.NAMES=FALSE)
# [1] "ca_A1" "cb_A2" "cc_A3" "cd_A4"

R NameValue from CSV String - access value via name

I am new to R and have a question not knowing how to solve it. Maybe you can help?
I do have a separated name/value input string: param1=test;param2=3;param3=140;
I would like to access a value via it's name in R.
Something like using
myParams["param1]
I already tried something like:
input = "param1=test;param2=3;param3=140;"
output1 = strsplit(input,";")[[1]]
output2 = do.call(rbind, strsplit(output1, "="))
to get a matrix but am missing the rest..
You could define a custom function myParams:
# Your sample data
input = "param1=test;param2=3;param3=140;"
output1 = strsplit(input,";")[[1]]
output2 = do.call(rbind, strsplit(output1, "="))
# Define function
myParams <- function(par, df = output2) {
return(df[which(df[, 1] == par), 2])
}
myParams("param1");
#[1] "test"
myParams("param2");
#[1] "3"
A simple way would be to create a dataframe out of that matrix first and then access the value via row names
input = "param1=test;param2=3;param3=140;"
output1 = strsplit(input,";")[[1]]
output2 = do.call(rbind, strsplit(output1, "="))
temp = data.frame(output2,row.names = TRUE)
# X2
#param1 test
#param2 3
#param3 140
temp[,"param1"]
#test
temp[,"param2"]
#3
temp[,"param3"]
#140

Resources