Automatically replacing only particular characters in a string

Automatically replacing only particular characters in a string - r

To show you my problem. We have a string containing a random system of equations:
x0<-"3w+2x+y-3z=-5; 5w+x+2z=31; -2w-x+3y+4z=7; -3x-5y+z=8"
Next steps:
varnames <- sort(strapply(x0, "[a-z]", simplify = unique))
spl <- strsplit(x0, ";")[[1]]
my_string<-unlist(spl)
my_string<-trimws(my_string)
ss1 <- gsubfn("[a-z]", x ~ (match(x, varnames) == seq_along(varnames))+0,
spl)
ss2 <- gsub("(\\d)c", "\\1*c", ss1)
ss3 <- sub("=.*", "", ss2)
A <- eval(parse(text = paste("rbind(", paste(ss3, collapse = ","), ")")))
b <- as.numeric(sub(".*=", "", ss2))
z<-matrix(cbind(A,b), nrow=ncol(cbind(A,b)), ncol=nrow(cbind(A,b)),
byrow=TRUE)
x1<-toString(z)
x1<-stringr::str_replace_all(z1,","," &")
x1
The output is:
3 & 2 & 1 & -3 & -5 & 5 & 1 & 0 & 2 & 31 & -2 & -1 & 3 & 4 & 7 & 0 & -3 & -5 & 1 & 8
But I want to achieve:
3 & 2 & 1 & -3 &|& -5 \\ 5 & 1 & 0 & 2 &|& 31 \\ -2 & -1 & 3 & 4 &|& 7 \\ 0 & -3 & -5 & 1 &|& 8
It means how to replace in x1 every fourth (in this example) "&" char (which stands in the x0 string for "=") with "&|&" and every fifth "&" (which stands in the x0 for ";") with "\\" to be able to create in markdown a Latex table, like this:
Thank you in advance.

Man that took a lot out of me, lol. Feels like there must be an easier way..
library(stringr)
x2 <- str_replace_all(x1, '(-?\\d+\\s&\\s-?\\d+\\s&\\s-?\\d+\\s&\\s)(-?\\d+\\s?)&?\\s?(-?\\d+)\\s?&?\\s?', '\\1\\2&|& \\3 \\\\ ')
x2 <- substr(x2, 1, nchar(x2)-3)
x2
#[1] "3 & 2 & 1 & -3 &|& -5 \\ 5 & 1 & 0 & 2 &|& 31 \\ -2 & -1 & 3 & 4 &|& 7 \\ 0 & -3 & -5 & 1 &|& 8"

Related

Print a matrix flattened by column using a separator

I have the following matrix made from a vector of vectors, which I want to print separated by the & operator.
vec1 <- c(1, 2, 3, 4)
vec2 <- c(5, 6, 7, 8)
vec3 <- c(9, 10, 11, 12)
vec4 <- c(13, 14, 15, 16)
vec5 <- c(17, 18, 19, 20)
vec6 <- c(21, 22, 23, 24)
Mat <- matrix(c(vec1, vec2, vec3, vec4, vec5, vec6), nrow = 6, ncol = 4, byrow = TRUE)
(vect1 <- c(Mat[1,1], Mat[1,2], Mat[1,3], Mat[1,4], Mat[3,1], Mat[3,2], Mat[3,3], Mat[3,4], Mat[5,1], Mat[5,2], Mat[5,3], Mat[5,4]))
This is what I want for the above.
[1] 1 & 2 & 3 & 4 & 9 & 10 & 11 & 12 & 17 & 18 & 19 & 20
(vect2 <- c(Mat[2,1], Mat[2,2], Mat[2,3], Mat[2,4], Mat[4,1], Mat[4,2], Mat[4,3], Mat[4,4], Mat[6,1], Mat[6,2], Mat[6,3], Mat[6,4]))
This is what I want for the above.
[1] 5 & 6 & 7 & 8 & 13 & 14 & 15 & 16 & 21 & 22 & 23 & 24
I actually need it in the output in the latex table such that the & symbol will separate each element from the other.

c() is a convenient function to flatten matrices by column, so t() then c() flattens by row.
Mat |>
t() |>
c() |>
paste(collapse = " & ")
"1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 & 11 & 12 & 13 & 14 & 15 & 16 & 17 & 18 & 19 & 20 & 21 & 22 & 23 & 24"
Feel free to leave out the paste step if you do not require it in string format.
|> is the base R form of a pipe, if you are unfamiliar with it.

Encountering conditional if else

I got a data frame (test) with a matrix of 4 x 2. I intended to use ifelse function to fix the dataset. Lines of code as below:
test <- data.frame(cbind(c(4,-5,-6,1),c("1","-3","4","-3")),stringsAsFactors = F)
test$X1 <- as.numeric(test$X1)
test$X2 <- as.numeric(test$X2)
test$X2 <- ifelse(test$X1<0 & test$X2>0, test$X2, test$X2*-1)
How do we write a code which apply the vice versa condition which means that if X1 < 0 & X2 > 0, then make X2 < 0, which apply the same on X1 (vice versa on the same logic)
The expected output is:
X1 <- 4 -5 -6 -1
X2 <- 1 -3 -4 -3
Would appreciate on any ideas.

We could achieve the desired result as follows using dplyr(assuming I understood the logic (which means that if X1 < 0 & X2 > 0, then make X2 < 0, which apply the same on X1 (vice versa on the same logic) well):
test %>%
mutate(X2 = ifelse(X1 <0 & X2>0, -X2, X2),
X1 = ifelse(X2<0 & X1>0, -X1,X1))
X1 X2
1 4 1
2 -5 -3
3 -6 -4
4 -1 -3

You could do
test$X2 <- with(test, X2 * c(1, -1)[(X1 < 0 & X2 > 0) + 1])
test$X1 <- with(test, X1 * c(1, -1)[(X1 > 0 & X2 < 0) + 1])
test
# X1 X2
#1 4 1
#2 -5 -3
#3 -6 -4
#4 -1 -3
To explain, let's take the first case.
The condition returns a logical vector
with(test, X1 < 0 & X2 > 0)
#[1] FALSE FALSE TRUE FALSE
By adding + 1 we convert it to numerical index where FALSE becomes 1 and TRUE becomes 2
with(test, X1 < 0 & X2 > 0) + 1
#[1] 1 1 2 1
We use this index to subset c(1, -1)
c(1, -1)[with(test, X1 < 0 & X2 > 0) + 1]
#[1] 1 1 -1 1
which is then multiplied to X2
with(test, X2 * c(1, -1)[(X1 < 0 & X2 > 0) + 1])
#[1] 1 -3 -4 -3

programming R ifelse conditions loop

Hello i need help with programming R. I have data.frame B with four column
x<- c(1,2,1,2,1,2,1,2,1,2,1,2,.......etc.)
y<-c(5,5,8,8,12,12,19,19,30,30,50,50,...etc.)
z<- c(2018-11-08,2018-11-08,2018-11-09,2018-11-09,2018-11-11,2018-11-11,2018-11-20,2018-11-20,2018-11-29,2018-11-29,2018-11-30,2018-11-30,.......etc.)
m<-c(0,1,1,0,1,1,0,1,0,1,0,1,...etc.)
2 milion rows and i need create next columns . Next columns should look as
t<-c(0,1,0,0,0,0,0,1,0,1,0,1,....)
code in cycle look like
B$t[1]=ifelse(B$y[i]==B$y[i+1] & B$z[i]==B$z[i+1] & B$x[i]==2 & B$m[1]==1,1,0)
for (i in 2:length(B$z))
{
B$t[i]<-ifelse(B$y[i]==B$y[i-1] & B$z[i]==B$z[i-1] & B$x[i]==2 & B$m[i]==1 & B$m[i]!=B$m[i-1],1,0)
}
I do not want to use cycle- loop.
I use basic package in R.
And i have new one question when i have data.frame E
x<- c(1,2,3,1,2,3,1,2,3,1,2,3,.......etc.)
y<-c(5,5,5,8,8,8,12,12,12,,19,19,19,30,30,30,50,50,50,...etc.)
z<- c(2018-11-08,2018-11-08,2018-11-08,2018-11-09,2018-11-09,2018-11-09,2018-11-11,2018-11-11,2018-11-11,2018-11-20,2018-11-20,2018-11-20,2018-11-29,2018-11-29,2018-11-29,2018-11-30,2018-11-30,2018-11-30,.......etc.)
m<-c(0,1,1,0,0,1,0,1,0,1,0,1,0,0,1...etc.)
2 milion rows and i need create next columns . Next columns should look as
t<-c(0,1,0,0,1,....)
code in cycle look like
E$t[1]=ifelse(E$y[i]==E$y[i+1] & E$z[i]==E$z[i+1] & E$x[1]==2 & E$m[1]==1,1,0)
E$t[2]=ifelse(E$y[i]==E$y[i+1] & E$z[i]==E$z[i+1] & E$x[2]==3 & E$m[2]==1,1,0)
for (i in 3:length(E$y))
{
E$t[i]<-ifelse(E$y[i]==E$y[i-2] & E$z[i]==E$z[i-2] & E$x[i]==3 & E$m[i]==1 &
E$m[i-1]==0 & E$m[i-2]==0,1,0)
}
I do not want to use cycle- loop.
I use basic package in R.

Here is a solution with base R:
N <- nrow(B)
B$t <- ifelse(B$y==c(NA, B$y[-N]) & B$z==c(NA, B$z[-N]) & B$x==2 & B$m==1 & B$m!=c(NA, B$m[-N]), 1, 0)
Here is a solution with data.table:
library("data.table")
B <- data.table(
x= c(1,2,1,2,1,2,1,2,1,2,1,2), y= c(5,5,8,8,12,12,19,19,30,30,50,50),
z= c("2018-11-08", "2018-11-08", "2018-11-09", "2018-11-09", "2018-11-11", "2018-11-11", "2018-11-20",
"2018-11-20", "2018-11-29", "2018-11-29", "2018-11-30", "2018-11-30"),
m= c(0,1,1,0,1,1,0,1,0,1,0,1)
)
B[, t := ifelse(y==c(NA, y[- .N]) & z==c(NA, z[- .N]) & x==2 & m==1 & m!=c(NA, m[- .N]), 1, 0)]
or (if logical is acceptable)
B[, t := (y==c(NA, y[- .N]) & z==c(NA, z[- .N]) & x==2 & m==1 & m!=c(NA, m[- .N]))]
or using shift()
B[, t := (y==shift(y) & z==shift(z) & x==2 & m==1 & m!=shift(m))]

With dplyr you can use if_else and lag:
library(dplyr)
dat %>%
mutate(t = if_else(
y == lag(y) & z == lag(z) & x == 2 & m == 1 & m != lag(m), 1, 0)
) # mutate lets you create a new variable in dat (named t here)
# x y z m t
# 1 1 5 2018-11-08 0 0
# 2 2 5 2018-11-08 1 1
# 3 1 8 2018-11-09 1 0
# 4 2 8 2018-11-09 0 0
# 5 1 12 2018-11-11 1 0
# 6 2 12 2018-11-11 1 0
# 7 1 19 2018-11-20 0 0
# 8 2 19 2018-11-20 1 1
# 9 1 30 2018-11-29 0 0
# 10 2 30 2018-11-29 1 1
# 11 1 50 2018-11-30 0 0
# 12 2 50 2018-11-30 1 1
Data:
x<- c(1,2,1,2,1,2,1,2,1,2,1,2)
y<-c(5,5,8,8,12,12,19,19,30,30,50,50)
z<- c("2018-11-08","2018-11-08","2018-11-09","2018-11-09","2018-11-11","2018-11-11","2018-11-20","2018-11-20","2018-11-29","2018-11-29","2018-11-30","2018-11-30")
m<-c(0,1,1,0,1,1,0,1,0,1,0,1)
dat <- data.frame(x, y, z, m)

Creating vectors with ifelse or if else

I still get tripped up using ifelse and if...else when I want to create a vector or new data.frame variable. The title of this question seems closely related, but does not address my issue: Why can't R's ifelse statements return vectors?
The code below shows my attempts to create the variables my.data2$v1b and my.data2$v2b. I failed with ifelse and if...else then succeeded with a for-loop and with apply.
Is there a way to create my.data2$v1b and my.data2$v2b with ifelse or if...else? I assume not based on my attempts and other Stack Overflow questions. So, what is the canonical way of creating these variables in R? Using apply works, but seems rather complex. Using a for-loop works but I get the impression for-loops are to be avoided.
There are many questions about ifelse, but I did not locate one that addressed this specific question: given that ifelse and if...else do not seem to work, what is the best solution? Sorry if this is a duplicate.
Here is my data set:
my.data2 <- read.table(text = '
refno v1 v2 state1 state2 xday first last
111 41 47 1 2 42 1 2
111 41 47 1 2 42 2 1
222 45 49 1 4 47 1 2
222 45 49 1 4 47 2 1
333 59 65 1 2 65 1 2
333 59 65 1 2 65 2 1
444 45 49 1 2 48 1 2
444 45 49 1 2 48 2 1
555 66 80 1 2 75 1 2
555 66 80 1 2 75 2 1
666 103 109 1 2 108 1 2
666 103 109 1 2 108 2 1
777 43 46 1 2 45 1 2
777 43 46 1 2 45 2 1
', header = TRUE, stringsAsFactors = FALSE)
Here are the desired vectors:
desired.data.v1b <- c(41,42, 45,47, 59,65, 45,48, 66,75, 103,108, 43,45)
desired.data.v2b <- c(42,47, 47,49, 65,65, 48,49, 75,80, 108,109, 45,46)
Here is where I start attempting to create these vectors:
v1b <- my.data2$v1
v2b <- my.data2$v2
# this ifelse does not work
my.data2$v1b < ifelse(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & my.data2$last == 1, my.data2$xday, my.data2$v1)
my.data2$v2b < ifelse(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & my.data2$first == 1, my.data2$xday, my.data2$v2)
# this if...else does not work
if(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & my.data2$last == 1) {v1b = my.data2$xday} else {v1b = my.data2$v1}
if(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & my.data2$first == 1) {v2b = my.data2$xday} else {v2b = my.data2$v2}
# this for-loop works
for(i in 1:nrow(my.data2)) {
if(my.data2$state1[i] == 1 & my.data2$state2[i] %in% c(2,4) & my.data2$last[i] == 1) {v1b[i] = my.data2$xday[i]}
if(my.data2$state1[i] == 1 & my.data2$state2[i] %in% c(2,4) & !(my.data2$last[i] == 1)) {v1b[i] = my.data2$v1[i] }
if(my.data2$state1[i] == 1 & my.data2$state2[i] %in% c(2,4) & my.data2$first[i] == 1) {v2b[i] = my.data2$xday[i]}
if(my.data2$state1[i] == 1 & my.data2$state2[i] %in% c(2,4) & !(my.data2$first[i] == 1)) {v2b[i] = my.data2$v2[i] }
}
all.equal(desired.data.v1b, v1b)
all.equal(desired.data.v2b, v2b)
my.data2$v1b <- v1b
my.data2$v2b <- v2b
# this apply works
my.v1 <- apply(my.data2, 1, function(x) {if (x['state1'] == 1 & x['state2'] %in% c(2,4) & x['last'] == 1) {x['v1b'] = x['xday']} else {x['v1b'] = x['v1']}})
my.v2 <- apply(my.data2, 1, function(x) {if (x['state1'] == 1 & x['state2'] %in% c(2,4) & x['first'] == 1) {x['v2b'] = x['xday']} else {x['v2b'] = x['v2']}})
names(my.v1) <- NULL
names(my.v2) <- NULL
all.equal(desired.data.v1b, my.v1)
all.equal(desired.data.v2b, my.v2)
EDIT
Maybe this is the canonical solution?
my.data2$v1b <- rep(-99, nrow(my.data2))
my.data2$v2b <- rep(-99, nrow(my.data2))
my.data2$v1b[(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & my.data2$last == 1) ] <- my.data2$xday[(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & my.data2$last == 1) ]
my.data2$v1b[(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & !(my.data2$last == 1))] <- my.data2$v1[ (my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & !(my.data2$last == 1))]
my.data2$v2b[(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & my.data2$first == 1) ] <- my.data2$xday[(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & my.data2$first == 1) ]
my.data2$v2b[(my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & !(my.data2$first == 1))] <- my.data2$v2[ (my.data2$state1 == 1 & my.data2$state2 %in% c(2,4) & !(my.data2$first == 1))]
all.equal(desired.data.v1b, my.data2$v1b)
all.equal(desired.data.v2b, my.data2$v2b)

R similar column names ifelse

adding reproducible code as suggested by answers
Qs<-paste0("Q2_", 1:18)
set.seed(15)
maindata <- data.frame(ID=1:5)
for(q in Qs) {
maindata[,q] <- sample(1:20,5,replace=T)
}
I have below code. Is their a better to achieve the output without writing each line? If i thought of writing the for loop for iterating over questions 1 to 18 but felt that for loop might not be too efficient...
ifelse(maindata$Q2_1 > 2 & maindata$Q2_1< 11 & !is.na(maindata$Q2_1), 1, 0 )+
ifelse(maindata$Q2_2 > 2 & maindata$Q2_2< 11 & !is.na(maindata$Q2_2), 1, 0)+
ifelse(maindata$Q2_3 > 2 & maindata$Q2_3< 11 & !is.na(maindata$Q2_3), 1, 0)+
ifelse(maindata$Q2_4 > 2 & maindata$Q2_4< 11 & !is.na(maindata$Q2_4), 1, 0)+
ifelse(maindata$Q2_5 > 2 & maindata$Q2_5< 11 & !is.na(maindata$Q2_5), 1, 0)+
ifelse(maindata$Q2_6 > 2 & maindata$Q2_6< 11 & !is.na(maindata$Q2_6), 1, 0)+
ifelse(maindata$Q2_7 > 2 & maindata$Q2_7< 11 & !is.na(maindata$Q2_7), 1, 0)+
ifelse(maindata$Q2_8 > 2 & maindata$Q2_8< 11 & !is.na(maindata$Q2_8), 1, 0)+
ifelse(maindata$Q2_9 > 2 & maindata$Q2_9< 11 & !is.na(maindata$Q2_9), 1, 0)+
ifelse(maindata$Q2_10 > 2 & maindata$Q2_10< 11 & !is.na(maindata$Q2_10), 1, 0)+
ifelse(maindata$Q2_11 > 2 & maindata$Q2_11< 11 & !is.na(maindata$Q2_11), 1, 0)+
ifelse(maindata$Q2_12 > 2 & maindata$Q2_12< 11 & !is.na(maindata$Q2_12), 1, 0)+
ifelse(maindata$Q2_13 > 2 & maindata$Q2_13< 11 & !is.na(maindata$Q2_13), 1, 0)+
ifelse(maindata$Q2_14 > 2 & maindata$Q2_14< 11 & !is.na(maindata$Q2_14), 1, 0)+
ifelse(maindata$Q2_15 > 2 & maindata$Q2_15< 11 & !is.na(maindata$Q2_15), 1, 0)+
ifelse(maindata$Q2_16 > 2 & maindata$Q2_16< 11 & !is.na(maindata$Q2_16), 1, 0)+
ifelse(maindata$Q2_17 > 2 & maindata$Q2_17< 11 & !is.na(maindata$Q2_17), 1, 0)+
ifelse(maindata$Q2_18 > 2 & maindata$Q2_18< 11 & !is.na(maindata$Q2_18), 1, 0)

Well, here's one way. First, let's create some sample data
Qs<-paste0("Q2_", 1:18)
set.seed(15)
maindata <- data.frame(ID=1:5)
for(q in Qs) {
maindata[,q] <- sample(1:20,5,replace=T)
}
Here we make a list of all the question names (Qs) and we create a data.frame with 5 rows where each column contains values sampled from 1:20. If we want the score for each line for each individual, we can do
score <- rowSums(sapply(Qs, function(q)
maindata[,q] > 2 & maindata[,q] <11 & !is.na(maindata[,q]) )+0)
Here I use sapply to iterate over the question names. Then i wrote the formula once and swap in the different question names. Here I return a simple logical value and add zero to convert FALSE to 0 and TRUE to 1. Then I use rowSums to app up scores across rows. We can see the results with
cbind(maindata[,"ID", drop=F], score)
# ID score
# 1 1 9
# 2 2 8
# 3 3 4
# 4 4 6
# 5 5 10

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Automatically replacing only particular characters in a string - r

Related

Print a matrix flattened by column using a separator

Encountering conditional if else

programming R ifelse conditions loop

Creating vectors with ifelse or if else

R similar column names ifelse

Categories

Resources