R programming: Creating a list of paired elements

R programming: Creating a list of paired elements - r

I have a list of elements say:
l <- c("x","ya1","xb3","yb3","ab","xc3","y","xa1","yd4")
Out of this list I would like to make a list of the matching x,y pairs, i.e.
(("xa1" "ya1") ("xb3" "yb3") ("x" "y"))
In essence, I need to capture the X elements, the Y elements and then pair them up:
I know how to do the X,Y extraction part:
xelems <- grep("^x", l, perl=TRUE, value=TRUE)
yelems <- grep("^y", l, perl=TRUE, value=TRUE)
An X element pairs up with a Y element when
1. xElem == yElem # if xElem and yElem are one char long, i.e. 'x' and 'y'
2. substr(xElem,1,nchar(xElem)) == substr(yElem,1,nchar(yElem))
There is no order, i.e. matching xElem and yElem can be positioned anywhere.
I am however not very sure about the next part. I am more familiar with the SKILL programming language (SKILL is a LISP derivative) and this is how I write it:
procedure( get_xy_pairs(inputList "l")
let(( yElem (xyPairs nil) xList yList)
xList=setof(i inputList rexMatchp("^x" i))
yList=setof(i inputList rexMatchp("^y" i))
when(xList && yList
unless(length(xList)==length(yList)
warn("xList and yList mismatch : %d vs %d\n" length(xList) length(yList))
)
foreach(xElem xList
if(xElem=="x"
then yElem="y"
else yElem=strcat("y" substring(xElem 2 strlen(xElem)))
)
if(member(yElem yList)
then xyPairs=cons(list(xElem yElem) xyPairs)
else warn("x element %s has no matching y element \n" xElem)
)
)
)
xyPairs
)
)
When run on l, this would return
get_xy_pairs(l)
*WARNING* x element xc3 has no matching y element
(("xa1" "ya1") ("xb3" "yb3") ("x" "y"))
As I am still new to R, I would appreciate if you folks can help. Besides, I do understand the R folks tend to avoid for loops and are more into lapply ?

Maybe something like this would work. (Only tested on your sample data.)
## Remove any item not starting with x or y
l2 <- l[grepl("^x|^y", l)]
## Split into a list of items starting with x
## and items starting with y
L <- split(l2, grepl("^x", l2))
## Give "names" to the "starting with y" group
names(L[[1]]) <- gsub("^y", "x", L[[1]])
## Use match to match the names in the y group with
## the values from the x group. This results in a
## nice named vector with the pairs you want
Matches <- L[[1]][match(L[[2]], names(L[[1]]), nomatch=0)]
Matches
# x xb3 xa1
# "y" "yb3" "ya1"
As a data.frame:
MatchesDF <- data.frame(x = names(Matches), y = unname(Matches))
MatchesDF
# x y
# 1 x y
# 2 xb3 yb3
# 3 xa1 ya1

I would store tuples in a list, i.e:
xypairs
[[1]]
[1] "x" "y"
[[2]]
[2] "xb3" "yb3"
Your procedure can be simplified with match and substring.
xends <- substring(xelems, 2)
yends <- substring(yelems, 2)
ypaired <- match(xends, yends) # Indices of yelems that match xelems
# Now we need to handle the no-matches:
xsorted <- c(xelems, rep(NA, sum(is.na(ypaired))))
ysorted <- yelems[ypaired]
ysorted <- c(ysorted, yelems[!(yelems %in% ysorted)])
# Now we create the list of tuples:
xypairs <- lapply(1:length(ysorted), function(i) {
c(xsorted[i], ysorted[i])
})
Result:
xypairs
[[1]]
[1] "x" "y"
[[2]]
[1] "xb3" "yb3"
[[3]]
[1] "xc3" NA
[[4]]
[1] "xa1" "ya1"
[[5]]
[1] NA "yd4"

Related

Issue matching string in list

I am trying to see if a list contains a particular string but I am having an issue.
> k
[1] "Investment"
> t
[[1]]
[1] "Investment" "Non-Investment"
> class(k)
[1] "character"
> class(t)
[1] "list"
> k %in% t
[1] FALSE
should not the above code result in TRUE rather than FALSE?

You need to unlist the list:
X <- "investment"
Y <- list(c("non-investment", "investment"))
X %in% unlist(Y)
Note I've changed it to X and Y: t is a base function so it's best not to overwrite it because it might cause conflicts!
One thing to consider is lists with multiple vectors, and figuring out whether you want to be searching across a list of vectors, or within a specific vector. Then you can use unlist to check all vectors simultaneously, and the square brackets to check a specific vector. To illustrate this, here there are sublists in Y, and the X string is in the second list, unlist tells us that X is in Y, while Y[[1]] returns FALSE, because %in% is only checking the first sublist:
X <- "alpha"
Y <- list(c("non-investment", "investment"), c("alpha", "beta"))
X %in% unlist(Y)
X %in% Y[[1]]
Note that, if you had specified Y as just a vector - which is essentially what it is in your example because there are not other sublists - then you could just use:
X <- "investment"
Y <- c("non-investment", "investment")
X %in% Y

The problem with t is it is a length one list of vectors - try k %in% t[[1]]. You may want to use unlist().
EDIT Sorry, list of vector, not lists.

print list names when iterating lapply [duplicate]

This question already has answers here:
Access lapply index names inside FUN
(12 answers)
Closed 8 years ago.
I have a time series (x,y,z and a) in a list name called dat.list. I would like to apply a function to this list using lapply. Is there a way that I can print the element names i.e., x,y,z and a after each iteration is completed in lapply. Below is the reproducible example.
## Create Dummy Data
x <- ts(rnorm(40,5), start = c(1961, 1), frequency = 12)
y <- ts(rnorm(50,20), start = c(1971, 1), frequency = 12)
z <- ts(rnorm(50,39), start = c(1981, 1), frequency = 12)
a <- ts(rnorm(50,59), start = c(1991, 1), frequency = 12)
dat.list <- list(x=x,y=y,z=z,a=a)
## forecast using lapply
abc <- function(x) {
r <- mean(x)
print(names(x))
return(r)
}
forl <- lapply(dat.list,abc)
Basically, I would like to print the element names x,y,z and a every time the function is executed on these elements. when I run the above code, I get null values printed.

The item names do not get passed to the second argument from lapply, only the values do. So if you wanted to see the names then the calling strategy would need to be different:
> abc <- function(nm, x) {
+ r <- mean(x)
+ print(nm)
+ return(r)
+ }
>
> forl <- mapply(abc, names(dat.list), dat.list)
[1] "x"
[1] "y"
[1] "z"
[1] "a"

You can use some deep digging (which I got from another answer on SO--I'll try to find the link) and do something like this:
abc <- function(x) {
r <- mean(x)
print(eval.parent(quote(names(X)))[substitute(x)[[3]]])
return(r)
}
forl <- lapply(dat.list, abc)
# [1] "x"
# [1] "y"
# [1] "z"
# [1] "a"
forl
# $x
# [1] 5.035647
#
# $y
# [1] 19.78315
#
# $z
# [1] 39.18325
#
# $a
# [1] 58.83891
Our you can just lapply across the names of the list (similar to what #BondedDust did), like this (but you lose the list names in the output):
abc <- function(x, y) {
r <- mean(y[[x]])
print(x)
return(r)
}
lapply(names(dat.list), abc, y = dat.list)

Turning a couple of vectors into a list of vectors

Suppose I have a collection of independent vectors, of the same length. For example,
x <- 1:10
y <- rep(NA, 10)
and I wish to turn them into a list whose length is that common length (10 in the given example), in which each element is a vector whose length is the number of independent vectors that were given. In my example, assuming output is the output object, I'd expect
> str(output)
List of 10
$ : num [1:2] 1 NA
...
> output
[[1]]
[1] 1 NA
...
What's the common method of doing that?

use mapply and c:
mapply(c, x, y, SIMPLIFY=FALSE)
[[1]]
[1] 1 NA
[[2]]
[1] 2 NA
..<cropped>..
[[10]]
[1] 10 NA

Another option:
split(cbind(x, y), seq(length(x)))
or even:
split(c(x, y), seq(length(x)))
or even (assuming x has no duplicate values as in your example):
split(c(x, y), x)

Here is a solution that allows you to zip arbitrary number of equi-length vectors into a list, based on position of the element
merge_by_pos <- function(...){
dotlist = list(...)
lapply(seq_along(dotlist), function(i){
Reduce('c', lapply(dotlist, '[[', i))
})
}
x <- 1:10
y <- rep(NA, 10)
z <- 21:30
merge_by_pos(x, y, z)

R populate list by its values

Say I have a list:
> fs
[[1]]
NULL
[[2]]
NULL
[[3]]
NULL
[[4]]
[1] 61.90298 58.29699 54.90104 51.70293 48.69110
I want to "reverse fill" the rest of the list by using it's values. Example:
The [[3]] should have the function value of [[4]] pairs:
c( myFunction(fs[[4]][1], fs[[4]][2]), myFunction(fs[[4]][2], fs[[4]][3]), .... )
The [[2]] should have myFunction values of [[3]] etc...
I hope that's clear. What's the right way to do it? For loops? *applys? My last attempt, which leaves 1-3 empty:
n = length(fs)
for (i in rev(1:(n-1)))
child_fs = fs[[i+1]]
res = c()
for (j in 1:(i+1))
up = v(child_fs[j])
do = v(child_fs[j+1])
this_f = myFunction(up, do)
res[j] = this_f
fs[[i]] = res

Make fs easily reproducible
fs <- list(NULL, NULL, NULL, c(61.90298, 58.29699, 54.90104, 51.70293, 48.69110))
To be able to show an example, make a trivial myFunction
myFunction <- function(a, b) {a + b}
You can loop over all but the last positions in fs (in reverse order), and compute each. Just call myFunciton with the vectors which are the next higher position's vectors without the last and without the first element.
for (i in rev(seq_along(fs))[-1]) {
fs[[i]] <- myFunction(head(fs[[i+1]], -1), tail(fs[[i+1]], -1))
}
That assumes myFunction is vectorized (given vectors for inputs, will give a vector for output). If it isn't, you can easily make a version which is.
myFunction <- function(a, b) {a[[1]] + b[[1]]}
for (i in rev(seq_along(fs))[-1]) {
fs[[i]] <- Vectorize(myFunction)(head(fs[[i+1]], -1), tail(fs[[i+1]], -1))
}
In either case, you get
> fs
[[1]]
[1] 453.2 426.8
[[2]]
[1] 233.398 219.802 206.998
[[3]]
[1] 120.200 113.198 106.604 100.394
[[4]]
[1] 61.90298 58.29699 54.90104 51.70293 48.69110

Really, what you have is a starting point
start <- c(61.90298, 58.29699, 54.90104, 51.70293, 48.69110)
a function you want to apply (I made this one up which adds 1 everywhere and deletes the last element)
myFunction <- function(x) head(x + 1, -1L)
and the number of times you want to apply the function (recursively):
n <- 3L
So I would write a function to apply the function n times recursively, then reverse the output list:
apply.n.times <- function(fun, n, x)
if (n == 0L) list(x) else c(list(x), Recall(fun, n - 1L, fun(x)))
rev(apply.n.times(myFunction, n, start))
# [[1]]
# [1] 64.90298 61.29699
#
# [[2]]
# [1] 63.90298 60.29699 56.90104
#
# [[3]]
# [1] 62.90298 59.29699 55.90104 52.70293
#
# [[4]]
# [1] 61.90298 58.29699 54.90104 51.70293 48.69110

Here is a one-line solution (if myFunction can be replaced with something like sum, or in this case rowSums):
Reduce( function(x,y) rowSums( embed(y,2) ), fs, right=TRUE, accumulate=TRUE )
If myFunction needs to accept 2 values and do something with them then this can be expanded a bit to:
Reduce( function(x,y) apply( embed(y,2), 1, function(z) myFunction(z[1],z[2]) ),
fs, right=TRUE, accumulate=TRUE )

Evaluate a symbolic Ryacas expression

This is a reproducible example:
a <- 0.05
za.2 <- qnorm(1-a/2)
b <- 0.20
zb <- qnorm(1-b)
lambda12 <- -log(1/2)/12
lambda18 <- -log(1/2)/18
theta <- lambda18/lambda12
(d = round(4*(za.2+zb)^2/log(theta)^2))
Tf<-36
library(Ryacas)
n <- Sym("n")
Solve(n/2*(2-exp(-lambda12*Tf)-exp(-lambda18*Tf))==d , n)
The last line returns
expression(list(n == 382/1.625))
Is there a way to extract the quotient and assign it to another variable (235.0769)?

G.Grothendieck pointed out in comments that you'll need to first to capture the expression to be operated upon below:
soln <- Solve(n/2*(2-exp(-lambda12*Tf)-exp(-lambda18*Tf))==d , n)
X <- yacas(soln)$text
Then, to extract the quotient, you can take advantage of the fact that many R language objects either are or can be coerced to lists.
X <- expression(list(n == 382/1.625))
res <- eval(X[[1]][[2]][[3]])
res
[1] 235.0769
The following just shows why that sequence of indices extracts the right piece of the expression:
as.list(X)
# [[1]]
# list(n == 382/1.625)
as.list(X[[1]])
# [[1]]
# list
#
# [[2]]
# n == 382/1.625
as.list(X[[1]][[2]])
# [[1]]
# `==`
#
# [[2]]
# n
#
# [[3]]
# 382/1.625

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

R programming: Creating a list of paired elements - r

Related

Issue matching string in list

print list names when iterating lapply [duplicate]

Turning a couple of vectors into a list of vectors

R populate list by its values

Evaluate a symbolic Ryacas expression

Categories

Resources