How to transpose array of strings in Julia? - julia

It works with numbers but not with strings.
The [1 2]' works but the ["a" "b"]' doesn't.
Why? And how to do that?

Why?
["a" "b"]' does not work because the ' operator actually computes the (lazy) adjoint of your matrix. Note that, as stated in the documentation, adjoint is recursive:
Base.adjoint — Function
adjoint(A)
Lazy adjoint (conjugate transposition) (also postfix '). Note that adjoint is applied recursively to elements.
This operation is intended for linear algebra usage - for general data manipulation see permutedims.
What happens in the case of [1 2] is that adjoint does not only flip elements across the diagonal; it also recursively calls itself on each element to conjugate it. Since conjugation is not defined for strings, this fails in the case of ["a" "b"].
How?
As suggested by the documentation, use permutedims for general data manipulation:
julia> permutedims(["a" "b"])
2×1 Array{String,2}:
"a"
"b"

Transposing is normally for linear algebra operations but in your case the easiest thing is just to drop the dimension:
julia> a = ["a" "b"]
1×2 Array{String,2}:
"a" "b"
julia> a[:]
2-element Array{String,1}:
"a"
"b"

If the string matrix is created by a literal, the easiest way is to create a vector immediately instead of a matrix.
["a", "b"]

Related

Why can't I combine Reduce with paste when using "*" as a character?

I'm trying to get the output "1*2*4*5" from (function(x) Reduce(paste0(toString("*")),x))(c(1,2,4,5)), but no matter how I manipulate Reduce, paste0, and the asterisks, I'm either getting error messages or the asterisks being treated as multiplication (giving 40). Where am I going wrong?
Reduce uses a function with two arguments to which it applies the previous result and the next element of the vector. Therefore, you need a function of both x and y:
Reduce(function(x,y)paste0(x,"*",y),c(1,2,4,5))
#[1] "1*2*4*5"
As an aside, you can provide an initial value to be applied as x for the first element of the vector with init =.
Reduce(function(x,y)paste0(x,"*",y),c(1,2,4,5), init = 0)
#[1] "0*1*2*4*5"
One thing you may have tried was this:
Reduce(paste0("*"),c(1,2,4,5))
#[1] 40
This applies the multiplication operator to x and y, because paste0("*") evaluates to "*".
Another base R option is to use paste within gsub, e.g.,
x <- 1:5
gsub("\\s","*",Reduce(paste,x))
which gives
> gsub("\\s","*",Reduce(paste,x))
[1] "1*2*3*4*5"
KISS method:
(with improvements as suggested by #nicola)
bar <- as.character(1:5)
paste0(bar,sep="",collapse='*')
#[1] "1*2*3*4*5"

How to use the split function when trying to split a string by `/` and `|` in Julia

Does anyone know the appropriate syntax for split when trying to split a string by / or | (such as vcf file genotypes). I've tried
somestring = "1|2|3|4|5"
split(somestring, r"/||")
but the double-pipe is clearly incorrect. Thoughts?
Origional question
You can use a tuple for multiple split characters!
julia> split("A/B|C", ('/', '|'))
3-element Array{SubString{String},1}:
"A"
"B"
"C"
Origional Answer
Link to the Julia docs for further reading on split() here

Julia: Check if elements from one vector are within another vector [duplicate]

This question already has answers here:
Vectorized "in" function in julia?
(5 answers)
Closed 5 years ago.
I would like to check if the elements in one vector are contained within another vector. In R there is the operator %in%.
For example the operator would do the following:
[1,3,5,7,9,4] %in% [1,2,4,5,8,9,10,11]
# [true,false,true,false,true,true]
I can easily write my own only I am trying not to reinvent the wheel.
Probably not so nice, but you could do:
julia> [1,3,5,7,9,4] .∈ [[1,2,4,5,8,9,10,11]]
6-element BitArray{1}:
true
false
true
false
true
true
There are a number of built-ins that do something similar. indexin gives you the indices in b where the elements of a are found (0 if it is not there - this is similar to R's match). setdiff gives you the elements in a that are not in b. It is likely you'll be able to do what you want with these - constructing temporary boolean arrays for filtering is not so ideomatic in julia as in R, as it generally creates an extra, unnecessary allocation.
You could use an anonymous function : map(x -> x in [1,2,4,5,8,9,10,11] ,[1,3,5,7,9,4])
Or a comprehension : [x in [1,2,4,5,8,9,10,11] for x = [1,3,5,7,9,4]]

The function of parentheses (round brackets) in R

How does R interpret parentheses? Like most other programming languages these are built-in operators, and I normally use them without thinking.
However, I came across this example. Let's say we have a data.table in R, and I would like to apply a function on it's columns. Then I might write:
dt <- data.table(my_data)
important_cols <- c("col1", "col2", "col5")
dt[, (important_cols) := lapply(.SD, my_func), .SDcols = important_cols]
Obviously I can't neglect the parentheses:
dt[, important_cols := lapply(.SD, my_func), .SDcols = important_cols]
as that would introduce a new object called important_cols to my data.table, instead of modifying my existing columns in place.
My question is, why does putting ( ) around the vector "expand" it?
This question can probably better phrased and titled. But then I would have probably found the answer by Googling if I knew the terminology to employ while asking it, hence I'm here.
While we're on that topic, if someone could point out the differences between [ ], { }, etc., and how they should be used, that would be appreciated too :)
A special feature of R (compared to e.g. C++) is that the various parentheses are actually functions. What this means is that (a) and a are different expressions. The second is just a, while the first is the function ( called with an argument a. Here are a few expressions trees for you to compare:
as.list(substitute( a ))
#[[1]]
#a
as.list(substitute( (a) ))
#[[1]]
#`(`
#
#[[2]]
#a
as.list(substitute( sqrt(a) ))
#[[1]]
#sqrt
#
#[[2]]
#a
Notice how similar the last trees are - in one the function is sqrt, in the other it's "(". In most places in R, the "(" function doesn't do anything, it just returns the same expression, but in the particular case of data.table, it is "overridden" (in quotes because that's not exactly how it's done, but in spirit it is) to do a variety of useful operations.
And here's one more demo to hopefully cement the point:
`(` = function(x) x*x
2
#[1] 2
(2)
#[1] 4
((2))
#[1] 16

read specific elements on a list wih R

I have the following list:
v1<-c('hello', 'bye')
v2<-c(1,2,3)
v3<-c(5,6,5,5,5,5)
l<-list(v1, v2, v3)
I want to read the second element of each member of the list. Thus the result may be:
'bye' 2 6
I did it using 'sapply' with the instruction:
sapply(1:3, function(i){l[[i]][2])
and it works. But I would like to do it with an easier instruction, I tried
l[[1:3]][2]
But it doesn't work. Which is the easier way to obtain the second element of each member in my list.
Thank you!
I would suggest
sapply(l,`[[`,2)
#[1] "bye" "2" "6"
EDIT:
Note that result of this operation is a vector, so all elements had to be coerced to the same type (in this case, it is character). If you'd like to keep the types of the result components, you should use
lapply(l,`[[`,2)
which returns a list, and elements of a list in R could be of different types. (thanks to Richard for bringing attention to this aspect!)
If you want to vectorize this (avoid *apply loops), you could use stringi::stri_list2matrix (though, you will lose your classes)
library(stringi)
stri_list2matrix(l, byrow = TRUE)[, 2]
## [1] "bye" "2" "6"

Resources