I have the following matrix A = [1.00 2.00; 3.00 4.00] and I need to convert it into a vector of Vectors as follows:
A1 = [1.00; 3.00]
A2 = [2.00; 4.00]
Any ideas?
tl;dr
This can be very elegantly created with a list comprehension:
A = [A[:,i] for i in 1:size(A,2)]
Explanation:
This essentially converts A from something that would be indexed as A[1,2] to something that would be indexed as A[2][1], which is what you asked.
Here I'm assigning directly back to A, which seems to me what you had in mind. But, only do this if the code is unambiguous! It's generally not a good idea to have same-named variables that represent different things at different points in the code.
NOTE: if this reversal of row / column order in the indexing isn't the order you had in mind, and you'd prefer A[1,2] to be indexed as A[1][2], then perform your list comprehension 'per row' instead, i.e.
A = [A[i,:] for i in 1:size(A,1)]
It would be much better simply to use slices of your matrix i.e. instead of A1 use
A[:,1]
and instead of A2 use
A[:,2]
If you really need them to be "seperate" objections you could try creating a cell array like so:
myfirstcell = cell(size(A,2))
for i in 1:size(A,2)
myfirstcell[i] = A[:,i]
end
See http://docs.julialang.org/en/release-0.4/stdlib/arrays/#Base.cell
(Cell arrays allow several different types of object to be stored in the same array)
Another option is B = [eachcol(A)...]. This returns an variable with type Vector{SubArray} which might be fine depending on what you want to do. To get a Vector{Vector{Float64}} try,
B = Vector{eltype(A)}[eachcol(A)...]
Related
common container types used in R are data.tables, data.frames, matrices and lists (probably more?!).
All these storage types have slightly different rules for indexing.
Let's say we have a simple dataset with named columns:
name1 name2
1 11
2 12
... ...
10 20
We now put this data in every container accordingly. If I want to index the number 5 which is in the name1 column it goes as follows:
lists: dataset[['name1']][5]
-> why the double brackets?!?!
data frames: dataset$name1[5] or dataset[5,'name1']
-> here are two options possible, why the ambiguity?!?
data table: dataset$name1[5]
-> why is it here only one possibility
I often stumbled upon this problem and coming from python this is something very odd. It furthermore leads to extremely tedious debuging. In python this is solved in a very uniform way where indexing is pretty much standard across lists,numpy arrays, pandas data frames, etc.
A data.frame is a list with equal elements having equal length. We use $ or [[ to extract the list elements or else it would still be a list with one element
You reference the data.frame example in R and then go on to say you are used to pandas, except these have direct, standard equivalents in pandas for the exact same purpose, so unsure where the confusion comes from.
dataset$name1[5] -> dataset['name1'][5] or dataset.name1[5]
dataset[5, 'name1'] -> dataset.loc[5, 'name1']
Using the definitions in the Note at the end these all work and give the same answer.
L[["name1"]][5]
DF[["name1"]][5]
DT[["name1"]][5]
L$name1[5]
DF$name1[5]
DT$name1[5]
It seems not unreasonable that a data frame which is conceptually a 2d object can take two subscripts whereas a list which is one dimensional takes one.
[[ and [ have different meanings so I am not sure consistency plays a role here.
Note
L <- list(name1 = 1:10, name2 = 11:20)
DF <- as.data.frame(L)
library(data.table)
DT <- as.data.table(DF)
Or how to split a vector into pairs of contiguous members and combine them in a list?
Supose you are given the vector
map <-seq(from = 1, to = 20, by = 4)
which is
1 5 9 13 17
My goal is to create the following list
path <- list(c(1,5), c(5,9), c(9,13), c(13,17))
This is supposed to represent the several path segments that the map is sugesting us to follow. In order to go from 1 to 17, we must first take the first path (path[1]), then the second path (path[2]), and all the way to the end.
My first attempt lead me to:
path <- split(aux <- data.frame(S = map[-length(map)], E = map[-1]), row(aux))
But I think it would be possible without creating this auxiliar data frame
and avoiding the performance decrease when the initial vector (the map) is to big. Also, it returns a warning message which is quite alright, but I like to avoid them.
Then I found this here on stackoverflow (not exactly like this, this is the adapted version for my problem):
mod_map <- c(map, map[c(-1,-length(map))])
mod_map <- sort(mod_map)
split(mod_map, ceiling(seq_along(mod_map)/2))
which is a simpler solution, but I have to use this modified version of my map.
Pherhaps I'm asking too much as I already got two solutions. But, could it be possible to have a third one, so that I don't have so use data frames as in my first solution and can use the original map, unlike my second solution?
We can use Map on the vector ('map' - better not to use function names - it is a function from purrr) with 1st and last element removed and concatenate elementwise
Map(c, map[-length(map)], map[-1])
Or as #Sotos mentioned, split can be used which would be faster
split(cbind(map[-length(map)], map[-1]), seq(length(map)-1))
Is there an R type equivalent to the Matlab structure type?
I have a few named vectors and I try to store them in a data frame. Ideally, I would simply access one element of an object and it would return the named vectors (like a structure in Matlab). I feel that using a data frame is not the right thing to do since it can store the values of the named vectors but not the names when they differ from one vector to the other.
More generally, is it possible to store a bunch of different objects in a single one in R?
Edit: As Joran said I think that list does the job.
l = list()
l$vec1 = namedVector1
l$vec2 = namedVector2
...
If I have a list of names
name1 = 'vec1'
name2 = 'vec2'
is there any way for the interpreter to understand that when I use a variable name like name1, I am not referring to the variable name but to its content? I have tried get(name1) but it does not work.
I could still be wrong about what you're trying to do, but I think this is the best you're going to get in terms of accessing each list element by name:
l <- list(a= 1:3,b = 1:10)
> ind <- "a"
> l[[ind]]
[1] 1 2 3
Namely, you're going to have to use [[ explicitly.
I have assignment using R and have a little problem. In the assignment several matrices have to be generated with random number of rows and later used for various calculations. Everything works perfect, unless number of rows is 1.
In the calculations I use nrow(matrix) in different ways, for example if (i <= nrow(matrix) ) {action} and also statements like matrix[,4] and so on.
So in case number of rows is 1 (I know it is actually vector) R give errors, definitely because nrow(1-dimensional matrix)=NULL. Is there simple way to deal with this? Otherwise probably whole code have to be rewritten, but I'm very short in time :(
It is not that single-row/col matrices in R have ncol/nrow set to NULL -- in R everything is a 1D vector which can behave like matrix (i.e. show as a matrix, accept matrix indexing, etc.) when it has a dim attribute set. It seems otherwise because simple indexing a matrix to a single row or column drops dim and leaves the data in its default (1D vector) state.
Thus you can accomplish your goal either by directly recreating dim attribute of a vector (say it is called x):
dim(x)<-c(length(x),1)
x #Now a single column matrix
dim(x)<-c(1,length(x))
x #Now a single row matrix
OR by preventing [] operator from dropping dim by adding drop=FALSE argument:
x<-matrix(1:12,3,4)
x #OK, matrix
x[,3] #Boo, vector
x[,3,drop=FALSE] #Matrixicity saved!
Let's call your vector x. Try using matrix(x) or t(matrix(x)) to convert it into a proper (2D) matrix.
How do I create a vector like this:
a = [a_1;a_2;...,a_n];
aNew = [a;a.^2;a.^3;...;a.^T].
Is it possible to create aNew without a loop?
So you want different powers of a, all strung out into a vector? I would create an array, where each column of the array is a different power of a. Then string it out into a vector. Something like this...
aNew = bsxfun(#power,a,1:T);
aNew = aNew(:);
This does what you want, in a simple, efficient way. bsxfun is a more efficient way of writing the expansion than are other methods, such as repmat, ndgrid and meshgrid.
The code I wrote does assume that a is a column vector, as you have constructed it.
The idea is to use meshgrid to create two arrays of size n x T:
[n_mesh, t_mesh] = meshgrid(a, 1:T);
Now n_mesh is an array where each row is a duplicate of a, and t_mesh is an array where each column is 1:T.
Now you can use an element-wise operation on them to create what you need:
aNew = n_mesh .^ t_mesh;