Our project has dll, which has function which returns 2d array flattened to 1d array of Microsoft variant structure.
It looks like (assume VT type is int)
1 2 3
4 5 6
and it returns 1 2 3 4 5 6 (array)
number of rows and columns in array keep on changing (2x3 or 4x5 or 1000x1)
I am able to call function from .cpp using rcpp. I need way to convert the variant array into R data type(s)
One way is convert it into list. But to convert them in row column format I need to process them again which I want to avoid.
I cannot see any package in R which can help me here.Any suggestion ? Can we create nested lists dynamically in rcpp ?
Related
I have a large and messy character vector html.
I also have a data frame url_characters:
start end
1 35288 35333
2 36723 36768
3 38168 38213
4 39647 39692
5 41091 41136
6 42549 42594
How can I create a loop in R that would perform substr function on character vector html using each row of the data frame for start and end arguments and store the results in a new character vector with each extraction as a separate item of the vector?
I am an absolute beginner so pardon me if the solution is an obvious one or already exists somewhere, I couldn't find it.
In Matlab I can write:
[0:n]
to get an array (1,n). For n=2, I get:
0 1 2
How to do the same in Julia? The purpose is to get the same type of array (1,3).
I know I can write [0 1 2], but I want something general like in Matlab.
In julia, the colon operator (in this context, anyway) returns a UnitRange object. This is an iterable object; that means you can use it with a for loop, or you can collect all its contents, etc. If you collect its contents, what you get here is a Vector.
If what you're after is explicitly a RowVector, then you can collect the contents of the UnitRange, and reshape the resulting vector accordingly (which in this case can be done via a simple transpose operation).
julia> collect(1:3).'
1×3 RowVector{Int64,Array{Int64,1}}:
1 2 3
The .' transpose operator is also defined for UnitRange arguments:
julia> (1:3).'
1×3 RowVector{Int64,UnitRange{Int64}}:
1 2 3
However, note the difference in the resulting type; if you apply .' again, you get a UnitRange object back again.
If you don't particularly like having a "RowVector" object, and want a straightforward array, use that in an Array constructor:
julia> Array((1:3).')
1×3 Array{Int64,2}:
1 2 3
(above as of latest julia 0.7 dev version)
I'm in the enviable position of being able to set up the format for my data collection ahead of time, rather than being handed some crazy format and having to struggle with it. I'd like to make sure I'm setting it up in a way that minimizes headaches down the road, but I'm not very familiar with importing into multidimensional arrays so I'd like input. It also seems like a thought exercise that others might get some use from.
I am compiling a large number of data summaries (500+) with 23 single data values for each experiment and two additional vectors that vary between 100 and 1500 data values (these two vectors happen to always match in length for each sample, but their length is different for each sample). I'm having to store all of these in an Excel sheet which I'm currently building. I want to set it up in a way that efficiently stores this data for import into an R array.
I'm assuming that the longer dimensions, which vary in length, will have the max length (1500) and a bunch of NA's at the end rather than try to keep track of ragged data in Excel.
My current plan would be to store these in long form in Excel, with data labels in the first column (dim1, dim2,...), and the data summaries in each subsequent column (a, b, c...), since this saves the most space. Using a smaller number of dimensions as an example (7 single values, 2 vectors of length 1500), the data would look like this in Excel:
a b c...
dim1 2 5 7...
dim2 3 6 8...
dim3 6 8 2 ...
dim4 5 6 1...
dim5 6 2 1...
dim6 0 3 8...
dim7 8 5 4...
dim8 1 1 1...
dim8 2 2 2 ...
... continued x1500
dim9 4 4 4...
dim9 5 5 5 ...
...continued x1500
Can I easily import this, using the leftmost column to identify the dimensions of the array in long form? I don't see an easy way to do this using Reshape2, but perhaps I'm missing something. Or, do I need to have the data in paired columns?
It isn't clear to me whether this format is the most efficient way to organize this data for import into a multidimensional array, or if there is a better way. Eventually there will be a large number of samples so I'd like to think through this now rather than struggle later.
What is the most painless way to import this...or, is there a more efficient way of setting it up for easier import?
Hmm.. I can't think of a case that you would have to use melt. If you keep the current format, and add a heading to the 'dim' column then you should be able to work with that data fairly easily.
If you did transpose the data on 'dim' I think it would make things a lot more difficult.
It might good to know what variable types a,b,c,etc. are in order to make a better assessment.
I want to create a list which is eight times the vector c(2,6), i.e. a list of 8 vectors.
WRONG: object = as.list(rep(c(2,6),8)) results instead in a list of 16 single numbers: 2 6 2 6 2 6 2 6 ...
I tried drop=0 but that didn't help, and I can't get lapply to work.
(My context:
I have a function in which a subfunction will call a numbered list object.
The number will be in a loop and therefore change, and the number and loop size is dependent on user values so I don't know what it'll be. If the user doesn't enter a list of vector values for one of the variables, I need to set a default.
The subfunction is expecting e.g. c(2,6)
The subfunction is currently looping 8 times so I need a list which is eight times c(2,6).
rep(list(c(2,6)),8) is the answer - thanks to Nicola in comments.
In R, given an numeric array, how to produce another array which contains sorted index order into the original array? To clarify my question, consider an example: For an original array:
[33 11 55 22 44]
I want to produce another array:
[3 1 5 2 4]
Each element in the second array indicates sorted index of corresponding element in the first array.
In MATLAB, this can be done by using [B,XI]=sort(A); where XI is the wanted array.
Try the rank() function, found here.