Writing Data into columns in a file (IDL) - idl-programming-language

I am trying to write some data into columns in IDL.
Let's say I am calculating "k" and "k**2", then I would get:
1 1
2 4
3 9
. .
and so on.
If I write this into a file, it looks like this:
1 1 2 4 3 9 . .
My corresponding code looks like this:
pro programname
openw, 1, "filename"
.
. calculating some values
.
printf, 1, value1, value2, value3
close,1
end
best regards

You should probably read the IDL documentation on formatted output, to get all of the details.
I don't understand your "value1, value2, value3" in your printf. If I were going to do this, I would have two variables "k" and "k2". Then I would print using either a transpose or a for loop:
IDL> k = [1:10]
IDL> k2 = k^2
IDL> print, transpose([[k], [k2]])
1 1
2 4
3 9
4 16
5 25
6 36
7 49
8 64
9 81
10 100
IDL> for i=0,n_elements(k)-1 do print, k[i], k2[i]
1 1
2 4
3 9
4 16
5 25
6 36
7 49
8 64
9 81
10 100
By the way, if you are going to use stack overflow you should start "accepting" the correct answers.

It sounds like you have two vectors. One with single values and one with square values. Here's what you could do:
pro programname
openw, f, "filename", /GET_LUN
.
. calculating some values for k
.
k2 = k^2 ;<---This is where you square your original array.
;Merge the vectors into a single array. Many ways to do this.
k = strtrim(k,2) ;<----------------------------Convert to string format
k2 = strtrim(k2,2) ;<--------------------------Convert to string format
merge = strjoin(transpose([[k],[k2]]),' ') ;<--Merge the arrays
;Now output your array to file the way you want to
for i=0L, n_elements(k)-1 do printf, f, merge[i]
;Close the file and free the logical file unit
free_lun, f
end

Related

regex for searching through dataframe in R

I have a list of barcodes with the format: AAACCTGAGCGTCAAG-1
The letters can be A, C, G or T and the number after the dash can be 1 - 16.
barcode = c('AAACCTGAGCGTCAAG-1',
'AAACCTGAGTACCGGA-1',
'AAACCTGCAGCTGCTG-1',
'AAACCTGCATCACGAT-3',
'AAACCTGCATTGGGCC-5',
'AAACCTGGTATAGTAG-10',
'AAACCTGGTCGCGTGT-1',
'AAACCTGGTTTCCACC-16',
'AAACCTGTCATGCATG-14',
'AAACCTGTCGCAGGCT-15',
'AAACGGGAGAACTCGG-1')
cluster = c(6,3,6,16,17,11,14,18,9,8,14)
df <- data.frame(Barcode = barcode, Cluster = cluster)
I need to subset this dataframe based on the -# at the end of the barcode. I have been using this to subset the dataframe. The problem is this works for every number except 1.
> df[grep("([ACGT]-10){1}", df$Barcode),]
Barcode Cluster
6 AAACCTGGTATAGTAG-10 11
When I use the following, it will include all the barcodes that end in -1, as well as -10, -11, -12, -13, -14, -15 and -16.
> df[grep("([ACGT]-1){1}", df$Barcode),]
Barcode Cluster
1 AAACCTGAGCGTCAAG-1 6
2 AAACCTGAGTACCGGA-1 3
3 AAACCTGCAGCTGCTG-1 6
6 AAACCTGGTATAGTAG-10 11
7 AAACCTGGTCGCGTGT-1 14
8 AAACCTGGTTTCCACC-16 18
9 AAACCTGTCATGCATG-14 9
10 AAACCTGTCGCAGGCT-15 8
11 AAACGGGAGAACTCGG-1 14
>
Is there a regex that will include barcodes ending in -1, but exclude all other barcodes that end in numbers from 10 - 16?
I want to subset the dataframe so that I only get this:
Barcode Cluster
1 AAACCTGAGCGTCAAG-1 6
2 AAACCTGAGTACCGGA-1 3
3 AAACCTGCAGCTGCTG-1 6
7 AAACCTGGTCGCGTGT-1 14
11 AAACGGGAGAACTCGG-1 14
>
Thanks!
How about:
df[grep("-1$", df$Barcode),]
This matches 1 at the end of the string, but also requires that the digit before 1 is not 1, so you don't match 11
Barcode Cluster
1 AAACCTGAGCGTCAAG-1 6
2 AAACCTGAGTACCGGA-1 3
3 AAACCTGCAGCTGCTG-1 6
7 AAACCTGGTCGCGTGT-1 14
11 AAACGGGAGAACTCGG-1 14
I think you can just use df[grep("([ACGT]-1$){1}", df$Barcode),]
You can just use a $ to specify the end of the chain. See more information here on "pattern" use: http://www.jdatalab.com/data_science_and_data_mining/2017/03/20/regular-expression-R.html

Alternating between reading forwards and backwards in a loop

My array is 1D m in length. say m = 16
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
The way I actually interpret the array is n x n = m
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
I require to read the array in this manner due to the way my physical environment is set up
0 4 8 12 13 9 5 1 2 6 10 14 15 11 7 3
What I came up with works but I really don't think it is the best way to do this:
bool isFlipped = true;
int x = 0; x < m; x++
if(isFlipped)
newLine[x] = line[((n-1)-x%n)*n + x/n)]
else
newLine[x] = line[x%n*n +x/n]
if(x != 0 && x % n == 0)
isFlipped = !isFlipped
This gives me the required result but I really think there is a way to get rid of this boolean by purely using a math formula. I am stuffing this into a 8kb microcontroller and I need to conserve as much space as I can because I will have some bluetooth communication and more math going into it later on.
Edit:
Thanks to a user I got to a one line solution-ish. (the below would replace the lines in the for-loop)
c=x/n
newLine[x] = line[((c+1)%2)*((x%n)*n+c) + (c%2)*((n-1)-2*(x%n))*n ];
You should be able to utilize the fact that odd columns in the n*n matrix are read from down up, and even columns are read from up down.
A number at index x in newLine is located in column number c=floor(x/n) in the n*n matrix. c%2 is 0 for even columns and 1 for odd columns. So something like this should work:
int c = x/n;
newLine[x] = line[(x%n)*n + (c%2)*((n-1)-2*(x%n))*n + c];

Sqlite - store matrices in a table

I am quite new to Sqlite and have a dilemma about database design. Suppose we have a number of matrices (of various sizes) that is going to be stored in a table. We can further assume that no matrise is sparse.
Let's say we have:
A = [[1, 4, 5],
[8, 1‚ 4],
[1, 1, 3]]
B = [['what', 'a', 'good', 'day'],
['for', 'a', 'walk', 'outside']]
C = [['AAA', 'BBB', 'CCC', 'DDD', 'EEE'],
['FFF', 'GGG', 'HHH', 'III', 'JJJ'],
['KKK', 'LLL', 'MMM', 'NNN', 'OOO']]
And D which is [NxM]
When we create the table we do not know all the sizes that the matrices will have. I do not think it would be nice to alter the table size afterwards. What would be a recommended way to store the matrices to efficiently get them back? I wish to query out a matrix row-by-row.
I am thinking of transforming matrices into a column vector that somehow ends up in a table like this,
CREATE TABLE mat(id INT,
row INT,
col INT,
val TEXT)
How can I get them back line by line with a query in sqlite that looks like this for matrix A?
[1, 4, 5]
[8, 1‚ 4]
[1, 1, 3]
Ideas? Or could someone kindly refer to any similar problems
---------------------- UPDATE ----------------------
Okay. My question was not clear enough. That is probably the way I'm intended to arrange the data in my database. I hope you can help me find a way to organize my database,
Suppose we have some sets of data:
Compilation User BogoMips
1 Andrew 1.04
1 Klaus 1.78
1 James 1.99
1 David 2.09
. . .
. . .
1 Alex 4.71
Compilation Time Temperature Colour
2 10:20 10 Blue
2 10:28 21 Green
2 10:42 25 Red
. . . .
. . . .
2 18:16 16 Green
Compilation Colour Distance
3 Blue 4
3 Green 9
. . .
. . .
3 Yellow 12
...And there will be many more sets of data with different numbers columns and new headers. Some header names will return in another set. In advance, we have no idea what kind of sets needs to be stored. Every set has a common header 'compilation' that binds them together.
How would you structure the data in a database?
I find it hard to believe that creating a new table for each set is a good solution. or?
My idea is to have two tables, headers and data.
CREATE TABLE headers (id INT,
header TEXT
)
CREATE TABLE data (id INT,
compilation INT,
fk_header_id INT REFERENCES headers,
row INT,
col INT,
value TEXT)
So the populated tables looks like this,
SELECT * FROM headers;
id header
------------
1 User
2 BogoMips
3 Time
4 Temperature
5 Colour
6 Distance
SELECT * FROM data;
id compilation fk_header_id row col value
----------------------------------------------------
1 1 1 1 1 Andrew
2 1 2 1 2 1.04
3 1 1 2 1 Klaus
4 1 2 2 2 1.78
. . . . . .
. 2 3 1 1 10:20
. 2 4 1 2 10
. 2 5 1 3 Blue
. 2 3 2 1 10:28
. 2 4 2 2 21
. 2 5 2 3 Green
. . . . . .
. 3 5 1 1 Blue
. 3 6 1 2 4
. 3 5 2 1 Green
. 3 6 2 2 9
. . . . . .
.
and so on
The problem is that I don't know how to query out the datasets in Sqlite. Anyone (Tony) have an idea?
You'd need a pivot / cross tab query (or it's join equivalent) to get the data out.
e.g
Select c1.value as col1, c2.value as col2, c3.value as col3
from data c1 on col = 1
inner join data c2 on col = 2 and c2.compilation = c1.compilation and c2.row = c1.row
inner join data c3 on col= 3 and c3.compilation = c1.compilation and c3.row = c1.row
Where c1.compilation = 1 order by c1.row
As you can see this is less than funny. In particular with the above, you'd have to know the number of columns in advance. Crosstab or pivot would relieve you of that in terms of the sql, but you'd still have to mess about to read in the data from the query result.
Haven't seen anything is your question that indicates a need to extract a row or a column from a matrix, never mind a cell from the db
My Table would start as simple as
Compilation, Description, Matrix
Matrix would be sort of serialisation of a matrix object, Binary, xml even some sort of string eg. 1,2,3|4,5,6|7,8,9
If this was all I needed to store, I'd be looking at a NoSQL variant.

returning a list in R and functional programming behavior

I have a basic questions regarding functional programming in R.
Given a function that returns a list, such as:
myF <- function(x){
return (list(a=11,b=x))
}
why is it that the list returned when calling the function with a range or vector is always the same lenght for 'a'
Ex:
myF(1:10)
returns:
$a
[1] 11
$b
[1] 1 2 3 4 5 6 7 8 9 10
How can one change the behavior so that the 'a' list has the sample length as b's.
I am actually working with a bunch of S4 objects that do I cannot easily convert to list (using as.list) so _apply is not my first choice.
Thanks for any insight or help!
EDIT (Added further explanations)
I am not necessarily looking to just pad 'a' to makes its length equal to b's. However using the solution
as.list(data.frame(a=myA,b=x)) pads the 'a' with the same value computed first.
myF <- function(x){
myA = ceiling(runif(1, max=100))
return (as.list(data.frame(a=myA
,b=x)))
}
myF(1:5)
$a
[1] 79 79 79 79 79 79 79 79 79 79
$b
[1] 1 2 3 4 5 6 7 8 9 10
I still am not sure why that happens!
Thanks
are you just looking to have 11 repeated so that a is the same length as b? if so:
> myF <- function(x){
+ return (list(a=rep(11,length(x)),b=x))
+ }
> myF(1:10)
$a
[1] 11 11 11 11 11 11 11 11 11 11
$b
[1] 1 2 3 4 5 6 7 8 9 10
EDIT based on OP's clarification/comments. If you want 'a' to instead be a random vector with length equal to 'b':
> myF <- function(x){
+ return (list(a=ceiling(runif(length(x),max=100)),b=x))
+ }
> myF(1:10)
$a
[1] 4 31 8 45 25 74 36 95 64 32
$b
[1] 1 2 3 4 5 6 7 8 9 10
I don't quite understand what you mean by not being able to use as.list. You should be able to get a version of your function satisfying the requirement that all components of the list be equally long by doing:
myF <- function(x){
return as.list(data.frame(a=11,b=x))
}
EDIT:
The reason list does not work the way you expect is that list applied to a number of lists/vectors/e.t.c. is just that, a list of those lists/vectors/e.t.c.; it does not "inspect" their structure.
What I think you want is the additional semantics that the vectors contained in the list should match up and produce a set of "rows", each with one corresponding element from each one of your vectors. This is exactly what a data frame is suppose to be (indeed how, I think, a data frame is represented in R). The final as.list call does little but change what type its tagged as.
EDIT2:
Note that if I'm wrong above (and that's not the general behaviour you want) then Mac's solution is more appropriate, as it gives you exactly the behaviour that both the vectors should have the same length, without implying that they should "line up".
This would both be confusing to anyone reading the code (as using a data.frame implies you think of your vectors as matching up) as well as forcing any additional elements you add to the list to be converted into vectors of the appropriate length (which may or may not be what you want)
In case I did not understand you correctly last time, here is another possibility:
If you want to generate a second vector, given some function/expression, of the same length as your argument you could do something like:
myF <- function(x){
return (list(a=replicate(length(x),f),b=x))
}
in your example f could be runif(1, max=100), though in the specific case of runif you could explicitly tell it to generate a vector of appropriate length by calling runif(length(x), max=100) inside the function.
replicate simply re-evaluates f the number of times you request, and gives you the vector of all the results.
It appears that your function is "hard coding" a. So no matter what you specify it will always give 11.
If for example you changed the function to:
myF <- function(x){ return (list(a=x,b=x)) }
myF(1:10)
$a
[1] 1 2 3 4 5 6 7 8 9 10
$b
[1] 1 2 3 4 5 6 7 8 9 10
a is allowed to change like b.
or
myF <- function(x,y){ return (list(a=y,b=x)) }
myF(10:1,1:10)
$a
[1] 1 2 3 4 5 6 7 8 9 10
$b
[1] 10 9 8 7 6 5 4 3 2 1
Now a is allowed to change independent of b.

loop over columns with semi like columnnames

I have the following variable and dataframe
welltypes <- c("LC","HC")
qccast <- data.frame(
LC_mean=1:10,
HC_mean=10:1,
BC_mean=rep(0,10)
)
Now I only want to see the welltypes I selected(in this case LC and HC, but it could also be different ones.)
for(i in 1:length(welltypes)){
qccast$welltypes[i]_mean
}
This does not work, I know.
But how do i loop over those columns?
And it has to happen variable wise, because welltypes is of an unkown size.
The second argument to $ needs to be a column name of the first argument. I haven't run the code, but I would expect welltypes[i]_mean to be a syntax error. $ is similar to [[, so you can use paste to create the column name string and subset via [[.
For example:
qccast[[paste(welltypes[i],"_mean",sep="")]]
Depending on the rest of your code, you may be able to do something like this instead.
for(i in paste(welltypes,"_mean",sep="")){
qccast[[i]]
}
Here's another strategy:
qccast[ sapply(welltypes, grep, names(qccast)) ]
LC_mean HC_mean
1 1 10
2 2 9
3 3 8
4 4 7
5 5 6
6 6 5
7 7 4
8 8 3
9 9 2
10 10 1
Another easy way to access given welltypes
qccast[,paste(welltypes, '_mean', sep = "")]

Resources