R: Creating a matrix with loops/apply (Original code Fortran) - r

Any help appreciated, I have been struggling with this problem far too long today, and I hope a fresh pair of eyes and set of braincells can help. Suggestions on how to make the code more efficient will also be greatly appreciated.
I am in the process of rewriting a program from Fortran into R. The eventual matrix, once all the data comes in, will be bigger that 1000x1000.
The first element of the code looked like this:
allocate (S(nrecords))
do i=1,nrecords
S(i)=ZZ(i,i)
end do
which in R simply became this: S<-diag(ZZ) **nrecords in the example data = 10
The example dataset I am using consists of a 10x10 matrix ZZ:
167315 136626 138035 150376 137080 136561 139467 137161 151010 140947
136626 171188 139660 138286 138161 138709 139713 138422 138138 140265
138035 139660 170362 138202 138643 138168 140629 139121 137675 139288
150376 138286 138202 167354 138025 138029 140168 137797 144110 139955
137080 138161 138643 138025 168606 144637 140715 138636 142043 141936
136561 138709 138168 138029 144637 167756 140256 138348 140914 152011
139467 139713 140629 140168 140715 140256 172119 141704 140553 140769
137161 138422 139121 137797 138636 138348 141704 169635 137902 138752
151010 138138 137675 144110 142043 140914 140553 137902 169823 142444
140947 140265 139288 139955 141936 152011 140769 138752 142444 173183
so S is a vector containing the diagonal values.
I am stuck in translating this Fortran element though:
allocate(D(nrecords,nrecords))
sumD=0
do i=1,nrecords
do j=1,nrecords
D(i,j)=S(i)+S(j)-2*ZZ(i,j)
sumD=sumD+D(i,j)
end do
end do
deallocate(ZZ)
sumD=sumD/(nrecords*nrecords)
I know that at the end of the day I am supposed to end up with another 10x10 matrix, where D1,1 will equal to 0, and D1,2 will be 65251. But between reading-up on for-loops, apply(), sapply() and tapply() I am rather lost and confused.
This is another element that has already been translated, and I wanted to base the fortran translation on this, but I think I have been staring at it too long, and I strongly suspect that there is a more efficient answer:
n <-6
sumA <- 0
for (i in 1:n) {
for (j in 1:n) {
sumA <- sumA+A[i,j]
}
}
sumA2 <- 0
for (i in 1:n) {
for (j in 1:n) {
sumA2 <- sumA2+A[i,j]^2
}
}
with the corresponding fortran:
sumA2=0.0;sumA=0.0
do i=1,nrecords
do j=1,nrecords
if(A(i,j) > 0.0) then
sumA2=sumA2+(A(i,j)*A(i,j))
sumA=sumA+A(i,j)
end if
end do
end do
sumMMA=0.0;sumZZ=0.0
do i=1,nrecords
do j=1,nrecords
sumMMA=sumMMA+(ZZ(i,j)*A(i,j))
sumZZ=sumZZ+ZZ(i,j) !this will not work using the sum(ZZ) function
end do
end do
Matrix A is simply
1 0 0 0 0 0 0 0 0 0
0 0.75 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 0 0
0 0 0 0 1 0 0 0 0 0
0 0 0 0 0 0.5 0 0 0 0
0 0 0 0 0 0 0.75 0 0 0
0 0 0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 0 1
Thanks in advance!

The purpose of the apply functions is to improve readability. If you don't understand them you don't need to use them. They are more or less wrappers for for loops. In your case, you can almost translate your code verbatim.
R
nrecords <- 10
ZZ <- as.matrix(read.table(header=F, text='
167315 136626 138035 150376 137080 136561 139467 137161 151010 140947
136626 171188 139660 138286 138161 138709 139713 138422 138138 140265
138035 139660 170362 138202 138643 138168 140629 139121 137675 139288
150376 138286 138202 167354 138025 138029 140168 137797 144110 139955
137080 138161 138643 138025 168606 144637 140715 138636 142043 141936
136561 138709 138168 138029 144637 167756 140256 138348 140914 152011
139467 139713 140629 140168 140715 140256 172119 141704 140553 140769
137161 138422 139121 137797 138636 138348 141704 169635 137902 138752
151010 138138 137675 144110 142043 140914 140553 137902 169823 142444
140947 140265 139288 139955 141936 152011 140769 138752 142444 173183
'))
S <- diag(ZZ)
Fortran
allocate(D(nrecords,nrecords))
sumD=0
do i=1,nrecords
do j=1,nrecords
D(i,j)=S(i)+S(j)-2*ZZ(i,j)
sumD=sumD+D(i,j)
end do
end do
deallocate(ZZ)
sumD=sumD/(nrecords*nrecords)
R
D <- matrix(0, nrecords, nrecords)
sumD = 0
for(i in 1:nrecords){
for(j in 1:nrecords){
D[i,j] = S[i] + S[j] - 2*ZZ[i,j]
sumD = sumD + D[i,j]
}
}
sumD = sumD/(nrecords*nrecords)
Fortran
do i=1,nrecords
do j=1,nrecords
if(A(i,j) > 0.0) then
sumA2=sumA2+(A(i,j)*A(i,j))
sumA=sumA+A(i,j)
end if
end do
end do
sumMMA=0.0;sumZZ=0.0
do i=1,nrecords
do j=1,nrecords
sumMMA=sumMMA+(ZZ(i,j)*A(i,j))
sumZZ=sumZZ+ZZ(i,j) !this will not work using the sum(ZZ) function
end do
end do
R
A <- matrix(0, nrecords, nrecords)
diag(A) <- c(1,.75,1,1,1,.5,.75,1,1,1)
sumA2 = 0
sumA = 0
for(i in 1:nrecords){
for(j in 1:nrecords){
if(A[i,j] > 0){
sumA2=sumA2+(A[i,j]*A[i,j])
sumA = sumA+A[i,j]
}
}
}
sumMMA=0
sumZZ=0
for(i in 1:nrecords){
for(j in 1:nrecords){
sumMMa=sumMMA+(ZZ[i,j]*A[i,j])
sumZZ=sumZZ+ZZ[i,j]
}
}

Related

Build xy+xz+yz using NAND port

for an homework I have to write xy+xz+yz using only NANDS port.
I will use the notation NAND(x,y) - or other types of bracket to make things clearer -, below my attempt and then an explanation for every step. I'd like to know if i'm doing this right and if there are better ways to do it.
My Solution
NAND[NAND(NAND(NAND(x,y),NAND(x,z)),NAND(NAND(x,y),NAND(x,z))),NAND(NAND(NAND(y,z),NAND(y,z)),NAND(NAND(y,z),NAND(y,z))))]`
I know this looks really impossible to read and keep track of. I'm sorry, didn't know how to make this more beautiful. Hope my explanation will clarify things.
I divided xy+xz+yz in two groups: xy+xz and yz
First Group:
xy+xz = NAND(NAND(x,y),NAND(xz)) = NOT[NOT(xy)*NOT(xz)] = xy+xz
Second Group:
yz = NAND(NAND(y,z),NAND(y,z)) = NOT(NOT(yz)*NOT(yz)) = yz (since yz+yz = yz)
Now I have to combine the first group with the second, for readibility I'll call the first group (in NAND as g1) and the second g2;
g1+g2= NAND[NAND(g1,g1),NAND(g2,g2)] = NOT[NOT(g1)*NOT(g2)] = g1+g2
So at the end:
xy+xz+yz= NAND[NAND(NAND(NAND(x,y),NAND(x,z)),NAND(NAND(x,y),NAND(x,z))),NAND(NAND(NAND(y,z),NAND(y,z)),NAND(NAND(y,z),NAND(y,z))))]
Is my reasoning right? There's a more easy way?
Thanks a lot guys
Your answer is correct (although you have some missing punctuation -- a couple commas and a parenthesis). You can confirm by generating a truth table of all possible outputs as so. I wrote a few lines of C code to confirm. As for your second question to whether there is an easier way, I don't know. Maybe someone else can help out.
x y z xy+xz+yz nands
------------------------------
0 0 0 0 0
0 0 1 0 0
0 1 0 0 0
0 1 1 1 1
1 0 0 0 0
1 0 1 1 1
1 1 0 1 1
1 1 1 1 1

Extracting alternating sequence from vector in R

I have a data looking like the following:
A= c(0,0,0,-1,0,0,0,1,1,1,0,0,-1,0,0,-1,-1,1,1,1,-1,0,0,0,-1,0,0,-1,-1,1,1,0,0,0,0,1,-1)
The goal is to extract alternating -1s and 1s. I want to make a function where the input vector contains 0,1, and -1. The output ideally spits out all the 0s and alternating -1s and 1s.
For instance, the desired output for the above example is:
B= c(0,0,0,-1,0,0,0,1,0,0,0,0,-1,0,0,0,0,1,0,0,-1,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,-1)
The two 1s in the 9th and 10th location in A is turned to 0 because we only keep the first 1 or -1 appearing. The -1s in 16th and 17th location of A is turned to 0 for this reason as well.
Anyone have a good idea for making such a function?
Identify positions of nonzero values:
w = which(A != 0)
For each run of similar values, in A[w], take the position of the first:
library(data.table)
wkeep = tapply(w, rleid(A[w]), FUN = function(x) x[1])
Set all other values to zero:
# following #alexis_laz's approach
B = numeric(length(A))
B[ wkeep ] = A[ wkeep ]
This way, you don't have to make comparisons in a loop, which R is slow at, I think.
rleid comes from data.table. With base R, you can make wkeep with #alexis_laz's suggestion:
wkeep = w[c(TRUE, A[w][-1L] != A[w][-length(w)])]
Or write your own rleid, as in Josh's answer.
This is really just a Reification of GWarius's pseudo-code. (I already had a structure but logic that was failing.)
last1 <- -A[which(A != 0)[1] ] # The opposite of the first non-zero item
for (i in seq_along(A) ){
if( last1==1 && A[i]==-1 ){ last1 <- -1
} else {if (last1 == -1 && A[i] == 1) { last1 <- 1
} else {A[i] <- 0}} }
A
[1] 0 0 0 -1 0 0 0 1 0 0 0 0 -1 0 0 0 0 1 0 0 -1 0 0
[24] 0 0 0 0 0 0 1 0 0 0 0 0 0 -1
> identical(A, B)
[1] TRUE
you have to slide all the array and with a flag variable you check if previously you found 1 or -1.
it could be possible pseudo-code algorithm:
while i < length(a):
if flag == 1 && a[i]=-1:
b[i]=a[i];
flag = -1;
else if flag == -1 && a[i] = 1:
b[i]=a[i];
flag = 1;
else:
b[i]=0;
i++;
}//end of while

Is it possible to convert a MathProg MIP file to a format recognised by SCIP?

I've been using GLPK to solve some mixed integer programming problems. Here's a sample input file in MathProg format:
set REACTIONS;
set REACTANTS;
param Ys {i in REACTANTS, j in REACTIONS};
param Gamma {i in REACTANTS, j in REACTIONS};
param eps;
param delt;
var w {i in REACTANTS} >=-delt <=delt;
var R0 {i in REACTIONS} >=0 <=1, integer;
var Rn {i in REACTIONS} >=0 <=1, integer;
minimize z: sum{i in REACTIONS} -Rn[i];
s.t. const1{i in REACTIONS} : sum{k in REACTANTS} w[k]*Gamma[k,i] <= delt*(1-R0[i]);
s.t. const2{i in REACTIONS} : -sum{k in REACTANTS} w[k]*Gamma[k,i] <= delt*(1-R0[i]);
s.t. const3{i in REACTIONS} : Rn[i] <= 1-R0[i];
s.t. const5{i in REACTIONS} : sum{k in REACTANTS} w[k]*Gamma[k,i] <= delt*(1-Rn[i])-eps;
s.t. const6{i in REACTIONS, j in REACTIONS: i <> j} : sum{k in REACTANTS} w[k]*(Ys[k,i]-Ys[k,j]) <= delt*(1-Rn[i]+Rn[j]+R0[j]);
data;
set REACTIONS:= 1 2 3 4 5 6;
set REACTANTS:= 1 2 3 4 5 6;
param Ys: 1 2 3 4 5 6:=
1 1 0 0 0 0 0
2 1 0 0 0 0 0
3 0 1 1 0 0 0
4 0 0 0 1 0 0
5 0 0 0 1 0 0
6 0 0 0 0 1 1;
param Gamma: 1 2 3 4 5 6:=
1 -1 1 0 0 0 1
2 -1 1 1 0 0 0
3 1 -1 -1 0 0 0
4 0 0 1 -1 1 0
5 0 0 0 -1 1 1
6 0 0 0 1 -1 -1;
param eps:=0.1;
param delt:=10;
end;
I've been running into performance problems for bigger problems of this type, and since SCIP claims to be several times faster than GLPK for MIP, it seems worth investigating. However, I haven't been able to make head or tail of the documentation when it comes to input file formats. SCIP's homepage says that it supports AMPL format, and the GLPK's homepagesays that MathProg is a subset of AMPL. Trying to feed the above file into SCIP 3.1.0 via scip -f file.nl returns the following error:
read problem <file.nl>
============
no reader for input file <file.nl> available
I'm not sure whether this is because I've failed to build SCIP with AMPL support, or something else. I found this blog post on building SCIP with AMPL support, but the instructions seem to be outdated as the source zip of SCIP 3.1.0 doesn't contain an interfaces folder.
So, I have two questions:
Is it possible to get SCIP to recognise my MathProg input as is?
If not, can anyone advise on how to convert it to a recognised format? An automated method would be preferable, as I don't really want to have to learn yet another format, but a manual method would be better than nothing.
Thanks for any help and apologies for my ignorance!
As I indicated in my comment above, the Ampl-interface is still included in the SCIP-distribution, and you should be able to compile it and read your problem
as documented in the excellent blog post you cite.
If you feel tempted to try different file formats, I see two options for you:
use glpk for translating your problem into a file format that is recognizable by SCIP. I found methods glp_write_mps() and glp_write_lp. SCIP can read both .lp and .mps-files. Make sure that you use exactly these file extensions, because SCIP doesn't recognize files in .lp-format but ending with .txt instead.
Use Zimpl to formulate your problems instead. The two formats of Zimpl and Ampl are strikingly similar, see the documentation for examples and further reference. Problem descriptions in Zimpl can be translated into .lp-format or read directly by SCIP, if you compile SCIP with the ZIMPL=true-option, which is the default.

Repeat a loop until it satisfies a specific condition

Anybody can help me on this? Suppose the "p" is totally exogenous and following a uniform distribution. Then I want to generate "z", which is a TRUE(=1) or FALSE(=0) dummy, and has the property that the summation of each three elements (1-3, 4-6, 7-9,..., 58-60) in "z" should be greater than 0.
For example, if I get a "z" like {1 0 0 1 1 0 0 0 0 0 1 0...}, I hope to repeat the loop again ( since sum(z[7:9])=0 ) to draw a different "error" until I get a new "z" like {1 1 0 0 0 1 0 1 0 1 0 0...} where all summations for each three elements are greater than 0. The code I use is as follows. Where am I wrong?
set.seed(005)
p<-runif(60, 0, 1)
for (i in 1:20) {
repeat {
error= -0.2*log((1/runif(60, 0, 1))-1) # a random component
z=(p<0.5+error) # TRUE/FALSE condition
z=replace(z, z==TRUE, 1) # replace z to 1 if z is true, else z=0
if (sum(z[(3*i-2):(3*i)])>0) {break}
}
}
Your for loop generates a new z for every i. I don't think that's what you're trying to do. From what I can understand, you're trying to generate a new z and then use a for loop with the counter i to check for sums of three consecutive elements. If so, then you need to have one loop to generate new zs, and then another one inside this loop which checks for the sum of three consecutive elements.
I think this does what you want. But when I run it it seems unlikely that you will get a satisfactory z soon.
set.seed(005)
p<-runif(60, 0, 1)
invalidentriesexist =1
while(invalidentriesexist == 1) {
error = -0.2*log((1/runif(60, 0, 1))-1) # a random component
z=(p<0.5+error) # TRUE/FALSE condition
z=replace(z, z==TRUE, 1) # replace z to 1 if z is true, else z=0
z=replace(z, z==FALSE, 0) # replace z to 1 if z is true, else z=0
invalidentriesexist = 0
i = 1
while ( i <=20 & invalidentriesexist == 0 ) {
invalidentriesexist = 0
if (sum(z[((3*i)-2):(3*i)])==0) {invalidentriesexist = 1}
cat(i,'\n')
cat(invalidentriesexist,'\n')
cat(paste(z,collapse = ","),'\n')
cat(z[((3*i)-2):(3*i)],'\n\n')
i = i + 1
}
}

Matlab SimBiology - Allow self connecting nodes

I'm using Matlab's SimBiology tool box to generate biograph (which are, just graphs).
When I'm trying to include a node which connects to itself, e.g:
g = [
0 1 0;
1 0 1;
0 0 1; % This one connects to itself.
]
I get the following warning message :
Warning: Self connecting nodes are not allowed, ignoring the diagonal of CM.
As my data sets include some self connecting nodes, I was wondering if this is a configurable feature.
Thanks!
Unfortunately biograph can not have self-connecting edges. If your purpose is only for visualization you could add some nodes with empty labels. Here is an example for a small graph and two self-connecting nodes:
cm = [0 1 1 0 0;1 0 0 1 1;1 0 1 0 0;0 0 0 0 1;1 0 1 0 1];
ids = {'M30931','L07625','K03454','M27323','M15390'};
sc = find(diag(cm));
cm = cm-diag(diag(cm));
n = size(cm,1);
m = numel(sc);
cm(n+m,n+m)=0;
cm(sub2ind([n+m,n+m],[sc;(1:m)'+n],[(1:m)'+n;sc]))=1;
ids((1:m)+n) = {' '};
bg = biograph(cm);
for i = 1:numel(bg.Nodes)
bg.Nodes(i).Label = ids{i};
if i>n
bg.Nodes(i).Shape = 'circle';
end
end
view(bg)
HTH
Lucio

Resources