Replace the specific element in array also change other one [duplicate] - julia

This question already has answers here:
Creating copies in Julia with = operator
(2 answers)
Closed 3 years ago.
Here is the example code. I can't understand why the first element in array B also be revised. Can I keep the same element in array B?
julia> A = [0.0 0.1 0.2 0.3];
julia> B = A;
julia> A[1] = 0.1;
julia> A
1×4 Array{Float64,2}:
0.1 0.1 0.2 0.3
julia> B
1×4 Array{Float64,2}:
0.1 0.1 0.2 0.3

Julia Array is passed by reference. You need to create a copy:
julia> A = [0.0 0.1 0.2 0.3];
julia> B = deepcopy(A)
1×4 Array{Float64,2}:
0.0 0.1 0.2 0.3
julia> A[1] = 0.1;
julia> A, B
([0.1 0.1 0.2 0.3], [0.0 0.1 0.2 0.3])
Note that for this code just copy will be also enough but if for example you have an Array of objects that you mutate deepcopy should be used.

Related

Lower RAM consumption for a transformation of a transition matrix

I've written the following two functions, that take as input a transition matrix and which nodes should be at absorbing states and transforms it.
The first function set.absorbing.states() has 3 arguments. tm is the initial transition matrix, the second one inn is one specified innitial node, while the third one soi is the set of interest. By 'set of interest', I mean a set of nodes in that matrix that must been set as absorbing states. Such an initial matrix is the following:
tm <- read.table(row.names=1, header=FALSE, text="
A 0.2 0.3 0.1 0.2 0.1 0.1
B 0.3 0.1 0.1 0.2 0.2 0.1
C 0 0.2 0.4 0.1 0.2 0.1
D 0.2 0.1 0.2 0.3 0.1 0.1
E 0.2 0.2 0.1 0.2 0.1 0.2
F 0.3 0.2 0.4 0.1 0 0")
colnames(tm) <- row.names(tm)
As you can see there are no absorbing states in that matrix. Let's say for example that we want to set as absorbing states the A and E and a randomly selected initial node B.
By executing the first function tm1 <- set.absorbing.states( tm , "B", c("A","E")) we are getting back a matrix that the absorbing states have been setted:
A B C D E F
A 1.0 0.0 0.0 0.0 0.0 0.0
B 0.3 0.1 0.1 0.2 0.2 0.1
C 0.0 0.2 0.4 0.1 0.2 0.1
D 0.2 0.1 0.2 0.3 0.1 0.1
E 0.0 0.0 0.0 0.0 1.0 0.0
F 0.3 0.2 0.4 0.1 0.0 0.0
As you can see, A and E have been changed into absorbing states.
The next step is to transform that matrix into a way that all absorbing state nodes (both rows and columns) go to the end. So by running ptm <- transform.tm( tm1, c("A","E") ) we get back a matrix that looks like:
B C D F A E
B 0.1 0.1 0.2 0.1 0.3 0.2
C 0.2 0.4 0.1 0.1 0.0 0.2
D 0.1 0.2 0.3 0.1 0.2 0.1
F 0.2 0.4 0.1 0.0 0.3 0.0
A 0.0 0.0 0.0 0.0 1.0 0.0
E 0.0 0.0 0.0 0.0 0.0 1.0
You can see now clearly that A and E nodes went to the end of that matrix.
Here follows the function I'm using.
set.absorbing.states <- function ( tm, inn, soi )
{
set <- which( row.names(tm) %in% soi )
set <- set[which( set != inn )]
for (i in set )
tm[i,] <- 0
for (i in set)
tm[i,i] <- 1
tm
}
transform.tm <- function ( tm, soi )
{
end_sets <- which(row.names(tm) %in% soi)
ptm <- rbind( cbind(tm[-end_sets, -end_sets], tm[-end_sets, end_sets]) , cbind(tm[end_sets, -end_sets], tm[end_sets, end_sets]) )
ptm
}
The thing now is that with such small matrices, everything is working properly. But I tried to use a big matrix (20.000*20.000) and it needed 32GB RAM to execute the second function.
So is there any way to execute this in more resource efficient way ?
Use indexing will significantly reduce the number of copies that your transformation function is creating (via rbind and cbind). It is probably a bit simpler conceptually (conditional on a solid understanding of indexing with [).
transform.tm1 <- function ( tm, soi ) {
newOrder <- c(setdiff(row.names(tm), soi), soi)
tm[newOrder, newOrder]
}
Here, setdiff is used to pull the non matching names and put them at the front a the vector. Then, simply reorder the matrix via row/column names.
This returns
transform.tm1(tm1, c("A", "E"))
B C D F A E
B 0.1 0.1 0.2 0.1 0.3 0.2
C 0.2 0.4 0.1 0.1 0.0 0.2
D 0.1 0.2 0.3 0.1 0.2 0.1
F 0.2 0.4 0.1 0.0 0.3 0.0
A 0.0 0.0 0.0 0.0 1.0 0.0
E 0.0 0.0 0.0 0.0 0.0 1.0
check that they return the same results
identical(transform.tm(tm1, c("A", "E")), transform.tm1(tm1, c("A", "E")))
[1] TRUE

Create a sparse symmetric random matrix in Julia

is there an easy way to create a sparse symmetric random matrix in Julia?
Julia has the command
sprand(m,n,d)
which "Creates a [sparse] m-by-n random matrix (of density d) with iid non-zero elements distributed uniformly on the half-open interval [0,1)[0,1)." But as far as I can tell this doesn't necessarily return a symmetric matrix.
I am looking for an equivalent command to MATLAB's
R = sprandsym(n,density)
which automatically creates a sparse symmetric random matrix. If such a command isn't implemented yet, what would be a workaround to transform the matrix returned by sprand(m,n,d) into a symmetric one?
Thank you!
You could Symmetric(sprand(10,10,0.4))
To avoid extra memory caveat mentioned in the comment to Michael Borregaard's answer, the following function takes a sparse matrix and drops the entries in the lower triangular part. If the SparseMatrixCSC format is unfamiliar, it also serves as a good presentation of how the format is manipulated:
function droplower(A::SparseMatrixCSC)
m,n = size(A)
rows = rowvals(A)
vals = nonzeros(A)
V = Vector{eltype(A)}()
I = Vector{Int}()
J = Vector{Int}()
for i=1:n
for j in nzrange(A,i)
rows[j]>i && break
push!(I,rows[j])
push!(J,i)
push!(V,vals[j])
end
end
return sparse(I,J,V,m,n)
end
Example usage:
julia> a = [0.5 1.0 0.0 ; 2.0 0.0 0.0 ; 0.0 0.0 0.0]
3×3 Array{Float64,2}:
0.5 1.0 0.0
2.0 0.0 0.0
0.0 0.0 0.0
julia> b = sparse(a)
3×3 SparseMatrixCSC{Float64,Int64} with 3 stored entries:
[1, 1] = 0.5
[2, 1] = 2.0
[1, 2] = 1.0
julia> c = droplower(b)
3×3 SparseMatrixCSC{Float64,Int64} with 2 stored entries:
[1, 1] = 0.5
[1, 2] = 1.0
julia> full(Symmetric(c)) # note this is symmetric although c isn't
3×3 Array{Float64,2}:
0.5 1.0 0.0
1.0 0.0 0.0
0.0 0.0 0.0
Operations on the SparseMatrixCSC often need to be customized for maximum efficiency. So, to get from a sparse matrix A to a symmetric sparse matrix with the same upper part, here is a custom version (it is a bit cryptic, but working):
function symmetrize(A::SparseMatrixCSC)
m,n = size(A)
m == n || error("argument expected to be square matrix")
rows = rowvals(A) ; vals = nonzeros(A)
a = zeros(Int,n) ; b = zeros(Int,n) ; c = 0
for i=1:n
for j in nzrange(A, i)
if rows[j]>=i
if rows[j]==i a[i] += 1 ; c += 1 ; end
break
end
a[i] += 1 ; b[rows[j]] += 1 ; c += 2
end
end
c == 0 && return SparseMatrixCSC(n, n, ones(n+1), nrows, nvals)
ncolptr = Vector{Int}(n+1)
nrows = Vector{Int}(c) ; nvals = Vector{eltype(A)}(c)
idx = 1
for i=1:n
ncolptr[i] = idx
if a[i]==0 a[i] = idx ; idx += b[i] ; continue ; end
for j in (0:a[i]-1)+first(nzrange(A, i))
nvals[idx] = vals[j] ; nrows[idx] = rows[j] ; idx += 1
rows[j] >= i && break
nvals[a[rows[j]]] = vals[j] ; nrows[a[rows[j]]] = i
a[rows[j]] += 1
end
a[i] = idx ; idx += b[i]
end
ncolptr[n+1] = idx
return SparseMatrixCSC(n, n, ncolptr, nrows, nvals)
end
And a sample run:
julia> f = sprand(5,5,0.2)
5×5 SparseMatrixCSC{Float64,Int64} with 5 stored entries:
[1, 1] = 0.981579
[3, 1] = 0.330961
[5, 1] = 0.527683
[4, 5] = 0.196898
[5, 5] = 0.579006
julia> full(f)
5×5 Array{Float64,2}:
0.981579 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.330961 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.196898
0.527683 0.0 0.0 0.0 0.579006
julia> full(symmetrize(f))
5×5 Array{Float64,2}:
0.981579 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 0.196898
0.0 0.0 0.0 0.196898 0.579006
This version should be faster than others, but this still needs to be benchmarked (and some #inbounds added in the for loops).

How do you generate a regular non-integer sequence in julia?

How are regular, non-integer sequences generated in julia?
I'm trying to get 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
In MATLAB, I would use
0.1:0.1:1
And in R
seq(0.1, 1, by = 0.1)
But I can't find anything except integer sequences in julia (e.g., 1:10). Searching for "sequence" in the docs only gives me information about how strings are sequences.
Similarly to Matlab, but with the difference that 0.1:0.1:1 defines a Range:
julia> typeof(0.1:0.1:1)
Range{Float64} (constructor with 3 methods)
and thus if an Array is needed:
julia> [0.1:0.1:1]
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
Unfortunately, this use of Range is only briefly mentioned at this point of the documentation.
Edit: As mentioned in the comments by #ivarne it is possible to achieve a similar result using linspace:
julia> linspace(.1,1,10)
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
but note that the results are not exactly the same due to rounding differences:
julia> linspace(.1,1,10)==[0.1:0.1:1]
false
The original answer is now deprecated. You should use collect() to generate a sequence.
## In Julia
> collect(0:.1:1)
10-element Array{Float64,1}:
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
## In R
> seq(0, 1, .1)
[1] 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
They are generated the same way as in Matlab
julia> sequence = 0:.1:1
0.0:0.1:1.0
Alternatively, you can use the range() function, which allows you to specify the length, step size, or both
julia> range(0, 1, length = 5)
0.0:0.25:1.0
julia> range(0, 1, step = .01)
0.0:0.01:1.0
julia> range(0, step = .01, length = 5)
0.0:0.01:0.04
You can still do all of the thinks you would normally do with a vector, eg indexing
julia> sequence[4]
0.3
math and stats...
julia> sum(sequence)
5.5
julia> using Statistics
julia> mean(sequence)
0.5
This will (in most cases) work the same way as a vector, but nothing is actually allocated. It can be comfortable to make the vector, but in most cases you shouldn't (it's less performant). This works because
julia> sequence isa AbstractArray
true
If you truly need the vector, you can collect(), splat (...) or use a comprehension:
julia> v = collect(sequence)
11-element Array{Float64,1}:
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
julia> v == [sequence...] == [x for x in sequence]
true

connect two matrixes by columns and extract sub matrix

I have two matrixes (e.g., A and B). I would like to extract columns of B based on the order of A's first column:
For example
matrix A
name score
a 0.1
b 0.2
c 0.1
d 0.6
matrix B
a d b c g h
0.1 0.2 0.3 0.4 0.6 0.2
0.2 0.1 0.4 0.7 0.1 0.1
...
I want matrix B to look like this at the end
matrix B_modified
a b c d
0.1 0.3 0.4 0.2
0.2 0.4 0.7 0.1
Can this be done either in perl or R? thanks a lot in advance
I've no idea what problems you're facing. Here's how I've done it.
## get data as matrix
a <- read.table(header=TRUE, text="name score
a 0.1
b 0.2
c 0.1
d 0.6", stringsAsFactors=FALSE) # load directly as characters
b <- read.table(header=TRUE, text="a d b c g h
0.1 0.2 0.3 0.4 0.6 0.2
0.2 0.1 0.4 0.7 0.1 0.1", stringsAsFactors=FALSE)
a <- as.matrix(a)
b <- as.matrix(b)
Now subset to get your final result:
b[, a[, "name"]]
# a b c d
# [1,] 0.1 0.3 0.4 0.2
# [2,] 0.2 0.4 0.7 0.1
The error :
[.data.frame(b, , a[, "name"]) : undefined columns selected
means that you try to get a column non defined in b but exist in a$name. One solution is to use intersect with colnames(b). This will convert also the factor to a string and you get the right order.
b[, intersect(a[, "name"],colnames(b))] ## the order is important here
For example , I test this with this data:
b <- read.table(text='
a d b c
0.1 0.2 0.3 0.4
0.2 0.1 0.4 0.7',header=TRUE)
a <- read.table(text='name score
a 0.1
z 0.5
c 0.1
d 0.6',header=TRUE)
b[, intersect(a[, "name"],colnames(b))]
a c d
1 0.1 0.4 0.2
2 0.2 0.7 0.1
If your data originates as an R data structure then it would be perverse to export it and solve this problem using Perl. However, if you have text files that look like the data you have shown, then here is a Perl solution for you.
I have split the output on spaces. That can be changed very simply if necessary.
use strict;
use warnings;
use autodie;
sub read_file {
my ($name) = #_;
open my $fh, '<', $name;
my #data = map [ split ], <$fh>;
\#data;
}
my $matrix_a = read_file('MatrixA.txt');
my #fields = map $matrix_a->[$_][0], 1 .. $#$matrix_a;
my $matrix_b = read_file('MatrixB.txt');
my #headers = #{$matrix_b->[0]};
my #indices = map {
my $label = $_;
grep $headers[$_] eq $label, 0..$#headers
} #fields;
for my $row (0 .. $#$matrix_b) {
print join(' ', map $matrix_b->[$row][$_], #indices), "\n";
}
output
a b c d
0.1 0.3 0.4 0.2
0.2 0.4 0.7 0.1

modulus bug in R [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why are these numbers not equal?
Just noticed this bug in R. I'm guessing it's the way 0.6 is represented, but anyone know exactly what's going on?
According to R:
0.3 %% 0.2 = 0.1
0.4 %% 0.2 = 0
0.5 %% 0.2 = 0.1
**0.6 %% 0.2 = 0.2**
0.7 %% 0.2 = 0.1
0.8 %% 0.2 = 0
What's going on?
In addition to #joshua Ulrich's comment
from ?'%%'
%% and x %/% y can be used for non-integer y, e.g. 1 %/% 0.2, but the results are subject to representation error and so may be platform-dependent. Because the IEC 60059 representation of 0.2 is a binary fraction slightly larger than 0.2, the answer to 1 %/% 0.2 should be 4 but most platforms give 5.
also similar to why we get this
> .1 + .1 + .1 == .3
[1] FALSE
as #Ben Boker pointed out, you may want to use something like
> 3:8 %% 2 / 10
[1] 0.1 0.0 0.1 0.0 0.1 0.0

Resources