How to compute the histogram of a 2d list in BSV? - bluespec

How would I create a module that will accept a 2d array of integers and return only the items above a certain count?
interface TestIfc;
method Action putInput(??? _input);
method ActionValue#(???) getOutput;
endinterface
module mkTest (TestIfc);
Vector#(PE_N, FIFO#(Vector#(Max_itemN, Item))) inputQ <- replicateM(mkFIFO);
Vector#(PE_N, FIFO#(Vector#(Max_itemN, Item))) accQ <- replicateM(mkFIFO);
FIFO#(Vector#(Max_itemN, Item)) mergeQ <- mkFIFO;
FIFO#(Tuple2#(Vector#(Max_itemN, Item), Bit#(32))) outputQ <- mkFIFO;
rule count;
let row1 <- input???.first; input???.deq;
Vector#(???, item) sum = replicate(0);
for(Bit#(8) j = 0; j < ???; j = j + 1)
begin
let n <- row1[j];
sum[n] <= sum[n] + 1;
end
accumulator.enq(sum);
endrule
rule filter;
accumulator.deq;
let acc_t = accumulator.first;
Int cnt = 0;
Vector#(???, Item) result = replicate(0);
for(Bit#(8) i = 0; i < ???; i = i + 1)
begin
if (acc_t[i] > SOMEBUMBER)
begin
result[cnt] = i;
cnt = cnt + 1;
end
end
endrule
...
I'm thinking of maybe splitting my input data vertically or horizontally to do more work efficiently, but right now I'm not confident in creating even a simple version and I'd like some help.

Related

Find the number of possible sums which add to N using (1,...,K)

I have the following problem to solve: given a number N and 1<=k<=N, count the number of possible sums of (1,...,k) which add to N. There may be equal factors (e.g. if N=3 and k=2, (1,1,1) is a valid sum), but permutations must not be counted (e.g., if N=3 and k=2, count (1,2) and (2,1) as a single solution). I have implemented the recursive Python code below but I'd like to find a better solution (maybe with dynamic programming? ). It seems similar to the triple step problem, but with the extra constraint of not counting permutations.
def find_num_sums_aux(n, min_k, max_k):
# base case
if n == 0:
return 1
count = 0
# due to lower bound min_k, we evaluate only ordered solutions and prevent permutations
for i in range(min_k, max_k+1):
if n-i>=0:
count += find_num_sums_aux(n-i, i, max_k)
return count
def find_num_sums(n, k):
count = find_num_sums_aux(n,1,k)
return count
This is a standard problem in dynamic programming (subset sum problem).
Lets define the function f(i,j) which gives the number of ways you can get the sum j using a subset of the numbers (1...i), then the result to your problem will be f(k,n).
for each number x of the range (1...i), x might be a part of the sum j or might not, so we need to count these two possibilities.
Note: f(i,0) = 1 for any i, which means that you can get the sum = 0 in one way and this way is by not taking any number from the range (1...i).
Here is the code written in C++:
int n = 10;
int k = 7;
int f[8][11];
//initializing the array with zeroes
for (int i = 0; i <= k; i++)
for (int j = 0; j <= n; j++)
f[i][j] = 0;
f[0][0] = 1;
for (int i = 1; i <= k; i++) {
for (int j = 0; j <= n; j++) {
if (j == 0)
f[i][j] = 1;
else {
f[i][j] = f[i - 1][j];//without adding i to the sum j
if (j - i >= 0)
f[i][j] = f[i][j] + f[i - 1][j - i];//adding i to the sum j
}
}
}
cout << f[k][n] << endl;//print f(k,n)
Update
To handle the case where we can repeat the elements like (1,1,1) will give you the sum 3, you just need to allow picking the same element multiple times by changing the following line of code:
f[i][j] = f[i][j] + f[i - 1][j - i];//adding i to the sum
To this:
f[i][j] = f[i][j] + f[i][j - i];

How to do it recursively if function depends on only one parameter

I need to do it with recursion, but the problem is that function depends on only ONE parameter and inside function it depends on two ( k and n ), also how to find minimum value if it returns only one value?
The function is :
I've already tried to make random k, but I don't think that is really good idea.
F1(int n) {
Random random = new Random();
int k = random.Next(1,10);
if (1 <= k && k <= n){
return Math.Min(F1(k - 1) + F1(n - k) + n);
} else {
return 0;
}
}
You need to make a loop traversing all k values in range 1..n. Something like this:
F1(int n) {
if (n == 0)
return ???? what is starting value?
minn = F1(0) + F1(n - 1) + n
for (int k = 2; k <= n; k++)
minn = Math.Min(minn, F1(k - 1) + F1(n - k) + n);
return minn;
}

Challenge with vector: how to split a vector based on max/min conditions

I've recently come across the following problem:
Let say I have an vector of random length (L) of 0 and 1 randomly distributed (for example [0,1,1,1,0,0,1,0]), I need to split the vector in two sub-vector at index K so following conditions are valid:
the left sub-vector must contains the maximum number of elements from
K in reverse order such as the number of zeros must be greater or
equal to the number of 1s
the right sub vector must contains the maximum number of element starting from K+1 such as the number of 1s must be greater or equal to the number of zeros
For example, [1,1,1,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0] the split is at index 9, left vector is [1,0], right vector [0,1]
I wrote the following solution but the complexity is O(L^2). I think there could be a solution with complexity of worst case O(L) but I cannot find anything that can help me. Any idea? Thanks
var max = 0;
var kMax = -1;
var firstZeroFound = false;
for (var i = 0; i < testVector.Length - 1; i++)
{
if (!firstZeroFound)
{
if (testVector[i]) continue;
firstZeroFound = true;
}
var maxZero = FindMax(testVector, i, -1, -1, false);
if (maxZero == 0) continue;
var maxOne = FindMax(testVector, i + 1, testVector.Length, 1, true);
if (maxOne == 0) continue;
if ((maxZero + maxOne) <= max)
continue;
max = maxOne + maxZero;
kMax = i;
if (max == testVector.Length)
break;
}
Console.Write("The result is {0}", kMax);
int FindMax(bool[] v, int start, int end, int increment, bool maximize)
{
var max = 0;
var sum = 0;
var count = 0;
var i = start;
while (i != end)
{
count++;
if (v[i])
sum++;
if (maximize)
{
if (sum * 2 >= count)
max = count;
}
else if (sum * 2 <= count)
{
max = count;
}
i += increment;
}
return max;
}
I think you should look at rle.
y <- c(1,1,1,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,0)
z <- rle(y)
d <- cbind(z$values, z$lengths)
[,1] [,2]
[1,] 1 9
[2,] 0 1
[3,] 1 1
[4,] 0 8
Basically, rle calculates the lengths of 0's and 1's at each level.
From here things may go easier for you.

R: How to compute correlation between rows of a matrix without having to transpose it?

I have a big matrix and am interested in computing the correlation between the rows of the matrix. Since the cor method computes correlation between the columns of a matrix, I am transposing the matrix before calling cor. But since the matrix is big, transposing it is expensive and is slowing down my program. Is there a way to compute the correlations among the rows without having to take transpose?
EDIT: thanks for the responses. thought i'd share some findings. my input matrix is 16 rows by 239766 cols and comes from a .mat file. I wrote C# code to do the same thing using the csmatio library. it looks like this:
foreach (var file in Directory.GetFiles(path, interictal_pattern))
{
var reader = new MatFileReader(file);
var mla = reader.Data[0] as MLStructure;
convert(mla.AllFields[0] as MLNumericArray<double>, data);
double sum = 0;
for (var i = 0; i < 16; i++)
{
for (var j = i + 1; j < 16; j++)
{
sum += cor(data, i, j);
}
}
var avg = sum / 120;
if (++count == 10)
{
var t2 = DateTime.Now;
var t = t2 - t1;
Console.WriteLine(t.TotalSeconds);
break;
}
}
static double[][] createArray(int rows, int cols)
{
var ans = new double[rows][];
for (var row = 0; row < rows; row++)
{
ans[row] = new double[cols];
}
return ans;
}
static void convert(MLNumericArray<double> mla, double[][] M)
{
var rows = M.Length;
var cols = M[0].Length;
for (int i = 0; i < rows; i++)
for (int j = 0; j < cols; j++)
M[i][j] = mla.Get(i, j);
}
static double cor(double[][] M, int i, int j)
{
var count = M[0].Length;
double sum1 = 0, sum2 = 0;
for (int ctr = 0; ctr < count; ctr++)
{
sum1 += M[i][ctr];
sum2 += M[j][ctr];
}
var mu1 = sum1 / count;
var mu2 = sum2 / count;
double numerator = 0, sumOfSquares1 = 0, sumOfSquares2 = 0;
for (int ctr = 0; ctr < count; ctr++)
{
var x = M[i][ctr] - mu1;
var y = M[j][ctr] - mu2;
numerator += x * y;
sumOfSquares1 += x * x;
sumOfSquares2 += y * y;
}
return numerator / Math.Sqrt(sumOfSquares1 * sumOfSquares2);
}
this gave a throughput of 22.22s for 10 files or 2.22s/file
Then I profiled my R code:
ptm=proc.time()
for(file in files)
{
i = i + 1;
mat = readMat(paste(path,file,sep=""))
a = t(mat[[1]][[1]])
C = cor(a)
correlations[i] = mean(C[lower.tri(C)])
}
print(proc.time()-ptm)
to my surprise its running faster than C# and is giving throughput of 5.7s per 10 files or 0.6s/file (an improvement of almost 4x!). The bottleneck in C# is the methods inside csmatio library to parse double values from input stream.
and if i do not convert the csmatio classes into a double[][] then the C# code runs extremely slow (order of magnitude slower ~20-30s/file).
Seeing that this problem arises from a data input issue whose details are not stated (and only hinted at in a comment), I will assume this is a comma-delimited file of unquoted numbers with the number of columns= Ncol. This does the transposition on input.
in.mat <- matrix( scan("path/to/the_file/fil.txt", what =numeric(0), sep=","),
ncol=Ncol, byrow=TRUE)
cor(in.nmat)
One dirty work-around would be to apply cor-functions row-wise and produce the correlation matrix from the results. You could try if this is any more efficient (which I doubt, though you could fine-tune it by not double computing everything or the redundant diagonal cases):
# Apply 2-fold nested row-wise functions
set.seed(1)
dat <- matrix(rnorm(1000), nrow=10)
cormat <- apply(dat, MARGIN=1, FUN=function(z) apply(dat, MARGIN=1, FUN=function(y) cor(z, y)))
cormat[1:3,1:3] # Show few first
# [,1] [,2] [,3]
#[1,] 1.000000000 0.002175792 0.1559263
#[2,] 0.002175792 1.000000000 -0.1870054
#[3,] 0.155926259 -0.187005418 1.0000000
Though, generally I would expect the transpose to have a really, really efficient implementation, so it's hard to imagine when that would be the bottle-neck. But, you could also dig through the implementation of 'cor' function and call the correlation C-function itself by first making sure your rows are suitable. Type 'cor' in the terminal to see the implementation, which is mostly a wrapper that makes input suitable for the C-function:
# Row with C-call from the implementation of 'cor':
# if (method == "pearson")
# .Call(C_cor, x, y, na.method, FALSE)
You can use outer:
outer(seq(nrow(mat)), seq(nrow(mat)),
Vectorize(function(x, y) cor(mat[x , ], mat[y , ])))
where mat is the name of your matrix.

Converting a decimal to a mixed-radix (base) number

How do you convert a decimal number to mixed radix notation?
I guess that given an input of an array of each of the bases, and the decimal number, it should output an array of the values of each column.
Pseudocode:
bases = [24, 60, 60]
input = 86462 #One day, 1 minute, 2 seconds
output = []
for base in reverse(bases)
output.prepend(input mod base)
input = input div base #div is integer division (round down)
Number -> set:
factors = [52,7,24,60,60,1000]
value = 662321
for i in n-1..0
res[i] = value mod factors[i]
value = value div factors[i]
And the reverse:
If you have the number like 32(52), 5(7), 7(24), 45(60), 15(60), 500(1000) and you want this converted to decimal:
Take number n, multiply it with the factor of n-1, continue for n-1..n=0
values = [32,5,7,45,15,500]
factors = [52,7,24,60,60,1000]
res = 0;
for i in 0..n-1
res = res * factors[i] + values[i]
And you have the number.
In Java you could do
public static int[] Number2MixedRadix(int[] base, int number) throws Exception {
//NB if the max number you want # a position is say 3 then the base# tha position
//in your base array should be 4 not 3
int[] RadixFigures = new int[base.length];
int[] PositionPowers = new int[base.length];
PositionPowers[base.length-1] = 1;
for (int k = base.length-2,pow = 1; k >-1; k--){
pow*=base[k+1];
PositionPowers[k]=pow;
}for (int k = 0; k<base.length; k++){
RadixFigures[k]=number/PositionPowers[k];
if(RadixFigures[k]>base[k])throw new Exception("RadixFigure#["+k+"] => ("+RadixFigures[k]+") is > base#["+k+"] => ("+base[k]+") | ( number is Illegal )");
number=number%PositionPowers[k];
}return RadixFigures;
}
Example
//e.g. mixed-radix base for 1day
int[] base = new int[]{1, 24, 60, 60};//max-day,max-hours,max-minutes,max-seconds
int[] MixedRadix = Number2MixedRadix(base, 19263);//19263 seconds
//this would give [0,5,21,3] => as per 0days 5hrs 21mins 3secs
Reversal
public static int MixedRadix2Number(int[] RadixFigures,int[] base) throws Exception {
if(RadixFigures.length!=base.length)throw new Exception("RadixFigures.length must be = base.length");
int number=0;
int[] PositionPowers = new int[base.length];
PositionPowers[base.length-1] = 1;
for (int k = base.length-2,pow = 1; k >-1; k--){
pow*=base[k+1];
PositionPowers[k]=pow;
}for (int k = 0; k<base.length; k++){
number+=(RadixFigures[k]*PositionPowers[k]);
if(RadixFigures[k]>base[k])throw new Exception("RadixFigure#["+k+"] => ("+RadixFigures[k]+") is > base#["+k+"] => ("+base[k]+") | ( number is Illegal )");
}return number;
}
I came up with a slightly different, and probably not as good method as the other ones here, but I thought I'd share anyway:
var theNumber = 313732097;
// ms s m h d
var bases = [1000, 60, 60, 24, 365];
var placeValues = []; // initialise an array
var currPlaceValue = 1;
for (var i = 0, l = bases.length; i < l; ++i) {
placeValues.push(currPlaceValue);
currPlaceValue *= bases[i];
}
console.log(placeValues);
// this isn't relevant for this specific problem, but might
// be useful in related problems.
var maxNumber = currPlaceValue - 1;
var output = new Array(placeValues.length);
for (var v = placeValues.length - 1; v >= 0; --v) {
output[v] = Math.floor(theNumber / placeValues[v]);
theNumber %= placeValues[v];
}
console.log(output);
// [97, 52, 8, 15, 3] --> 3 days, 15 hours, 8 minutes, 52 seconds, 97 milliseconds
I tried a few of the examples before and found an edge case they didn't cover, if you max out your scale you need to prepend the result from the last step
def intToMix(number,radix=[10]):
mixNum=[]
radix.reverse()
for i in range(0,len(radix)):
mixNum.append(number%radix[i])
number//=radix[i]
mixNum.append(number)
mixNum.reverse()
radix.reverse()
return mixNum
num=60*60*24*7
radix=[7,24,60,60]
tmp1=intToMix(num,radix)

Resources