Enumerated values with a single array of numbers without looping - enumerate

I am writing some functionality for a visual node based CAD program that will not allow for me to loop so I need a workaround to enumerate a list of numbers. I am an architect with very little programming experience so any help would be great.
A have an array of numbers(numArray) coming in as such 0,1,2,3,4... (first column) I need to take those numbers and convert them into their counterpart for column 1,2,3,4 without using any loops or nested loops.
numArray 1 2 3 4
-----------
0 = 0|0|0|0
1 = 0|0|0|1
2 = 0|0|0|2
3 = 0|0|0|3
4 = 0|0|1|0
5 = 0|0|1|1
6 = 0|0|1|2
7 = 0|0|1|3
8 = 0|0|2|0
9 = 0|0|2|1
10= 0|0|2|2
12= 0|0|2|3
13= 0|0|3|0
14= 0|0|3|1
15= 0|0|3|2
16= 0|1|3|3
17= 0|1|0|0
18= 0|1|0|1
19= 0|1|0|2
20= 0|1|0|3
21= 0|1|1|0
22= 0|1|1|1
23= 0|1|1|2
24= 0|1|1|3
I have figured out column 4 by implementing the following:
int column4 = numArray % 4;
this works and creates the numbers as such 0,1,2,3,0,1,2,3.... this is great however I am not sure how to use the num array coming in to produce column 3 2 and 1. Again I have very little programming experience so any help would be great.

You're converting the input to base 4 notation, so this will do the job:
int input[] = {0,1,2,3,4,5,6,7,8,9,10,12,13,14,15,16,17,18,19,20,21,22,23,24};
for (int i = 0; i < sizeof(input)/sizeof(input[0]); i++)
{
int c1, c2, c3, c4;
c4 = input[i] % 4;
c3 = (input[i] / 4) % 4;
c2 = (input[i] / 16) % 4;
c1 = (input[i] / 64) % 4;
printf("%d = %d\t%d\t%d\t%d\n", input[i], c1, c2, c3, c4);
}

Related

Counting frequency of amino acids at each position in multiple-sequence alignments

I'm wondering if anyone knows any tools which allow me to count the frequency of amino acids at any specific position in a multiple-sequence alignment.
For example if I had three sequences:
Species 1 - MMRSA
Species 2 - MMLSA
Species 3 - MMRTA
I'd like for a way to search by position for the following output:
Position 1 - M = 3;
Position 2 - M = 3;
Position 3 - R = 2, L = 1;
Position 4 - S = 2, T = 1;
Position 5 - A = 3.
Thanks! I'm familiar with R and Linux, but if there's any other software that can do this I'm sure I can learn.
Using R:
x <- read.table(text = "Species 1 - MMRSA
Species 2 - MMLSA
Species 3 - MMRTA")
ixCol = 1
table(sapply(strsplit(x$V4, ""), "[", ixCol))
# M
# 3
ixCol = 4
table(sapply(strsplit(x$V4, ""), "[", ixCol))
# S T
# 2 1
Depending input file format, there are likely a purpose built bioconductor packages/functions.
That is really easy to parse, you can use any language of choice.
Here is an example in Python using a dict and Counter to assemble the data in a simple object.
from collections import defaultdict, Counter
msa = '''
Species 1 - MMRSA
Species 2 - MMLSA
Species 3 - MMRTA
'''
r = defaultdict(list) #dictionary having the sequences index as key and the list of aa found at that index as value
for line in msa.split('\n'):
line = line.strip()
if line:
sequence = line.split(' ')[-1]
for i, aa in enumerate(list(sequence)):
r[i].append(aa)
count = {k:Counter(v) for k,v in r.items()}
print(count)
#{0: Counter({'M': 3}), 1: Counter({'M': 3}), 2: Counter({'R': 2, 'L': 1}), 3: Counter({'S': 2, 'T': 1}), 4: Counter({'A': 3})}
To print the output as you specified:
for k, v in count.items():
print(f'Position {k+1} :', end=' ') #add 1 to start counting from 1 instead of 0
for aa, c in v.items():
print(f'{aa} = {c};', end=' ')
print()
It prints:
Position 1 : M = 3;
Position 2 : M = 3;
Position 3 : R = 2; L = 1;
Position 4 : S = 2; T = 1;
Position 5 : A = 3;

What are the different versions of arithmetic swap and why do they work?

I think we all should be familiar of the arithmetic swap algorithm, that swaps two variables without using a third variable. Now I found out that there are two variations of the arithmetic swap. Please consider the following:
Variation 1.
int a = 2;
int b = 3;
a = a + b;
b = a - b;
a = a - b;
Variation 2.
int a = 2;
int b = 3;
b = b - a;
a = a + b;
b = a - b;
I want to know, why are there two distinct variations of the arithmetic swap and why do they work? Are there also other variations of the arithmetic swap that achieve the same result? How are they related? Is there any elegant mathematical formula that justifies why the arithmetic swap works the way it does, for all variations? Is there anything related between these two variations of the two arithmetic swap, like an underlying truth?
Break each variable out as what it represents:
a = 2
b = 3
a1 = a + b
b1 = a1 - b = (a + b) - b = a
a2 = a1 - b1 = (a + b) - a = b
a = 2
b = 3
b1 = b - a
a1 = a + b1 = a + (b - a) = b
b2 = a1 - b1 = b - (b - a) = a
There's not underlying truth other than the fact that the math works out. Remember that each time you do an assignment, it's effectively a new "variable" from the math side.

Prime factorization of factorial

Is it possible to find prime factors of factorial without actually calculating the factorial?
My point here is to find prime factors of factorial not of a big number. Your algorithm should skip the step of having to calculate the factorial and derive prime factors from n! where n <= 4000.
Calculating the factorial and finding it's prime divisors is pretty easy, but my program crashes when the input is greater than n=22. Therfore I thought it would be pretty convinent to do the whole process without having to calculate the factorial.
function decomp(n){
var primeFactors = [];
var fact = 1;
for (var i = 2; i <= n; i++) {
fact = fact * i;
}
while (fact % 2 === 0) {
primeFactors.push(2);
fact = fact/2;
}
var sqrtFact = Math.sqrt(fact);
for (var i = 2; i <= sqrtFact; i++) {
while (fact % i === 0) {
primeFactors.push(i);
fact = fact/i;
}
}
return primeFactors;
}
I don't expect any code nor links, exemplifactions and a brief outline is enough.
Let's consider an example: 10! = 2^8 * 3^4 * 5^2 * 7^1. I computed that by computing the factors of each number from 2 to 10:
2: 2
3: 3
4: 2,2
5: 5
6: 2,3
7: 7
8: 2,2,2
9: 3,3
10: 2,5
Then I just counted each factor. There are eight 2's (1 in 2, 2 in 4, 1 in 6, 3 in 8, and 1 in 10), four 3's (1 in 3, 1 in 6, and 2 in 9), two 5's (1 in 5, and 1 in 10), and one 7 (in 7).
In terms of writing a program, just keep an array of counters (it only needs to be as large as the square root of the largest factorial you want to factor) and, for each number from 2 to the factorial, add the count of its factors to the array of counters.
Does that help?

Best way to find least standard deviation

I have a spreadsheet where I put numbers that represent number of verses on each paragraph of a book.
I manually distribute sequential paragraphs by number of verses, so in the spreadsheet I'll have something like this:
Verses Day
5 1
6 1
3 1
10 2
8 3
4 3
2 3
6 4
3 4
10 5
3 5
2 6
5 6
10 7
= 2,7080128015
By summing the total of verses for each day - in this case, 7 days - I get the standard deviation and try to reduce it for a better distribution of paragraphs.
The question is: what is the best way to find the least standard deviation?
I thought on using brute force to generate all possible combinations, but that is not a good idea if the number increases.
EDIT: The standard deviation is based on total number of verses of each day, which are identified sequentialy. Day 1 has total of 14 verses, day 2, 10 and so on.
1 14
2 10
3 14
4 9
5 13
6 7
7 10
= 2,7080128015
Since the total number of verses and the number of days is constant, you want to minimize
sum (avg verse count - verse count of day i)^2
i
avg verse count is a constant and simply the total number of verses divided by the number of days.
This problem can be solved with a dynamic program over the days. Let us build the partial solution function f(days, paragraph) that gives us the minimal sum of squares for distributing paragraphs 0 through paragraph over days days. We are interested in the last value of this function.
We can build the function incrementally. Calculating f(1, p) for any p is straight-forward since we just need to calculate the differences to the average and square. Then, for all other days, we can calculate
f(d, p) = min f(d - 1, i) + (avg verse count - sum verse count of paragraph j)^2
i<p j:i+1..p
That means, we check the solutions for one day less and fill up the current day with the paragraphs between the previous day's end paragraph and p. While we calculate this function, we keep a pointer to the chosen minimum element (as usual for a dynamic program). When we are done calculating the entire function, we just follow the pointers back to the start, which will give us the partitioning.
The algorithm has a running time of O(d * p^2), where d is the number of days and p is the number of paragraphs.
Example Code
Here is some example C# code that implements the above algorithm:
struct Entry
{
public double minCost;
public int predecessor;
}
public static void Main()
{
//input data
int[] versesPerParagraph = { 5, 6, 3, 10, 8, 4, 2, 6, 3, 10, 3, 2, 5, 10 };
int days = 7;
//calculate constants
double avgVerses = (double)versesPerParagraph.Sum() / days;
//set up DP table (f(d,p))
int paragraphs = versesPerParagraph.Length;
Entry[,] dp = new Entry[days, paragraphs];
//initialize table
int verseCount = 0;
for(int p = 0; p < paragraphs; ++p)
{
verseCount += versesPerParagraph[p];
double diff = avgVerses - verseCount;
dp[0, p].minCost = diff * diff;
dp[0, p].predecessor = -1;
}
//run dynamic program
for(int d = 1; d < days; ++d)
{
for(int p = d; p < paragraphs; ++p)
{
verseCount = 0;
dp[d, p].minCost = double.MaxValue;
for(int i = p; i >= d; --i)
{
verseCount += versesPerParagraph[i];
double diff = avgVerses - verseCount;
double cost = dp[d - 1, i - 1].minCost + diff * diff;
if(cost < dp[d, p].minCost)
{
dp[d, p].minCost = cost;
dp[d, p].predecessor = i - 1;
}
}
}
}
//reconstruct the partitioning
{
int p = paragraphs - 1;
for (int d = days - 1; d >= 0; --d)
{
int predecessor = dp[d, p].predecessor;
//calculate number of verses, just to show them
verseCount = 0;
for (int i = predecessor + 1; i <= p; ++i)
verseCount += versesPerParagraph[i];
Console.WriteLine($"Day {d} ranges from paragraph {predecessor + 1} to {p} and has {verseCount} verses.");
p = predecessor;
}
}
}
The output is:
Day 6 ranges from paragraph 13 to 13 and has 10 verses.
Day 5 ranges from paragraph 10 to 12 and has 10 verses.
Day 4 ranges from paragraph 9 to 9 and has 10 verses.
Day 3 ranges from paragraph 6 to 8 and has 11 verses.
Day 2 ranges from paragraph 4 to 5 and has 12 verses.
Day 1 ranges from paragraph 2 to 3 and has 13 verses.
Day 0 ranges from paragraph 0 to 1 and has 11 verses.
This partitioning gives a standard deviation of 1.15.

Scilab code giving submatrix incorrectly defined error

I am trying to plot a 3D graph between 2 scalars and one matrix for each of its entries. On compiling it is giving me "Submatrix incorrectly defined" error on line 11. The code:
i_max= 3;
u = zeros(4,5);
a1 = 1;
a2 = 1;
a3 = 1;
b1 = 1;
hx = linspace(1D-6,1D6,13);
ht = linspace(1D-6,1D6,13);
for i = 1:i_max
for j = 2:4
u(i+1,j)=u(i,j)+(ht*(a1*u(i,j))+b1+(((a2*u(i,j+1))-(2*a2*u(i,j))+(a2*u(i,j-1)))*(hx^-2))+(((a3*u(i,j+1))-(a3*u(i,j-1)))*(0.5*hx^-1)));
plot(ht,hx,u(i+1,j));
end
end
Full error message:
-->exec('C:\Users\deba123\Documents\assignments and lecture notes\Seventh Semester\UGP\Scilab\Simulation1_Plot.sce', -1)
+(((a3*u(i,j+1))-(a3*u(i,j-1)))*(0.5*hx^-1)))
!--error 15
Submatrix incorrectly defined.
at line 11 of exec file called by :
emester\UGP\Scilab\Simulation1_Plot.sce', -1
Please help.
For a 3-dimensional figure, you need 2 argument vectors and a matrix for the function values. So I expanded u to a tensor.
At every operation in your code, I added the current dimension of the term. Now, a transparent handling of you calculation is given. For plotting you have to use the plot3d (single values) or surf (surface) command.
In a 3-dim plot, you want two map 2 vectors (hx,ht) with dim n and m to an scalar z. Therefore you reach a (nxm)-matrix with your results. Is this, what you want to do? Currently, you have 13 values for each u(i,j,:) - entry, but you want (13x13) for every figure. Maybe the eval3d-function can help you.
i_max= 3;
u = zeros(4,5,13);
a1 = 1;
a2 = 1;
a3 = 1;
b1 = 1;
hx = linspace(1D-6,1D6,13); // 1 x 13
ht = linspace(1D-6,1D6,13); // 1 x 13
for i = 1:i_max
for j = 2:4
u(i+1,j,:)= u(i,j)...
+ ht*(a1*u(i,j))*b1... // 1 x 13
+(((a2*u(i,j+1)) -(2*a2*u(i,j)) +(a2*u(i,j-1)))*(hx.^-2))... // 1 x 13
+(((a3*u(i,j+1))-(a3*u(i,j-1)))*(0.5*hx.^-1)) ... // 1 x 13
+ hx*ones(13,1)*ht; // added to get non-zero values
z = squeeze( u(i+1,j, : ))'; // 1x13
// for a 3d-plot: (1x13, 1x13, 13x13)
figure()
plot3d(ht,hx, z'* z ,'*' ); //
end
end

Resources