Related
APL is great for array type problems but I'm curious as to how best work with graphs in APL. I'm playing around with leet questions, for example question 662. Maximum Width of Binary Tree, the exercise works with Node objects with a value/left/right pointer style, however the test-case uses a basic array like [1,3,null,5,3]. The notation is compressed; uncompressed would be [[1], [3,null], [5,3,null,null]]. Reading layer-by-layer give [[1], [3], [5,3]] (so 2 is the widest layer).
Another example,
[5,4,7,3,null,2,null,-1,null,9] gives the answer 2
So I'm not sure the idiomatic way to work with trees. Do I use classes? Or are arrays best? In either case how do I convert the input?
I came up with a couple of solutions, but both feel inelegant. (Apologies for lack of comments)
convert←{
prev ← {(-⌈2÷⍨≢⍵)↑⍵}
nxt←{
⍵≡⍬:⍺
m←2/×prev ⍺
cnt←+/m
(⍺,(m\cnt↑⍵))nxt(cnt↓⍵)
}
(1↑⍵)nxt(1↓⍵)
}
Alternatively,
convert ← {
total←(+/×⍵)
nxt←{
double←×1,2↓2/0,⍵
(((+/double)↑⍺)#⊢)double
}
⍵ nxt⍣{(+/×⍺)=total}1
}
Both solutions are limited in they assume that 0 is null.
Once I've decompressed the input it's simply just a matter of stratifying by it's order
⌈/(1+⌈/-⌊/)∘⍸¨×nodes⊆⍨⍸2*¯1+⍳⌈2⍟≢nodes
In Python though I could use other methods to traverse i.e. keep track of the left/right-most node on a per-depth basis.
NOTE: This may be two questions, one to decompress and the other how to traverse graphs in general, but one depends on the other
Any ideas?
The work of Co-dfns compiler has given lots of insights on working tree/graph like data structures with APL.
Thesis: A Data Parallel Compiler Hosted on the GPU
GitHub repo: github.com/Co-dfns/Co-dfns (Many related goodies in project README file)
However the thesis is quite lengthy so for this particular exercise I would give a brief explanation on how to approach it.
the exercise works with Node objects with a value/left/right pointer style, however the test-case uses a basic array like [1,3,null,5,3].
Do we really actually build the tree with Node type objects to get an answer to the question? You can write the solution in something like Python and translate to APL, but that would be, losing the whole point of writing it in APL...
Notice the input is already an array! It is a bfs traverse of the binary tree. (The co-dfns compiler uses dfs traverse order, though)
so, actually what we need to do is just built a matrix like below for the input like [1,3,2,5,3,null,9] (⍬ is a placeholder value for for null):
1 ⍬ ⍬ ⍬ ⍝ level 0
3 2 ⍬ ⍬ ⍝ level 1
5 3 ⍬ 9 ⍝ level 2
For this problem we don't need to know which node's parent is which.
We can even do something like, by abusing the fact that input has no negative value (even the number could be negative, actually we only care about if it is null), and change ⍬ to ¯1 or 0 and make it easier to compute the answer.
So the problem has became: "compute the matrix representation of the tree as variable tree from the input array, then calculate the width of each level by +/0<tree, then the output is just 2*level (notice the first level is level-0)" This is using wrong definition for the width. I'll show how to correct it below
And it is actually very easy to do the conversion from input to matrix, hint: ↑.
1 (3 2) 5
┌─┬───┬─┐
│1│3 2│5│
└─┴───┴─┘
↑1 (3 2) 5
1 0
3 2
5 0
Thanks for pointing out that my original solution has problem on constructing the tree matrix.
This is the corrected method for constructing the tree. To distinguish from 0 for null and the padding, I add one to the input array so 2 is for non-null and 1 is for null.
buildmatrix←{
⎕IO←0
in←1+(⊂⊂'null')(≢⍤1 0)⎕JSON ⍵
⍝ Build the matrix
loop←{
(n acc)←⍺
0=≢⍵:acc
cur←n↑⍵
(2×+/2=cur)(acc,⊂cur)∇ n↓⍵
}
↑1 ⍬ loop in
}
However since the definition for width here is:
The width of one level is defined as the length between the end-nodes (the leftmost and rightmost non-null nodes), where the null nodes between the end-nodes are also counted into the length calculation.
We can just compute the width while attempting to reconstructing the tree (compute each level's width using \ and / with patterns from previous level):
If last level is 1011 and next level is 100010
1 0 1 1
1 0 0 0 0 0 1 0
(2/1 0 1 1)\1 0 0 0 1 0
1 0 0 0 0 0 1 0
So the it isn't needed to construct the complete matrix, and the answer for the exercise is just:
width←{
⎕IO←0
in←(⊂⊂'null')(≢⍤1 0)⎕JSON ⍵
⍝ strip leading trailing zero
strip←(⌽⍳∘1↓⊢)⍣2
⍝ Build the matrix
loop←{
(prev mw)←⍺
0=≢⍵:mw
cur←⍵↑⍨n←2×+/prev
((⊢,⍥⊂mw⌈≢)strip cur\⍨2/prev)∇ n↓⍵
}
(,1)1 loop 1↓in
}
width '[1,null,2,3,null,4,5,6]'
2
And the interesting fact is, you can probably do the same in other non-array based languages like Haskell. So instead of translating existing algorithms between similar looking languages, by thinking in the APL way you find new algorithms for problems!
I've recently started to learn about SUBLEQ One Instruction Set Computers and am currently trying to write a simple assembler for a SUBLEQ emulator i wrote. So far I've implemented DB, MOV, INC and DEC, but I am struggling a bit with the MOV instruction, when it has pointers as arguments.
For example: MOV 20, 21 to move data from address 21 to address 20 in SUBLEQ looks like this (assuming address 100 is zero and the program starts at address zero):
sble 20 20 3
sble 21 100 6
sble 100 20 9
The content at the target address is zeroed and the content at the source address is added to the destination by subtracting it two times.
Now to my problem: If one argument is a pointer, for example MOV 20, [21] so that the contents of address 21 are pointing to the real data I want to to copy to address 20, how can that be represented using SUBLEQ?
I'll start off by saying I know very little about how subleq is used in practice, so take this answer with a grain of salt:
One downside of subleq is that it is notoriously difficult to use pointers, but it can edit its own code. This means that you will have to use the code to rewrite the address being looked at with the value at 21.
for example, if you somehow got the code to go to a line appended after your current code you could use this:
# (I used the quotes to mean next line)
sble 3 3 " # set the first value of the second instruction to 0
sble 21 101 " # put the value of 21 into an unused address
sble 101 3 " # subtract the value of 101 and put it back into the code
sble 3 3 " # reset the value at 3 to 0
it might be a good decision to have to have a movp (move-pointer) command, so you don't accidentally mess up your mov command code at runtime
this means that a new method of using pointers has to be thought of differently for every problem someone comes across, but will usually be done by editing the code with the code
A - B
0 1 - new connection
0 0 - unchanged
1 1 - unchanged
1 0 - disruption
I'm working on an industrial engineering modelling/coding project. I facing with data processing problem about the sample data above in an GAMS code.
I need a mathematical way of finding 1-0 patterns(which means disruptions in my model). I cannot use logical statements, like if, as they will affect my model and make it non-linear.
I tried
sum(i,a(i)-b(i))
it returned 0 as all the values cancelled each other. I need a pure mathematical expression to detect disruptions. Any ideas?
EDIT:Absolute value is also not acceptable.
ANSWER: After few hours of playing with numbers I came up with the following:
{ (a(i) + b(i)) - (a(i) * b(i)) - b(i) }
Thanks everyone for their contributions
This should work:
sum(i,ABS(x(i)-y(i)))
ak - old values,
bk - new values
In reviewing some old coldfusion code, I've found several instances of data being encrypted with the CFMX_COMPAT algorithm via the encrypt/decrypt functions.
After searching around for a while, I've been unable to find what kind of algorithm this is. The docs mention that it is now the least secure method, but I'd like to know why that is.
(A couple of people elesewhere have suggested that it's just MD5, but that doesn't make a lot of sense as the data is being decrypted.)
It is an XOR-based algorithm, but not a text-book one, so blanket XOR-algorithm answers are not correct (and have been applied to these CFMX_COMPAT questions incorrectly in the past).
For a detailed look at the source code to this proprietary XOR, check this answer to "Compare Password Hashes Between C# and ColdFusion", in which #Leigh (who also commented on one of these questions) helped provide an accurate port of the algorithm, lifted directly from the Railo source.
It is a simple XOR algorithm. Technically it is crypto, but it is very, very, very, very, very weak crypto. I should have put a few more "very"s in there.
As I understand it, each bit of the plaintext is XOR'd against the next byte in the key, the result is the ciphertext.
So if we looked at everything in bits:
P: 1 0 1 0 1 0 1 0 0 0 1
K: 0 0 1 1 1 0 0 1 0 1 0
C: 1 0 0 1 0 0 1 1 0 1 0
P = Plaintext
K = Key
C = Ciphertext
If you are not familiar with XOR, it works like this:
0 XOR 0 -> 0
0 XOR 1 -> 1
1 XOR 0 -> 1
1 XOR 1 -> 0
It's definitely not MD5, as that's a hashing algorithm, not an encryption algorithm (as you point out).
I dunno what algorithm it uses, but your could decompile the relevant Java class in cfusion.jar and have a look. I doubt there's a better way of finding out than that. I doubt even if you open a support ticket with Adobe that they'd actually tell you.
History: I read from one of Knuth's algorithm book that first computers used the base of 10. Then, it switched to two's complement here.
Question: Why does the base could not be -2 in at least a monoid?
Examples:
(-2)^1 = -2
(-2)^3 = -8
The problem is that with a negabinary (base -2) system, it's more difficult to understand, and the number of possible positive and negative values are different. To see this latter point, consider a simple 3 bit case.
Here
the first (rightmost) bit represents the decimal 1;
the middle bit represents the decimal -2; and
the third (leftmost) bit represents the decimal 4
So
000 -> 0
001 -> 1
010 -> -2
011 -> -1
100 -> 4
101 -> 5
110 -> 2
111 -> 3
Thus the range of expressable values is -2 to 5, i.e. non-symmetric.
At its heart, digital logic is base two. A digital signal is either on or off. Supporting other bases (as in BCD) means wasted representation space, more engineering, more complex specification, etc.
Editted to add: In addition to the trivial representation of a single binary digit in digital logic, addition is easily realized in hardware, start half adder which is easily realized in Boolean logic (i.e. with transistors):
(No carry) (with carry)
| 0 1 0 1
--+--------------------
0 | 00 01 01 10
1 | 01 10 10 11
(the returned digit is (A xor B) xor C, and the carry is ((A and B) or (C and (A or B))) ) which are then chained together to generate a full register adder.
Which brings us to twos complement: negation is easy, and the addition of mixed positive and negative number follows naturally with no additional hardware. So subtraction comes almost for free.
Few other representations will allow arithmetic to be implemented so cheaply, and I know of none that are easier.
Optimization in storage and optimization in processing time are often at cross purposes with each other; all other things being equal, simplicity usually trumps complexity.
Anyone can propose any storage mechanism for information they wish, but unless there are processors or algorithms that support it, it won't get used.
There are two reasons to choose base 2 over base -2:
First, in a lot of applications you don't need to represent negative numbers. By isolating their representation to a single bit you can either expand the range of representable numbers, or reduce the storage space required when negative numbers aren't needed. In base -2 you need to include the negative values even if you clip the range.
Second, 2s complement hardware is simple to implement. Not only is simple to implement, it is super simple to implement 2s complement hardware that supports both signed and unsigned arithmetic, since they are the same thing. In other words, the binary representation of uint4(8) and sint4(-15) are the same, and the binary representation of uint(7) and sint4(7) are the same, which means you can do the addition without knowing whether or not it is signed, the values all work out either way. That means the HW can totally avoid knowing anything about signs and let it be dealt with as a language convention.
Also, the use of the binary system has a mathematical background. Consider the Information Theory by Claude Shannon . My english skills don't qualify to explain this topic, so better follow the link to wikipedia and enjoy the maths behind all this stuff.
In the end the decision was made because of voltage variance.
With base 2 it is on or off, no in between.
However with base 10 how do you know what each number is?
is .1 volts 1? What about .11? Voltage can vary and is not precise. Which is why an analog signal is not as good as a digital. This is if you pay more for a HDMI cable than $6 it is a waste, it is digital it gets there or not. Audio it does matter because the signal can change.
Please, see an example of the complexity that dmckee pointed out without examples. So you can see an example, the numbers 0-9:
0 = 0
1 = 1
2 = 110
3 = 111
4 = 100
5 = 101
6 = 11010
7 = 11011
8 = 11000
9 = 11001
1's complement does have 0 and -0 - is that what you're after?
CDC used to produce 1's complement machines which made negation very easy as you suggest. As I understand it, it also allowed them to produce hardware for subtraction that didn't infringe on IBM's patent on the 2's complement binary subtractor.