How to make and, or, not, xor, plus using only substraction - math

I read that there is a computer that uses only subtraction.
How is that possible. For the plus operand it's pretty easy.
The logical operands I think can be made using subtraction with a constant.
What do you guys think ?

Plus +
is easy as you already have minus implemented so:
x + y = x - (0-y)
NOT !
In standard ALU is usual to compute substraction by addition:
-x = !x + 1
So from this the negation is:
!x = -1 - x
AND &,OR |,XOR ^
Sorry have no clue about efficient AND,OR,XOR implementations without more info about the architecture other then testing each bit individually from MSB to LSB. So first you need to know the bit value from a number so let assume 4 bit unsigned integer numbers for simplification so x=(x3,x2,x1,x0) where x3 is the MSB and x0 is the LSB.
if (x>=8) { x3=1; x-=8; } else x3=0;
if (x>=4) { x2=1; x-=4; } else x2=0;
if (x>=2) { x1=1; x-=2; } else x1=0;
if (x>=1) { x0=1; x-=1; } else x0=0;
And this is how to get the number back
x=0
if (x0) x+=1;
if (x1) x+=2;
if (x2) x+=4;
if (x3) x+=8;
or like this:
x=15
if (!x0) x-=1;
if (!x1) x-=2;
if (!x2) x-=4;
if (!x3) x-=8;
now we can do the AND,OR,XOR operations
z=x&y // AND
z0=(x0+y0==2);
z1=(x1+y1==2);
z2=(x2+y2==2);
z3=(x3+y3==2);
z=x|y // OR
z0=(x0+y0>0);
z1=(x1+y1>0);
z2=(x2+y2>0);
z3=(x3+y3>0);
z=x^y // XOR
z0=!(x0+y0==1);
z1=!(x1+y1==1);
z2=!(x2+y2==1);
z3=!(x3+y3==1);
PS the comparison is just substraction + Carry and Zero flags examination. Also all the + can be rewriten and optimized to use of - to better suite this weird architecture
bit shift <<,>>
z=x>>1
z0=x1;
z1=x2;
z2=x3;
z3=0;
z=x<<1
z0=0;
z1=x0;
z2=x1;
z3=x2;

Related

How is R able to sum an integer sequence so fast?

Create a large contiguous sequence of integers:
x <- 1:1e20
How is R able to compute the sum so fast?
sum(x)
Doesn't it have to loop over 1e20 elements in the vector and sum each element?
Summing up the comments:
R introduced something called ALTREP, or ALternate REPresentation for R objects. Its intent is to do some things more efficiently. From https://www.r-project.org/dsc/2017/slides/dsc2017.pdf, some examples include:
allow vector data to be in a memory-mapped file or distributed
allow compact representation of arithmetic sequences;
allow adding meta-data to objects;
allow computations/allocations to be deferred;
support alternative representations of environments.
The second and fourth bullets seem appropriate here.
We can see a hint of this in action by looking at what I'm inferring is at the core of the R sum primitive for altreps, at https://github.com/wch/r-source/blob/7c0449d81c853f781fb13e9c7118065aedaf2f7f/src/main/altclasses.c#L262:
static SEXP compact_intseq_Sum(SEXP x, Rboolean narm)
{
#ifdef COMPACT_INTSEQ_MUTABLE
/* If the vector has been expanded it may have been modified. */
if (COMPACT_SEQ_EXPANDED(x) != R_NilValue)
return NULL;
#endif
double tmp;
SEXP info = COMPACT_SEQ_INFO(x);
R_xlen_t size = COMPACT_INTSEQ_INFO_LENGTH(info);
R_xlen_t n1 = COMPACT_INTSEQ_INFO_FIRST(info);
int inc = COMPACT_INTSEQ_INFO_INCR(info);
tmp = (size / 2.0) * (n1 + n1 + inc * (size - 1));
if(tmp > INT_MAX || tmp < R_INT_MIN)
/**** check for overflow of exact integer range? */
return ScalarReal(tmp);
else
return ScalarInteger((int) tmp);
}
Namely, the reduction of an integer sequence without gaps is trivial. It's when there are gaps or NAs that things become a bit more complicated.
In action:
vec <- 1:1e10
sum(vec)
# [1] 5e+19
sum(vec[-10])
# Error: cannot allocate vector of size 37.3 Gb
### win11, R-4.2.2
Where ideally we would see that sum(vec) == (sum(vec[-10]) + 10), but we cannot since we can't use the optimization of sequence-summing.

Efficient method for imposing (some cases of) periodic boundary conditions on floats?

Some cases of periodic boundary conditions (PBC) can be imposed very efficiently on integers by simply doing:
myWrappedWithinPeriodicBoundary = myUIntValue & mask
This works when the boundary is the half open range [0, upperBound), where the (exclusive) upperBound is 2^exp so that
mask = (1 << exp) - 1
For example:
let pbcUpperBoundExp = 2 // so the periodic boundary will be [0, 4)
let mask = (1 << pbcUpperBoundExp) - 1
for x in -7 ... 7 { print(x & mask, terminator: " ") }
(in Swift) will print:
1 2 3 0 1 2 3 0 1 2 3 0 1 2 3
Question: Is there any (roughly similar) efficient method for imposing (some cases of) PBCs on floating point-numbers (32 or 64-bit IEEE-754)?
There are several reasonable approaches:
fmod(x,1)
modf(x,&dummy) — has the advantage of knowing its divisor statically, but in my testing comes from libc.so.6 even with -ffast-math
x-floor(x) (suggested by Jens in a comment) — supports negative inputs directly
Manual bit-twiddling direct implementation
Manual bit-twiddling implementation of floor
The first two preserve the sign of their input; you can add 1 if it's negative.
The two bit manipulations are very similar: you identify which significand bits correspond to the integer portion, and mask them (for the direct implementation) or the rest (to implement floor) off. The direct implementation can be completed either with a floating-point division or with a shift to reassemble the double manually; the former is 28% faster even given hardware CLZ. The floor implementation can immediately reconstitute a double: floor never changes the exponent of its argument unless it returns 0. About 20 lines of C are required.
The following timing is with double and gcc -O3, with timing loops over representative inputs into which the operative code was inlined.
fmod: 41.8 ns
modf: 19.6 ns
floor: 10.6 ns
With -ffast-math:
fmod: 26.2 ns
modf: 30.0 ns
floor: 21.9 ns
Bit manipulation:
direct: 18.0 ns
floor: 20.6 ns
The manual implementations are competitive, but the floor technique is the best. Oddly, two of the three library functions perform better without -ffast-math: that is, as a PLT function call than as an inlined builtin function.
I'm adding this answer to my own question since it describes the, at the time of writing, best solution I have found. It's in Swift 4.1 (should be straight forward to translate into C) and it's been tested in various use cases:
extension BinaryFloatingPoint {
/// Returns the value after restricting it to the periodic boundary
/// condition [0, 1).
/// See https://forums.swift.org/t/why-no-fraction-in-floatingpoint/10337
#_transparent
func wrappedToUnitRange() -> Self {
let fract = self - self.rounded(.down)
// Have to clamp to just below 1 because very small negative values
// will otherwise return an out of range result of 1.0.
// Turns out this:
if fract >= 1.0 { return Self(1).nextDown } else { return fract }
// is faster than this:
//return min(fract, Self(1).nextDown)
}
#_transparent
func wrapped(to range: Range<Self>) -> Self {
let measure = range.upperBound - range.lowerBound
let recipMeasure = Self(1) / measure
let scaled = (self - range.lowerBound) * recipMeasure
return scaled.wrappedToUnitRange() * measure + range.lowerBound
}
#_transparent
func wrappedIteratively(to range: Range<Self>) -> Self {
var v = self
let measure = range.upperBound - range.lowerBound
while v >= range.upperBound { v = v - measure }
while v < range.lowerBound { v = v + measure }
return v
}
}
On my MacBook Pro with a 2 GHz Intel Core i7,
a hundred million (probably inlined) calls to wrapped(to range:) on random (finite) Double values takes 0.6 seconds, which is about 166 million calls per second (not multi threaded). The range being statically known or not, or having bounds or measure that is a power of two etc, can make some difference but not as much as one could perhaps have thought.
wrappedToUnitRange() takes about 0.2 seconds, meaning 500 million calls per second on my system.
Given the right scenario, wrappedIteratively(to range:) is as fast as wrappedToUnitRange().
The timings have been made by comparing a baseline test (without wrapping some value, but still using it to compute eg a simple xor checksum) to the same test where a value is wrapped. The difference in time between these are the times I have given for the wrapping calls.
I have used Swift development toolchain 2018-02-21, compiling with -O -whole-module-optimization -static-stdlib -gnone. And care has been taken to make the tests relevant, ie preventing dead code removal, using true random input of different distributions etc. Writing the wrapping functions generically, like this extension on BinaryFloatingPoint, turned out to be optimized into equivalent code as if I had written separate specialized versions for eg Float and Double.
It would be interesting to see someone more skilled than me investigating this further (C or Swift or any other language doesn't matter).
EDIT:
For anyone interested, here is some versions for simd float2:
extension float2 {
#_transparent
func wrappedInUnitRange() -> float2 {
return simd.fract(self)
}
#_transparent
func wrappedToMinusOneToOne() -> float2 {
let scaled = (self + float2(1, 1)) * float2(0.5, 0.5)
let scaledFract = scaled - floor(scaled)
let wrapped = simd_muladd(scaledFract, float2(2, 2), float2(-1, -1))
// Note that we have to make sure the result is not out of bounds, like
// simd fract does:
let oneNextDown = Float(bitPattern:
0b0_01111110_11111111111111111111111)
let oneNextDownFloat2 = float2(oneNextDown, oneNextDown)
return simd.min(wrapped, oneNextDownFloat2)
}
#_transparent
func wrapped(toLowerBound lowerBound: float2,
upperBound: float2) -> float2
{
let measure = upperBound - lowerBound
let recipMeasure = simd_precise_recip(measure)
let scaled = (self - lowerBound) * recipMeasure
let scaledFract = scaled - floor(scaled)
// Note that we have to make sure the result is not out of bounds, like
// simd fract does:
let wrapped = simd_muladd(scaledFract, measure, lowerBound)
let maxX = upperBound.x.nextDown // For some reason, this won't be
let maxY = upperBound.y.nextDown // optimized even when upperBound is
// statically known, and there is no similar simd function available.
let maxValue = float2(maxX, maxY)
return simd.min(wrapped, maxValue)
}
}
I asked some related simd-related questions here which might be of interest.
EDIT2:
As can be seen in the above Swift Forums thread:
// Note that tiny negative values like:
let x: Float = -1e-08
// May produce results outside the [0, 1) range:
let wrapped = x - floor(x)
print(wrapped < 1.0) // false
// which may result in out-of-bounds table accesses
// in common usage, so it's probably better to use:
let correctlyWrapped = simd_fract(x)
print(correctlyWrapped < 1.0) // true
I have since updated the code to account for this.

How to use arithmetic shift & selector in verilog?

I want to use selector and arithmetic shift together.
But this code is failed to implemented, the result is just logical shift.
module multiplier(x1, x2, x1x2);
input [15:0] x1, x2;
output [15:0] x1x2;
assign x1x2 =
x2[13]? ($signed(x1)>>>4'd1) : 16'b0000000000000000;
endmodule
The arithmetic shift is done successfull without selector like this code.
module multiplier(x1, x2, x1x2);
input [15:0] x1, x2;
output [15:0] x1x2;
assign x1x2 = $signed(x1)>>>4'd1;
endmodule
How to use selector and arithmetic shift together?
Verilog will almost always choose unsigned when it has a choice and it appears the selector logic is allowing Verilog to choose.
There are a couple if difference solutions:
Use two lines:
wire [15:0] x1_shift = $signed(x1)>>>4'd1;
assign x1x2 = x2[13] ? x1_shift : 16'b0;
Use Curly instead of parenthesis:
assign x1x2 = x2[13]? { $signed(x1)>>>4'd1 } : 16'b0;
Hard coded shift:
assign x1x2 = x2[13] ? {x1[15],x1[15:1]} : 16'b0;
Sign all conditions: (As Unn pointed out 16'b0 is unsigned)
assign x1x2 = x2[13] ? ($signed(x1)>>>4'd1) : $signed(16'b0); // least recommenced
With SystemVerilog you can also do size casting:
assign x1x2 = x2[13] ? 16'($signed(x1)>>>4'd1) : 16'b0; // SV only, not Verilog
Working examples here.

Prolog Basic Recursive Division

I am new to Prolog and am having some difficulty fixing the errors of my first program.
The program requirement is that it divides the 2 inputs using recursion, returning 0 if the dividend is larger than the divisor, and ignores remainders.
%Author: Justin Taylor
testquotient :-
repeat,
var(Divident), var(Divisor), var(Answer), var(End),
write('Enter Divident: '),
read(Divident),
write('Enter Divisor: '),
read(Divisor),
quotient(Divident, Divisor, Answer),
nl,
write('Quotient is = '),
write(Answer),
nl,
write('Enter 0 to quit, 1 to continue: '),
read(End),
(End =:= 0),!.
quotient(_, 0, 'Undefined').
quotient(0, _, 0).
quotient(Divisor == Divident -> Answer = 1).
quotient(Divisor < Divident -> Answer = 0).
quotient(Divident, Divisor, Answer) :-
(Divisor > Divident -> Divisor = Divisor - Divident,
quotient(Divident, Divisor, Answer + 1);
Answer = Answer).
First, read up on is. Type help(is). at the SWI-Prolog's prompt. Read the whole section about "Arithmetic" carefully. Second, your first few clauses for quotient are completely off-base, invalid syntax. I'll show you how to rewrite one of them, you'll have to do the other yourself:
%% WRONG: quotient(Divisor == Divident -> Answer = 1).
quotient(Divisor, Divident, Answer) :-
Divisor =:= Divident -> Answer = 1.
%% WRONG: quotient(Divisor < Divident -> Answer = 0).
....
Note the use of =:= instead of ==.
Your last clause for quotient looks almost right at the first glance, save for the major faux pas: prolog's unification, =, is not, repeat not, an assignment operator! We don't change values assigned to logical variables (if X is 5, what's there to change about it? It is what it is). No, instead we define new logical variable, like this
( Divisor > Divident -> NewDivisor = Divisor - Divident,
and we use it in the recursive call,
%% WRONG: quotient(Divident, NewDivisor, Answer + 1) ;
but this is wrong too, w.r.t. the new Answer. If you add 1 on your way down (as you subtract Divident from your Divisor - btw shouldn't it be the other way around?? check your logic or at least swap your names, "divisor" is what you divide by ) that means you should've supplied the initial value. But you seem to supply the terminal value as 0, and that means that you should build your result on your way back up from the depths of recursion:
%%not quite right yet
quotient(Divident, NewDivisor, NewAnswer), Answer = NewAnswer + 1 ;
Next, Answer = Answer succeeds always. We just write true in such cases.
Lastly, you really supposed to use is on each recursion step, and not just in the very end:
( Divisor > Divident -> NewDivisor is Divisor - Divident, %% use "is"
quotient(Divident, NewDivisor, NewAnswer), Answer is NewAnswer+1 %% use "is"
; true ). %% is this really necessary?
Your 'Undefined' will cause an error on 0, but leave it at that, for now. Also, you don't need to "declare" your vars in Prolog. The line var(Divident), ..., var(End), serves no purpose.

Diffie-Hellman -- Primitive root mod n -- cryptography question

In the below snippet, please explain starting with the first "for" loop what is happening and why. Why is 0 added, why is 1 added in the second loop. What is going on in the "if" statement under bigi. Finally explain the modPow method. Thank you in advance for meaningful replies.
public static boolean isPrimitive(BigInteger m, BigInteger n) {
BigInteger bigi, vectorint;
Vector<BigInteger> v = new Vector<BigInteger>(m.intValue());
int i;
for (i=0;i<m.intValue();i++)
v.add(new BigInteger("0"));
for (i=1;i<m.intValue();i++)
{
bigi = new BigInteger("" + i);
if (m.gcd(bigi).intValue() == 1)
v.setElementAt(new BigInteger("1"), n.modPow(bigi,m).intValue());
}
for (i=0;i<m.intValue();i++)
{
bigi = new BigInteger("" + i);
if (m.gcd(bigi).intValue() == 1)
{
vectorint = v.elementAt(bigi.intValue());
if ( vectorint.intValue() == 0)
i = m.intValue() + 1;
}
}
if (i == m.intValue() + 2)
return false;
else
return true;
}
Treat the vector as a list of booleans, with one boolean for each number 0 to m. When you view it that way, it becomes obvious that each value is set to 0 to initialize it to false, and then set to 1 later to set it to true.
The last for loop is testing all the booleans. If any of them are 0 (indicating false), then the function returns false. If all are true, then the function returns true.
Explaining the if statement you asked about would require explaining what a primitive root mod n is, which is the whole point of the function. I think if your goal is to understand this program, you should first understand what it implements. If you read Wikipedia's article on it, you'll see this in the first paragraph:
In modular arithmetic, a branch of
number theory, a primitive root modulo
n is any number g with the property
that any number coprime to n is
congruent to a power of g (mod n).
That is, if g is a primitive root (mod
n), then for every integer a that has
gcd(a, n) = 1, there is an integer k
such that gk ≡ a (mod n). k is called
the index of a. That is, g is a
generator of the multiplicative group
of integers modulo n.
The function modPow implements modular exponentiation. Once you understand how to find a primitive root mod n, you'll understand it.
Perhaps the final piece of the puzzle for you is to know that two numbers are coprime if their greatest common divisor is 1. And so you see these checks in the algorithm you pasted.
Bonus link: This paper has some nice background, including how to test for primitive roots near the end.

Resources