Is there a Intel SIMD comparison function that returns 0 or 1 instead of 0 or 0xFFFFFFFF?

Is there a Intel SIMD comparison function that returns 0 or 1 instead of 0 or 0xFFFFFFFF? - intel

I'm currently using the intel SIMD function: _mm_cmplt_ps( V1, V2 ).
The function returns a vector containing the results of each component test. Based on if V1 components are less than V2 components, example:
XMVECTOR Result;
Result.x = (V1.x < V2.x) ? 0xFFFFFFFF : 0;
Result.y = (V1.y < V2.y) ? 0xFFFFFFFF : 0;
Result.z = (V1.z < V2.z) ? 0xFFFFFFFF : 0;
Result.w = (V1.w < V2.w) ? 0xFFFFFFFF : 0;
return Result;
However is there a function like this that returns 1 or 0 instead? A function that uses SIMD and no workarounds because it is supposed to be optimized + vectorized.

You can write that function yourself. It’s only 2 instructions:
// 1.0 for lanes where a < b, zero otherwise
inline __m128 compareLessThan_01( __m128 a, __m128 b )
{
const __m128 cmp = _mm_cmplt_ps( a, b );
return _mm_and_ps( cmp, _mm_set1_ps( 1.0f ) );
}
Here’s more generic version which returns either of the 2 values. It requires SSE 4.1 which is almost universally available by now with 97.94% of users, if you have to support SSE2-only, emulate with _mm_and_ps, _mm_andnot_ps, and _mm_or_ps.
// y for lanes where a < b, x otherwise
inline __m128 compareLessThan_xy( __m128 a, __m128 b, float x, float y )
{
const __m128 cmp = _mm_cmplt_ps( a, b );
return _mm_blendv_ps( _mm_set1_ps( x ), _mm_set1_ps( y ), cmp );
}

The DirectXMath no-intrinsics version of _mm_cmplt_ps is actually:
XMVECTORU32 Control = { { {
(V1.vector4_f32[0] < V2.vector4_f32[0]) ? 0xFFFFFFFF : 0,
(V1.vector4_f32[1] < V2.vector4_f32[1]) ? 0xFFFFFFFF : 0,
(V1.vector4_f32[2] < V2.vector4_f32[2]) ? 0xFFFFFFFF : 0,
(V1.vector4_f32[3] < V2.vector4_f32[3]) ? 0xFFFFFFFF : 0
} } };
return Control.v;
XMVECTOR is the same as __m128 which is 4 floats so it needs the alias to make sure it's writing integers.
I use _mm_movemask_ps for the "Control Register" version of DirectXMath functions. It just collects the top-most bit of each SIMD value.
int result = _mm_movemask_ps(_mm_cmplt_ps( V1, V2 ));
The lower nibble of result will contain bit patterns. A 1 bit for each value that passes the test, and a 0 bit for each value that fails the test. This could be used to reconstruct 1 vs. 0.

Related

How to get a logarithmic distribution from an interval

I am currently trying to cut an interval into not equal-width slices. In fact I want the width of each slice to follow a logarithmic rule. For instance the first interval is supposed to be bigger than the second one, etc.
I have a hard time remembering my mathematics lectures. So assuming I know a and b which are respectively the lower and upper boundaries of my interval I, and n is the number of slices:
how can I find the lower and upper boundaries of each slice (following a logarithmic scale)?
In other word, here's what I have done to get equal-width interval:
for (i = 1; i< p; i++) {
start = lower + i -1 + ((i-1) * size_piece);
if (i == p-1 ) {
end = upper;
} else {
end = start + size_piece;
}
//function(start, end)
}
Where: p-1= number of slices, and size_piece = |b-a|.
What I want to get now is start and end values, but following a logarithmic scale instead of an arithmetic scale (which are going to be called in some function in the for loop).
Thanks in advance for your help.

If I have understood your question, this C++ program will show you a practical example of the algorithm that can be used:
#include <iostream>
#include <cmath>
void my_function( double a, double b ) {
// print out the lower and upper bounds of the slice
std::cout << a << " -- " << b << '\n';
}
int main() {
double start = 0.0, end = 1.0;
int n_slices = 7;
// I want to create 7 slices in a segment of length = end - start
// whose extremes are logarithmically distributed:
// | 1 | 2 | 3 | 4 | 5 |6 |7|
// +-----------------+----------+------+----+---+--+-+
// start end
double scale = (end - start) / log(1.0 + n_slices);
double lower_bound = start;
for ( int i = 0; i < n_slices; ++i ) {
// transform to the interval (1,n_slices+1):
// 1 2 3 4 5 6 7 8
// +-----------------+----------+------+----+---+--+-+
// start end
double upper_bound = start + log(2.0 + i) * scale;
// use the extremes in your function
my_function(lower_bound,upper_bound);
// update
lower_bound = upper_bound;
}
return 0;
}
The output (the extremes of the slices) is:
0 -- 0.333333
0.333333 -- 0.528321
0.528321 -- 0.666667
0.666667 -- 0.773976
0.773976 -- 0.861654
0.861654 -- 0.935785
0.935785 -- 1

Frame the solution using Dynamic programming

Given a bag with a maximum of 100 chips,each chip has its value written over it.
Determine the most fair division between two persons. This means that the difference between the amount each person obtains should be minimized. The value of a chips varies from 1 to 1000.
Input: The number of coins m, and the value of each coin.
Output: Minimal positive difference between the amount the two persons obtain when they divide the chips from the corresponding bag.
I am finding it difficult to form a DP solution for it. Please help me.
Initially I had to tried it as a Non DP solution.Actually I havent thought of solving it using DP. I simply sorted the value array. And assigned the largest value to one of the person, and incrementally assigned the other values to one of the two depending upon which creates minimum difference. But that solution actually didnt work.
I am posting my solution here :
bool myfunction(int i, int j)
{
return(i >= j) ;
}
int main()
{
int T, m, sum1, sum2, temp_sum1, temp_sum2,i ;
cin >> T ;
while(T--)
{
cin >> m ;
sum1 = 0 ; sum2 = 0 ; temp_sum1 = 0 ; temp_sum2 = 0 ;
vector<int> arr(m) ;
for(i=0 ; i < m ; i++)
{
cin>>arr[i] ;
}
if(m==1 )
{
if(arr[0]%2==0)
cout<<0<<endl ;
else
cout<<1<<endl ;
}
else {
sort(arr.begin(), arr.end(), myfunction) ;
// vector<int> s1 ;
// vector<int> s2 ;
for(i=0 ; i < m ; i++)
{
temp_sum1 = sum1 + arr[i] ;
temp_sum2 = sum2 + arr[i] ;
if(abs(temp_sum1 - sum2) <= abs(temp_sum2 -sum1))
{
sum1 = sum1 + arr[i] ;
}
else
{
sum2 = sum2 + arr[i] ;
}
temp_sum1 = 0 ;
temp_sum2 = 0 ;
}
cout<<abs(sum1 -sum2)<<endl ;
}
}
return 0 ;
}

what i understand from your question is you want to divide chips in two persons so as to minimize the difference between sum of numbers written on those.
If understanding is correct, then potentially you can follow below approach to arrive at solution.
Sort the values array i.e. int values[100]
Start adding elements from both ends of array in for loop i.e. for(i=0; j=values.length;i<j;i++,j--)
Odd numbered iteration sum belongs to one person & even numbered sum to other person
run the loop till i < j
now, the difference between two sums obtained in odd & even iterations should be minimum as array was sorted earlier.
If my understanding of the question is correct, then this solution should resolve your problem.
Reflect as appropriate.
Thanks
Ravindra

Recursive method that reverses numbers?

public void printReverseDigits( int input )
Prints the digits in the input integer, in reverse order. You may
assume the input will always be greater than 0. For example:
> RecursionFun f = new RecursionFun()
> f.printReverseDigits( 12345 )
54321
> f.printReverseDigits( 20 )
02
> f.printReverseDigits( 404 )
404
> f.printReverseDigits( 1 )
1
I don't even know where to start on this ^. We can't use loops or anything of that sort... only recursion, if statements, stuff like that.
Any ideas on how to even begin? :( I don't get it...

You need to print out the units - number % 10,
then remove the units - number / 10,
and carry on if the number is non 0 using recursion instead of a loop
f.printReverseDigits( int num )
{
print( "%d", num % 10 );
num /= 10;
if( num )
{
printReverseDigits( num );
}
}

It will be easy if the number of digits is known, if not you could check it by seeing if the number if the number is less than 9, then 99 then 999, and so on.
If the number is 404, if(input<999) will return true, then we know it is of three digits.
...in a loop for number of digits....
digit[i] = input % 10;
input = input / 10;
Then you could combine the digits in reverse and return it.
For recursion:
int printReverseDigits(int input)
{
int digit, new;
if(!input) return 0;
digit=input%10;
new=printReverseDigits(input/10);
cout<<digit;
return new+digit;
}

Very simple really. Here is a c++ solution.
#include <iostream>
using namespace std;
void recursivePrintVals(const int someNum) {
if(!someNum) return;
cout << someNum % 10;
recursivePrintVals(someNum/10);
}
int main() {
recursivePrintVals(123456789);
}

Codility K-Sparse Test Spoilers

Have you tried the latest Codility test?
I felt like there was an error in the definition of what a K-Sparse number is that left me confused and I wasn't sure what the right way to proceed was. So it starts out by defining a K-Sparse Number:
In the binary number "100100010000" there are at least two 0s between
any two consecutive 1s. In the binary number "100010000100010" there
are at least three 0s between any two consecutive 1s. A positive
integer N is called K-sparse if there are at least K 0s between any
two consecutive 1s in its binary representation. (My emphasis)
So the first number you see, 100100010000 is 2-sparse and the second one, 100010000100010, is 3-sparse. Pretty simple, but then it gets down into the algorithm:
Write a function:
class Solution { public int sparse_binary_count(String S,String T,int K); }
that, given:
string S containing a binary representation of some positive integer A,
string T containing a binary representation of some positive integer B,
a positive integer K.
returns the number of K-sparse integers within the range [A..B] (both
ends included)
and then states this test case:
For example, given S = "101" (A = 5), T = "1111" (B=15) and K=2, the
function should return 2, because there are just two 2-sparse integers
in the range [5..15], namely "1000" (i.e. 8) and "1001" (i.e. 9).
Basically it is saying that 8, or 1000 in base 2, is a 2-sparse number, even though it does not have two consecutive ones in its binary representation. What gives? Am I missing something here?

Tried solving that one. The assumption that the problem makes about binary representations of "power of two" numbers being K sparse by default is somewhat confusing and contrary.
What I understood was 8-->1000 is 2 power 3 so 8 is 3 sparse. 16-->10000 2 power 4 , and hence 4 sparse.
Even we assume it as true , and if you are interested in below is my solution code(C) for this problem. Doesn't handle some cases correctly, where there are powers of two numbers involved in between the two input numbers, trying to see if i can fix that:
int sparse_binary_count (const string &S,const string &T,int K)
{
char buf[50];
char *str1,*tptr,*Sstr,*Tstr;
int i,len1,len2,cnt=0;
long int num1,num2;
char *pend,*ch;
Sstr = (char *)S.c_str();
Tstr = (char *)T.c_str();
str1 = (char *)malloc(300001);
tptr = str1;
num1 = strtol(Sstr,&pend,2);
num2 = strtol(Tstr,&pend,2);
for(i=0;i<K;i++)
{
buf[i] = '0';
}
buf[i] = '\0';
for(i=num1;i<=num2;i++)
{
str1 = tptr;
if( (i & (i-1))==0)
{
if(i >= (pow((float)2,(float)K)))
{
cnt++;
continue;
}
}
str1 = myitoa(i,str1,2);
ch = strstr(str1,buf);
if(ch == NULL)
continue;
else
{
if((i % 2) != 0)
cnt++;
}
}
return cnt;
}
char* myitoa(int val, char *buf, int base){
int i = 299999;
int cnt=0;
for(; val && i ; --i, val /= base)
{
buf[i] = "0123456789abcdef"[val % base];
cnt++;
}
buf[i+cnt+1] = '\0';
return &buf[i+1];
}

There was an information within the test details, showing this specific case. According to this information, any power of 2 is considered K-sparse for any K.
You can solve this simply by binary operations on integers. You are even able to tell, that you will find no K-sparse integers bigger than some specific integer and lower than (or equal to) integer represented by T.
As far as I can see, you must pay also a lot of attention to the performance, as there are sometimes hundreds of milions of integers to be checked.
My own solution, written in Python, working very efficiently even on large ranges of integers and being successfully tested for many inputs, has failed. The results were not very descriptive, saying it does not work as required within question (although it meets all the requirements in my opinion).

/////////////////////////////////////
solutions with bitwise operators:
no of bits per int = 32 on 32 bit system,check for pattern (for K=2,
like 1001, 1000) in each shift and increment the count, repeat this
for all numbers in range.
///////////////////////////////////////////////////////
int KsparseNumbers(int a, int b, int s) {
int nbits = sizeof(int)*8;
int slen = 0;
int lslen = pow(2, s);
int scount = 0;
int i = 0;
for (; i < s; ++i) {
slen += pow(2, i);
}
printf("\n slen = %d\n", slen);
for(; a <= b; ++a) {
int num = a;
for(i = 0 ; i < nbits-2; ++i) {
if ( (num & slen) == 0 && (num & lslen) ) {
scount++;
printf("\n Scount = %d\n", scount);
break;
}
num >>=1;
}
}
return scount;
}
int main() {
printf("\n No of 2-sparse numbers between 5 and 15 = %d\n", KsparseNumbers(5, 15, 2));
}

Mathematically Find Max Value without Conditional Comparison

----------Updated ------------
codymanix and moonshadow have been a big help thus far. I was able to solve my problem using the equations and instead of using right shift I divided by 29. Because with 32bits signed 2^31 = overflows to 29. Which works!
Prototype in PHP
$r = $x - (($x - $y) & (($x - $y) / (29)));
Actual code for LEADS (you can only do one math function PER LINE!!! AHHHH!!!)
DERIVDE1 = IMAGE1 - IMAGE2;
DERIVED2 = DERIVED1 / 29;
DERIVED3 = DERIVED1 AND DERIVED2;
MAX = IMAGE1 - DERIVED3;
----------Original Question-----------
I don't think this is quite possible with my application's limitations but I figured it's worth a shot to ask.
I'll try to make this simple. I need to find the max values between two numbers without being able to use a IF or any conditional statement.
In order to find the the MAX values I can only perform the following functions
Divide, Multiply, Subtract, Add, NOT, AND ,OR
Let's say I have two numbers
A = 60;
B = 50;
Now if A is always greater than B it would be simple to find the max value
MAX = (A - B) + B;
ex.
10 = (60 - 50)
10 + 50 = 60 = MAX
Problem is A is not always greater than B. I cannot perform ABS, MAX, MIN or conditional checks with the scripting applicaiton I am using.
Is there any way possible using the limited operation above to find a value VERY close to the max?

finding the maximum of 2 variables:
max = a-((a-b)&((a-b)>>31))
where >> is bitwise right-shift (also called SHR or ASR depeding on signedness).
Instead of 31 you use the number of bits your numbers have minus one.

I guess this one would be the most simplest if we manage to find difference between two numbers (only the magnitude not sign)
max = ((a+b)+|a-b|)/2;
where |a-b| is a magnitude of difference between a and b.

If you can't trust your environment to generate the appropriate branchless operations when they are available, see this page for how to proceed. Note the restriction on input range; use a larger integer type for the operation if you cannot guarantee your inputs will fit.

Solution without conditionals. Cast to uint then back to int to get abs.
int abs (a) { return (int)((unsigned int)a); }
int max (a, b) { return (a + b + abs(a - b)) / 2; }
int max3 (a, b, c) { return (max(max(a,b),c); }

Using logical operations only, short circuit evaluation and assuming the C convention of rounding towards zero, it is possible to express this as:
int lt0(int x) {
return x && (!!((x-1)/x));
}
int mymax(int a, int b) {
return lt0(a-b)*b+lt0(b-a)*a;
}
The basic idea is to implement a comparison operator that will return 0 or 1. It's possible to do a similar trick if your scripting language follows the convention of rounding toward the floor value like python does.

function Min(x,y:integer):integer;
Var
d:integer;
abs:integer;
begin
d:=x-y;
abs:=d*(1-2*((3*d) div (3*d+1)));
Result:=(x+y-abs) div 2;
end;

Hmmm. I assume NOT, AND, and OR are bitwise? If so, there's going to be a bitwise expression to solve this. Note that A | B will give a number >= A and >= B. Perhaps there's a pruning method for selecting the number with the most bits.
To extend, we need the following to determine whether A (0) or B (1) is greater.
truth table:
0|0 = 0
0|1 = 1
1|0 = 0
1|1 = 0
!A and B
therefore, will give the index of the greater bit. Ergo, compare each bit in both numbers, and when they are different, use the above expression (Not A And B) to determine which number was greater. Start from the most significant bit and proceed down both bytes. If you have no looping construct, manually compare each bit.
Implementing "when they are different":
(A != B) AND (my logic here)

try this, (but be aware for overflows)
(Code in C#)
public static Int32 Maximum(params Int32[] values)
{
Int32 retVal = Int32.MinValue;
foreach (Int32 i in values)
retVal += (((i - retVal) >> 31) & (i - retVal));
return retVal;
}

You can express this as a series of arithmetic and bitwise operations, e.g.:
int myabs(const int& in) {
const int tmp = in >> ((sizeof(int) * CHAR_BIT) - 1);
return tmp - (in ^ tmp(;
}
int mymax(int a, int b) {
return ((a+b) + myabs(b-a)) / 2;
}

//Assuming 32 bit integers
int is_diff_positive(int num)
{
((num & 0x80000000) >> 31) ^ 1; // if diff positive ret 1 else 0
}
int sign(int x)
{
return ((num & 0x80000000) >> 31);
}
int flip(int x)
{
return x ^ 1;
}
int max(int a, int b)
{
int diff = a - b;
int is_pos_a = sign(a);
int is_pos_b = sign(b);
int is_diff_positive = diff_positive(diff);
int is_diff_neg = flip(is_diff_positive);
// diff (a - b) will overflow / underflow if signs are opposite
// ex: a = INT_MAX , b = -3 then a - b => INT_MAX - (-3) => INT_MAX + 3
int can_overflow = is_pos_a ^ is_pos_b;
int cannot_overflow = flip(can_overflow);
int res = (cannot_overflow * ( (a * is_diff_positive) + (b *
is_diff_negative)) + (can_overflow * ( (a * is_pos_a) + (b *
is_pos_b)));
return res;
}

This is my implementation using only +, -, *, %, / operators
using static System.Console;
int Max(int a, int b) => (a + b + Abs(a - b)) / 2;
int Abs(int x) => x * ((2 * x + 1) % 2);
WriteLine(Max(-100, -2) == -2); // true
WriteLine(Max(2, -100) == 2); // true

I just came up with an expression:
(( (a-b)-|a-b| ) / (2(a-b)) )*b + (( (b-a)-|b-a| )/(2(b-a)) )*a
which is equal to a if a>b and is equal to b if b>a
when a>b:
a-b>0, a-b = |a-b|, (a-b)-|a-b| = 0 so the coeficcient for b is 0
b-a<0, b-a = -|b-a|, (b-a)-|b-a| = 2(b-a)
so the coeficcient for a is 2(b-a)/2(b-a) which is 1
so it would ultimately return 0*b+1*a if a is bigger and vice versa

Find MAX between n & m
MAX = ( (n/2) + (m/2) + ( ((n/2) - (m/2)) * ( (2*((n/2) - (m/2)) + 1) % 2) ) )
Using #define in c:
#define MAX(n, m) ( (n/2) + (m/2) + ( ((n/2) - (m/2)) * ( (2*((n/2) - (m/2)) + 1) % 2) ) )
or
#define ABS(n) ( n * ( (2*n + 1) % 2) ) // Calculates abs value of n
#define MAX(n, m) ( (n/2) + (m/2) + ABS((n/2) - (m/2)) ) // Finds max between n & m
#define MIN(n, m) ( (n/2) + (m/2) - ABS((n/2) - (m/2)) ) // Finds min between n & m

please look at this program.. this might be the best answer till date on this page...
#include <stdio.h>
int main()
{
int a,b;
a=3;
b=5;
printf("%d %d\n",a,b);
b = (a+b)-(a=b); // this line is doing the reversal
printf("%d %d\n",a,b);
return 0;
}

If A is always greater than B .. [ we can use] .. MAX = (A - B) + B;
No need. Just use: int maxA(int A, int B){ return A;}
(1) If conditionals are allowed you do max = a>b ? a : b.
(2) Any other method either use a defined set of numbers or rely on the implicit conditional checks.
(2a) max = a-((a-b)&((a-b)>>31)) this is neat, but it only works if you use 32 bit numbers. You can expand it arbitrary large number N, but the method will fail if you try to find max(N-1, N+1). This algorithm works for finite state automata, but not a Turing machine.
(2b) Magnitude |a-b| is a condition |a-b| = a-b>0 a-b : b-a
What about:
Square root is also a condition. Whenever c>0 and c^2 = d we have second solution -c, because (-c)^2 = (-1)^2*c^2 = 1*c^2 = d. Square root returns the greatest in the pair. I comes with a build in int max(int c1, int c2){return max(c1, c2);}
Without comparison operator math is very symmetric as well as limited in power. Positive and negative numbers cannot be distinguished without if of some sort.

It depends which language you're using, but the Ternary Operator might be useful.
But then, if you can't perform conditional checks in your 'scripting application', you probably don't have the ternary operator.

using System;
namespace ConsoleApp2
{
class Program
{
static void Main(string[] args)
{
float a = 101, b = 15;
float max = (a + b) / 2 + ((a > b) ? a - b : b - a) / 2;
}
}
}

#region GetMaximumNumber
/// <summary>
/// Provides method to get maximum values.
/// </summary>
/// <param name="values">Integer array for getting maximum values.</param>
/// <returns>Maximum number from an array.</returns>
private int GetMaximumNumber(params int[] values)
{
// Declare to store the maximum number.
int maximumNumber = 0;
try
{
// Check that array is not null and array has an elements.
if (values != null &&
values.Length > 0)
{
// Sort the array in ascending order for getting maximum value.
Array.Sort(values);
// Get the last value from an array which is always maximum.
maximumNumber = values[values.Length - 1];
}
}
catch (Exception ex)
{
throw ex;
}
return maximumNumber;
}
#endregion

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Is there a Intel SIMD comparison function that returns 0 or 1 instead of 0 or 0xFFFFFFFF? - intel

Related

How to get a logarithmic distribution from an interval

Frame the solution using Dynamic programming

Recursive method that reverses numbers?

Codility K-Sparse Test Spoilers

Mathematically Find Max Value without Conditional Comparison

Categories

Resources

Develop Reference

r css asp.net wordpress firebase qt symfony nginx http apache-flex

Is there a Intel SIMD comparison function that returns 0 or 1 instead of 0 or 0xFFFFFFFF? - intel

Related

How to get a logarithmic distribution from an interval

Frame the solution using Dynamic programming

Recursive method that reverses numbers?

Codility K-Sparse Test **Spoilers**

Mathematically Find Max Value without Conditional Comparison

Categories

Resources

Codility K-Sparse Test Spoilers