Testing approximate equality of two doubles - math

Given these two double values
double d1 = 1.0E-24;
double d2 = 1.0000000000000029E-24;
and the following function
public static boolean isEqual(double d0, double d1, double epsilon) {
return d0 == d1 ? true : Math.abs(d0 - d1) < epsilon;
}
How can I calculate the smallest epsilon (the least double distance) so that the term Math.abs(d0 - d1) < epsilon evaluates to true?
edited in response to comment:
I'm aware of the method Double.doubleToRawLongBits but still need some clarification which bits I need to compare.
So the binary representation of both double values is
d1: 0 11101011110 0011010101111100001010011001101010001000111010100111
d2: 0 11101011110 0011010101111100001010011001101010001000111010110111
Do I just need to compare the mantissa bits like this?
isEqual(getMantissaLongBits(d1), getMantissaLongBits(d2), epsilon(getMantissaLongBits(d1), getMantissaLongBits(d2),49))?
private static long getMantissaLongBits(double x) {
return Double.doubleToRawLongBits(x) & 0x000fffffffffffffL;
}
public static double epsilon(final double one, final double other, final int bits) {
return Math.max(Math.scalb(Math.max(Math.abs(one), Math.abs(other)), -bits),
Math.scalb(Double.MIN_NORMAL, 52 - bits));
}

Related

Calculating Standard Deviation in Assignment

I have an assignment for my class that reads like this: Write a class called Stats. The constructor will take no input. There will be a method addData(double a) which will be used to add a value from the test program. Methods getCount(), getAverage() and getStandardDeviation() will return the appropriate values as doubles.
Here's what I have so far:
public class Stats
{
public Stats (double a)
{
a=0.0
}
public void addData(double a)
{
while (
sum=sum+a;
sumsq=sumsq+Math.pow(a,2)
count=count+1
}
public double getCount()
{
return count;
}
public double getAverage()
{
average=sum/count
return average;
}
public double getStandardDeviation()
{
private double sum=o;
private double count=0;
private double sumsq=0;
My problem is figuring out how to calculate the standard deviation using the variables I've defined.
Thanks guys!
You can't do this with the variables you defined. You need to keep the original data to be able to compute the formula
sigma = Math.sqrt( sum(Math.pow(x-mean, 2)) / count )
So,
(1) create private array or list into which you'll add your values in addData. That's all you need to do in addData.
(2) getCount = length of the list
(3) getAverage = sum of values in list / getCount()
(4) getStandardDeviation is something like
double avg = getAverage();
double cnt = getCount();
double sumsq = 0;
for (int i = 0; i < values.Count(); i++) {
sumsq += Math.pow(values[i] - avg, 2);
}
stdev = Math.sqrt(sumsq / cnt);

Codility K-Sparse Test **Spoilers**

Have you tried the latest Codility test?
I felt like there was an error in the definition of what a K-Sparse number is that left me confused and I wasn't sure what the right way to proceed was. So it starts out by defining a K-Sparse Number:
In the binary number "100100010000" there are at least two 0s between
any two consecutive 1s. In the binary number "100010000100010" there
are at least three 0s between any two consecutive 1s. A positive
integer N is called K-sparse if there are at least K 0s between any
two consecutive 1s in its binary representation. (My emphasis)
So the first number you see, 100100010000 is 2-sparse and the second one, 100010000100010, is 3-sparse. Pretty simple, but then it gets down into the algorithm:
Write a function:
class Solution { public int sparse_binary_count(String S,String T,int K); }
that, given:
string S containing a binary representation of some positive integer A,
string T containing a binary representation of some positive integer B,
a positive integer K.
returns the number of K-sparse integers within the range [A..B] (both
ends included)
and then states this test case:
For example, given S = "101" (A = 5), T = "1111" (B=15) and K=2, the
function should return 2, because there are just two 2-sparse integers
in the range [5..15], namely "1000" (i.e. 8) and "1001" (i.e. 9).
Basically it is saying that 8, or 1000 in base 2, is a 2-sparse number, even though it does not have two consecutive ones in its binary representation. What gives? Am I missing something here?
Tried solving that one. The assumption that the problem makes about binary representations of "power of two" numbers being K sparse by default is somewhat confusing and contrary.
What I understood was 8-->1000 is 2 power 3 so 8 is 3 sparse. 16-->10000 2 power 4 , and hence 4 sparse.
Even we assume it as true , and if you are interested in below is my solution code(C) for this problem. Doesn't handle some cases correctly, where there are powers of two numbers involved in between the two input numbers, trying to see if i can fix that:
int sparse_binary_count (const string &S,const string &T,int K)
{
char buf[50];
char *str1,*tptr,*Sstr,*Tstr;
int i,len1,len2,cnt=0;
long int num1,num2;
char *pend,*ch;
Sstr = (char *)S.c_str();
Tstr = (char *)T.c_str();
str1 = (char *)malloc(300001);
tptr = str1;
num1 = strtol(Sstr,&pend,2);
num2 = strtol(Tstr,&pend,2);
for(i=0;i<K;i++)
{
buf[i] = '0';
}
buf[i] = '\0';
for(i=num1;i<=num2;i++)
{
str1 = tptr;
if( (i & (i-1))==0)
{
if(i >= (pow((float)2,(float)K)))
{
cnt++;
continue;
}
}
str1 = myitoa(i,str1,2);
ch = strstr(str1,buf);
if(ch == NULL)
continue;
else
{
if((i % 2) != 0)
cnt++;
}
}
return cnt;
}
char* myitoa(int val, char *buf, int base){
int i = 299999;
int cnt=0;
for(; val && i ; --i, val /= base)
{
buf[i] = "0123456789abcdef"[val % base];
cnt++;
}
buf[i+cnt+1] = '\0';
return &buf[i+1];
}
There was an information within the test details, showing this specific case. According to this information, any power of 2 is considered K-sparse for any K.
You can solve this simply by binary operations on integers. You are even able to tell, that you will find no K-sparse integers bigger than some specific integer and lower than (or equal to) integer represented by T.
As far as I can see, you must pay also a lot of attention to the performance, as there are sometimes hundreds of milions of integers to be checked.
My own solution, written in Python, working very efficiently even on large ranges of integers and being successfully tested for many inputs, has failed. The results were not very descriptive, saying it does not work as required within question (although it meets all the requirements in my opinion).
/////////////////////////////////////
solutions with bitwise operators:
no of bits per int = 32 on 32 bit system,check for pattern (for K=2,
like 1001, 1000) in each shift and increment the count, repeat this
for all numbers in range.
///////////////////////////////////////////////////////
int KsparseNumbers(int a, int b, int s) {
int nbits = sizeof(int)*8;
int slen = 0;
int lslen = pow(2, s);
int scount = 0;
int i = 0;
for (; i < s; ++i) {
slen += pow(2, i);
}
printf("\n slen = %d\n", slen);
for(; a <= b; ++a) {
int num = a;
for(i = 0 ; i < nbits-2; ++i) {
if ( (num & slen) == 0 && (num & lslen) ) {
scount++;
printf("\n Scount = %d\n", scount);
break;
}
num >>=1;
}
}
return scount;
}
int main() {
printf("\n No of 2-sparse numbers between 5 and 15 = %d\n", KsparseNumbers(5, 15, 2));
}

Interpolating values between interval, interpolation as per Bezier curve

To implement a 2D animation I am looking for interpolating values between two key frames with the velocity of change defined by a Bezier curve. The problem is Bezier curve is represented in parametric form whereas requirement is to be able to evaluate the value for a particular time.
To elaborate, lets say the value of 10 and 40 is to be interpolated across 4 seconds with the value changing not constantly but as defined by a bezier curve represented as 0,0 0.2,0.3 0.5,0.5 1,1.
Now if I am drawing at 24 frames per second, I need to evaluate the value for every frame. How can I do this ? I looked at De Casteljau algorithm and thought that dividing the curve into 24*4 pieces for 4 seconds would solve my problem but that sounds erroneous as time is along the "x" axis and not along the curve.
To further simplify
If I draw the curve in a plane, the x axis represents the time and the y axis the value I am looking for. What I actually require is to to be able to find out "y" corresponding to "x". Then I can divide x in 24 divisions and know the value for each frame
I was facing the same problem: Every animation package out there seems to use Bézier curves to control values over time, but there is no information out there on how to implement a Bézier curve as a y(x) function. So here is what I came up with.
A standard cubic Bézier curve in 2D space can be defined by the four points P0=(x0, y0) .. P3=(x3, y3).
P0 and P3 are the end points of the curve, while P1 and P2 are the handles affecting its shape. Using a parameter t ϵ [0, 1], the x and y coordinates for any given point along the curve can then be determined using the equations
A) x = (1-t)3x0 + 3t(1-t)2x1 + 3t2(1-t)x2 + t3x3 and
B) y = (1-t)3y0 + 3t(1-t)2y1 + 3t2(1-t)y2 + t3y3.
What we want is a function y(x) that, given an x coordinate, will return the corresponding y coordinate of the curve. For this to work, the curve must move monotonically from left to right, so that it doesn't occupy the same x coordinate more than once on different y positions. The easiest way to ensure this is to restrict the input points so that x0 < x3 and x1, x2 ϵ [x0, x3]. In other words, P0 must be to the left of P3 with the two handles between them.
In order to calculate y for a given x, we must first determine t from x. Getting y from t is then a simple matter of applying t to equation B.
I see two ways of determining t for a given y.
First, you might try a binary search for t. Start with a lower bound of 0 and an upper bound of 1 and calculate x for these values for t via equation A. Keep bisecting the interval until you get a reasonably close approximation. While this should work fine, it will neither be particularly fast nor very precise (at least not both at once).
The second approach is to actually solve equation A for t. That's a bit tough to implement because the equation is cubic. On the other hand, calculation becomes really fast and yields precise results.
Equation A can be rewritten as
(-x0+3x1-3x2+x3)t3 + (3x0-6x1+3x2)t2 + (-3x0+3x1)t + (x0-x) = 0.
Inserting your actual values for x0..x3, we get a cubic equation of the form at3 + bt2 + c*t + d = 0 for which we know there is only one solution within [0, 1]. We can now solve this equation using an algorithm like the one posted in this Stack Overflow answer.
The following is a little C# class demonstrating this approach. It should be simple enough to convert it to a language of your choice.
using System;
public class Point {
public Point(double x, double y) {
X = x;
Y = y;
}
public double X { get; private set; }
public double Y { get; private set; }
}
public class BezierCurve {
public BezierCurve(Point p0, Point p1, Point p2, Point p3) {
P0 = p0;
P1 = p1;
P2 = p2;
P3 = p3;
}
public Point P0 { get; private set; }
public Point P1 { get; private set; }
public Point P2 { get; private set; }
public Point P3 { get; private set; }
public double? GetY(double x) {
// Determine t
double t;
if (x == P0.X) {
// Handle corner cases explicitly to prevent rounding errors
t = 0;
} else if (x == P3.X) {
t = 1;
} else {
// Calculate t
double a = -P0.X + 3 * P1.X - 3 * P2.X + P3.X;
double b = 3 * P0.X - 6 * P1.X + 3 * P2.X;
double c = -3 * P0.X + 3 * P1.X;
double d = P0.X - x;
double? tTemp = SolveCubic(a, b, c, d);
if (tTemp == null) return null;
t = tTemp.Value;
}
// Calculate y from t
return Cubed(1 - t) * P0.Y
+ 3 * t * Squared(1 - t) * P1.Y
+ 3 * Squared(t) * (1 - t) * P2.Y
+ Cubed(t) * P3.Y;
}
// Solves the equation ax³+bx²+cx+d = 0 for x ϵ ℝ
// and returns the first result in [0, 1] or null.
private static double? SolveCubic(double a, double b, double c, double d) {
if (a == 0) return SolveQuadratic(b, c, d);
if (d == 0) return 0;
b /= a;
c /= a;
d /= a;
double q = (3.0 * c - Squared(b)) / 9.0;
double r = (-27.0 * d + b * (9.0 * c - 2.0 * Squared(b))) / 54.0;
double disc = Cubed(q) + Squared(r);
double term1 = b / 3.0;
if (disc > 0) {
double s = r + Math.Sqrt(disc);
s = (s < 0) ? -CubicRoot(-s) : CubicRoot(s);
double t = r - Math.Sqrt(disc);
t = (t < 0) ? -CubicRoot(-t) : CubicRoot(t);
double result = -term1 + s + t;
if (result >= 0 && result <= 1) return result;
} else if (disc == 0) {
double r13 = (r < 0) ? -CubicRoot(-r) : CubicRoot(r);
double result = -term1 + 2.0 * r13;
if (result >= 0 && result <= 1) return result;
result = -(r13 + term1);
if (result >= 0 && result <= 1) return result;
} else {
q = -q;
double dum1 = q * q * q;
dum1 = Math.Acos(r / Math.Sqrt(dum1));
double r13 = 2.0 * Math.Sqrt(q);
double result = -term1 + r13 * Math.Cos(dum1 / 3.0);
if (result >= 0 && result <= 1) return result;
result = -term1 + r13 * Math.Cos((dum1 + 2.0 * Math.PI) / 3.0);
if (result >= 0 && result <= 1) return result;
result = -term1 + r13 * Math.Cos((dum1 + 4.0 * Math.PI) / 3.0);
if (result >= 0 && result <= 1) return result;
}
return null;
}
// Solves the equation ax² + bx + c = 0 for x ϵ ℝ
// and returns the first result in [0, 1] or null.
private static double? SolveQuadratic(double a, double b, double c) {
double result = (-b + Math.Sqrt(Squared(b) - 4 * a * c)) / (2 * a);
if (result >= 0 && result <= 1) return result;
result = (-b - Math.Sqrt(Squared(b) - 4 * a * c)) / (2 * a);
if (result >= 0 && result <= 1) return result;
return null;
}
private static double Squared(double f) { return f * f; }
private static double Cubed(double f) { return f * f * f; }
private static double CubicRoot(double f) { return Math.Pow(f, 1.0 / 3.0); }
}
You have a few options:
Let's say your curve function F(t) takes a parameter t that ranges from 0 to 1 where F(0) is the beginning of the curve and F(1) is the end of the curve.
You could animate motion along the curve by incrementing t at a constant change per unit of time.
So t is defined by function T(time) = Constant*time
For example, if your frame is 1/24th of a second, and you want to move along the curve at a rate of 0.1 units of t per second, then each frame you increment t by 0.1 (t/s) * 1/24 (sec/frame).
A drawback here is that your actual speed or distance traveled per unit time will not be constant. It will depends on the positions of your control points.
If you want to scale speed along the curve uniformly you can modify the constant change in t per unit time. However, if you want speeds to vary dramatically you will find it difficult to control the shape of the curve. If you want the velocity at one endpoint to be much larger, you must move the control point further away, which in turn pulls the shape of the curve towards that point. If this is a problem, you may consider using a non constant function for t. There are a variety of approaches with different trade-offs, and we need to know more details about your problem to suggest a solution. For example, in the past I have allowed users to define the speed at each keyframe and used a lookup table to translate from time to parameter t such that there is a linear change in speed between keyframe speeds (it's complicated).
Another common hangup: If you are animating by connecting several Bezier curves, and you want the velocity to be continuous when moving between curves, then you will need to constrain your control points so they are symmetrical with the adjacent curve. Catmull-Rom splines are a common approach.
I've answered a similar question here. Basically if you know the control points before hand then you can transform the f(t) function into a y(x) function. To not have to do it all by hand you can use services like Wolfram Alpha to help you with the math.

Mathematically Find Max Value without Conditional Comparison

----------Updated ------------
codymanix and moonshadow have been a big help thus far. I was able to solve my problem using the equations and instead of using right shift I divided by 29. Because with 32bits signed 2^31 = overflows to 29. Which works!
Prototype in PHP
$r = $x - (($x - $y) & (($x - $y) / (29)));
Actual code for LEADS (you can only do one math function PER LINE!!! AHHHH!!!)
DERIVDE1 = IMAGE1 - IMAGE2;
DERIVED2 = DERIVED1 / 29;
DERIVED3 = DERIVED1 AND DERIVED2;
MAX = IMAGE1 - DERIVED3;
----------Original Question-----------
I don't think this is quite possible with my application's limitations but I figured it's worth a shot to ask.
I'll try to make this simple. I need to find the max values between two numbers without being able to use a IF or any conditional statement.
In order to find the the MAX values I can only perform the following functions
Divide, Multiply, Subtract, Add, NOT, AND ,OR
Let's say I have two numbers
A = 60;
B = 50;
Now if A is always greater than B it would be simple to find the max value
MAX = (A - B) + B;
ex.
10 = (60 - 50)
10 + 50 = 60 = MAX
Problem is A is not always greater than B. I cannot perform ABS, MAX, MIN or conditional checks with the scripting applicaiton I am using.
Is there any way possible using the limited operation above to find a value VERY close to the max?
finding the maximum of 2 variables:
max = a-((a-b)&((a-b)>>31))
where >> is bitwise right-shift (also called SHR or ASR depeding on signedness).
Instead of 31 you use the number of bits your numbers have minus one.
I guess this one would be the most simplest if we manage to find difference between two numbers (only the magnitude not sign)
max = ((a+b)+|a-b|)/2;
where |a-b| is a magnitude of difference between a and b.
If you can't trust your environment to generate the appropriate branchless operations when they are available, see this page for how to proceed. Note the restriction on input range; use a larger integer type for the operation if you cannot guarantee your inputs will fit.
Solution without conditionals. Cast to uint then back to int to get abs.
int abs (a) { return (int)((unsigned int)a); }
int max (a, b) { return (a + b + abs(a - b)) / 2; }
int max3 (a, b, c) { return (max(max(a,b),c); }
Using logical operations only, short circuit evaluation and assuming the C convention of rounding towards zero, it is possible to express this as:
int lt0(int x) {
return x && (!!((x-1)/x));
}
int mymax(int a, int b) {
return lt0(a-b)*b+lt0(b-a)*a;
}
The basic idea is to implement a comparison operator that will return 0 or 1. It's possible to do a similar trick if your scripting language follows the convention of rounding toward the floor value like python does.
function Min(x,y:integer):integer;
Var
d:integer;
abs:integer;
begin
d:=x-y;
abs:=d*(1-2*((3*d) div (3*d+1)));
Result:=(x+y-abs) div 2;
end;
Hmmm. I assume NOT, AND, and OR are bitwise? If so, there's going to be a bitwise expression to solve this. Note that A | B will give a number >= A and >= B. Perhaps there's a pruning method for selecting the number with the most bits.
To extend, we need the following to determine whether A (0) or B (1) is greater.
truth table:
0|0 = 0
0|1 = 1
1|0 = 0
1|1 = 0
!A and B
therefore, will give the index of the greater bit. Ergo, compare each bit in both numbers, and when they are different, use the above expression (Not A And B) to determine which number was greater. Start from the most significant bit and proceed down both bytes. If you have no looping construct, manually compare each bit.
Implementing "when they are different":
(A != B) AND (my logic here)
try this, (but be aware for overflows)
(Code in C#)
public static Int32 Maximum(params Int32[] values)
{
Int32 retVal = Int32.MinValue;
foreach (Int32 i in values)
retVal += (((i - retVal) >> 31) & (i - retVal));
return retVal;
}
You can express this as a series of arithmetic and bitwise operations, e.g.:
int myabs(const int& in) {
const int tmp = in >> ((sizeof(int) * CHAR_BIT) - 1);
return tmp - (in ^ tmp(;
}
int mymax(int a, int b) {
return ((a+b) + myabs(b-a)) / 2;
}
//Assuming 32 bit integers
int is_diff_positive(int num)
{
((num & 0x80000000) >> 31) ^ 1; // if diff positive ret 1 else 0
}
int sign(int x)
{
return ((num & 0x80000000) >> 31);
}
int flip(int x)
{
return x ^ 1;
}
int max(int a, int b)
{
int diff = a - b;
int is_pos_a = sign(a);
int is_pos_b = sign(b);
int is_diff_positive = diff_positive(diff);
int is_diff_neg = flip(is_diff_positive);
// diff (a - b) will overflow / underflow if signs are opposite
// ex: a = INT_MAX , b = -3 then a - b => INT_MAX - (-3) => INT_MAX + 3
int can_overflow = is_pos_a ^ is_pos_b;
int cannot_overflow = flip(can_overflow);
int res = (cannot_overflow * ( (a * is_diff_positive) + (b *
is_diff_negative)) + (can_overflow * ( (a * is_pos_a) + (b *
is_pos_b)));
return res;
}
This is my implementation using only +, -, *, %, / operators
using static System.Console;
int Max(int a, int b) => (a + b + Abs(a - b)) / 2;
int Abs(int x) => x * ((2 * x + 1) % 2);
WriteLine(Max(-100, -2) == -2); // true
WriteLine(Max(2, -100) == 2); // true
I just came up with an expression:
(( (a-b)-|a-b| ) / (2(a-b)) )*b + (( (b-a)-|b-a| )/(2(b-a)) )*a
which is equal to a if a>b and is equal to b if b>a
when a>b:
a-b>0, a-b = |a-b|, (a-b)-|a-b| = 0 so the coeficcient for b is 0
b-a<0, b-a = -|b-a|, (b-a)-|b-a| = 2(b-a)
so the coeficcient for a is 2(b-a)/2(b-a) which is 1
so it would ultimately return 0*b+1*a if a is bigger and vice versa
Find MAX between n & m
MAX = ( (n/2) + (m/2) + ( ((n/2) - (m/2)) * ( (2*((n/2) - (m/2)) + 1) % 2) ) )
Using #define in c:
#define MAX(n, m) ( (n/2) + (m/2) + ( ((n/2) - (m/2)) * ( (2*((n/2) - (m/2)) + 1) % 2) ) )
or
#define ABS(n) ( n * ( (2*n + 1) % 2) ) // Calculates abs value of n
#define MAX(n, m) ( (n/2) + (m/2) + ABS((n/2) - (m/2)) ) // Finds max between n & m
#define MIN(n, m) ( (n/2) + (m/2) - ABS((n/2) - (m/2)) ) // Finds min between n & m
please look at this program.. this might be the best answer till date on this page...
#include <stdio.h>
int main()
{
int a,b;
a=3;
b=5;
printf("%d %d\n",a,b);
b = (a+b)-(a=b); // this line is doing the reversal
printf("%d %d\n",a,b);
return 0;
}
If A is always greater than B .. [ we can use] .. MAX = (A - B) + B;
No need. Just use: int maxA(int A, int B){ return A;}
(1) If conditionals are allowed you do max = a>b ? a : b.
(2) Any other method either use a defined set of numbers or rely on the implicit conditional checks.
(2a) max = a-((a-b)&((a-b)>>31)) this is neat, but it only works if you use 32 bit numbers. You can expand it arbitrary large number N, but the method will fail if you try to find max(N-1, N+1). This algorithm works for finite state automata, but not a Turing machine.
(2b) Magnitude |a-b| is a condition |a-b| = a-b>0 a-b : b-a
What about:
Square root is also a condition. Whenever c>0 and c^2 = d we have second solution -c, because (-c)^2 = (-1)^2*c^2 = 1*c^2 = d. Square root returns the greatest in the pair. I comes with a build in int max(int c1, int c2){return max(c1, c2);}
Without comparison operator math is very symmetric as well as limited in power. Positive and negative numbers cannot be distinguished without if of some sort.
It depends which language you're using, but the Ternary Operator might be useful.
But then, if you can't perform conditional checks in your 'scripting application', you probably don't have the ternary operator.
using System;
namespace ConsoleApp2
{
class Program
{
static void Main(string[] args)
{
float a = 101, b = 15;
float max = (a + b) / 2 + ((a > b) ? a - b : b - a) / 2;
}
}
}
#region GetMaximumNumber
/// <summary>
/// Provides method to get maximum values.
/// </summary>
/// <param name="values">Integer array for getting maximum values.</param>
/// <returns>Maximum number from an array.</returns>
private int GetMaximumNumber(params int[] values)
{
// Declare to store the maximum number.
int maximumNumber = 0;
try
{
// Check that array is not null and array has an elements.
if (values != null &&
values.Length > 0)
{
// Sort the array in ascending order for getting maximum value.
Array.Sort(values);
// Get the last value from an array which is always maximum.
maximumNumber = values[values.Length - 1];
}
}
catch (Exception ex)
{
throw ex;
}
return maximumNumber;
}
#endregion

Determine place values for any base

I'm looking for a function that will determine the value of a place given a number and a base. For example,
Given:
Whole Value: 1120
Base: 10
Place: Tens place
Should return: 2
Does anybody know the math for this?
Edit: The function is also expected to pass the whole value numerically, not as a string like "e328fa" or something. Also the return value should be numeric as well, so a FindInPlace(60 (whole value), 16 (base), 2 (place, 1-based index)) should return 3.
With 1-based place indexing the formula is:
placeval = floor(number / (base^(place-1))) mod base
In Python:
def FindInPlace(number, base, place):
return number//base**(place-1) % base
If the number is already converted to an integer (i.e. base 10)
// Supports up to base 36
char digits[] = "0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ";
char FindPlace(int number, int base, int digit)
{
if(digit < 0) return 0;
// Essentially divide the number by [base] to the [digit] power
for(i=0; i<digit; i++)
{
number /= base;
}
// TODO: Verify that the digit is in range of digits
return digits[number % base];
}
(0 gives you the right most digit, 1 gives you the next to right-most digit, etc)
I've returned the digit as a char, to allow for bases more than 10.
Note that if you want to allow the user to input the desired digit as "1s place, 10s place, 100s place" or "1s, 16s, 256s", you simply do
digit = log(PlaceValue, base);
or rewrite the code to be
char FindPlace(int number, int base, int digitAsBaseToAPower)
{
// TODO: Error checking
return digits[(number / digitAsBaseToAPower) % base];
}
int getPlace(float x, float place) {
return (int)(x/place) % 10;
}
This works for base-10, and can handle places to the right or left of the decimal. You'd use it like this:
place = getPlace(1120,10);
otherPlace = getPlace(0.1120,1e-3);
A more general solution for any base is tricky. I'd go with a string solution.
Something like this?
int place_value(int value, int base, int place)
{
int value_in_place= value;
for (int place_index= 1; place_index<place; ++place_index)
{
value_in_place/=base;
}
return value_in_place % base;
}
where place is the one-based index of the digit you want from the right.
The following method, placeValue, returns a char, because bases 11-36 have digits greater than 9. The method expects:
int value: the whole value
int base: the number base to convert the whole to; acceptable values are 2-36
int place: the index of the digit; the least significant digit has index 1
import java.math.BigInteger;
...
private static char placeValue(int value, int base, int place) {
BigInteger bigValue = BigInteger.valueOf(value);
String baseString = bigValue.toString(base);
int numDigits = baseString.length();
int digitIndex = numDigits - place;
char digit = baseString.charAt(digitIndex);
return digit;
}

Resources