2

I have an array of floating point numbers of type T, where T could be either float or double

T x[n];

These numbers are strictly positive and sorted, i.e.

0 < x[0] < x[1] < x[2] < ... < x[n-1]

I want to find the smallest floating point number H of type T such that the following inequality is strictly satisfied:

int(x[i+1]*H) > int(x[i]*H)   // for i=0..n-2

Let's assume that the numbers x are such that there are no underflow or overflow issues to worry about.

I am approaching the problem as described below, which I am not sure if it is robust and, when it works, I think produces a suboptimal result.

Is there a robust and accurate way to solve this? I could accept a suboptimal solution, but at least it needs to be robust.


My current method

I use the notation m(x) to indicate a floating point number which approximates x and may be affected by roundoff error. In the following passages, I modify the inequality below taking as appropriate upper and lower bounds. Note that the below is not to be interpreted as source code, but as the matehmatical steps and reasoning to get to the final equations.

Let a(x) be the closest floating point number toward -inf
Let b(x) be the closest floating point number toward +inf

// Original inequality, where the operation affected by rounding error
// are marked with m()
int(m(x[i+1]*H)) > int(m(x[i]*H))   // for i=0..n-2

// I remove `atoi` subtracting and adding 0.5 
// this is a conceptual operation, not actually executed on the machine,
// hence I do not mark it with an m()
m(x[i+1]*H) - 0.5 > m(x[i]*H) + 0.5   // for i=0..n-2

// I reorganize terms
m(x[i+1]*H) - m(x[i]*H) >  1.0        // for i=0..n-2

// I would like to resolve the m() operator with strict LHS minorant and RHS majorant
// Note I cannot use `a(x)` and `b(x)` because I do not know `H` 
// I use multiplication by `(1+2*eps)` or `(1-2*eps)` as I hope this 
// will result in a number strictly greater (or lower) than the original one. 
// I already know this does not work for example if x is very small.
// So this seems like a pitfall of this approach.
x[i+1]*H(1-2*eps) - x[i]*H*(1+2*eps) >  b(1)  // for i=0..n-2

// solve the inequality for H, marking the operations affected by rounding 
// with m()
H > m( b(1) / m( x[i+1](1-2*eps) - x[i]*(1+2*eps) ) ) // for i=0..n-2

// take majorant or minorant as appropriate for rounded terms
H > b( b(1) / a( x[i+1](1-2*eps) - x[i]*(1+2*eps) ) )   // for i=0..n-2

// and eventually take a majorant which gives me the final formula for H
H = b( b( b(1) / min{ a( x[i+1](1-2*eps) - x[i]*(1+2*eps) ) } ) )
4
  • What language is this? What is atoi? The C/C++ atoi function converts a string to an integer, so it can't be that. And what do you mean by majorant and minorant? The usual mathematical definitions don't seem to fit here. Nov 3, 2016 at 20:53
  • 1
    My bad. With atoi I meant truncation to integrer. I corrected the question. As per majorant and minorant, I mean majorant(a)>a and minorant(a)<a. However, please do not focus too much on my proposed solution, which may have pitfalls. Focus on the problem statement instead.
    – Fabio
    Nov 4, 2016 at 0:27
  • 1
    Thanks for the updates! Just to make sure I understand: for an example input array of [0.99, 1.0], the optimal output would be H = 1.0, while your method produces something that's close to H = 100.0. Is that correct? This is an interesting problem ... Nov 4, 2016 at 8:41
  • Exactly! Furthermore, I am yet not convinced my method is robust.
    – Fabio
    Nov 4, 2016 at 12:13

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Browse other questions tagged or ask your own question.