# Questions tagged [floating-point]

Floating point numbers are approximations of real numbers that can represent larger ranges than integers but use the same amount of memory, at the cost of lower precision. If your question is about small arithmetic errors (e.g. why does 0.2 + 0.1 equal 0.300000001?) or decimal conversion errors, please read the "info" page linked below before posting.

1,636 questions with no upvoted or accepted answers
Filter by
Sorted by
Tagged with
196views

### Calculating floating point error bound

I have geometrical algorithms and im struggling with floating point inaccuracies. For example, I'm calculating wether a point lies on the left/right/on a plane (C#): const double Epsilon = 1e-10; ...
• 610
1kviews

### Incorrect float subtraction result

DataSet id qty CheckIn CheckOut 5 10 1 0 5 10 0 1 5 1.6 1 0 5 0.4 0 1 5 0.4 0 1 5 0.4 0 1 I am trying to ...
339views

### First (smallest) even number not representable with IEEE 754 floating point?

I'm not sure how to go about this problem. I know that the smallest integer not representable by IEEE 754 would be 2^(mantissa+1) + 1 but how would I take that info and change it to an even number? ...
• 2,429
592views

### Can Float32 ever be not equal float in Objective-C?

The ObjC's library functions seem to be using Float32 and float interchangeably. For example the Accelerate framework declares function arguments as float (and double) where as you can see Float32 ...
• 11.2k
378views

### How to turn off denormal number support in MATLAB?

I am trying to turn off denormal number support in matlab, so that basically any two computations that would result in a denormal number would instead just result in zero (DAZ, FTZ) I've researched ...
• 3,410
100views

### Is rounding behavior of string-to-double methods defined?

Ideally, a string-to-double method would always yield the double whose value was closest to the exact numerical value of the specified string; for example, since "102030405060708072.99" is only 7.01 ...
• 72.8k
5kviews

### Prevent Jackson from serializing a float as a double

Jackson seems to be coercing all floats into doubles in any data structure that I am trying to serialize into JSON. Is there any way to avoid this behavior? Float f = 50.1f; System.out.println(f); ...
1kviews

### why floating point exception in OpenCL kernel call to sin but not cos?

My OpenCL kernel is throwing a floating point exception. I've reduced it to just the lines I think are causing the problem. If I replace the line acc.x += sin(distSqr); with acc.x += cos(distSqr); ...
• 291
2kviews

### Irrational number representation in computer

We can write a simple Rational Number class using two integers representing A/B with B != 0. If we want to represent an irrational number class (storing and computing), the first thing came to my ...
• 14.9k
991views

### Call C function from Assembly code using as88 Assembler

I'm working on a Floating Point calculator for 16bits processors, specifically 8086/8088. I'm using as88 Tracker which doesn't implement floating points, not allowing me to use sscanf with "%f". I ...
1kviews

### Crazy behaviour during execution

I have been doing some inline-asm with gcc. Everything is ALMOST working, up to some behaviour that is just baffling me. I am evaluating a rational polynomial, but need to use 80-bit constants. The ...
2kviews

### Excel Solver and VBA: Floating point/decimal numbers in constraints get incorrectly converted to integers?

I'm running VBA scripts under both Excel 2007 and 2010 which involve a lot of optimization using the built-in Solver of Excel. What is the correct way to specify decimal constraints like X>=0.0001 ...
• 711
232views

### What range of numbers can I represent using scaled_float in Elasticsearch?

I'm trying to figure out what numbers I can represent using scaled_float. In the documentation here https://www.elastic.co/guide/en/elasticsearch/reference/master/number.html I first read: ...
• 18.1k
204views

### Casting float to int inconsistent across MinGw and Clang

Using C++, I'm trying to cast a float value to an int using these instructions : #include <iostream> int main() { float NbrToCast = 1.8f; int TmpNbr = NbrToCast * 10; std::cout <...
• 25.6k
2kviews

### Precise definition of float string formatting?

Is the following behavior defined in Python's documentation (Python 2.7)? >>> '{:20}'.format(1e10) ' 10000000000.0' >>> ...
• 85.3k
84views

### std::fmod(4.2, 0.12) is equal to epsilon * 1.5

auto a{ 4.2 }; auto b{ 0.12 }; auto result = std::fmod(a, b); if(result <= std::numeric_limits<double>::epsilon()) result = 0; // <-- This line isn't triggered In this example, 4.2 is ...
• 31
61views

### Reading the trailing fraction part in a nanl

I am upgrading a working multi-platform math library (compliant with c99 standard) and have to code the long double extension to a couple functions (alike sinl for sin). In particular, I have to deal ...
29views

### IEEE integer standard

IEEE 754 defines floating point standards for computers. Is there such a similar standard for integers? Whenever I search for something like that, I end up at IEEE 754! C/C++ defines char, short, int, ...
• 1,110
96views

### How to compute trunc(a/b) with only the nearest-to-even rounding mode?

Given two IEEE-754 double-precision floating-point numbers a and b, I want to get the exact quotient a/b rounded to an integer towards zero. A C99 program to do that could look like this: #include <...
• 5,437
175views

### Rational approximation of double using int numerator and denominator in C++

A real world third party API takes a parameter of type fraction which is a struct of an int numerator and denominator. The value that I need to pass is known to me as a decimal string that is ...
• 298
164views

### Why cv2.GaussianBlur modify max value?

Why cv2.GaussianBlur modify max value in this case? Here is example code: import numpy as np import cv2 mask = np.zeros((256, 256, 1), np.uint8) mask[128:, :] = 255 np.max(mask) 255 mask = cv2....
• 17.1k
59views

### Python numpy:How many digits does np.float64 have?

So I'm doing taylor expansion for exp(11.2). When I print all terms, I find some terms have 15 digits but some have 17 digits. I wonder why is that happening?
57views

### I want to generate a for loop using decimals. I need the y value from the loop to create a list. This is a project and I cant use numpy

float object cannot be interpreted as an integer I want to generate a for loop using decimals. I need the y value from the loop to create a list. This is a project and I cant use numpy. Are there any ...
243views

### Numpy matrix multiplication instability across rows

I am multiplying two float64 matrices with the following values: import numpy as np # 4x5 matrix with identical columns. x = np.zeros((4, 5,), dtype=np.float64) x[1] = 1 x[3] = -3 w = np.array([1, 1,...
• 681
105views

### What is a significance of 9.536743e-7?

I realized that outputs of a continuous function (takes a vector, returns a scalar) that I wrote in python are discretized at the resolution of 9.536743e-7. I googled this number and learned that some ...
• 121
69views

### How to keep number as string when creating dataframe Pandas

I am having some issue converting a multidimensional list into a Pandas dataframe. The problem is related to the numeric fields: I have some number in a non-standard format, as you can see from this ...
93views

### How do I parse a string with a C++ style hex float (%a) to a f64?

I need to parse a text file with bunch of float numbers that were created via C++'s printf("%a") and looks like: -0x1.68p+6 -0x1.68p+7 As I understand it, this is the mantissa and exponent ...
• 6,641
51views

### git installation failing, windows 10, CryptStringToBinaryW

Git for windows is failing to install, I've redownloaded numerous times. The install log looks like this : 2021-05-07 09:02:33.191 -- DLL function import -- 2021-05-07 09:02:33.191 Function name: ...
• 44.8k
118views

### R vs Python differences in floating point resolution : Is it possible to set Python act like R?

I am working on a personal project involving the manipulation of linear algebra concepts. This project involves, in the first instance, replicating (translating) an R (64bit version) code into a ...
• 711
47views

### Understanding IEEE 754: why convertFromInt and convertToIntegerXXX are categorized as arithmetic operations and not conversion operations?

Note: understanding IEEE 754. Please be patient. IEEE Std 754-2019: 5.4.1 Arithmetic operations ― formatOf-convertFromInt(int) ― intFormatOf-convertToIntegerXXX(source) Question: why convertFromInt ...
• 4,132
69views

### Why does the Windows x64 calling convention require XMM (FP) args copied to integer registers, for variadic functions like printf?

I am trying to assemble following code using NASM on Windows. The printf function is supposed to take xmm0 through xmm2 for fractional point arguments. Why do I have to place fractional arguments in ...
• 175
86views

### python search float return wrong because of precision problem?

I want to find a float in a arry like this: arr = np.asarray([1351.1 , 1351.11, 1351.14, 1351.16, 1351.17]) index = np.searchsorted(arr, 1351.14, side="right") - 1 # return 2 But I find ...
• 7,635
126views

### Double values comparison w/ tolerance for "point" equality in a DXF drawing

I have a simple algorithm that fails sometimes because it is comparing doubles. I look at a DXF drawing and get all the line segments, and also break it down into a series of points. When looping ...
• 141
89views

### Go float vs uint64 comparison issue

Working on a problem that compares a float and a uint64, in which the float equals MaxUint64+1. The comparison works fine with a float literal. However when the float is assigned to a variable, the ...
99views

### Is precision of pow in C dependent on scale of parameters?

Legacy code I am to maintain has a function with a variable x of type double. This variable contains a position measured in meters. It is expected to hold a value in the mm range, let's say 0.0001 <...
• 385
152views

### Which C compilers do support #pragma STDC FENV_ACCESS ON (or its equivalent)?

Which C compilers do support #pragma STDC FENV_ACCESS ON (or its equivalent)? cl (19.25.28611): does support via #pragma fenv_access (on) gcc (10.2.0): no support: warning: ignoring '#pragma STDC ...
• 4,132
193views

### Quad-precision numbers with Intel Compiler (icc)

I have been trying to work with Intel's Quad-precision floats. I have the following code, and it is returning unexpected results. #include <stdio.h> #include <math.h> int print(const char ...
• 312
198views

### GCC porting to new target using software floating point library

I am currently trying to porting GCC-9.2.0 compiler for new architecture "SPIM" which is similar to MIPS architecture, using floating point arithmetic operation by GCC internal: 4.2 Routines ...
75views

### Is bfloat16 ever used for graphics?

Bfloat16 is a half precision floating point format that has the same 8-bit exponent as single precision, but only 7 (plus 1 implied) bits of significand. Surprisingly, this turns out to be adequate ...
• 28.2k
80views

### Invert the sign of a non-negative floating-point number: Would std::copysign(x) make a difference than (-1)*x?

Let's say we would write a function that would return the inverse of a non-negative floating-point number x. Would the following two functions make a difference in the result (say rounding issues), if ...
• 7,124
104views

### I got this error on early stopping function TypeError: '>' not supported between instances of 'NoneType' and 'float' in python

I am working on epilepsy prediction using CNN and I use early stopping in my code. I got this error line 230 in on_epoch_end if current > self.value: TypeError: '>' not supported between ...
• 467
61views

### How does MYSQL represent floats internally?

I ran into this issue wherein inserting a number like 1234567 to a float column leads to a rounded off value of 1234570. I understand that this is due to float point precision but what confuses me is ...
253views

### Stochastically rounding a float to an integer

I want a function (using Python 3.6+ if it's relevant) that will stochastically round a floating-point number to an integer in the following manner: Given a real number, x, let a = floor(x) and let b ...
• 533
89views

### Inconsistency in coordinates representation in GeoPandas

I have list of Shapely Points in GeoSeries. coords.head(): 0 POINT (-26.17690 80.81700) 1 POINT (-15.54390 80.61700) 2 POINT (-20.67690 80.36700) 3 POINT (6.10610 80.83300) 4 POINT ...
• 33
1kviews

### Understanding sys.float_info for maximum floats in python

I'm trying to understand the information given by sys.float_info to understand what the maximum floats in Python are. On my computer, this gives me the following: >>> import sys >>> ...
655views

### Handle money and currencies in javascript multiplying and dividing by 100

I'm writing an app for myself. I need to handle money and currencies. I read this and I want to try the second approach: You can multiply floats into integers before you calculate, then divide them ...
• 1,325
1kviews

### How to get the maximum number of decimal places in each column of a pandas dataframe?

I need to set the global float precision to the minimum value possible Also, I need to get the precision for each column, in part to get the global precision and on the other hand, I would like to ...
• 8,950
753views

### x86_64 asm calculate floating point to power of another floating point

Okay so I have to calculate this kind of calculation: 10^(some floating point value) where the floating point value (the exponent) is stored as double in xmm0 register calculated before with divsd ...
78views

### How does the "r" suffix correspond to a floating point number when using MASM?

When using MASM I find no information on how a value is formatted in floating point hex. For example: What is a decimal value of 50.1 equal to when using the "r" suffix? Also according to the MASM ...