All Questions
Tagged with floating-point c
160
questions with no upvoted or accepted answers
5votes
0answers
123views
C (MIPS) - How to tell compiler load single-precision floats immidiates with GPRs?
Recently, I am trying to write some utilities for n64 with gcc and have some problems with it's optimization strategy.
Please consider following example:
// cctest.c
extern struct {
float x;
...
4votes
0answers
34views
fpclassify(): what are the examples of another implementation-defined categories?
N2479 C17..C2x working draft — February 5, 2020 ISO/IEC 9899:202x (E) (emphasis added):
The fpclassify macro classifies its argument value as NaN, infinite, normal, subnormal, zero, or into another ...
4votes
3answers
115views
Is there a bug in controlled rounding using `exp`?
I'm observing incorrect (IMO) rounding behaviour on some platforms as follows:
Calculate the value of log(2) under rounding modes to FE_DOWNWARD and FE_UPWARD (see <fenv.h>). In all cases I've ...
4votes
0answers
211views
Any insights on this Microsoft C 5.1 floating point and DOSBox weirdness?
This is a fantastically strange bug that has been tweaking my noodle for the better part of a day; it took me some time to boil it down to this.
The setup:
Microsoft C 5.10 (~1988)
DOSBox 0.74
...
3votes
0answers
105views
FLT_HAS_SUBNORM is 0: does execution of fpclassify() with manually constructed subnormal lead to UB or lead to WDB returning FP_SUBNORMAL?
In case of FLT_HAS_SUBNORM == 0 (or any XXX_HAS_SUBNORM == 0 in general) does execution of fpclassify macro with manually constructed subnormal (constructed using type punning via union, using memcpy,...
3votes
0answers
155views
Floating point [in]accuracy of C program, when running on the same machine, changed over last two weeks
The following C code was compiled today on two systems with Microsoft's compiler (installed with Visual Studio 2017 Community), both of which had modern 64-bit Intel processors and were running ...
3votes
2answers
991views
Call C function from Assembly code using as88 Assembler
I'm working on a Floating Point calculator for 16bits processors, specifically 8086/8088.
I'm using as88 Tracker which doesn't implement floating points, not allowing me to use sscanf with "%f".
I ...
2votes
0answers
61views
Reading the trailing fraction part in a nanl
I am upgrading a working multi-platform math library (compliant with c99 standard) and have to code the long double extension to a couple functions (alike sinl for sin).
In particular, I have to deal ...
2votes
1answer
96views
How to compute trunc(a/b) with only the nearest-to-even rounding mode?
Given two IEEE-754 double-precision floating-point numbers a and b, I want to get the exact quotient a/b rounded to an integer towards zero.
A C99 program to do that could look like this:
#include <...
2votes
0answers
99views
Is precision of pow in C dependent on scale of parameters?
Legacy code I am to maintain has a function with a variable x of type double. This variable contains a position measured in meters. It is expected to hold a value in the mm range, let's say 0.0001 <...
2votes
1answer
152views
Which C compilers do support #pragma STDC FENV_ACCESS ON (or its equivalent)?
Which C compilers do support #pragma STDC FENV_ACCESS ON (or its equivalent)?
cl (19.25.28611): does support via #pragma fenv_access (on)
gcc (10.2.0): no support: warning: ignoring '#pragma STDC ...
2votes
2answers
193views
Quad-precision numbers with Intel Compiler (icc)
I have been trying to work with Intel's Quad-precision floats. I have the following code, and it is returning unexpected results.
#include <stdio.h>
#include <math.h>
int print(const char ...
2votes
0answers
170views
prohibiting float to double conversion
I'm learning C programming, so during writing my code, one question emerged in my head. I'm writing program which works with numbers featuring floating point, but it does not require so big mantissa ...
2votes
0answers
52views
How much precision is needed to represent n/2^k, given a boundary of -2 to +2?
Apologies for the poorly phrased title. I tried to do better, but perhaps someone can suggest a better title.
As a test case for another project, I'm working with the Mandelbrot set, using only value ...
2votes
0answers
172views
R's isoreg function generating more knots than unique fitted values
I have been working with R's isoreg function and have experienced a problem: the function is generating more knots than unique fitted values.
From the R help,
iKnots [is an] integer vector giving ...
2votes
0answers
242views
Why is FLT_MIN_EXP not equal to emin?
The IEEE 754 standard defines the minimum and maximum values that can be represented in the exponent field, using the biased representation. For binary32, emax is defined as 127 and emin is defined to ...
2votes
0answers
698views
gcc embedded cortex soft / hard float
Thanks in advance.
I'm using GCC to compile my Code for STM32F7 ARM Cortex.
Unfortunately my result always includes floating point emulation Routines
00200664 00000254 T __aeabi_dmul
00200664 ...
2votes
0answers
569views
Facing undefined symbols linker issue with Diab compiler when I type cast array of data from float to long long
I wrote a small example code and executed in both GCC and DIAB compilers.
#include<stdio.h>
int main()
{
float a[10];
long long int b[10];
int i;
for (i =0;i<10;i++)
{
...
2votes
0answers
416views
Using Windows Structured Exception Handling (SEH) to catch floating point exceptions
I am attempting to catch floating point exceptions in code compiled using Visual Studio 2008, similar to this post: Visual C++ / Weird behavior after enabling floating-point exceptions (compiler bug ?)...
2votes
0answers
2kviews
convert int 32 to q31 or f32
I am trying to understand exactly how to do this. I know how fixed point and floating point notations work, but I was wondering how I can convert from int32 to q31 or f32.
If I understand q31 ...
2votes
1answer
232views
Bitwise creation of 64-bit float
The situation is that I'm on a 32-bit embedded platform (Cortex-M4F) which has a hardware FPU. I'd really like to use the FPU, but the platform provides no hardware implementation of 64-bit float ...
2votes
2answers
425views
Number Of Digits in Fractional Portion of Float/Double
I am trying to implement http://www.exploringbinary.com/correct-decimal-to-floating-point-using-big-integers/
I have read through it quite a few times and feel very comfortable with it. The first ...
1vote
0answers
28views
strength reduction leading to different outcomes with respect to signaling NaNs
gcc strength-reduces floating-point expressions such as x * 1.0 into the identity function. This is correct if x is a finite or infinite value, but if x is a signaling NaN, x * 1.0 will be a "...
1vote
0answers
41views
Is it considered normal that under FE_DOWNWARD or FE_TOWARDZERO expression FLT_MAX * FLT_MAX evaluates to FLT_MAX?
Sample code:
#include <float.h>
#include <fenv.h>
#pragma STDC FENV_ACCESS ON
int main(void)
{
if (fesetround(RM) != 0) return 2;
return ((FLT_MAX * FLT_MAX) == FLT_MAX) ? 0 : 1;
...
1vote
0answers
34views
Getting wrong values when calculating average values of an Array in C
I am trying to read the values of a Matrix stored in a file, then calculate the average of each column.
This is my output:
3.000000 -965.000000 3.000000 -1111.000000 -585.000000
2....
1vote
0answers
52views
Why precision of long double is not required to be higher than precision of double?
C11:
DBL_DECIMAL_DIG 10
LDBL_DECIMAL_DIG 10
DBL_DIG 10
LDBL_DIG 10
#define DBL_MAX 1E+37
#define LDBL_MAX 1E+37
#define DBL_MIN 1E-37
#define LDBL_MIN 1E-37
C11:
the ...
1vote
0answers
38views
HAS_SUBNORM is 0: FTZ (flush to zero) shall be done before tininess detection or after tininess detection?
Consider 1.1754944E-38f - 1.1754945E-38f (both are normals).
If HAS_SUBNORM is 1, then the answer is -1E-45f (subnormal) and no exceptions are raised.
If HAS_SUBNORM is 0, then the answer is -0.0f (...
1vote
0answers
54views
FLT_HAS_SUBNORM is 0: what fpclassify(<subnormal>) shall return: FP_SUBNORMAL or FP_ZERO, or lead to UB?
Follow-up question for:
FLT_HAS_SUBNORM is 0: does execution of fpclassify() with manually constructed subnormal lead to UB or lead to WDB returning FP_SUBNORMAL?
If the presence of subnormal numbers ...
1vote
0answers
100views
How to use Decimal64 DPD with gcc
I have been trying to figure out how to integrate Decimal Floating Types with GCC 10.2.0.
I built gcc with --enable-decimal-float=dpd and --with-backend=libdecnumber flags, and specified DFP_ENABLE=1 ...
1vote
0answers
30views
LLDB Floating point values displayed incorrectly unless printf is used
I am trying to debug a program using a coredump.
But I am having trouble getting LLDB to print flating point variables correctly.
Here is an example of a debugging session:
(lldb) run
...
(lldb) fr v
(...
1vote
0answers
55views
How to implement GNUMP in this Bernoulli numbers algortihm I wrote in C?
I wrote this C algortihm to calculate Bernoulli numbers when the input is Nth, but it only works until 136th because of the floating-point limitations of C. I'm not a programmer, more a mathematician, ...
1vote
0answers
191views
"No source code lines were found at current PC 0x74e." with double in MPLAB X
Need some help with a simple calculation done on a PIC24FJ128GB204.
Using MPLAB X 5.2, XC16 compiler, ICD4.
I have a sensor that returns data in 6 bytes: [temp MSB][temp LSB][CRC][humidityMSB][...
1vote
0answers
66views
How to convert strtod's 'P' binary exponent notation to decimal?
I knew about the standard 'e' exponent for decimal notation, but the linux man page of strtod says about the hexadecimal notation:
A hexadecimal number consists of a "0x" or "0X" followed by a ...
1vote
1answer
180views
How to get the number of decimal places of a float value?
I want to express the final product rounded down according to scientific notation.
So if 80.55 and 879.5689 give 70849.281250 the final print functions outputs 70849.28.
Is there a way to get the ...
1vote
1answer
63views
How can I make sure there is no floating point arithmetic in a C code (Visual Studio 2013 express + WDK 8.1, WDM kernel driver)?
Some years ago I came across a modified moufiltr driver (made by Povohat), which allows the user to set a customized mouse acceleration. This driver used floating point arithmetic.
I found a mouclass ...
1vote
2answers
729views
How to filter out characters in C programming with exceptions
I am trying to write a C program that asks the user for an input and if the input is between a given float an output will appear. in this case if the input speed is 20.5 then the output would be you ...
1vote
1answer
524views
Why the cos function in math.h faster than x86 fcos instruction
The cos() in math.h run faster than the x86 asm fcos.
The following code is compare between the x86 fcos and the cos() in math.h.
In this code, 1000000 times asm fcos cost 150ms; 1000000 times cos() ...
1vote
0answers
1kviews
C programming: creating a function to convert double to string
I am trying to code a function that converts a double into a string (a sort of dtoa function). I don't want to use any of the standard library that will do all the job for me (itoa is ok, strlen too, ...
1vote
0answers
75views
Interpreting binary data as float
I have an external hardware meter that would send measured voltage values through a TCP connection as a floating point value. I am writing a C code to capture this data. The first code I wrote is as ...
1vote
0answers
511views
C Program : Display a Partially filled Array
I need some help displaying a float array that's partially filled. Not sure what I'm doing wrong, but here's what I have.
#define _CRT_SECURE_NO_WARNINGS
#include <stdio.h>
#include <...
1vote
0answers
62views
Optimising divide operation inside Jacobi relaxation
I am trying to optimise the divide operation from the Jacobi relaxation formula.
Also doing profiling using perf.
Here is my code
for (int l = 0; l < iter; l++) {
for (i = 1; i < height; ...
1vote
2answers
81views
In elf-gcc, exp() works correctly only for the first call than not afterwards
I'm running time-consuming algorithm with bare-metal program(no OS) on our board using a processor(sparc architecture) developed in our team, and using gcc elf toolchain. With soft-float, it works ...
1vote
1answer
77views
To capture a floating-point overflow excepiton
I have a simple C program that computes (1e200)^2, which should cause a floating-point overflow exception since the largest double is 1e308 or so.
double square(double x){
return x*x;
}
int main(...
1vote
0answers
306views
x86 Assembly, how to reproduce fscale in C
Having float numbers 31.0 and 0.85842... how can I reproduce following fscale operation in C to get the same result of 1843450267.9 ?
(gdb) info float
R7: Valid 0x3fff8000000000000000 +1
R6: ...
1vote
1answer
77views
array values changes after sorting in float variable
I ran this code and input float values in array 's' but after sorting , new values of array elements are slightly different from input values. Why is it so ?
This is the code I ran:
#include <...
1vote
1answer
53views
floating point numbers in C slightly different from expected
I noticed that in C, a float can be as small as 2^-149, and as large as 2^127. If I try to set the float to any smaller or larger respectively than these, then I get zero and inf, respectively. The 2^...
1vote
0answers
128views
Redefine GCC floating point storage format
I am currently working on an ARM floating point project which needs to change the GCC floating point output format. The default GCC float point represented inside memory is IEEE754 format, which is ...
1vote
0answers
105views
strange implementation of acosf
I stumbled about some strange problem:
I got floating point exceptions (we enabled them explicitly and used /fp:strict)in acosf:
we have 2 32bit floating point vectors and want to calculate the ...
1vote
3answers
117views
Floating point numbers inaccuracy?
I am prompting the user to input a float number. I save the number in float variable and multiply it by 100 to make it integer. Only 2 decimal places are allowed so it is a fairly easy thing. Now the ...
1vote
0answers
352views
How to create a custom float with varying bit length
I'm writing a program that basically converts an int into a floating point with a twist, the user specifies how many bits are used for the exponent and mantissa part. I don't understand what the ...