All Questions

Tagged with
160 questions with no upvoted or accepted answers
Filter by
Sorted by
Tagged with
5votes
0answers
123views

C (MIPS) - How to tell compiler load single-precision floats immidiates with GPRs?

Recently, I am trying to write some utilities for n64 with gcc and have some problems with it's optimization strategy. Please consider following example: // cctest.c extern struct { float x; ...
user avatar
  • 81
4votes
0answers
34views

fpclassify(): what are the examples of another implementation-defined categories?

N2479 C17..C2x working draft — February 5, 2020 ISO/IEC 9899:202x (E) (emphasis added): The fpclassify macro classifies its argument value as NaN, infinite, normal, subnormal, zero, or into another ...
user avatar
  • 4,132
4votes
3answers
115views

Is there a bug in controlled rounding using `exp`?

I'm observing incorrect (IMO) rounding behaviour on some platforms as follows: Calculate the value of log(2) under rounding modes to FE_DOWNWARD and FE_UPWARD (see <fenv.h>). In all cases I've ...
user avatar
4votes
0answers
211views

Any insights on this Microsoft C 5.1 floating point and DOSBox weirdness?

This is a fantastically strange bug that has been tweaking my noodle for the better part of a day; it took me some time to boil it down to this. The setup: Microsoft C 5.10 (~1988) DOSBox 0.74 ...
user avatar
3votes
0answers
105views

FLT_HAS_SUBNORM is 0: does execution of fpclassify() with manually constructed subnormal lead to UB or lead to WDB returning FP_SUBNORMAL?

In case of FLT_HAS_SUBNORM == 0 (or any XXX_HAS_SUBNORM == 0 in general) does execution of fpclassify macro with manually constructed subnormal (constructed using type punning via union, using memcpy,...
user avatar
  • 4,132
3votes
0answers
155views

Floating point [in]accuracy of C program, when running on the same machine, changed over last two weeks

The following C code was compiled today on two systems with Microsoft's compiler (installed with Visual Studio 2017 Community), both of which had modern 64-bit Intel processors and were running ...
user avatar
  • 1,606
3votes
2answers
991views

Call C function from Assembly code using as88 Assembler

I'm working on a Floating Point calculator for 16bits processors, specifically 8086/8088. I'm using as88 Tracker which doesn't implement floating points, not allowing me to use sscanf with "%f". I ...
user avatar
2votes
0answers
61views

Reading the trailing fraction part in a nanl

I am upgrading a working multi-platform math library (compliant with c99 standard) and have to code the long double extension to a couple functions (alike sinl for sin). In particular, I have to deal ...
user avatar
2votes
1answer
96views

How to compute trunc(a/b) with only the nearest-to-even rounding mode?

Given two IEEE-754 double-precision floating-point numbers a and b, I want to get the exact quotient a/b rounded to an integer towards zero. A C99 program to do that could look like this: #include <...
user avatar
  • 5,437
2votes
0answers
99views

Is precision of pow in C dependent on scale of parameters?

Legacy code I am to maintain has a function with a variable x of type double. This variable contains a position measured in meters. It is expected to hold a value in the mm range, let's say 0.0001 <...
user avatar
2votes
1answer
152views

Which C compilers do support #pragma STDC FENV_ACCESS ON (or its equivalent)?

Which C compilers do support #pragma STDC FENV_ACCESS ON (or its equivalent)? cl (19.25.28611): does support via #pragma fenv_access (on) gcc (10.2.0): no support: warning: ignoring '#pragma STDC ...
user avatar
  • 4,132
2votes
2answers
193views

Quad-precision numbers with Intel Compiler (icc)

I have been trying to work with Intel's Quad-precision floats. I have the following code, and it is returning unexpected results. #include <stdio.h> #include <math.h> int print(const char ...
user avatar
2votes
0answers
170views

prohibiting float to double conversion

I'm learning C programming, so during writing my code, one question emerged in my head. I'm writing program which works with numbers featuring floating point, but it does not require so big mantissa ...
user avatar
2votes
0answers
52views

How much precision is needed to represent n/2^k, given a boundary of -2 to +2?

Apologies for the poorly phrased title. I tried to do better, but perhaps someone can suggest a better title. As a test case for another project, I'm working with the Mandelbrot set, using only value ...
user avatar
  • 527
2votes
0answers
172views

R's isoreg function generating more knots than unique fitted values

I have been working with R's isoreg function and have experienced a problem: the function is generating more knots than unique fitted values. From the R help, iKnots [is an] integer vector giving ...
user avatar
2votes
0answers
242views

Why is FLT_MIN_EXP not equal to emin?

The IEEE 754 standard defines the minimum and maximum values that can be represented in the exponent field, using the biased representation. For binary32, emax is defined as 127 and emin is defined to ...
user avatar
  • 2,311
2votes
0answers
698views

gcc embedded cortex soft / hard float

Thanks in advance. I'm using GCC to compile my Code for STM32F7 ARM Cortex. Unfortunately my result always includes floating point emulation Routines 00200664 00000254 T __aeabi_dmul 00200664 ...
user avatar
2votes
0answers
569views

Facing undefined symbols linker issue with Diab compiler when I type cast array of data from float to long long

I wrote a small example code and executed in both GCC and DIAB compilers. #include<stdio.h> int main() { float a[10]; long long int b[10]; int i; for (i =0;i<10;i++) { ...
user avatar
2votes
0answers
416views

Using Windows Structured Exception Handling (SEH) to catch floating point exceptions

I am attempting to catch floating point exceptions in code compiled using Visual Studio 2008, similar to this post: Visual C++ / Weird behavior after enabling floating-point exceptions (compiler bug ?)...
user avatar
  • 182
2votes
0answers
2kviews

convert int 32 to q31 or f32

I am trying to understand exactly how to do this. I know how fixed point and floating point notations work, but I was wondering how I can convert from int32 to q31 or f32. If I understand q31 ...
user avatar
  • 185
2votes
1answer
232views

Bitwise creation of 64-bit float

The situation is that I'm on a 32-bit embedded platform (Cortex-M4F) which has a hardware FPU. I'd really like to use the FPU, but the platform provides no hardware implementation of 64-bit float ...
user avatar
  • 1,488
2votes
2answers
425views

Number Of Digits in Fractional Portion of Float/Double

I am trying to implement http://www.exploringbinary.com/correct-decimal-to-floating-point-using-big-integers/ I have read through it quite a few times and feel very comfortable with it. The first ...
user avatar
  • 2,861
1vote
0answers
28views

strength reduction leading to different outcomes with respect to signaling NaNs

gcc strength-reduces floating-point expressions such as x * 1.0 into the identity function. This is correct if x is a finite or infinite value, but if x is a signaling NaN, x * 1.0 will be a "...
user avatar
1vote
0answers
41views

Is it considered normal that under FE_DOWNWARD or FE_TOWARDZERO expression FLT_MAX * FLT_MAX evaluates to FLT_MAX?

Sample code: #include <float.h> #include <fenv.h> #pragma STDC FENV_ACCESS ON int main(void) { if (fesetround(RM) != 0) return 2; return ((FLT_MAX * FLT_MAX) == FLT_MAX) ? 0 : 1; ...
user avatar
  • 4,132
1vote
0answers
34views

Getting wrong values when calculating average values of an Array in C

I am trying to read the values of a Matrix stored in a file, then calculate the average of each column. This is my output: 3.000000 -965.000000 3.000000 -1111.000000 -585.000000 2....
user avatar
1vote
0answers
52views

Why precision of long double is not required to be higher than precision of double?

C11: DBL_DECIMAL_DIG 10 LDBL_DECIMAL_DIG 10 DBL_DIG 10 LDBL_DIG 10 #define DBL_MAX 1E+37 #define LDBL_MAX 1E+37 #define DBL_MIN 1E-37 #define LDBL_MIN 1E-37 C11: the ...
user avatar
  • 4,132
1vote
0answers
38views

HAS_SUBNORM is 0: FTZ (flush to zero) shall be done before tininess detection or after tininess detection?

Consider 1.1754944E-38f - 1.1754945E-38f (both are normals). If HAS_SUBNORM is 1, then the answer is -1E-45f (subnormal) and no exceptions are raised. If HAS_SUBNORM is 0, then the answer is -0.0f (...
user avatar
  • 4,132
1vote
0answers
54views

FLT_HAS_SUBNORM is 0: what fpclassify(<subnormal>) shall return: FP_SUBNORMAL or FP_ZERO, or lead to UB?

Follow-up question for: FLT_HAS_SUBNORM is 0: does execution of fpclassify() with manually constructed subnormal lead to UB or lead to WDB returning FP_SUBNORMAL? If the presence of subnormal numbers ...
user avatar
  • 4,132
1vote
0answers
100views

How to use Decimal64 DPD with gcc

I have been trying to figure out how to integrate Decimal Floating Types with GCC 10.2.0. I built gcc with --enable-decimal-float=dpd and --with-backend=libdecnumber flags, and specified DFP_ENABLE=1 ...
user avatar
1vote
0answers
30views

LLDB Floating point values displayed incorrectly unless printf is used

I am trying to debug a program using a coredump. But I am having trouble getting LLDB to print flating point variables correctly. Here is an example of a debugging session: (lldb) run ... (lldb) fr v (...
user avatar
1vote
0answers
55views

How to implement GNUMP in this Bernoulli numbers algortihm I wrote in C?

I wrote this C algortihm to calculate Bernoulli numbers when the input is Nth, but it only works until 136th because of the floating-point limitations of C. I'm not a programmer, more a mathematician, ...
user avatar
1vote
0answers
191views

"No source code lines were found at current PC 0x74e." with double in MPLAB X

Need some help with a simple calculation done on a PIC24FJ128GB204. Using MPLAB X 5.2, XC16 compiler, ICD4. I have a sensor that returns data in 6 bytes: [temp MSB][temp LSB][CRC][humidityMSB][...
user avatar
1vote
0answers
66views

How to convert strtod's 'P' binary exponent notation to decimal?

I knew about the standard 'e' exponent for decimal notation, but the linux man page of strtod says about the hexadecimal notation: A hexadecimal number consists of a "0x" or "0X" followed by a ...
user avatar
  • 1,500
1vote
1answer
180views

How to get the number of decimal places of a float value?

I want to express the final product rounded down according to scientific notation. So if 80.55 and 879.5689 give 70849.281250 the final print functions outputs 70849.28. Is there a way to get the ...
user avatar
1vote
1answer
63views

How can I make sure there is no floating point arithmetic in a C code (Visual Studio 2013 express + WDK 8.1, WDM kernel driver)?

Some years ago I came across a modified moufiltr driver (made by Povohat), which allows the user to set a customized mouse acceleration. This driver used floating point arithmetic. I found a mouclass ...
user avatar
  • 11
1vote
2answers
729views

How to filter out characters in C programming with exceptions

I am trying to write a C program that asks the user for an input and if the input is between a given float an output will appear. in this case if the input speed is 20.5 then the output would be you ...
user avatar
  • 33
1vote
1answer
524views

Why the cos function in math.h faster than x86 fcos instruction

The cos() in math.h run faster than the x86 asm fcos. The following code is compare between the x86 fcos and the cos() in math.h. In this code, 1000000 times asm fcos cost 150ms; 1000000 times cos() ...
user avatar
  • 71
1vote
0answers
1kviews

C programming: creating a function to convert double to string

I am trying to code a function that converts a double into a string (a sort of dtoa function). I don't want to use any of the standard library that will do all the job for me (itoa is ok, strlen too, ...
user avatar
1vote
0answers
75views

Interpreting binary data as float

I have an external hardware meter that would send measured voltage values through a TCP connection as a floating point value. I am writing a C code to capture this data. The first code I wrote is as ...
user avatar
1vote
0answers
511views

C Program : Display a Partially filled Array

I need some help displaying a float array that's partially filled. Not sure what I'm doing wrong, but here's what I have. #define _CRT_SECURE_NO_WARNINGS #include <stdio.h> #include <...
user avatar
1vote
0answers
62views

Optimising divide operation inside Jacobi relaxation

I am trying to optimise the divide operation from the Jacobi relaxation formula. Also doing profiling using perf. Here is my code for (int l = 0; l < iter; l++) { for (i = 1; i < height; ...
user avatar
  • 325
1vote
2answers
81views

In elf-gcc, exp() works correctly only for the first call than not afterwards

I'm running time-consuming algorithm with bare-metal program(no OS) on our board using a processor(sparc architecture) developed in our team, and using gcc elf toolchain. With soft-float, it works ...
user avatar
  • 4,305
1vote
1answer
77views

To capture a floating-point overflow excepiton

I have a simple C program that computes (1e200)^2, which should cause a floating-point overflow exception since the largest double is 1e308 or so. double square(double x){ return x*x; } int main(...
user avatar
  • 8,673
1vote
0answers
306views

x86 Assembly, how to reproduce fscale in C

Having float numbers 31.0 and 0.85842... how can I reproduce following fscale operation in C to get the same result of 1843450267.9 ? (gdb) info float R7: Valid 0x3fff8000000000000000 +1 R6: ...
user avatar
  • 11
1vote
1answer
77views

array values changes after sorting in float variable

I ran this code and input float values in array 's' but after sorting , new values of array elements are slightly different from input values. Why is it so ? This is the code I ran: #include <...
user avatar
  • 407
1vote
1answer
53views

floating point numbers in C slightly different from expected

I noticed that in C, a float can be as small as 2^-149, and as large as 2^127. If I try to set the float to any smaller or larger respectively than these, then I get zero and inf, respectively. The 2^...
user avatar
  • 1,317
1vote
0answers
128views

Redefine GCC floating point storage format

I am currently working on an ARM floating point project which needs to change the GCC floating point output format. The default GCC float point represented inside memory is IEEE754 format, which is ...
user avatar
1vote
0answers
105views

strange implementation of acosf

I stumbled about some strange problem: I got floating point exceptions (we enabled them explicitly and used /fp:strict)in acosf: we have 2 32bit floating point vectors and want to calculate the ...
user avatar
  • 6,455
1vote
3answers
117views

Floating point numbers inaccuracy?

I am prompting the user to input a float number. I save the number in float variable and multiply it by 100 to make it integer. Only 2 decimal places are allowed so it is a fairly easy thing. Now the ...
user avatar
  • 175
1vote
0answers
352views

How to create a custom float with varying bit length

I'm writing a program that basically converts an int into a floating point with a twist, the user specifies how many bits are used for the exponent and mantissa part. I don't understand what the ...
user avatar
  • 1,607

15 30 50 per page