All Questions
Tagged with floating-point binary
43
questions with no upvoted or accepted answers
2votes
0answers
57views
Generate all numbers of the binary system (B=2, t=3, L=-2, U=3)
Suppose we have the following binary system
(B=2, t=3, L=-2, U=3)
where B is the base of the system, since it's a binary system, B is of course 2. t is the precision of the number, usually refers to ...
1vote
0answers
60views
How to deterministically divide floats when there is a known deviation from IEEE 754?
DirectX 11 allows GPU manufacturers to deviate from the rounding behavior specified in the IEEE 754 standard. I cannot enable IEEE strictness, because I don't control the shader compilation process.
...
1vote
0answers
69views
9 Bit Floating Point to Hex
I've been running into an issue where I'm trying to convert a 9-bit floating point number to hex, where the floating point scheme is 1 sign-bit, followed by 4 bit exponent, and then a 4-bit mantissa. ...
1vote
1answer
35views
Decimal-module is not working on binary level? How?
I'm a beginner when it comes to what's going on in the background when I start the program. Right now my focus is on the difference between "decimal" and "float" in "Python&...
1vote
0answers
76views
Converting a Twos-complement number to its binary representation
I am doing some bitwise manipulation.
I am adding a 32 bit number to another 32 bit number with |= instead of += because I was thinking the number might be getting messed up when it converts to 32 ...
1vote
0answers
125views
PHP unpack float return unexpected answer
I have some kind of binary data. I try to get the Float from that data. We already have the Java version for that program. So we already know what result should we get. When we try with the following ...
1vote
1answer
341views
largest integer that can be stored in a double such that all integers less than can be accurately stored as well
This is some more clarification to the question that was already answered some time ago here: biggest integer that can be stored in a double
The top answer mentions that "the largest integer such ...
1vote
0answers
626views
Adding two IEEE floating point numbers in Java
I am having a difficult time figuring out how to correctly add two IEEE floating point numbers using Java. I'm not sure how to proceed in actually adding the mantissas together because I don't get how ...
1vote
0answers
729views
Create a function that converts decimals into IEEE 754 floating point precision numbers using MATLAB?
I need to create a function that does exactly as the title says and outputs a 32 character string. So far I can get the signed bit and the exponential part correctly. What I'm struggling with is ...
1vote
4answers
3kviews
C: convert a real number to 64 bit floating point binary
I'm trying to write a code that converts a real number to a 64 bit floating point binary. In order to do this, the user inputs a real number (for example, 547.4242) and the program must output a 64 ...
1vote
0answers
2kviews
Converting from Double to Binary
I've been a lot of trouble figuring this class problem. My due date is tomorrow and I still don't know how to do it. I made a code when the input put by the user is converted into binary, octal, and ...
0votes
0answers
30views
Is anyone able to understand how to add these minifloat (1 bit sign, 3 bit exponent, 4 bit fraction) bit numbers?
I have a task where we add two binary numbers, but I do not understand the solution.
Previously, it is so we have given binary numbers, in minifloat format. In our case, the minifloat is defined like ...
0votes
0answers
40views
Binary Floating point to decimal JavaScript function
To answer the original question that was closed due to an invalid moderation.
Binary Floating point to decimal JavaScript fucntion
This function below basically does the math based on the floating ...
0votes
1answer
96views
Manually calculating IEEE-754 floating point fractions and splitting up the bits - Python
I'm trying to come up with a way to do this:
Lets say the fraction portion of my IEEE-754 floating point number is 0b10110011001100110011010
I'm trying to take each bit and multiply it by a power of 2 ...
0votes
0answers
119views
Floating point addition / subtraction
I get confused because of the hidden bit in the mantissa.
From what i know:
Subtract the two exponent, find the smaller number and shift the mantissa with the hidden bit (?) by the result of the ...
0votes
0answers
383views
IEEE-754 Floating Point Standard: Representing Numbers
Given an IEEE-754 standard floating point number with 6 bits of exponent, and 25 bits of mantissa.
1: What's the smallest non-infinite positive integer this representation CANNOT represent? My answer:
...
0votes
1answer
435views
How to convert DEC 64bit double precision floating point to IEEE-754 (DEC is not decimal)
To clarify here DEC in this case is Digital Equipment Corporation NOT decimal. What I have is a binary representation of DEC with seeeeeeeefffff....(total f's is 54) s = sign bit, e = exponent, f = ...
0votes
1answer
175views
Floating-point mantissa and exponent base 2
I'm trying to understand how to get the mantissa and the exponent in this case. Here's an example I have in my book.
I have this formula (-1)^s * (1 + M) * b^E-e = x
s = 0 or 1 (the sign)
M = mantissa
...
0votes
0answers
269views
Calculating smallest postive floating point number
A 16-bit floating point representation is defined like this: 1 bit for sign, 6 bits for exponent and 9 bits for significand. Floating point number must be normalized (in the form 0.1... × 2exp). ...
0votes
0answers
351views
Convert periodic binary number to decimal
I have a simple question that is confusing me.
Convert the periodic binary number (0.1011)_2 (imagine a bar over the digits after the comma) to a decimal representation.
If it weren't periodic, no ...
0votes
1answer
403views
Decimal fraction to binary with precision
Consider the following fraction:
9.8765
How do I go about converting it to precise floating point binary without losing any value?
Now consider the following fraction
9.87654321
Again, how would ...
0votes
0answers
36views
Floating point multiplication in JavaScript
I tried the below JavaScript in Chrome console:
// Case 1
0.11 * 10000 // 1100
0.14 * 10000 // 1400.000...2
// Both of 0.11 and 0.14 are infinite in binary number
// Case 2
0.14 * 10 // 1.4000...1
0....
0votes
1answer
51views
Verify floating point representation
I want to encode -(263.125) in base 10.
I encoded it and arrived at this solution :
11000011110000011100100000000000
I just want to make sure that it is correct.
Thank you in advance.
0votes
0answers
59views
Why are floats denormalized to deal with underflow instead of overflow?
When denormalized couldn't the exponent on, let's say single point precision floats, represent 128 (instead of -126) and the mantissa (with an added 1 at the end) just be multiplied by 2^{128}? This ...
0votes
1answer
679views
How to convert decimal(with float point) to binary in Swift 3? (self written code without third-party library and Foundation)
I am looking for a simple way to convert a decimal with floating point to binary with floating point in Swift 3. For example, this code converts decimal to binary without any problems.
func ...
0votes
1answer
212views
Converting float to binary using IEEE754 standard
How to convert 32-bits float binary using IEEE754 standard in Elixir.
Converting integer is possible by using Integer.to_string/2 and passing base 2 as the second option
iex> Integer.to_string(5, ...
0votes
0answers
32views
Should the exponent be larger than or equal 2^52 for the 64 bit floating point to be an integer irrespective of the value of the mantissa
I'm trying to understand if the exponent should be larger than 2^52 for the 64 bit floating point to always get integer. It seems so to me. Here is my reasoning:
If mantissa has 52 bits, then the ...
0votes
0answers
77views
How to does rounding off binary number works?
I am currently learning floating points and my question is:
if I have a number like 0.01 and I want to round it off then I have 0.1
but if I have a number , 0.000000001, does that mean I have 0.1 as ...
0votes
0answers
47views
what is the result of (54.125) - (184)10
I am practicing for midterm and apprently there's no answer key for it.
However, I practiced and got a result but not sure if this is correct since the solution is really long.
perfrom the following ...
0votes
0answers
207views
Is it possible to modify the bits of a float?
I'm wondering if there is a way to get the value of the mantissa or the value of the exponent and modify them in order to create a new float variable, for example say I want a float to have exponent ...
0votes
2answers
100views
Extracting bits and reconstructing in C
For a class project I'm trying to extract 3 bit fields from an IEEE floating point number, multiply it by 0.5, and reconstruct the number. I've gotten extraction mostly working by pushing them into ...
0votes
0answers
147views
How to convert a hexadecimal number to floating point binary by hand?
How does one convert a hexadecimal number to its floating point binary equivalent (assuming that one exists, IEEE 754) by hand?
I'd prefer simplified working rules (since this is not the main focus ...
0votes
0answers
72views
Calculate range for any n-Bit long extended floating point
I am trying to imagine an 80-bit extended precision form with a 1 bit sign, 16-bit exponent,and 63-bit fraction excluding the implied 1 before the binary point. I know that to calculate the bias for ...
0votes
0answers
163views
Convert 1.0 x 2^-140 to single precision floating point
So I'm trying to understand how I would convert such a number like 1.0 x 2^-140 in single floating point precision considering it is out of range. Any tips how I would go about solving this in binary?
0votes
2answers
3kviews
read float and double from binary data in C++
I need to be able to read in a float or double from binary data in C++, similarly to Python's struct.unpack function. My issue is that the data I am receiving will always be big-endian. I have dealt ...
0votes
1answer
78views
How would I write this in IEEE standards?
I would like to know how to write 5/32 in IEEE754 standard. Is there a shortcut to do the fraction part?
The answer is 0 10000010 00100000000000000000000. But there has to be an easier way to write 5/...
0votes
1answer
371views
fractional binary subtraction
I am having difficulty understanding why the following binary subtraction gives the result that it does. I keep getting a different answer. I am trying to compute 0.1-x such that x is 0....
0votes
2answers
135views
Floating point number representation in binary
I'm working on a problem out of Cracking The Coding Interview that asks: Given a 2-D graph with points on it, find a line which passes the most number of points.
The solution is to: Draw an infinite ...
-1votes
1answer
301views
reading float values from binary file (in after effects script)
I have a binary file containing data recorded using a c program.
the data stored in files are float values.
Now I need to retrieve the float numbers from binary file in after effects script.
This is ...
-1votes
2answers
775views
how to convert a binary string into two's complement and IEEE in C
SO I have been tasked to write a program that scans in two 32-character binary strings from the user, which will always be 32 characters long. After reading the numbers in, i'm supposed to ask the ...
-1votes
2answers
80views
Non integers number binary presentation
How is the binary presentation of a float number in java?
for example, what is the binary presentation of (5.34) for example.
thanks.
-1votes
1answer
256views
IEEE -754 Floating Point Conversion
Just started a new class and I'm having trouble grasping the floating-point conversions. We were given a problem of a dollar amount, then to convert that to binary, then to hex, then to floating point....