# All Questions

Tagged with
43 questions with no upvoted or accepted answers
Filter by
Sorted by
Tagged with
57views

### Generate all numbers of the binary system (B=2, t=3, L=-2, U=3)

Suppose we have the following binary system (B=2, t=3, L=-2, U=3) where B is the base of the system, since it's a binary system, B is of course 2. t is the precision of the number, usually refers to ...
1vote
60views

### How to deterministically divide floats when there is a known deviation from IEEE 754?

DirectX 11 allows GPU manufacturers to deviate from the rounding behavior specified in the IEEE 754 standard. I cannot enable IEEE strictness, because I don't control the shader compilation process. ...
1vote
69views

### 9 Bit Floating Point to Hex

I've been running into an issue where I'm trying to convert a 9-bit floating point number to hex, where the floating point scheme is 1 sign-bit, followed by 4 bit exponent, and then a 4-bit mantissa. ...
1vote
35views

### Decimal-module is not working on binary level? How?

I'm a beginner when it comes to what's going on in the background when I start the program. Right now my focus is on the difference between "decimal" and "float" in "Python&...
1vote
76views

### Converting a Twos-complement number to its binary representation

I am doing some bitwise manipulation. I am adding a 32 bit number to another 32 bit number with |= instead of += because I was thinking the number might be getting messed up when it converts to 32 ...
1vote
125views

### PHP unpack float return unexpected answer

I have some kind of binary data. I try to get the Float from that data. We already have the Java version for that program. So we already know what result should we get. When we try with the following ...
1vote
341views

### largest integer that can be stored in a double such that all integers less than can be accurately stored as well

This is some more clarification to the question that was already answered some time ago here: biggest integer that can be stored in a double The top answer mentions that "the largest integer such ...
1vote
626views

### Adding two IEEE floating point numbers in Java

I am having a difficult time figuring out how to correctly add two IEEE floating point numbers using Java. I'm not sure how to proceed in actually adding the mantissas together because I don't get how ...
1vote
729views

### Create a function that converts decimals into IEEE 754 floating point precision numbers using MATLAB?

I need to create a function that does exactly as the title says and outputs a 32 character string. So far I can get the signed bit and the exponential part correctly. What I'm struggling with is ...
1vote
3kviews

### C: convert a real number to 64 bit floating point binary

I'm trying to write a code that converts a real number to a 64 bit floating point binary. In order to do this, the user inputs a real number (for example, 547.4242) and the program must output a 64 ...
1vote
2kviews

### Converting from Double to Binary

I've been a lot of trouble figuring this class problem. My due date is tomorrow and I still don't know how to do it. I made a code when the input put by the user is converted into binary, octal, and ...
30views

### Is anyone able to understand how to add these minifloat (1 bit sign, 3 bit exponent, 4 bit fraction) bit numbers?

I have a task where we add two binary numbers, but I do not understand the solution. Previously, it is so we have given binary numbers, in minifloat format. In our case, the minifloat is defined like ...
40views

### Binary Floating point to decimal JavaScript function

To answer the original question that was closed due to an invalid moderation. Binary Floating point to decimal JavaScript fucntion This function below basically does the math based on the floating ...
96views

### Manually calculating IEEE-754 floating point fractions and splitting up the bits - Python

I'm trying to come up with a way to do this: Lets say the fraction portion of my IEEE-754 floating point number is 0b10110011001100110011010 I'm trying to take each bit and multiply it by a power of 2 ...
119views

### Floating point addition / subtraction

I get confused because of the hidden bit in the mantissa. From what i know: Subtract the two exponent, find the smaller number and shift the mantissa with the hidden bit (?) by the result of the ...
383views

### IEEE-754 Floating Point Standard: Representing Numbers

Given an IEEE-754 standard floating point number with 6 bits of exponent, and 25 bits of mantissa. 1: What's the smallest non-infinite positive integer this representation CANNOT represent? My answer: ...
435views

### How to convert DEC 64bit double precision floating point to IEEE-754 (DEC is not decimal)

To clarify here DEC in this case is Digital Equipment Corporation NOT decimal. What I have is a binary representation of DEC with seeeeeeeefffff....(total f's is 54) s = sign bit, e = exponent, f = ...
175views

### Floating-point mantissa and exponent base 2

I'm trying to understand how to get the mantissa and the exponent in this case. Here's an example I have in my book. I have this formula (-1)^s * (1 + M) * b^E-e = x s = 0 or 1 (the sign) M = mantissa ...
269views

### Calculating smallest postive floating point number

A 16-bit floating point representation is defined like this: 1 bit for sign, 6 bits for exponent and 9 bits for significand. Floating point number must be normalized (in the form 0.1... × 2exp). ...
351views

### Convert periodic binary number to decimal

I have a simple question that is confusing me. Convert the periodic binary number (0.1011)_2 (imagine a bar over the digits after the comma) to a decimal representation. If it weren't periodic, no ...
403views

### Decimal fraction to binary with precision

Consider the following fraction: 9.8765 How do I go about converting it to precise floating point binary without losing any value? Now consider the following fraction 9.87654321 Again, how would ...
36views

### Floating point multiplication in JavaScript

I tried the below JavaScript in Chrome console: // Case 1 0.11 * 10000 // 1100 0.14 * 10000 // 1400.000...2 // Both of 0.11 and 0.14 are infinite in binary number // Case 2 0.14 * 10 // 1.4000...1 0....
51views

### Verify floating point representation

I want to encode -(263.125) in base 10. I encoded it and arrived at this solution : 11000011110000011100100000000000 I just want to make sure that it is correct. Thank you in advance.
59views

### Why are floats denormalized to deal with underflow instead of overflow?

When denormalized couldn't the exponent on, let's say single point precision floats, represent 128 (instead of -126) and the mantissa (with an added 1 at the end) just be multiplied by 2^{128}? This ...
679views

### How to convert decimal(with float point) to binary in Swift 3? (self written code without third-party library and Foundation)

I am looking for a simple way to convert a decimal with floating point to binary with floating point in Swift 3. For example, this code converts decimal to binary without any problems. func ...
212views

### Converting float to binary using IEEE754 standard

How to convert 32-bits float binary using IEEE754 standard in Elixir. Converting integer is possible by using Integer.to_string/2 and passing base 2 as the second option iex> Integer.to_string(5, ...
32views

### Should the exponent be larger than or equal 2^52 for the 64 bit floating point to be an integer irrespective of the value of the mantissa

I'm trying to understand if the exponent should be larger than 2^52 for the 64 bit floating point to always get integer. It seems so to me. Here is my reasoning: If mantissa has 52 bits, then the ...
77views

### How to does rounding off binary number works?

I am currently learning floating points and my question is: if I have a number like 0.01 and I want to round it off then I have 0.1 but if I have a number , 0.000000001, does that mean I have 0.1 as ...
47views

### what is the result of (54.125) - (184)10

I am practicing for midterm and apprently there's no answer key for it. However, I practiced and got a result but not sure if this is correct since the solution is really long. perfrom the following ... 207views

### Is it possible to modify the bits of a float?

I'm wondering if there is a way to get the value of the mantissa or the value of the exponent and modify them in order to create a new float variable, for example say I want a float to have exponent ...
100views

### Extracting bits and reconstructing in C

For a class project I'm trying to extract 3 bit fields from an IEEE floating point number, multiply it by 0.5, and reconstruct the number. I've gotten extraction mostly working by pushing them into ...
147views

### How to convert a hexadecimal number to floating point binary by hand?

How does one convert a hexadecimal number to its floating point binary equivalent (assuming that one exists, IEEE 754) by hand? I'd prefer simplified working rules (since this is not the main focus ...
72views

### Calculate range for any n-Bit long extended floating point

I am trying to imagine an 80-bit extended precision form with a 1 bit sign, 16-bit exponent,and 63-bit fraction excluding the implied 1 before the binary point. I know that to calculate the bias for ...
163views

### Convert 1.0 x 2^-140 to single precision floating point

So I'm trying to understand how I would convert such a number like 1.0 x 2^-140 in single floating point precision considering it is out of range. Any tips how I would go about solving this in binary?
3kviews

### read float and double from binary data in C++

I need to be able to read in a float or double from binary data in C++, similarly to Python's struct.unpack function. My issue is that the data I am receiving will always be big-endian. I have dealt ...
78views

### How would I write this in IEEE standards?

I would like to know how to write 5/32 in IEEE754 standard. Is there a shortcut to do the fraction part? The answer is 0 10000010 00100000000000000000000. But there has to be an easier way to write 5/...
371views

### fractional binary subtraction

I am having difficulty understanding why the following binary subtraction gives the result that it does. I keep getting a different answer. I am trying to compute 0.1-x such that x is 0....
135views

### Floating point number representation in binary

I'm working on a problem out of Cracking The Coding Interview that asks: Given a 2-D graph with points on it, find a line which passes the most number of points. The solution is to: Draw an infinite ...
301views

### reading float values from binary file (in after effects script)

I have a binary file containing data recorded using a c program. the data stored in files are float values. Now I need to retrieve the float numbers from binary file in after effects script. This is ...
775views

### how to convert a binary string into two's complement and IEEE in C

SO I have been tasked to write a program that scans in two 32-character binary strings from the user, which will always be 32 characters long. After reading the numbers in, i'm supposed to ask the ...