Transcript Sign Field

Representing Real Numbers
Using Floating Point
Notation
Lecture 6
CSCI 1405, CSCI 1301
Introduction to Computer Science
Fall 2009
Real numbers in Binary
System
Remember that:
(101.001)2
= 1*22 + 0*21 + 1*20 + 0*2-1 + 0*2-2 + 1*2-3
= 22 + 20 + 2-3
= 5.125
(101011.101)2 (which is 43.625) = 1.01011101 * 25.
Single Precision Floating Point
•
•
Single precision floating point unit is a packet of 32 bits
Divided into three sections one bit, eight bits, and twentythree bits, in that order.
Sign
1 bit
Exponent
8 bits
Mantissa
23 bits
Single Precision Floating Point
•
•
•
Sign Field: one bit long, and is the sign bit. It is either 0 or 1; 0
indicates that5the number is positive, 1 negative. The number
1.01011101 * 2 is positive, so this field would have a value of 0.
Exponent eight bits long, and serves as the "exponent" of the
number, this "exponent" is actually
127 greater than the "real"
5
exponent, in our 1.01011101 x 2 number, the eight-bit exponent
field would have a decimal value of 5 + 127 = 132. In binary this is
10000100. (Note: actual range of real exponent values from -126
to +128).
Mantissa Field: twenty-three bits
long, and serves as the
5
"mantissa." In our 1.01011101 * 2 number, the mantissa, the
most significant 1 is assumed to be there and is left out to give
us just that much more precision. Thus, our mantissa for our
number would in fact be 01011101000000000000000.
0
10000100
01011101000000000000000
Conversion from Decimal to
Floating Point Representation
(329.390625 )10 = (?)2
(329)10=(101001001)2
(.390625)10= (0.011001)2
0.390625
*2
= 0.78125
0
0.78125
*2
= 1.5625
1
0.5625
*2
= 1.125
1
0.125
*2
= 0.25
0
0.25
*2
= 0.5
0
0.5
*2
=1
1
0
Conversion from Decimal to Floating
Point Representation
(329.390625 )10
= (101001001.011001 )2
= 1.01001001011001 * 28
• The sign is positive, so the sign field is 0.
• The exponent is 8. 8 + 127 = 135, so the exponent
field is 10000111.
• The mantissa is merely 01001001011001 (remember
the implied 1 of the mantissa means we don't include
the leading 1) plus however many 0s we have to add
to the right side to make that binary number 23 bits
long.
Thank You