Casting from Float to Double

aml1138@yahoo.com · ‎05-04-2011

Where can I find a write up about how to cast variables correctly? When I cast from float to double the variable contains random numbers in the added precision. for example:

float a =1.234567

double b = 0.0

b = (double)a;

then b is displayed as 1.23456704632568

Thanks

Kracken · ‎05-05-2011

Hi aml1138,

The result you are getting is expected behavior. A float only has 6 digits of precision and cannot represent 1.234567 exactly. When the number is type casted to a double, you are seeing the additional numbers that were already present. You can show this by:

int main (void)
{
    float a =1.234567;
    double b = 0;
    b = (double)a;
    printf("%.15f\n%.15f\n",a,b);
    printf ("Press <Enter> to quit.");
    getchar ();
    return 0;
}

Which gives the following result:

So you can see that the double is the same number as the float, but the float isn't exactly 1.234567.

I don't know of any write up that explains type casting, but the Wikipedia page on floating point numbers may be of some help.

Single precision floating-point format - Wikipedia, the free encyclopedia

Regards,

Brandon V.

Applications Engineering

National Instruments

www.ni.com/support

menchar · ‎05-06-2011

Going to increased flt pt precision can, in some cases, have the apparent effect of actually decreasing the accuracy of the decimal number being represented.

As Brandon pointed out, flt pt is a binary approximation using a binary mantissa and a power of two exponent to approximate a decimal number with some number of decimal digits of precision. There are a finite number of decimal numbers that any given flt pt representation can represent exactly. doubles can represent more of these than floats. When a decimal number cannot be represented exactly, the closest decimal number that can be expressed in the flt pt notation is used. Some decimal numbers may not be exactly representable in binary flt pt regardless of the precision used.

When you view a flt pt number as formatted by some language-specific formatter (e.g. C standard library) , it's going to round the binary flt pt representation (and the equivalent decimal number that may or may not be the true decimal number you're interested in) to some number of decimal digits of precision. So, you can have a float that represents a decimal number that gets rounded to the exact decimal number when displayed, even though it couldn't really represent that number exactly. And when you simply cast the float to a double, the double doesn't somehow "know" what the original, true decimal number was - you just get the same decimal number as Brandon demonstrates.

If, on the other hand, you were to parse the original decimal number (as with a compiler) into a double, you would, potentially, get a closer representation to the original decimal number. And, if the float representation was rounded to the true decimal number due to limited precision and displayed, and then you display the double with additional decimal precision, you can get a decimal number that appears to be less accurate than what you had with the float!

There are many papers on floating point and numeric precision and calculation considerations. I co-authored one that IEEE published several years ago that was oriented towards numeric issues in test software .

There are at least a few threads on this forum in re flt pt representation. Anyone working with flt pt and measurements gets bit by this initially.

There's a very good IEEE flt pt calculator / converter / demo application on a CUNY website, I don't have the URL handy but you can google it.

menchar · ‎05-06-2011

Here's the URL:

http://babbage.cs.qc.edu/IEEE-754/

There was an error on this webpage that I exchanged emails with the author / prof about, I believe he has fixed it up.

aml1138@yahoo.com · ‎05-06-2011

Perhaps I misstated what I want. I want to covert the random numbers added to zero. Is there a standard method for that?

Kracken · ‎05-09-2011

Hi aml1138,

You can drop the additonal numbers with some rounding. You can do something like:

int main (void)
{
    float a =1.234567;
    double b = 0, rounded;
    b = (double)a;
    rounded = RoundRealToNearestInteger (b*10000000);
    rounded = (double)rounded/10000000;
    printf("%.15f\n%.15f\n%.15f\n",a,b,rounded);
    printf ("Press <Enter> to quit.");
    getchar ();
    return 0;
}

Which would give you the following results.

Regards,

Brandon V.

Applications Engineering

National Instruments

www.ni.com/support

aml1138@yahoo.com · ‎05-09-2011

yep, that's the way I've been doing it but that seems to clumsy.

LabWindows/CVI

Casting from Float to Double

Casting from Float to Double

Re: Casting from Float to Double

Re: Casting from Float to Double

Re: Casting from Float to Double

Re: Casting from Float to Double

Re: Casting from Float to Double

Re: Casting from Float to Double