Big Floating Point

[Home]   [Puzzles & Projects]    [Delphi Techniques]   [Math topics]   [Library]   [Utilities]

 

 

Search

Search WWW

Search DelphiForFun.org

As of October, 2016, Embarcadero is offering a free release of Delphi (Delphi 10.1 Berlin Starter Edition ).     There are a few restrictions, but it is a welcome step toward making more programmers aware of the joys of Delphi.  They do say "Offer may be withdrawn at any time", so don't delay if you want to check it out.  Please use the feedback link to let me know if the link stops working.

 

Support DFF - Shop

 If you shop at Amazon anyway,  consider using this link. 

     

We receive a few cents from each purchase.  Thanks

 


Support DFF - Donate

 If you benefit from the website,  in terms of knowledge, entertainment value, or something otherwise useful, consider making a donation via PayPal  to help defray the costs.  (No PayPal account necessary to donate via credit card.)  Transaction is secure.

Mensa® Daily Puzzlers

For over 15 years Mensa Page-A-Day calendars have provided several puzzles a year for my programming pleasure.  Coding "solvers" is most fun, but many programs also allow user solving, convenient for "fill in the blanks" type.  Below are Amazon  links to the two most recent years.

Mensa® 365 Puzzlers  Calendar 2017

Mensa® 365 Puzzlers Calendar 2018

(Hint: If you can wait, current year calendars are usually on sale in January.)

Contact

Feedback:  Send an e-mail with your comments about this program (or anything else).

Search DelphiForFun.org only

 

 

 

Problem Description

Here's a program that performs the basic arithmetic operations to floating point numbers of arbitrary size.  

Try this one on your Windows calculator!

1234567890123456789012345678901234567890 E - 10000
X                                             98765432109876543210987654321 E +10001
1.219326311370217952261850327337448559633622923332 E 68          

       

Background & Techniques

There is nothing very useful about the results produced here.  It is mainly an investigation into how to combine exponents and output from our  Big Integer arithmetic unit to apply +, -, x and (maybe) ÷ operations to floating point numbers of arbitrary size. 

The program was prompted by a viewer's question about limitations of the  Windows Calculator program.   The Windows Calc.exe calculator limits input and output  to 32 significant digits and input of exponents (using the Exp key)  to 4 digits.   Using the X^Y key we can enter exponents up to 43300.     The largest intrinsic number type,  Extended, uses 80bit (10 byte) number with 64 reserved bits for significant digits.  This is equivalent to 19 or 20 decimal digits, so some sort of large number handling routines must be used to reach 32 digit accuracy.    It is likely  that the  limitations are be imposed by display space considerations and, perhaps, by the math library Microsoft uses to calculate function values.  By the way, the PowerToy Calculator,  available as a free download from Microsoft, allows users to choose 32, 64, 128 or 512 digit accuracy - with a warning that 512 digit mode may be very slow.  I guess!

I allow up to 50 digits to be used, primarily. to keep the display space to a reasonable size.

Programmer's Notes

 I define a TBigFloat class which contains fields Exponent, the exponent of the number, DecPart, a TInteger  big integer object containing the digits and an integer field reflecting the number of significant digits to display. 

The DecPart field represent a decimal value between -1 and +1.  Exponent is the power of ten that DecPart must be multiplied by to get the real value of the number.  So negative exponents reflect the number of 0's that must be inserted between the decimal point and the leftmost digit of DecPart.   Positive exponents represent the position of the decimal point moving right from the leftmost digit of DecPart, adding 0's to the right end of the number if required.   

 Procedure  GetNumber converts a string to its internal representation and Shownumber builds a string version of the number in "Normal" or "Scientific"  format.     Split is a method that uses the number of digits and the exponent value to determine how many digits are to the left and right of the implied decimal point.  This allows the appropriate operand to be be shifted left (multiplied by 10)  as necessary to align the decimal points before adding.     

Addendum March 23, 2003:  A viewer suggested an algorithm for division, which I implemented today.  The new version  divides by successive approximation.  A initial range for the quotients is set at 0 for the lowest value and 10(dividend exponent - divisor exponent) as the maximum.  The algorithm loops  making quotient guesses by splitting the difference between the last guess that was too high and the last guess that was too low.  Quotients are checked by multiplying the quotient guess by the divisor and comparing the result to the dividend. during each iteration.   This may not be the optimum  method, but it seem to work on the cases tested so far.    A "rounding" procedure was required to trim results back  the specified number of significant digits.  Otherwise exact divisors tended to produce quotients ending in 9999999999...    The ShowNumber procedure was  modified to remove extra trailing 0's after a decimal point, regardless of the significant digits specification.

Addendum April 13, 2005:  The TInteger class used by TBigFloat has been moved the DFF Library. in  unit UBigIntsV2.  If you wish to recompile this program,  a one-time download of the library will be necessary.

Addendum April 4, 2006:  UBigFloatV2 was added to the DFF Library file and removed from the source zip file.  BigFloatTest program has added then Compare  operation for testing.

Addendum December 5, 2006: Charles Doumar has contributed a number of additions to the UBigFloatV2 unit including Power, Log  (natural and base 10), and Exp functions.   The DFF Library file has been upgraded to DFFLibV08 to include the revised UBigFloatV2 unit and a few others.      BigFloatTest  was upgraded to test the new big float functions.

Addendum February 7, 2007:  A few changes/enhancements posted today in UBigFloatV3 contained in a new library release DFFLibV10.  Most were "clean-up" activirties except for the change in the definition of the Round procedure.

bulletTFloatint, the large integer descendant of our TInteger class, now contains several shift routines that were formerly part of TInteger, but existed only for use here.   TFloatInt holds the digit values for TBigFloat numbers. 
bulletSupport procedures AssignHalf, AssignTwo, AssignThree, AssignFour, Squareraw, GetNumber, and ShowNumber were moved to  Protected section.
bullet "Maxsigs", maximum significant digits parameters changed from cardinal to integer type to avoid Delphi widening both parameters when comparing to integer types. 
bulletMoved zlog... variables used internally from Interface to Implementation section.
bulletChanged old "Round" procedure to RoundToPrec (round to a specified number of significant digits) and defined new Round procedure to agree with Trunc, Ceiling and Floor "round to" digits definition. Parameter specifies the "round to" position relative to the decimal point. So 0 returns integer value, 1 "rounds" to 1/10, 2 "rounds" to 1/100, -1 "rounds" to multiple of 10, etc.  The specific "rounding" operation performed depends on the procedure called.    The revised BigFloatTest program now allows results for all 5 procedures to be calculated.

Addendum October 16, 2009:  BigFloatTest was reposted today to incorporate two small changes to UBigFloatV3.  The Add procedure to add one TBigFloat number to another could produce erroneous results when a number with a very large exponent was added to another with a very large negative exponent (a very small number).    Also procedure Reciprocal could loop and produce an "Out of Memory" error under certain conditions.  Division uses Reciprocal to divide by multiplying and a user encountered the error when computing 1/99,999,999,999,999,999,999.   Thanks to Charles Doumar for the corrections posted yesterday in UBigFloatV3 in library file DFFLibV13.  

August 29, 2012:  The "Round" procedure in UBigFloatV3 was rewritten today to correct  erroneous results when 0 digits to the right of the decimal point was specified and and values were between -1 and +1.  

May 11, 2015:  A memory leak in UBigIntsForFloatV4 unit was corrected today.  BigFloatTest was changed to report allocated memory after each test to verify the correction and to check future changes.   Programmers can avoid re-downloading the library zip file by adding the line inherited; as the last statement of TInteger.Free method in UBigIntsForFloatV4.

September 20, 2016:  A viewer recently reported significant error with the BigFloat "Divide" and "Reciprocal" operations.  Divide works by computing the reciprocal of the denominator and multiplying  by the numerator.  If the value passed to Reciprocal directly or as a denominator is negative, values returned were incorrect.    The error has existed for several months, so hopefully negative denominators are rare.   If you use this unit and 1/-1 does not return -1 as the result, you need this fix!    

Our DFF Library zip file has been updated with the corrected UBigFloatV3.pas file and reposted as file DFFLibV14_20Sep2016.zip.  The updated version is also included in the source code download so no need to download the library to recompile and test this program.    

Running/Exploring the Program 

bulletDownload  executable
bulletDownload source (Requires DFF Library source DFFLibV14_20Sep2016 or later  )
bulletDownload current DFF Library Source   (DFFLibV15 )

Note: The Lazarus (Free Pascal based) programs downloadable below are not currently maintained. 

bulletDownload Lazarus Source
bulletDownload current DFF Lazarus Library Source(DFFLazLib01)

Suggestions for Further Explorations

(Done 3/23/03 - see addendum note above) Need to complete the "divide" operation just to learn a little more about what division  really means.   If you ever take a class in formal logic, one of the tautologies (logic theorems)  you  learn is named "Modus Tollens":  Given the statements  A and B, assume that we know  "If A is true then B is true"  and  "B is false", we can conclude  "A is false".   In other words,  the statements  "If A is true then B is true"  and "If B is false then A is false" are logically equivalent.   This reflects my attitude toward toward  programming and problem solving in general:   "If I understand the problem, I can solve it"   which implies   "If I can't solve a problem, then I just don't understand it!".     If nothing else, this approach to problem solving inspires the persistence required.  

Original Date: March 11, 2003 

Modified: July 29, 2017

 
  [Feedback]   [Newsletters (subscribe/view)] [About me]
Copyright © 2000-2018, Gary Darby    All rights reserved.