CHAPTER
FOURTEEN
INTERACTIVE INPUT EDITING AND HISTORY SUBSTITUTION
Some versions of the Python interpreter support editing of the current input line and history substitution, similar to facilities found in the Korn shell and the GNU Bash shell. This is implemented using the GNU Readlinelibrary, which supports various styles of editing. This library has its own documentation which we won’t duplicate here.
14.1 Tab Completion and History Editing
Completion of variable and module names is automatically enabled at interpreter startup so that theTab key invokes the completion function; it looks at Python statement names, the current local variables, and the available module names. For dotted expressions such asstring.a, it will evaluate the expression up to the final '.'and then suggest completions from the attributes of the resulting object. Note that this may execute application-defined code if an object with a __getattr__()method is part of the expression. The default configuration also saves your history into a file named.python_historyin your user directory. The history will be available again during the next interactive interpreter session.
CHAPTER
FIFTEEN
FLOATING POINT ARITHMETIC: ISSUES AND LIMITATIONS
Floating-point numbers are represented in computer hardware as base 2 (binary) fractions. For example, the decimal fraction
0.125
has value 1/10 + 2/100 + 5/1000, and in the same way the binary fraction 0.001
has value 0/2 + 0/4 + 1/8. These two fractions have identical values, the only real difference being that the first is written in base 10 fractional notation, and the second in base 2.
Unfortunately, most decimal fractions cannot be represented exactly as binary fractions. A consequence is that, in general, the decimal point numbers you enter are only approximated by the binary floating-point numbers actually stored in the machine.
The problem is easier to understand at first in base 10. Consider the fraction 1/3. You can approximate that as a base 10 fraction:
0.3
or, better, 0.33
or, better, 0.333
and so on. No matter how many digits you’re willing to write down, the result will never be exactly 1/3, but will be an increasingly better approximation of 1/3.
In the same way, no matter how many base 2 digits you’re willing to use, the decimal value 0.1 cannot be represented exactly as a base 2 fraction. In base 2, 1/10 is the infinitely repeating fraction
0.0001100110011001100110011001100110011001100110011...
Stop at any finite number of bits, and you get an approximation. On most machines today, floats are approximated using a binary fraction with the numerator using the first 53 bits starting with the most significant bit and with the denominator as a power of two. In the case of 1/10, the binary fraction is 3602879701896397 / 2 ** 55which is close to but not exactly equal to the true value of 1/10.
Many users are not aware of the approximation because of the way values are displayed. Python only prints a decimal approximation to the true decimal value of the binary approximation stored by the machine. On most machines, if Python were to print the true decimal value of the binary approximation stored for 0.1, it would have to display
103
>>> 0.1
0.1000000000000000055511151231257827021181583404541015625
That is more digits than most people find useful, so Python keeps the number of digits manageable by displaying a rounded value instead
>>> 1 / 10
0.1
Just remember, even though the printed result looks like the exact value of 1/10, the actual stored value is the nearest representable binary fraction.
Interestingly, there are many different decimal numbers that share the same nearest ap-proximate binary fraction. For example, the numbers 0.1 and 0.10000000000000001 and 0.1000000000000000055511151231257827021181583404541015625 are all approximated by 3602879701896397 / 2 ** 55. Since all of these decimal values share the same approximation, any one of them could be displayed while still preserving the invarianteval(repr(x)) == x.
Historically, the Python prompt and built-in repr() function would choose the one with 17 significant digits, 0.10000000000000001. Starting with Python 3.1, Python (on most systems) is now able to choose the shortest of these and simply display0.1.
Note that this is in the very nature of binary floating-point: this is not a bug in Python, and it is not a bug in your code either. You’ll see the same kind of thing in all languages that support your hardware’s floating-point arithmetic (although some languages may notdisplaythe difference by default, or in all output modes).
For more pleasant output, you may wish to use string formatting to produce a limited number of significant digits:
>>> format(math.pi, '.12g') # give 12 significant digits
'3.14159265359'
>>> format(math.pi, '.2f') # give 2 digits after the point
'3.14'
>>> repr(math.pi)
'3.141592653589793'
It’s important to realize that this is, in a real sense, an illusion: you’re simply rounding thedisplay of the true machine value.
One illusion may beget another. For example, since 0.1 is not exactly 1/10, summing three values of 0.1 may not yield exactly 0.3, either:
>>> .1 + .1 + .1 == .3
False
Also, since the 0.1 cannot get any closer to the exact value of 1/10 and 0.3 cannot get any closer to the exact value of 3/10, then pre-rounding withround()function cannot help:
>>> round(.1, 1) + round(.1, 1) + round(.1, 1) == round(.3, 1)
False
Though the numbers cannot be made closer to their intended exact values, the round() function can be useful for post-rounding so that results with inexact values become comparable to one another:
>>> round(.1 + .1 + .1, 10) == round(.3, 10)
True
Python Tutorial, Release 3.6.4
Binary floating-point arithmetic holds many surprises like this. The problem with “0.1” is explained in precise detail below, in the “Representation Error” section. See The Perils of Floating Point for a more complete account of other common surprises.
As that says near the end, “there are no easy answers.” Still, don’t be unduly wary of floating-point! The errors in Python float operations are inherited from the floating-point hardware, and on most machines are on the order of no more than 1 part in 2**53 per operation. That’s more than adequate for most tasks, but you do need to keep in mind that it’s not decimal arithmetic and that every float operation can suffer a new rounding error.
While pathological cases do exist, for most casual use of floating-point arithmetic you’ll see the result you expect in the end if you simply round the display of your final results to the number of decimal digits you expect. str() usually suffices, and for finer control see the str.format() method’s format specifiers in formatstrings.
For use cases which require exact decimal representation, try using thedecimal module which implements decimal arithmetic suitable for accounting applications and high-precision applications.
Another form of exact arithmetic is supported by thefractionsmodule which implements arithmetic based on rational numbers (so the numbers like 1/3 can be represented exactly).
If you are a heavy user of floating point operations you should take a look at the Numerical Python package and many other packages for mathematical and statistical operations supplied by the SciPy project. See
<https://scipy.org>.
Python provides tools that may help on those rare occasions when you really do want to know the exact value of a float. Thefloat.as_integer_ratio()method expresses the value of a float as a fraction:
>>> x = 3.14159
>>> x.as_integer_ratio()
(3537115888337719, 1125899906842624)
Since the ratio is exact, it can be used to losslessly recreate the original value:
>>> x == 3537115888337719 / 1125899906842624
True
Thefloat.hex()method expresses a float in hexadecimal (base 16), again giving the exact value stored by your computer:
>>> x.hex()
'0x1.921f9f01b866ep+1'
This precise hexadecimal representation can be used to reconstruct the float value exactly:
>>> x == float.fromhex('0x1.921f9f01b866ep+1')
True
Since the representation is exact, it is useful for reliably porting values across different versions of Python (platform independence) and exchanging data with other languages that support the same format (such as Java and C99).
Another helpful tool is themath.fsum()function which helps mitigate loss-of-precision during summation.
It tracks “lost digits” as values are added onto a running total. That can make a difference in overall accuracy so that the errors do not accumulate to the point where they affect the final total:
>>> sum([0.1] * 10) == 1.0
False
>>> math.fsum([0.1] * 10) == 1.0
True
105
15.1 Representation Error
This section explains the “0.1” example in detail, and shows how you can perform an exact analysis of cases like this yourself. Basic familiarity with binary floating-point representation is assumed.
Representation error refers to the fact that some (most, actually) decimal fractions cannot be represented exactly as binary (base 2) fractions. This is the chief reason why Python (or Perl, C, C++, Java, Fortran, and many others) often won’t display the exact decimal number you expect.
Why is that? 1/10 is not exactly representable as a binary fraction. Almost all machines today (November 2000) use IEEE-754 floating point arithmetic, and almost all platforms map Python floats to IEEE-754
“double precision”. 754 doubles contain 53 bits of precision, so on input the computer strives to convert 0.1 to the closest fraction it can of the formJ/2**N whereJ is an integer containing exactly 53 bits. Rewriting
1 / 10 ~= J / (2**N) as
J ~= 2**N / 10
and recalling thatJ has exactly 53 bits (is>= 2**52but < 2**53), the best value forN is 56:
>>> 2**52 <= 2**56 // 10 < 2**53
True
That is, 56 is the only value forN that leavesJ with exactly 53 bits. The best possible value forJ is then that quotient rounded:
>>> q, r = divmod(2**56, 10)
>>> r 6
Since the remainder is more than half of 10, the best approximation is obtained by rounding up:
>>> q+1
7205759403792794
Therefore the best possible approximation to 1/10 in 754 double precision is:
7205759403792794 / 2 ** 56
Dividing both the numerator and denominator by two reduces the fraction to:
3602879701896397 / 2 ** 55
Note that since we rounded up, this is actually a little bit larger than 1/10; if we had not rounded up, the quotient would have been a little bit smaller than 1/10. But in no case can it beexactly 1/10!
So the computer never “sees” 1/10: what it sees is the exact fraction given above, the best 754 double approximation it can get:
>>> 0.1 * 2 ** 55
3602879701896397.0
If we multiply that fraction by 10**55, we can see the value out to 55 decimal digits:
>>> 3602879701896397 * 10 ** 55 // 2 ** 55
1000000000000000055511151231257827021181583404541015625
Python Tutorial, Release 3.6.4
meaning that the exact number stored in the computer is equal to the decimal value 0.1000000000000000055511151231257827021181583404541015625. Instead of displaying the full decimal value, many languages (including older versions of Python), round the result to 17 significant digits:
>>> format(0.1, '.17f')
'0.10000000000000001'
Thefractionsanddecimalmodules make these calculations easy:
>>> from decimal import Decimal
>>> from fractions import Fraction
>>> Fraction.from_float(0.1)
Fraction(3602879701896397, 36028797018963968)
>>> (0.1).as_integer_ratio()
(3602879701896397, 36028797018963968)
>>> Decimal.from_float(0.1)
Decimal('0.1000000000000000055511151231257827021181583404541015625')
>>> format(Decimal.from_float(0.1), '.17')
'0.10000000000000001'
15.1. Representation Error 107
CHAPTER
SIXTEEN
APPENDIX
16.1 Interactive Mode
16.1.1 Error Handling
When an error occurs, the interpreter prints an error message and a stack trace. In interactive mode, it then returns to the primary prompt; when input came from a file, it exits with a nonzero exit status after printing the stack trace. (Exceptions handled by anexceptclause in atrystatement are not errors in this context.) Some errors are unconditionally fatal and cause an exit with a nonzero exit; this applies to internal inconsistencies and some cases of running out of memory. All error messages are written to the standard error stream; normal output from executed commands is written to standard output.
Typing the interrupt character (usuallyControl-Cor Delete) to the primary or secondary prompt cancels the input and returns to the primary prompt.1 Typing an interrupt while a command is executing raises theKeyboardInterruptexception, which may be handled by atrystatement.
16.1.2 Executable Python Scripts
On BSD’ish Unix systems, Python scripts can be made directly executable, like shell scripts, by putting the line
#!/usr/bin/env python3.5
(assuming that the interpreter is on the user’s PATH) at the beginning of the script and giving the file an executable mode. The #! must be the first two characters of the file. On some platforms, this first line must end with a Unix-style line ending ('\n'), not a Windows ('\r\n') line ending. Note that the hash, or pound, character,'#', is used to start a comment in Python.
The script can be given an executable mode, or permission, using thechmodcommand.
$ chmod +x myscript.py
On Windows systems, there is no notion of an “executable mode”. The Python installer automatically associates .py files with python.exe so that a double-click on a Python file will run it as a script. The extension can also be.pyw, in that case, the console window that normally appears is suppressed.
16.1.3 The Interactive Startup File
When you use Python interactively, it is frequently handy to have some standard commands executed every time the interpreter is started. You can do this by setting an environment variable namedPYTHONSTARTUP
1 A problem with the GNU Readline package may prevent this.
109
to the name of a file containing your start-up commands. This is similar to the.profilefeature of the Unix shells.
This file is only read in interactive sessions, not when Python reads commands from a script, and not when /dev/ttyis given as the explicit source of commands (which otherwise behaves like an interactive session). It is executed in the same namespace where interactive commands are executed, so that objects that it defines or imports can be used without qualification in the interactive session. You can also change the prompts sys.ps1andsys.ps2in this file.
If you want to read an additional start-up file from the current directory, you can program this in the global start-up file using code likeif os.path.isfile('.pythonrc.py'): exec(open('.pythonrc.py').
read()). If you want to use the startup file in a script, you must do this explicitly in the script:
import os
filename = os.environ.get('PYTHONSTARTUP') if filename and os.path.isfile(filename):
with open(filename) as fobj:
startup_file = fobj.read() exec(startup_file)
16.1.4 The Customization Modules
Python provides two hooks to let you customize it: sitecustomize and usercustomize. To see how it works, you need first to find the location of your user site-packages directory. Start Python and run this code:
>>> import site
>>> site.getusersitepackages()
'/home/user/.local/lib/python3.5/site-packages'
Now you can create a file named usercustomize.pyin that directory and put anything you want in it. It will affect every invocation of Python, unless it is started with the-soption to disable the automatic import.
sitecustomizeworks in the same way, but is typically created by an administrator of the computer in the global site-packages directory, and is imported beforeusercustomize. See the documentation of the site module for more details.
APPENDIX
A
GLOSSARY
>>> The default Python prompt of the interactive shell. Often seen for code examples which can be executed interactively in the interpreter.
... The default Python prompt of the interactive shell when entering code for an indented code block or within a pair of matching left and right delimiters (parentheses, square brackets or curly braces).
2to3 A tool that tries to convert Python 2.x code to Python 3.x code by handling most of the incompati-bilities which can be detected by parsing the source and traversing the parse tree.
2to3 is available in the standard library as lib2to3; a standalone entry point is provided as Tools/
scripts/2to3. See 2to3-reference.
abstract base class Abstract base classes complementduck-typingby providing a way to define interfaces when other techniques like hasattr() would be clumsy or subtly wrong (for example with magic methods). ABCs introduce virtual subclasses, which are classes that don’t inherit from a class but are still recognized by isinstance()andissubclass(); see theabcmodule documentation. Python comes with many built-in ABCs for data structures (in the collections.abcmodule), numbers (in thenumbersmodule), streams (in theiomodule), import finders and loaders (in theimportlib.abc module). You can create your own ABCs with theabcmodule.
argument A value passed to a function (or method) when calling the function. There are two kinds of argument:
• keyword argument: an argument preceded by an identifier (e.g. name=) in a function call or passed as a value in a dictionary preceded by**. For example,3and5are both keyword arguments in the following calls to complex():
complex(real=3, imag=5)
complex(**{'real': 3, 'imag': 5})
• positional argument: an argument that is not a keyword argument. Positional arguments can appear at the beginning of an argument list and/or be passed as elements of aniterablepreceded by*. For example,3and5are both positional arguments in the following calls:
complex(3, 5) complex(*(3, 5))
Arguments are assigned to the named local variables in a function body. See the calls section for the rules governing this assignment. Syntactically, any expression can be used to represent an argument;
the evaluated value is assigned to the local variable.
See also the parameter glossary entry, the FAQ question on the difference between arguments and parameters, andPEP 362.
asynchronous context manager An object which controls the environment seen in an async with statement by defining__aenter__()and__aexit__()methods. Introduced byPEP 492.
111
asynchronous generator A function which returns an asynchronous generator iterator. It looks like a coroutine function defined withasync defexcept that it containsyield expressions for producing a series of values usable in an async forloop.
Usually refers to a asynchronous generator function, but may refer to an asynchronous generator iterator in some contexts. In cases where the intended meaning isn’t clear, using the full terms avoids ambiguity.
An asynchronous generator function may contain awaitexpressions as well asasync for, and async with statements.
asynchronous generator iterator An object created by aasynchronous generator function.
This is anasynchronous iteratorwhich when called using the__anext__()method returns an awaitable object which will execute that the body of the asynchronous generator function until the next yield expression.
Eachyieldtemporarily suspends processing, remembering the location execution state (including local variables and pending try-statements). When the asynchronous generator iterator effectively resumes with another awaitable returned by__anext__(), it picks-up where it left-off. SeePEP 492andPEP 525.
asynchronous iterable An object, that can be used in an async for statement. Must return an asyn-chronous iterator from its__aiter__()method. Introduced byPEP 492.
asynchronous iterator An object that implements__aiter__()and__anext__()methods. __anext__
must return anawaitableobject. async forresolves awaitable returned from asynchronous iterator’s __anext__() method until it raisesStopAsyncIterationexception. Introduced by PEP 492.
attribute A value associated with an object which is referenced by name using dotted expressions. For example, if an objectohas an attributeait would be referenced as o.a.
awaitable An object that can be used in an await expression. Can be acoroutine or an object with an __await__() method. See alsoPEP 492.
BDFL Benevolent Dictator For Life, a.k.a. Guido van Rossum, Python’s creator.
binary file Afile objectable to read and writebytes-like objects. Examples of binary files are files opened in binary mode ('rb', 'wb' or 'rb+'), sys.stdin.buffer, sys.stdout.buffer, and instances of io.BytesIO andgzip.GzipFile.
See also:
A text filereads and writesstrobjects.
bytes-like object An object that supports the bufferobjects and can export a C-contiguous buffer. This includes all bytes, bytearray, and array.array objects, as well as many common memoryview ob-jects. Bytes-like objects can be used for various operations that work with binary data; these include compression, saving to a binary file, and sending over a socket.
Some operations need the binary data to be mutable. The documentation often refers to these as “read-write bytes-like objects”. Example mutable buffer objects include bytearrayand a memoryview of a bytearray. Other operations require the binary data to be stored in immutable objects (“read-only bytes-like objects”); examples of these include bytesand amemoryview of abytes object.
bytecode Python source code is compiled into bytecode, the internal representation of a Python program in the CPython interpreter. The bytecode is also cached in.pyc files so that executing the same file is faster the second time (recompilation from source to bytecode can be avoided). This “intermediate language” is said to run on a virtual machine that executes the machine code corresponding to each bytecode. Do note that bytecodes are not expected to work between different Python virtual machines, nor to be stable between Python releases.
A list of bytecode instructions can be found in the documentation for the dis module.
Python Tutorial, Release 3.6.4
class A template for creating user-defined objects. Class definitions normally contain method definitions which operate on instances of the class.
coercion The implicit conversion of an instance of one type to another during an operation which involves two arguments of the same type. For example, int(3.15)converts the floating point number to the integer 3, but in 3+4.5, each argument is of a different type (one int, one float), and both must be converted to the same type before they can be added or it will raise aTypeError. Without coercion, all arguments of even compatible types would have to be normalized to the same value by the programmer, e.g.,float(3)+4.5rather than just3+4.5.
complex number An extension of the familiar real number system in which all numbers are expressed as a sum of a real part and an imaginary part. Imaginary numbers are real multiples of the imaginary unit (the square root of -1), often writteniin mathematics or jin engineering. Python has built-in support for complex numbers, which are written with this latter notation; the imaginary part is written with a jsuffix, e.g.,3+1j. To get access to complex equivalents of themath module, usecmath. Use of complex numbers is a fairly advanced mathematical feature. If you’re not aware of a need for them, it’s almost certain you can safely ignore them.
context manager An object which controls the environment seen in a with statement by defining __enter__() and__exit__()methods. SeePEP 343.
contiguous A buffer is considered contiguous exactly if it is either C-contiguous or Fortran contiguous.
Zero-dimensional buffers are C and Fortran contiguous. In one-dimensional arrays, the items must be laid out in memory next to each other, in order of increasing indexes starting from zero. In multidimensional C-contiguous arrays, the last index varies the fastest when visiting items in order of memory address. However, in Fortran contiguous arrays, the first index varies the fastest.
coroutine Coroutines is a more generalized form of subroutines. Subroutines are entered at one point and exited at another point. Coroutines can be entered, exited, and resumed at many different points.
They can be implemented with theasync defstatement. See alsoPEP 492.
coroutine function A function which returns a coroutine object. A coroutine function may be defined with theasync defstatement, and may containawait,async for, andasync withkeywords. These were introduced byPEP 492.
CPython The canonical implementation of the Python programming language, as distributed on python.org. The term “CPython” is used when necessary to distinguish this implementation from others such as Jython or IronPython.
decorator A function returning another function, usually applied as a function transformation using the
@wrappersyntax. Common examples for decorators areclassmethod()andstaticmethod().
The decorator syntax is merely syntactic sugar, the following two function definitions are semantically equivalent:
def f(...):
...
f = staticmethod(f)
@staticmethod def f(...):
...
The same concept exists for classes, but is less commonly used there. See the documentation for function definitions and class definitions for more about decorators.
descriptor Any object which defines the methods__get__(),__set__(), or__delete__(). When a class attribute is a descriptor, its special binding behavior is triggered upon attribute lookup. Normally, usinga.bto get, set or delete an attribute looks up the object namedbin the class dictionary fora, but if bis a descriptor, the respective descriptor method gets called. Understanding descriptors is a key
113