< Python Programming


Python Programming

The current, editable version of this book is available in Wikibooks, the open-content textbooks collection, at
https://en.wikibooks.org/wiki/Python_Programming

Permission is granted to copy, distribute, and/or modify this document under the terms of the Creative Commons Attribution-ShareAlike 3.0 License.


Overview



Python is a high-level, structured, open-source programming language that can be used for a wide variety of programming tasks. Python was created by Guido Van Rossum in the early 1990s; its following has grown steadily and interest has increased markedly in the last few years or so. It is named after Monty Python's Flying Circus comedy program.

Python is used extensively for system administration (many vital components of Linux distributions are written in it); also, it is a great language to teach programming to novices. NASA has used Python for its software systems and has adopted it as the standard scripting language for its Integrated Planning System. Python is also extensively used by Google to implement many components of its Web Crawler and Search Engine & Yahoo! for managing its discussion groups.

Python within itself is an interpreted programming language that is automatically compiled into bytecode before execution (the bytecode is then normally saved to disk, just as automatically, so that compilation need not happen again until and unless the source gets changed). It is also a dynamically typed language that includes (but does not require one to use) object-oriented features and constructs.

The most unusual aspect of Python is that whitespace is significant; instead of block delimiters (braces → "{}" in the C family of languages), indentation is used to indicate where blocks begin and end.

For example, the following Python code can be interactively typed at an interpreter prompt, display the famous "Hello World!" on the user screen:

 >>> print "Hello World!"
Hello World!

Another great feature of Python is its availability for all platforms. Python can run on Microsoft Windows, Macintosh and all Linux distributions with ease. This makes the programs very portable, as any program written for one platform can easily be used on another.

Python provides a powerful assortment of built-in types (e.g., lists, dictionaries and strings), a number of built-in functions, and a few constructs, mostly statements. For example, loop constructs that can iterate over items in a collection instead of being limited to a simple range of integer values. Python also comes with a powerful standard library, which includes hundreds of modules to provide routines for a wide variety of services including regular expressions and TCP/IP sessions.

Python is used and supported by a large Python Community that exists on the Internet. The mailing lists and news groups like the tutor list actively support and help new python programmers. While they discourage doing homework for you, they are quite helpful and are populated by the authors of many of the Python textbooks currently available on the market.

Note:

Python 2 vs Python 3: Several years ago, the Python developers made the decision to come up with a major new version of Python. Initially called “Python 3000”, this became the 3.x series of versions of Python. What was radical about this was that the new version is backward-incompatible with Python 2.x: certain old features (like the handling of Unicode strings) were deemed to be too unwieldy or broken to be worth carrying forward. Instead, new, cleaner ways of achieving the same results were added.

 




Getting Python



In order to program in Python you need the Python interpreter. If it is not already installed or if the version you are using is obsolete, you will need to obtain and install Python using the methods below:

Python 2 vs Python 3

In 2008, a new version of Python (version 3) was published that was not entirely backward compatible. Developers were asked to switch to the new version as soon as possible, but many of the common external modules are not yet (as of Aug 2010) available for Python 3. There is a program called 2to3 to convert the source code of a Python 2 program to the source code of a Python 3 program. Consider this fact before you start working with Python. Now we are in the era of Python3.6

The latest version of Python 2, i.e. Python 2.7.x, was officially discontinued on 1 January 2020.

Installing Python in Windows

Go to the Python Homepage or the ActiveState website and get the proper version for your platform. Download it, read the instructions and get it installed.

In order to run Python from the command line, you will need to have the python directory in your PATH. Alternatively, you could use an Integrated Development Environment (IDE) for Python like DrPython, eric, PyScripter, or Python's own IDLE (which ships with every version of Python since 2.3).

The PATH variable can be modified from the Window's System control panel. To add the PATH in Windows 7 :

  1. Go to Start.
  2. Right click on computer.
  3. Click on properties.
  4. Click on 'Advanced System Settings'
  5. Click on 'Environmental Variables'.
  6. In the system variables select Path and edit it, by appending a ';' (without quote) and adding 'C:\python27'(without quote).

If you prefer having a temporary environment, you can create a new command prompt short-cut that automatically executes the following statement:

PATH %PATH%;c:\python27

If you downloaded a different version (such as Python 3.1), change the "27" for the version of Python you have (27 is 2.7.x, the current version of Python 2.)

Cygwin

By default, the Cygwin installer for Windows does not include Python in the downloads. However, it can be selected from the list of packages.

Installing Python on Mac

Users on Apple Mac OS X will find that it already ships with Python 2.3 (OS X 10.4 Tiger) or Python 2.6.1 (OS X Snow Leopard), but if you want the more recent version head to Python Download Page follow the instruction on the page and in the installers. As a bonus you will also install the Python IDE.

Installing Python on Unix environments

Python is available as a package for some Linux distributions. In some cases, the distribution CD will contain the python package for installation, while other distributions require downloading the source code and using the compilation scripts.

Gentoo Linux

Gentoo is an example of a distribution that installs Python by default the package management system Portage depends on Python.

Ubuntu Linux

Users of Ubuntu will notice that Python comes installed by default, only it sometimes is not the latest version. To check which version of Python is installed, type
python -V
into the terminal.

Arch Linux

Arch Linux does not come with Python pre-installed by default, but it is easily available for installation through the package manager to pacman. As root (or using sudo if you've installed and configured it), type:

pacman -S python

This will be update package databases and install Python 3. Python 2 can be installed with:

pacman -S python2

Other versions can be built from source from the Arch User Repository.

Source code installations

Some platforms do not have a version of Python installed, and do not have pre-compiled binaries. In these cases, you will need to download the source code from the official site. Once the download is complete, you will need to unpack the compressed archive into a folder.

To build Python, simply run the configure script (requires the Bash shell) and compile using make.

Other Distributions

Python, which is also referred to as CPython, is written in the C Programming language. The C source code is generally portable, that means CPython can run on various platforms. More precisely, CPython can be made available on all platforms that provide a compiler to translate the C source code to binary code for that platform.

Apart from CPython there are also other implementations that run on top of a virtual machine. For example, on Java's JRE (Java Runtime Environment) or Microsoft's .NET CLR (Common Language Runtime). Both can access and use the libraries available on their platform. Specifically, they make use of reflection that allows complete inspection and use of all classes and objects for their very technology.

Python Implementations (Platforms)

EnvironmentDescriptionGet From
JythonJava Version of PythonJython
IronPythonC# Version of PythonIronPython

Integrated Development Environments (IDE)

CPython ships with IDLE; however, IDLE is not considered user-friendly.[1] For Linux, KDevelop and Spyder are popular. For Windows, PyScripter is free, quick to install, and comes included with PortablePython. The best py used GUI Builder is actually Boa Constructor, but has suffered from bit rot significantly, so it is best to be used in conjunction with LiClipse.

Some Integrated Development Environments (IDEs) for Python

EnvironmentDescriptionGet From
ActivePythonHighly flexible, Pythonwin IDEActivePython
AnjutaIDE Linux/UnixAnjuta
Eclipse (PyDev plugin)Open-source IDEEclipse
EricOpen-source Linux/Windows IDE.Eric
KDevelopCross-language IDE for KDEKDevelop
Ninja-IDECross-platform open-source IDE.Nina-IDE
PyScripterFree Windows IDE (portable)PyScripter
PythonwinWindows-oriented environmentPythonwin
SpyderFree cross-platform IDE (math-oriented)Spyder
VisualWxFree GUI BuilderVisualWx

The Python official wiki has a complete list of IDEs.

There are several commercial IDEs such as Komodo, BlackAdder, Code Crusader, Code Forge, and PyCharm. However, for beginners learning to program, purchasing a commercial IDE is unnecessary.

Trying Python online

You can try Python online, thereby avoiding the need to install. Keywords: REPL.

Links:

Keeping Up to Date

Python has a very active community and the language itself is evolving continuously. Make sure to check python.org for recent releases and relevant tools. The website is an invaluable asset.

Public Python-related mailing lists are hosted at mail.python.org. Two examples of such mailing lists are the Python-announce-list to keep up with newly released third party-modules or software for Python and the general discussion list Python-list. These lists are mirrored to the Usenet newsgroups comp.lang.python.announce & comp.lang.python.

Notes




Setting it up



There are several IDEs available for Python. A full list can be found on the Python wiki.

Installing Python PyDev Plug-in for Eclipse IDE

You can use the Eclipse IDE as your Python IDE. The only requirement is Eclipse and the Eclipse PyDev Plug-in.

Go to http://www.eclipse.org/downloads/ and get the proper Eclipse IDE version for your OS platform. There are also updates on the site, but just look for the basic program, Download and install it. The install just requires you to unpack the downloaded Eclipse install file onto your system.

You can install PyDev Plug-in two ways:

  • Suggested: Use Eclipse's update manager, found in the tool bar under "Help" -> "install new Software". add http://pydev.org/updates/ in "work with" click add, and select PyDev ,and click "Next" and let Eclipse do the rest. Eclipse will now check for any updates to PyDev when it searches for updates.
    • If you get an error stating a requirement for the plugin "org.eclipse.mylyn", expand the PyDev tree, and deselect the optional mylyn components.
  • Or install PyDev manually, by going to http://pydev.sourceforge.net and get the latest PyDev Plug-in version. Download it, and install it by unpacking it into the Eclipse base folder.

Python Mode for Emacs

There is also a python mode for Emacs which provides features such as running pieces of code, and changing the tab level for blocks. You can download the mode at https://launchpad.net/python-mode

Installing new modules

Although many applications and modules have searchable webpages, there is a central repository for searching packages for installation, known as the "Cheese Shop".

See Also




Interactive mode



Python has two basic modes: script and interactive. The normal mode is the mode where the scripted and finished .py files are run in the Python interpreter. Interactive mode is a command line shell which gives immediate feedback for each statement, while running previously fed statements in active memory. As new lines are fed into the interpreter, the fed program is evaluated both in part and in whole.

Interactive mode is a good way to play around and try variations on syntax.

On macOS or linux, open a terminal and simply type "python". On Windows, bring up the command prompt and type "py", or start an interactive Python session by selecting "Python (command line)", "IDLE", or similar program from the task bar / app menu. IDLE is a GUI which includes both an interactive mode and options to edit and run files.

Python should print something like this:

$ python
Python 3.0b3 (r30b3:66303, Sep  8 2008, 14:01:02) [MSC v.1500 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>

(If Python doesn't run, make sure it is installed and your path is set correctly. See Getting Python.)

The >>> is Python's way of telling you that you are in interactive mode. In interactive mode what you type is immediately run. Try typing 1+1 in. Python will respond with 2. Interactive mode allows you to test out and see what Python will do. If you ever feel the need to play with new Python statements, go into interactive mode and try them out.

A sample interactive session:

>>> 5
5
>>> print(5*7)
35
>>> "hello" * 4
'hellohellohellohello'
>>> "hello".__class__
<type 'str'>

However, you need to be careful in the interactive environment to avoid confusion. For example, the following is a valid Python script:

if 1:
  print("True")
print("Done")

If you try to enter this as written in the interactive environment, you might be surprised by the result:

>>> if 1:
...   print("True")
... print("Done")
  File "<stdin>", line 3
    print("Done")
        ^
SyntaxError: invalid syntax

What the interpreter is saying is that the indentation of the second print was unexpected. You should have entered a blank line to end the first (i.e., "if") statement, before you started writing the next print statement. For example, you should have entered the statements as though they were written:

if 1:
  print("True")
 
print("Done")

Which would have resulted in the following:

>>> if 1:
...   print("True")
...
True
>>> print("Done")
Done
>>>

Interactive mode

Instead of Python exiting when the program is finished, you can use the -i flag to start an interactive session. This can be very useful for debugging and prototyping.

python -i hello.py




Self Help


This book is useful for learning Python, but there might be a topic that the book does not cover. You might want to search for modules in the standard library, or inspect an unknown object's functions, or perhaps you know there is a function that you have to call inside an object but you don't know its name. This is where the interactive help comes into play.

Built-in help overview

Built-in interactive help at a glance:

help()      # Starts an interactive help
help("topics")  # Outputs the list of help topics
help("OPERATORS") # Shows help on the topic of operators
help("len")    # Shows help on len function
help("re")    # Shows help on re module
help("re.sub")  # Shows help on sub function from re module
help(len)     # Shows help on the object passed, the len function
help([].pop)   # Shows help on the pop function of a list
dir([])      # Outputs a list of attributes of a list, which includes functions
import re
help(re)     # Shows help on the help module
help(re.sub)   # Shows help on the sub function of re module
help(1)      # Shows help on int type
help([])     # Shows help on list type
help(def)     # Fails: def is a keyword that does not refer to an object
help("def")    # Shows help on function definitions

To start Python's interactive help, type "help()" at the prompt.

>>>help()

You will be presented with a greeting and a quick introduction to the help system. For Python 2.6, the prompt will look something like this:

Welcome to Python 2.6! This is the online help utility.

If this is your first time using Python, you should definitely check out
the tutorial on the Internet at http://docs.python.org/tutorial/.

Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules. To quit this help utility and
return to the interpreter, just type "quit".

To get a list of available modules, keywords, or topics, type "modules",
"keywords", or "topics". Each module also comes with a one-line summary
of what it does; to list the modules whose summaries contain a given word
such as "spam", type "modules spam".

Notice also that the prompt will change from ">>>" (three right angle brackets) to "help>" You can access the different portions of help simply by typing in modules, keywords, or topics.

Typing in the name of one of these will print the help page associated with the item in question. To get a list of available modules, keywords, or topics, type "modules","keywords", or "topics". Each module also comes with a one-line summary of what it does; to list the modules whose summaries contain a given word such as "spam", type "modules spam".

You can exit the help system by typing "quit" or by entering a blank line to return to the interpreter.

Help Parameter

You can obtain information on a specific command without entering interactive help.

For example, you can obtain help on a given topic simply by adding a string in quotes, such as help("object"). You may also obtain help on a given object as well, by passing it as a parameter to the help function.

  • help in 2. Built-in Functions, docs.python.org



Creating Python programs


Welcome to Python! This tutorial will show you how to start writing programs.

Python programs are nothing more than text files, and they may be edited with a standard text editor program.[1] What text editor you use will probably depend on your operating system: any text editor can create Python programs. However, it is easier to use a text editor that includes Python syntax highlighting.


Hello, World

The very first program that beginning programmers usually write or learn is the "Hello, World!" program. This program simply outputs the phrase "Hello, World!" then terminates itself. Let's write "Hello, World!" in Python!

Open up your text editor and create a new file called hello.py containing just this line (you can copy-paste if you want):

print('Hello, World!')

The below line is used for Python 3.x.x

print("Hello, World!")

You can also put the below line to pause the program at the end until you press anything.

input()

This program uses the print function, which simply outputs its parameters to the terminal. By default, print appends a newline character to its output, which simply moves the cursor to the next line.


Now that you've written your first program, let's run it in Python! This process differs slightly depending on your operating system.

Windows

  • Create a folder on your computer to use for your Python programs, such as C:\pythonpractice, and save your hello.py program in that folder.
  • In the Start menu, select "Run...", and type in cmd. This will cause the Windows terminal to open.
  • Type cd \pythonpractice to change directory to your pythonpractice folder, and hit Enter.
  • Type hello.py to run your program!

If it didn't work, make sure your PATH contains the python directory. See Getting Python.

Mac

  • Create a folder on your computer to use for your Python programs. A good suggestion would be to name it pythonpractice and place it in your Home folder (the one that contains folders for Documents, Movies, Music, Pictures, etc). Save your hello.py program into it. Open the Applications folder, go into the Utilities folder, and open the Terminal program.
  • Type cd pythonpractice to change directory to your pythonpractice folder, and hit Enter.
  • Type python ./hello.py to run your program!

Linux

  • Create a folder on your computer to use for your Python programs, such as ~/pythonpractice, and save your hello.py program in that folder.
  • Open up the terminal program. In KDE, open the main menu and select "Run Command..." to open Konsole. In GNOME, open the main menu, open the Applications folder, open the Accessories folder, and select Terminal.
  • Type cd ~/pythonpractice to change directory to your pythonpractice folder, and hit Enter.
  • Don't forget to make the script executable by chmod +x.
  • Type python ./hello.py to run your program!

Linux (advanced)

  • Create a folder on your computer to use for your Python programs, such as ~/pythonpractice.
  • Open up your favorite text editor and create a new file called hello.py containing just the following 2 lines (you can copy-paste if you want):[2]
#! /usr/bin/python
print('Hello, world!')
  • save your hello.py program in the ~/pythonpractice folder.
  • Open up the terminal program. In KDE, open the main menu and select "Run Command..." to open Konsole. In GNOME, open the main menu, open the Applications folder, open the Accessories folder, and select Terminal.
  • Type cd ~/pythonpractice to change directory to your pythonpractice folder, and hit Enter.
  • Type chmod a+x hello.py to tell Linux that it is an executable program.
  • Type ./hello.py to run your program!
  • In addition, you can also use ln -s hello.py /usr/bin/hello to make a symbolic link hello.py to /usr/bin under the name hello, then run it by simply executing hello.

Note that this mainly should be done for complete, compiled programs, if you have a script that you made and use frequently, then it might be a good idea to put it somewhere in your home directory and put a link to it in /usr/bin. If you want a playground, a good idea is to invoke mkdir ~/.local/bin and then put scripts in there. To make ~/.local/bin content executable the same way /usr/bin does type $PATH = $PATH:~/local/bin (you can add this line to your shell rc file, for example ~/.bashrc).

Result

The program should print:

Hello, world!

Congratulations! You're well on your way to becoming a Python programmer.

Exercises

  1. Modify the hello.py program to say hello to someone from your family or your friends (or to Ada Lovelace).
  2. Change the program so that after the greeting, it asks, "How did you get here?".
  3. Re-write the original program to use two print statements: one for "Hello" and one for "world". The program should still only print out on one line.

Solutions

Notes

  1. Sometimes, Python programs are distributed in compiled form. We won't have to worry about that for quite a while.
  2. A Quick Introduction to Unix/My First Shell Script explains what a hash bang line does.



Variables and Strings


In this section, you will be introduced to two different kinds of data in Python: variables and strings. Please follow along by running the included programs and examining their output.

Variables

A variable is something that holds a value that may change. In simplest terms, a variable is just a box that you can put stuff in. You can use variables to store all kinds of stuff, but for now, we are just going to look at storing numbers in variables.

lucky = 7
print (lucky)
7

This code creates a variable called lucky, and assigns to it the integer number 7. When we ask Python to tell us what is stored in the variable lucky, it returns that number again.

We can also change what is inside a variable. For example:

 

changing = 3                                   
print (changing)
3 

changing = 9
print (changing)
9

different = 12
print (different)
12
print (changing)
9

changing = 15
print (changing)
15

We declare a variable called changing, put the integer 3 in it, and verify that the assignment was properly done. Then, we assign the integer 9 to changing, and ask again what is stored in changing. Python has thrown away the 3, and has replaced it with 9. Next, we create a second variable, which we call different, and put 12 in it. Now we have two independent variables, different and changing, that hold different information, i.e., assigning a new value to one of them is not affecting the other.

You can also assign the value of a variable to be the value of another variable. For example:

red = 5
blue = 10
print (red, blue)
5 10

yellow = red
print (yellow, red, blue)
5 5 10

red = blue
print (yellow, red, blue)
5 10 10

To understand this code, keep in mind that the name of the variable is always on the left side of the equals sign (the assignment operator), and the value of the variable on the right side of the equals sign. First the name, then the value.

We start out declaring that red is 5, and blue is 10. As you can see, you can pass several arguments to print to tell it to print multiple items on one line, separating them by spaces. As expected, Python reports that red stores 5, and blue holds 10.

Now we create a third variable, called yellow. To set its value, we tell Python that we want yellow to be whatever red is. (Remember: name to the left, value to the right.) Python knows that red is 5, so it also sets yellow to be 5.

Now we're going to take the red variable, and set it to the value of the blue variable. Don't get confused — name on the left, value on the right. Python looks up the value of blue, and finds that it is 10. So, Python throws away red's old value (5), and replaces it with 10. After this assignment Python reports that yellow is 5, red is 10, and blue is 10.

But didn't we say that yellow should be whatever value red is? The reason that yellow is still 5 when red is 10, is because we only said that yellow should be whatever red is at the moment of the assignment. After Python has figured out what red is and assigned that value to yellow, yellow doesn't care about red any more. yellow has a value now, and that value is going to stay the same no matter what happens to red.

String

A 'string' is simply a list of characters in order. A character is anything you can type on the keyboard in one keystroke, like a letter, a number, or a backslash. For example, "hello" is a string. It is five characters long — h, e, l, l, o. Strings can also have spaces: "hello world" contains 11 characters: 10 letters and the space between "hello" and "world". There are no limits to the number of characters you can have in a string — you can have anywhere from one to a million or more. You can even have a string that has 0 characters, which is usually called an "empty string."

There are three ways you can declare a string in Python: single quotes ('), double quotes ("), and triple quotes ("""). In all cases, you start and end the string with your chosen string declaration. For example:

>>> print ('I am a single quoted string')
I am a single quoted string
>>> print ("I am a double quoted string")
I am a double quoted string
>>> print ("""I am a triple quoted string""")
I am a triple quoted string

You can use quotation marks within strings by placing a backslash directly before them, so that Python knows you want to include the quotation marks in the string, instead of ending the string there. Placing a backslash directly before another symbol like this is known as escaping the symbol.

>>> print ("So I said, \"You don't know me! You'll never understand me!\"")
So I said, "You don't know me! You'll never understand me!"
>>> print ('So I said, "You don\'t know me! You\'ll never understand me!"')
So I said, "You don't know me! You'll never understand me!"
>>> print ("""The double quotation mark (\") is used to indicate direct quotations.""")
The double quotation mark (") is used to indicate direct quotations.

If you want to include a backslash in a string, you have to escape said backslash. This tells Python that you want to include the backslash in the string, instead of using it as an escape character. For example:

>>> print ("This will result in only three backslashes: \\ \\ \\")
This will result in only three backslashes: \ \ \

As you can see from the above examples, only the specific character used to quote the string needs to be escaped. This makes for more readable code.

To see how to use strings, let's go back for a moment to an old, familiar program:

>>> print("Hello, world!")
Hello, world!

Look at that! You've been using strings since the very beginning!

You can add two strings together using the + operator: this is called concatenating them.

>>> print ("Hello, " + "world!")
Hello, world!

Notice that there is a space at the end of the first string. If you don't put that in, the two words will run together, and you'll end up with Hello,world!

You can also repeat strings by using the * operator, like so:

>>> print ("bouncy " * 5)
bouncy bouncy bouncy bouncy bouncy 
>>> print ("bouncy " * 10)
bouncy bouncy bouncy bouncy bouncy bouncy bouncy bouncy bouncy bouncy

The string bouncy gets repeated 5 times in the 1st example and 10 times in the 2nd.

If you want to find out how long a string is, you use the len() function, which simply takes a string and counts the number of characters in it. (len stands for "length.") Just put the string that you want to find the length of inside the parentheses of the function. For example:

>>> print (len("Hello, world!"))
13

Strings and Variables

Now that you've learned about variables and strings separately, let's see how they work together.

Variables can store much more than just numbers. You can also use them to store strings! Here's how:

question = "What did you have for lunch?"
print (question)
What did you have for lunch?

In this program, we are creating a variable called question, and storing the string "What did you have for lunch?" in it. Then, we just tell Python to print out whatever is inside the question variable. Notice that when we tell Python to print out question, there are no quotation marks around the word question: this tells Python that we are using a variable, not a string. If we put in quotation marks around question, Python would treat it as a string, as shown below:

question = "What did you have for lunch?"
print ("question")
question

Let's try something different. Sure, it's all fine and dandy to ask the user what they had for lunch, but it doesn't make much difference if they can't respond! Let's edit this program so that the user can type in what they ate.

question = "What did you have for lunch?"
print (question)
answer = raw_input() #You should use "input()" in python 3.x, because python 3.x doesn't have a function named "raw_input".

print ("You had " + answer + "! That sounds delicious!")

To ask the user to write something, we used a function called raw_input(), which waits until the user writes something and presses enter, and then returns what the user wrote. Don't forget the parentheses! Even though there's nothing inside of them, they're still important, and Python will give you an error if you don't put them in. You can also use a different function called input(), which works in nearly the same way. We will learn the differences between these two functions later.

In this program, we created a variable called answer, and put whatever the user wrote into it. Then, we print out a new string, which contains whatever the user wrote. Notice the extra space at the end of the "You had " string, and the exclamation mark at the start of the "! That sounds delicious!" string. They help format the output and make it look nice, so that the strings don't all run together.

Combining Numbers and Strings

Take a look at this program, and see if you can figure out what it's supposed to do.

print ("Please give me a number: ")
number = raw_input()

plusTen = number + 10
print ("If we add 10 to your number, we get " + plusTen)

This program should take a number from the user, add 10 to it, and print out the result. But if you try running it, it won't work! You'll get an error that looks like this:

Traceback (most recent call last):
  File "test.py", line 5, in <module>
    print "If we add 10 to your number, we get " + plusTen
TypeError: cannot concatenate 'str' and 'int' objects

What's going on here? Python is telling us that there is a TypeError, which means there is a problem with the types of information being used. Specifically, Python can't figure out how to reconcile the two types of data that are being used simultaneously: integers and strings. For example, Python thinks that the number variable is holding a string, instead of a number. If the user enters 15, then number will contain a string that is two characters long: a 1, followed by a 5. So how can we tell Python that 15 should be a number, instead of a string?

Also, when printing out the answer, we are telling Python to concatenate together a string ("If we add 10 to your number, we get ") and a number (plusTen). Python doesn't know how to do that -- it can only concatenate strings together. How do we tell Python to treat a number as a string, so that we can print it out with another string?

Luckily, there are two functions that are perfect solutions for these problems. The int() function will take a string and turn it into an integer, while the str() function will take an integer and turn it into a string. In both cases, we put what we want to change inside the parentheses. Therefore, our modified program will look like this:

print ("Please give me a number:",)
response = raw_input()

number = int(response) 
plusTen = number + 10

print ("If we add 10 to your number, we get " + str(plusTen))

That's all you need to know about strings and variables! We'll learn more about types later.

List of Learned Functions

  • print(): Print the output information to the user
  • input() or raw_input(): asks the user for a response, and returns that response. (Note that in version 3.x raw_input() does not exist and has been replaced by input())
  • len(): returns the length of a string (number of characters)
  • str(): returns the string representation of an object
  • int(): given a string or number, returns an integer

Exercises

  1. Write a program that asks the user to type in a string, and then tells the user how long that string was.
  2. Ask the user for a string, and then for a number. Print out that string, that many times. (For example, if the string is hello and the number is 3 you should print out hello hello hello .)
  3. What would happen if a mischievous user typed in a word when you ask for a number? Try it.

Solutions

Quiz



Basic syntax


There are five fundamental concepts in Python.

Case Sensitivity

All variables are case-sensitive. Python treats 'number' and 'Number' as separate, unrelated entities.

Spaces and tabs don't mix

Instead of block delimiters (braces → "{}" in the C family of languages), indentation is used to indicate where blocks begin and end. Because whitespace is significant, remember that spaces and tabs don't mix, so use only one or the other when indenting your programs. A common error is to mix them. While they may look the same in editor, the interpreter will read them differently and it will result in either an error or unexpected behavior. Most decent text editors can be configured to let tab key emit spaces instead.

Python's Style Guideline described that the preferred way is using 4 spaces.

Tips: If you invoked python from the command-line, you can give -t or -tt argument to python to make python issue a warning or error on inconsistent tab usage.

pythonprogrammer@wikibook:~$ python -tt myscript.py

This will issue an error if you have mixed spaces and tabs.


Objects

In Python, like all object-oriented languages, there are aggregations of code and data called objects, which typically represent the pieces in a conceptual model of a system.

Objects in Python are created (i.e., instantiated) from templates called classes (which are covered later, as much of the language can be used without understanding classes). They have attributes, which represent the various pieces of code and data which make up the object. To access attributes, one writes the name of the object followed by a period (henceforth called a dot), followed by the name of the attribute.

An example is the 'upper' attribute of strings, which refers to the code that returns a copy of the string in which all the letters are uppercase. To get to this, it is necessary to have a way to refer to the object (in the following example, the way is the literal string that constructs the object).

'bob'.upper

Code attributes are called methods. So in this example, upper is a method of 'bob' (as it is of all strings). To execute the code in a method, use a matched pair of parentheses surrounding a comma separated list of whatever arguments the method accepts (upper doesn't accept any arguments). So to find an uppercase version of the string 'bob', one could use the following:

'bob'.upper()

Scope

In a large system, it is important that one piece of code does not affect another in difficult to predict ways. One of the simplest ways to further this goal is to prevent one programmer's choice of a name from blocking another's use of that name. The concept of scope was invented to do this. A scope is a "region" of code in which a name can be used and outside of which the name cannot be easily accessed. There are two ways of delimiting regions in Python: with functions or with modules. They each have different ways of accessing from outside the scope useful data that was produced within the scope. With functions, that way is to return the data. The way to access names from other modules leads us to another concept.

Namespaces

It would be possible to teach Python without the concept of namespaces because they are so similar to attributes, which we have already mentioned, but the concept of namespaces is one that transcends any particular programming language, and so it is important to teach. To begin with, there is a built-in function dir() that can be used to help one understand the concept of namespaces. When you first start the Python interpreter (i.e., in interactive mode), you can list the objects in the current (or default) namespace using this function.

Python 2.3.4 (#53, Oct 18 2004, 20:35:07) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> dir()
['__builtins__', '__doc__', '__name__']

This function can also be used to show the names available within a module's namespace. To demonstrate this, first we can use the type() function to show what kind of object __builtins__ is:

>>> type(__builtins__)
<type 'module'>

Since it is a module, it has a namespace. We can list the names within the __builtins__ namespace, again using the dir() function (note that the complete list of names has been abbreviated):

>>> dir(__builtins__)
['ArithmeticError', ... 'copyright', 'credits', ... 'help', ... 'license', ... 'zip']
>>>

Namespaces are a simple concept. A namespace is a particular place in which names specific to a module reside. Each name within a namespace is distinct from names outside of that namespace. This layering of namespaces is called scope. A name is placed within a namespace when that name is given a value. For example:

>>> dir()
['__builtins__', '__doc__', '__name__']
>>> name = "Bob"
>>> import math
>>> dir()
['__builtins__', '__doc__', '__name__', 'math', 'name']

Note that I was able to add the "name" variable to the namespace using a simple assignment statement. The import statement was used to add the "math" name to the current namespace. To see what math is, we can simply:

>>> math
<module 'math' (built-in)>

Since it is a module, it also has a namespace. To display the names within this namespace, we:

>>> dir(math)
['__doc__', '__name__', 'acos', 'asin', 'atan', 'atan2', 'ceil', 'cos', 'cosh', 'degrees', 'e',
'exp', 'fabs', 'floor', 'fmod', 'frexp', 'hypot', 'ldexp', 'log', 'log10', 'modf', 'pi', 'pow',
'radians', 'sin', 'sinh', 'sqrt', 'tan', 'tanh']
>>>

If you look closely, you will notice that both the default namespace and the math module namespace have a '__name__' object. The fact that each layer can contain an object with the same name is what scope is all about. To access objects inside a namespace, simply use the name of the module, followed by a dot, followed by the name of the object. This allows us to differentiate between the __name__ object within the current namespace, and that of the object with the same name within the math module. For example:

>>> print (__name__)
__main__
>>> print (math.__name__)
math
>>> print (math.__doc__)
This module is always available.  It provides access to the
mathematical functions defined by the C standard.
>>> print (math.pi)
3.1415926535897931



Sequences


Sequences allow you to store multiple values in an organized and efficient fashion. There are seven sequence types: strings, Unicode strings, lists, tuples, bytearrays, buffers, and xrange objects. Dictionaries and sets are containers for sequential data. See the official python documentation on sequences: Python_Documentation (actually there are more, but these are the most commonly used types).

Strings

We already covered strings, but that was before you knew what a sequence is. In other languages, the elements in arrays and sometimes the characters in strings may be accessed with the square brackets, or subscript operator. This works in Python too:

>>> "Hello, world!"[0]
'H'
>>> "Hello, world!"[1]
'e'
>>> "Hello, world!"[2]
'l'
>>> "Hello, world!"[3]
'l'
>>> "Hello, world!"[4]
'o'

Indexes are numbered from 0 to n-1 where n is the number of items (or characters), and they are positioned between the items:

 H  e  l  l  o  ,  _  w  o  r  l  d  !
 0  1  2  3  4  5  6  7  8  9 10 11 12

The item which comes immediately after an index is the one selected by that index. Negative indexes are counted from the end of the string:

>>> "Hello, world!"[-2]
'd'
>>> "Hello, world!"[-9]
'o'
>>> "Hello, world!"[-13]
'H'
>>> "Hello, world!"[-1]
'!'

But in Python, the colon : allows the square brackets to take as many as two numbers. For any sequence which only uses numeric indexes, this will return the portion which is between the specified indexes. This is known as "slicing," and the result of slicing a string is often called a "substring."

>>> "Hello, world!"[3:9]
'lo, wo'
>>> string = "Hello, world!"
>>> string[:5]
'Hello'
>>> string[-6:-1]
'world'
>>> string[-9:]
'o, world!'
>>> string[:-8]
'Hello'
>>> string[:]
'Hello, world!'

As demonstrated above, if either number is omitted it is assumed to be the beginning or end of the sequence. Note also that the brackets are inclusive on the left but exclusive on the right: in the first example above with [3:9] the position 3, 'l', is included while position 9, 'r', is excluded.

Lists

A list is just what it sounds like: a list of values, organized in order. A list is created using square brackets. For example, an empty list would be initialized like this:

spam = []

The values of the list are separated by commas. For example:

spam = ["bacon", "eggs", 42]

Lists may contain objects of varying types. It may hold both the strings "eggs" and "bacon" as well as the number 42.

Like characters in a string, items in a list can be accessed by indexes starting at 0. To access a specific item in a list, you refer to it by the name of the list, followed by the item's number in the list inside brackets. For example:

>>> spam
['bacon', 'eggs', 42]
>>> spam[0]
'bacon'
>>> spam[1]
'eggs'
>>> spam[2]
42

You can also use negative numbers, which count backwards from the end of the list:

>>> spam[-1]
42
>>> spam[-2]
'eggs'
>>> spam[-3]
'bacon'

The len() function also works on lists, returning the number of items in the array:

>>> len(spam)
3

Note that the len() function counts the number of item inside a list, so the last item in spam (42) has the index (len(spam) - 1).

The items in a list can also be changed, just like the contents of an ordinary variable:

>>> spam = ["bacon", "eggs", 42]
>>> spam
['bacon', 'eggs', 42]
>>> spam[1]
'eggs'
>>> spam[1] = "ketchup"
>>> spam
['bacon', 'ketchup', 42]

(Strings, being immutable, are impossible to modify.) As with strings, lists may be sliced:

>>> spam[1:]
['eggs', 42]
>>> spam[:-1]
['bacon', 'eggs']

It is also possible to add items to a list. There are many ways to do it, the easiest way is to use the append() method of list:

>>> spam.append(10)
>>> spam
['bacon', 'eggs', 42, 10]

Note that you cannot manually insert an element by specifying the index outside of its range. The following code would fail:

>>> spam[4] = 10
IndexError: list assignment index out of range

Instead, you must use the insert() function. If you want to insert an item inside a list at a certain index, you may use the insert() method of list, for example:

>>> spam.insert(1, 'and')
>>> spam
['bacon', 'and', 'eggs', 42, 10]


You can also delete items from a list using the del statement:

>>> spam
['bacon', 'and', 'eggs', 42, 10]
>>> del spam[1]
>>> spam
['bacon', 'eggs', 42, 10]
>>> spam[0]
'bacon'
>>> spam[1]
'eggs'
>>> spam[2]
42
>>> spam[3]
10

As you can see, the list re-orders itself, so there are no gaps in the numbering.

Lists have an unusual characteristic. Given two lists a and b, if you set b to a, and change a, b will also be changed.

>>> a=[2, 3, 4, 5]
>>> b=a
>>> del a[3]
>>> print a
[2, 3, 4]
>>> print b
[2, 3, 4]

This can easily be worked around by using b=a[:] instead.

For further explanation on lists, or to find out how to make 2D arrays, see Data Structure/Lists

Tuples

Tuples are similar to lists, except they are immutable. Once you have set a tuple, there is no way to change it whatsoever: you cannot add, change, or remove elements of a tuple. Otherwise, tuples work identically to lists.

To declare a tuple, you use commas:

unchanging = "rocks", 0, "the universe"

It is often necessary to use parentheses to differentiate between different tuples, such as when doing multiple assignments on the same line:

foo, bar = "rocks", 0, "the universe" # 3 elements here
foo, bar = "rocks", (0, "the universe") # 2 elements here because the second element is a tuple

Unnecessary parentheses can be used without harm, but nested parentheses denote nested tuples:

>>> var = "me", "you", "us", "them"
>>> var = ("me", "you", "us", "them")

both produce:

>>> print var 
('me', 'you', 'us', 'them')

but:

>>> var = ("me", "you", ("us", "them"))
>>> print(var)
('me', 'you', ('us', 'them')) # A tuple of 3 elements, the last of which is itself a tuple.

For further explanation on tuple, see Data Structure/Tuples

Dictionaries

Dictionaries are also like lists, and they are mutable -- you can add, change, and remove elements from a dictionary. However, the elements in a dictionary are not bound to numbers, the way a list is. Every element in a dictionary has two parts: a key, and a value. Calling a key of a dictionary returns the value linked to that key. You could consider a list to be a special kind of dictionary, in which the key of every element is a number, in numerical order.

Dictionaries are declared using curly braces, and each element is declared first by its key, then a colon, and then its value. For example:

>>> definitions = {"guava": "a tropical fruit", "python": "a programming language", "the answer": 42}
>>> definitions
{'python': 'a programming language', 'the answer': 42, 'guava': 'a tropical fruit'}
>>> definitions["the answer"]
42
>>> definitions["guava"]
'a tropical fruit'
>>> len(definitions)    
3

Also, adding an element to a dictionary is much simpler: simply declare it as you would a variable.

>>> definitions["new key"] = "new value"
>>> definitions
{'python': 'a programming language', 'the answer': 42, 'guava': 'a tropical fruit', 'new key': 'new value'}

For further explanation on dictionary, see Data Structure/Dictionaries

Sets

Sets are just like lists except that they are unordered and they do not allow duplicate values. Elements of a set are neither bound to a number (like list and tuple) nor to a key (like dictionary). The reason for using a set over other data types is that a set is much faster for a large number of items than a list or tuple and sets provide fast data insertion, deletion, and membership testing. Sets also support mathematical set operations such as testing for subsets and finding the union or intersection of two sets.

>>> mind = set([42, 'a string', (23, 4)])
>>> mind
set([(23, 4), 42, 'a string'])


>>> mind = set([42, 'a string', 40, 41])
>>> mind
set([40, 41, 42, 'a string'])
>>> mind = set([42, 'a string', 40, 0])
>>> mind
set([40, 0, 42, 'a string'])
>>> mind.add('hello')
>>> mind
set([40, 0, 42, 'a string', 'hello'])

Note that sets are unordered, items you add into sets will end up in an indeterminable position, and it may also change from time to time.

>>> mind.add('duplicate value')
>>> mind.add('duplicate value')
>>> mind
set([0, 'a string', 40, 42, 'hello', 'duplicate value'])

Sets cannot contain a single value more than once. Unlike lists, which can contain anything, the types of data that can be included in sets are restricted. A set can only contain hashable, immutable data types. Integers, strings, and tuples are hashable; lists, dictionaries, and other sets (except frozensets, see below) are not.

Frozenset

The relationship between frozenset and set is like the relationship between tuple and list. Frozenset is an immutable version of set. An example:

>>> frozen=frozenset(['life','universe','everything'])
>>> frozen
frozenset(['universe', 'life', 'everything'])

Other data types

Python also has other types of sequences, though these are used less frequently and need to be imported from the standard library before being used. We will only brush over them here.

array
A typed-list, an array may only contain homogeneous values.
collections.defaultdict
A dictionary that, when an element is not found, returns a default value instead of error.
collections.deque
A double ended queue, allows fast manipulation on both sides of the queue.
heapq
A priority queue.
Queue
A thread-safe multi-producer, multi-consumer queue for use with multi-threaded programs. Note that a list can also be used as queue in a single-threaded code.

For further explanation on set, see Data Structure/Sets

3rd party data structure

Some useful data types in Python do not come in the standard library. Some of these are very specialized in their use. We will mention some of the more well known 3rd party types.

numpy.array
useful for heavy number crunching => see numpy section
sorteddict
like the name says, a sorted dictionary

Exercises

  1. Write a program that puts 5, 10, and "twenty" into a list. Then remove 10 from the list.
  2. Write a program that puts 5, 10, and "twenty" into a tuple.
  3. Write a program that puts 5, 10, and "twenty" into a set. Put "twenty", 10, and 5 into another set purposefully in a different order. Print both of them out and notice the ordering.
  4. Write a program that constructs a tuple, one element of which is a frozenset.
  5. Write a program that creates a dictionary mapping 1 to "Monday," 2 to "Tuesday," etc.



Data types


Data types determine whether an object can do something, or whether it just would not make sense. Other programming languages often determine whether an operation makes sense for an object by making sure the object can never be stored somewhere where the operation will be performed on the object (this type system is called static typing). Python does not do that. Instead it stores the type of an object with the object, and checks when the operation is performed whether that operation makes sense for that object (this is called dynamic typing).

Built-in Data types

Python's built-in (or standard) data types can be grouped into several classes. Sticking to the hierarchy scheme used in the official Python documentation these are numeric types, sequences, sets and mappings (and a few more not discussed further here). Some of the types are only available in certain versions of the language as noted below.

  • boolean: the type of the built-in values True and False. Useful in conditional expressions, and anywhere else you want to represent the truth or falsity of some condition. Mostly interchangeable with the integers 1 and 0. In fact, conditional expressions will accept values of any type, treating special ones like boolean False, integer 0 and the empty string "" as equivalent to False, and all other values as equivalent to True. But for safety’s sake, it is best to only use boolean values in these places.

Numeric types:

  • int: Integers; equivalent to C longs in Python 2.x, non-limited length in Python 3.x
  • long: Long integers of non-limited length; exists only in Python 2.x
  • float: Floating-Point numbers, equivalent to C doubles
  • complex: Complex Numbers

Sequences:

  • str: String; represented as a sequence of 8-bit characters in Python 2.x, but as a sequence of Unicode characters (in the range of U+0000 - U+10FFFF) in Python 3.x
  • bytes: a sequence of integers in the range of 0-255; only available in Python 3.x
  • byte array: like bytes, but mutable (see below); only available in Python 3.x
  • list
  • tuple

Sets:

  • set: an unordered collection of unique objects; available as a standard type since Python 2.6
  • frozen set: like set, but immutable (see below); available as a standard type since Python 2.6

Mappings:

  • dict: Python dictionaries, also called hashmaps or associative arrays, which means that an element of the list is associated with a definition, rather like a Map in Java

Some others, such as type and callables

Mutable vs Immutable Objects

In general, data types in Python can be distinguished based on whether objects of the type are mutable or immutable. The content of objects of immutable types cannot be changed after they are created.

Some immutable typesSome mutable types
  • int, float, complex
  • str
  • bytes
  • tuple
  • frozenset
  • bool
  • array
  • bytearray
  • list
  • set
  • dict

Only mutable objects support methods that change the object in place, such as reassignment of a sequence slice, which will work for lists, but raise an error for tuples and strings.

It is important to understand that variables in Python are really just references to objects in memory. If you assign an object to a variable as below,

a = 1
s = 'abc'
l = ['a string', 456, ('a', 'tuple', 'inside', 'a', 'list')]

all you really do is make this variable (a, s, or l) point to the object (1, 'abc', ['a string', 456, ('a', 'tuple', 'inside', 'a', 'list')]), which is kept somewhere in memory, as a convenient way of accessing it. If you reassign a variable as below

a = 7
s = 'xyz'
l = ['a simpler list', 99, 10]

you make the variable point to a different object (newly created ones in our examples). As stated above, only mutable objects can be changed in place (l[0] = 1 is ok in our example, but s[0] = 'a' raises an error). This becomes tricky, when an operation is not explicitly asking for a change to happen in place, as is the case for the += (increment) operator, for example. When used on an immutable object (as in a += 1 or in s += 'qwertz'), Python will silently create a new object and make the variable point to it. However, when used on a mutable object (as in l += [1,2,3]), the object pointed to by the variable will be changed in place. While in most situations, you do not have to know about this different behavior, it is of relevance when several variables are pointing to the same object. In our example, assume you set p = s and m = l, then s += 'etc' and l += [9,8,7]. This will change s and leave p unaffected, but will change both m and l since both point to the same list object. Python's built-in id() function, which returns a unique object identifier for a given variable name, can be used to trace what is happening under the hood.
Typically, this behavior of Python causes confusion in functions. As an illustration, consider this code:

def append_to_sequence (myseq):
    myseq += (9,9,9)
    return myseq

tuple1 = (1,2,3) # tuples are immutable
list1 = [1,2,3] # lists are mutable

tuple2 = append_to_sequence(tuple1)
list2 = append_to_sequence(list1)

print 'tuple1 = ', tuple1 # outputs (1, 2, 3)
print 'tuple2 = ', tuple2 # outputs (1, 2, 3, 9, 9, 9)
print 'list1 = ', list1 # outputs [1, 2, 3, 9, 9, 9]
print 'list2 = ', list2 # outputs [1, 2, 3, 9, 9, 9]

This will give the above indicated, and usually unintended, output. myseq is a local variable of the append_to_sequence function, but when this function gets called, myseq will nevertheless point to the same object as the variable that we pass in (t or l in our example). If that object is immutable (like a tuple), there is no problem. The += operator will cause the creation of a new tuple, and myseq will be set to point to it. However, if we pass in a reference to a mutable object, that object will be manipulated in place (so myseq and l, in our case, end up pointing to the same list object).

Links:

Creating Objects of Defined Types

Literal integers can be entered in three ways:

  • decimal numbers can be entered directly
  • hexadecimal numbers can be entered by prepending a 0x or 0X (0xff is hex FF, or 255 in decimal)
  • the format of octal literals depends on the version of Python:
  • Python 2.x: octals can be entered by prepending a 0 (0732 is octal 732, or 474 in decimal)
  • Python 3.x: octals can be entered by prepending a 0o or 0O (0o732 is octal 732, or 474 in decimal)

Floating point numbers can be entered directly.

Long integers are entered either directly (1234567891011121314151617181920 is a long integer) or by appending an L (0L is a long integer). Computations involving short integers that overflow are automatically turned into long integers.

Complex numbers are entered by adding a real number and an imaginary one, which is entered by appending a j (i.e. 10+5j is a complex number. So is 10j). Note that j by itself does not constitute a number. If this is desired, use 1j.

Strings can be either single or triple quoted strings. The difference is in the starting and ending delimiters, and in that single quoted strings cannot span more than one line. Single quoted strings are entered by entering either a single quote (') or a double quote (") followed by its match. So therefore

'foo' works, and
"moo" works as well,
     but
'bar" does not work, and
"baz' does not work either.
"quux'' is right out.

Triple quoted strings are like single quoted strings, but can span more than one line. Their starting and ending delimiters must also match. They are entered with three consecutive single or double quotes, so

'''foo''' works, and
"""moo""" works as well,
     but
'"'bar'"' does not work, and
"""baz''' does not work either.
'"'quux"'" is right out.

Tuples are entered in parentheses, with commas between the entries:

(10, 'Mary had a little lamb')

Also, the parenthesis can be left out when it's not ambiguous to do so:

10, 'whose fleece was as white as snow'

Note that one-element tuples can be entered by surrounding the entry with parentheses and adding a comma like so:

('this is a singleton tuple',)

Lists are similar, but with brackets:

['abc', 1,2,3]

Dicts are created by surrounding with curly braces a list of key/value pairs separated from each other by a colon and from the other entries with commas:

{ 'hello': 'world', 'weight': 'African or European?' }

Any of these composite types can contain any other, to any depth:

((((((((('bob',),['Mary', 'had', 'a', 'little', 'lamb']), { 'hello' : 'world' } ),),),),),),)

Null object

The Python analogue of null pointer known from other programming languages is None. None is not a null pointer or a null reference but an actual object of which there is only one instance. One of the uses of None is in default argument values of functions, for which see Python Programming/Functions. Comparisons to None are usually made using is rather than ==.

Testing for None and assignment:

if item is None:
  ...
  another = None

if not item is None:
  ...

if item is not None: # Also possible
  ...

Using None in a default argument value:

def log(message, type = None):
  ...

PEP8 states that "Comparisons to singletons like None should always be done with is or is not, never the equality operators." Therefore, "if item == None:" is inadvisable. A class can redefine the equality operator (==) such that instances of it will equal None.

You can verify that None is an object by dir(None) or id(None).

See also Python Programming/Operators/ chapter.

Links:

Type conversion

Type conversion in Python by example:

v1 = int(2.7) # 2
v2 = int(-3.9) # -3
v3 = int("2") # 2
v4 = int("11", 16) # 17, base 16
v5 = long(2)
v6 = float(2) # 2.0
v7 = float("2.7") # 2.7
v8 = float("2.7E-2") # 0.027
v9 = float(False) # 0.0
vA = float(True) # 1.0
vB = str(4.5) # "4.5"
vC = str([1, 3, 5]) # "[1, 3, 5]"
vD = bool(0) # False; bool fn since Python 2.2.1
vE = bool(3) # True
vF = bool([]) # False - empty list
vG = bool([False]) # True - non-empty list
vH = bool({}) # False - empty dict; same for empty tuple
vI = bool("") # False - empty string
vJ = bool(" ") # True - non-empty string
vK = bool(None) # False
vL = bool(len) # True
vM = set([1, 2])
vN = list(vM)
vO = list({1: "a", 2: "b"}) # dict -> list of keys
vP = tuple(vN)
vQ = list("abc") # ['a', 'b', 'c']
print v1, v2, v3, type(v1), type(v2), type(v3)

Implicit type conversion:

int1 = 4
float1 = int1 + 2.1 # 4 converted to float
# str1 = "My int:" + int1 # Error: no implicit type conversion from int to string
str1 = "My int:" + str(int1)
int2 = 4 + True # 5: bool is implicitly converted to int

Keywords: type casting.

Links:

Exercises

  1. Write a program that instantiates a single object, adds [1,2] to the object, and returns the result.
    1. Find an object that returns an output of the same length (if one exists?).
    2. Find an object that returns an output length 2 greater than it started.
    3. Find an object that causes an error.
  2. Find two data types X and Y such that X = X + Y will cause an error, but X += Y will not.



Numbers


Python 2.x supports 4 built-in numeric types - int, long, float and complex. Of these, the long type has been dropped in Python 3.x - the int type is now of unlimited length by default. You don’t have to specify what type of variable you want; Python does that automatically.

  • Int: The basic integer type in python, equivalent to the hardware 'c long' for the platform you are using in Python 2.x, unlimited in length in Python 3.x.
  • Long: Integer type with unlimited length. In python 2.2 and later, Ints are automatically turned into long ints when they overflow. Dropped since Python 3.0, use int type instead.
  • Float: This is a binary floating point number. Longs and Ints are automatically converted to floats when a float is used in an expression, and with the true-division / operator. In CPython, floats are usually implemented using the C languages double, which often yields 52 bits of significand, 11 bits of exponent, and 1 sign bit, but this is machine dependent.
  • Complex: This is a complex number consisting of two floats. Complex literals are written as a + bj where a and b are floating-point numbers denoting the real and imaginary parts respectively.

In general, the number types are automatically 'up cast' in this order:

Int → Long → Float → Complex. The farther to the right you go, the higher the precedence.

>>> x = 5
>>> type(x)
<type 'int'>
>>> x = 187687654564658970978909869576453
>>> type(x)
<type 'long'>
>>> x = 1.34763
>>> type(x)
<type 'float'>
>>> x = 5 + 2j
>>> type(x)
<type 'complex'>

The result of divisions is somewhat confusing. In Python 2.x, using the / operator on two integers will return another integer, using floor division. For example, 5/2 will give you 2. You have to specify one of the operands as a float to get true division, e.g. 5/2. or 5./2 (the dot specifies you want to work with float) will yield 2.5. Starting with Python 2.2 this behavior can be changed to true division by the future division statement from __future__ import division. In Python 3.x, the result of using the / operator is always true division (you can ask for floor division explicitly by using the // operator since Python 2.2).

This illustrates the behavior of the / operator in Python 2.2+:

>>> 5/2
2
>>> 5/2.
2.5
>>> 5./2
2.5
>>> from __future__ import division
>>> 5/2
2.5
>>> 5//2
2

For operations on numbers, see chapters Basic Math and Math.



Strings


Overview

Strings in Python at a glance:

str1 = "Hello"                # A new string using double quotes
str2 = 'Hello'                # Single quotes do the same
str3 = "Hello\tworld\n"       # One with a tab and a newline
str4 = str1 + " world"        # Concatenation
str5 = str1 + str(4)          # Concatenation with a number
str6 = str1[2]                # 3rd character
str6a = str1[-1]              # Last character
#str1[0] = "M"                # No way; strings are immutable
for char in str1: print char  # For each character
str7 = str1[1:]               # Without the 1st character
str8 = str1[:-1]              # Without the last character
str9 = str1[1:4]              # Substring: 2nd to 4th character
str10 = str1 * 3              # Repetition
str11 = str1.lower()          # Lowercase
str12 = str1.upper()          # Uppercase
str13 = str1.rstrip()         # Strip right (trailing) whitespace
str14 = str1.replace('l','h') # Replacement
list15 = str1.split('l')      # Splitting
if str1 == str2: print "Equ"  # Equality test
if "el" in str1: print "In"   # Substring test
length = len(str1)            # Length
pos1 = str1.find('llo')       # Index of substring or -1
pos2 = str1.rfind('l')        # Index of substring, from the right
count = str1.count('l')       # Number of occurrences of a substring

print str1, str2, str3, str4, str5, str6, str7, str8, str9, str10 
print str11, str12, str13, str14, list15
print length, pos1, pos2, count

See also chapter Regular Expression for advanced pattern matching on strings in Python.

String operations

Equality

Two strings are equal if they have exactly the same contents, meaning that they are both the same length and each character has a one-to-one positional correspondence. Many other languages compare strings by identity instead; that is, two strings are considered equal only if they occupy the same space in memory. Python uses the is operator to test the identity of strings and any two objects in general.

Examples:

>>> a = 'hello'; b = 'hello' # Assign 'hello' to a and b.
>>> a == b                   # check for equality
True
>>> a == 'hello'             #
True
>>> a == "hello"             # (choice of delimiter is unimportant)
True
>>> a == 'hello '            # (extra space)
False
>>> a == 'Hello'             # (wrong case)
False

Numerical

There are two quasi-numerical operations which can be done on strings -- addition and multiplication. String addition is just another name for concatenation. String multiplication is repetitive addition, or concatenation. So:

>>> c = 'a'
>>> c + 'b'
'ab'
>>> c * 5
'aaaaa'

Containment

There is a simple operator 'in' that returns True if the first operand is contained in the second. This also works on substrings

>>> x = 'hello'
>>> y = 'ell'
>>> x in y
False
>>> y in x
True

Note that 'print x in y' would have also returned the same value.

Indexing and Slicing

Much like arrays in other languages, the individual characters in a string can be accessed by an integer representing its position in the string. The first character in string s would be s[0] and the nth character would be at s[n-1].

>>> s = "Xanadu"
>>> s[1]
'a'

Unlike arrays in other languages, Python also indexes the arrays backwards, using negative numbers. The last character has index -1, the second to last character has index -2, and so on.

>>> s[-4]
'n'

We can also use "slices" to access a substring of s. s[a:b] will give us a string starting with s[a] and ending with s[b-1].

>>> s[1:4]
'ana'

None of these are assignable.

>>> print s
>>> s[0] = 'J'
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object does not support item assignment
>>> s[1:3] = "up"
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object does not support slice assignment
>>> print s

Outputs (assuming the errors were suppressed):

Xanadu
Xanadu

Another feature of slices is that if the beginning or end is left empty, it will default to the first or last index, depending on context:

>>> s[2:]
'nadu'
>>> s[:3]
'Xan'
>>> s[:]
'Xanadu'

You can also use negative numbers in slices:

>>> print s[-2:]
'du'

To understand slices, it's easiest not to count the elements themselves. It is a bit like counting not on your fingers, but in the spaces between them. The list is indexed like this:

Element:     1     2     3     4
Index:    0     1     2     3     4
         -4    -3    -2    -1

So, when we ask for the [1:3] slice, that means we start at index 1, and end at index 2, and take everything in between them. If you are used to indexes in C or Java, this can be a bit disconcerting until you get used to it.

String constants

String constants can be found in the standard string module. An example is string.digits, which equals to '0123456789'.

Links:

String methods

There are a number of methods or built-in string functions:

  • capitalize
  • center
  • count
  • decode
  • encode
  • endswith
  • expandtabs
  • find
  • index
  • isalnum
  • isalpha
  • isdigit
  • islower
  • isspace
  • istitle
  • isupper
  • join
  • ljust
  • lower
  • lstrip
  • replace
  • rfind
  • rindex
  • rjust
  • rstrip
  • split
  • splitlines
  • startswith
  • strip
  • swapcase
  • title
  • translate
  • upper
  • zfill

Only emphasized items will be covered.

is*

isalnum(), isalpha(), isdigit(), islower(), isupper(), isspace(), and istitle() fit into this category.

The length of the string object being compared must be at least 1, or the is* methods will return False. In other words, a string object of len(string) == 0, is considered "empty", or False.

  • isalnum returns True if the string is entirely composed of alphabetic and/or numeric characters (i.e. no punctuation).
  • isalpha and isdigit work similarly for alphabetic characters or numeric characters only.
  • isspace returns True if the string is composed entirely of whitespace.
  • islower, isupper, and istitle return True if the string is in lowercase, uppercase, or titlecase respectively. Uncased characters are "allowed", such as digits, but there must be at least one cased character in the string object in order to return True. Titlecase means the first cased character of each word is uppercase, and any immediately following cased characters are lowercase. Curiously, 'Y2K'.istitle() returns True. That is because uppercase characters can only follow uncased characters. Likewise, lowercase characters can only follow uppercase or lowercase characters. Hint: whitespace is uncased.

Example:

>>> '2YK'.istitle()
False
>>> 'Y2K'.istitle()
True
>>> '2Y K'.istitle()
True

Title, Upper, Lower, Swapcase, Capitalize

Returns the string converted to title case, upper case, lower case, inverts case, or capitalizes, respectively.

The title method capitalizes the first letter of each word in the string (and makes the rest lower case). Words are identified as substrings of alphabetic characters that are separated by non-alphabetic characters, such as digits, or whitespace. This can lead to some unexpected behavior. For example, the string "x1x" will be converted to "X1X" instead of "X1x".

The swapcase method makes all uppercase letters lowercase and vice versa.

The capitalize method is like title except that it considers the entire string to be a word. (i.e. it makes the first character upper case and the rest lower case)

Example:

s = 'Hello, wOrLD'
print s              # 'Hello, wOrLD'
print s.title()      # 'Hello, World'
print s.swapcase()   # 'hELLO, WoRld'
print s.upper()      # 'HELLO, WORLD'
print s.lower()      # 'hello, world'
print s.capitalize() # 'Hello, world'

Keywords: to lower case, to upper case, lcase, ucase, downcase, upcase.

count

Returns the number of the specified substrings in the string. i.e.

>>> s = 'Hello, world'
>>> s.count('o') # print the number of 'o's in 'Hello, World' (2)
2

Hint: .count() is case-sensitive, so this example will only count the number of lowercase letter 'o's. For example, if you ran:

>>> s = 'HELLO, WORLD'
>>> s.count('o') # print the number of lowercase 'o's in 'HELLO, WORLD' (0)
0

strip, rstrip, lstrip

Returns a copy of the string with the leading (lstrip) and trailing (rstrip) whitespace removed. strip removes both.

>>> s = '\t Hello, world\n\t '
>>> print s
         Hello, world

>>> print s.strip()
Hello, world
>>> print s.lstrip()
Hello, world
        # ends here
>>> print s.rstrip()
         Hello, world

Note the leading and trailing tabs and newlines.

Strip methods can also be used to remove other types of characters.

import string
s = 'www.wikibooks.org'
print s
print s.strip('w')                 # Removes all w's from outside
print s.strip(string.lowercase)    # Removes all lowercase letters from outside
print s.strip(string.printable)    # Removes all printable characters

Outputs:

www.wikibooks.org
.wikibooks.org
.wikibooks.
 

Note that string.lowercase and string.printable require an import string statement

ljust, rjust, center

left, right or center justifies a string into a given field size (the rest is padded with spaces).

>>> s = 'foo'
>>> s
'foo'
>>> s.ljust(7)
'foo    '
>>> s.rjust(7)
'    foo'
>>> s.center(7)
'  foo  '

join

Joins together the given sequence with the string as separator:

>>> seq = ['1', '2', '3', '4', '5']
>>> ' '.join(seq)
'1 2 3 4 5'
>>> '+'.join(seq)
'1+2+3+4+5'

map may be helpful here: (it converts numbers in seq into strings)

>>> seq = [1,2,3,4,5]
>>> ' '.join(map(str, seq))
'1 2 3 4 5'

now arbitrary objects may be in seq instead of just strings.

find, index, rfind, rindex

The find and index methods return the index of the first found occurrence of the given subsequence. If it is not found, find returns -1 but index raises a ValueError. rfind and rindex are the same as find and index except that they search through the string from right to left (i.e. they find the last occurrence)

>>> s = 'Hello, world'
>>> s.find('l')
2
>>> s[s.index('l'):]
'llo, world'
>>> s.rfind('l')
10
>>> s[:s.rindex('l')]
'Hello, wor'
>>> s[s.index('l'):s.rindex('l')]
'llo, wor'

Because Python strings accept negative subscripts, index is probably better used in situations like the one shown because using find instead would yield an unintended value.

replace

Replace works just like it sounds. It returns a copy of the string with all occurrences of the first parameter replaced with the second parameter.

>>> 'Hello, world'.replace('o', 'X')
'HellX, wXrld'

Or, using variable assignment:

string = 'Hello, world'
newString = string.replace('o', 'X')
print string
print newString

Outputs:

Hello, world
HellX, wXrld

Notice, the original variable (string) remains unchanged after the call to replace.

expandtabs

Replaces tabs with the appropriate number of spaces (default number of spaces per tab = 8; this can be changed by passing the tab size as an argument).

s = 'abcdefg\tabc\ta'
print s
print len(s)
t = s.expandtabs()
print t
print len(t)

Outputs:

abcdefg abc     a
13
abcdefg abc     a
17

Notice how (although these both look the same) the second string (t) has a different length because each tab is represented by spaces not tab characters.

To use a tab size of 4 instead of 8:

v = s.expandtabs(4)
print v
print len(v)

Outputs:

abcdefg abc a
13

Please note each tab is not always counted as eight spaces. Rather a tab "pushes" the count to the next multiple of eight. For example:

s = '\t\t'
print s.expandtabs().replace(' ', '*')
print len(s.expandtabs())

Output:

 ****************
 16
s = 'abc\tabc\tabc'
print s.expandtabs().replace(' ', '*')
print len(s.expandtabs())

Outputs:

 abc*****abc*****abc
 19

split, splitlines

The split method returns a list of the words in the string. It can take a separator argument to use instead of whitespace.

>>> s = 'Hello, world'
>>> s.split()
['Hello,', 'world']
>>> s.split('l')
['He', '', 'o, wor', 'd']

Note that in neither case is the separator included in the split strings, but empty strings are allowed.

The splitlines method breaks a multiline string into many single line strings. It is analogous to split('\n') (but accepts '\r' and '\r\n' as delimiters as well) except that if the string ends in a newline character, splitlines ignores that final character (see example).

>>> s = """
... One line
... Two lines
... Red lines
... Blue lines
... Green lines
... """
>>> s.split('\n')
['', 'One line', 'Two lines', 'Red lines', 'Blue lines', 'Green lines', '']
>>> s.splitlines()
['', 'One line', 'Two lines', 'Red lines', 'Blue lines', 'Green lines']

The method split also accepts multi-character string literals:

txt = 'May the force be with you'
spl = txt.split('the')
print(spl)
# ['May ', ' force be with you']

Exercises

  1. Write a program that takes a string, (1) capitalizes the first letter, (2) creates a list containing each word, and (3) searches for the last occurrence of "a" in the first word.
  2. Run the program on the string "Bananas are yellow."
  3. Write a program that replaces all instances of "one" with "one (1)". For this exercise capitalization does not matter, so it should treat "one", "One", and "oNE" identically.
  4. Run the program on the string "One banana was brown, but one was green."



Lists


A list in Python is an ordered group of items (or elements). It is a very general structure, and list elements don't have to be of the same type: you can put numbers, letters, strings and nested lists all on the same list.

Overview

Lists in Python at a glance:

list1 = []                      # A new empty list
list2 = [1, 2, 3, "cat"]        # A new non-empty list with mixed item types
list1.append("cat")             # Add a single member, at the end of the list
list1.extend(["dog", "mouse"])  # Add several members
list1.insert(0, "fly")          # Insert at the beginning
list1[0:0] = ["cow", "doe"]     # Add members at the beginning
doe = list1.pop(1)              # Remove item at index
if "cat" in list1:              # Membership test
  list1.remove("cat")           # Remove AKA delete
#list1.remove("elephant") - throws an error
for item in list1:              # Iteration AKA for each item
  print item
print "Item count:", len(list1) # Length AKA size AKA item count
list3 = [6, 7, 8, 9]
for i in range(0, len(list3)):  # Read-write iteration AKA for each item
  list3[i] += 1                 # Item access AKA element access by index
last = list3[-1]                # Last item
nextToLast = list3[-2]          # Next-to-last item
isempty = len(list3) == 0       # Test for emptiness
set1 = set(["cat", "dog"])      # Initialize set from a list
list4 = list(set1)              # Get a list from a set
list5 = list4[:]                # A shallow list copy
list4equal5 = list4==list5      # True: same by value
list4refEqual5 = list4 is list5 # False: not same by reference
list6 = list4[:]
del list6[:]                    # Clear AKA empty AKA erase
list7 = [1, 2] + [2, 3, 4]      # Concatenation
print list1, list2, list3, list4, list5, list6, list7
print list4equal5, list4refEqual5
print list3[1:3], list3[1:], list3[:2] # Slices
print max(list3 ), min(list3 ), sum(list3) # Aggregates

print [x for x in range(10)]    # List comprehension
print [x for x in range(10) if x % 2 == 1]
print [x for x in range(10) if x % 2 == 1 if x < 5]
print [x + 1 for x in range(10) if x % 2 == 1]
print [x + y for x in '123' for y in 'abc']

List creation

There are two different ways to make a list in Python. The first is through assignment ("statically"), the second is using list comprehensions ("actively").

Plain creation

To make a static list of items, write them between square brackets. For example:

[ 1,2,3,"This is a list",'c',Donkey("kong") ]

Observations:

  1. The list contains items of different data types: integer, string, and Donkey class.
  2. Objects can be created 'on the fly' and added to lists. The last item is a new instance of Donkey class.

Creation of a new list whose members are constructed from non-literal expressions:

a = 2
b = 3
myList = [a+b, b+a, len(["a","b"])]

List comprehensions

Using list comprehension, you describe the process using which the list should be created. To do that, the list is broken into two pieces. The first is a picture of what each element will look like, and the second is what you do to get it.

For instance, let's say we have a list of words:

listOfWords = ["this","is","a","list","of","words"]

To take the first letter of each word and make a list out of it using list comprehension, we can do this:

>>> listOfWords = ["this","is","a","list","of","words"]
>>> items = [ word[0] for word in listOfWords ]
>>> print items
['t', 'i', 'a', 'l', 'o', 'w']

List comprehension supports more than one for statement. It will evaluate the items in all of the objects sequentially and will loop over the shorter objects if one object is longer than the rest.

>>> item = [x+y for x in 'cat' for y in 'pot']
>>> print item
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']

List comprehension supports an if statement, to only include members into the list that fulfill a certain condition:

>>> print [x+y for x in 'cat' for y in 'pot']
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'to', 'tt']
>>> print [x+y for x in 'cat' for y in 'pot' if x != 't' and y != 'o' ]
['cp', 'ct', 'ap', 'at']
>>> print [x+y for x in 'cat' for y in 'pot' if x != 't' or y != 'o' ]
['cp', 'co', 'ct', 'ap', 'ao', 'at', 'tp', 'tt']

In version 2.x, Python's list comprehension does not define a scope. Any variables that are bound in an evaluation remain bound to whatever they were last bound to when the evaluation was completed. In version 3.x Python's list comprehension uses local variables:

>>> print x, y                         #Input to python version 2
t t                                    #Output using python 2

>>> print x, y                         #Input to python version 3
NameError: name 'x' is not defined     #Python 3 returns an error because x and y were not leaked

This is exactly the same as if the comprehension had been expanded into an explicitly-nested group of one or more 'for' statements and 0 or more 'if' statements.

List creation shortcuts

You can initialize a list to a size, with an initial value for each element:

>>> zeros=[0]*5
>>> print zeros
[0, 0, 0, 0, 0]

This works for any data type:

>>> foos=['foo']*3
>>> print foos
['foo', 'foo', 'foo']

But there is a caveat. When building a new list by multiplying, Python copies each item by reference. This poses a problem for mutable items, for instance in a multidimensional array where each element is itself a list. You'd guess that the easy way to generate a two dimensional array would be:

listoflists=[ [0]*4 ] *5

and this works, but probably doesn't do what you expect:

>>> listoflists=[ [0]*4 ] *5
>>> print listoflists
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> listoflists[0][2]=1
>>> print listoflists
[[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]]

What's happening here is that Python is using the same reference to the inner list as the elements of the outer list. Another way of looking at this issue is to examine how Python sees the above definition:

>>> innerlist=[0]*4
>>> listoflists=[innerlist]*5
>>> print listoflists
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> innerlist[2]=1
>>> print listoflists
[[0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0], [0, 0, 1, 0]]

Assuming the above effect is not what you intend, one way around this issue is to use list comprehensions:

>>> listoflists=[[0]*4 for i in range(5)]
>>> print listoflists
[[0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]
>>> listoflists[0][2]=1
>>> print listoflists
[[0, 0, 1, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0], [0, 0, 0, 0]]

List size

To find the length of a list use the built in len() method.

>>> len([1,2,3])
3
>>> a = [1,2,3,4]
>>> len( a )
4

Combining lists

Lists can be combined in several ways. The easiest is just to 'add' them. For instance:

>>> [1,2] + [3,4]
[1, 2, 3, 4]

Another way to combine lists is with extend. If you need to combine lists inside of a lambda, extend is the way to go.

>>> a = [1,2,3]
>>> b = [4,5,6]
>>> a.extend(b)
>>> print a
[1, 2, 3, 4, 5, 6]

The other way to append a value to a list is to use append. For example:

>>> p=[1,2]
>>> p.append([3,4])
>>> p
[1, 2, [3, 4]]
>>> # or
>>> print p
[1, 2, [3, 4]]

However, [3,4] is an element of the list, and not part of the list. append always adds one element only to the end of a list. So if the intention was to concatenate two lists, always use extend.

Getting pieces of lists (slices)

Continuous slices

Like strings, lists can be indexed and sliced:

>>> list = [2, 4, "usurp", 9.0, "n"]
>>> list[2]
'usurp'
>>> list[3:]
[9.0, 'n']

Much like the slice of a string is a substring, the slice of a list is a list. However, lists differ from strings in that we can assign new values to the items in a list:

>>> list[1] = 17
>>> list
[2, 17, 'usurp', 9.0, 'n']

We can assign new values to slices of the lists, which don't even have to be the same length:

>>> list[1:4] = ["opportunistic", "elk"]
>>> list
[2, 'opportunistic', 'elk', 'n']

It's even possible to append items onto the start of lists by assigning to an empty slice:

>>> list[:0] = [3.14, 2.71]
>>> list
[3.14, 2.71, 2, 'opportunistic', 'elk', 'n']

Similarly, you can append to the end of the list by specifying an empty slice after the end:

>>> list[len(list):] = ['four', 'score']
>>> list
[3.14, 2.71, 2, 'opportunistic', 'elk', 'n', 'four', 'score']

You can also completely change the contents of a list:

>>> list[:] = ['new', 'list', 'contents']
>>> list
['new', 'list', 'contents']

The right-hand side of a list assignment statement can be any iterable type:

>>> list[:2] = ('element',('t',),[])
>>> list
['element', ('t',), [], 'contents']

With slicing you can create copy of list since slice returns a new list:

>>> original = [1, 'element', []]
>>> list_copy = original[:]
>>> list_copy
[1, 'element', []]
>>> list_copy.append('new element')
>>> list_copy
[1, 'element', [], 'new element']
>>> original
[1, 'element', []]

Note, however, that this is a shallow copy and contains references to elements from the original list, so be careful with mutable types:

>>> list_copy[2].append('something')
>>> original
[1, 'element', ['something']]

Non-Continuous slices

It is also possible to get non-continuous parts of an array. If one wanted to get every n-th occurrence of a list, one would use the :: operator. The syntax is a:b:n where a and b are the start and end of the slice to be operated upon.

>>> list = [i for i in range(10) ]
>>> list
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list[::2]
[0, 2, 4, 6, 8]
>>> list[1:7:2]
[1, 3, 5]

Comparing lists

Lists can be compared for equality.

>>> [1,2] == [1,2]
True
>>> [1,2] == [3,4]
False

Lists can be compared using a less-than operator, which uses lexicographical order:

>>> [1,2] < [2,1]
True
>>> [2,2] < [2,1]
False
>>> ["a","b"] < ["b","a"]
True

Sorting lists

Sorting at a glance:

list1 = [2, 3, 1, 'a', 'B']
list1.sort()                                   # list1 gets modified, case sensitive
list2 = sorted(list1)                          # list1 is unmodified; since Python 2.4
list3 = sorted(list1, key=lambda x: x.lower()) # case insensitive ; will give error as not all elements of list are strings and .lower() is not applicable
list4 = sorted(list1, reverse=True)            # Reverse sorting order: descending
print list1, list2, list3, list4

Sorting lists is easy with a sort method.

>>> list1 = [2, 3, 1, 'a', 'b']
>>> list1.sort()
>>> list1
[1, 2, 3, 'a', 'b']

Note that the list is sorted in place, and the sort() method returns None to emphasize this side effect.

If you use Python 2.4 or higher there are some more sort parameters:

  • sort(cmp,key,reverse)
    • cmp : method to be used for sorting
    • key : function to be executed with key element. List is sorted by return-value of the function
    • reverse : sort(reverse=True) or sort(reverse=False)

Python also includes a sorted() function.

>>> list1 = [5, 2, 3, 'q', 'p']
>>> sorted(list1)
[2, 3, 5, 'p', 'q']
>>> list1
[5, 2, 3, 'q', 'p']

Note that unlike the sort() method, sorted(list) does not sort the list in place, but instead returns the sorted list. The sorted() function, like the sort() method also accepts the reverse parameter.

Links:

Iteration

Iteration over lists:

Read-only iteration over a list, AKA for each element of the list:

list1 = [1, 2, 3, 4]
for item in list1:
  print item

Writable iteration over a list:

list1 = [1, 2, 3, 4]
for i in range(0, len(list1)):
  list1[i]+=1 # Modify the item at an index as you see fit
print list

From a number to a number with a step:

for i in range(1, 13+1, 3): # For i=1 to 13 step 3
  print i
for i in range(10, 5-1, -1): # For i=10 to 5 step -1
  print i

For each element of a list satisfying a condition (filtering):

for item in list:
  if not condition(item):
    continue
  print item

See also Python Programming/Loops.

Removing

Removing aka deleting an item at an index (see also #pop(i)):

list1 = [1, 2, 3, 4]
list1.pop() # Remove the last item
list1.pop(0) # Remove the first item , which is the item at index 0
print list1

list1 = [1, 2, 3, 4]
del list1[1] # Remove the 2nd element; an alternative to list.pop(1)
print list1

Removing an element by value:

list1 = ["a", "a", "b"]
list1.remove("a") # Removes only the 1st occurrence of "a"
print list1

Keeping only items in a list satisfying a condition, and thus removing the items that do not satisfy it:

list1 = [1, 2, 3, 4]
newlist = [item for item in list1 if item > 2]
print newlist

This uses a list comprehension.

Removing items failing a condition can be done without losing the identity of the list being made shorter, by using "[:]":

list1 = [1, 2, 3, 4]
sameList = list1
list1[:] = [item for item in list1 if item > 2]
print sameList, sameList is list1

Removing items failing a condition can be done by having the condition in a separate function:

list1 = [1, 2, 3, 4]
def keepingCondition(item):
  return item > 2
sameList = list1
list1[:] = [item for item in list1 if keepingCondition(item)]
print sameList, sameList is list1

Removing items while iterating a list usually leads to unintended outcomes unless you do it carefully by using an index:

list1 = [1, 2, 3, 4]
index = len(list1)
while index > 0:
  index -= 1
  if not list1[index] < 2:
    list1.pop(index)

Links:

Aggregates

There are some built-in functions for arithmetic aggregates over lists. These include minimum, maximum, and sum:

list = [1, 2, 3, 4]
print max(list), min(list), sum(list)
average = sum(list) / float(len(list)) # Provided the list is non-empty
# The float above ensures the division is a float one rather than integer one.
print average

The max and min functions also apply to lists of strings, returning maximum and minimum with respect to alphabetical order:

list = ["aa", "ab"]
print max(list), min(list) # Prints "ab aa"

Copying

Copying AKA cloning of lists:

Making a shallow copy:

list1= [1, 'element']
list2 = list1[:] # Copy using "[:]"
list2[0] = 2 # Only affects list2, not list1
print list1[0] # Displays 1

# By contrast
list1 = [1, 'element']
list2 = list1
list2[0] = 2 # Modifies the original list
print list1[0] # Displays 2

The above does not make a deep copy, which has the following consequence:

list1 = [1, [2, 3]] # Notice the second item being a nested list
list2 = list1[:] # A shallow copy
list2[1][0] = 4 # Modifies the 2nd item of list1 as well
print list1[1][0] # Displays 4 rather than 2

Making a deep copy:

import copy
list1 = [1, [2, 3]] # Notice the second item being a nested list
list2 = copy.deepcopy(list1) # A deep copy
list2[1][0] = 4 # Leaves the 2nd item of list1 unmodified
print list1[1][0] # Displays 2

See also #Continuous slices.

Links:

Clearing

Clearing a list:

del list1[:] # Clear a list
list1 = []   # Not really clear but rather assign to a new empty list

Clearing using a proper approach makes a difference when the list is passed as an argument:

def workingClear(ilist):
  del ilist[:]
def brokenClear(ilist):
  ilist = [] # Lets ilist point to a new list, losing the reference to the argument list
list1=[1, 2]; workingClear(list1); print list1
list1=[1, 2]; brokenClear(list1); print list1

Keywords: emptying a list, erasing a list, clear a list, empty a list, erase a list.

Removing duplicate items

Removing duplicate items from a list (keeping only unique items) can be achieved as follows.

If each item in the list is hashable, using list comprehension, which is fast:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
seen = {}
list1[:] = [seen.setdefault(e, e) for e in list1 if e not in seen]

If each item in the list is hashable, using index iteration, much slower:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
seen = set()
for i in range(len(list1) - 1, -1, -1):
  if list1[i] in seen:
    list1.pop(i)
  seen.add(list1[i])

If some items are not hashable, the set of visited items can be kept in a list:

list1 = [1, 4, 4, ["a", "b"], 5, ["a", "b"], 3, 2, 3, 2, 1]
seen = []
for i in range(len(list1) - 1, -1, -1):
  if list1[i] in seen:
    list1.pop(i)
  seen.append(list1[i])

If each item in the list is hashable and preserving element order does not matter:

list1 = [1, 4, 4, 5, 3, 2, 3, 2, 1]
list1[:] = list(set(list1))  # Modify list1
list2 = list(set(list1))

In the above examples where index iteration is used, scanning happens from the end to the beginning. When these are rewritten to scan from the beginning to the end, the result seems hugely slower.

Links:

List methods

append(x)

Add item x onto the end of the list.

>>> list = [1, 2, 3]
>>> list.append(4)
>>> list
[1, 2, 3, 4]

See pop(i)

pop(i)

Remove the item in the list at the index i and return it. If i is not given, remove the last item in the list and return it.

>>> list = [1, 2, 3, 4]
>>> a = list.pop(0)
>>> list
[2, 3, 4]
>>> a
1
>>> b = list.pop()
>>>list
[2, 3]
>>> b
4

Operators

+

To concatenate two lists.

*

To multiply one list several times.

in

The operator 'in' is used for two purposes; either to iterate over every item in a list in a for loop, or to check if a value is in a list returning true or false.

>>> list = [1, 2, 3, 4]
>>> if 3 in list:
>>>    ....
>>> l = [0, 1, 2, 3, 4]
>>> 3 in l
True
>>> 18 in l
False
>>>for x in l:
>>>    print x
0
1
2
3
4

Difference

To get the difference between two lists, just iterate:

a = [0, 1, 2, 3, 4, 4]
b = [1, 2, 3, 4, 4, 5]
print [item for item in a if item not in b]
# [0]

Intersection

To get the intersection between two lists (by preserving its elements order, and their doubles), apply the difference with the difference:

a = [0, 1, 2, 3, 4, 4]
b = [1, 2, 3, 4, 4, 5]
dif = [item for item in a if item not in b]
print [item for item in a if item not in dif]
# [1, 2, 3, 4, 4]

Exercises

  1. Use a list comprehension to construct the list ['ab', 'ac', 'ad', 'bb', 'bc', 'bd'].
  2. Use a slice on the above list to construct the list ['ab', 'ad', 'bc'].
  3. Use a list comprehension to construct the list ['1a', '2a', '3a', '4a'].
  4. Simultaneously remove the element '2a' from the above list and print it.
  5. Copy the above list and add '2a' back into the list such that the original is still missing it.
  6. Use a list comprehension to construct the list ['abe', 'abf', 'ace', 'acf', 'ade', 'adf', 'bbe', 'bbf', 'bce', 'bcf', 'bde', 'bdf']


Solutions

Question 1 :

List1 = [a + b for a in 'ab' for b in 'bcd']
print(List1)
>>> ['ab', 'ac', 'ad', 'bb', 'bc', 'bd']

Question 2 :

List2 = List1[::2]
print(List2)
>>> ['ab', 'ad', 'bc']

Question 3 :

List3 = [a + b for a in '1234' for b in 'a']
print(List3)
>>> ['1a', '2a', '3a', '4a']

Question 4 :

print(List3.pop(List3.index('3a')))
print(List3)
>>> 3a
>>> ['1a', '2a', '4a']

Question 5 :

List4 = List3[:]
List4.insert(2, '3a')
print(List4)
>>> ['1a', '2a', '3a', '4a']

Question 6 :

List5 = [a + b + c for a in 'ab' for b in 'bcd' for c in 'ef']
print(List5)
>>> ['abe', 'abf', 'ace', 'acf', 'ade', 'adf', 'bbe', 'bbf', 'bce', 'bcf', 'bde', 'bdf']



Tuples


A tuple in Python is much like a list except that it is immutable (unchangeable) once created. A tuple of hashable objects is hashable and thus suitable as a key in a dictionary and as a member of a set.

Overview

Tuples in Python at a glance:

tup1  = (1, 'a')
tup2  = 1, 'a'                # Brackets not needed
tup3  = (1,)                  # Singleton
tup4  = 1,                    # Singleton without brackets
tup5 = ()                     # Empty tuple
list1 = [1, 'a']
it1, it2 = tup1               # Assign items
print tup1 == tup2            # True
print tup1 == list1           # False
print tup1 == tuple(list1)    # True
print list(tup1) == list1     # True
print tup1[0]                 # First member
for item in tup1: print item  # Iteration
print (1, 2) + (3, 4)         # (1, 2, 3, 4)
tup1 += (3,)
print tup1                    # (1, 'a', 3), despite immutability
print len(tup1)               # Length AKA size AKA item count
print 3 in tup1               # Membership - true
return r1, r2                 # Return multiple values
r1, r2 = myfun()              # Receive multiple values
tup6 = ([1,2],)
tup6[0][0]=3
print tup6                    # The list within is mutable
set1 = set( (1,2) )           # Can be placed into a set
#set1 = set( ([1,2], 2) )     # Error: The list within makes it unhashable

Tuple notation

Tuples may be created directly or converted from lists. Generally, tuples are enclosed in parentheses.

>>> l = [1, 'a', [6, 3.14]]
>>> t = (1, 'a', [6, 3.14])
>>> t
(1, 'a', [6, 3.14])
>>> tuple(l)
(1, 'a', [6, 3.14])
>>> t == tuple(l)
True
>>> t == l
False

A one item tuple is created by an item in parentheses followed by a comma:

>>> t = ('A single item tuple',)
>>> t
('A single item tuple',)

Also, tuples will be created from items separated by commas.

>>> t = 'A', 'tuple', 'needs', 'no', 'parens'
>>> t
('A', 'tuple', 'needs', 'no', 'parens')

Packing and Unpacking

You can also perform multiple assignment using tuples.

>>> article, noun, verb, adjective, direct_object = t   #t is defined above
>>> noun
'tuple'

Note that either, or both sides of an assignment operator can consist of tuples.

>>> a, b = 1, 2
>>> b
2

The example above: article, noun, verb, adjective, direct_object = t is called "tuple unpacking" because the tuple t was unpacked and its values assigned to each of the variables on the left. "Tuple packing" is the reverse: t=article, noun, verb, adjective, direct_object. When unpacking a tuple, or performing multiple assignment, you must have the same number of variables being assigned to as values being assigned.

Operations on tuples

These are the same as for lists except that we may not assign to indices or slices, and there is no "append" operator.

>>> a = (1, 2)
>>> b = (3, 4)
>>> a + b
(1, 2, 3, 4)
>>> a
(1, 2)
>>> b
(3, 4)
>>> a.append(3)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'tuple' object has no attribute 'append'
>>> a
(1, 2)
>>> a[0] = 0
Traceback (most recent call last):
File "<stdin>", line 1, in ?
TypeError: object does not support item assignment
>>> a
(1, 2)

For lists we would have had:

>>> a = [1, 2]
>>> b = [3, 4]
>>> a + b
[1, 2, 3, 4]
>>> a
[1, 2]
>>> b
[3, 4]
>>> a.append(3)
>>> a
[1, 2, 3]
>>> a[0] = 0
>>> a
[0, 2, 3]

Tuple Attributes

Length: Finding the length of a tuple is the same as with lists; use the built in len() method.

>>> len( ( 1, 2, 3) )
3
>>> a = ( 1, 2, 3, 4 )
>>> len( a )
4

Conversions

Convert list to tuples using the built in tuple() method.

>>> l = [4, 5, 6]
>>> tuple(l)
(4, 5, 6)

Converting a tuple into a list using the built in list() method to cast as a list:

>>> t = (4, 5, 6)
>>> list(t)
[4, 5, 6]

Dictionaries can also be converted to tuples of tuples using the items method of dictionaries:

>>> d = {'a': 1, 'b': 2}
>>> tuple(d.items())
(('a', 1), ('b', 2))

Uses of Tuples

Tuples can be used in place of lists where the number of items is known and small, for example when returning multiple values from a function. Many other languages require creating an object or container to return, but with Python's tuple assignment, multiple-value returns are easy:

def func(x, y):
    # code to compute x and y
    return x, y

This resulting tuple can be easily unpacked with the tuple assignment technique explained above:

x, y = func(1, 2)

Using List Comprehension to process Tuple elements

Occasionally, there is a need to manipulate the values contained within a tuple in order to create a new tuple. For example, if we wanted a way to double all of the values within a tuple, we can combine some of the above information in addition to list comprehension like this:

def double(T):
    'double() - return a tuple with each tuple element (e) doubled.'
    return tuple( [ e * 2 for e in T ] )

Exercises

  1. Create the list ['a', 'b', 'c'], then create a tuple from that list.
  2. Create the tuple ('a', 'b', 'c'), then create a list from that tuple. (Hint: the material needed to do this has been covered, but it's not entirely obvious)
  3. Make the following instantiations simultaneously: a = 'a', b=2, c='gamma'. (That is, in one line of code).
  4. Create a tuple containing just a single element which in turn contains the three elements 'a', 'b', and 'c'. Verify that the length is actually 1 by using the len() function.



Dictionaries


A dictionary in Python is a collection of unordered values accessed by key rather than by index. The keys have to be hashable: integers, floating point numbers, strings, tuples, and frozensets are hashable, while lists, dictionaries, and sets other than frozensets are not. Dictionaries were available as early as in Python 1.4.

Overview

Dictionaries in Python at a glance:

dict1 = {}                     # Create an empty dictionary
dict2 = dict()                 # Create an empty dictionary 2
dict2 = {"r": 34, "i": 56}     # Initialize to non-empty value
dict3 = dict([("r", 34), ("i", 56)])      # Init from a list of tuples
dict4 = dict(r=34, i=56)       # Initialize to non-empty value 3
dict1["temperature"] = 32      # Assign value to a key
if "temperature" in dict1:     # Membership test of a key AKA key exists
  del dict1["temperature"]     # Delete AKA remove
equalbyvalue = dict2 == dict3
itemcount2 = len(dict2)        # Length AKA size AKA item count
isempty2 = len(dict2) == 0     # Emptiness test
for key in dict2:              # Iterate via keys
  print (key, dict2[key])        # Print key and the associated value
  dict2[key] += 10             # Modify-access to the key-value pair
for key in sorted(dict2):      # Iterate via keys in sorted order of the keys
  print (key, dict2[key])        # Print key and the associated value
for value in dict2.values():   # Iterate via values
  print (value)
for key, value in dict2.items(): # Iterate via pairs
  print (key, value)
dict5 = {} # {x: dict2[x] + 1 for x in dict2 } # Dictionary comprehension in Python 2.7 or later
dict6 = dict2.copy()             # A shallow copy
dict6.update({"i": 60, "j": 30}) # Add or overwrite; a bit like list's extend
dict7 = dict2.copy()
dict7.clear()                  # Clear AKA empty AKA erase
sixty = dict6.pop("i")         # Remove key i, returning its value
print (dict1, dict2, dict3, dict4, dict5, dict6, dict7, equalbyvalue, itemcount2, sixty)

Dictionary notation

Dictionaries may be created directly or converted from sequences. Dictionaries are enclosed in curly braces, {}

>>> d = {'city':'Paris', 'age':38, (102,1650,1601):'A matrix coordinate'}
>>> seq = [('city','Paris'), ('age', 38), ((102,1650,1601),'A matrix coordinate')]
>>> d
{'city': 'Paris', 'age': 38, (102, 1650, 1601): 'A matrix coordinate'}
>>> dict(seq)
{'city': 'Paris', 'age': 38, (102, 1650, 1601): 'A matrix coordinate'}
>>> d == dict(seq)
True

Also, dictionaries can be easily created by zipping two sequences.

>>> seq1 = ('a','b','c','d')
>>> seq2 = [1,2,3,4]
>>> d = dict(zip(seq1,seq2))
>>> d
{'a': 1, 'c': 3, 'b': 2, 'd': 4}

Operations on Dictionaries

The operations on dictionaries are somewhat unique. Slicing is not supported, since the items have no intrinsic order.

>>> d = {'a':1,'b':2, 'cat':'Fluffers'}
>>> d.keys()
['a', 'b', 'cat']
>>> d.values()
[1, 2, 'Fluffers']
>>> d['a']
1
>>> d['cat'] = 'Mr. Whiskers'
>>> d['cat']
'Mr. Whiskers'
>>> 'cat' in d
True
>>> 'dog' in d
False

Combining two Dictionaries

You can combine two dictionaries by using the update method of the primary dictionary. Note that the update method will merge existing elements if they conflict.

>>> d = {'apples': 1, 'oranges': 3, 'pears': 2}
>>> ud = {'pears': 4, 'grapes': 5, 'lemons': 6}
>>> d.update(ud)
>>> d
{'grapes': 5, 'pears': 4, 'lemons': 6, 'apples': 1, 'oranges': 3}
>>>

Deleting from dictionary

del dictionaryName[membername]

Exercises

Write a program that:

  1. Asks the user for a string, then creates the following dictionary. The values are the letters in the string, with the corresponding key being the place in the string. https://docs.python.org/2/tutorial/datastructures.html#looping-techniques
  2. Replaces the entry whose key is the integer 3, with the value "Pie".
  3. Asks the user for a string of digits, then prints out the values corresponding to those digits.



Sets


Starting with version 2.3, Python comes with an implementation of the mathematical set. Initially this implementation had to be imported from the standard module set, but with Python 2.6 the types set and frozenset became built-in types. A set is an unordered collection of objects, unlike sequence objects such as lists and tuples, in which each element is indexed. Sets cannot have duplicate members - a given object appears in a set 0 or 1 times. All members of a set have to be hashable, just like dictionary keys. Integers, floating point numbers, tuples, and strings are hashable; dictionaries, lists, and other sets (except frozensets) are not.

Overview

Sets in Python at a glance:

set1 = set()                   # A new empty set
set1.add("cat")                # Add a single member
set1.update(["dog", "mouse"])  # Add several members, like list's extend
set1 |= set(["doe", "horse"])  # Add several members 2, like list's extend
if "cat" in set1:              # Membership test
  set1.remove("cat")
#set1.remove("elephant") - throws an error
set1.discard("elephant")       # No error thrown
print set1
for item in set1:              # Iteration AKA for each element
  print item
print "Item count:", len(set1) # Length AKA size AKA item count
#1stitem = set1[0]             # Error: no indexing for sets
isempty = len(set1) == 0       # Test for emptiness
set1 = {"cat", "dog"}          # Initialize set using braces; since Python 2.7
#set1 = {}                     # No way; this is a dict
set1 = set(["cat", "dog"])     # Initialize set from a list
set2 = set(["dog", "mouse"])
set3 = set1 & set2             # Intersection
set4 = set1 | set2             # Union
set5 = set1 - set3             # Set difference
set6 = set1 ^ set2             # Symmetric difference
issubset = set1 <= set2        # Subset test
issuperset = set1 >= set2      # Superset test
set7 = set1.copy()             # A shallow copy
set7.remove("cat")
print set7.pop()               # Remove an arbitrary element
set8 = set1.copy()
set8.clear()                   # Clear AKA empty AKA erase
set9 = {x for x in range(10) if x % 2} # Set comprehension; since Python 2.7
print set1, set2, set3, set4, set5, set6, set7, set8, set9, issubset, issuperset

Constructing Sets

One way to construct sets is by passing any sequential object to the "set" constructor.

>>> set([0, 1, 2, 3])
set([0, 1, 2, 3])
>>> set("obtuse")
set(['b', 'e', 'o', 's', 'u', 't'])

We can also add elements to sets one by one, using the "add" function.

>>> s = set([12, 26, 54])
>>> s.add(32)
>>> s
set([32, 26, 12, 54])

Note that since a set does not contain duplicate elements, if we add one of the members of s to s again, the add function will have no effect. This same behavior occurs in the "update" function, which adds a group of elements to a set.

>>> s.update([26, 12, 9, 14])
>>> s
set([32, 9, 12, 14, 54, 26])

Note that you can give any type of sequential structure, or even another set, to the update function, regardless of what structure was used to initialize the set.

The set function also provides a copy constructor. However, remember that the copy constructor will copy the set, but not the individual elements.

>>> s2 = s.copy()
>>> s2
set([32, 9, 12, 14, 54, 26])

Membership Testing

We can check if an object is in the set using the same "in" operator as with sequential data types.

>>> 32 in s
True
>>> 6 in s
False
>>> 6 not in s
True

We can also test the membership of entire sets. Given two sets and , we check if is a subset or a superset of .

>>> s.issubset(set([32, 8, 9, 12, 14, -4, 54, 26, 19]))
True
>>> s.issuperset(set([9, 12]))
True

Note that "issubset" and "issuperset" can also accept sequential data types as arguments

>>> s.issuperset([32, 9])
True

Note that the <= and >= operators also express the issubset and issuperset functions respectively.

>>> set([4, 5, 7]) <= set([4, 5, 7, 9])
True
>>> set([9, 12, 15]) >= set([9, 12])
True

Like lists, tuples, and string, we can use the "len" function to find the number of items in a set.

Removing Items

There are three functions which remove individual items from a set, called pop, remove, and discard. The first, pop, simply removes an item from the set. Note that there is no defined behavior as to which element it chooses to remove.

>>> s = set([1,2,3,4,5,6])
>>> s.pop()
1
>>> s
set([2,3,4,5,6])

We also have the "remove" function to remove a specified element.

>>> s.remove(3)
>>> s
set([2,4,5,6])

However, removing a item which isn't in the set causes an error.

>>> s.remove(9)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
KeyError: 9

If you wish to avoid this error, use "discard." It has the same functionality as remove, but will simply do nothing if the element isn't in the set

We also have another operation for removing elements from a set, clear, which simply removes all elements from the set.

>>> s.clear()
>>> s
set([])

Iteration Over Sets

We can also have a loop move over each of the items in a set. However, since sets are unordered, it is undefined which order the iteration will follow.

>>> s = set("blerg")
>>> for n in s:
...     print n,
...
r b e l g

Set Operations

Python allows us to perform all the standard mathematical set operations, using members of set. Note that each of these set operations has several forms. One of these forms, s1.function(s2) will return another set which is created by "function" applied to and . The other form, s1.function_update(s2), will change to be the set created by "function" of and . Finally, some functions have equivalent special operators. For example, s1 & s2 is equivalent to s1.intersection(s2)

Intersection

Any element which is in both and will appear in their intersection.

>>> s1 = set([4, 6, 9])
>>> s2 = set([1, 6, 8])
>>> s1.intersection(s2)
set([6])
>>> s1 & s2
set([6])
>>> s1.intersection_update(s2)
>>> s1
set([6])

Union

The union is the merger of two sets. Any element in or will appear in their union.

>>> s1 = set([4, 6, 9])
>>> s2 = set([1, 6, 8])
>>> s1.union(s2)
set([1, 4, 6, 8, 9])
>>> s1 | s2
set([1, 4, 6, 8, 9])

Note that union's update function is simply "update" above.

Symmetric Difference

The symmetric difference of two sets is the set of elements which are in one of either set, but not in both.

>>> s1 = set([4, 6, 9])
>>> s2 = set([1, 6, 8])
>>> s1.symmetric_difference(s2)
set([8, 1, 4, 9])
>>> s1 ^ s2
set([8, 1, 4, 9])
>>> s1.symmetric_difference_update(s2)
>>> s1
set([8, 1, 4, 9])

Set Difference

Python can also find the set difference of and , which is the elements that are in but not in .

>>> s1 = set([4, 6, 9])
>>> s2 = set([1, 6, 8])
>>> s1.difference(s2)
set([9, 4])
>>> s1 - s2
set([9, 4])
>>> s1.difference_update(s2)
>>> s1
set([9, 4])

Multiple sets

Starting with Python 2.6, "union", "intersection", and "difference" can work with multiple input by using the set constructor. For example, using "set.intersection()":

>>> s1 = set([3, 6, 7, 9])
>>> s2 = set([6, 7, 9, 10])
>>> s3 = set([7, 9, 10, 11])
>>> set.intersection(s1, s2, s3)
set([9, 7])

frozenset

A frozenset is basically the same as a set, except that it is immutable - once it is created, its members cannot be changed. Since they are immutable, they are also hashable, which means that frozensets can be used as members in other sets and as dictionary keys. frozensets have the same functions as normal sets, except none of the functions that change the contents (update, remove, pop, etc.) are available.

>>> fs = frozenset([2, 3, 4])
>>> s1 = set([fs, 4, 5, 6])
>>> s1
set([4, frozenset([2, 3, 4]), 6, 5])
>>> fs.intersection(s1)
frozenset([4])
>>> fs.add(6)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'frozenset' object has no attribute 'add'

Exercises

  1. Create the set {'cat', 1, 2, 3}, call it s.
  2. Create the set {'c', 'a', 't', '1', '2', '3'}.
  3. Create the frozen set {'cat', 1, 2, 3}, call it fs.
  4. Create a set containing the frozenset fs, it should look like {frozenset({'cat', 2, 3, 1})}.

Reference



Basic Math


Now that we know how to work with numbers and strings, let's write a program that might actually be useful! Let's say you want to find out how much you weigh in stone. A concise program can make short work of this task. Since a stone is 14 pounds, and there are about 2.2 pounds in a kilogram, the following formula should do the trick:

So, let's turn this formula into a program!

mass_kg = int(input("What is your mass in kilograms?" ))
mass_stone = mass_kg * 2.2 / 14
print("You weigh", mass_stone, "stone.")

Run this program and get your weight in stone! Notice that applying the formula was as simple as putting in a few mathematical statements:

mass_stone = mass_kg * 2.2 / 14

Mathematical Operators

Here are some commonly used mathematical operators

Syntax Math Operation Name
a+baddition
a-bsubtraction
a*bmultiplication
a/bdivision (see note below)
a//bfloor division (e.g. 5//2=2) - Available in Python 2.2 and later
a%bmodulo
-anegation
abs(a)absolute value
a**bexponent
math.sqrt(a)square root

Beware that due to the limitations of floating point arithmetic, rounding errors can cause unexpected results. For example:

 >>> print(0.6/0.2)
 3.0
 >>> print(0.6//0.2)
 2.0

For the Python 2.x series, / does "floor division" for integers and longs (e.g. 5/2=2) but "true division" for floats and complex (e.g. 5.0/2.0=2.5). For Python 3.x, / does "true division" for all types.[1][2]

This can be fixed by putting a round([math]-0.5) around a normal division sign, because of a Python error causing round(0.5) to round down.

The operator // always performs Euclidean (or integer-type) division, which includes a quotient term (obtained from the // operator) and a remainder term (obtained from the % operator). In the previous example we have seen that the quotient term 0.6 // 0.2 is 2.0, which can be verified by extending the above example:

 >>> 0.6 == 0.2 * ( 0.6 // 0.2 ) + 0.6 % 0.2
 True
 >>> 0.6 // 0.2
 2.0
 >>> 0.6 % 0.2
 0.19999999999999996

The difference between the operations / and // when applied to decimal numbers is due to the way decimal numbers are stored in Python and rounding.

 >>> print(0.6 / 0.2)
 3.0
 >>> 0.6 / 0.2
 2.9999999999999996
 >>> 2.0 + ( 0.6 % 0.2 ) / 0.2
 3.0
 >>> 0.6 / 0.2 == ( 0.6 // 0.2 ) + ( 0.6 % 0.2 ) / 0.2
 False
 >>> round( 0.6 / 0.2 ) == ( 0.6 // 0.2 ) + ( 0.6 % 0.2 ) / 0.2
 True

Order of Operations

Python uses the standard order of operations as taught in Algebra and Geometry classes at high school or secondary school. That is, mathematical expressions are evaluated in the following order (memorized by many as PEMDAS), which is also applied to parentheticals.

(Note that operations which share a table row are performed from left to right. That is, a division to the left of a multiplication, with no parentheses between them, is performed before the multiplication simply because it is to the left.)

Name Syntax Description PEMDAS Mnemonic
Parentheses ( ... ) Before operating on anything else, Python must evaluate all parentheticals starting at the innermost level. (This includes functions.) Please
Exponents ** As an exponent is simply short multiplication or division, it should be evaluated before them. Excuse
Multiplication and

Division

* / // % Again, multiplication is rapid addition and must, therefore, happen first. My

Dear

Addition and

Subtraction

+ - Aunt

Sally

Formatting output

Wouldn't it be nice if we always worked with nice round numbers while doing math? Unfortunately, the real world is not quite so neat and tidy as we would like it to be. Sometimes, we end up with long, ugly numbers like the following:

What is your mass in kilograms? 65
You weigh 10.2142857143 stone.

By default, Python's print statement prints numbers to 10 decimal places. But what if you only want one or two? We can use the round() function, which rounds a number to the number of decimal points you choose. round() takes two arguments: the number you want to round, and the number of decimal places to round it to. For example:

>>> print (round(3.14159265, 2))
3.14

Now, let's change our program to only print the result to two decimal places.

print ("You weigh", round(mass_stone, 2), "stone.")

This also demonstrates the concept of nesting functions. As you can see, you can place one function inside another function, and everything will still work exactly the way you would expect. If you don't like this, you can always use multiple variables, instead:

twoSigFigs = round(mass_stone, 2)
numToString = str(twoSigFigs)
print ("You weigh " + numToString + " stone.")

Exercises

  1. Ask the user to specify the number of sides on a polygon and find the number of diagonals within the polygon.
  2. Take the lengths of two sides of a right-angle triangle from the user and apply the Pythagorean Theorem to find the hypotenuse.

Solutions

Notes

  1. What's New in Python 2.2
  2. PEP 238 -- Changing the Division Operator



Operators


Basics

Python math works like you would expect.

>>> x = 2
>>> y = 3
>>> z = 5
>>> x * y
6
>>> x + y
5
>>> x * y + z
11
>>> (x + y) * z
25

Note that Python adheres to the PEMDAS order of operations.

Powers

There is a built in exponentiation operator **, which can take either integers, floating point or complex numbers. This occupies its proper place in the order of operations.

>>> 2**8
256

Division and Type Conversion

For Python 2.x, dividing two integers or longs uses integer division, also known as "floor division" (applying the floor function after division. So, for example, 5 / 2 is 2. Using "/" to do division this way is deprecated; if you want floor division, use "//" (available in Python 2.2 and later).

"/" does "true division" for floats and complex numbers; for example, 5.0/2.0 is 2.5.

For Python 3.x, "/" does "true division" for all types.[1][2]

Dividing by or into a floating point number (Python never uses the fractional type in such a case) will cause Python to use true division. To coerce an integer to become a float, 'float()' with the integer as a parameter

>>> x = 5
>>> float(x)
5.0

This can be generalized for other numeric types: int(), complex(), long().

Beware that due to the limitations of floating point arithmetic, rounding errors can cause unexpected results. For example:

>>> print 0.6/0.2
3.0
>>> print 0.6//0.2
2.0

Modulus

The modulus (remainder of the division of the two operands, rather than the quotient) can be found using the % operator, or by the divmod builtin function. The divmod function returns a tuple containing the quotient and remainder.

>>> 10%7
3
>>> -10%7
4

print (" 10%7 ")

Negation

Unlike some other languages, variables can be negated directly:

>>> x = 5
>>> -x
-5

Comparison

OperationMeans
<Less than
>Greater than
<=Less than or equal to
>=Greater than or equal to
==Equal to
!=Not equal to

Numbers, strings and other types can be compared for equality/inequality and ordering:

>>> 2 == 3
False
>>> 3 == 3
True
>>> 3 == '3'
False
>>> 2 < 3
True
>>> "a" < "aa"
True

Identity

The operators is and is not test for object identity and stand in contrast to == (equals): x is y is true if and only if x and y are references to the same object in memory. x is not y yields the inverse truth value. Note that an identity test is more stringent than an equality test since two distinct objects may have the same value.

>>> [1,2,3] == [1,2,3]
True
>>> [1,2,3] is [1,2,3]
False

For the built-in immutable data types (like int, str and tuple) Python uses caching mechanisms to improve performance, i.e., the interpreter may decide to reuse an existing immutable object instead of generating a new one with the same value. The details of object caching are subject to changes between different Python versions and are not guaranteed to be system-independent, so identity checks on immutable objects like 'hello' is 'hello', (1,2,3) is (1,2,3), 4 is 2**2 may give different results on different machines.

In some Python implementations, the following results are applicable:

print 8 is 8            # True
print "str" is "str"    # True
print (1, 2) is (1, 2)  # False - whyever, it is immutable
print [1, 2] is [1, 2]  # False
print id(8) == id(8)    # True
int1 = 8
print int1 is 8         # True
oldid = id(int1)
int1 += 2
print id(int1) == oldid # False

Links:

Augmented Assignment

There is shorthand for assigning the output of an operation to one of the inputs:

>>> x = 2
>>> x # 2
2
>>> x *= 3
>>> x # 2 * 3
6
>>> x += 4
>>> x # 2 * 3 + 4
10
>>> x /= 5
>>> x # (2 * 3 + 4) / 5
2
>>> x **= 2
>>> x # ((2 * 3 + 4) / 5) ** 2
4
>>> x %= 3
>>> x # ((2 * 3 + 4) / 5) ** 2 % 3
1

>>> x = 'repeat this  '
>>> x  # repeat this
repeat this
>>> x *= 3  # fill with x repeated three times
>>> x
repeat this  repeat this  repeat this

Boolean

or

if a or b:
    do_this
else:
    do_this

and

if a and b:
    do_this
else:
    do_this

not

if not a:
    do_this
else:
    do_this

The order of operations here is: not first, and second, or third. In particular, "True or True and False or False" becomes "True or False or False" which is True.

Caution, Boolean operators are valid on things other than Booleans; for instance "1 and 6" will return 6. Specifically, "and" returns either the first value considered to be false, or the last value if all are considered true. "or" returns the first true value, or the last value if all are considered false.

Exercises

  1. Use Python to calculate .
  2. Use Python to calculate .
  3. Use Python to calculate 11111111111111111111+22222222222222222222, but in one line of code with at most 15 characters. (Hint: each of those numbers is 20 digits long, so you have to find some other way to input those numbers)
  4. Exactly one of the following expressions evaluates to "cat"; the other evaluates to "dog". Trace the logic to determine which one is which, then check your answer using Python.
1 and "cat" or "dog"
0 and "cat" or "dog"

References



Control Flow


As with most imperative languages, there are three main categories of program control flow:

  • loops
  • branches
  • function calls

Function calls are covered in the next section.

Generators and list comprehensions are advanced forms of program control flow, but they are not covered here.

Overview

Control flow in Python at a glance:

x = -6                              # Branching
if x > 0:                           # If
  print "Positive"
elif x == 0:                        # Else if AKA elseif
  print "Zero"
else:                               # Else
  print "Negative"
list1 = [100, 200, 300]
for i in list1: print i             # A for loop
for i in range(0, 5): print i       # A for loop from 0 to 4
for i in range(5, 0, -1): print i   # A for loop from 5 to 1
for i in range(0, 5, 2): print i    # A for loop from 0 to 4, step 2
list2 = [(1, 1), (2, 4), (3, 9)]
for x, xsq in list2: print x, xsq   # A for loop with a two-tuple as its iterator
l1 = [1, 2]; l2 = ['a', 'b']
for i1, i2 in zip(l1, l2): print i1, i2 # A for loop iterating two lists at once.
i = 5
while i > 0:                        # A while loop
  i -= 1
list1 = ["cat", "dog", "mouse"]
i = -1 # -1 if not found
for item in list1:
  i += 1
  if item=="dog":
    break                           # Break; also usable with while loop
print "Index of dog:",i             
for i in range(1,6):
  if i <= 4:
    continue                        # Continue; also usable with while loop
  print "Greater than 4:", i

Loops

In Python, there are two kinds of loops, 'for' loops and 'while' loops.

For loops

A for loop iterates over elements of a sequence (tuple or list). A variable is created to represent the object in the sequence. For example,

x = [100,200,300]
for i in x:
      print (i)   #these parenthesis are needed for the code to get executed in higher versions of Python

This will output

100
200
300

The for loop loops over each of the elements of a list or iterator, assigning the current element to the variable name given. In the example above, each of the elements in x is assigned to i.

A built-in function called range exists to make creating sequential lists such as the one above easier. The loop above is equivalent to:

l = range(100, 301,100)
for i in l:
    print (i)

The next example uses a negative step (the third argument for the built-in range function):

for i in range(5, 0, -1):
    print (i)

This will output

5
4
3
2
1

The negative step can be -2:

for i in range(10, 0, -2):
    print (i)

This will output

10
8
6
4
2

For loops can have names for each element of a tuple, if it loops over a sequence of tuples:

l = [(1, 1), (2, 4), (3, 9), (4, 16), (5, 25)]
for x, xsquared in l:
    print x, ':', xsquared

This will output

1 : 1
2 : 4
3 : 9
4 : 16
5 : 25

Links:

While loops

A while loop repeats a sequence of statements until some condition becomes false. For example:

x = 5
while x > 0:
    print (x)  #all the print statement must be in parenthesis for version 3.4.0
    x = x - 1  #the algebra need not be done within the parenthesis

Will output:

5
4
3
2
1

Python's while loops can also have an 'else' clause, which is a block of statements that is executed (once) when the while statement evaluates to false. The break statement inside the while loop will not direct the program flow to the else clause. For example:

x = 5
y = x
while y > 0:
    print (y)
    y = y - 1
else:
    print (x)

This will output:

5
4
3
2
1
5

Unlike some languages, there is no post-condition loop.

Links:

Breaking and continuing

Python includes statements to exit a loop (either a for loop or a while loop) prematurely. To exit a loop, use the break statement:

x = 5
while x > 0:
    print x
    break
    x -= 1
    print x

This will output

5

The statement to begin the next iteration of the loop without waiting for the end of the current loop is 'continue'.

l = [5,6,7]
for x in l:
    continue
    print x

This will not produce any output.

Else clause of loops

The else clause of loops will be executed if no break statements are met in the loop.

l = range(1,100)
for x in l:
    if x == 100:
        print x
        break
    else:
        print x," is not 100"
else:
    print "100 not found in range"


Another example of a while loop using the break statement and the else statement:

expected_str = "melon"
received_str = "apple"
basket = ["banana", "grapes", "strawberry", "melon", "orange"]
x = 0
step = int(raw_input("Input iteration step: "))
 
while(received_str != expected_str):
    if(x >= len(basket)): print "No more fruits left on the basket."; break
    received_str = basket[x]
    x += step # Change this to 3 to make the while statement
              # evaluate to false, avoiding the break statement, using the else clause.
    if(received_str==basket[2]): print "I hate",basket[2],"!"; break
    if(received_str != expected_str): print "I am waiting for my ",expected_str,"."
else:
    print "Finally got what I wanted! my precious ",expected_str,"!"
print "Going back home now !"

This will output:

Input iteration step: 2
I am waiting for my  melon .
I hate strawberry !
Going back home now !

White Space

Python determines where a loop repeats itself by the indentation in the whitespace. Everything that is indented is part of the loop, the next entry that is not indented is not. For example, the code below prints "1 1 2 1 1 2"

for i in [0, 1]:
    for j in ["a","b"]:
        print("1")
    print("2")

On the other hand, the code below prints "1 2 1 2 1 2 1 2"

for i in [0, 1]:
    for j in ["a","b"]:
        print("1")
        print("2")

Branches

There is basically only one kind of branch in Python, the 'if' statement. The simplest form of the if statement simple executes a block of code only if a given predicate is true, and skips over it if the predicate is false

For instance,

>>> x = 10
>>> if x > 0:
...    print "Positive"
...
Positive
>>> if x < 0:
...    print "Negative"
...

You can also add "elif" (short for "else if") branches onto the if statement. If the predicate on the first “if” is false, it will test the predicate on the first elif, and run that branch if it’s true. If the first elif is false, it tries the second one, and so on. Note, however, that it will stop checking branches as soon as it finds a true predicate, and skip the rest of the if statement. You can also end your if statements with an "else" branch. If none of the other branches are executed, then python will run this branch.

>>> x = -6
>>> if x > 0:
...    print "Positive"
... elif x == 0:
...    print "Zero"
... else:
...    print "Negative"
...
'Negative'

Links:

Conclusion

Any of these loops, branches, and function calls can be nested in any way desired. A loop can loop over a loop, a branch can branch again, and a function can call other functions, or even call itself.

Exercises

  1. Print the numbers from 0 to 1000 (including both 0 and 1000).
  2. Print the numbers from 0 to 1000 that are multiples of 5.
  3. Print the numbers from 1 to 1000 that are multiples of 5.
  4. Use a nested for-loop to prints the 3x3 multiplication table below
1 2 3 
2 4 6 
3 6 9
  1. Print the 3x3 multiplication table below.
  1 2 3 
 ------
1|1 2 3 
2|2 4 6 
3|3 6 9



Decision Control


Python, like many other computer programming languages, uses Boolean logic for its decision control. That is, the Python interpreter compares one or more values in order to decide whether to execute a piece of code or not, given the proper syntax and instructions.

Decision control is then divided into two major categories, conditional and repetition. Conditional logic simply uses the keyword if and a Boolean expression to decide whether or not to execute a code block. Repetition builds on the conditional constructs by giving us a simple method in which to repeat a block of code while a Boolean expression evaluates to true.

Boolean Expressions

Here is a little example of boolean expressions (you don't have to type it in):

a = 6
b = 7
c = 42
print (1, a == 6)
print (2, a == 7)
print (3, a == 6 and b == 7)
print (4, a == 7 and b == 7)
print (5, not a == 7 and b == 7)
print (6, a == 7 or b == 7)
print (7, a == 7 or b == 6)
print (8, not (a == 7 and b == 6))
print (9, not a == 7 and b == 6)

With the output being:

1 True
2 False
3 True
4 False
5 True
6 True
7 False
8 True
9 False

What is going on? The program consists of a bunch of funny looking print statements. Each print statement prints a number and an expression. The number is to help keep track of which statement I am dealing with. Notice how each expression ends up being either True or False; these are built-in Python values. The lines:

print (1, a == 6)
print (2, a == 7)

print out True and False respectively, just as expected, since the first is true and the second is false. The third print, print (3, a == 6 and b == 7), is a little different. The operator and means if both the statement before and the statement after are true then the whole expression is true otherwise the whole expression is false. The next line, print (4, a == 7 and b == 7), shows how if part of an and expression is false, the whole thing is false. The behavior of and can be summarized as follows:

expression result
true and true true
true and false false
false and true false
false and false false

Note that if the first expression is false Python does not check the second expression since it knows the whole expression is false.

The next line, print (5, not a == 7 and b == 7), uses the not operator. not just gives the opposite of the expression (The expression could be rewritten as print (5, a != 7 and b == 7)). Here's the table:

expression result
not true false
not false true

The two following lines, print (6, a == 7 or b == 7) and print (7, a == 7 or b == 6), use the or operator. The or operator returns true if the first expression is true, or if the second expression is true or both are true. If neither are true it returns false. Here's the table:

expression result
true or true true
true or false true
false or true true
false or false false

Note that if the first expression is true Python doesn't check the second expression since it knows the whole expression is true. This works since or is true if at least one of the expressions are true. The first part is true so the second part could be either false or true, but the whole expression is still true.

The next two lines, print (8, not (a == 7 and b == 6)) and print (9, not a == 7 and b == 6), show that parentheses can be used to group expressions and force one part to be evaluated first. Notice that the parentheses changed the expression from false to true. This occurred since the parentheses forced the not to apply to the whole expression instead of just the a == 7 portion.

Here is an example of using a boolean expression:

list = ["Life","The Universe","Everything","Jack","Jill","Life","Jill"]

# Make a copy of the list.
copy = list[:]
# Sort the copy
copy.sort()
prev = copy[0]
del copy[0]

count = 0

# Go through the list searching for a match
while count < len(copy) and copy[count] != prev:
    prev = copy[count]
    count = count + 1

# If a match was not found then count can't be < len
# since the while loop continues while count is < len
# and no match is found
if count < len(copy):
    print ("First Match:",prev)

See the Lists chapter to explain what [:] means on the first line.

Here is the output:

First Match: Jill

This program works by continuing to check for a match while count < len(copy) and copy[count] != prev. When either count is greater than the last index of copy or a match has been found the and is no longer true so the loop exits. The if simply checks to make sure that the while exited because a match was found.

The other 'trick' of and is used in this example. If you look at the table for and notice that the third entry is "false and won't check". If count >= len(copy) (in other words count < len(copy) is false) then copy[count] is never looked at. This is because Python knows that if the first is false then they both can't be true. This is known as a short circuit and is useful if the second half of the and will cause an error if something is wrong. I used the first expression ( count < len(copy)) to check and see if count was a valid index for copy. (If you don't believe me remove the matches `Jill' and `Life', check that it still works and then reverse the order of count < len(copy) and copy[count] != prev to copy[count] != prev and count < len(copy).)

Boolean expressions can be used when you need to check two or more different things at once.

Examples

password1.py

## This programs asks a user for a name and a password.
# It then checks them to make sure that the user is allowed in.
# Note that this is a simple and insecure example,
# real password code should never be implemented this way.

name = raw_input("What is your name? ")
password = raw_input("What is the password? ")
if name == "Josh" and password == "Friday":
    print ("Welcome Josh")
elif name == "Fred" and password == "Rock":
    print ("Welcome Fred")
else:
    print ("I don't know you.")

Sample runs

What is your name? Josh
What is the password? Friday
Welcome Josh

What is your name? Bill
What is the password? Saturday
I don't know you.

Exercises

  1. Write a program that has a user guess your name, but they only get 3 chances to do so until the program quits.

Solutions



Conditional Statements


Decisions

A Decision is when a program has more than one choice of actions depending on a variable's value. Think of a traffic light. When it is green, we continue our drive. When we see the light turn yellow, we reduce our speed, and when it is red, we stop. These are logical decisions that depend on the value of the traffic light. Luckily, Python has a decision statement to help us when our application needs to make such decision for the user.

If statements

Here is a warm-up exercise - a short program to compute the absolute value of a number:
absoval.py

n = input("Integer? ")#Pick an integer. And remember, if raw_input is not supported by your OS, use input()
n = int(n)#Defines n as the integer you chose. (Alternatively, you can define n yourself)
if n < 0:
    print ("The absolute value of",n,"is",-n)
else:
    print ("The absolute value of",n,"is",n)

Here is the output from the two times that I ran this program:

Integer? -34
The absolute value of -34 is 34

Integer? 1
The absolute value of 1 is 1

What does the computer do when it sees this piece of code? First it prompts the user for a number with the statement "n = input("Integer? ")". Next it reads the line "if n < 0:". If n is less than zero Python runs the line "print ("The absolute value of",n,"is",-n)". Otherwise python runs the line "print ("The absolute value of",n,"is",n)".

More formally, Python looks at whether the expression n < 0 is true or false. An if statement is followed by an indented block of statements that are run when the expression is true. After the if statement is an optional else statement and another indented block of statements. This 2nd block of statements is run if the expression is false.

Expressions can be tested several different ways. Here is a table of all of them:

operator function
< less than
<= less than or equal to
> greater than
>= greater than or equal to
== equal
!= not equal


Another feature of the if command is the elif statement. It stands for "else if," which means that if the original if statement is false and the elif statement is true, execute the block of code following the elif statement. Here's an example:
ifloop.py

a = 0
while a < 10:
    a = a + 1
    if a > 5:
        print (a,">",5)
    elif a <= 7:
        print (a,"<=",7)
    else:
        print ("Neither test was true")

and the output:

1 <= 7
2 <= 7
3 <= 7
4 <= 7
5 <= 7
6 > 5
7 > 5
8 > 5
9 > 5
10 > 5

Notice how the elif a <= 7 is only tested when the if statement fails to be true. elif allows multiple tests to be done in a single if statement.

If Example

High_low.py

# Plays the guessing game higher or lower
# (originally written by Josh Cogliati, improved by Quique, now improved
# by Sanjith, further improved by VorDd, with continued improvement from
# the various Wikibooks contributors.)
 
# This should actually be something that is semi random like the
# last digits of the time or something else, but that will have to
# wait till a later chapter.  (Extra Credit, modify it to be random
# after the Modules chapter)
 
# This is for demonstration purposes only.
# It is not written to handle invalid input like a full program would.
 
answer = 23
question = 'What number am I thinking of?  '
print ('Let\'s play the guessing game!')

while True:
    guess = int(input(question))

    if guess < answer:
        print ('Little higher')
    elif guess > answer:
        print ('Little lower')
    else: # guess == answer
        print ('MINDREADER!!!')
        break

Sample run:

Let's play the guessing game!
What number am I thinking of?  22
Little higher
What number am I thinking of?  25
Little Lower
What number am I thinking of?  23
MINDREADER!!!

As it states in its comments, this code is not prepared to handle invalid input (i.e., strings instead of numbers). If you are wondering how you would implement such functionality in Python, you are referred to the Errors Chapter of this book, where you will learn about error handling. For the above code you may try this slight modification of the while loop:

while True:
	inp = input(question)
	try:
		guess = int(inp)
	except ValueError:
		print('Your guess should be a number')
	else:
		if guess < answer:
			print ('Little higher')
		elif guess > answer:
			print ('Little lower')
		else: # guess == answer
			print ('MINDREADER!!!')
			break

even.py

#Asks for a number.
#Prints if it is even or odd

print ("Input [x] for exit.")

while True:
	inp = input("Tell me a number: ")
	if inp == 'x':
		break
	# catch any resulting ValueError during the conversion to float
	try:
		number = float(inp)
	except ValueError:
		print('I said: Tell me a NUMBER!')
	else:
		test = number % 2
		if test == 0:
			print (int(number),"is even.")
		elif test == 1:
			print (int(number),"is odd.")
		else:
			print (number,"is very strange.")

Sample runs.

Tell me a number: 3
3 is odd.

Tell me a number: 2
2 is even.

Tell me a number: 3.14159
3.14159 is very strange.

average1.py

#Prints the average value.
 
print ("Welcome to the average calculator program")
print ("NOTE- THIS PROGRAM ONLY CALCULATES AVERAGES FOR 3 NUMBERS")
x = int(input("Please enter the first number "))
y = int(input("Please enter the second number "))
z = int(input("Please enter the third number "))
str = x+y+z
print (float (str/3.0))
#MADE BY SANJITH sanrubik@gmail.com

Sample runs

Welcome to the average calculator program
NOTE- THIS PROGRAM ONLY CALCULATES AVERAGES FOR 3 NUMBERS
Please enter the first number 7
Please enter the second number 6
Please enter the third number 4
5.66666666667

average2.py

#keeps asking for numbers until count have been entered.
#Prints the average value.

sum = 0.0

print ("This program will take several numbers, then average them.")
count = int(input("How many numbers would you like to sum:  "))
current_count = 0
 
while current_count < count:
	print ("Number",current_count)
	number = float(input("Enter a number:  "))
	sum = sum + number
	current_count += 1
 
print("The average was:",sum/count)

Sample runs

This program will take several numbers, then average them.
How many numbers would you like to sum:  2
Number 0
Enter a number:  3
Number 1
Enter a number:  5
The average was: 4.0

This program will take several numbers, then average them.
How many numbers would you like to sum:  3
Number 0
Enter a number:  1
Number 1
Enter a number:  4
Number 2
Enter a number:  3
The average was: 2.66666666667

average3.py

#Continuously updates the average as new numbers are entered.

print ("Welcome to the Average Calculator, please insert a number")
currentaverage = 0
numofnums = 0
while True:
    newnumber = int(input("New number "))
    numofnums = numofnums + 1
    currentaverage = (round((((currentaverage * (numofnums - 1)) + newnumber) / numofnums), 3))
    print ("The current average is " + str((round(currentaverage, 3))))

Sample runs

Welcome to the Average Calculator, please insert a number
New number 1
The current average is 1.0
New number 3
The current average is 2.0
New number 6
The current average is 3.333
New number 6
The current average is 4.0
New number


If Exercises

  1. Write a password guessing program to keep track of how many times the user has entered the password wrong. If it is more than 3 times, print You have been denied access. and terminate the program. If the password is correct, print You have successfully logged in. and terminate the program.
  2. Write a program that asks for two numbers. If the sum of the numbers is greater than 100, print That is a big number and terminate the program.
  3. Write a program that asks the user their name. If they enter your name, say "That is a nice name." If they enter "John Cleese" or "Michael Palin", tell them how you feel about them ;), otherwise tell them "You have a nice name."
  4. Ask the user to enter the password. If the password is correct print "You have successfully logged in" and exit the program. If the password is wrong print "Sorry the password is wrong" and ask the user to enter the password 3 times. If the password is wrong print "You have been denied access" and exit the program.
##   Password guessing program using if statement and while statement only
###  source by zain


guess_count = 0

correct_pass = 'dee234'

while True:
	pass_guess = input("Please enter your password: ")

	guess_count += 1

	if pass_guess == correct_pass:
		print ('You have successfully logged in.')
		break

	elif pass_guess != correct_pass:
		if guess_count >= 3:
			print ("You have been denied access.")
			break
def mard():
    for i in range(1,4):
        a = input("enter a password:  ") # to ask password
        b = "sefinew" # the password
        if a == b: # if the password entered and the password are the same to print.
            print("You have successfully logged in")
            exit()# to terminate the program.  Using 'break' instead of 'exit()' will allow your shell or idle to dump the block and continue to run.
        else: # if the password entered and the password are not the same to print.
            print("Sorry the password is wrong ")
            if i == 3:
                print("You have been denied access")
                exit() # to terminate the program

mard()


#Source by Vanchi
import time
import getpass

password = getpass.getpass("Please enter your password")

print ("Waiting for 3 seconds")
time.sleep(3)
got_it_right = False
for number_of_tries in range(1,4):
    reenter_password = getpass.getpass("Please reenter your password")
    if password == reenter_password:
        print ("You are Logged in! Welcome User :)")
        got_it_right = True
        break

if not got_it_right:
    print ("Access Denied!!")

Conditional Statements

Many languages (like Java and PHP) have the concept of a one-line conditional (called The Ternary Operator), often used to simplify conditionally accessing a value. For instance (in Java):

int in= ; // read from program input

// a normal conditional assignment
int res;
if(number < 0)
  res = -number;
else
  res = number;

For many years Python did not have the same construct natively, however you could replicate it by constructing a tuple of results and calling the test as the index of the tuple, like so:

number = int(input("Enter a number to get its absolute value:"))
res = (-number, number)[number > 0]

It is important to note that, unlike a built in conditional statement, both the true and false branches are evaluated before returning, which can lead to unexpected results and slower executions if you're not careful. To resolve this issue, and as a better practice, wrap whatever you put in the tuple in anonymous function calls (lambda notation) to prevent them from being evaluated until the desired branch is called:

number = int(input("Enter a number to get its absolute value:"))
res = (lambda: number, lambda: -number)[number < 0]()

Since Python 2.5 however, there has been an equivalent operator to The Ternary Operator (though not called such, and with a totally different syntax):

number = int(input("Enter a number to get its absolute value:"))
res = -number if number < 0 else number

Switch

A switch is a control statement present in most computer programming languages to minimize a bunch of If - elif statements. Sadly Python doesn't officially support this statement, but with the clever use of an array or dictionary, we can recreate this Switch statement that depends on a value.

x = 1

def hello():
  print ("Hello")

def bye():
  print ("Bye")

def hola():
  print ("Hola is Spanish for Hello")

def adios():
  print ("Adios is Spanish for Bye")

# Notice that our switch statement is a regular variable, only that we added the function's name inside
# and there are no quotes
menu = [hello,bye,hola,adios]

# To call our switch statement, we simply make reference to the array with a pair of parentheses
# at the end to call the function
menu[3]()   # calls the adios function since is number 3 in our array.

menu[0]()   # Calls the hello function being our first element in our array.

menu[x]()   # Calls the bye function as is the second element on the array x = 1

This works because Python stores a reference of the function in the array at its particular index, and by adding a pair of parentheses we are actually calling the function. Here the last line is equivalent to:

Another way. Using function through user Input

go = "y"
x = 0
def hello():
  print ("Hello")
 
def bye():
  print ("Bye")
 
def hola():
  print ("Hola is Spanish for Hello")
 
def adios():
  print ("Adios is Spanish for Bye")

menu = [hello, bye, hola, adios]
 

while x < len(menu) :
    print ("function", menu[x].__name__, ", press " , "[" + str(x) + "]")
    x += 1
    
while go != "n":
    c = int(input("Select Function: "))
    menu[c]()
    go = input("Try again? [y/n]: ")

print ("\nBye!")
   

#end

Another way

if x == 0:
    hello()
elif x == 1:
    bye()
elif x == 2:
    hola()
else:
    adios()

Another way

Another way is to use lambdas. Code pasted here with permissions1

result = {
  'a': lambda x: x * 5,
  'b': lambda x: x + 7,
  'c': lambda x: x - 2
}[value](x)

For more information on lambda see anonymous functions in the function section.



Loops


While loops

This is our first control structure.The version used is 2.7. Ordinarily the computer starts with the first line and then goes down from there. Control structures change the order that statements are executed or decide if a certain statement will be run. As a side note, decision statements (e.g., if statements) also influence whether or not a certain statement will run. Here's the source for a program that uses the while control structure:

a = 0
while a < 5:
    a += 1 # Same as a = a + 1 
    print (a)

And here is the output:

1
2
3
4
5

So what does the program do? First it sees the line a = 0 and sets the variable a to zero. Then it sees while a < 5: and so the computer checks to see if a < 5. The first time the computer sees this statement, a is zero, and zero is less than 5. In other words, while a is less than five, the computer will run the indented statements.

Here is another example of the use of while:

a = 1
s = 0
print ('Enter Numbers to add to the sum.')
print ('Enter 0 to quit.')
while a != 0:
    print ('Current Sum: ', s)
    a = input('Number? ') #raw_input() will not work anymore.
    a = float(a)
    s += a
print ('Total Sum = ',s)
Enter Numbers to add to the sum.
Enter 0 to quit.
Current Sum: 0
Number? 200
Current Sum: 200
Number? -15.25
Current Sum: 184.75
Number? -151.85
Current Sum: 32.9
Number? 10.00
Current Sum: 42.9
Number? 0
Total Sum = 42.9

Notice how print 'Total Sum =',s is only run at the end. The while statement only affects the lines that are tabbed in (a.k.a. indented). The != means does not equal so while a != 0 : means until a is zero run the tabbed in statements that are afterwards.

Now that we have while loops, it is possible to have programs that run forever. An easy way to do this is to write a program like this:

while 1 == 1:
    print ("Help, I'm stuck in a loop.")

This program will output Help, I'm stuck in a loop. until the heat death of the universe or you stop it. The way to stop it is to hit the Control (or Ctrl) button and `c' (the letter) at the same time. This will kill the program. (Note: sometimes you will have to hit enter after the Control C.)

Examples

Fibonacci.py

#This program calculates the Fibonacci sequence
a = 0
b = 1
count = 0
max_count = 20
while count < max_count:
    count = count + 1
    #we need to keep track of a since we change it
    old_a = a
    old_b = b
    a = old_b
    b = old_a + old_b
    #Notice that the , at the end of a print statement keeps it
    # from switching to a new line
    print (old_a),

Output:

0 1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597 2584 4181

Password.py

# Waits until a password has been entered.  Use control-C to break out without
# the password.

# Note that this must not be the password so that the 
# while loop runs at least once.
password = "foobar"

#note that != means not equal
while password != "unicorn":
    password = input("Password: ")
print ("Welcome in")

Sample run:

Password:auo
Password:y22
Password:password
Password:open sesame
Password:unicorn
Welcome in

For Loops

The next type of loop in Python is the for loop. Unlike in most languages, for requires some __iterable__ object like a Set or List to work.

onetoten = range(1,11)
for count in onetoten:
    print (count)

The output:

1
2
3
4
5
6
7
8
9
10

The output looks very familiar, but the program code looks different. The first line uses the range function. The range function uses two arguments like this range(start,finish). start is the first number that is produced. finish is one larger than the last number. Note that this program could have been done in a shorter way:

for count in range(1,11):
    print (count)

Here are some examples to show what happens with the range function:

>>> range(1,10)
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> range(-32, -20)
[-32, -31, -30, -29, -28, -27, -26, -25, -24, -23, -22, -21]
>>> range(5,21)
[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
>>> range(21,5)
[]

Another way to use the range() function in a for loop is to supply only one argument:

for a in range(10):
    print (a)

The above code acts exactly the same as:

for a in range(0, 10):
    print (a)

with 0 implied as the starting point. The output is

0 1 2 3 4 5 6 7 8 9

The code would cycle through the for loop 10 times as expected, but starting with 0 instead of 1.

The next line for count in onetoten: uses the for control structure. A for control structure looks like for variable in list:. list is gone through starting with the first element of the list and going to the last. As for goes through each element in a list it puts each into variable. That allows variable to be used in each successive time the for loop is run through. Here is another example to demonstrate:

demolist = ['life',42, 'the universe', 6,'and',7,'everything']
for item in demolist:
    print ("The Current item is: %s" % item)

The output is:

The Current item is: life
The Current item is: 42
The Current item is: the universe
The Current item is: 6
The Current item is: and
The Current item is: 7
The Current item is: everything

Notice how the for loop goes through and sets item to each element in the list. (Notice how if you don't want print to go to the next line add a comma at the end of the statement (i.e. if you want to print something else on that line). ) So, what is for good for? The first use is to go through all the elements of a list and do something with each of them. Here a quick way to add up all the elements:

list = [2,4,6,8]
sum = 0
for num in list:
    sum = sum + num
print ("The sum is: %d" % sum)

with the output simply being:

The sum is:  20

Or you could write a program to find out if there are any duplicates in a list like this program does:

list = [4, 5, 7, 8, 9, 1,0,7,10]
list.sort()
prev = list[0]
del list[0]
for item in list:
    if prev == item:
        print ("Duplicate of ",prev," Found")
    prev = item

and for good measure:

Duplicate of  7  Found

How does it work? Here is a special debugging version:

l = [4, 5, 7, 8, 9, 1,0,7,10]
print ("l = [4, 5, 7, 8, 9, 1,0,7,10]","\tl:",l)
l.sort()
print ("l.sort()","\tl:",l)
prev = l[0]
print ("prev = l[0]","\tprev:",prev)
del l[0]
print ("del l[0]","\tl:",l)
for item in l:
    if prev == item:
        print ("Duplicate of ",prev," Found")
    print ("if prev == item:","\tprev:",prev,"\titem:",item)
    prev = item
    print ("prev = item","\t\tprev:",prev,"\titem:",item)

with the output being:

l = [4, 5, 7, 8, 9, 1,0,7,10]   l: [4, 5, 7, 8, 9, 1, 0, 7, 10]
l.sort()        l: [0, 1, 4, 5, 7, 7, 8, 9, 10]
prev = l[0]     prev: 0
del l[0]        l: [1, 4, 5, 7, 7, 8, 9, 10]
if prev == item:        prev: 0         item: 1
prev = item             prev: 1         item: 1
if prev == item:        prev: 1         item: 4
prev = item             prev: 4         item: 4
if prev == item:        prev: 4         item: 5
prev = item             prev: 5         item: 5
if prev == item:        prev: 5         item: 7
prev = item             prev: 7         item: 7
if prev == item:        prev: 7         item: 7
Duplicate of  7  Found
prev = item             prev: 7         item: 7
if prev == item:        prev: 7         item: 8
prev = item             prev: 8         item: 8
if prev == item:        prev: 8         item: 9
prev = item             prev: 9         item: 9
if prev == item:        prev: 9         item: 10
prev = item             prev: 10        item: 10

Note: The reason there are so many print statements is because print statements can show the value of each variable at different times, and help debug the program. First the program starts with a old list. Next the program sorts the list. This is so that any duplicates get put next to each other. The program then initializes a prev(ious) variable. Next the first element of the list is deleted so that the first item is not incorrectly thought to be a duplicate. Next a for loop is gone into. Each item of the list is checked to see if it is the same as the previous. If it is a duplicate was found. The value of prev is then changed so that the next time the for loop is run through prev is the previous item to the current. Sure enough, the 7 is found to be a duplicate.

The other way to use for loops is to do something a certain number of times. Here is some code to print out the first 9 numbers of the Fibonacci series:

a = 1
b = 1
for c in range(1,10):
    print (a)
    n = a + b
    a = b
    b = n
print ("")

with the surprising output:

1
1
2
3
5
8
13
21
34

Everything that can be done with for loops can also be done with while loops but for loops give an easy way to go through all the elements in a list or to do something a certain number of times.


range Versus xrange

Above, you were introduced to the range function, which returns a list of all the integers in a specified range. Supposing you were to write an expression like range(0, 1000000): that would construct a list consisting of a million integers! That can take up a lot of memory.

Often, you do indeed need to process all the numbers over a very wide range. But you might only need to do so one at a time; as each number is processed, it can be discarded from memory before the next one is obtained.

To do this, you can use the xrange function instead of range. For example, the following simple loop

for i in xrange(0, 1000000) :
    print(i)
#end for

will print the million integers from 0 to 999999, but it will get them one at a time from the xrange call, instead of getting them all at once as a single list and going through that.

This is an example of an iterator, which yields values one at a time as they are needed, rather than all at once. As you learn more about Python, you will see a lot more examples of iterators in use, and you will learn how to write your own.

Note:

Python 3 Note: In Python 3.x, there is no function named xrange. Instead, the range function has the behaviour described above for xrange.

 

The break Statement

A while-loop checks its termination condition before each entry into the loop body, and terminates if the condition has gone False. Thus, the loop body will normally iterate zero, one or more complete times.

A for-loop iterates its body once for each value returned from the iterator expression. Again, each iteration is normally of the complete loop body.

Sometimes you want to conditionally stop the loop in the middle of the loop body. An example situation might look like this:

  1. Obtain the next value to check
  2. Is there in fact another value to check? If not, exit the loop with failure.
  3. Is the value what I’m looking for? If so, exit the loop with success.
  4. Otherwise, go back to step 1.

As you can see, there are two possible exits from this loop. You can exit from the middle of a Python while- or for-loop with the break-statement. Here is one way to write such a loop:

found = False # initial assumption
for value in values_to_check():
    if is_what_im_looking_for(value):
        found = True
        break
    #end if
#end for
# ... found is True on success, False on failure

The trouble with this is the asymmetry between the two ways out of the loop: one through normal for-loop termination, the other through the break. As a stylistic matter, it would be more consistent to follow this principle:

If one exit from a loop is represented by a break, then all exits from the loop should be represented by breaks.

In particular, this means that the loop construct itself becomes a “loop-forever” loop; the only out of it is via a break statement.

We can do this by explicitly dealing with an iterator yielding the values to be checked. Perhaps the values_to_check() call above already yields an iterator; if not, it can be converted to one by wrapping it in a call to the iter() built-in function, and then using the next() built-in function to obtain successive values from this iterator. Then the loop becomes:

values = iter(values_to_check())
while True:
    value = next(values, None)
    if value is None:
        found = False
        break
    #end if
    if is_what_im_looking_for(value):
        found = True
        break
    #end if
#end while
# ... found is True on success, False on failure

This uses the special Python value None to indicate that the iterator has reached its end. This is a common Python convention. But if you need to use None as a valid item in your sequence of values, then it is easy enough to define some other unique value (e.g. a custom dummy class) to represent the end of the list.

Exercises

  1. Create a program to count by prime numbers. Ask the user to input a number, then print each prime number up to that number.
  2. Instruct the user to pick an arbitrary number from 1 to 100 and proceed to guess it correctly within seven tries. After each guess, the user must tell whether their number is higher than, lower than, or equal to your guess.

Solutions



Functions


Function Calls

A callable object is an object that can accept some arguments (also called parameters) and possibly return an object (often a tuple containing multiple objects).

A function is the simplest callable object in Python, but there are others, such as classes or certain class instances.

Defining Functions

A function is defined in Python by the following format:

def functionname(arg1, arg2, ...):
    statement1
    statement2
    ...
>>> def functionname(arg1,arg2):
...     return arg1+arg2
...
>>> t = functionname(24,24) # Result: 48

If a function takes no arguments, it must still include the parentheses, but without anything in them:

def functionname():
    statement1
    statement2
    ...

The arguments in the function definition bind the arguments passed at function invocation (i.e. when the function is called), which are called actual parameters, to the names given when the function is defined, which are called formal parameters. The interior of the function has no knowledge of the names given to the actual parameters; the names of the actual parameters may not even be accessible (they could be inside another function).

A function can 'return' a value, for example:

def square(x):
    return x*x

A function can define variables within the function body, which are considered 'local' to the function. The locals together with the arguments comprise all the variables within the scope of the function. Any names within the function are unbound when the function returns or reaches the end of the function body.

You can return multiple values as follows:

def first2items(list1):
  return list1[0], list1[1]
a, b = first2items(["Hello", "world", "hi", "universe"])
print a + " " + b

Keywords: returning multiple values, multiple return values.

Declaring Arguments

When calling a function that takes some values for further processing, we need to send some values as Function Arguments. For example:

>>> def find_max(a,b):
   if(a > b):
      return str(a) + " is greater than " + str(b)
   elif(b > a):
      return str(b) + " is greater than " + str(a)
>>> find_max(30, 45)  #Here (30, 45) are the arguments passing for finding max between this two numbers
The output will be: 45 is greater than 30

Default Argument Values

If any of the formal parameters in the function definition are declared with the format "arg = value," then you will have the option of not specifying a value for those arguments when calling the function. If you do not specify a value, then that parameter will have the default value given when the function executes.

>>> def display_message(message, truncate_after=4):
...     print message[:truncate_after]
...
>>> display_message("message")
mess
>>> display_message("message", 6)
messag

Links:

Variable-Length Argument Lists

Python allows you to declare two special arguments which allow you to create arbitrary-length argument lists. This means that each time you call the function, you can specify any number of arguments above a certain number.

def function(first,second,*remaining):
    statement1
    statement2
    ...

When calling the above function, you must provide value for each of the first two arguments. However, since the third parameter is marked with an asterisk, any actual parameters after the first two will be packed into a tuple and bound to "remaining."

>>> def print_tail(first,*tail):
...     print tail
...
>>> print_tail(1, 5, 2, "omega")
(5, 2, 'omega')

If we declare a formal parameter prefixed with two asterisks, then it will be bound to a dictionary containing any keyword arguments in the actual parameters which do not correspond to any formal parameters. For example, consider the function:

def make_dictionary(max_length=10, **entries):
    return dict([(key, entries[key]) for i, key in enumerate(entries.keys()) if i < max_length])

If we call this function with any keyword arguments other than max_length, they will be placed in the dictionary "entries." If we include the keyword argument of max_length, it will be bound to the formal parameter max_length, as usual.

>>> make_dictionary(max_length=2, key1=5, key2=7, key3=9)
{'key3': 9, 'key2': 7}

Links:

By Value and by Reference

Objects passed as arguments to functions are passed by reference; they are not being copied around. Thus, passing a large list as an argument does not involve copying all its members to a new location in memory. Note that even integers are objects. However, the distinction of by value and by reference present in some other programming languages often serves to distinguish whether the passed arguments can be actually changed by the called function and whether the calling function can see the changes.

Passed objects of mutable types such as lists and dictionaries can be changed by the called function and the changes are visible to the calling function. Passed objects of immutable types such as integers and strings cannot be changed by the called function; the calling function can be certain that the called function will not change them. For mutability, see also Data Types chapter.

An example:

def appendItem(ilist, item):
  ilist.append(item) # Modifies ilist in a way visible to the caller

def replaceItems(ilist, newcontentlist):
  del ilist[:]                 # Modification visible to the caller
  ilist.extend(newcontentlist) # Modification visible to the caller
  ilist = [5, 6] # No outside effect; lets the local ilist point to a new list object,
                 # losing the reference to the list object passed as an argument
def clearSet(iset):
  iset.clear()

def tryToTouchAnInteger(iint):
  iint += 1 # No outside effect; lets the local iint to point to a new int object,
            # losing the reference to the int object passed as an argument
  print "iint inside:",iint # 4 if iint was 3 on function entry 

list1 = [1, 2]
appendItem(list1, 3)
print list1 # [1, 2, 3]
replaceItems(list1, [3, 4])
print list1 # [3, 4]
set1 = set([1, 2])
clearSet(set1 )
print set1 # set([])
int1 = 3
tryToTouchAnInteger(int1)
print int1 # 3

Preventing Argument Change

An argument cannot be declared to be constant, not to be changed by the called function. If an argument is of an immutable type, it cannot be changed anyway, but if it is of a mutable type such as list, the calling function is at the mercy of the called function. Thus, if the calling function wants to make sure a passed list does not get changed, it has to pass a copy of the list.

An example:

def evil_get_length(ilist):
  length = len(ilist)
  del ilist[:] # Muhaha: clear the list
  return length

list1 = [1, 2]
print evil_get_length(list1[:]) # Pass a copy of list1
print list1 # list1 = [1, 2]
print evil_get_length(list1) # list1 gets cleared
print list1 # list1 = []

Calling Functions

A function can be called by appending the arguments in parentheses to the function name, or an empty matched set of parentheses if the function takes no arguments.

foo()
square(3)
bar(5, x)

A function's return value can be used by assigning it to a variable, like so:

x = foo()
y = bar(5,x)

As shown above, when calling a function you can specify the parameters by name and you can do so in any order

def display_message(message, start=0, end=4):
   print message[start:end]

display_message("message", end=3)

This above is valid and start will have the default value of 0. A restriction placed on this is after the first named argument then all arguments after it must also be named. The following is not valid

display_message(end=5, start=1, "my message")

because the third argument ("my message") is an unnamed argument.

Nested functions

Nested functions are functions defined within other functions. Arbitrary level of nesting is possible.

Nested functions can read variables declared in the immediately outside function. For such variables that are mutable, nested functions can even modify them. For such variables that are immutable such as integers, attempt at modification in the nested function throws UnboundLocalError. In Python 3, an immutable immediately outside variable can be declared in the nested function to be nonlocal, in an analogy to global. Once this is done, the nested function can assign a new value to that variable and that modification is going to be seen outside of the nested function.

Nested functions can be used in #Closures, on which see below. Furthermore, they can be used to reduce repetion of code that pertains only to a single function, often with reduced argument list owing to seeing the immediately outside variables.

An example of a nested function that modifies an immediately outside variable that is a list and therefore mutable:

def outside():
  outsideList = [1, 2]
  def nested():
    outsideList.append(3)
  nested()
  print outsideList

An example in which the outside variable is first accessed below the nested function definition and it still works:

def outside():
  def nested():
    outsideList.append(3)
  outsideList = [1, 2]
  nested()
  print outsideList

Keywords: inner functions, internal functions, local functions.

Links:

Closures

A closure is a nested function with an after-return access to the data of the outer function, where the nested function is returned by the outer function as a function object. Thus, even when the outer function has finished its execution after being called, the closure function returned by it can refer to the values of the variables that the outer function had when it defined the closure function.

An example:

def adder(outer_argument): # outer function
  def adder_inner(inner_argument): # inner function, nested function
    return outer_argument + inner_argument # Notice outer_argument
  return adder_inner
add5 = adder(5) # a function that adds 5 to its argument
add7 = adder(7) # a function that adds 7 to its argument
print add5(3) # prints 8
print add7(3) # prints 10

Closures are possible in Python because functions are first-class objects. A function is merely an object of type function. Being an object means it is possible to pass a function object (an uncalled function) around as argument or as return value or to assign another name to the function object. A unique feature that makes closure useful is that the enclosed function may use the names defined in the parent function's scope.

Lambda Expressions

A lambda is an anonymous (unnamed) function. It is used primarily to write very short functions that are a hassle to define in the normal way. A function like this:

>>> def add(a, b):
...    return a + b
...
>>> add(4, 3)
7

may also be defined using lambda

>>> print ((lambda a, b: a + b)(4, 3))
7

Lambda is often used as an argument to other functions that expects a function object, such as sorted()'s 'key' argument.

>>> sorted([[3, 4], [3, 5], [1, 2], [7, 3]], key=lambda x: x[1])
[[1, 2], [7, 3], [3, 4], [3, 5]]

The lambda form is often useful as a closure, such as illustrated in the following example:

>>> def attribution(name):
...    return lambda x: x + ' -- ' + name
...
>>> pp = attribution('John')
>>> pp('Dinner is in the fridge')
'Dinner is in the fridge -- John'

Note that the lambda function can use the values of variables from the scope in which it was created (like pre and post). This is the essence of closure.

Links:

Generator Functions

When discussing loops, you came across the concept of an iterator. This yields in turn each element of some sequence, rather than the entire sequence at once, allowing you to deal with sequences much larger than might be able to fit in memory at once.

You can create your own iterators, by defining what is known as a generator function. To illustrate the usefulness of this, let us start by considering a simple function to return the concatenation of two lists:

def concat(a, b):
    return a + b

print concat([5, 4, 3], ["a", "b", "c"])
# prints [5, 4, 3, 'a', 'b', 'c']

Imagine wanting to do something like concat(range(0, 1000000), range(1000000, 2000000))

That would work, but it would consume a lot of memory.

Consider an alternative definition, which takes two iterators as arguments:

def concat(a, b):
    for i in a:
        yield i
    for i in b:
        yield i

Notice the use of the yield statement, instead of return. We can now use this something like

for i in concat(xrange(0, 1000000), xrange(1000000, 2000000))
    print i

and print out an awful lot of numbers, without using a lot of memory at all.

Note: You can still pass a list or other sequence type wherever Python expects an iterator (like to an argument of your concat function); this will still work, and makes it easy not to have to worry about the difference where you don’t need to.

Links:




Scoping


Variables

Variables in Python are automatically declared by assignment. Variables are always references to objects, and are never typed. Variables exist only in the current scope or global scope. When they go out of scope, the variables are destroyed, but the objects to which they refer are not (unless the number of references to the object drops to zero).

Scope is delineated by function and class blocks. Both functions and their scopes can be nested. So therefore

def foo():
    def bar():
        x = 5 # x is now in scope
        return x + y # y is defined in the enclosing scope later
    y = 10
    return bar() # now that y is defined, bar's scope includes y

Now when this code is tested,

>>> foo()
15
>>> bar()
Traceback (most recent call last):
  File "<pyshell#26>", line 1, in -toplevel-
    bar()
NameError: name 'bar' is not defined

The name 'bar' is not found because a higher scope does not have access to the names lower in the hierarchy.

It is a common pitfall to fail to lookup an attribute (such as a method) of an object (such as a container) referenced by a variable before the variable is assigned the object. In its most common form:

>>> for x in range(10):
         y.append(x) # append is an attribute of lists

Traceback (most recent call last):
  File "<pyshell#46>", line 2, in -toplevel-
    y.append(x)
NameError: name 'y' is not defined

Here, to correct this problem, one must add y = [] before the for loop.

A loop does not create its own scope:

for x in [1, 2, 3]:
  inner = x
print inner # 3 rather than an error

Keyword global

Global variables of a Python module are read-accessible from functions in that module. In fact, if they are mutable, they can be also modified via method call. However, they cannot modified via a plain assignment unless declared global in the function.

An example to clarify:

count1 = 1
count2 = 1
list1 = []
list2 = []

def test1():
  print count1    # Read access is unproblematic, referring to the global

def test2():
  try:
    print count1  # This print would be unproblematic, but it throws an error ...
    count1 += 1   # ... since count1 += 1 causes count1 to be local.
  except UnboundLocalError as error:
    print "Error caught:", error

def test3():
  list1 = [2]     # No outside effect; this rebinds list1 to be a local variable

def test4():
  global count2, list2
  print count1    # Read access is unproblematic, referring to the global
  count2 += 1     # We can modify count2 via assignment
  list1.append(1) # Impacts the global list1 even without global declaration
  list2 = [2]     # Impacts the global list2

test1()
test2()
test3()
test4()

print "count1:", count1  # 1
print "count2:", count2  # 2
print "list1:", list1    # [1]
print "list2:", list2    # [2]

Links:

Keyword nonlocal

Keyword nonlocal, available since Python 3.0, is an analogue of global for nested scopes. It enables a nested function of assign-modify a variable that is local to the outer function.

An example:

# Requires Python 3
def outer():
  outerint = 0
  outerint2 = 10
  def inner():
    nonlocal outerint
    outerint = 1 # Impacts outer's outerint only because of the nonlocal declaration
    outerint2 = 1 # No impact
  inner()
  print(outerint)
  print(outerint2)

outer()

Simulation of nonlocal in Python 2 via a mutable object:

def outer():
  outerint = [1]           # Technique 1: Store int in a list
  class outerNL: pass      # Technique 2: Store int in a class
  outerNL.outerint2 = 11
  def inner():
    outerint[0] = 2        # List members can be modified
    outerNL.outerint2 = 12 # Class members can be modified
  inner()
  print outerint[0]
  print outerNL.outerint2

outer()

Links:

globals and locals

To find out which variables exist in the global and local scopes, you can use locals() and globals() functions, which return dictionaries:

int1 = 1
def test1():
  int1 = 2
  globals()["int1"] = 3  # Write access seems possible
  print locals()["int1"] # 2
  
test1()

print int1               # 3

Write access to locals() dictionary is discouraged by the Python documentation.

Links:



Input and output

x=50 def fun():

   global x
   print("x is", x) 
   x=20
   print("global changed x", x) 

fun() print ("value of x", x)

Input

Note on Python version: The following uses the syntax of Python 2.x. Some of the following is not going to work with Python 3.x.

Python has two functions designed for accepting data directly from the user:

  • input()
  • raw_input()

There are also very simple ways of reading a file and, for stricter control over input, reading from stdin if necessary.

raw_input()

raw_input() asks the user for a string of data (ended with a newline), and simply returns the string. It can also take an argument, which is displayed as a prompt before the user enters the data. E.g.

print raw_input('What is your name? ')

prints out

What is your name? <user input data here>

Example: in order to assign the user's name, i.e. string data, to a variable "x" you would type

x = raw_input('What is your name?')

Once the user inputs his name, e.g. Simon, you can call it as x

print 'Your name is ' + x

prints out

Your name is Simon

input()

input() takes in any datatype but returns only in the format of string with a trialing new line attached to it. So entering

[1,2,3]

would return a list containing those numbers in string format '[1,2,3]'

More complicated expressions are possible. For example, if a script says:

x = input('What are the first 10 perfect squares? ')

it is possible for a user to input:

map(lambda x: x*x, range(10))

which yields the correct answer in list form. Note that no inputted statement can span more than one line.

input() should not be used for anything but the most trivial program. Turning the strings returned from raw_input() into python types using an idiom such as:

x = None
while not x:
    try:
        x = int(raw_input())
    except ValueError:
        print 'Invalid Number'

is preferable, as input() uses eval() to turn a literal into a python type. This will allow a malicious person to run arbitrary code from inside your program trivially.

File Input

File Objects

Python includes a built-in file type. Files can be opened by using the file type's constructor:

f = file('test.txt', 'r')

This means f is open for reading. The first argument is the filename and the second parameter is the mode, which can be 'r', 'w', or 'rw', among some others.

The most common way to read from a file is simply to iterate over the lines of the file:

f = open('test.txt', 'r')
for line in f:
    print line[0]
f.close()

This will print the first character of each line. Note that a newline is attached to the end of each line read this way.

The newer and better way to read from a file:

with open("test.txt", "r") as txt:
    for line in txt:
        print line

The advantage is, that the opened file will close itself after reading each line.

Because files are automatically closed when the file object goes out of scope, there is no real need to close them explicitly. So, the loop in the previous code can also be written as:

for line in open('test.txt', 'r'):
    print line[0]

You can read limited numbers of characters at a time like this:

c = f.read(1)
while len(c) > 0:
    if len(c.strip()) > 0: print c,
    c = f.read(1)

This will read the characters from f one at a time, and then print them if they're not whitespace.

A file object implicitly contains a marker to represent the current position. If the file marker should be moved back to the beginning, one can either close the file object and reopen it or just move the marker back to the beginning with:

f.seek(0)

Standard File Objects

Like many other languages, there are built-in file objects representing standard input, output, and error. These are in the sys module and are called stdin, stdout, and stderr. There are also immutable copies of these in __stdin__, __stdout__, and __stderr__. This is for IDLE and other tools in which the standard files have been changed.

You must import the sys module to use the special stdin, stdout, stderr I/O handles.

import sys

For finer control over input, use sys.stdin.read(). In order to implement the UNIX 'cat' program in Python, you could do something like this:

import sys
for line in sys.stdin:
    print line,

Note that sys.stdin.read() will read from standard input till EOF. (which is usually Ctrl+D.)

Parsing command line

Command-line arguments passed to a Python program are stored in sys.argv list. The first item in the list is name of the Python program, which may or may not contain the full path depending on the manner of invocation. sys.argv list is modifiable.

Printing all passed arguments except for the program name itself:

import sys
for arg in sys.argv[1:]:
  print arg

Parsing passed arguments for passed minus options:

import sys
option_f = False
option_p = False
option_p_argument = ""
i = 1
while i < len(sys.argv):
  if sys.argv[i] == "-f":
    option_f = True
    sys.argv.pop(i)
  elif sys.argv[i] == "-p":
    option_p = True
    sys.argv.pop(i)
    option_p_argument = sys.argv.pop(i)
  else:
    i += 1

Above, the arguments at which options are found are removed so that sys.argv can be looped for all remaining arguments.

Parsing of command-line arguments is further supported by library modules optparse (deprecated), argparse (since Python 2.7) and getopt (to make life easy for C programmers).

Links:

Output

Note on Python version: The following uses the syntax of Python 2.x. Much of the following is not going to work with Python 3.x. In particular, Python 3.x requires round brackets around arguments to "print".

The basic way to do output is the print statement.

print 'Hello, world'

To print multiple things on the same line separated by spaces, use commas between them, like this:

print 'Hello,', 'World'

This will print out the following:

Hello, World

While neither string contained a space, a space was added by the print statement because of the comma between the two objects. Arbitrary data types can be printed this way:

print 1,2,0xff,0777,(10+5j),-0.999,map,sys

This will output the following:

1 2 255 511 (10+5j) -0.999 <built-in function map> <module 'sys' (built-in)>

Objects can be printed on the same line without needing to be on the same line if one puts a comma at the end of a print statement:

for i in range(10):
    print i,

This will output the following:

0 1 2 3 4 5 6 7 8 9

To end the printed line with a newline, add a print statement without any objects.

for i in range(10):
    print i,
print
for i in range(10,20):
    print i,

This will output the following:

0 1 2 3 4 5 6 7 8 9
10 11 12 13 14 15 16 17 18 19

If the bare print statement were not present, the above output would look like:

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

You can use similar syntax when writing to a file instead of to standard output, like this:

print >> f, 'Hello, world'

This will print to any object that implements write(), which includes file objects.

Omitting newlines

To avoid adding spaces and newlines between objects' output with subsequent print statements, you can do one of the following:

Concatenation: Concatenate the string representations of each object, then later print the whole thing at once.

print str(1)+str(2)+str(0xff)+str(0777)+str(10+5j)+str(-0.999)+str(map)+str(sys)

This will output the following:

12255511(10+5j)-0.999<built-in function map><module 'sys' (built-in)>

Write function: You can make a shorthand for sys.stdout.write and use that for output.

import sys
write = sys.stdout.write
write('20')
write('05\n')

This will output the following:

2005

You may need sys.stdout.flush() to get that text on the screen quickly.

Examples

Examples of output with Python 2.x:

  • print "Hello"
  • print "Hello", "world"
    • Separates the two words with a space.
  • print "Hello", 34
    • Prints elements of various data types, separating them by a space.
  • print "Hello " + 34
    • Throws an error as a result of trying to concatenate a string and an integer.
  • print "Hello " + str(34)
    • Uses "+" to concatenate strings, after converting a number to a string.
  • print "Hello",
    • Prints "Hello " without a newline, with a space at the end.
  • sys.stdout.write("Hello")
    • Prints "Hello" without a newline. Doing "import sys" is a prerequisite. Needs a subsequent "sys.stdout.flush()" in order to display immediately on the user's screen.
  • sys.stdout.write("Hello\n")
    • Prints "Hello" with a newline.
  • print >> sys.stderr, "An error occurred."
    • Prints to standard error stream.
  • sys.stderr.write("Hello\n")
    • Prints to standard error stream.
  • sum=2+2; print "The sum: %i" % sum
    • Prints a string that has been formatted with the use of an integer passed as an argument.
  • formatted_string = "The sum: %i" % (2+2); print formatted_string
    • Like the previous, just that the formatting happens outside of the print statement.
  • print "Float: %6.3f" % 1.23456
    • Outputs "Float: 1.234". The number 3 after the period specifies the number of decimal digits after the period to be displayed, while 6 before the period specifies the total number of characters the displayed number should take, to be padded with spaces if needed.
  • print "%s is %i years old" % ("John", 23)
    • Passes two arguments to the formatter.

Examples of output with Python 3.x:

  • from __future__ import print_function
    • Ensures Python 2.6 and later Python 2.x can use Python 3.x print function.
  • print ("Hello", "world")
    • Prints the two words separated with a space. Notice the surrounding brackets, ununsed in Python 2.x.
  • print ("Hello world", end="")
    • Prints without the ending newline.
  • print ("Hello", "world", sep="-")
    • Prints the two words separated with a dash.
  • print ("Error", file=sys.stderr)
    • Outputs to a file handle, in this case standard error stream.

File Output

Printing numbers from 1 to 10 to a file, one per line:

file1 = open("TestFile.txt","w")
for i in range(1,10+1):
  print >>file1, i
file1.close()

With "w", the file is opened for writing. With ">>file", print sends its output to a file rather than standard output.

Printing numbers from 1 to 10 to a file, separated with a dash:

file1 = open("TestFile.txt","w")
for i in range(1,10+1):
  if i>1:
    file1.write("-")
  file1.write(str(i))
file1.close()

Opening a file for appending rather than overwriting:

file1 = open("TestFile.txt","a")

See also Files chapter.

Formatting

Formatting numbers and other values as strings using the string percent operator:

v1 = "Int: %i" % 4               # 4
v2 = "Int zero padded: %03i" % 4 # 004
v3 = "Int space padded: %3i" % 4 #   4
v4 = "Hex: %x" % 31              # 1f
v5 = "Hex 2: %X" % 31            # 1F - capitalized F
v6 = "Oct: %o" % 8               # 10
v7 = "Float: %f" % 2.4           # 2.400000
v8 = "Float: %.2f" % 2.4         # 2.40
v9 = "Float in exp: %e" % 2.4    # 2.400000e+00
vA = "Float in exp: %E" % 2.4    # 2.400000E+00
vB = "List as string: %s" % [1, 2, 3]
vC = "Left padded str: %10s" % "cat"
vD = "Right padded str: %-10s" % "cat"
vE = "Truncated str: %.2s" % "cat"
vF = "Dict value str: %(age)s" % {"age": 20}
vG = "Char: %c" % 65             # A
vH = "Char: %c" % "A"            # A

Formatting numbers and other values as strings using the format() string method, since Python 2.6:

v1 = "Arg 0: {0}".format(31)     # 31
v2 = "Args 0 and 1: {0}, {1}".format(31, 65)
v3 = "Args 0 and 1: {}, {}".format(31, 65)
v4 = "Arg indexed: {0[0]}".format(["e1", "e2"])
v5 = "Arg named: {a}".format(a=31)
v6 = "Hex: {0:x}".format(31)     # 1f
v7 = "Hex: {:x}".format(31)      # 1f - arg 0 is implied
v8 = "Char: {0:c}".format(65)    # A
v9 = "Hex: {:{h}}".format(31, h="x") # 1f - nested evaluation

Formatting numbers and other values as strings using literal string interpolation, since Python 3.6:

int1 = 31; int2 = 41; str1="aaa"; myhex = "x"
v1 = f"Two ints: {int1} {int2}"
v2 = f"Int plus 1: {int1+1}"      # 32 - expression evaluation
v3 = f"Str len: {len(str1)}"      # 3 - expression evaluation
v4 = f"Hex: {int1:x}"             # 1f
v5 = f"Hex: {int1:{myhex}}"       # 1f - nested evaluation

Links:



Files


File I/O

Read entire file:

inputFileText = open("testit.txt", "r").read()
print(inputFileText)

In this case the "r" parameter means the file will be opened in read-only mode.

Read certain amount of bytes from a file:

inputFileText = open("testit.txt", "r").read(123)
print(inputFileText)

When opening a file, one starts reading at the beginning of the file, if one would want more random access to the file, it is possible to use seek() to change the current position in a file and tell() to get to know the current position in the file. This is illustrated in the following example:

>>> f=open("/proc/cpuinfo","r")
>>> f.tell()
0L
>>> f.read(10)
'processor\t'
>>> f.read(10)
': 0\nvendor'
>>> f.tell()
20L
>>> f.seek(10)
>>> f.tell()
10L
>>> f.read(10)
': 0\nvendor'
>>> f.close()
>>> f
<closed file '/proc/cpuinfo', mode 'r' at 0xb7d79770>

Here a file is opened, twice ten bytes are read, tell() shows that the current offset is at position 20, now seek() is used to go back to position 10 (the same position where the second read was started) and ten bytes are read and printed again. And when no more operations on a file are needed the close() function is used to close the file we opened.

Read one line at a time:

for line in open("testit.txt", "r"):
    print line

In this case readlines() will return an array containing the individual lines of the file as array entries. Reading a single line can be done using the readline() function which returns the current line as a string. This example will output an additional newline between the individual lines of the file, this is because one is read from the file and print introduces another newline.

Write to a file requires the second parameter of open() to be "w", this will overwrite the existing contents of the file if it already exists when opening the file:

outputFileText = "Here's some text to save in a file"
open("testit.txt", "w").write(outputFileText)

Append to a file requires the second parameter of open() to be "a" (from append):

outputFileText = "Here's some text to add to the existing file."
open("testit.txt", "a").write(outputFileText)

Note that this does not add a line break between the existing file content and the string to be added.

Since Python 2.5, you can use with keyword to ensure the file handle is released as soon as possible and to make it exception-safe:

with open("input.txt") as file1:
  data = file1.read()
  # process the data

Or one line at a time:

with open("input.txt") as file1:
  for line in file1:
    print line

Related to the with keywords is Context Managers chapter.

Links:

Testing Files

Determine whether path exists:

import os
os.path.exists('<path string>')

When working on systems such as Microsoft Windows, the directory separators will conflict with the path string. To get around this, do the following:

import os
os.path.exists('C:\\windows\\example\\path')

A better way however is to use "raw", or r:

import os
os.path.exists(r'C:\windows\example\path')

But there are some other convenient functions in os.path, where os.path.exists() only confirms whether or not path exists, there are functions which let you know if the path is a file, a directory, a mount point or a symlink. There is even a function os.path.realpath() which reveals the true destination of a symlink:

>>> import os
>>> os.path.isfile("/")
False
>>> os.path.isfile("/proc/cpuinfo")
True
>>> os.path.isdir("/")
True
>>> os.path.isdir("/proc/cpuinfo")
False
>>> os.path.ismount("/")
True
>>> os.path.islink("/")
False
>>> os.path.islink("/vmlinuz")
True
>>> os.path.realpath("/vmlinuz")
'/boot/vmlinuz-2.6.24-21-generic'

Common File Operations

To copy or move a file, use the shutil library.

import shutil
shutil.move("originallocation.txt","newlocation.txt")
shutil.copy("original.txt","copy.txt")

To perform a recursive copy it is possible to use copytree(), to perform a recursive remove it is possible to use rmtree()

import shutil
shutil.copytree("dir1","dir2")
shutil.rmtree("dir1")

To remove an individual file there exists the remove() function in the os module:

import os
os.remove("file.txt")

Finding Files

Files can be found using glob:

glob.glob('*.txt') # Finds files in the current directory ending in dot txt 
glob.glob('*\\*.txt') # Finds files in any of the direct subdirectories
                      # of the currect directory ending in dot txt 
glob.glob('C:\\Windows\\*.exe')
for fileName in glob.glob('C:\\Windows\\*.exe'):
  print fileName
glob.glob('C:\\Windows\\**.exe', recursive=True) # Py 3.5: ** allows recursive nesting

The content of a directory can be listed using listdir:

filesAndDirectories=os.listdir('.')
for item in filesAndDirectories:
  if os.path.isfile(item) and item.endswith('.txt'):
    print "Text file: " + item
  if os.path.isdir(item):
    print "Directory: " + item

Getting a list of all items in a directory, including the nested ones:

for root, directories, files in os.walk('/user/Joe Hoe'):
  print "Root: " + root                          # e.g. /user/Joe Hoe/Docs
  for dir1 in directories:
    print "Dir.: " + dir1                        # e.g. Fin
    print "Dir. 2: " + os.path.join(root, dir1)  # e.g. /user/Joe Hoe/Docs/Fin
  for file1 in files:
    print "File: " + file1                       # e.g. MyFile.txt
    print "File 2: " + os.path.join(root, file1) # e.g. /user/Joe Hoe/Docs/MyFile.txt

Above, root takes value of each directory in /user/Joe Hoe including /user/Joe Hoe itself, and directories and files are only those directly present in each root.

Getting a list of all files in a directory, including the nested ones, ending in .txt, using list comprehension:

files = [os.path.join(r, f) for r, d, fs in os.walk(".") for f in fs
         if f.endswith(".txt")]
# As iterator
files = (os.path.join(r, f) for r, d, fs in os.walk(".") for f in fs
         if f.endswith(".txt"))

Links:

Current Directory

Getting current working directory:

os.getcwd()

Changing current working directory:

os.chdir('C:\\')



Text


To get the length of a string, we use the len() function:

>>> len("Hello Wikibooks!")
16

You can slice strings just like lists and any other sequences:

>>> "Hello Wikibooks!"[0:5]
'Hello'
>>> "Hello Wikibooks!"[5:11]
' Wikib'
>>> "Hello Wikibooks!"[:5] #equivalent of [0:5]
'Hello'

To get the ASCII code of a character, use the ord() function.

>>> ord('h')
104
>>> ord('a')
97
>>> ord('^')
94

To get the character encoded by an ASCII code number, use the chr() function.

>>> chr(104)
'h'
>>> chr(97)
'a'
>>> chr(94)
'^'
To know if all the characters present in a string are alphanumeric i.e. they are alphabets and numeric, use the isalnum() function. It returns true if there is at least one character present in the string and all the characters present are alphanumeric.

To know if all the characters present in a string are pure alphabets, use the isalpha() function. It returns true if there is at least one character present in the string and all the characters present are alphabetic.

Example

stringparser.py

# Add each character, and it's ordinal, of user's text input, to two lists
s = input("Enter value: ")  # this line requires Python 3.x, use raw_input() instead of input() in Python 2.x
l1 = [] 
l2 = []
for c in s:   # in Python, a string is just a sequence, so we can iterate over it!
    l1.append(c) 
    l2.append(ord(c))
print(l1)
print(l2)

Or shorter (using list comprehension instead of the for block):

# Add each character, and it's ordinal, of user's text input, to two lists
s = input("Enter value: ")  # this line requires Python 3.x, use raw_input() instead of input() in Python 2.x

l1=[c for c in s]   # in Python, a string is just a sequence, so we can iterate over it!
l2=[ord(c) for c in s]

print(l1)
print(l2)


Output:

Enter value: string
['s', 't', 'r', 'i', 'n', 'g']
[115, 116, 114, 105, 110, 103]

Or

Enter value: Hello, Wikibooks!
['H', 'e', 'l', 'l', 'o', ',', ' ', 'W', 'i', 'k', 'i', 'b', 'o', 'o', 'k', 's', '!']
[72, 101, 108, 108, 111, 44, 32, 87, 105, 107, 105, 98, 111, 111, 107, 115, 33]


Exercises

  1. Use Python to determine the difference in ASCII code between lowercase and upper case letters.
  2. Write a program that converts a lowercase letter to an upper case letter using the ASCII code. (Note that there are better ways to do this, but you should do it once using the ASCII code to get a feel for how the language works)



Modules


Modules are a way to structure a program and create reusable libraries. A module is usually stored in and corresponds to a separate .py file. Many modules are available from the standard library. You can create your own modules. Python searches for modules in the current directory and other locations; the list of module search locations can be expanded by expanding PYTHONPATH environment variable and by other means.

Importing a Module

To use the functions and classes offered by a module, you have to import the module:

import math
print math.sqrt(10)

The above imports the math standard module, making all of the functions in that module namespaced by the module name. It imports all functions and all classes, if any.

You can import the module under a different name:

import math as Mathematics
print Mathematics.sqrt(10)

You can import a single function, making it available without the module name namespace:

from math import sqrt
print sqrt(10)

You can import a single function and make it available under a different name:

from math import cos as cosine
print cosine(10)

You can import multiple modules in a row:

import os, sys, re

You can make an import as late as in a function definition:

def sqrtTen():
  import math
  print math.sqrt(10)

Such an import only takes place when the function is called.

You can import all functions from the module without the module namespace, using an asterisk notation:

from math import *
print sqrt(10)

However, if you do this inside a function, you get a warning in Python 2 (and error in Python 3):

def sqrtTen():
  from math import *
  print sqrt(10)

You can guard for a module not found:

try:
  import custommodule
except ImportError:
  pass

Modules can be different kinds of things:

  • Python files
  • Shared Objects (under Unix and Linux) with the .so suffix
  • DLL's (under Windows) with the .pyd suffix
  • Directories

Modules are loaded in the order they're found, which is controlled by sys.path. The current directory is always on the path.

Directories should include a file in them called __init__.py, which should probably include the other files in the directory.

Creating a DLL that interfaces with Python is covered in another section.

Imported Check

You can check whether a module has been imported as follows:

if "re" in sys.modules:
  print "Regular expression module is ready for use."

Links:

Creating a Module

From a File

The easiest way to create a module is by having a file called mymod.py either in a directory recognized by the PYTHONPATH variable or (even easier) in the same directory where you are working. If you have the following file mymod.py

class Object1:
        def __init__(self):
                self.name = 'object 1'

you can already import this "module" and create instances of the object Object1.

import mymod
myobject = mymod.Object1()
from mymod import *
myobject = Object1()

From a Directory

It is not feasible for larger projects to keep all classes in a single file. It is often easier to store all files in directories and load all files with one command. Each directory needs to have a __init__.py file which contains python commands that are executed upon loading the directory.

Suppose we have two more objects called Object2 and Object3 and we want to load all three objects with one command. We then create a directory called mymod and we store three files called Object1.py, Object2.py and Object3.py in it. These files would then contain one object per file but this not required (although it adds clarity). We would then write the following __init__.py file:

from Object1 import *
from Object2 import *
from Object3 import *

__all__ = ["Object1", "Object2", "Object3"]

The first three commands tell python what to do when somebody loads the module. The last statement defining __all__ tells python what to do when somebody executes from mymod import *. Usually we want to use parts of a module in other parts of a module, e.g. we want to use Object1 in Object2. We can do this easily with an from . import * command as the following file Object2.py shows:

from . import *

class Object2:
        def __init__(self):
                self.name = 'object 2'
                self.otherObject = Object1()

We can now start python and import mymod as we have in the previous section.

Making a program usable as a module

In order to make a program usable both as a standalone program to be called from a command line and as a module, it is advisable that you place all code in functions and methods, designate one function as the main one, and call then main function when __name__ built-in equals '__main__'. The purpose of doing so is to make sure that the code you have placed in the main function is not called when your program is imported as a module; the code would be called upon import if it were placed outside of functions and methods.

Your program, stored in mymodule.py, can look as follows:

def reusable_function(x, y):
  return x + y

def main():
  pass
  # Any code you like

if __name__ == '__main__':
  main()

The uses of the above program can look as follows:

from mymodule import reusable_function
my_result = reusable_function(4, 5)

Links:

Extending Module Path

When import is requested, modules are searched in the directories (and zip files?) in the module path, accessible via sys.path, a Python list. The module path can be extended as follows:

import sys
sys.path.append("/My/Path/To/Module/Directory")
from ModuleFileName import my_function

Above, if ModuleFileName.py is located at /My/Path/To/Module/Directory and contains a definition of my_function, the 2nd line ensures the 3rd line actually works.

Links:

Module Names

Module names seem to be limited to alphanumeric characters and underscore; dash cannot be used. While my-module.py can be created and run, importing my-module fails. The name of a module is the name of the module file minus the .py suffix.

Module names are case sensitive. If the module file is called MyModule.py, doing "import mymodule" fails while "import MyModule" is fine.

PEP 0008 recommends module names to be in all lowercase, with possible use of underscores.

Examples of module names from the standard library include math, sys, io, re, urllib, difflib, and unicodedata.

Links:

Built-in Modules

For a module to be built-in is not the same as to be part of the standard library. For instance, re is not a built-in module but rather a module written in Python. By contrast, _sre is a built-in module.

Obtaining a list of built-in module names:

print sys.builtin_module_names
print "_sre" in sys.builtin_module_names # True
print "math" in sys.builtin_module_names # True

Links:



Classes


Classes are a way of aggregating similar data and functions. A class is basically a scope inside which various code (especially function definitions) is executed, and the locals to this scope become attributes of the class, and of any objects constructed by this class. An object constructed by a class is called an instance of that class.

Overview

Classes in Python at a glance:

import math
class MyComplex:
  """A complex number"""       # Class documentation
  classvar = 0.0               # A class attribute, not an instance one
  def phase(self):             # A method
    return math.atan2(self.imaginary, self.real)
  def __init__(self):          # A constructor
    """A constructor"""
    self.real = 0.0            # An instance attribute
    self.imaginary = 0.0
c1 = MyComplex()
c1.real = 3.14                 # No access protection
c1.imaginary = 2.71
phase = c1.phase()             # Method call
c1.undeclared = 9.99           # Add an instance attribute
del c1.undeclared              # Delete an instance attribute

print vars(c1)                 # Attributes as a dictionary
vars(c1)["undeclared2"] = 7.77 # Write access to an attribute
print c1.undeclared2           # 7.77, indeed

MyComplex.classvar = 1         # Class attribute access
print c1.classvar == 1         # True; class attribute access, not an instance one
print "classvar" in vars(c1)   # False
c1.classvar = -1               # An instance attribute overshadowing the class one
MyComplex.classvar = 2         # Class attribute access
print c1.classvar == -1        # True; instance attribute acccess
print "classvar" in vars(c1)   # True

class MyComplex2(MyComplex):   # Class derivation or inheritance
  def __init__(self, re = 0, im = 0):
    self.real = re             # A constructor with multiple arguments with defaults
    self.imaginary = im
  def phase(self):
    print "Derived phase"
    return MyComplex.phase(self) # Call to a base class; "super"
c3 = MyComplex2()
c4 = MyComplex2(1, 1)
c4.phase()                     # Call to the method in the derived class

class Record: pass             # Class as a record/struct with arbitrary attributes
record = Record()
record.name = "Joe"
record.surname = "Hoe"

Defining a Class

To define a class, use the following format:

class ClassName:
    "Here is an explanation about your class"
    pass

The capitalization in this class definition is the convention, but is not required by the language. It's usually good to add at least a short explanation of what your class is supposed to do. The pass statement in the code above is just to say to the python interpreter just go on and do nothing. You can remove it as soon as you are adding your first statement.

Instance Construction

The class is a callable object that constructs an instance of the class when called. Let's say we create a class Foo.

class Foo:
    "Foo is our new toy."
    pass

To construct an instance of the class, Foo, "call" the class object:

f = Foo()

This constructs an instance of class Foo and creates a reference to it in f.

Class Members

In order to access the member of an instance of a class, use the syntax <class instance>.<member>. It is also possible to access the members of the class definition with <class name>.<member>.

Methods

A method is a function within a class. The first argument (methods must always take at least one argument) is always the instance of the class on which the function is invoked. For example

>>> class Foo:
...     def setx(self, x):
...         self.x = x
...     def bar(self):
...         print self.x

If this code were executed, nothing would happen, at least until an instance of Foo were constructed, and then bar were called on that instance.

Why a mandatory argument?

In a normal function, if you were to set a variable, such as test = 23, you could not access the test variable. Typing test would say it is not defined. This is true in class functions unless they use the self variable.

Basically, in the previous example, if we were to remove self.x, function bar could not do anything because it could not access x. The x in setx() would disappear. The self argument saves the variable into the class's "shared variables" database.

Why self?

You do not need to use self. However, it is a norm to use self.

Invoking Methods

Calling a method is much like calling a function, but instead of passing the instance as the first parameter like the list of formal parameters suggests, use the function as an attribute of the instance.

>>> f = Foo()
>>> f.setx(5)
>>> f.bar()

This will output

5

It is possible to call the method on an arbitrary object, by using it as an attribute of the defining class instead of an instance of that class, like so:

>>> Foo.setx(f,5)
>>> Foo.bar(f)

This will have the same output.

Dynamic Class Structure

As shown by the method setx above, the members of a Python class can change during runtime, not just their values, unlike classes in languages like C++ or Java. We can even delete f.x after running the code above.

>>> del f.x
>>> f.bar()
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "<stdin>", line 5, in bar
AttributeError: Foo instance has no attribute 'x'

Another effect of this is that we can change the definition of the Foo class during program execution. In the code below, we create a member of the Foo class definition named y. If we then create a new instance of Foo, it will now have this new member.

>>> Foo.y = 10
>>> g = Foo()
>>> g.y
10
Viewing Class Dictionaries

At the heart of all this is a dictionary that can be accessed by "vars(ClassName)"

>>> vars(g)
{}

At first, this output makes no sense. We just saw that g had the member y, so why isn't it in the member dictionary? If you remember, though, we put y in the class definition, Foo, not g.

>>> vars(Foo)
{'y': 10, 'bar': <function bar at 0x4d6a3c>, '__module__': '__main__',
 'setx': <function setx at 0x4d6a04>, '__doc__': None}

And there we have all the members of the Foo class definition. When Python checks for g.member, it first checks g's vars dictionary for "member," then Foo. If we create a new member of g, it will be added to g's dictionary, but not Foo's.

>>> g.setx(5)
>>> vars(g)
{'x': 5}

Note that if we now assign a value to g.y, we are not assigning that value to Foo.y. Foo.y will still be 10, but g.y will now override Foo.y

>>> g.y = 9
>>> vars(g)
{'y': 9, 'x': 5}
>>> vars(Foo)
{'y': 10, 'bar': <function bar at 0x4d6a3c>, '__module__': '__main__',
 'setx': <function setx at 0x4d6a04>, '__doc__': None}

Sure enough, if we check the values:

>>> g.y
9
>>> Foo.y
10

Note that f.y will also be 10, as Python won't find 'y' in vars(f), so it will get the value of 'y' from vars(Foo).

Some may have also noticed that the methods in Foo appear in the class dictionary along with the x and y. If you remember from the section on lambda functions, we can treat functions just like variables. This means that we can assign methods to a class during runtime in the same way we assigned variables. If you do this, though, remember that if we call a method of a class instance, the first parameter passed to the method will always be the class instance itself.

Changing Class Dictionaries

We can also access the members dictionary of a class using the __dict__ member of the class.

>>> g.__dict__
{'y': 9, 'x': 5}

If we add, remove, or change key-value pairs from g.__dict__, this has the same effect as if we had made those changes to the members of g.

>>> g.__dict__['z'] = -4
>>> g.z
-4

Why use classes?

Classes are special due to the fact once an instance is made, the instance is independent of all other instances. I could have two instances, each with a different x value, and they will not affect the other's x.

f = Foo()
f.setx(324)
f.boo()
g = Foo()
g.setx(100)
g.boo()

f.boo() and g.boo() will print different values.

New Style Classes

New style classes were introduced in python 2.2. A new-style class is a class that has a built-in as its base, most commonly object. At a low level, a major difference between old and new classes is their type. Old class instances were all of type instance. New style class instances will return the same thing as x.__class__ for their type. This puts user defined classes on a level playing field with built-ins. Old/Classic classes are slated to disappear in Python 3. With this in mind all development should use new style classes. New Style classes also add constructs like properties and static methods familiar to Java programmers.

Old/Classic Class

>>> class ClassicFoo:
...     def __init__(self):
...         pass

New Style Class

>>> class NewStyleFoo(object):
...     def __init__(self):
...         pass

Properties

Properties are attributes with getter and setter methods.

>>> class SpamWithProperties(object):
...     def __init__(self):
...         self.__egg = "MyEgg"
...     def get_egg(self):
...         return self.__egg
...     def set_egg(self, egg):
...         self.__egg = egg
...     egg = property(get_egg, set_egg)

>>> sp = SpamWithProperties()
>>> sp.egg
'MyEgg'
>>> sp.egg = "Eggs With Spam"
>>> sp.egg
'Eggs With Spam'
>>>

and since Python 2.6, with @property decorator

>>> class SpamWithProperties(object):
...     def __init__(self):
...         self.__egg = "MyEgg"
...     @property
...     def egg(self):
...         return self.__egg
...     @egg.setter
...     def egg(self, egg):
...         self.__egg = egg

Static Methods

Static methods in Python are just like their counterparts in C++ or Java. Static methods have no "self" argument and don't require you to instantiate the class before using them. They can be defined using staticmethod()

>>> class StaticSpam(object):
...     def StaticNoSpam():
...         print "You can't have have the spam, spam, eggs and spam without any spam... that's disgusting"
...     NoSpam = staticmethod(StaticNoSpam)

>>> StaticSpam.NoSpam()
You can't have have the spam, spam, eggs and spam without any spam... that's disgusting

They can also be defined using the function decorator @staticmethod.

>>> class StaticSpam(object):
...     @staticmethod
...     def StaticNoSpam():
...         print "You can't have have the spam, spam, eggs and spam without any spam... that's disgusting"

Inheritance

Like all object oriented languages, Python provides support for inheritance. Inheritance is a simple concept by which a class can extend the facilities of another class, or in Python's case, multiple other classes. Use the following format for this:

class ClassName(BaseClass1, BaseClass2, BaseClass3,...):
    ...

ClassName is what is known as the derived class, that is, derived from the base classes. The derived class will then have all the members of its base classes. If a method is defined in the derived class and in the base class, the member in the derived class will override the one in the base class. In order to use the method defined in the base class, it is necessary to call the method as an attribute on the defining class, as in Foo.setx(f,5) above:

>>> class Foo:
...     def bar(self):
...         print "I'm doing Foo.bar()"
...     x = 10
...
>>> class Bar(Foo):
...     def bar(self):
...         print "I'm doing Bar.bar()"
...         Foo.bar(self)
...     y = 9
...
>>> g = Bar()
>>> Bar.bar(g)
I'm doing Bar.bar()
I'm doing Foo.bar()
>>> g.y
9
>>> g.x
10

Once again, we can see what's going on under the hood by looking at the class dictionaries.

>>> vars(g)
{}
>>> vars(Bar)
{'y': 9, '__module__': '__main__', 'bar': <function bar at 0x4d6a04>,
 '__doc__': None}
>>> vars(Foo)
{'x': 10, '__module__': '__main__', 'bar': <function bar at 0x4d6994>,
 '__doc__': None}

When we call g.x, it first looks in the vars(g) dictionary, as usual. Also as above, it checks vars(Bar) next, since g is an instance of Bar. However, thanks to inheritance, Python will check vars(Foo) if it doesn't find x in vars(Bar).

Multiple inheritance

As shown in section #Inheritance, a class can be derived from multiple classes:

class ClassName(BaseClass1, BaseClass2, BaseClass3):
    pass

A tricky part about multiple inheritance is method resolution: upon a method call, if the method name is available from multiple base classes or their base classes, which base class method should be called.

The method resolution order depends on whether the class is an old-style class or a new-style class. For old-style classes, derived classes are considered from left to right, and base classes of base classes are considered before moving to the right. Thus, above, BaseClass1 is considered first, and if method is not found there, the base classes of BaseClass1 are considered. If that fails, BaseClass2 is considered, then its base classes, and so on. For new-style classes, see the Python documentation online.

Links:

Special Methods

There are a number of methods which have reserved names which are used for special purposes like mimicking numerical or container operations, among other things. All of these names begin and end with two underscores. It is convention that methods beginning with a single underscore are 'private' to the scope they are introduced within.

Initialization and Deletion

__init__

One of these purposes is constructing an instance, and the special name for this is '__init__'. __init__() is called before an instance is returned (it is not necessary to return the instance manually). As an example,

class A:
    def __init__(self):
        print 'A.__init__()'
a = A()

outputs

A.__init__()

__init__() can take arguments, in which case it is necessary to pass arguments to the class in order to create an instance. For example,

class Foo:
    def __init__ (self, printme):
        print printme
foo = Foo('Hi!')

outputs

Hi!

Here is an example showing the difference between using __init__() and not using __init__():

class Foo:
    def __init__ (self, x):
         print x
foo = Foo('Hi!')
class Foo2:
    def setx(self, x):
        print x
f = Foo2()
Foo2.setx(f,'Hi!')

outputs

Hi!
Hi!
__del__

Similarly, '__del__' is called when an instance is destroyed; e.g. when it is no longer referenced.

__enter__ and __exit__

These methods are also a constructor and a destructor but they're only executed when the class is instantiated with with. Example:

class ConstructorsDestructors:
    def __init__(self):
        print 'init'

    def __del__(self):
        print 'del'

    def __enter__(self):
        print 'enter'

    def __exit__(self, exc_type, exc_value, traceback):
        print 'exit'

with ConstructorsDestructors():
    pass
init
enter
exit
del
__new__

Metaclass constructor.

Representation

__str__

Converting an object to a string, as with the print statement or with the str() conversion function, can be overridden by overriding __str__. Usually, __str__ returns a formatted version of the objects content. This will NOT usually be something that can be executed.

For example:

class Bar:
    def __init__ (self, iamthis):
        self.iamthis = iamthis
    def __str__ (self):
        return self.iamthis
bar = Bar('apple')
print bar

outputs

apple
__repr__

This function is much like __str__(). If __str__ is not present but this one is, this function's output is used instead for printing. __repr__ is used to return a representation of the object in string form. In general, it can be executed to get back the original object.

For example:

class Bar:
    def __init__ (self, iamthis):
        self.iamthis = iamthis
    def __repr__(self):
        return "Bar('%s')" % self.iamthis
bar = Bar('apple')
bar

outputs (note the difference: it may not be necessary to put it inside a print, however in Python 2.7 it does)

Bar('apple')
String Representation Override Functions
Function Operator
__str__str(A)
__repr__repr(A)
__unicode__unicode(x) (2.x only)

Attributes

__setattr__

This is the function which is in charge of setting attributes of a class. It is provided with the name and value of the variables being assigned. Each class, of course, comes with a default __setattr__ which simply sets the value of the variable, but we can override it.

>>> class Unchangable:
...    def __setattr__(self, name, value):
...        print "Nice try"
...
>>> u = Unchangable()
>>> u.x = 9
Nice try
>>> u.x
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
AttributeError: Unchangable instance has no attribute 'x'
__getattr___

Similar to __setattr__, except this function is called when we try to access a class member, and the default simply returns the value.

>>> class HiddenMembers:
...     def __getattr__(self, name):
...         return "You don't get to see " + name
...
>>> h = HiddenMembers()
>>> h.anything
"You don't get to see anything"
__delattr__

This function is called to delete an attribute.

>>> class Permanent:
...     def __delattr__(self, name):
...         print name, "cannot be deleted"
...
>>> p = Permanent()
>>> p.x = 9
>>> del p.x
x cannot be deleted
>>> p.x
9
Attribute Override Functions
Function Indirect form Direct Form
__getattr__getattr(A, B)A.B
__setattr__setattr(A, B, C)A.B = C
__delattr__delattr(A, B)del A.B

Operator Overloading

Operator overloading allows us to use the built-in Python syntax and operators to call functions which we define.

Binary Operators


If a class has the __add__ function, we can use the '+' operator to add instances of the class. This will call __add__ with the two instances of the class passed as parameters, and the return value will be the result of the addition.

>>> class FakeNumber:
...     n = 5
...     def __add__(A,B):
...         return A.n + B.n
...
>>> c = FakeNumber()
>>> d = FakeNumber()
>>> d.n = 7
>>> c + d
12

To override the augmented assignment operators, merely add 'i' in front of the normal binary operator, i.e. for '+=' use '__iadd__' instead of '__add__'. The function will be given one argument, which will be the object on the right side of the augmented assignment operator. The returned value of the function will then be assigned to the object on the left of the operator.

>>> c.__imul__ = lambda B: B.n - 6
>>> c *= d
>>> c
1

It is important to note that the augmented assignment operators will also use the normal operator functions if the augmented operator function hasn't been set directly. This will work as expected, with "__add__" being called for "+=" and so on.

>>> c = FakeNumber()
>>> c += d
>>> c
12
Binary Operator Override Functions
Function Operator
__add__A + B
__sub__A - B
__mul__A * B
__truediv__A / B
__floordiv__A // B
__mod__A % B
__pow__A ** B
__and__A & B
__or__A | B
__xor__A ^ B
__eq__A == B
__ne__A != B
__gt__A > B
__lt__A < B
__ge__A >= B
__le__A <= B
__lshift__A << B
__rshift__A >> B
__contains__A in B
A not in B
Unary Operators


Unary operators will be passed simply the instance of the class that they are called on.

>>> FakeNumber.__neg__ = lambda A : A.n + 6
>>> -d
13
Unary Operator Override Functions
Function Operator
__pos__+A
__neg__-A
__inv__~A
__abs__abs(A)
__len__len(A)
Item Operators


It is also possible in Python to override the indexing and slicing operators. This allows us to use the class[i] and class[a:b] syntax on our own objects.

The simplest form of item operator is __getitem__. This takes as a parameter the instance of the class, then the value of the index.

>>> class FakeList:
...     def __getitem__(self,index):
...         return index * 2
...
>>> f = FakeList()
>>> f['a']
'aa'

We can also define a function for the syntax associated with assigning a value to an item. The parameters for this function include the value being assigned, in addition to the parameters from __getitem__

>>> class FakeList:
...     def __setitem__(self,index,value):
...         self.string = index + " is now " + value
...
>>> f = FakeList()
>>> f['a'] = 'gone'
>>> f.string
'a is now gone'

We can do the same thing with slices. Once again, each syntax has a different parameter list associated with it.

>>> class FakeList:
...     def __getslice___(self,start,end):
...         return str(start) + " to " + str(end)
...
>>> f = FakeList()
>>> f[1:4]
'1 to 4'

Keep in mind that one or both of the start and end parameters can be blank in slice syntax. Here, Python has default value for both the start and the end, as show below.

>> f[:]
'0 to 2147483647'

Note that the default value for the end of the slice shown here is simply the largest possible signed integer on a 32-bit system, and may vary depending on your system and C compiler.

  • __setslice__ has the parameters (self,start,end,value)

We also have operators for deleting items and slices.

  • __delitem__ has the parameters (self,index)
  • __delslice__ has the parameters (self,start,end)

Note that these are the same as __getitem__ and __getslice__.

Item Operator Override Functions
Function Operator
__getitem__C[i]
__setitem__C[i] = v
__delitem__del C[i]
__getslice__C[s:e]
__setslice__C[s:e] = v
__delslice__del C[s:e]

Other Overrides

Other Override Functions
Function Operator
__cmp__cmp(x, y)
__hash__hash(x)
__nonzero__bool(x)
__call__f(x)
__iter__iter(x)
__reversed__reversed(x) (2.6+)
__divmod__divmod(x, y)
__int__int(x)
__long__long(x)
__float__float(x)
__complex__complex(x)
__hex__hex(x)
__oct__oct(x)
__index__
__copy__copy.copy(x)
__deepcopy__copy.deepcopy(x)
__sizeof__sys.getsizeof(x) (2.6+)
__trunc__math.trunc(x) (2.6+)
__format__format(x, ...) (2.6+)

Programming Practices

The flexibility of python classes means that classes can adopt a varied set of behaviors. For the sake of understandability, however, it's best to use many of Python's tools sparingly. Try to declare all methods in the class definition, and always use the <class>.<member> syntax instead of __dict__ whenever possible. Look at classes in C++ and Java to see what most programmers will expect from a class.

Encapsulation

Since all python members of a python class are accessible by functions/methods outside the class, there is no way to enforce encapsulation short of overriding __getattr__, __setattr__ and __delattr__. General practice, however, is for the creator of a class or module to simply trust that users will use only the intended interface and avoid limiting access to the workings of the module for the sake of users who do need to access it. When using parts of a class or module other than the intended interface, keep in mind that the those parts may change in later versions of the module, and you may even cause errors or undefined behaviors in the module.since encapsulation is private.

Doc Strings

When defining a class, it is convention to document the class using a string literal at the start of the class definition. This string will then be placed in the __doc__ attribute of the class definition.

>>> class Documented:
...     """This is a docstring"""
...     def explode(self):
...         """
...         This method is documented, too! The coder is really serious about
...         making this class usable by others who don't know the code as well
...         as he does.
...
...         """
...         print "boom"
>>> d = Documented()
>>> d.__doc__
'This is a docstring'

Docstrings are a very useful way to document your code. Even if you never write a single piece of separate documentation (and let's admit it, doing so is the lowest priority for many coders), including informative docstrings in your classes will go a long way toward making them usable.

Several tools exist for turning the docstrings in Python code into readable API documentation, e.g., EpyDoc.

Don't just stop at documenting the class definition, either. Each method in the class should have its own docstring as well. Note that the docstring for the method explode in the example class Documented above has a fairly lengthy docstring that spans several lines. Its formatting is in accordance with the style suggestions of Python's creator, Guido van Rossum in PEP 8.

Adding methods at runtime

To a class

It is fairly easy to add methods to a class at runtime. Lets assume that we have a class called Spam and a function cook. We want to be able to use the function cook on all instances of the class Spam:

class Spam:
  def __init__(self):
    self.myeggs = 5

def cook(self):
  print "cooking %s eggs" % self.myeggs

Spam.cook = cook   #add the function to the class Spam
eggs = Spam()      #NOW create a new instance of Spam
eggs.cook()        #and we are ready to cook!

This will output

cooking 5 eggs
To an instance of a class

It is a bit more tricky to add methods to an instance of a class that has already been created. Lets assume again that we have a class called Spam and we have already created eggs. But then we notice that we wanted to cook those eggs, but we do not want to create a new instance but rather use the already created one:

class Spam:
  def __init__(self):
    self.myeggs = 5

eggs = Spam()

def cook(self):
  print "cooking %s eggs" % self.myeggs

import types
f = types.MethodType(cook, eggs, Spam)
eggs.cook = f

eggs.cook()

Now we can cook our eggs and the last statement will output:

cooking 5 eggs
Using a function

We can also write a function that will make the process of adding methods to an instance of a class easier.

def attach_method(fxn, instance, myclass):
  f = types.MethodType(fxn, instance, myclass)
  setattr(instance, fxn.__name__, f)

All we now need to do is call the attach_method with the arguments of the function we want to attach, the instance we want to attach it to and the class the instance is derived from. Thus our function call might look like this:

attach_method(cook, eggs, Spam)

Note that in the function add_method we cannot write instance.fxn = f since this would add a function called fxn to the instance.




Exceptions


Python 2 handles all errors with exceptions.

An exception is a signal that an error or other unusual condition has occurred. There are a number of built-in exceptions, which indicate conditions like reading past the end of a file, or dividing by zero. You can also define your own exceptions.

Overview

Exceptions in Python at a glance:

import random
try:
  ri = random.randint(0, 2)
  if ri == 0:
    infinity = 1/0
  elif ri == 1:
    raise ValueError("Message")
    #raise ValueError, "Message" # Deprecated
  elif ri == 2:
    raise ValueError # Without message
except ZeroDivisionError:
  pass
except ValueError as valerr:
# except ValueError, valerr: # Deprecated?
  print valerr
  raise # Raises the exception just caught
except: # Any other exception
  pass
finally: # Optional
  pass # Clean up

class CustomValueError(ValueError): pass # Custom exception
try:
  raise CustomValueError
  raise TypeError
except (ValueError, TypeError): # Value error catches custom, a derived class, as well
  pass                          # A tuple catches multiple exception classes

Raising exceptions

Whenever your program attempts to do something erroneous or meaningless, Python raises exception to such conduct:

>>> 1 / 0
Traceback (most recent call last):
    File "<stdin>", line 1, in ?
ZeroDivisionError: integer division or modulo by zero

This traceback indicates that the ZeroDivisionError exception is being raised. This is a built-in exception -- see below for a list of all the other ones.

Catching exceptions

In order to handle errors, you can set up exception handling blocks in your code. The keywords try and except are used to catch exceptions. When an error occurs within the try block, Python looks for a matching except block to handle it. If there is one, execution jumps there.

If you execute this code:

try:
    print 1/0
except ZeroDivisionError:
    print "You can't divide by zero!"

Then Python will print this:

You can't divide by zero!

If you don't specify an exception type on the except line, it will cheerfully catch all exceptions. This is generally a bad idea in production code, since it means your program will blissfully ignore unexpected errors as well as ones which the except block is actually prepared to handle.

Exceptions can propagate up the call stack:

def f(x):
    return g(x) + 1

def g(x):
    if x < 0: raise ValueError, "I can't cope with a negative number here."
    else: return 5

try:
    print f(-6)
except ValueError:
    print "That value was invalid."

In this code, the print statement calls the function f. That function calls the function g, which will raise an exception of type ValueError. Neither f nor g has a try/except block to handle ValueError. So the exception raised propagates out to the main code, where there is an exception-handling block waiting for it. This code prints:

That value was invalid.

Sometimes it is useful to find out exactly what went wrong, or to print the python error text yourself. For example:

try:
    the_file = open("the_parrot")
except IOError, (ErrorNumber, ErrorMessage):
    if ErrorNumber == 2: # file not found
        print "Sorry, 'the_parrot' has apparently joined the choir invisible."
    else:
        print "Congratulation! you have managed to trip a #%d error" % ErrorNumber
        print ErrorMessage

Which will print:

Sorry, 'the_parrot' has apparently joined the choir invisible.

Custom Exceptions

Code similar to that seen above can be used to create custom exceptions and pass information along with them. This can be very useful when trying to debug complicated projects. Here is how that code would look; first creating the custom exception class:

class CustomException(Exception):
    def __init__(self, value):
        self.parameter = value
    def __str__(self):
        return repr(self.parameter)

And then using that exception:

try:
    raise CustomException("My Useful Error Message")
except CustomException, (instance):
    print "Caught: " + instance.parameter

Trying over and over again

Recovering and continuing with finally

Exceptions could lead to a situation where, after raising an exception, the code block where the exception occurred might not be revisited. In some cases this might leave external resources used by the program in an unknown state.

finally clause allows programmers to close such resources in case of an exception. Between 2.4 and 2.5 version of python there is change of syntax for finally clause.

  • Python 2.4
try:
    result = None
    try:
        result = x/y
    except ZeroDivisionError:
        print "division by zero!"
    print "result is ", result
finally:
    print "executing finally clause"
  • Python 2.5
try:
    result = x / y
except ZeroDivisionError:
    print "division by zero!"
else:
    print "result is", result
finally:
    print "executing finally clause"

Built-in exception classes

All built-in Python exceptions

Exotic uses of exceptions

Exceptions are good for more than just error handling. If you have a complicated piece of code to choose which of several courses of action to take, it can be useful to use exceptions to jump out of the code as soon as the decision can be made. The Python-based mailing list software Mailman does this in deciding how a message should be handled. Using exceptions like this may seem like it's a sort of GOTO -- and indeed it is, but a limited one called an escape continuation. Continuations are a powerful functional-programming tool and it can be useful to learn them.

Just as a simple example of how exceptions make programming easier, say you want to add items to a list but you don't want to use "if" statements to initialize the list we could replace this:

if hasattr(self, 'items'):
    self.items.extend(new_items)
else:
    self.items = list(new_items)

Using exceptions, we can emphasize the normal program flow—that usually we just extend the list—rather than emphasizing the unusual case:

try:
    self.items.extend(new_items)
except AttributeError:
    self.items = list(new_items)



Errors


In python there are three types of errors; syntax errors, logic errors and exceptions.

Syntax errors

Syntax errors are the most basic type of error. They arise when the Python parser is unable to understand a line of code. Syntax errors are almost always fatal, i.e. there is almost never a way to successfully execute a piece of code containing syntax errors. Some syntax errors can be caught and handled, like eval(""), but these are rare.

In IDLE, it will highlight where the syntax error is. Most syntax errors are typos, incorrect indentation, or incorrect arguments. If you get this error, try looking at your code for any of these.

Logic errors

These are the most difficult type of error to find, because they will give unpredictable results and may crash your program.  A lot of different things can happen if you have a logic error. However these are very easy to fix as you can use a debugger, which will run through the program and fix any problems.

A simple example of a logic error can be showcased below, the while loop will compile and run however, the loop will never finish and may crash Python:

#Counting Sheep
#Goal: Print number of sheep up until 101.
sheep_count=1
while sheep_count<100:
    print("%i Sheep"%sheep_count)

Logic errors are only erroneous in the perspective of the programming goal one might have; in many cases Python is working as it was intended, just not as the user intended. The above while loop is functioning correctly as Python is intended to, but the exit the condition the user needs is missing.

Exceptions

Exceptions arise when the python parser knows what to do with a piece of code but is unable to perform the action. An example would be trying to access the internet with python without an internet connection; the python interpreter knows what to do with that command but is unable to perform it.

Dealing with exceptions

Unlike syntax errors, exceptions are not always fatal. Exceptions can be handled with the use of a try statement.

Consider the following code to display the HTML of the website 'example.com'. When the execution of the program reaches the try statement it will attempt to perform the indented code following, if for some reason there is an error (the computer is not connected to the internet or something) the python interpreter will jump to the indented code below the 'except:' command.

import urllib2
url = 'http://www.example.com'
try:
    req = urllib2.Request(url)
    response = urllib2.urlopen(req)
    the_page = response.read()
    print the_page
except:
    print "We have a problem."

Another way to handle an error is to except a specific error.

try:
    age = int(raw_input("Enter your age: "))
    print "You must be {0} years old.".format(age)
except ValueError:
    print "Your age must be numeric."

If the user enters a numeric value as his/her age, the output should look like this:

Enter your age: 5
Your age must be 5 years old.

However, if the user enters a non-numeric value as his/her age, a ValueError is thrown when trying to execute the int() method on a non-numeric string, and the code under the except clause is executed:

Enter your age: five
Your age must be numeric.

You can also use a try block with a while loop to validate input:

valid = False
while valid == False:
    try:
        age = int(raw_input("Enter your age: "))
        valid = True     # This statement will only execute if the above statement executes without error.
        print "You must be {0} years old.".format(age)
    except ValueError:
        print "Your age must be numeric."

The program will prompt you for your age until you enter a valid age:

Enter your age: five
Your age must be numeric.
Enter your age: abc10
Your age must be numeric.
Enter your age: 15
You must be 15 years old.

In certain other cases, it might be necessary to get more information about the exception and deal with it appropriately. In such situations the except as construct can be used.

f=raw_input("enter the name of the file:")
l=raw_input("enter the name of the link:")
try:
    os.symlink(f,l)
except OSError as e:
    print "an error occurred linking %s to %s: %s\n error no %d"%(f,l,e.args[1],e.args[0])
enter the name of the file:file1.txt
enter the name of the link:AlreadyExists.txt
an error occurred linking file1.txt to AlreadyExists.txt: File exists
 error no 17

enter the name of the file:file1.txt
enter the name of the link:/Cant/Write/Here/file1.txt
an error occurred linking file1.txt to /Cant/Write/Here/file1.txt: Permission denied
 error no 13



Source Documentation and Comments


Documentation is the process of leaving information about your code. The two mechanisms for doing this in Python are comments and documentation strings.

Comments

There will always be a time in which you have to return to your code. Perhaps it is to fix a bug, or to add a new feature. Regardless, looking at your own code after six months is almost as bad as looking at someone else's code. What one needs is a means to leave reminders to yourself as to what you were doing.

For this purpose, you leave comments. Comments are little snippets of text embedded inside your code that are ignored by the Python interpreter. A comment is denoted by the hash character (#) and extends to the end of the line. For example:

#!/usr/bin/env python
# commentexample.py

# Display the knights that come after Scene 24
print("The Knights Who Say Ni!")
# print("I will never see the light of day!")

As you can see, you can also use comments to temporarily remove segments of your code, like the second print statement.

Comment Guidelines

The following guidelines are from PEP 8, written by Guido van Rossum.

  • General
    • Comments that contradict the code are worse than no comments. Always make a priority of keeping the comments up-to-date when the code changes!
    • Comments should be complete sentences. If a comment is a phrase or sentence, its first word should be capitalized, unless it is an identifier that begins with a lower case letter (never alter the case of identifiers!).
    • If a comment is short, the period at the end can be omitted. Block comments generally consist of one or more paragraphs built out of complete sentences, and each sentence should end in a period.
    • You should use two spaces after a sentence-ending period.
    • When writing English, Strunk and White applies.
    • Python coders from non-English speaking countries: please write your comments in English, unless you are 120% sure that the code will never be read by people who don't speak your language.
  • Inline Comments
    • An inline comment is a comment on the same line as a statement. Inline comments should be separated by at least two spaces from the statement. They should start with a # and a single space.
    • Inline comments are unnecessary and in fact distracting if they state the obvious. Don't do this:
      x = x + 1  # Increment x
      
      But sometimes, this is useful:
      x = x + 1  # Compensate for border
      

Documentation Strings

But what if you just want to know how to use a function, class, or method? You could add comments before the function, but comments are inside the code, so you would have to pull up a text editor and view them that way. But you can't pull up comments from a C extension, so that is less than ideal. You could always write a separate text file with how to call the functions, but that would mean that you would have to remember to update that file. If only there was a mechanism for being able to embed the documentation and get at it easily...

Fortunately, Python has such a capability. Documentation strings (or docstrings) are used to create easily-accessible documentation. You can add a docstring to a function, class, or module by adding a string as the first indented statement. For example:

#!/usr/bin/env python
# docstringexample.py

"""Example of using documentation strings."""

class Knight:
    """
    An example class.
    
    Call spam to get bacon.
    """
    
    def spam(eggs="bacon"):
        """Prints the argument given."""
        print(eggs)

The convention is to use triple-quoted strings, because it makes it easier to add more documentation spanning multiple lines.

To access the documentation, you can use the help function inside a Python shell with the object you want help on, or you can use the pydoc command from your system's shell. If we were in the directory where docstringexample.py lives, one could enter pydoc docstringexample to get documentation on that module.



Idioms

Python is a strongly idiomatic language: there is generally a single optimal way of doing something (a programming idiom), rather than many ways: “There’s more than one way to do it” is not a Python motto.

This section starts with some general principles, then goes through the language, highlighting how to idiomatically use operations, data types, and modules in the standard library.

Principles

Use exceptions for error-checking, following EAFP (It's Easier to Ask Forgiveness than Permission) instead of LBYL (Look Before You Leap): put an action that may fail inside a try...except block.

Use context managers for managing resources, like files. Use finally for ad hoc cleanup, but prefer to write a context manager to encapsulate this.

Use properties, not getter/setter pairs.

Use dictionaries for dynamic records, classes for static records (for simple classes, use collections.namedtuple): if a record always has the same fields, make this explicit in a class; if the fields may vary (be present or not), use a dictionary.

Use _ for throwaway variables, like discarding a return value when a tuple is returned, or to indicate that a parameter is being ignored (when required for an interface, say). You can use *_, **__ to discard positional or keyword arguments passed to a function: these correspond to the usual *args, **kwargs parameters, but explicitly discarded. You can also use these in addition to positional or named parameters (following the ones you use), allowing you to use some and discard any excess ones.

Use implicit True/False (truthy/falsy values), except when needing to distinguish between falsy values, like None, 0, and [], in which case use an explicit check like is None or == 0.

Use the optional else clause after try, for, while not just if.

Imports

For very robust code, only import modules, not names (like functions or classes), as this creates a new (name) binding, which is not necessarily in sync with the existing binding.[1] For example, given a module m which defines a function f, importing the function with from m import f means that m.f and f can differ if either is assigned to (creating a new binding).

In practice, this is frequently ignored, particularly for small-scale code, as changing a module post-import is rare, so this is rarely a problem, and both classes and functions are imported from modules so they can be referred to without a prefix. However, for robust, large-scale code, this is an important rule, as it risks creating very subtle bugs.

For robust code with low typing, one can use a renaming import to abbreviate a long module name:

import module_with_very_long_name as vl
vl.f()  # easier than module_with_very_long_name.f, but still robust

Note that importing submodules (or subpackages) from a package using from is completely fine:

from p import sm  # completely fine
sm.f()

Operations

Swap values
b, a = a, b
Attribute access on nullable value

To access an attribute (esp. to call a method) on a value that might be an object, or might be None, use the boolean shortcircuiting of and:

a and a.x
a and a.f()

Particularly useful for regex matches:

match and match.group(0)
in

in in can be used for substring checking

Data types

All sequence types

Indexing during iteration

Use enumerate() if you need to keep track of iteration cycles over an iterable:

for i, x in enumerate(l):
    # ...

Anti-idiom:

for i in range(len(l)):
    x = l[i]  # why did you go from list to numbers back to the list?
    # ...
Finding first matching element

Python sequences do have an index method, but this returns the index of the first occurrence of a specific value in the sequence. To find the first occurrence of a value that satisfies a condition, instead, use next and a generator expression:

try:
    x = next(i for i, n in enumerate(l) if n > 0)
except StopIteration:
    print('No positive numbers')
else:
    print('The index of the first positive number is', x)

If you need the value, not the index of its occurrence, you can get it directly through:

try:
    x = next(n for n in l if n > 0)
except StopIteration:
    print('No positive numbers')
else:
    print('The first positive number is', x)

The reason for this construct is twofold:

  • Exceptions let you signal “no match found” (they solve the semipredicate problem): since you're returning a single value (not an index), this can't be returned in the value.
  • Generator expressions let you use an expression without needing a lambda or introducing new grammar.
Truncating

For mutable sequences, use del, instead of reassigning to a slice:

del l[j:]
del l[:i]

Anti-idiom:

l = l[:j]
l = l[i:]

The simplest reason is that del makes your intention clear: you're truncating.

More subtly, slicing creates another reference to the same list (because lists are mutable), and then unreachable data can be garbage-collected, but generally this is done later. Deleting instead immediately modifies the list in-place (which is faster than creating a slice and then assigning it to the existing variable), and allows Python to immediately deallocate the deleted elements, instead of waiting for garbage collection.

In some cases you do want 2 slices of the same list – though this is rare in basic programming, other than iterating once over a slice in a for loop – but it's rare that you'll want to make a slice of a whole list, then replace the original list variable with a slice (but not change the other slice!), as in the following funny-looking code:

m = l
l = l[i:j]  # why not m = l[i:j] ?
Sorted list from an iterable

You can create a sorted list directly from any iterable, without needing to first make a list and then sort it. These include sets and dictionaries (iterate on the keys):

s = {1, 'a', ...}
l = sorted(s)
d = {'a': 1, ...}
l = sorted(d)

Tuples

Use tuples for constant sequences. This is rarely necessary (primarily when using as keys in a dictionary), but makes intention clear.

Strings

Substring

Use in for substring checking.

However, do not use in to check if a string is a single-character match, since it matches substrings and will return spurious matches – instead use a tuple of valid values. For example, the following is wrong:

def valid_sign(sign):
    return sign in '+-'  # wrong, returns true for sign == '+-'

Instead, use a tuple:

def valid_sign(sign):
    return sign in ('+', '-')
Building a string

To make a long string incrementally, build a list and then join it with '' – or with newlines, if building a text file (don't forget the final newline in this case!). This is faster and clearer than appending to a string, which is often slow. (In principle can be in overall length of string and number of additions, which is if pieces are of similar sizes.)

However, there are some optimizations in some versions CPython that make simple string appending fast – string appending in CPython 2.5+, and bytestring appending in CPython 3.0+ are fast, but for building Unicode strings (unicode in Python 2, string in Python 3), joining is faster. If doing extensive string manipulation, be aware of this and profile your code. See Performance Tips: String Concatenation and Concatenation Test Code for details.

Don't do this:

s = ''
for x in l:
    # this makes a new string every iteration, because strings are immutable
    s += x

Instead:

# ...
# l.append(x)
s = ''.join(l)

You can even use generator expressions, which are extremely efficient:

s = ''.join(f(x) for x in l)

If you do want a mutable string-like object, you can use StringIO.

Dictionaries

To iterate through a dictionary, either keys, values, or both:

# Iterate over keys
for k in d:
    ...

# Iterate over values, Python 3
for v in d.values():
    ...

# Iterate over values, Python 2
# In Python 2, dict.values() returns a copy
for v in d.itervalues():
    ...

# Iterate over keys and values, Python 3
for k, v in d.items():
    ...

# Iterate over values, Python 2
# In Python 2, dict.items() returns a copy
for k, v in d.iteritems():
    ...

Anti-patterns:

for k, _ in d.items():  # instead: for k in d:
    ...
for _, v in d.items():  # instead: for v in d.values()
    ...

FIXME:

  • setdefault
  • usually better to use collections.defaultdict

dict.get is useful, but using dict.get and then checking if it is None as a way of testing if the key is in the dictionary is an anti-idiom, as None is a potential value, and whether the key is in the dictionary can be checked directly. It's ok to use get and compare with None if this is not a potential value, however.

Simple:

if 'k' in d:
    # ... d['k']

Anti-idiom (unless None is not a potential value):

v = d.get('k')
if v is not None:
    # ... v
Dict from parallel sequences of keys and values

Use zip as: dict(zip(keys, values))

Modules

re

Match if found, else None:

match = re.match(r, s)
return match and match.group(0)

...returns None if no match, and the match contents if there is one.

References

Further reading



Package management

pip is the standard Python package manager, making it easy to download and install packages from the PyPI repository. pip seems to be part of Python distribution since Python 2.7.9 and 3.5.4. If you do not have pip, you can install it by downloading get-pip.py from bootstrap.pypa.io and running python get-pip.py. Other package managers are not covered in this chapter.

Examples of pip use:

  • pip install xlrd
    • Installs xlrd package from the PyPI repository, or from another repository if customized to search in additional repositories.
  • pip install --upgrade xlrd
    • Upgrades a package to the latest version.
  • pip install mypackage.whl
    • Installs the package from wheel file mypackage.whl. This is useful when for whatever reason installation from PyPI fails and you need to download the wheel file (.whl) of the package manually.
  • pip freeze
    • Lists installed packages and their versions.
  • pip show xlrd
    • Outputs information about an installed package (here xlrd), including version, author and license.
  • python -m pip install xlrd
    • Calls pip via python and -m option. Useful e.g. for installing packages for PyPy (a just-in-time compiler for Python), in which case you use pypy -m pip install xlrd.
  • pip --version
    • Outputs the pip version.
  • pip install --upgrade pip
    • Upgrades pip itself.

PyPI is an online repository of Python packages, many of which are published under a rather permissive license such as MIT license or one of the BSD licenses. PyPI hosts both pure-Python packages and Python packages taking advantage of the C language. Installing pure-Python packages such as xlrd is usually seamless. As for C language packages, many of them have precompiled binaries for multiple operating systems, making the installation seamless as well. However, for a C language package that has only sources published, pip needs a working and properly set up compiler to successfully install the package.

A wheel file is a package distribution. It can contain pure-Python code but also precompiled executable binaries if required. A single package can offer multiple wheels per different Python versions and operating systems. An example wheel file containing precompiled binaries is numpy-1.16.2-cp27-cp27m-win32.whl, for numpy package, available from pypi Download files section for the package. If you are using pip with no problems, you do not need to worry about wheel files.



Python 2 vs. Python 3

Python 3 was created incompatible with Python 2.

One noticeable difference is that in Python 3, print is not a statement but rather a function, and therefore, invoking it requires placing brackets around its arguments. Differences with deeper impact include making all strings Unicode and introducing a bytes type, making all integers big integers, letting slash (/) denote a true division rather than per default integer division, etc.; for changes, see What’s New In Python 3.0 and for all changes, see What’s New in Python.

Python 2 code can be made ready for a switch to Python 3 by importing features from __future__ module. For instance, from __future__ import print_function makes Python 2 behave as if it had Python 3 print function.

Support for Python 2.7 ended in 2020. Python 3 was first released in 2008.

A list of Python packages ready for Python 3 is available from py3readiness.org.

A survey conducted in 2018 by JetBrains and Python Software Foundation suggests significant adoption of Python 3 among Python users.



Decorators


Duplicated code is recognized as bad practice in software for lots of reasons, not least of which is that it requires more work to maintain. If you have the same algorithm operating twice on different pieces of data you can put the algorithm in a function and pass in the data to avoid having to duplicate the code. However, sometimes you find cases where the code itself changes, but two or more places still have significant chunks of duplicated boilerplate code. A typical example might be logging:

def multiply(a, b):
    result = a * b
    log("multiply has been called")
    return result

def add(a, b):
    result = a + b
    log("add has been called")
    return result

In a case like this, it's not obvious how to factor out the duplication. We can follow our earlier pattern of moving the common code to a function, but calling the function with different data is not enough to produce the different behavior we want (add or multiply). Instead, we have to pass a function to the common function. This involves a function that operates on a function, known as a higher-order function.

Decorator in Python is a syntax sugar for high-level function.

Minimal example of property decorator:

>>> class Foo(object):
...     @property
...     def bar(self):
...         return 'baz'
...
>>> F = Foo()
>>> print F.bar
baz

The above example is really just a syntax sugar for codes like this:

>>> class Foo(object):
...     def bar(self):
...         return 'baz'
...     bar = property(bar)
...
>>> F = Foo()
>>> print F.bar
baz

Minimal Example of generic decorator:

>>> def decorator(f):
...     def called(*args, **kargs):
...         print 'A function is called somewhere'
...         return f(*args, **kargs)
...     return called
...
>>> class Foo(object):
...     @decorator
...     def bar(self):
...         return 'baz'
...
>>> F = Foo()
>>> print F.bar()
A function is called somewhere
baz

A good use for the decorators is to allow you to refactor your code so that common features can be moved into decorators. Consider for example, that you would like to trace all calls to some functions and print out the values of all the parameters of the functions for each invocation. Now you can implement this in a decorator as follows:

#define the Trace class that will be 
#invoked using decorators
class Trace(object):
    def __init__(self, f):
        self.f =f

    def __call__(self, *args, **kwargs):
        print "entering function " + self.f.__name__
        i=0
        for arg in args:
            print "arg {0}: {1}".format(i, arg)
            i =i+1
            
        return self.f(*args, **kwargs)

Then you can use the decorator on any function that you defined by:

@Trace
def sum(a, b):
    print "inside sum"
    return a + b

On running this code you would see output like

>>> sum(3,2)
entering function sum
arg 0: 3
arg 1: 2
inside sum

Alternately, instead of creating the decorator as a class, you could have used a function as well.

def Trace(f):
    def my_f(*args, **kwargs):
        print "entering " +  f.__name__
        result= f(*args, **kwargs)
        print "exiting " +  f.__name__
        return result
    my_f.__name = f.__name__
    my_f.__doc__ = f.__doc__
    return my_f

#An example of the trace decorator
@Trace
def sum(a, b):
    print "inside sum"
    return a + b

#if you run this you should see
>>> sum(3,2)
entering sum
inside sum
exiting sum
5

Remember it is good practice to return the function or a sensible decorated replacement for the function so that decorators can be chained.



Context Managers


A basic issue in programming is resource management: a resource is anything in limited supply, notably file handles, network sockets, locks, etc., and a key problem is making sure these are released after they are acquired. If they are not released, you have a resource leak, and the system may slow down or crash. More generally, you may want cleanup actions to always be done, other than simply releasing resources.

Python provides special syntax for this in the with statement, which automatically manages resources encapsulated within context manager types, or more generally performs startup and cleanup actions around a block of code. You should always use a with statement for resource management. There are many built-in context manager types, including the basic example of File, and it is easy to write your own. The code is not hard, but the concepts are slightly subtle, and it is easy to make mistakes.

Basic resource management

Basic resource management uses an explicit pair of open()...close() functions, as in basic file opening and closing. Don’t do this, for the reasons we are about to explain:

f = open(filename)
# ...
f.close()

The key problem with this simple code is that it fails if there is an early return, either due to a return statement or an exception, possibly raised by called code. To fix this, ensuring that the cleanup code is called when the block is exited, one uses a try...finally clause:

f = open(filename)
try:
    # ...
finally:
    f.close()

However, this still requires manually releasing the resource, which might be forgotten, and the release code is distant from the acquisition code. The release can be done automatically by instead using with, which works because File is a context manager type:

with open(filename) as f:
    # ...

This assigns the value of open(filename) to f (this point is subtle and varies between context managers), and then automatically releases the resource, in this case calling f.close(), when the block exits.

Technical details

Newer objects are context managers (formally context manager types: subtypes, as they implement the context manager interface, which consists of __enter__(), __exit__()), and thus can be used in with statements easily (see With Statement Context Managers).

For older file-like objects that have a close method but not __exit__(), you can use the @contextlib.closing decorator. If you need to roll your own, this is very easy, particularly using the @contextlib.contextmanager decorator.[1]

Context managers work by calling __enter__() when the with context is entered, binding the return value to the target of as, and calling __exit__() when the context is exited. There’s some subtlety about handling exceptions during exit, but you can ignore it for simple use.

More subtly, __init__() is called when an object is created, but __enter__() is called when a with context is entered.

The __init__()/__enter__() distinction is important to distinguish between single use, reusable and reentrant context managers. It’s not a meaningful distinction for the common use case of instantiating an object in the with clause, as follows:

with A() as a:
    ...

…in which case any single use context manager is fine.

However, in general it is a difference, notably when distinguishing a reusable context manager from the resource it is managing, as in here:

a_cm = A()
with a_cm as a:
   ...

Putting resource acquisition in __enter__() instead of __init__() gives a reusable context manager.

Notably, File() objects do the initialization in __init__() and then just returns itself when entering a context, as in def __enter__(): return self. This is fine if you want the target of the as to be bound to an object (and allows you to use factories like open as the source of the with clause), but if you want it to be bound to something else, notably a handle (file name or file handle/file descriptor), you want to wrap the actual object in a separate context manager. For example:

@contextmanager
def FileName(*args, **kwargs):
   with File(*args, **kwargs) as f:
       yield f.name

For simple uses you don’t need to do any __init__() code, and only need to pair __enter__()/__exit__(). For more complicated uses you can have reentrant context managers, but that’s not necessary for simple use.

Caveats

try...finally

Note that a try...finally clause is necessary with @contextlib.contextmanager, as this does not catch any exceptions raised after the yield, but is not necessary in __exit__(), which is called even if an exception is raised.

Context, not scope

The term context manager is carefully chosen, particularly in contrast to “scope”. Local variables in Python have function scope, and thus the target of a with statement, if any, is still visible after the block has exited, though __exit__() has already been called on the context manager (the argument of the with statement), and thus is often not useful or valid. This is a technical point, but it’s worth distinguishing the with statement context from the overall function scope.

Generators

Generators that hold or use resources are a bit tricky.

Beware that creating generators within a with statement and then using them outside the block does not work, because generators have deferred evaluation, and thus when they are evaluated, the resource has already been released. This is most easily seen using a file, as in this generator expression to convert a file to a list of lines, stripping the end-of-line character:

with open(filename) as f:
    lines = (line.rstrip('\n') for line in f)

When lines is then used – evaluation can be forced with list(lines) – this fails with ValueError: I/O operation on closed file. This is because the file is closed at the end of the with statement, but the lines are not read until the generator is evaluated.

The simplest solution is to avoid generators, and instead use lists, such as list comprehensions. This is generally appropriate in this case (reading a file) since one wishes to minimize system calls and just read the file all at once (unless the file is very large):

with open(filename) as f:
    lines = [line.rstrip('\n') for line in f]

In case that one does wish to use a resource in a generator, the resource must be held within the generator, as in this generator function:

def stripped_lines(filename):
    with open(filename) as f:
        for line in f:
            yield line.rstrip('\n')

As the nesting makes clear, the file is kept open while iterating through it.

To release the resource, the generator must be explicitly closed, using generator.close(), just as with other objects that hold resources (this is the dispose pattern). This can in turn be automated by making the generator into a context manager, using @contextlib.closing, as:

from contextlib import closing

with closing(stripped_lines(filename)) as lines:
    # ...

Not RAII

Resource Acquisition Is Initialization is an alternative form of resource management, particularly used in C++. In RAII, resources are acquired during object construction, and released during object destruction. In Python the analogous functions are __init__() and __del__() (finalizer), but RAII does not work in Python, and releasing resources in __del__() does not work. This is because there is no guarantee that __del__() will be called: it’s just for memory manager use, not for resource handling.

In more detail, Python object construction is two-phase, consisting of (memory) allocation in __new__() and (attribute) initialization in __init__(). Python is garbage-collected via reference counting, with objects being finalized (not destructed) by __del__(). However, finalization is non-deterministic (objects have non-deterministic lifetimes), and the finalizer may be called much later or not at all, particularly if the program crashes. Thus using __del__() for resource management will generally leak resources.

It is possible to use finalizers for resource management, but the resulting code is implementation-dependent (generally working in CPython but not other implementations, such as PyPy) and fragile to version changes. Even if this is done, it requires great care to ensure references drop to zero in all circumstances, including: exceptions, which contain references in tracebacks if caught or if running interactively; and references in global variables, which last until program termination. Prior to Python 3.4, finalizers on objects in cycles were also a serious problem, but this is no longer a problem; however, finalization of objects in cycles is not done in a deterministic order.

References



Reflection


A Python script can find out about the type, class, attributes and methods of an object. This is referred to as reflection or introspection. See also Metaclasses.

Reflection-enabling functions include type(), isinstance(), callable(), dir() and getattr().

Type

The type method enables to find out about the type of an object. The following tests return True:

  • type(3) is int
  • type(3.0) is float
  • type(10**10) is long # Python 2
  • type(1 + 1j) is complex
  • type('Hello') is str
  • type([1, 2]) is list
  • type([1, [2, 'Hello']]) is list
  • type({'city': 'Paris'}) is dict
  • type((1,2)) is tuple
  • type(set()) is set
  • type(frozenset()) is frozenset
  • ----
  • type(3).__name__ == "int"
  • type('Hello').__name__ == "str"
  • ----
  • import types, re, Tkinter # For the following examples
  • type(re) is types.ModuleType
  • type(re.sub) is types.FunctionType
  • type(Tkinter.Frame) is types.ClassType
  • type(Tkinter.Frame).__name__ == "classobj"
  • type(Tkinter.Frame()).__name__ == "instance"
  • type(re.compile('myregex')).__name__ == "SRE_Pattern"
  • type(type(3)) is types.TypeType

The type function disregards class inheritance: "type(3) is object" yields False while "isinstance(3, object)" yields True.

Links:

Isinstance

Determines whether an object is an instance of a type or class.

The following tests return True:

  • isinstance(3, int)
  • isinstance([1, 2], list)
  • isinstance(3, object)
  • isinstance([1, 2], object)
  • import Tkinter; isinstance(Tkinter.Frame(), Tkinter.Frame)
  • import Tkinter; Tkinter.Frame().__class__.__name__ == "Frame"

Note that isinstance provides a weaker condition than a comparison using #Type.

Function isinstance and a user-defined class:

class Plant: pass                        # Dummy class
class Tree(Plant): pass                  # Dummy class derived from Plant
tree = Tree()                            # A new instance of Tree class
print isinstance(tree, Tree)             # True
print isinstance(tree, Plant)            # True
print isinstance(tree, object)           # True
print type(tree) is Tree                 # False
print type(tree).__name__ == "instance"  # True
print tree.__class__.__name__ == "Tree"  # True

Links:

Issubclass

Determines whether a class is a subclass of another class. Pertains to classes, not their instances.

class Plant: pass                        # Dummy class
class Tree(Plant): pass                  # Dummy class derived from Plant
tree = Tree()                            # A new instance of Tree class
print issubclass(Tree, Plant)            # True
print issubclass(Tree, object)           # False in Python 2
print issubclass(int, object)            # True
print issubclass(bool, int)              # True
print issubclass(int, int)               # True
print issubclass(tree, Plant)            # Error - tree is not a class

Links:

Duck typing

Duck typing provides an indirect means of reflection. It is a technique consisting in using an object as if it was of the requested type, while catching exceptions resulting from the object not supporting some of the features of the class or type.

Links:

Callable

For an object, determines whether it can be called. A class can be made callable by providing a __call__() method.

Examples:

  • callable(2)
    • Returns False. Ditto for callable("Hello") and callable([1, 2]).
  • callable([1,2].pop)
    • Returns True, as pop without "()" returns a function object.
  • callable([1,2].pop())
    • Returns False, as [1,2].pop() returns 2 rather than a function object.

Links:

Dir

Returns the list of names of attributes of an object, which includes methods. Is somewhat heuristic and possibly incomplete, as per python.org.

Examples:

  • dir(3)
  • dir("Hello")
  • dir([1, 2])
  • import re; dir(re)
    • Lists names of functions and other objects available in the re module for regular expressions.

Links:

Getattr

Returns the value of an attribute of an object, given the attribute name passed as a string.

An example:

  • getattr(3, "imag")

The list of attributes of an object can be obtained using #Dir.

Links:

Keywords

A list of Python keywords can be obtained from Python:

import keyword
pykeywords = keyword.kwlist
print keyword.iskeyword("if")      # True
print keyword.iskeyword("True")    # False

Links:

Built-ins

A list of Python built-in objects and functions can be obtained from Python:

print dir(__builtins__)           # Output the list
print type(__builtins__.list)     # = <type 'type'>
print type(__builtins__.open)     # = <type 'builtin_function_or_method'>
print list is __builtins__.list   # True
print open is __builtins__.open   # True

Links:



Metaclasses


In Python, classes are themselves objects. Just as other objects are instances of a particular class, classes themselves are instances of a metaclass.

Python3

The Pep 3115 defines the changes to python 3 metaclasses. In python3 you have a method __prepare__ that is called in the metaclass to create a dictionary or other class to store the class members.[1] Then there is the __new__ method that is called to create new instances of that class. [2]

The type Metaclass

The metaclass for all standard Python types is the "type" object.

>>> type(object)
<type 'type'>
>>> type(int)
<type 'type'>
>>> type(list)
<type 'type'>

Just like list, int and object, "type" is itself a normal Python object, and is itself an instance of a class. In this case, it is in fact an instance of itself.

>>> type(type)
<type 'type'>

It can be instantiated to create new class objects similarly to the class factory example above by passing the name of the new class, the base classes to inherit from, and a dictionary defining the namespace to use.

For instance, the code:

>>> class MyClass(BaseClass):
...     attribute = 42

Could also be written as:

>>> MyClass = type("MyClass", (BaseClass,), {'attribute' : 42})

Metaclasses

It is possible to create a class with a different metaclass than type by setting the metaclass keyword argument when defining the class. When this is done, the class, and its subclass will be created using your custom metaclass. For example

class CustomMetaclass(type):
    def __init__(cls, name, bases, dct):
        print "Creating class %s using CustomMetaclass" % name
        super(CustomMetaclass, cls).__init__(name, bases, dct)

class BaseClass(metaclass=CustomMetaclass):
    pass

class Subclass1(BaseClass):
    pass

This will print

Creating class BaseClass using CustomMetaclass
Creating class Subclass1 using CustomMetaclass

By creating a custom metaclass in this way, it is possible to change how the class is constructed. This allows you to add or remove attributes and methods, register creation of classes and subclasses creation and various other manipulations when the class is created.

More resources

References



Namespace




Performance

Since Python is an interpreted language in its most commonly used CPython implementation, it is many times slower in a variety of tasks than the most commonly used compiled non-managed languages such as C and C++; for some tasks, it is more than 100 times slower. CPython seems to be on par with Perl, another interpreted language, slower on some tasks, faster on other tasks.

Peformance can be measured using benchmarks. Benchmarks are often far from representative of the real-world usage and have to be taken with a grain of salt. Some benchmarks are outright wrong in that non-idiomatic code is used for a language, yielding avoidably low performance for the language.

PyPy is a just-in-time (JIT) compiler that often runs faster than CPython. Another compiler that can lead to greater speeds is Numba, which works for a subset of Python. Yet another compiler is Cython, not to be confused with CPython.



PyPy

PyPy is a Python interpreter containing a just-in-time compiler. Python programs can usually be run by PyPy without modification but for availability of 3rd party modules since a module made for CPython, the normal Python interpreter, does not automatically work with PyPy. Furthermore, some Python programs can run into trouble because of PyPy's different strategy of when to free up allocated objects including file handles.

The speed-up brought by PyPy compared to CPython depends on the nature of the task. For some computationally heavy tasks, the speed-up factor can reach as high as 50. PyPy speed center reports the geometric average speed-up factor as 7.6, calculated from a set of benchmarks.

PyPy is available both for Python 2 and Python 3, but the version for Python 3 is slower; the above speed statements pertain to Python 2.

Interactive use of PyPy is possible: you can type "pypy" into a command line, and start interacting with it just like with CPython.

The outputs from PyPy are not guaranteed to be exactly the same as from CPython. For instance, PyPy can yield items from a set in an order different from that of CPython since the item order in a set is arbitrary and not guaranteed to the same between different Python implementations; for verification, you can compare the results of {1,2}.pop(). Dictionaries have an arbitrary key order as well.

Floating point results may be slightly different between PyPy and CPython in some setups as long as PyPy is built with SSE2 instruction set enabled and CPython is not.

See also Performance.



Cython

Cython (not to be confused with CPython) is a compiler of Python-like source code to the C language, from which it is compiled by a C compiler to binary executable. The objective is a significant speedup compared to interpreting the Python code in CPython, the standard interpreter. Cython is usually used to create extension modules for Python. The source code language compilable by Cython is a near superset of Python.

You can install Cython using pip install Cython. However, in order for Cython to work, you will need a working C compiler. On Linux, you usually have one; on Windows, you can install and use Microsoft Visual C++ compiler or MinGW.

Beyond the normal Python, Cython-compilable Python source can contain C-like declarations of variable types, leading to speedups of the compiled code.

Cython-compilable Python source code files conventionally use extension pyx.

The compiled extension module still needs CPython (the normal Python interpreter) to run, and can call other Python modules, including pure Python modules. This is because, where required, Cython compiles to C code that uses the CPython API to achieve general Python-like behavior.



Command-line one-liners

Python can run one-liners from an operating system command line using option -c:

  • python -c "print(3.0/2)"
    • Calculates and outputs the result.
  • python -c "import math;print(math.sin(1))"
    • Imports a module required and outputs sine value.
  • python -c "for i in range(1,11):print(i)"
    • Uses a loop to output numbers from 1 to 10.
  • python -c "for i in range(1,11):for j in range(1,11): print(i,j)"
    • Does not work; two loops in a line are an invalid syntax.
  • python -c "for i, j in ((i,j) for i in range (1,11) for j in range(1,11)): print(i, j)"
    • Outputs pairs using two analogues of loop within a comprehension.
  • echo hey | python -c "import sys,re;[sys.stdout.write(line) for line in sys.stdin if re.search('he.', line)]"
    • Acts as grep: outputs each line of the input containing a substring matching a regular expression. Not a Python one-liner strength.
  • echo hallo | python -c "import sys,re;[sys.stdout.write(re.sub('h[au]llo', 'hello', line)) for line in sys.stdin]"
    • Acts as sed: for each line of the input, performs a regex replacement and outputs the results. Again, not a Python one-liner strength.
  • python -m calendar
    • Outputs a year's calendar using calendar module.
  • python -c "import playsound as p;p.playsound(r'C:\WINDOWS\Media\notify.wav')"
    • On Windows, plays notification sound. Requires installation of playsound module. The module works across platforms; what is Windows specific above is the file path.



Tips and Tricks


There are many tips and tricks you can learn in Python:

Strings

  • Triple quotes are an easy way to define a string with both single and double quotes.
  • String concatenation is expensive. Use percent formatting and str.join() for concatenation:

(but don't worry about this unless your resulting string is more than 500-1000 characters long) [1]

print "Spam" + " eggs" + " and" + " spam"               # DON'T DO THIS
print " ".join(["Spam","eggs","and","spam"])            # Much faster/more
                                                        # common Python idiom
print "%s %s %s %s" % ("Spam", "eggs", "and", "spam")   # Also a pythonic way of
                                                        # doing it - very fast

Optimized C modules

Several modules have optimized versions written in C, which provide an almost-identical interface and are frequently much faster or more memory-efficient than the pure Python implementations. Module behavior generally does differ in some respects, often minor, and thus C versions are frequently used.

This is primarily a Python 2.x feature, which has been largely removed in Python 3, with modules automatically using optimized implementations if available.[2] However, the cProfile/profile pair still exists (as of Python 3.4).

importing

The C version of a module named module or Module is called cModule, and frequently imported using import...as to strip off the prefix, as:

import cPickle as pickle

For compatibility, one can try to import the C version and fall back to the Python version if the C version is not available; in this case using import...as is required, so the code does not depend on which module was imported:

try:
  import cPickle as pickle
except ImportError:
  import pickle

Examples

Notable examples include:

  • (Python 2.x) cPickle for pickle, up to 1000× faster.
  • (Python 2.x) cStringIO for StringIO, replaced by io.StringIO in Python 3
  • cProfile for profile – the Python profile adds significant overhead, and thus cProfile is recommended for most use.
  • (not needed in Python 3.3+) cElementTree for ElementTree, 15–20 times faster and uses 2–5 times less memory;[3] not needed in Python 3.3+, which automatically uses a fast implementation if possible.

List comprehension and generators

  • List comprehension and generator expressions are very useful for working with small, compact loops. Additionally, it is faster than a normal for-loop.
directory = os.listdir(os.getcwd())       # Gets a list of files in the
                                          # directory the program runs from
filesInDir = [item for item in directory] # Normal For Loop rules apply, you
                                          # can add "if condition" to make a
                                          # more narrow search.
  • List comprehension and generator expression can be used to work with two (or more) lists with zip or itertools.izip
[a - b for (a,b) in zip((1,2,3), (1,2,3))]  # will return [0, 0, 0]

Data type choice

Choosing the correct data type can be critical to the performance of an application. For example, say you have 2 lists:

list1 = [{'a': 1, 'b': 2}, {'c': 3, 'd': 4}, {'e': 5, 'f': 6}]
list2 = [{'e': 5, 'f': 6}, {'g': 7, 'h': 8}, {'i': 9, 'j': 10}]

and you want to find the entries common to both lists. You could iterate over one list, checking for common items in the other:

common = []
for entry in list1:
    if entry in list2:
        common.append(entry)

For such small lists, this will work fine, but for larger lists, for example if each contains thousands of entries, the following will be more efficient, and produces the same result:

set1 = set([tuple(entry.items()) for entry in list1])
set2 = set([tuple(entry.items()) for entry in list2])
common = set1.intersection(set2)
common = [dict(entry) for entry in common]

Sets are optimized for speed in such functions. Dictionaries themselves cannot be used as members of a set as they are mutable, but tuples can. If one needs to do set operations on a list of dictionaries, one can convert the items to tuples and the list to a set, perform the operation, then convert back. This is often much faster than trying to replicate set operations using string functions.

Other

  • Decorators can be used for handling common concerns like logging, db access, etc.
  • While Python has no built-in function to flatten a list you can use a recursive function to do the job quickly.
def flatten(seq, list = None):
    """flatten(seq, list = None) -> list

    Return a flat version of the iterator `seq` appended to `list`
    """
    if list == None:
        list = []
    try:                          # Can `seq` be iterated over?
        for item in seq:          # If so then iterate over `seq`
            flatten(item, list)      # and make the same check on each item.
    except TypeError:             # If seq isn't iterable
        list.append(seq)             # append it to the new list.
    return list
  • To stop a Python script from closing right after you launch one independently, add this code:
print 'Hit Enter to exit'
raw_input()
  • Python already has a GUI built in: Tkinter, based on Tcl's Tk. More are available, such as PyQt4, pygtk3, and wxPython.
  • Ternary Operators:
[on_true] if [expression] else [on_false]

x, y = 50, 25

small = x if x < y else y
  • Booleans as indexes:
b = 1==1
name = "I am %s" % ["John","Doe"][b]
#returns I am Doe

References



Standard Library


The Python Standard Library is a collection of script modules accessible to a Python program to simplify the programming process and removing the need to rewrite commonly used commands. They can be used by 'calling/importing' them at the beginning of a script.

A list of the Standard Library modules can be found at http://www.python.org/doc/.

The following are among the most important:

  • time
  • sys
  • os
  • math
  • random
  • pickle
  • urllib
  • re
  • cgi
  • socket





Regular Expression


Python includes a module for working with regular expressions on strings. For more information about writing regular expressions and syntax not specific to Python, see the regular expressions wikibook. Python's regular expression syntax is similar to Perl's

To start using regular expressions in your Python scripts, import the "re" module:

import re

Overview

Regular expression functions in Python at a glance:

import re
if re.search("l+","Hello"):        print 1  # Substring match suffices
if not re.match("ell.","Hello"):   print 2  # The beginning of the string has to match
if re.match(".el","Hello"):        print 3
if re.match("he..o","Hello",re.I): print 4  # Case-insensitive match
print re.sub("l+", "l", "Hello")            # Prints "Helo"; replacement AKA substitution
print re.sub(r"(.*)\1", r"\1", "HeyHey")    # Prints "Hey"; backreference
print re.sub("EY", "ey", "HEy", flags=re.I) # Prints "Hey"; case-insensitive sub
print re.sub(r"(?i)EY", r"ey", "HEy")       # Prints "Hey"; case-insensitive sub
for match in re.findall("l+.", "Hello Dolly"):
  print match                               # Prints "llo" and then "lly"
for match in re.findall("e(l+.)", "Hello Dolly"):
  print match                               # Prints "llo"; match picks group 1
for match in re.findall("(l+)(.)", "Hello Dolly"):
  print match[0], match[1]                  # The groups end up as items in a tuple
match = re.match("(Hello|Hi) (Tom|Thom)","Hello Tom Bombadil")
if match:                                 # Equivalent to if match is not None
  print match.group(0)                    # Prints the whole match disregarding groups
  print match.group(1) + match.group(2)   # Prints "HelloTom"

Matching and searching

One of the most common uses for regular expressions is extracting a part of a string or testing for the existence of a pattern in a string. Python offers several functions to do this.

The match and search functions do mostly the same thing, except that the match function will only return a result if the pattern matches at the beginning of the string being searched, while search will find a match anywhere in the string.

>>> import re
>>> foo = re.compile(r'foo(.{,5})bar', re.I+re.S)
>>> st1 = 'Foo, Bar, Baz'
>>> st2 = '2. foo is bar'
>>> search1 = foo.search(st1)
>>> search2 = foo.search(st2)
>>> match1 = foo.match(st1)
>>> match2 = foo.match(st2)

In this example, match2 will be None, because the string st2 does not start with the given pattern. The other 3 results will be Match objects (see below).

You can also match and search without compiling a regexp:

>>> search3 = re.search('oo.*ba', st1, re.I)

Here we use the search function of the re module, rather than of the pattern object. For most cases, its best to compile the expression first. Not all of the re module functions support the flags argument and if the expression is used more than once, compiling first is more efficient and leads to cleaner looking code.

The compiled pattern object functions also have parameters for starting and ending the search, to search in a substring of the given string. In the first example in this section, match2 returns no result because the pattern does not start at the beginning of the string, but if we do:

>>> match3 = foo.match(st2, 3)

it works, because we tell it to start searching at character number 3 in the string.

What if we want to search for multiple instances of the pattern? Then we have two options. We can use the start and end position parameters of the search and match function in a loop, getting the position to start at from the previous match object (see below) or we can use the findall and finditer functions. The findall function returns a list of matching strings, useful for simple searching. For anything slightly complex, the finditer function should be used. This returns an iterator object, that when used in a loop, yields Match objects. For example:

>>> str3 = 'foo, Bar Foo. BAR FoO: bar'
>>> foo.findall(str3)
[', ', '. ', ': ']
>>> for match in foo.finditer(str3):
...     match.group(1)
...
', '
'. '
': '

If you're going to be iterating over the results of the search, using the finditer function is almost always a better choice.

Match objects

Match objects are returned by the search and match functions, and include information about the pattern match.

The group function returns a string corresponding to a capture group (part of a regexp wrapped in ()) of the expression, or if no group number is given, the entire match. Using the search1 variable we defined above:

>>> search1.group()
'Foo, Bar'
>>> search1.group(1)
', '

Capture groups can also be given string names using a special syntax and referred to by matchobj.group('name'). For simple expressions this is unnecessary, but for more complex expressions it can be very useful.

You can also get the position of a match or a group in a string, using the start and end functions:

>>> search1.start()
0
>>> search1.end()
8
>>> search1.start(1)
3
>>> search1.end(1)
5

This returns the start and end locations of the entire match, and the start and end of the first (and in this case only) capture group, respectively.

Replacing

Another use for regular expressions is replacing text in a string. To do this in Python, use the sub function.

sub takes up to 3 arguments: The text to replace with, the text to replace in, and, optionally, the maximum number of substitutions to make. Unlike the matching and searching functions, sub returns a string, consisting of the given text with the substitution(s) made.

>>> import re
>>> mystring = 'This string has a q in it'
>>> pattern = re.compile(r'(a[n]? )(\w) ')
>>> newstring = pattern.sub(r"\1'\2' ", mystring)
>>> newstring
"This string has a 'q' in it"

This takes any single alphanumeric character (\w in regular expression syntax) preceded by "a" or "an" and wraps in in single quotes. The \1 and \2 in the replacement string are backreferences to the 2 capture groups in the expression; these would be group(1) and group(2) on a Match object from a search.

The subn function is similar to sub, except it returns a tuple, consisting of the result string and the number of replacements made. Using the string and expression from before:

>>> subresult = pattern.subn(r"\1'\2' ", mystring)
>>> subresult
("This string has a 'q' in it", 1)

Replacing without constructing and compiling a pattern object:

>>> result = re.sub(r"b.*d","z","abccde")
>>> result
'aze'

Splitting


The split function splits a string based on a given regular expression:

>>> import re
>>> mystring = '1. First part 2. Second part 3. Third part'
>>> re.split(r'\d\.', mystring)
['', ' First part ', ' Second part ', ' Third part']

Escaping

The escape function escapes all non-alphanumeric characters in a string. This is useful if you need to take an unknown string that may contain regexp metacharacters like ( and . and create a regular expression from it.

>>> re.escape(r'This text (and this) must be escaped with a "\" to use in a regexp.')
'This\\ text\\ \\(and\\ this\\)\\ must\\ be\\ escaped\\ with\\ a\\ \\"\\\\\\"\\ to\\ use\\ in\\ a\\ regexp\\.'

Flags

The different flags use with regular expressions:

AbbreviationFull nameDescription
re.Ire.IGNORECASEMakes the regexp case-insensitive
re.Lre.LOCALEMakes the behavior of some special sequences (\w, \W, \b, \B, \s, \S) dependent on the current locale
re.Mre.MULTILINEMakes the ^ and $ characters match at the beginning and end of each line, rather than just the beginning and end of the string
re.Sre.DOTALLMakes the . character match every character including newlines.
re.Ure.UNICODEMakes \w, \W, \b, \B, \d, \D, \s, \S dependent on Unicode character properties
re.Xre.VERBOSEIgnores whitespace except when in a character class or preceded by an non-escaped backslash, and ignores # (except when in a character class or preceded by an non-escaped backslash) and everything after it to the end of a line, so it can be used as a comment. This allows for cleaner-looking regexps.

Pattern objects

If you're going to be using the same regexp more than once in a program, or if you just want to keep the regexps separated somehow, you should create a pattern object, and refer to it later when searching/replacing.

To create a pattern object, use the compile function.

import re
foo = re.compile(r'foo(.{,5})bar', re.I+re.S)

The first argument is the pattern, which matches the string "foo", followed by up to 5 of any character, then the string "bar", storing the middle characters to a group, which will be discussed later. The second, optional, argument is the flag or flags to modify the regexp's behavior. The flags themselves are simply variables referring to an integer used by the regular expression engine. In other languages, these would be constants, but Python does not have constants. Some of the regular expression functions do not support adding flags as a parameter when defining the pattern directly in the function, if you need any of the flags, it is best to use the compile function to create a pattern object.

The r preceding the expression string indicates that it should be treated as a raw string. This should normally be used when writing regexps, so that backslashes are interpreted literally rather than having to be escaped.




External commands

The traditional way of executing external commands is using os.system():

import os
os.system("dir")
os.system("echo Hello")
exitCode = os.system("echotypo")

The modern way, since Python 2.4, is using subprocess module:

subprocess.call(["echo", "Hello"])
exitCode = subprocess.call(["dir", "nonexistent"])

The traditional way of executing external commands and reading their output is via popen2 module:

import popen2
readStream, writeStream, errorStream = popen2.popen3("dir")
# allLines = readStream.readlines()
for line in readStream:
  print line.rstrip()
readStream.close()
writeStream.close()
errorStream.close()

The modern way, since Python 2.4, is using subprocess module:

import subprocess
process = subprocess.Popen(["echo","Hello"], stdout=subprocess.PIPE)
for line in process.stdout:
   print line.rstrip()

Keywords: system commands, shell commands, processes, backtick, pipe.



XML Tools


Introduction

Python includes several modules for manipulating xml.


  1. TODO:Must include parenthesis for "print" in(python 3.8)

xml.sax.handler

Python Doc

import xml.sax.handler as saxhandler
import xml.sax as saxparser

class MyReport:
    def __init__(self):
        self.Y = 1


class MyCH(saxhandler.ContentHandler):
    def __init__(self, report):
        self.X = 1
        self.report = report

    def startDocument(self):
        print 'startDocument'

    def startElement(self, name, attrs):
        print 'Element:', name

report = MyReport()          #for future use
ch = MyCH(report)

xml = """\
<collection>
  <comic title=\"Sandman\" number='62'>
     <writer>Neil Gaiman</writer>
     <penciller pages='1-9,18-24'>Glyn Dillon</penciller>
     <penciller pages="10-17">Charles Vess</penciller>
  </comic>
</collection>
"""

print xml

saxparser.parseString(xml, ch)

xml.dom.minidom

An example of doing RSS feed parsing with DOM

from xml.dom import minidom as dom
import urllib2

def fetchPage(url):
    a = urllib2.urlopen(url)
    return ''.join(a.readlines())

def extract(page):
    a = dom.parseString(page)
    item = a.getElementsByTagName('item')
    for i in item:
        if i.hasChildNodes():
            t = i.getElementsByTagName('title')[0].firstChild.wholeText
            l = i.getElementsByTagName('link')[0].firstChild.wholeText
            d = i.getElementsByTagName('description')[0].firstChild.wholeText
            print t, l, d

if __name__=='__main__':
    page = fetchPage("http://rss.slashdot.org/Slashdot/slashdot")
    extract(page)

XML document provided by pyxml documentation.



Email


Python includes several modules in the standard library for working with emails and email servers.

Sending mail

Sending mail is done with Python's smtplib using an SMTP (Simple Mail Transfer Protocol) server. Actual usage varies depending on complexity of the email and settings of the email server, the instructions here are based on sending email through Google's Gmail.

The first step is to create an SMTP object, each object is used for connection with one server.

import smtplib
server = smtplib.SMTP('smtp.gmail.com', 587)

The first argument is the server's hostname, the second is the port. The port used varies depending on the server.

Next, we need to do a few steps to set up the proper connection for sending mail.

server.ehlo()
server.starttls()
server.ehlo()

These steps may not be necessary depending on the server you connect to. ehlo() is used for ESMTP servers, for non-ESMTP servers, use helo() instead. See Wikipedia's article about the SMTP protocol for more information about this. The starttls() function starts Transport Layer Security mode, which is required by Gmail. Other mail systems may not use this, or it may not be available.

Next, log in to the server:

server.login("youremailusername", "password")

Then, send the mail:

msg = "\nHello!" # The /n separates the message from the headers (which we ignore for this example)
server.sendmail("you@gmail.com", "target@example.com", msg)

Note that this is a rather crude example, it doesn't include a subject, or any other headers. For that, one should use the email package.

The email package

Python's email package contains many classes and functions for composing and parsing email messages, this section only covers a small subset useful for sending emails.

We start by importing only the classes we need, this also saves us from having to use the full module name later.

from email.mime.text import MIMEText
from email.mime.multipart import MIMEMultipart

Then we compose some of the basic message headers:

fromaddr = "you@gmail.com"
toaddr = "target@example.com"
msg = MIMEMultipart()
msg['From'] = fromaddr
msg['To'] = toaddr
msg['Subject'] = "Python email"

Next, we attach the body of the email to the MIME message:

body = "Python test mail"
msg.attach(MIMEText(body, 'plain'))

For sending the mail, we have to convert the object to a string, and then use the same prodecure as above to send using the SMTP server..

import smtplib
server = smtplib.SMTP('smtp.gmail.com', 587)
server.ehlo()
server.starttls()
server.ehlo()
server.login("youremailusername", "password")
text = msg.as_string()
server.sendmail(fromaddr, toaddr, text)

If we look at the text, we can see it has added all the necessary headers and structure necessary for a MIME formatted email. See MIME for more details on the standard:

The full text of our example message
>>> print text
Content-Type: multipart/mixed; boundary="===============1893313573=="
MIME-Version: 1.0
From: you@gmail.com
To: target@example.com
Subject: Python email

--===============1893313573==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit

Python test mail
--===============1893313573==--




Threading


Threading in python is used to run multiple threads (tasks, function calls) at the same time. Note that this does not mean that they are executed on different CPUs. Python threads will NOT make your program faster if it already uses 100 % CPU time. In that case, you probably want to look into parallel programming. If you are interested in parallel programming with python, please see here.

Python threads are used in cases where the execution of a task involves some waiting. One example would be interaction with a service hosted on another computer, such as a webserver. Threading allows python to execute other code while waiting; this is easily simulated with the sleep function.

Examples

A Minimal Example with Function Call

Make a thread that prints numbers from 1-10 and waits a second between each print:

import threading
import time

def loop1_10():
    for i in range(1, 11):
        time.sleep(1)
        print(i)

threading.Thread(target=loop1_10).start()

A Minimal Example with Object

#!/usr/bin/env python

import threading
import time


class MyThread(threading.Thread):
    def run(self):                                         # Default called function with mythread.start()
        print("{} started!".format(self.getName()))        # "Thread-x started!"
        time.sleep(1)                                      # Pretend to work for a second
        print("{} finished!".format(self.getName()))       # "Thread-x finished!"

def main():
    for x in range(4):                                     # Four times...
        mythread = MyThread(name = "Thread-{}".format(x))  # ...Instantiate a thread and pass a unique ID to it
        mythread.start()                                   # ...Start the thread, run method will be invoked
        time.sleep(.9)                                     # ...Wait 0.9 seconds before starting another

if __name__ == '__main__':
    main()

The output looks like this:

Thread-0 started!
Thread-1 started!
Thread-0 finished!
Thread-2 started!
Thread-1 finished!
Thread-3 started!
Thread-2 finished!
Thread-3 finished!




Sockets


HTTP Client

Make a very simple HTTP client

import socket
s = socket.socket()
s.connect(('localhost', 80))
s.send('GET / HTTP/1.1\nHost:localhost\n\n')
s.recv(40000) # receive 40000 bytes

NTP/Sockets

Connecting to and reading an NTP time server, returning the time as follows

ntpps       picoseconds portion of time
ntps        seconds portion of time
ntpms       milliseconds portion of time
ntpt        64-bit ntp time, seconds in upper 32-bits, picoseconds in lower 32-bits



GUI Programming


There are various GUI toolkits usable from Python.

Very productive are true GUI-builders, where the programmer can arrange the GUI window and other components such as database by using the mouse only in an intuitive fashion like in Windows Delphi 2.0. Very little typing is required. For python, only Boa Constructor follows this paradigm. WXglade and Qt-designer, monkey studio etc. come somewhat near but remain incomplete.

Disadvantages with the following kits described below are:

  • Difficult deployment - the apps won't run on a particular GNU-Linux installation without major additional work
  • breakage - apps won't work due to bit-rot.

Tkinter

Tkinter is a Python wrapper for Tcl/Tk providing a cross-platform GUI toolkit. On Windows, it comes bundled with Python; on other operating systems, it can be installed. The set of available widgets is smaller than in some other toolkits, but since Tkinter widgets are extensible, many of the missing compound widgets can be created using the extensibility, such as combo box and scrolling pane.

A minimal example:

from Tkinter import *
root = Tk()
frame = Frame(root)
frame.pack()
label = Label(frame, text="Hey there.")
label.pack()
quitButton = Button(frame, text="Quit", command=frame.quit)
quitButton.pack()
root.mainloop()

Main chapter: Tkinter.

Links:

PyGTK

See also book PyGTK For GUI Programming

PyGTK provides a convenient wrapper for the GTK+ library for use in Python programs, taking care of many of the boring details such as managing memory and type casting. The bare GTK+ toolkit runs on Linux, Windows, and Mac OS X (port in progress), but the more extensive features when combined with PyORBit and gnome-python require a GNOME install, and can be used to write full featured GNOME applications.

Home Page

PyQt

PyQt is a wrapper around the cross-platform Qt C++ toolkit. It has many widgets and support classes supporting SQL, OpenGL, SVG, XML, and advanced graphics capabilities. A PyQt hello world example:

from PyQt4.QtCore import *
from PyQt4.QtGui import *

class App(QApplication):
    def __init__(self, argv):
        super(App, self).__init__(argv)
        self.msg = QLabel("Hello, World!")
        self.msg.show()

if __name__ == "__main__":
    import sys
    app = App(sys.argv)
    sys.exit(app.exec_())

PyQt is a set of bindings for the cross-platform Qt application framework. PyQt v4 supports Qt4 and PyQt v3 supports Qt3 and earlier.

wxPython

Bindings for the cross platform toolkit wxWidgets. WxWidgets is available on Windows, Macintosh, and Unix/Linux.

import wx

class test(wx.App):
    def __init__(self):
        wx.App.__init__(self, redirect=False)

    def OnInit(self):
        frame = wx.Frame(None, -1,
                         "Test",
                         pos=(50,50), size=(100,40),
                         style=wx.DEFAULT_FRAME_STYLE)
        button = wx.Button(frame, -1, "Hello World!", (20, 20))
        self.frame = frame
        self.frame.Show()
        return True

if __name__ == '__main__':
        app = test()
        app.MainLoop()

Dabo

Dabo is a full 3-tier application framework. Its UI layer wraps wxPython, and greatly simplifies the syntax.

import dabo
dabo.ui.loadUI("wx")

class TestForm(dabo.ui.dForm):
	def afterInit(self):
		self.Caption = "Test"
		self.Position = (50, 50)
		self.Size = (100, 40)
		self.btn = dabo.ui.dButton(self, Caption="Hello World",
		      OnHit=self.onButtonClick)
		self.Sizer.append(self.btn, halign="center", border=20)
	
	def onButtonClick(self, evt):
		dabo.ui.info("Hello World!")

if __name__ == '__main__':
        app = dabo.ui.dApp()
        app.MainFormClass = TestForm
        app.start()


pyFltk

pyFltk is a Python wrapper for the FLTK, a lightweight cross-platform GUI toolkit. It is very simple to learn and allows for compact user interfaces.

The "Hello World" example in pyFltk looks like:

from fltk import *

window = Fl_Window(100, 100, 200, 90)
button = Fl_Button(9,20,180,50)
button.label("Hello World")
window.end()
window.show()
Fl.run()

Other Toolkits

  • PyKDE - Part of the kdebindings package, it provides a python wrapper for the KDE libraries.
  • PyXPCOM provides a wrapper around the Mozilla XPCOM component architecture, thereby enabling the use of standalone XUL applications in Python. The XUL toolkit has traditionally been wrapped up in various other parts of XPCOM, but with the advent of libxul and XULRunner this should become more feasible. These days, nobody uses PyXPCOM for very good reasons: PyXPCOM gives one dead links and outdated incompatible firefox extensions.




Tkinter

Tkinter is a Python wrapper for Tcl/Tk providing a cross-platform GUI toolkit. On Windows, it comes bundled with Python; on other operating systems, it can be installed. The set of available widgets is smaller than in some other toolkits, but since Tkinter widgets are extensible, many of the missing compound widgets can be created using the extensibility, such as combo box and scrolling pane.

IDLE, Python's Integrated Development and Learning Environment, is written using Tkinter and is often distributed with Python. You can learn about features of Tkinter by playing around with menus and dialogs of IDLE. For instance, Options > Configure IDLE... dialog shows a broad variety of GUI elements including tabbed interface. You can learn about programming using Tkinter by studying IDLE source code, which, on Windows, is available e.g. in C:\Program Files\Python27\Lib\idlelib.

Python 3: The examples on this page are for Python 2. In Python 3, what was previously module Tkinter is tkinter, what was tkMessageBox is messagebox, etc.

Minimal example

A minimal example:

from Tkinter import *
root = Tk()
frame = Frame(root)
frame.pack()
label = Label(frame, text="Hey there.")
label.pack()
quitButton = Button(frame, text="Quit", command=frame.quit)
quitButton.pack()
root.mainloop()

A minimal example made more compact - later references to GUI items not required:

from Tkinter import *
root = Tk()
frame = Frame(root)
frame.pack()
Label(frame, text="Hey there.").pack()
Button(frame, text="Quit", command=frame.quit).pack()
root.mainloop()

A minimal example creating an application class derived from Frame:

from Tkinter import *

class App(Frame):
  def __init__(self, master):
    Frame.__init__(self)
    self.label = Label(master, text="Hey there.")
    self.label.pack()    
    self.quitButton = Button(master, text="Quit", command=self.quit)
    self.quitButton.pack()

if __name__ == '__main__':
  root = Tk()
  app = App(root)
  root.mainloop()

Message boxes

Simple message boxes can be created using tkMessageBox as follows:

import Tkinter, tkMessageBox
Tkinter.Tk().withdraw() # Workaround: Hide the window
answer = tkMessageBox.askokcancel("Confirmation", "File not saved. Discard?")
answer = tkMessageBox.askyesno("Confirmation", "Do you really want to delete the file?")
# Above, "OK" and "Yes" yield True, and "Cancel" and "No" yield False
tkMessageBox.showwarning("Warning", "Timeout has elapsed.")
tkMessageBox.showwarning("Warning", "Timeout has elapsed.", icon=tkMessageBox.ERROR)
tkMessageBox.showerror("Warning", "Timeout has elapsed.")

Links:

File dialog

File dialogs can be created as follows:

import Tkinter, tkFileDialog
Tkinter.Tk().withdraw() # Workaround: Hide the window
filename1 = tkFileDialog.askopenfilename()
filename2 = tkFileDialog.askopenfilename(initialdir=r"C:\Users")
filename3 = tkFileDialog.asksaveasfilename()
filename4 = tkFileDialog.asksaveasfilename(initialdir=r"C:\Users")
if filename1 <> "":
  for line in open(filename1): # Dummy reading of the file
    dummy = line.rstrip()

Links:

Radio button

A radio button can be used to create a simple choice dialog with multiple options:

from Tkinter import *
master = Tk()  
choices = [("Apple", "a"), ("Orange", "o"), ("Pear", "p")]
defaultChoice = "a"
userchoice = StringVar()
userchoice.set(defaultChoice)
def cancelAction(): userchoice.set("");master.quit()
Label(master, text="Choose a fruit:").pack()
for text, key in choices: 
  Radiobutton(master, text=text, variable=userchoice, value=key).pack(anchor=W)
Button(master, text="OK", command=master.quit).pack(side=LEFT, ipadx=10)
Button(master, text="Cancel", command=cancelAction).pack(side=RIGHT, ipadx=10)
mainloop()
if userchoice.get() <>"":
  print userchoice.get() # "a", or "o", or "p"
else:
  print "Choice canceled."

An alternative to radio button that immediately reacts to button press:

from Tkinter import *
import os
buttons = [("Users", r"C:\Users"),
           ("Windows", r"C:\Windows"),
           ("Program Files", r"C:\Program Files")]
master = Tk()
def open(filePath):
  def openInner():
    os.chdir(filePath) # Cross platform
    #os.system('start "" "'+filePath+'') # Windows
    master.quit()
  return openInner
Label(master, text="Choose a fruit:").pack()
for buttonLabel, filePath in buttons:
  Button(master, text=buttonLabel, command=open(filePath)).pack(anchor=W)
mainloop()

Links:

List box

A list box can be used to create a simple multiple-choice dialog:

from Tkinter import *
master = Tk()  
choices = ["Apple", "Orange", "Pear"]
canceled = BooleanVar()
def cancelAction(): canceled.set(True); master.quit()
Label(master, text="Choose a fruit:").pack()
listbox = Listbox(master, selectmode=EXTENDED) # Multiple options can be chosen
for text in choices: 
  listbox.insert(END, text)
listbox.pack()    
Button(master, text="OK", command=master.quit).pack(side=LEFT, ipadx=10)
Button(master, text="Cancel", command=cancelAction).pack(side=RIGHT, ipadx=10)
mainloop()
if not canceled.get():
  print listbox.curselection() # A tuple of choice indices starting with 0
  # The above is a tuple even if selectmode=SINGLE
  if "0" in listbox.curselection(): print "Apple chosen."
  if "1" in listbox.curselection(): print "Orange chosen."
  if "2" in listbox.curselection(): print "Pear chosen."  
else:
  print "Choice canceled."

Links:

Checkbox

Checkbox or check button can be created as follows:

from Tkinter import *
root = Tk()
checkbuttonState = IntVar()
Checkbutton(root, text="Recursive", variable=checkbuttonState).pack()
mainloop()
print checkbuttonState.get() # 1 = checked; 0 = unchecked

Links:

Entry

Entry widget, a single-line text input field, can be used as follows:

from Tkinter import *
root = Tk()
Label(text="Enter your first name:").pack()
entryContent = StringVar()
Entry(root, textvariable=entryContent).pack()
mainloop()
print entryContent.get()

Links:

Menus can be created as follows:

from Tkinter import *
root = Tk()
def mycommand(): print "Chosen."
menubar = Menu(root)

menu1 = Menu(menubar, tearoff=0)
menu1.add_command(label="New", command=mycommand)
menu1.add_command(label="Clone", command=mycommand)
menu1.add_separator()
menu1.add_command(label="Exit", command=root.quit)
menubar.add_cascade(label="Project", menu=menu1)
  
menu2 = Menu(menubar, tearoff=0)
menu2.add_command(label="Oval", command=mycommand)
menu2.add_command(label="Rectangle", command=mycommand)
menubar.add_cascade(label="Shapes", menu=menu2)

root.config(menu=menubar)

mainloop()

Links:

LabelFrame

A frame around other elements can be created using LabelFrame widget as follows:

from Tkinter import *
root = Tk()
Label(text="Bus").pack()
frame = LabelFrame(root, text="Fruits") # text is optional
frame.pack()
Label(frame, text="Apple").pack()
Label(frame, text="Orange").pack()
mainloop()

Links:

Message

Message is like Label but ready to wrap across multiple lines. An example:

from Tkinter import *
root = Tk()
Message(text="Lazy brown fox jumped. " * 5, width=100).pack() # width is optional
mainloop()

Links:

Option menu

Drop-down list, in Tkinter option menu, can be created as follows:

from Tkinter import *
root = Tk()
options = ["Apple", "Orange", "Pear"]
selectedOption = StringVar()
selectedOption.set("Apple") # Default
OptionMenu(root, selectedOption, *options).pack()  
mainloop()
print selectedOption.get() # The text in the options list

Links:

Text

Text widget is a more complex one, allowing editing of both plain and formatted text, including multiple fonts.

Example to be added.

Links:

Tcl/Tk version

The Windows installer for Python 2.3 ships with Tcl/Tk 8.4.3. You can find out about the version:

import Tkinter
print Tkinter.TclVersion # Up to 8.5
print Tkinter.TkVersion # Up to 8.5

Links:



CGI interface


The Common Gateway Interface (CGI) allows to execute some Python programs on an HTTP server.

Installation

By default, open a .py file in HTTP returns its content. In order to make the server compile and execute the source code, it must be placed in a directory including an .htaccess file, with the lines[1]:

AddHandler cgi-script .py
Options +ExecCGI

Attention: on the Unix-like servers the files aren't executable by default, so this must be set with the command: chmod +x *.py.

Examples

The module cgitb is used for debugging:

#!C:\Program Files (x86)\Python\python.exe
# -*- coding: UTF-8 -*-
print "Content-type: text/html; charset=utf-8\n\n"
print "<html><head><title>Local directory</title></head><body>"
import cgitb
cgitb.enable()
import os
print "The CGI file is located into:"
print os.path.dirname(__file__)
print "</body></html>"

The usage of a form needs an import cgi[2].

For a MySQL database, its import MySQLdb[3].

The following file is called CGI_MySQL.py, and uses the both modules:

#!C:\Program Files (x86)\Python\python.exe
# -*- coding: UTF-8 -*-
print "Content-type: text/html; charset=utf-8\n\n"
print "<html><head><title>DB CGI</title></head><body>"
print "<h1>MySQL extraction</h1>"
print "<ul>"
import cgitb
cgitb.enable()
import cgi, MySQLdb
form = cgi.FieldStorage()
if form.getvalue('name') == None:
	print "<h2>Research a name</h2>"
	print '''
	<form action="CGI_MySQL.py" method="post">
	<input type="text" name="name" />
	<input type="submit"></form>
		'''
else:
	print "<h2>Result</h2>"
	print "List for " + form.getvalue('name') + " :"
	connection = MySQLdb.connect(user='login1', passwd='passwd1', db='base1')
	cursor = connection.cursor()
	cursor.execute("SELECT page_title FROM page WHERE name ='"+form.getvalue('name')+"'")
	for row in cursor.fetchall():
		print "<li>%s</li>" % row[0]
	connection.close()
print "</ul>"
print "</body></html>"

References

  1. "HOWTO Use Python in the web". http://docs.python.org/howto/webservers.html.
  2. http://fr.openclassrooms.com/informatique/cours/apercu-de-la-cgi-avec-python
  3. https://pypi.python.org/pypi/MySQL-python/1.2.5




WSGI web programming


WSGI Web Programming

External Resources

http://docs.python.org/library/wsgiref.html



Web Page Harvesting




Internet


The urllib module which is bundled with python can be used for web interaction. This module provides a file-like interface for web urls.

Getting page text as a string

An example of reading the contents of a webpage

import urllib.request as urllib
pageText = urllib.urlopen("http://www.spam.org/eggs.html").read()
print pageText

Get and post methods can be used, too.

import urllib.request as urllib
params = urllib.urlencode({"plato":1, "socrates":10, "sophokles":4, "arkhimedes":11})

# Using GET method
pageText = urllib.urlopen("http://international-philosophy.com/greece?%s" % params).read()
print pageText

# Using POST method
pageText = urllib.urlopen("http://international-philosophy.com/greece", params).read()
print pageText

Downloading files

To save the content of a page on the internet directly to a file, you can read() it and save it as a string to a file object

import urllib2
data = urllib2.urlopen("http://upload.wikimedia.org/wikibooks/en/9/91/Python_Programming.pdf", "pythonbook.pdf").read() # not recommended as if you are downloading 1gb+ file, will store all data in ram.
file =  open('Python_Programming.pdf','wb')
file.write(data)
file.close()

This will download the file from here and save it to a file "pythonbook.pdf" on your hard drive.

Other functions

The urllib module includes other functions that may be helpful when writing programs that use the internet:

>>> plain_text = "This isn't suitable for putting in a URL"
>>> print urllib.quote(plain_text)
This%20isn%27t%20suitable%20for%20putting%20in%20a%20URL
>>> print urllib.quote_plus(plain_text)
This+isn%27t+suitable+for+putting+in+a+URL

The urlencode function, described above converts a dictionary of key-value pairs into a query string to pass to a URL, the quote and quote_plus functions encode normal strings. The quote_plus function uses plus signs for spaces, for use in submitting data for form fields. The unquote and unquote_plus functions do the reverse, converting urlencoded text to plain text.

Email

With Python, MIME compatible emails can be sent. This requires an installed SMTP server.

import smtplib
from email.mime.text import MIMEText

msg = MIMEText( 
"""Hi there,

This is a test email message.

Greetings""")

me  = 'sender@example.com'
you = 'receiver@example.com'
msg['Subject'] = 'Hello!'
msg['From'] =  me
msg['To'] =  you
s = smtplib.SMTP()
s.connect()
s.sendmail(me, [you], msg.as_string())
s.quit()

This sends the sample message from 'sender@example.com' to 'receiver@example.com'.



Networks


Sockets

Python can also communicate via sockets.

Connecting to a server

This simple Python program will fetch a 4096 byte HTTP response from Google:

import socket, sys
sock = socket.socket ( socket.AF_INET, socket.SOCK_STREAM )
sock.connect ( ( "google.com", 80 ) )

sock.send('GET / HTTP/1.1\r\n')
sock.send('User-agent: Mozilla/5.0 (wikibooks test)\r\n\r\n')
print sock.recv(4096)

High-level interfaces

Most Python developers will prefer to use a high-level interface over using sockets, such as Twisted and urllib2.



Math

For basic math including addition, subtraction, multiplication and the like, see Basic Math and Operators chapters. For quick reference, the built-in Python math operators include addition (+), subtraction (-), multiplication (*), division (/), floor division (//), modulo (%), and exponentiation (**). The built-in Python math functions include rounding (round()), absolute value (abs()), minimum (min()), maximum (max()), division with a remainder (divmod()), and exponentiation (pow()). Sign function can be created as "sign = lambda n: 1 if n > 0 else -1 if n < 0 else 0".

Math

A range of mathematical functions is available from math module of the standard library:

import math

print math.sin(10)       # sine
print math.cos(10)       # cosine
print math.tan(10)       # tangent 

print math.asin(10)      # arc sine
print math.acos(10)      # arc cosine
print math.atan(10)      # arc tangent

print math.sinh(10)      # hyperbolic sine    
print math.cosh(10)      # hyperbolic cosine
print math.tanh(10)      # hyperbolic tangent

print math.pow(2, 4)     # 2 raised to 4
print math.exp(4)        # e ^ 4
print math.sqrt(10)      # square root
print math.pow(5, 1/3.0) # cubic root of 5
print math.log(3)        # ln; natural logarithm
print math.log(100, 10)  # base 10

print math.ceil(2.3)    # ceiling
print math.floor(2.7)   # floor

print math.pi
print math.e

Cmath

The cmath module provides similar functions like the math module but for complex numbers, and then some.

Random

Pseudo-random generators are available from the random module:

import random
print random.random()     # Uniformly distributed random float >= 0.0 and < 1.0.
print random.random()*10  # Uniformly distributed random float >= 0.0 and < 10.0
print random.randint(0,9) # Uniformly distributed random int >= 0 and <=9
li=[1, 2, 3]; random.shuffle(li); print li # Randomly shuffled list

Decimal

The decimal module enables decimal floating point arithmethic, avoiding certain artifacts of the usual underlying binary representation of floating point numbers that are unintuitive to humans.

import decimal
plainFloat = 1/3.0
print plainFloat # 0.3333333333333333
decFloat = decimal.Decimal("0.33333333333333333333333333333333333333")
print decFloat   # Decimal('0.33333333333333333333333333333333333333')
decFloat2 = decimal.Decimal(plainFloat)
print decFloat2  # Decimal('0.333333333333333314829616256247390992939472198486328125')

Fractions

The fractions module provides fraction arithmetic via Fraction class. Compared to floating point numbers representing fractions, Fraction fractions do not lose precision.

from fractions import Fraction
oneThird = Fraction(1, 3)
floatOneThird = 1/3.0
print Fraction(0.25)                  # 1/4
print Fraction(floatOneThird)         # 6004799503160661/18014398509481984
print Fraction(1, 3) * Fraction(2, 5) # 2/15

Statistics

The statistics module, available since Python 3.4, provides some basic statistical functions. It only provides basics; it does not replace full-fledged 3rd party libraries such as numpy. For Python 2.7, the statistics module can be installed from pypi.

import statistics as stats
print stats.mean([1, 2, 3, 100]) # 26.5
print stats.median([1, 2, 3, 100]) # 2.5
print stats.mode([1, 1, 2, 3]) # 1
print stats.pstdev([1, 1, 2, 3]) # 0.82915...; population standard deviation
print stats.pvariance([1, 1, 2, 3]) # 0.6875; population variance



Databases


Python has support for working with databases via a simple API. Modules included with Python include modules for SQLite and Berkeley DB. Modules for MySQL , PostgreSQL , FirebirdSQL and others are available as third-party modules. The latter have to be downloaded and installed before use. The package MySQLdb can be installed, for example, using the debian package "python-mysqldb".

DBMS Specifics

MySQL

An Example with MySQL would look like this:

import MySQLdb
db = MySQLdb.connect("host machine", "dbuser", "password", "dbname")
cursor = db.cursor()
query = """SELECT * FROM sampletable"""
lines = cursor.execute(query)
data = cursor.fetchall()
db.close()

On the first line, the Module MySQLdb is imported. Then a connection to the database is set up and on line 4, we save the actual SQL statement to be executed in the variable query. On line 5 we execute the query and on line 6 we fetch all the data. After the execution of this piece of code, lines contains the number of lines fetched (e.g. the number of rows in the table sampletable). The variable data contains all the actual data, e.g. the content of sampletable. In the end, the connection to the database would be closed again. If the number of lines are large, it is better to use row = cursor.fetchone() and process the rows individually:

  #first 5 lines are the same as above
  while True:
    row = cursor.fetchone()
    if row == None: break
    #do something with this row of data
  db.close()

Obviously, some kind of data processing has to be used on row, otherwise the data will not be stored. The result of the fetchone() command is a Tuple.

In order to make the initialization of the connection easier, a configuration file can be used:

import MySQLdb
db = MySQLdb.connect(read_default_file="~/.my.cnf")
...

Here, the file .my.cnf in the home directory contains the necessary configuration information for MySQL.

Sqlite

An example with sqlite is very similar to the one above and the cursor provides many of the same functionalities.

import sqlite3
db = sqlite3.connect("/path/to/file")
cursor = db.cursor()
query = """SELECT * FROM sampletable"""
lines = cursor.execute(query)
data = cursor.fetchall()
db.close()

When writing to the db, one has to remember to call db.commit(), otherwise the changes are not saved:

import sqlite3
db = sqlite3.connect("/path/to/file")
cursor = db.cursor()
query = """INSERT INTO sampletable (value1, value2) VALUES (1,'test')"""
cursor.execute(query)
db.commit()
db.close()

Postgres

import psycopg2
conn = psycopg2.connect("dbname=test")
cursor = conn.cursor()
cursor.execute("select * from test");
for i in cursor.next():
    print i
conn.close()

Firebird

import firebirdsql
conn = firebirdsql.connect(dsn='localhost/3050:/var/lib/firebird/2.5/test.fdb', user='alice', password='wonderland')
cur = conn.cursor()
cur.execute("select * from baz")
for c in cur.fetchall():
    print(c)
conn.close()

General Principles

Parameter Quoting

You will frequently need to substitute dynamic data into a query string. It is important to ensure this is done correctly.

# Do not do this!
result = db.execute("SELECT name FROM employees WHERE location = '" + location + "'")

This example is wrong, because it doesn’t correctly deal with special characters, like apostrophes, in the string being substituted. If your code has to deal with potentially hostile users (like on a public-facing Web server), this could leave you open to an SQL injection attack.

For simple cases, use the automatic parameter substitution provided by the execute method, e.g.

result = db.execute("SELECT name FROM employees WHERE location = ?", [location])

The DBMS interface itself will automatically convert the values you pass into the correct SQL syntax.

For more complex cases, the DBMS module should provide a quoting function that you can explicitly call. For example, MySQLdb provides the escape_string method, while APSW (for SQLite3) provides format_sql_value. This is necessary where the query structure takes a more dynamic form:

criteria = [("company", company)] # list of tuples (fieldname, value)
if department != None :
    criteria.append(("department", department))
#end if
# ... append other optional criteria as appropriate ...

result = db.execute \
  (
        "SELECT name FROM employees WHERE "
    +
        " and ".join
          (
            "%s = %s" % (criterion[0], MySQLdb.escape_string(criterion[1]))
            for criterion in criteria
          )
  )

This will dynamically construct queries like “select name from employees where company = 'some company'” or “select name from employees where company = 'some company' and department = 'some department'”, depending on which fields have been filled in by the user.

Use Iterators

Python iterators are a natural fit for the problem of iterating over lots of database records. Here is an example of a function that performs a database query and returns an iterator for the results, instead of returning them all at once. It relies on the fact that, in APSW (the Python 3 interface library for SQLite), the cursor.execute method itself returns an iterator for the result records. The result is that you can write very concise code for doing complex database queries in Python.

def db_iter(db, cmd, mapfn = lambda x : x) :
    "executes cmd on a new cursor from connection db and yields the results in turn."
    cu = db.cursor()
    result = cu.execute(cmd)
    while True :
        yield mapfn(next(result))
    #end while
#end db_iter

Example uses of this function:

for \
    artist, publisher \
in \
    db_iter \
      (
        db = db,
        cmd =
                "SELECT artist, publisher FROM artists WHERE location = %s"
            %
                 apsw.format_sql_value(location)
      ) \
:
    print(artist, publisher)
#end for

and

for \
    location \
in \
    db_iter \
      (
        db = db,
        cmd = "SELECT DISTINCT location FROM artists",
        mapfn = lambda x : x[0]
      ) \
:
    print(location)
#end for

In the first example, since db_iter returns a tuple for each record, this can be directly assigned to individual variables for the record fields. In the second example, the tuple has only one element, so a custom mapfn is used to extract this element and return it instead of the tuple.

Never Use “SELECT *” in a Script

Database table definitions are frequently subject to change. As application requirements evolve, fields and even entire tables are often added, or sometimes removed. Consider a statement like

result = db.execute("select * from employees")

You may happen to know that the employees table currently contains, say, 4 fields. But tomorrow someone may add a fifth field. Did you remember to update your code to deal with this? If not, it’s liable to crash. Or even worse, produce an incorrect result!

Better to always list the specific fields you’re interested in, no matter how many there are:

result = db.execute("select name, address, department, location from employees")

That way, any extra fields added will simply be ignored. And if any of the named fields are removed, the code will at least fail with a runtime error, which is a good reminder that you forgot to update it!

Looping on Field Breaks

Consider the following scenario: your sales company database has a table of employees, and also a table of sales made by each employee. You want to loop over these sale entries, and produce some per-employee statistics. A naïve approach might be:

  • Query the database to get a list of employees
  • For each employee, do a database query to get the list of sales for each employee.

If you have a lot of employees, then the first query may produce a large list, and the second step will involve a correspondingly large number of database queries.

In fact, the entire processing loop can run off a single database query, using the standard SQL construct called a join.

Note:

SQL programming is a specialty skill in its own right. To learn more about this, start with the Wikipedia article.

 

Here is what an example of such a loop could look like:

rows = db_iter \
  (
    db = db,
    cmd =
        "select employees.name, sales.amount, sales.date from"
        " employees left join sales on employees.id = sales.employee_id"
        " order by employees.name, sales.date"
  )
prev_employee_name = None
while True :
    row = next(rows, None)
    if row != None :
        employee_name, amount, date = row
    #end if
    if row == None or employee_name != prev_employee_name :
         if prev_employee_name != None :
              # done stats for this employee
              report(prev_employee_name, employee_stats)
         #end if
         if row == None :
              break
         # start stats for a new employee
         prev_employee_name = employee_name
         employee_stats = {"total_sales" : 0, "number_of_sales" : 0}
         if date != None :
               employee_stats["earliest_sale"] = date
         #end if
    #end if
    # another row of stats for this employee
    if amount != None :
         employee_stats["total_sales"] += amount
         employee_stats["number_of_sales"] += 1
    #end if
    if date != None :
         employee_stats["latest_sale"] = date
    #end if
#end while

Here the statistics are quite simple: earliest and latest sale, and number and total amount of sales, and could be computed directly within the SQL query. But the same loop could compute more complex statistics (like standard deviation) that cannot be represented directly within a simple SQL query.

Note how the statistics for each employee are written out under either of two conditions:

  • The employee name of the next record is different from the previous one
  • The end of the query results has been reached.

Both conditions are tested with row == None or employee_name != prev_employee_name; after writing out the employee statistics, a separate check for the second condition row == None is used to terminate the loop. If the loop doesn’t terminate, then processing is initialized for the new employee.

Note also the use of a left join in this case: if an employee has had no sales, then the join will return a single row for that employee, with SQL null values (represented by None in Python) for the fields from the sales table. This is why we need checks for such None values before processing those fields.

Alternatively, we could have used an inner join, which would have returned no results for an employee with no sales. Whether you want to omit such an employee from your report, or include them with totals of zero, is really up to your application.

See Also




Database Programming


Python has support for working with databases via a simple API. Modules included with Python include modules for SQLite and Berkeley DB. Modules for MySQL , PostgreSQL , FirebirdSQL and others are available as third-party modules. The latter have to be downloaded and installed before use. The package MySQLdb can be installed, for example, using the debian package "python-mysqldb".

DBMS Specifics

MySQL

An Example with MySQL would look like this:

import MySQLdb
db = MySQLdb.connect("host machine", "dbuser", "password", "dbname")
cursor = db.cursor()
query = """SELECT * FROM sampletable"""
lines = cursor.execute(query)
data = cursor.fetchall()
db.close()

On the first line, the Module MySQLdb is imported. Then a connection to the database is set up and on line 4, we save the actual SQL statement to be executed in the variable query. On line 5 we execute the query and on line 6 we fetch all the data. After the execution of this piece of code, lines contains the number of lines fetched (e.g. the number of rows in the table sampletable). The variable data contains all the actual data, e.g. the content of sampletable. In the end, the connection to the database would be closed again. If the number of lines are large, it is better to use row = cursor.fetchone() and process the rows individually:

  #first 5 lines are the same as above
  while True:
    row = cursor.fetchone()
    if row == None: break
    #do something with this row of data
  db.close()

Obviously, some kind of data processing has to be used on row, otherwise the data will not be stored. The result of the fetchone() command is a Tuple.

In order to make the initialization of the connection easier, a configuration file can be used:

import MySQLdb
db = MySQLdb.connect(read_default_file="~/.my.cnf")
...

Here, the file .my.cnf in the home directory contains the necessary configuration information for MySQL.

Sqlite

An example with sqlite is very similar to the one above and the cursor provides many of the same functionalities.

import sqlite3
db = sqlite3.connect("/path/to/file")
cursor = db.cursor()
query = """SELECT * FROM sampletable"""
lines = cursor.execute(query)
data = cursor.fetchall()
db.close()

When writing to the db, one has to remember to call db.commit(), otherwise the changes are not saved:

import sqlite3
db = sqlite3.connect("/path/to/file")
cursor = db.cursor()
query = """INSERT INTO sampletable (value1, value2) VALUES (1,'test')"""
cursor.execute(query)
db.commit()
db.close()

Postgres

import psycopg2
conn = psycopg2.connect("dbname=test")
cursor = conn.cursor()
cursor.execute("select * from test");
for i in cursor.next():
    print i
conn.close()

Firebird

import firebirdsql
conn = firebirdsql.connect(dsn='localhost/3050:/var/lib/firebird/2.5/test.fdb', user='alice', password='wonderland')
cur = conn.cursor()
cur.execute("select * from baz")
for c in cur.fetchall():
    print(c)
conn.close()

General Principles

Parameter Quoting

You will frequently need to substitute dynamic data into a query string. It is important to ensure this is done correctly.

# Do not do this!
result = db.execute("SELECT name FROM employees WHERE location = '" + location + "'")

This example is wrong, because it doesn’t correctly deal with special characters, like apostrophes, in the string being substituted. If your code has to deal with potentially hostile users (like on a public-facing Web server), this could leave you open to an SQL injection attack.

For simple cases, use the automatic parameter substitution provided by the execute method, e.g.

result = db.execute("SELECT name FROM employees WHERE location = ?", [location])

The DBMS interface itself will automatically convert the values you pass into the correct SQL syntax.

For more complex cases, the DBMS module should provide a quoting function that you can explicitly call. For example, MySQLdb provides the escape_string method, while APSW (for SQLite3) provides format_sql_value. This is necessary where the query structure takes a more dynamic form:

criteria = [("company", company)] # list of tuples (fieldname, value)
if department != None :
    criteria.append(("department", department))
#end if
# ... append other optional criteria as appropriate ...

result = db.execute \
  (
        "SELECT name FROM employees WHERE "
    +
        " and ".join
          (
            "%s = %s" % (criterion[0], MySQLdb.escape_string(criterion[1]))
            for criterion in criteria
          )
  )

This will dynamically construct queries like “select name from employees where company = 'some company'” or “select name from employees where company = 'some company' and department = 'some department'”, depending on which fields have been filled in by the user.

Use Iterators

Python iterators are a natural fit for the problem of iterating over lots of database records. Here is an example of a function that performs a database query and returns an iterator for the results, instead of returning them all at once. It relies on the fact that, in APSW (the Python 3 interface library for SQLite), the cursor.execute method itself returns an iterator for the result records. The result is that you can write very concise code for doing complex database queries in Python.

def db_iter(db, cmd, mapfn = lambda x : x) :
    "executes cmd on a new cursor from connection db and yields the results in turn."
    cu = db.cursor()
    result = cu.execute(cmd)
    while True :
        yield mapfn(next(result))
    #end while
#end db_iter

Example uses of this function:

for \
    artist, publisher \
in \
    db_iter \
      (
        db = db,
        cmd =
                "SELECT artist, publisher FROM artists WHERE location = %s"
            %
                 apsw.format_sql_value(location)
      ) \
:
    print(artist, publisher)
#end for

and

for \
    location \
in \
    db_iter \
      (
        db = db,
        cmd = "SELECT DISTINCT location FROM artists",
        mapfn = lambda x : x[0]
      ) \
:
    print(location)
#end for

In the first example, since db_iter returns a tuple for each record, this can be directly assigned to individual variables for the record fields. In the second example, the tuple has only one element, so a custom mapfn is used to extract this element and return it instead of the tuple.

Never Use “SELECT *” in a Script

Database table definitions are frequently subject to change. As application requirements evolve, fields and even entire tables are often added, or sometimes removed. Consider a statement like

result = db.execute("select * from employees")

You may happen to know that the employees table currently contains, say, 4 fields. But tomorrow someone may add a fifth field. Did you remember to update your code to deal with this? If not, it’s liable to crash. Or even worse, produce an incorrect result!

Better to always list the specific fields you’re interested in, no matter how many there are:

result = db.execute("select name, address, department, location from employees")

That way, any extra fields added will simply be ignored. And if any of the named fields are removed, the code will at least fail with a runtime error, which is a good reminder that you forgot to update it!

Looping on Field Breaks

Consider the following scenario: your sales company database has a table of employees, and also a table of sales made by each employee. You want to loop over these sale entries, and produce some per-employee statistics. A naïve approach might be:

  • Query the database to get a list of employees
  • For each employee, do a database query to get the list of sales for each employee.

If you have a lot of employees, then the first query may produce a large list, and the second step will involve a correspondingly large number of database queries.

In fact, the entire processing loop can run off a single database query, using the standard SQL construct called a join.

Note:

SQL programming is a specialty skill in its own right. To learn more about this, start with the Wikipedia article.

 

Here is what an example of such a loop could look like:

rows = db_iter \
  (
    db = db,
    cmd =
        "select employees.name, sales.amount, sales.date from"
        " employees left join sales on employees.id = sales.employee_id"
        " order by employees.name, sales.date"
  )
prev_employee_name = None
while True :
    row = next(rows, None)
    if row != None :
        employee_name, amount, date = row
    #end if
    if row == None or employee_name != prev_employee_name :
         if prev_employee_name != None :
              # done stats for this employee
              report(prev_employee_name, employee_stats)
         #end if
         if row == None :
              break
         # start stats for a new employee
         prev_employee_name = employee_name
         employee_stats = {"total_sales" : 0, "number_of_sales" : 0}
         if date != None :
               employee_stats["earliest_sale"] = date
         #end if
    #end if
    # another row of stats for this employee
    if amount != None :
         employee_stats["total_sales"] += amount
         employee_stats["number_of_sales"] += 1
    #end if
    if date != None :
         employee_stats["latest_sale"] = date
    #end if
#end while

Here the statistics are quite simple: earliest and latest sale, and number and total amount of sales, and could be computed directly within the SQL query. But the same loop could compute more complex statistics (like standard deviation) that cannot be represented directly within a simple SQL query.

Note how the statistics for each employee are written out under either of two conditions:

  • The employee name of the next record is different from the previous one
  • The end of the query results has been reached.

Both conditions are tested with row == None or employee_name != prev_employee_name; after writing out the employee statistics, a separate check for the second condition row == None is used to terminate the loop. If the loop doesn’t terminate, then processing is initialized for the new employee.

Note also the use of a left join in this case: if an employee has had no sales, then the join will return a single row for that employee, with SQL null values (represented by None in Python) for the fields from the sales table. This is why we need checks for such None values before processing those fields.

Alternatively, we could have used an inner join, which would have returned no results for an employee with no sales. Whether you want to omit such an employee from your report, or include them with totals of zero, is really up to your application.

See Also




numpy


Numpy is a numeric library for python.

Installation

It's provided with the main Linux distribution, however it can be installed through the Debian package python-numpy. On Windows, it can be downloaded on http://sourceforge.net/projects/numpy/files/.

Then, once the .zip unpacked, the installation is done by entering into the console:

python setup.py install

In case of error:

Histogram

import numpy
mydata = [numpy.random.normal(0,1) for i in range(10000) ]
h, n = numpy.histogram( mydata , 100, (-5,5) )

See also



Game Programming in Python


3D Game Programming


3D Game Engine with a Python binding

pyirrlicht
ctypes python module for Irrlicht Engine SDK.
PyPi Link https://pypi.python.org/pypi/pyirrlicht
Pip command pip install pyirrlicht
  • Irrlicht Engine (Python binding website: )

Both are very good free open source C++ 3D game Engine with a Python binding.

  • CrystalSpace is a free cross-platform software development kit for real-time 3D graphics, with particular focus on games. Crystal Space is accessible from Python in two ways: (1) as a Crystal Space plugin module in which C++ code can call upon Python code, and in which Python code can call upon Crystal Space; (2) as a pure Python module named ‘cspace’ which one can ‘import’ from within Python programs. To use the first option, load the ‘cspython’ plugin as you would load any other Crystal Space plugin, and interact with it via the SCF ‘iScript’ interface .The second approach allows you to write Crystal Space applications entirely in Python, without any C++ coding. CS Wiki

3D Game Engines written for Python

Engines designed for Python from scratch.

Blender
Open Source 3D creation. Free to use for any purpose, forever.
Download link https://www.blender.org/download/
  • Blender is an impressive 3D tool with a fully integrated 3D graphics creation suite allowing modeling, animation, rendering, post-production, real-time interactive 3D and game creation and playback with cross-platform compatibility. The 3D game engine uses an embedded python interpreter to make 3D games.
Panda3d
Panda3D is a game engine, a framework for 3D rendering and game development for Python and C++ programs
Download link http://www.panda3d.org/download.php
  • Panda3D is a 3D game engine. It's a library written in C++ with Python bindings. Panda3D is designed in order to support a short learning curve and rapid development. This software is available for free download with source code under the BSD License. The development was started by [Disney]. Now there are many projects made with Panda3D, such as Disney's Pirate's of the Caribbean Online, ToonTown, Building Virtual World, Shell Games and many others. Panda3D supports several features: Procedural Geometry, Animated Texture, Render to texture, Track motion, fog, particle system, and many others.
Crystal Space
Crystal Space is a mature, full-featured Software Development Kit (SDK) providing real-time 3D graphics for applications such as games and virtual reality
Download link http://www.crystalspace3d.org/main/Download

2D Game Programming

Pygame
Python Game Development
PyPi Link https://pypi.python.org/pypi/Pygame
Pip command pip install Pygame
  • Pygame is a cross platform Python library which wraps SDL. It provides many features like Sprite groups and sound/image loading and easy changing of an objects position. It also provides the programmer access to key and mouse events. A full tutorial can be found in the free book "Making Games with Python & Pygame".
pgu
Python Game Utilities
Download link https://code.google.com/archive/p/pgu/downloads
Dependencies PyGame
  • Phil's Pygame Utilities (PGU) is a collection of tools and libraries that enhance Pygame. Tools include a tile editor and a level editor (tile, isometric, hexagonal). GUI enhancements include full featured GUI, HTML rendering, document layout, and text rendering. The libraries include a sprite and tile engine (tile, isometric, hexagonal), a state engine, a timer, and a high score system. (Beta with last update March, 2007. APIs to be deprecated and isometric and hexagonal support is currently Alpha and subject to change.) [Update 27/02/08 Author indicates he is not currently actively developing this library and anyone that is willing to develop their own scrolling isometric library offering can use the existing code in PGU to get them started.]
pyglet
Cross-platform windowing and multimedia library
PyPi Link https://pypi.python.org/pypi/pyglet
Pip command pip install pyglet
  • Pyglet is a cross-platform windowing and multimedia library for Python with no external dependencies or installation requirements. Pyglet provides an object-oriented programming interface for developing games and other visually-rich applications for Windows, Mac OS X and Linux. Pyglet allows programs to open multiple windows on multiple screens, draw in those windows with OpenGL, and play back audio and video in most formats. Unlike similar libraries available, pyglet has no external dependencies (such as SDL) and is written entirely in Python. Pyglet is available under a BSD-Style license.
kivy
A software library for rapid development of hardware-accelerated multitouch applications.
PyPi Link https://pypi.python.org/pypi/kivy
Pip command pip install kivy
Dependencies docutils; pygments (auto-installed with kivy)

kivy.deps.sdl2; kivy.deps.glew (will not auto-install, run pip install kivy.deps.sdl2 kivy.deps.glew, needed for OpenGl) kivy.deps.angle (Python3.5+, can be substituted for kivy.deps.glew pip install kivy.deps.angle) kivy.deps.gstreamer (120+mb, needed for video/audio, pip install kivy.deps.gstreamer, not needed for graphics only)

kivy_examples (Optional, install with pip install kivy_examples
  • Kivy Kivy is a library for developing multi-touch applications. It is completely cross-platform (Linux/OSX/Win & Android with OpenGL ES2). It comes with native support for many multi-touch input devices, a growing library of multi-touch aware widgets and hardware accelerated OpenGL drawing. Kivy is designed to let you focus on building custom and highly interactive applications as quickly and easily as possible.
Rabbyt
A fast 2D sprite engine using OpenGL
PyPi Link https://pypi.python.org/pypi/Rabbyt
Pip command pip install Rabbyt
  • Rabbyt A fast Sprite library for Python with game development in mind. With Rabbyt Anims, even old graphics cards can produce very fast animations of 2,400 or more sprites handling position, rotation, scaling, and color simultaneously.


See Also



PyQt4


WARNING: The examples on this page are a mixture of PyQt3 and PyQt4 - use with caution!

This tutorial aims to provide a hands-on guide to learn the basics of building a small Qt4 application in Python.

To follow this tutorial, you should have basic Python knowledge. However, knowledge of Qt4 is not necessary. I'm using Linux in these examples and am assuming you already have a working installation of Python and PyQt4. To test this, open a Python shell (by typing 'Python' in a console to start the interactive interpreter) and type:

>>> import PyQt4

If no error message appears, you should be ready to go.

The examples in this tutorial as easy as possible, showing useful ways to write and structure your program. It is important that you read the source code of the example files, most of the explanations are in the code. The best way to get comfortable with PyQt is play around with the examples and try to change things.

Hello, world!

Let's start easy: popping up a window and displaying something. The following small program will pop up a window showing "Hello, world!".

 #!/usr/bin/env python
 
 import sys
 from PyQt4 import Qt
 
 # We instantiate a QApplication passing the arguments of the script to it:
 a = Qt.QApplication(sys.argv)
 
 # Add a basic widget to this application:
 # The first argument is the text we want this QWidget to show, the second
 # one is the parent widget. Since Our "hello" is the only thing we use (the 
 # so-called "MainWidget", it does not have a parent.
 hello = Qt.QLabel("Hello, World")
 
 # ... and that it should be shown.
 hello.show()
 
 # Now we can start it.
 a.exec_()

About 7 lines of code, and that's about as easy as it can get.

A Button

Let's add some interaction! We'll replace the label saying "Hello, World!" with a button and assign an action to it. This assignment is done by connecting a signal, an event which is sent out when the button is pushed, to a slot, which is an action, normally a function that is run in the case of that event.

#!/usr/bin/env python

import sys
from PyQt4 import Qt

a = Qt.QApplication(sys.argv)

# Our function to call when the button is clicked
def sayHello():
    print ("Hello, World!")

# Instantiate the button
hellobutton = Qt.QPushButton("Say 'Hello world!'")

# And connect the action "sayHello" to the event "button has been clicked"
hellobutton.clicked.connect(sayHello)

# The rest is known already...
#a.setMainWidget(hellobutton)
hellobutton.show()
a.exec_()

You can imagine that coding this way is not scalable nor the way you'll want to continue working. So let's make that stuff pythonic, adding structure and actually using object-orientation in it. We create our own application class, derived from a QApplication and put the customization of the application into its methods: One method to build up the widgets and a slot which contains the code that's executed when a signal is received.

#!/usr/bin/env python

import sys
from PyQt4 import Qt

class HelloApplication(Qt.QApplication):
    def __init__(self, args):
        """ In the constructor we're doing everything to get our application
            started, which is basically constructing a basic QApplication by 
            its __init__ method, then adding our widgets and finally starting 
            the exec_loop."""
        Qt.QApplication.__init__(self, args)
        self.addWidgets()

    def addWidgets(self):
        """ In this method, we're adding widgets and connecting signals from 
            these widgets to methods of our class, the so-called "slots" 
        """
        self.hellobutton = Qt.QPushButton("Say 'Hello world!'")
        self.hellobutton.clicked.connect(self.slotSayHello)
        self.hellobutton.show()

    def slotSayHello(self):
        """ This is an example slot, a method that gets called when a signal is 
            emitted """
        print ("Hello, World!")


# Only actually do something if this script is run standalone, so we can test our 
# application, but we're also able to import this program without actually running
# any code.
if __name__ == "__main__":
    app = HelloApplication(sys.argv)
    app.exec_()

GUI Coding

... so we want to use Qt3 Designer for creating our GUI. In the picture, you can see a simple GUI, with in green letters the names of the widgets. What we are going to do is We compile the .ui file from Qt designer into a python class We subclass that class and use it as our mainWidget This way, we're able to change the user interface afterwards from Qt designer, without having it messing around in the code we added.

pyuic4 testapp_ui.ui -o testapp_ui.py

makes a Python file from it which we can work with.

The way our program works can be described like this: We fill in the lineedit Clicking the add button will be connected to a method that reads the text from the lineedit, makes a listviewitem out of it and adds that to our listview. Clicking the deletebutton will delete the currently selected item from the listview. Here's the heavily commented code (only works in PyQt3):

#!/usr/bin/env python
 
from testapp_ui import TestAppUI
from qt import *
import sys
 
class HelloApplication(QApplication):
    def __init__(self, args):
        """ In the constructor we're doing everything to get our application
            started, which is basically constructing a basic QApplication by 
            its __init__ method, then adding our widgets and finally starting 
            the exec_loop."""
        QApplication.__init__(self, args)
        
        # We pass None since it's the top-level widget, we could in fact leave 
        # that one out, but this way it's easier to add more dialogs or widgets.
        self.maindialog = TestApp(None)
        
        self.setMainWidget(self.maindialog)
        self.maindialog.show()
        self.exec_loop()     


class TestApp(TestAppUI):
    def __init__(self, parent):
        # Run the parent constructor and connect the slots to methods.
        TestAppUI.__init__(self, parent)
        self._connectSlots()

        # The listview is initially empty, so the deletebutton will have no effect,
        # we grey it out.
        self.deletebutton.setEnabled(False)

    def _connectSlots(self):
        # Connect our two methods to SIGNALS the GUI emits.
        self.addbutton.clicked.connect(self._slotAddClicked)
        self.deletebutton.clicked.connect(self._slotDeleteClicked)

    def _slotAddClicked(self):
        # Read the text from the lineedit,
        text = self.lineedit.text()
        # if the lineedit is not empty,
        if len(text):
            # insert a new listviewitem ...
            lvi = QListViewItem(self.listview)
            # with the text from the lineedit and ...
            lvi.setText(0,text)
            # clear the lineedit.
            self.lineedit.clear()

            # The deletebutton might be disabled, since we're sure that there's now
            # at least one item in it, we enable it.
            self.deletebutton.setEnabled(True)

    def _slotDeleteClicked(self):
        # Remove the currently selected item from the listview.
        self.listview.takeItem(self.listview.currentItem())

        # Check if the list is empty - if yes, disable the deletebutton.
        if self.listview.childCount() == 0:
            self.deletebutton.setEnabled(False)


if __name__ == "__main__":
    app = HelloApplication(sys.argv)

also this code is useful and it works on PyQt4 and it has many useful options

#!/usr/bin/env python
# Copyright (c) 2008-10 Qtrac Ltd. All rights reserved.
# This program or module is free software: you can redistribute it and/or
# modify it under the terms of the GNU General Public License as published
# by the Free Software Foundation, either version 2 of the License, or
# version 3 of the License, or (at your option) any later version. It is
# provided for educational purposes and is distributed in the hope that
# it will be useful, but WITHOUT ANY WARRANTY; without even the implied
# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
# the GNU General Public License for more details.
#
#
#   Versions
#
# 1.0.1 Fixed bug reported by Brian Downing where paths that contained
#       spaces were not handled correctly.
# 1.0.2 Fixed bug reported by Ben Thompson that if the UIC program
#       fails, no problem was reported; I try to report one now.
# 1.1.0 Added Remember path option; if checked the program starts with
#       the last used path, otherwise with the current directory, unless
#       overridden on the command line
# 1.1.1 Changed default path on Windows to match PyQt 4.4
# 1.2.1 Changed import style + bug fixes
# 1.2.2 Added stderr to error message output as per Michael Jackson's
#       suggestion
# 1.2.3 Tried to make the paths work on Mac OS X
# 1.2.4 Added more options
# 1.2.5 Use "new-style" connections (src.signal.connect(target.slot) instead of
#       src.connect(src, SIGNAL("signal()"), target.slot)) and improve PEP-8
#       compliance

from __future__ import division
from __future__ import print_function
from __future__ import unicode_literals
from future_builtins import *

import os
import platform
import stat
import sys
from PyQt4.QtCore import *
from PyQt4.QtGui import *

__version__ = "1.2.5"

Windows = sys.platform.lower().startswith(("win", "microsoft"))

class OptionsForm(QDialog):
    def __init__(self, parent=None):
        super(OptionsForm, self).__init__(parent)

        settings = QSettings()
        if sys.platform.startswith("darwin"):
            pyuic4Label = QLabel("pyuic4 (pyuic.py)")
        else:
            pyuic4Label = QLabel("pyuic4")
        self.pyuic4Label = QLabel(settings.value("pyuic4",
                QVariant(PYUIC4)).toString())
        self.pyuic4Label.setFrameStyle(QFrame.StyledPanel|
                                       QFrame.Sunken)
        pyuic4Button = QPushButton("py&uic4...")
        pyrcc4Label = QLabel("pyrcc4")
        self.pyrcc4Label = QLabel(settings.value("pyrcc4",
                QVariant(PYRCC4)).toString())
        self.pyrcc4Label.setFrameStyle(QFrame.StyledPanel|
                                       QFrame.Sunken)
        pyrcc4Button = QPushButton("p&yrcc4...")
        pylupdate4Label = QLabel("pylupdate4")
        self.pylupdate4Label = QLabel(settings.value("pylupdate4",
                QVariant(PYLUPDATE4)).toString())
        self.pylupdate4Label.setFrameStyle(QFrame.StyledPanel|
                                           QFrame.Sunken)
        pylupdate4Button = QPushButton("&pylupdate4...")
        lreleaseLabel = QLabel("lrelease")
        self.lreleaseLabel = QLabel(settings.value("lrelease",
                QVariant("lrelease")).toString())
        self.lreleaseLabel.setFrameStyle(QFrame.StyledPanel|
                                         QFrame.Sunken)
        lreleaseButton = QPushButton("&lrelease...")
        toolPathGroupBox = QGroupBox("Tool Paths")

        pathsLayout = QGridLayout()
        pathsLayout.addWidget(pyuic4Label, 0, 0)
        pathsLayout.addWidget(self.pyuic4Label, 0, 1)
        pathsLayout.addWidget(pyuic4Button, 0, 2)
        pathsLayout.addWidget(pyrcc4Label, 1, 0)
        pathsLayout.addWidget(self.pyrcc4Label, 1, 1)
        pathsLayout.addWidget(pyrcc4Button, 1, 2)
        pathsLayout.addWidget(pylupdate4Label, 2, 0)
        pathsLayout.addWidget(self.pylupdate4Label, 2, 1)
        pathsLayout.addWidget(pylupdate4Button, 2, 2)
        pathsLayout.addWidget(lreleaseLabel, 3, 0)
        pathsLayout.addWidget(self.lreleaseLabel, 3, 1)
        pathsLayout.addWidget(lreleaseButton, 3, 2)
        toolPathGroupBox.setLayout(pathsLayout)

        resourceModuleNamesGroupBox = QGroupBox(
                "Resource Module Names")
        qrcFiles = bool(int(settings.value("qrc_resources", "1").toString()))
        self.qrcRadioButton = QRadioButton("&qrc_file.py")
        self.qrcRadioButton.setChecked(qrcFiles)
        self.rcRadioButton = QRadioButton("file_&rc.py")
        self.rcRadioButton.setChecked(not qrcFiles)

        radioLayout = QHBoxLayout()
        radioLayout.addWidget(self.qrcRadioButton)
        radioLayout.addWidget(self.rcRadioButton)
        resourceModuleNamesGroupBox.setLayout(radioLayout)

        self.pyuic4xCheckBox = QCheckBox("Run pyuic4 with -&x "
                " to make forms stand-alone runable")
        x = bool(int(settings.value("pyuic4x", "0").toString()))
        self.pyuic4xCheckBox.setChecked(x)

        buttonBox = QDialogButtonBox(QDialogButtonBox.Ok|
                                     QDialogButtonBox.Cancel)

        layout = QVBoxLayout()
        layout.addWidget(toolPathGroupBox)
        layout.addWidget(resourceModuleNamesGroupBox)
        layout.addWidget(self.pyuic4xCheckBox)
        layout.addWidget(buttonBox)
        self.setLayout(layout)

        pyuic4Button.clicked.connect(lambda: self.setPath("pyuic4"))
        pyrcc4Button.clicked.connect(lambda: self.setPath("pyrcc4"))
        pylupdate4Button.clicked.connect(lambda: self.setPath("pylupdate4"))
        lreleaseButton.clicked.connect(lambda: self.setPath("lrelease"))
        buttonBox.accepted.connect(self.accept)
        buttonBox.rejected.connect(self.reject)

        self.setWindowTitle("Make PyQt - Options")

    def accept(self):
        settings = QSettings()
        settings.setValue("pyuic4", QVariant(self.pyuic4Label.text()))
        settings.setValue("pyrcc4", QVariant(self.pyrcc4Label.text()))
        settings.setValue("pylupdate4",
                QVariant(self.pylupdate4Label.text()))
        settings.setValue("lrelease", QVariant(self.lreleaseLabel.text()))
        settings.setValue("qrc_resources",
                "1" if self.qrcRadioButton.isChecked() else "0")
        settings.setValue("pyuic4x",
                "1" if self.pyuic4xCheckBox.isChecked() else "0")
        QDialog.accept(self)

    def setPath(self, tool):
        if tool == "pyuic4":
            label = self.pyuic4Label
        elif tool == "pyrcc4":
            label = self.pyrcc4Label
        elif tool == "pylupdate4":
            label = self.pylupdate4Label
        elif tool == "lrelease":
            label = self.lreleaseLabel
        path = QFileDialog.getOpenFileName(self,
                "Make PyQt - Set Tool Path", label.text())
        if path:
            label.setText(QDir.toNativeSeparators(path))


class Form(QMainWindow):
    def __init__(self):
        super(Form, self).__init__()

        pathLabel = QLabel("Path:")
        settings = QSettings()
        rememberPath = settings.value("rememberpath",
                QVariant(True if Windows else False)).toBool()
        if rememberPath:
            path = (unicode(settings.value("path").toString()) or
                    os.getcwd())
        else:
            path = (sys.argv[1] if len(sys.argv) > 1 and
                    QFile.exists(sys.argv[1]) else os.getcwd())
        self.pathLabel = QLabel(path)
        self.pathLabel.setFrameStyle(QFrame.StyledPanel|
                                     QFrame.Sunken)
        self.pathLabel.setToolTip("The relative path; all actions will "
                "take place here,<br>and in this path's subdirectories "
                "if the Recurse checkbox is checked")
        self.pathButton = QPushButton("&Path...")
        self.pathButton.setToolTip(self.pathLabel.toolTip().replace(
                "The", "Sets the"))
        self.recurseCheckBox = QCheckBox("&Recurse")
        self.recurseCheckBox.setToolTip("Clean or build all the files "
                "in the path directory,<br>and all its subdirectories, "
                "as deep as they go.")
        self.transCheckBox = QCheckBox("&Translate")
        self.transCheckBox.setToolTip("Runs <b>pylupdate4</b> on all "
                "<tt>.py</tt> and <tt>.pyw</tt> files in conjunction "
                "with each <tt>.ts</tt> file.<br>Then runs "
                "<b>lrelease</b> on all <tt>.ts</tt> files to produce "
                "corresponding <tt>.qm</tt> files.<br>The "
                "<tt>.ts</tt> files must have been created initially by "
                "running <b>pylupdate4</b><br>directly on a <tt>.py</tt> "
                "or <tt>.pyw</tt> file using the <tt>-ts</tt> option.")
        self.debugCheckBox = QCheckBox("&Dry Run")
        self.debugCheckBox.setToolTip("Shows the actions that would "
                "take place but does not do them.")
        self.logBrowser = QTextBrowser()
        self.logBrowser.setLineWrapMode(QTextEdit.NoWrap)
        self.buttonBox = QDialogButtonBox()
        menu = QMenu(self)
        optionsAction = menu.addAction("&Options...")
        self.rememberPathAction = menu.addAction("&Remember path")
        self.rememberPathAction.setCheckable(True)
        self.rememberPathAction.setChecked(rememberPath)
        aboutAction = menu.addAction("&About")
        moreButton = self.buttonBox.addButton("&More",
                QDialogButtonBox.ActionRole)
        moreButton.setMenu(menu)
        moreButton.setToolTip("Use <b>More-&gt;Tool paths</b> to set the "
                "paths to the tools if they are not found by default")
        self.buildButton = self.buttonBox.addButton("&Build",
                QDialogButtonBox.ActionRole)
        self.buildButton.setToolTip("Runs <b>pyuic4</b> on all "
                "<tt>.ui</tt> "
                "files and <b>pyrcc4</b> on all <tt>.qrc</tt> files "
                "that are out-of-date.<br>Also runs <b>pylupdate4</b> "
                "and <b>lrelease</b> if the Translate checkbox is "
                "checked.")
        self.cleanButton = self.buttonBox.addButton("&Clean",
                QDialogButtonBox.ActionRole)
        self.cleanButton.setToolTip("Deletes all <tt>.py</tt> files that "
                "were generated from <tt>.ui</tt> and <tt>.qrc</tt> "
                "files,<br>i.e., all files matching <tt>qrc_*.py</tt>, "
                "<tt>*_rc.py</tt> and <tt>ui_*.py.")
        quitButton = self.buttonBox.addButton("&Quit",
                QDialogButtonBox.RejectRole)

        topLayout = QHBoxLayout()
        topLayout.addWidget(pathLabel)
        topLayout.addWidget(self.pathLabel, 1)
        topLayout.addWidget(self.pathButton)
        bottomLayout = QHBoxLayout()
        bottomLayout.addWidget(self.recurseCheckBox)
        bottomLayout.addWidget(self.transCheckBox)
        bottomLayout.addWidget(self.debugCheckBox)
        bottomLayout.addStretch()
        bottomLayout.addWidget(self.buttonBox)
        layout = QVBoxLayout()
        layout.addLayout(topLayout)
        layout.addWidget(self.logBrowser)
        layout.addLayout(bottomLayout)
        widget = QWidget()
        widget.setLayout(layout)
        self.setCentralWidget(widget)

        aboutAction.triggered.connect(self.about)
        optionsAction.triggered.connect(self.setOptions)
        self.pathButton.clicked.connect(self.setPath)
        self.buildButton.clicked.connect(self.build)
        self.cleanButton.clicked.connect(self.clean)
        quitButton.clicked.connect(self.close)

        self.setWindowTitle("Make PyQt")

    def closeEvent(self, event):
        settings = QSettings()
        settings.setValue("rememberpath",
                QVariant(self.rememberPathAction.isChecked()))
        settings.setValue("path", QVariant(self.pathLabel.text()))
        event.accept()

    def about(self):
        QMessageBox.about(self, "About Make PyQt",
                """<b>Make PyQt</b> v {0}
                <p>Copyright &copy; 2007-10 Qtrac Ltd. 
                All rights reserved.
                <p>This application can be used to build PyQt
                applications.
                It runs pyuic4, pyrcc4, pylupdate4, and lrelease as
                required, although pylupdate4 must be run directly to
                create the initial .ts files.
                <p>Python {1} - Qt {2} - PyQt {3} on {4}""".format(
                __version__, platform.python_version(),
                QT_VERSION_STR, PYQT_VERSION_STR,
                platform.system()))

    def setPath(self):
        path = QFileDialog.getExistingDirectory(self,
                "Make PyQt - Set Path", self.pathLabel.text())
        if path:
            self.pathLabel.setText(QDir.toNativeSeparators(path))

    def setOptions(self):
        dlg = OptionsForm(self)
        dlg.exec_()

    def build(self):
        self.updateUi(False)
        self.logBrowser.clear()
        recurse = self.recurseCheckBox.isChecked()
        path = unicode(self.pathLabel.text())
        self._apply(recurse, self._build, path)
        if self.transCheckBox.isChecked():
            self._apply(recurse, self._translate, path)
        self.updateUi(True)

    def clean(self):
        self.updateUi(False)
        self.logBrowser.clear()
        recurse = self.recurseCheckBox.isChecked()
        path = unicode(self.pathLabel.text())
        self._apply(recurse, self._clean, path)
        self.updateUi(True)

    def updateUi(self, enable):
        for widget in (self.buildButton, self.cleanButton,
                self.pathButton, self.recurseCheckBox,
                self.transCheckBox, self.debugCheckBox):
            widget.setEnabled(enable)
        if not enable:
            QApplication.setOverrideCursor(QCursor(Qt.WaitCursor))
        else:
            QApplication.restoreOverrideCursor()
            self.buildButton.setFocus()

    def _apply(self, recurse, function, path):
        if not recurse:
            function(path)
        else:
            for root, dirs, files in os.walk(path):
                for dir in sorted(dirs):
                    function(os.path.join(root, dir))

    def _make_error_message(self, command, process):
        err = ""
        ba = process.readAllStandardError()
        if not ba.isEmpty():
            err = ": " + str(QString(ba))
        return "<font color=red>FAILED: %s%s</font>" % (command, err)

    def _build(self, path):
        settings = QSettings()
        pyuic4 = unicode(settings.value("pyuic4",
                                        QVariant(PYUIC4)).toString())
        pyrcc4 = unicode(settings.value("pyrcc4",
                                        QVariant(PYRCC4)).toString())
        prefix = unicode(self.pathLabel.text())
        pyuic4x = bool(int(settings.value("pyuic4x", "0").toString()))
        if not prefix.endswith(os.sep):
            prefix += os.sep
        failed = 0
        process = QProcess()
        for name in os.listdir(path):
            source = os.path.join(path, name)
            target = None
            if source.endswith(".ui"):
                target = os.path.join(path,
                                    "ui_" + name.replace(".ui", ".py"))
                command = pyuic4
            elif source.endswith(".qrc"):
                if bool(int(settings.value("qrc_resources", "1").toString())):
                    target = os.path.join(path,
                                        "qrc_" + name.replace(".qrc", ".py"))
                else:
                    target = os.path.join(path, name.replace(".qrc", "_rc.py"))
                command = pyrcc4
            if target is not None:
                if not os.access(target, os.F_OK) or (
                   os.stat(source)[stat.ST_MTIME] >
                   os.stat(target)[stat.ST_MTIME]):
                    args = ["-o", target, source]
                    if command == PYUIC4 and pyuic4x:
                        args.insert(0, "-x")
                    if (sys.platform.startswith("darwin") and
                        command == PYUIC4):
                        command = sys.executable
                        args = [PYUIC4] + args
                    msg = ("converted <font color=darkblue>" + source +
                           "</font> to <font color=blue>" + target +
                           "</font>")
                    if self.debugCheckBox.isChecked():
                        msg = "<font color=green># " + msg + "</font>"
                    else:
                        process.start(command, args)
                        if (not process.waitForFinished(2 * 60 * 1000) or
                            not QFile.exists(target)):
                            msg = self._make_error_message(command,
                                                           process)
                            failed += 1
                    self.logBrowser.append(msg.replace(prefix, ""))
                else:
                    self.logBrowser.append("<font color=green>"
                            "# {0} is up-to-date</font>".format(
                            source.replace(prefix, "")))
                QApplication.processEvents()
        if failed:
            QMessageBox.information(self, "Make PyQt - Failures",
                    "Try manually setting the paths to the tools "
                    "using <b>More-&gt;Options</b>")

    def _clean(self, path):
        prefix = unicode(self.pathLabel.text())
        if not prefix.endswith(os.sep):
            prefix += os.sep
        deletelist = []
        for name in os.listdir(path):
            target = os.path.join(path, name)
            source = None
            if (target.endswith(".py") or target.endswith(".pyc") or
                target.endswith(".pyo")):
                if name.startswith("ui_") and not name[-1] in "oc":
                    source = os.path.join(path, name[3:-3] + ".ui")
                elif name.startswith("qrc_"):
                    if target[-1] in "oc":
                        source = os.path.join(path, name[4:-4] + ".qrc")
                    else:
                        source = os.path.join(path, name[4:-3] + ".qrc")
                elif name.endswith(("_rc.py", "_rc.pyo", "_rc.pyc")):
                    if target[-1] in "oc":
                        source = os.path.join(path, name[:-7] + ".qrc")
                    else:
                        source = os.path.join(path, name[:-6] + ".qrc")
                elif target[-1] in "oc":
                    source = target[:-1]
                if source is not None:
                    if os.access(source, os.F_OK):
                        if self.debugCheckBox.isChecked():
                            self.logBrowser.append("<font color=green>"
                                    "# delete {0}</font>".format(
                                    target.replace(prefix, "")))
                        else:
                            deletelist.append(target)
                    else:
                        self.logBrowser.append("<font color=darkred>"
                                "will not remove "
                                "'{0}' since `{1}' not found</font>"
                                .format(target.replace(prefix, ""),
                                source.replace(prefix, "")))
        if not self.debugCheckBox.isChecked():
            for target in deletelist:
                self.logBrowser.append("deleted "
                        "<font color=red>{0}</font>".format(
                        target.replace(prefix, "")))
                os.remove(target)
                QApplication.processEvents()

    def _translate(self, path):
        prefix = unicode(self.pathLabel.text())
        if not prefix.endswith(os.sep):
            prefix += os.sep
        files = []
        tsfiles = []
        for name in os.listdir(path):
            if name.endswith((".py", ".pyw")):
                files.append(os.path.join(path, name))
            elif name.endswith(".ts"):
                tsfiles.append(os.path.join(path, name))
        if not tsfiles:
            return
        settings = QSettings()
        pylupdate4 = unicode(settings.value("pylupdate4",
                             QVariant(PYLUPDATE4)).toString())
        lrelease = unicode(settings.value("lrelease",
                           QVariant(LRELEASE)).toString())
        process = QProcess()
        failed = 0
        for ts in tsfiles:
            qm = ts[:-3] + ".qm"
            command1 = pylupdate4
            args1 = files + ["-ts", ts]
            command2 = lrelease
            args2 = ["-silent", ts, "-qm", qm]
            msg = "updated <font color=blue>{0}</font>".format(
                    ts.replace(prefix, ""))
            if self.debugCheckBox.isChecked():
                msg = "<font color=green># {0}</font>".format(msg)
            else:
                process.start(command1, args1)
                if not process.waitForFinished(2 * 60 * 1000):
                    msg = self._make_error_message(command1, process)
                    failed += 1
            self.logBrowser.append(msg)
            msg = "generated <font color=blue>{0}</font>".format(
                    qm.replace(prefix, ""))
            if self.debugCheckBox.isChecked():
                msg = "<font color=green># {0}</font>".format(msg)
            else:
                process.start(command2, args2)
                if not process.waitForFinished(2 * 60 * 1000):
                    msg = self._make_error_message(command2, process)
                    failed += 1
            self.logBrowser.append(msg)
            QApplication.processEvents()
        if failed:
            QMessageBox.information(self, "Make PyQt - Failures",
                    "Try manually setting the paths to the tools "
                    "using <b>More-&gt;Options</b>")


app = QApplication(sys.argv)
PATH = unicode(app.applicationDirPath())
if Windows:
    PATH = os.path.join(os.path.dirname(sys.executable),
                        "Lib/site-packages/PyQt4")
    if os.access(os.path.join(PATH, "bin"), os.R_OK):
        PATH = os.path.join(PATH, "bin")
if sys.platform.startswith("darwin"):
    i = PATH.find("Resources")
    if i > -1:
        PATH = PATH[:i] + "bin"
PYUIC4 = os.path.join(PATH, "pyuic4")
if sys.platform.startswith("darwin"):
    PYUIC4 = os.path.dirname(sys.executable)
    i = PYUIC4.find("Resources")
    if i > -1:
        PYUIC4 = PYUIC4[:i] + "Lib/python2.6/site-packages/PyQt4/uic/pyuic.py"
PYRCC4 = os.path.join(PATH, "pyrcc4")
PYLUPDATE4 = os.path.join(PATH, "pylupdate4")
LRELEASE = "lrelease"
if Windows:
    PYUIC4 = PYUIC4.replace("/", "\\") + ".bat"
    PYRCC4 = PYRCC4.replace("/", "\\") + ".exe"
    PYLUPDATE4 = PYLUPDATE4.replace("/", "\\") + ".exe"
app.setOrganizationName("Qtrac Ltd.")
app.setOrganizationDomain("qtrac.eu")
app.setApplicationName("Make PyQt")
if len(sys.argv) > 1 and sys.argv[1] == "-c":
    settings = QSettings()
    settings.setValue("pyuic4", QVariant(PYUIC4))
    settings.setValue("pyrcc4", QVariant(PYRCC4))
    settings.setValue("pylupdate4", QVariant(PYLUPDATE4))
    settings.setValue("lrelease", QVariant(LRELEASE))
form = Form()
form.show()
app.exec_()

Useful to Know

Creating the GUI in Qt Designer not only makes creating the GUI easier, but it's also a great learning tool. You can test what a widget looks like, see what's available in Qt, and have a look at properties you might want to use.

The C++ API documentation is also a very useful (read: necessary) tool when working with PyQt. The API translation is straightforward, and after a little experience, you'll find the developers API docs one of the tools you really need. When working from KDE, konqueror's default shortcut is qt:[widgetname], so [alt]+[F2], "qt:qbutton directly takes you to the right API documentation page. Trolltech's doc section has much more documentation which you might want to have a look at.

The first 3 examples in this tutorial have been created using PyQt4, the last one uses syntax that only works with PyQt3.

Note: The previous version of this page (aplicable to pyqt3) is/was available at http://vizzzion.org/?id=pyqt.

This document is published under the GNU Free Documentation License.




Dbus


In Linux, Dbus is a way for processes to communicate with each other. For example, programs like Pidgin instant messenger allow other programs to find out or change the user's status (Available, Away, etc). Another example is the network-manager service that publishes which internet connection is active. Programs that sometimes connect to the internet can then pick the best time to download updates to the system.

Buses

Messages are sent along buses. Services attach themselves to these buses, and allow clients to pass messages to and from them.

There are two main buses, the system bus and session bus. Services on the system bus affect the whole system, such as providing information about the network or disk drives. Services on the session bus provide access to programs running on the desktop, like Pidgin.

import dbus

sys_bus = dbus.SystemBus()

Objects and interfaces

Services attached to a bus can be contacted using their well-known name. While this could be any string, the format is normally that of a reverse domain name: an example for a spreadsheet program called "CalcProgram" from "My Corp Inc." could be "com.mycorp.CalcProgram".

Services publish objects using slash-separated paths (this is similar to webpages). Someone on dbus can request an object if they know this path.

The object passed back is not a full object: it just refers to the service's copy of the object. It is called a proxy object.

proxy_for_cell_a2 = sys_bus.get_object('com.mycorp.CalcProgram', '/spreadsheet1/cells/a2')

Before the proxy object can be used, we need to specify what type of object it is. We do this by creating an interface object.

cell_a2 = dbus.Interface(proxy_for_cell_a2, 'com.mycorp.CalcProgram.SpreadsheetCell')

Whatever methods are set up for this type of object can be called:

cell_a2.getContents()
Name Example Description
service well known name com.mycorp.CalcProgram Identifies the application
path of an object /spreadsheet1/cells/a2 Identifies an object published by a service
interface com.mycorp.CalcProgram.SpreadsheetCell Identifies what type of object we expect

dbus-python examples

These examples have been tested with dbus-python 0.83.0. Older library versions may not have the same interface.

Calling an interface's methods / Listing HAL Devices:

import dbus

bus = dbus.SystemBus()
hal_manager_object = bus.get_object('org.freedesktop.Hal', '/org/freedesktop/Hal/Manager')
hal_manager_interface = dbus.Interface(hal_manager_object, 'org.freedesktop.Hal.Manager')

# calling method upon interface
print hal_manager_interface.GetAllDevices()

# accessing a method through 'get_dbus_method' through proxy object by specifying interface
method = hal_manager_object.get_dbus_method('GetAllDevices', 'org.freedesktop.Hal.Manager')
print method()

# calling method upon proxy object by specifying the interface to use
print hal_manager_object.GetAllDevices(dbus_interface='org.freedesktop.Hal.Manager')

Introspecting an object:

import dbus

bus = dbus.SystemBus()
hal_manager_object = bus.get_object(
    'org.freedesktop.Hal',          # service
    '/org/freedesktop/Hal/Manager'  # published object
)

introspection_interface = dbus.Interface(
    hal_manager_object,
    dbus.INTROSPECTABLE_IFACE,
)

# Introspectable interfaces define a property 'Introspect' that
# will return an XML string that describes the object's interface
interface = introspection_interface.Introspect()
print interface

Avahi:

import dbus

sys_bus = dbus.SystemBus()

# get an object called / in org.freedesktop.Avahi to talk to
raw_server = sys_bus.get_object('org.freedesktop.Avahi', '/')

# objects support interfaces. get the org.freedesktop.Avahi.Server interface to our org.freedesktop.Avahi object.
server = dbus.Interface(raw_server, 'org.freedesktop.Avahi.Server')

# The so-called documentation is at /usr/share/avahi/introspection/Server.introspect
print server
print server.GetVersionString()
print server.GetHostName()

pydbus examples

These examples have been tested with pydbus 0.2 and 0.3.

Calling an interface's methods / Listing systemd units:

from pydbus import SystemBus

bus = SystemBus()
systemd = bus.get(
    '.systemd1' # service name - names starting with . automatically get org.freedesktop prepended.
    # no object path - it'll be set to the service name transformed to the path format (/org/freedesktop/systemd1)
)

for unit in systemd.ListUnits()[0]:
    print(unit)

Introspecting an object:

from pydbus import SystemBus

bus = SystemBus()
systemd = bus.get('.systemd1')

# Introspectable interfaces define a property 'Introspect' that
# will return an XML string that describes the object's interface
print(systemd.Introspect()[0])

# Introspection data is automatically converted to Python's help system data
help(systemd)

Avahi:

from pydbus import SystemBus

bus = SystemBus()

# get an object called / in org.freedesktop.Avahi to talk to
avahi = bus.get('.Avahi', '/')

# See the object's API
help(avahi)

print(avahi.GetVersionString())
print(avahi.GetHostName())

References



pyFormex


pyFormex is a module for Python, which allows the generation, manipulation, and operation of 3D geometric models using mathematical operations. Its uses include automated 3D design and finite-element preprocessing.



matplotlib

matplotlib is a Python library that allows Python to be used like Matlab, visualizing data on the fly. It is able to create plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc. It can be used from normal Python and also from iPython.

Examples:

Plot a data series that represents the square function:

from matplotlib import pyplot as plt
data = [x * x for x in range(20)]
plt.plot(data)
plt.show()

Plot a data series that represents the square function but in reverse order, by providing not only the series of the y-axis values but also the series of x-axis values:

from matplotlib import pyplot as plt
datax = range(20)
datax.reverse()
datay = [x * x for x in range(20)]
plt.plot(datax, datay)
plt.show()

Plot the square function, setting the limits for the y-axis:

from matplotlib import pyplot as plt
data = [x * x for x in range(20)]
plt.ylim(-500, 500) # Set limits of y-axis
plt.plot(data)
plt.show()



Sorted Container Types

Python does not provide modules for sorted set and dictionary data types as part of its standard library. This is a concious decision on the part of Guido van Rossum, et al. to preserve "one obvious way to do it." Instead, Python delegates this task to third-party libraries that are available on the Python Package Index. These libraries use various techniques to maintain list, dict, and set types in sorted order. Maintaining order using a specialized data structure can avoid very slow behavior (quadratic run-time) in the naive approach of editing and constantly re-sorting.

Third-party modules supporting sorted containers:

  • SortedContainers - Pure-Python implementation that is fast-as-C implementations. Implements sorted list, dict, and set. Testing includes 100% code coverage and hours of stress. Documentation includes full API reference, performance comparison, and contributing/development guidelines. License: Apache2.
  • rbtree - Provides a fast, C-implementation for sorted dict and set data types. Based on a red-black tree implementation.
  • treap - Provides a sorted dict data type. Uses a treap for implementation and improves performance using Cython.
  • bintrees - Provides several tree-based implementations for dict and set data types. Fastest implementations are based on AVL and Red-Black trees. Implemented in C. Extends the conventional API to provide set operations for dict data types.
  • banyan - Provides a fast, C-implementation for dict and set data types.
  • skiplistcollections - Pure-Python implementation based on skip-lists providing a limited API for dict and set data types.
  • blist - Provides sorted list, dict and set data types based on the "blist" data type, a B-tree implementation. Implemented in Python and C.



Excel

Python has multiple 3rd party libraries for reading and writing Microsoft Excel and Apache OpenOffice files.

For working with .xls files, there is xlrd for reading and xlwt for writing.

For working with .xlsx files, there is xlrd for reading, openpyxl for reading and writing, and XlsxWriter and PyExcelerate for writing.

For working with .ods files, there is pyexcel-ezodf for reading and writing.

xlrd

Supports reading .xls and .xlsx Excel files. License: BSD.

Example:

import xlrd
workbook = xlrd.open_workbook("MySpreadsheet.xls")
#for sheet in workbook.sheets(): # Loads all the sheets, unlike workbook.sheet_names()
for sheetName in workbook.sheet_names(): # Sheet iteration by name
  print "Sheet name:", sheetName
  sheet = workbook.sheet_by_name(sheetName)
  for rowno in range(sheet.nrows):
    for colno in range(sheet.ncols):
      cell = sheet.cell(rowno, colno)
      print str(cell.value) # Output as a string
      if cell.ctype == xlrd.XL_CELL_DATE:
        dateTuple = xlrd.xldate_as_tuple(cell.value, workbook.datemode)
        print dateTuple # E.g. (2017, 1, 1, 0, 0, 0)
        mydate = xlrd.xldate.xldate_as_datetime(cell.value, workbook.datemode)
        print mydate # In xlrd 0.9.3
      print
    
for sheetno in range(workbook.nsheets): # Sheet iteration by index
  sheet = workbook.sheet_by_index(sheetno)
  print "Sheet name:", sheet.name
  for notekey in sheet.cell_note_map: # In xlrd 0.7.2
    print "Note AKA comment text:", sheet.cell_note_map[notekey].text
  
print xlrd.formula.colname(1) # Column name such as A or AD, here 'B'

Links:

xlwt

Supports writing .xls files. License: BSD.

Links:

openpyxl

Supports reading and writing .xlsx Excel files. Does not support .xls files. License: MIT.

Reading a workbook:

from openpyxl import load_workbook
workbook = load_workbook("MyNewWorkbook.xlsx")
for worksheet in workbook.worksheets:
  print "==%s==" % worksheet.title
  for row in worksheet: # For each cell in each row
    for cell in row:
      print cell.row, cell.column, cell.value # E.g. 1 A Value
  for cell in worksheet["A"]: # For each cell in column A
    print cell.value
  print worksheet["A1"].value # A single cell
  print worksheet.cell(column=1, row=1).value # A1 value as well

Creating a new workbook:

from openpyxl import Workbook
workbook = Workbook()
worksheet = workbook.worksheets[0]
worksheet['A1'] = 'String value'
worksheet['A2'] = 42 # Numerical value
worksheet.cell(row=3, column=1).value = "New A3 Value"
workbook.save("MyNewWorkbook.xlsx") # Overrides if it exists

Changing an existing workbook:

from openpyxl import load_workbook
workbook_name = 'MyWorkbook.xlsx'
workbook = load_workbook(workbook_name)
worksheet = workbook.worksheets[0]
worksheet['A1'] = "String value"
workbook.save(workbook_name)

Links:

XlsxWriter

Supports writing of .xlsx files. License: BSD.

Links:

PyExcelerate

Supports writing .xlsx files. License: BSD.

Links:

xlutils

Supports various operations and queries on .xls files; depends on xlrd and xlwt. License: MIT.

Links:

pywin32

Supports access to Windows applications via Windows Component Object Model (COM). Thus, on Windows, if Excel is installed, PyWin32 lets you call it from Python and let it do various things. You can install PyWin32 by downloading a .exe installer from SourceForge, where it is currently hosted.

Links:



MS Word

Microsoft Word documents of the .docx format can be created or changed using python-docx 3rd party module.

The .doc format can be worked with on Windows using PyWin32 via COM interface, provided Word is installed. An example:

import win32com.client
wordapp = win32com.client.gencache.EnsureDispatch("Word.Application")
# wordapp.Visible = False
worddoc = wordapp.Documents.Open(r"C:\MyFile.doc")
wdFormatHTML = 8
worddoc.SaveAs(r"C:\MyFile.html", FileFormat=wdFormatHTML)
worddoc.ActiveWindow.Close()
# wordapp.Application.Quit(-1) - No need to quit; Word quits when its last window is closed



Extending with C


Python modules can be written in pure Python but they can also be written in the C language. The following shows how to extend Python with C.

Using the Python/C API

A minimal example

To illustrate the mechanics, we will create a minimal extension module containing a single function that outputs "Hello" followed by the name passed in as the first parameter.

We will first create the C source code, placing it to hellomodule.c:

#include <Python.h>

static PyObject*
say_hello(PyObject* self, PyObject* args)
{
    const char* name;

    if (!PyArg_ParseTuple(args, "s", &name))
        return NULL;

    printf("Hello %s!\n", name);

    Py_RETURN_NONE;
}

static PyMethodDef HelloMethods[] =
{
     {"say_hello", say_hello, METH_VARARGS, "Greet somebody."},
     {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC
inithello(void)
{
     (void) Py_InitModule("hello", HelloMethods);
}

Then we will need a setup file, setup.py:

from distutils.core import setup, Extension

module1 = Extension('hello', sources = ['hellomodule.c'])

setup (name = 'PackageName',
        version = '1.0',
        description = 'This is a demo package',
        ext_modules = [module1])

Then we can build the module using a procedure whose details depends on the operating system and the compiler suite.

Building with GCC for Linux

Before our module can be compiled, you must install the Python development headers if you have not already. On Debian and Debian-based systems such as Ubuntu, these can be installed with the following command:

$ sudo apt install python-dev

On openSUSE, the required package is called python-devel and can be installed with zypper:

$ sudo zypper install python-devel


Now that Python.h is available, we can compile the module source code we created in the previous section as follows:

$ python setup.py build

The will compile the module to a file called hello.so in build/lib.linux-i686-x.y.

Building with GCC for Microsoft Windows

Microsoft Windows users can use MinGW to compile the extension module from the command line. Assuming gcc is in the path, you can build the extension as follows:

python setup.py build -cmingw32

The above will produce file hello.pyd, a Python Dynamic Module, similar to a DLL. The file will land in build\lib.win32-x.y.

An alternate way of building the module in Windows is to build a DLL. (This method does not need an extension module file). From cmd.exe, type:

gcc -c  hellomodule.c -I/PythonXY/include
gcc -shared hellomodule.o -L/PythonXY/libs -lpythonXY -o hello.dll

where XY represents the version of Python, such as "24" for version 2.4.

Building using Microsoft Visual C++

With VC8, distutils is broken. Therefore, we will use cl.exe from a command prompt instead:

cl /LD hellomodule.c /Ic:\Python24\include c:\Python24\libs\python24.lib /link/out:hello.dll

Using the extension module

Change to the subdirectory where the file hello.so resides. In an interactive Python session you can use the module as follows.

>>> import hello
>>> hello.say_hello("World")
Hello World!

A module for calculating Fibonacci numbers

In this section, we present a module for Fibonacci numbers, thereby expanding on the minimal example above. Compared to the minimal example, what is worth noting is the use of "i" in PyArg_ParseTuple() and Py_BuildValue().

The C source code in (fibmodule.c):

#include <Python.h>

int
_fib(int n)
{
    if (n < 2)
        return n;
    else
        return _fib(n-1) + _fib(n-2);
}

static PyObject*
fib(PyObject* self, PyObject* args)
{
    int n;

    if (!PyArg_ParseTuple(args, "i", &n))
        return NULL;

    return Py_BuildValue("i", _fib(n));
}

static PyMethodDef FibMethods[] = {
    {"fib", fib, METH_VARARGS, "Calculate the Fibonacci numbers."},
    {NULL, NULL, 0, NULL}
};

PyMODINIT_FUNC
initfib(void)
{
    (void) Py_InitModule("fib", FibMethods);
}

The build script (setup.py):

from distutils.core import setup, Extension

module1 = Extension('fib', sources = ['fibmodule.c'])

setup (name = 'PackageName',
        version = '1.0',
        description = 'This is a demo package',
        ext_modules = [module1])

Usage:

>>> import fib
>>> fib.fib(10)
55

Using SWIG

SWIG is a tool that helps a variety of scripting and programming languages call C and C++ code. SWIG makes creation of C language modules much more straightforward.

To use SWIG, you need to get it up and running first.

You can install it on an Ubuntu system as follows:

$ sudo apt-get install swig
$ sudo apt-get install python-dev

To get SWIG for Windows, you can use binaries available from the SWIG download page.

Once you have SWIG, you need to create the module source file and the module interface file:

hellomodule.c:

#include <stdio.h>

void say_hello(const char* name) {
    printf("Hello %s!\n", name);
}

hello.i:

%module hello
extern void say_hello(const char* name);

Then we let SWIG do its work:

swig -python hello.i

The above produces files hello.py and hello_wrap.c.

The next step is compiling; substitute /usr/include/python2.4/ with the correct path to Python.h for your setup:

gcc -fpic -c hellomodule.c hello_wrap.c -I/usr/include/python2.4/

As the last step, we do the linking:

gcc -shared hellomodule.o hello_wrap.o -o _hello.so -lpython

The module is used as follows:

>>> import hello
>>> hello.say_hello("World")
Hello World!



Extending with C++


There are different ways to extend Python with C and C++ code:

  • In plain C, using Python.h
  • Using Swig
  • Using Boost.Python, optionally with Py++ preprocessing
  • Using pybind11
  • Using Cython.

This page describes Boost.Python. Before the emergence of Cython, it was the most comfortable way of writing C++ extension modules.

Boost.Python comes bundled with the Boost C++ Libraries. To install it on an Ubuntu system, you might need to run the following commands

$ sudo apt-get install libboost-python-dev 
$ sudo apt-get install python-dev

A Hello World Example

The C++ source code (hellomodule.cpp)

#include <iostream>

using namespace std;

void say_hello(const char* name) {
    cout << "Hello " <<  name << "!\n";
}

#include <boost/python/module.hpp>
#include <boost/python/def.hpp>
using namespace boost::python;

BOOST_PYTHON_MODULE(hello)
{
    def("say_hello", say_hello);
}

setup.py

#!/usr/bin/env python

from distutils.core import setup
from distutils.extension import Extension

setup(name="PackageName",
    ext_modules=[
        Extension("hello", ["hellomodule.cpp"],
        libraries = ["boost_python"])
    ])

Now we can build our module with

python setup.py build

The module `hello.so` will end up in e.g `build/lib.linux-i686-2.4`.

Using the extension module

Change to the subdirectory where the file `hello.so` resides. In an interactive python session you can use the module as follows.

>>> import hello
>>> hello.say_hello("World")
Hello World!

An example with CGAL

Some, but not all, functions of the CGAL library already have Python bindings. Here an example is provided for a case without such a binding and how it might be implemented. The example is taken from the CGAL Documentation.

// test.cpp
using namespace std;

/* PYTHON */
#include <boost/python.hpp>
#include <boost/python/module.hpp>
#include <boost/python/def.hpp>
namespace python = boost::python;

/* CGAL */
#include <CGAL/Cartesian.h>
#include <CGAL/Range_segment_tree_traits.h>
#include <CGAL/Range_tree_k.h>

typedef CGAL::Cartesian<double> K;
typedef CGAL::Range_tree_map_traits_2<K, char> Traits;
typedef CGAL::Range_tree_2<Traits> Range_tree_2_type;

typedef Traits::Key Key;
typedef Traits::Interval Interval;

Range_tree_2_type *Range_tree_2 = new Range_tree_2_type;

void create_tree()   {

  typedef Traits::Key Key;                
  typedef Traits::Interval Interval;    

  std::vector<Key> InputList, OutputList;
  InputList.push_back(Key(K::Point_2(8,5.1), 'a'));
  InputList.push_back(Key(K::Point_2(1.0,1.1), 'b'));
  InputList.push_back(Key(K::Point_2(3,2.1), 'c'));

  Range_tree_2->make_tree(InputList.begin(),InputList.end());
  Interval win(Interval(K::Point_2(1,2.1),K::Point_2(8.1,8.2)));
  std::cout << "\n Window Query:\n";
  Range_tree_2->window_query(win, std::back_inserter(OutputList));
  std::vector<Key>::iterator current=OutputList.begin();
  while(current!=OutputList.end()){
      std::cout << "  " << (*current).first.x() << "," << (*current).first.y()
           << ":" << (*current).second << std::endl;
      current++;
    }
  std::cout << "\n Done\n";
}

void initcreate_tree() {;}

using namespace boost::python;
BOOST_PYTHON_MODULE(test)
{
    def("create_tree", create_tree, "");
}
// setup.py
#!/usr/bin/env python
 
from distutils.core import setup
from distutils.extension import Extension
 
setup(name="PackageName",
    ext_modules=[
        Extension("test", ["test.cpp"],
        libraries = ["boost_python"])
    ])

We then compile and run the module as follows:

$ python setup.py build
$ cd build/lib*
$ python
>>> import test
>>> test.create_tree()
Window Query:
 3,2.1:c
 8,5.1:a
Done
>>>

Handling Python objects and errors

One can also handle more complex data, e.g. Python objects like lists. The attributes are accessed with the extract function executed on the objects "attr" function output. We can also throw errors by telling the library that an error has occurred and returning. In the following case, we have written a C++ function called "afunction" which we want to call. The function takes an integer N and a vector of length N as input, we have to convert the python list to a vector of strings before calling the function.

#include <vector>
using namespace std;

void _afunction_wrapper(int N, boost::python::list mapping) {

    int mapping_length = boost::python::extract<int>(mapping.attr("__len__")());
    //Do Error checking, the mapping needs to be at least as long as N 
    if (mapping_length < N) {
        PyErr_SetString(PyExc_ValueError,
            "The string mapping must be at least of length N");
        boost::python::throw_error_already_set();
        return;
    }

    vector<string> mystrings(mapping_length);
    for (int i=0; i<mapping_length; i++) {
        mystrings[i] = boost::python::extract<char const *>(mapping[i]);
    }

   //now call our C++ function
   _afunction(N, mystrings);

}

using namespace boost::python;
BOOST_PYTHON_MODULE(c_afunction)
{
    def("afunction", _afunction_wrapper);
}




Extending with Pyrex


Pyrex is a compiler of Python-like source code to the C language, intended to make it relatively easy for Python programmers to write fast Python extension modules without having to learn the C language.

Cython is an actively developed derivative of Pyrex. Pyrex is no longer actively developed, the last stable release being from 2010.

Links:



Extending with ctypes


ctypes is a foreign function interface module for Python (included with Python 2.5 and above), which allows you to load in dynamic libraries and call C functions. This is not technically extending Python, but it serves one of the primary reasons for extending Python: to interface with external C code.

Basics

A library is loaded using the ctypes.CDLL function. After you load the library, the functions inside the library are already usable as regular Python calls. For example, if we wanted to forego the standard Python print statement and use the standard C library function, printf, you would use this:

from ctypes import *
libName = 'libc.so' # If you're on a UNIX-based system
libName = 'msvcrt.dll' # If you're on Windows
libc = CDLL(libName)
libc.printf("Hello, World!\n")

Of course, you must use the libName line that matches your operating system, and delete the other. If all goes well, you should see the infamous Hello World string at your console.

Getting Return Values

ctypes assumes, by default, that any given function's return type is a signed integer of native size. Sometimes you don't want the function to return anything, and other times, you want the function to return other types. Every ctypes function has an attribute called restype. When you assign a ctypes class to restype, it automatically casts the function's return value to that type.

Common Types

ctypes name C type Python type Notes
None void None the None object
c_bool C99 _Bool bool
c_byte signed char int
c_char signed char str length of one
c_char_p char * str
c_double double float
c_float float float
c_int signed int int
c_long signed long long
c_longlong signed long long long
c_short signed short long
c_ubyte unsigned char int
c_uint unsigned int int
c_ulong unsigned long long
c_ulonglong unsigned long long long
c_ushort unsigned short int
c_void_p void * int
c_wchar wchar_t unicode length of one
c_wchar_p wchar_t * unicode



Extending with Perl


It is possible to call Perl functions and modules in Python. One way to do that is the PyPerl Module. It is not developed actively any more and for it to work in newer versions, one has to use this version and apply the patches.

import perl
perl.eval( "use lib './' ")
perl.require( 'Module::ModuleName' )

obj = perl.callm("new", 'Module::ModuleName'  )
obj[ '_attr1' ] = 9
obj[ '_attr2' ] = 42
obj.fxn1()

It is thus possible to handle Perl objects, change their attributes and call their methods.

See also



Popularity

The popularity of a programming language is one factor that people consider when choosing a language for a task or a project. Beware that even a good programming language can be unpopular.

The popularity of Python can be determined with the help of various indices, which are not identical to the real popularity in the real world. While the statistics are linked from the external links section below, it is advisable that you use them with a grain of salt.

Python is the main scripting language used by Google, per Google Python Style Guide.

Python is embedded in various software packages to support their extensibility and automation, including Gimp and Inkscape; see also Wikipedia's W:Python (programming_language)#Uses and W:List of Python software#Embedded as a scripting language.



Links

Web resources:

Python.org:

Online books:

Various:



Authors


Authors of Python textbook

  • Quartz25
  • Jesdisciple
  • Hannes Röst
  • David Ross
  • Lawrence D’Oliveiro
  • User:BLibrestez55




Library Modules

This is a list of python modules in the standard library as of Python 3.6.

  • __future__: Future statement definitions
  • __main__: The environment where the top-level script is run.
  • _dummy_thread: Drop-in replacement for the _thread module.
  • _thread: Low-level threading API.
  • abc: Abstract base classes according to PEP 3119.
  • aifc: Read and write audio files in AIFF or AIFC format.
  • argparse: Command-line option and argument parsing library.
  • array: Space efficient arrays of uniformly typed numeric values.
  • ast: Abstract Syntax Tree classes and manipulation.
  • asynchat: Support for asynchronous command/response protocols.
  • asyncio: Asynchronous I/O, event loop, coroutines and tasks.
  • asyncore: A base class for developing asynchronous socket handling services.
  • atexit: Register and execute cleanup functions.
  • audioop: Manipulate raw audio data.
  • base64: RFC 3548: Base16, Base32, Base64 Data Encodings; Base85 and Ascii85
  • bdb: Debugger framework.
  • binascii: Tools for converting between binary and various ASCII-encoded binary representations.
  • binhex: Encode and decode files in binhex4 format.
  • bisect: Array bisection algorithms for binary searching.
  • builtins: The module that provides the built-in namespace.
  • bz2: Interfaces for bzip2 compression and decompression.
  • calendar: Functions for working with calendars, including some emulation of the Unix cal program.
  • cgi: Helpers for running Python scripts via the Common Gateway Interface.
  • cgitb: Configurable traceback handler for CGI scripts.
  • chunk: Module to read IFF chunks.
  • cmath: Mathematical functions for complex numbers.
  • cmd: Build line-oriented command interpreters.
  • code: Facilities to implement read-eval-print loops.
  • codecs: Encode and decode data and streams.
  • codeop: Compile (possibly incomplete) Python code.
  • collections: Container datatypes
  • colorsys: Conversion functions between RGB and other color systems.
  • compileall: Tools for byte-compiling all Python source files in a directory tree.
  • concurrent:
  • configparser: Configuration file parser.
  • contextlib: Utilities for with-statement contexts.
  • copy: Shallow and deep copy operations.
  • copyreg: Register pickle support functions.
  • cProfile
  • crypt (Unix): The crypt() function used to check Unix passwords.
  • csv: Write and read tabular data to and from delimited files.
  • ctypes: A foreign function library for Python.
  • curses (Unix): An interface to the curses library, providing portable terminal handling.
  • datetime: Basic date and time types.
  • dbm: Interfaces to various Unix "database" formats.
  • decimal: Implementation of the General Decimal Arithmetic Specification.
  • difflib: Helpers for computing differences between objects.
  • dis: Disassembler for Python bytecode.
  • distutils: Support for building and installing Python modules into an existing Python installation.
  • doctest: Test pieces of code within docstrings.
  • dummy_threading: Drop-in replacement for the threading module.
  • email: Package supporting the parsing, manipulating, and generating email messages.
  • encodings:
  • ensurepip: Bootstrapping the "pip" installer into an existing Python installation or virtual environment.
  • enum: Implementation of an enumeration class.
  • errno: Standard errno system symbols.
  • faulthandler: Dump the Python traceback.
  • fcntl (Unix): The fcntl() and ioctl() system calls.
  • filecmp: Compare files efficiently.
  • fileinput: Loop over standard input or a list of files.
  • fnmatch: Unix shell style filename pattern matching.
  • formatter: Deprecated: Generic output formatter and device interface.
  • fpectl (Unix): Provide control for floating point exception handling.
  • fractions: Rational numbers.
  • ftplib: FTP protocol client (requires sockets).
  • functools: Higher-order functions and operations on callable objects.
  • gc: Interface to the cycle-detecting garbage collector.
  • getopt: Portable parser for command line options; support both short and long option names.
  • getpass: Portable reading of passwords and retrieval of the userid.
  • gettext: Multilingual internationalization services.
  • glob: Unix shell style pathname pattern expansion.
  • grp (Unix): The group database (getgrnam() and friends).
  • gzip: Interfaces for gzip compression and decompression using file objects.
  • hashlib: Secure hash and message digest algorithms.
  • heapq: Heap queue algorithm (a.k.a. priority queue).
  • hmac: Keyed-Hashing for Message Authentication (HMAC) implementation
  • html: Helpers for manipulating HTML.
  • http: HTTP status codes and messages
  • imaplib: IMAP4 protocol client (requires sockets).
  • imghdr: Determine the type of image contained in a file or byte stream.
  • imp: Deprecated: Access the implementation of the import statement.
  • importlib: The implementation of the import machinery.
  • inspect: Extract information and source code from live objects.
  • io: Core tools for working with streams.
  • ipaddress: IPv4/IPv6 manipulation library.
  • itertools: Functions creating iterators for efficient looping.
  • json: Encode and decode the JSON format.
  • keyword: Test whether a string is a keyword in Python.
  • lib2to3: the 2to3 library
  • linecache: This module provides random access to individual lines from text files.
  • locale: Internationalization services.
  • logging: Flexible event logging system for applications.
  • lzma: A Python wrapper for the liblzma compression library.
  • macpath: Mac OS 9 path manipulation functions.
  • mailbox: Manipulate mailboxes in various formats
  • mailcap: Mailcap file handling.
  • marshal: Convert Python objects to streams of bytes and back (with different constraints).
  • math: Mathematical functions (sin() etc.).
  • mimetypes: Mapping of filename extensions to MIME types.
  • mmap: Interface to memory-mapped files for Unix and Windows.
  • modulefinder: Find modules used by a script.
  • msilib (Windows): Creation of Microsoft Installer files, and CAB files.
  • msvcrt (Windows): Miscellaneous useful routines from the MS VC++ runtime.
  • multiprocessing: Process-based parallelism.
  • netrc: Loading of .netrc files.
  • nis (Unix): Interface to Sun's NIS (Yellow Pages) library.
  • nntplib: NNTP protocol client (requires sockets).
  • numbers: Numeric abstract base classes (Complex, Real, Integral, etc.).
  • operator: Functions corresponding to the standard operators.
  • optparse: Deprecated: Command-line option parsing library.
  • os: Miscellaneous operating system interfaces.
  • ossaudiodev (Linux, FreeBSD): Access to OSS-compatible audio devices.
  • parser: Access parse trees for Python source code.
  • pathlib: Object-oriented filesystem paths
  • pdb: The Python debugger for interactive interpreters.
  • pickle: Convert Python objects to streams of bytes and back.
  • pickletools: Contains extensive comments about the pickle protocols and pickle-machine opcodes, as well as some useful functions.
  • pipes (Unix): A Python interface to Unix shell pipelines.
  • pkgutil: Utilities for the import system.
  • platform: Retrieves as much platform identifying data as possible.
  • plistlib: Generate and parse Mac OS X plist files.
  • poplib: POP3 protocol client (requires sockets).
  • posix (Unix): The most common POSIX system calls (normally used via module os).
  • pprint: Data pretty printer.
  • profile: Python source profiler.
  • pstats: Statistics object for use with the profiler.
  • pty (Linux): Pseudo-Terminal Handling for Linux.
  • pwd (Unix): The password database (getpwnam() and friends).
  • py_compile: Generate byte-code files from Python source files.
  • pyclbr: Supports information extraction for a Python class browser.
  • pydoc: Documentation generator and online help system.
  • queue: A synchronized queue class.
  • quopri: Encode and decode files using the MIME quoted-printable encoding.
  • random: Generate pseudo-random numbers with various common distributions.
  • re: Regular expression operations.
  • readline (Unix): GNU readline support for Python.
  • reprlib: Alternate repr() implementation with size limits.
  • resource (Unix): An interface to provide resource usage information on the current process.
  • rlcompleter: Python identifier completion, suitable for the GNU readline library.
  • runpy: Locate and run Python modules without importing them first.
  • sched: General purpose event scheduler.
  • secrets: Generate secure random numbers for managing secrets.
  • select: Wait for I/O completion on multiple streams.
  • selectors: High-level I/O multiplexing.
  • shelve: Python object persistence.
  • shlex: Simple lexical analysis for Unix shell-like languages.
  • shutil: High-level file operations, including copying.
  • signal: Set handlers for asynchronous events.
  • site: Module responsible for site-specific configuration.
  • smtpd: A SMTP server implementation in Python.
  • smtplib: SMTP protocol client (requires sockets).
  • sndhdr: Determine type of a sound file.
  • socket: Low-level networking interface.
  • socketserver: A framework for network servers.
  • spwd (Unix): The shadow password database (getspnam() and friends).
  • sqlite3: A DB-API 2.0 implementation using SQLite 3.x.
  • ssl: TLS/SSL wrapper for socket objects
  • stat: Utilities for interpreting the results of os.stat(), os.lstat() and os.fstat().
  • statistics: mathematical statistics functions
  • string: Common string operations.
  • stringprep: String preparation, as per RFC 3453
  • struct: Interpret bytes as packed binary data.
  • subprocess: Subprocess management.
  • sunau: Provide an interface to the Sun AU sound format.
  • symbol: Constants representing internal nodes of the parse tree.
  • symtable: Interface to the compiler's internal symbol tables.
  • sys: Access system-specific parameters and functions.
  • sysconfig: Python's configuration information
  • syslog (Unix): An interface to the Unix syslog library routines.
  • tabnanny: Tool for detecting white space related problems in Python source files in a directory tree.
  • tarfile: Read and write tar-format archive files.
  • telnetlib: Telnet client class.
  • tempfile: Generate temporary files and directories.
  • termios (Unix): POSIX style tty control.
  • test: Regression tests package containing the testing suite for Python.
  • textwrap: Text wrapping and filling
  • threading: Thread-based parallelism.
  • time: Time access and conversions.
  • timeit: Measure the execution time of small code snippets.
  • tkinter: Interface to Tcl/Tk for graphical user interfaces
  • token: Constants representing terminal nodes of the parse tree.
  • tokenize: Lexical scanner for Python source code.
  • trace: Trace or track Python statement execution.
  • traceback: Print or retrieve a stack traceback.
  • tracemalloc: Trace memory allocations.
  • tty (Unix): Utility functions that perform common terminal control operations.
  • turtle: An educational framework for simple graphics applications
  • turtledemo: A viewer for example turtle scripts
  • types: Names for built-in types.
  • typing: Support for type hints (see PEP 484).
  • unicodedata: Access the Unicode Database.
  • unittest: Unit testing framework for Python.
  • urllib:
  • uu: Encode and decode files in uuencode format.
  • uuid: UUID objects (universally unique identifiers) according to RFC 4122
  • venv: Creation of virtual environments.
  • warnings: Issue warning messages and control their disposition.
  • wave: Provide an interface to the WAV sound format.
  • weakref: Support for weak references and weak dictionaries.
  • webbrowser: Easy-to-use controller for Web browsers.
  • winreg (Windows): Routines and objects for manipulating the Windows registry.
  • winsound (Windows): Access to the sound-playing machinery for Windows.
  • wsgiref: WSGI Utilities and Reference Implementation.
  • xdrlib: Encoders and decoders for the External Data Representation (XDR).
  • xml: Package containing XML processing modules
  • zipapp: Manage executable python zip archives
  • zipfile: Read and write ZIP-format archive files.
  • zipimport: support for importing Python modules from ZIP archives.
  • zlib: Low-level interface to compression and decompression routines compatible with gzip.



Naming conventions

There are various naming conventions used in Python programs.

PEP 0008 specifies naming conventions for class names (e.g. GenericTree), package and module names (e.g. generictree and generic_tree), function and variable names (storestate or store_state; mixedCase is dispreferred), and more. Google Python Style Guide follows similar naming conventions.

The above stands in contrast to Java naming convention (e.g. storeState for method names) and C# naming convention (e.g. StoreState for method names).

In Python 2, the standard library contains multiple deviations from PEP 0008. For instance, Tkinter module is spelled as Tkinter with capital T; this was renamed to tkinter in Python 3.

This article is issued from Wikibooks. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.