Knowledge Base/Python/General

From Thalesians
Jump to: navigation, search

Contents

The Zen of Python

What is the Zen of Python? To find out, enter

>>> import this

at the Python interpreter prompt. (This is an Easter egg.) You will see the following:

The Zen of Python, by Tim Peters

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!

A simple main function

This is pretty standard:

#!/usr/bin/env python
 
def main(argv):
    # ...
 
    return 0
 
if __name__ == "__main__":
    sys.exit(main(sys.argv))

Note: Python is usually installed under /usr/local/bin/python, so it is also possible to start with

!/usr/local/bin/python

/usr/bin/env python searches $PATH for python and runs it.

Checking if a variable is a number (or number string)

Numbers (ints, floats, etc.) have a special __int__ method, so we can simply check if it exists:

  1. def isNumber(x):
  2. return hasattr(x, '__int__')

This will return True for 5, -5 and 5.0, for instance. But what if we have a string representation? We are going to get False for each of "5", "-5" and "5.0". These are not numbers, they are strings.

What if we want to return True for strings representing numbers? We can use the following:

  1. def isNumberOrNumberString(x):
  2. if isNumber(x): return True
  3. try:
  4. int(x)
  5. except ValueError:
  6. try:
  7. float(x)
  8. except ValueError:
  9. return False
  10. return True

repr() versus str()

The difference between repr() and str() in Python may not be immediately apparent.

According to the documentation, repr() returns the "official" string representation of an object. "If at all possible, this should look like a valid Python expression that could be used to recreate an object with the same value (given an appropriate environment). If this is not possible, a string of the form <...some useful description...> should be returned." In general, repr(o) should return a string representation of o such that the identity

o == eval(repr(o))

holds. eval() takes an (official) string representation of an object and returns a copy of that object constructed from this string representation.

On the other hand, str() returns an "informal" string representation of an object. "This differs from repr() in that it does not have to be a valid Python expression: a more convenient or concise representation may be used instead." In general, this representation should be human readable. There is no requirement for the identity

o == eval(str(o))

to hold.

Let us look at a few examples.

print str("Paul's test string")

prints

Paul's test string

while

print repr("Paul's test string")

prints

"Paul's test string"

The latter is a valid Python expression, the former is not.

print str(1.0 / 3.0)

prints

0.333333333333

while

print str(1.0 / 3.0)

prints

0.33333333333333331

The latter attempts to give enough decimal figures to enable the value to be reconstructed to maximum precision.

It's a bit surprising that

print str([3, "paul's test string", 5.5, "bar", 7, 1.0 / 3.0])

and

print repr([3, "paul's test string", 5.5, "bar", 7, 1.0 / 3.0])

both print

[3, "paul's test string", 5.5, 'bar', 7, 0.33333333333333331]

on ActivePython 2.5.2.2. It looks like str() for lists is implemented by calling repr() iteratively on the elements. (Shouldn't it be calling str()?)

Finally, for user-defined classes, repr() calls the __repr()__ method, while str() calls the __str()__ method. Here is an implementation of a simple class that provides both __repr()__ and __str()__ and conforms to the requirements imposed by the documentation:

  1. class Point:
  2. def __init__(self, x, y):
  3. self.x = x
  4. self.y = y
  5.  
  6. def __eq__(self, other):
  7. if hasattr(other, "x") and hasattr(other, "y"):
  8. return (self.x == other.x) and (self.y == other.y)
  9. else:
  10. return False
  11.  
  12. def __ne__(self, other):
  13. return not self.__eq__(other)
  14.  
  15. def __str__(self):
  16. return "(%s, %s)" % (str(self.x), str(self.y))
  17.  
  18. def __repr__(self):
  19. return "Point(%s, %s)" % (repr(self.x), repr(self.y))

Thus

  1. pt = Point(3, 5)
  2. print pt
  3. print str(pt)
  4. print repr(pt)
  5. print eval(str(pt)) == pt
  6. print eval(repr(pt)) == pt

prints

  1. (3, 5)
  2. (3, 5)
  3. Point(3, 5)
  4. False
  5. True

The first two lines are identical because print calls __repr__ when passed an object as its parameter. Notice that the result of repr(pt) can be used to reconstruct the Point object with eval().

Overridable properties in Python

  1. class Foo(object):
  2. _a = 7
  3.  
  4. def get_a(self):
  5. return self._a
  6.  
  7. def set_a(self, a):
  8. self._a = a
  9.  
  10. A = property(fget=get_a, fset=set_a)
  11.  
  12. class Bar(Foo):
  13. _newA = 5
  14.  
  15. def get_a(self):
  16. return self._newA
  17.  
  18. def set_a(self, a):
  19. self._newA = a
  20.  
  21. f = Foo()
  22. print f.A
  23.  
  24. b = Bar()
  25. print b.A

If Foo.get_a is overridden by Bar.get_a we would expect to see the output

7
5

But instead we see

7
7

This is because in line

A = property(fget=get_a, fset=set_a)

the binding occurs pretty early and fget, fset are bound to A.get_a and A.set_a early, for good.

However, Python enables one to create overridable properties. The following implementation does the trick:

  1. class OProperty(object):
  2. """Based on the emulation of PyProperty_Type() in Objects/descrobject.c"""
  3.  
  4. def __init__(self, fget=None, fset=None, fdel=None, doc=None):
  5. self.fget = fget
  6. self.fset = fset
  7. self.fdel = fdel
  8. self.__doc__ = doc
  9.  
  10. def __get__(self, obj, objtype=None):
  11. if obj is None:
  12. return self
  13. if self.fget is None:
  14. raise AttributeError, "unreadable attribute"
  15. if self.fget.__name__ == '<lambda>' or not self.fget.__name__:
  16. return self.fget(obj)
  17. else:
  18. return getattr(obj, self.fget.__name__)()
  19.  
  20. def __set__(self, obj, value):
  21. if self.fset is None:
  22. raise AttributeError, "can't set attribute"
  23. if self.fset.__name__ == '<lambda>' or not self.fset.__name__:
  24. self.fset(obj, value)
  25. else:
  26. getattr(obj, self.fset.__name__)(value)
  27.  
  28. def __delete__(self, obj):
  29. if self.fdel is None:
  30. raise AttributeError, "can't delete attribute"
  31. if self.fdel.__name__ == '<lambda>' or not self.fdel.__name__:
  32. self.fdel(obj)
  33. else:
  34. getattr(obj, self.fdel.__name__)()

It was taken from the article An Overridable Alternative to the property Function in Python, where you can find the full details.

Converting a list to a dict, value to index

  1. mylist = ["foo", "bar", "baz"]
  2. print dict([(mylist[i], i) for i in range(0, len(mylist))])

prints

{'baz': 2, 'foo': 0, 'bar': 1}

Iterating through all keys-value pairs in a dict

Very often we want to iterate through all the key-value pairs in a dict:

  1. d = {"Name": "Paul", "Surname": "Bilokon"}
  2.  
  3. for key, value in d.items():
  4. print "%s = %s" % (key, value)

This produces

Surname = Bilokon
Name = Paul

If we just want to iterate through the keys, we use

  1. for key in d.keys():
  2. print key

If we just want to iterate through the values, we use

  1. for value in d.values():
  2. print value

What happens if we use the syntax

  1. for x in d:
  2. print x

Perhaps counterintuitively, this will iterate through the keys, not values.

Iterating through two or more lists in parallel

Use zip:

  1. names = ["Isaac", "Carl Friedrich", "Evariste", "John"]
  2. surnames = ["Newton", "Gauss", "Galois", "von Neumann"]
  3. ages = [84, 77, 20, 53]
  4.  
  5. for n, s, a in zip(names, surnames, ages):
  6. print "NAME: %s, SURNAME: %s, AGE: %d" % (n, s, a)

The result looks as follows:

NAME: Isaac, SURNAME: Newton, AGE: 84
NAME: Carl Friedrich, SURNAME: Gauss, AGE: 77
NAME: Evariste, SURNAME: Galois, AGE: 20
NAME: John, SURNAME: von Neumann, AGE: 53

zip truncates the results to the length of the shortest list:

  1. exponents = [2, 3, 5, 7, 9]
  2. primes = [3, 7, 31, 127]
  3.  
  4. print zip(exponents, primes)
  5.  
  6. for e, p in zip(exponents, primes):
  7. print "2^%d - 1 ... Mersenne prime: %d" % (e, p)
[(2, 3), (3, 7), (5, 31), (7, 127)]
2^2 - 1 ... Mersenne prime: 3
2^3 - 1 ... Mersenne prime: 7
2^5 - 1 ... Mersenne prime: 31
2^7 - 1 ... Mersenne prime: 127

Alternatively, you can use map(None, exponents, primes). This will pad the shorter lists with None:

  1. print map(None, exponents, primes)
  2.  
  3. for e, p in map(None, exponents, primes):
  4. print e, p

The results are as follows:

[(2, 3), (3, 7), (5, 31), (7, 127), (9, None)]
2 3
3 7
5 31
7 127
9 None

Building a dictionary from two lists

This is easy. Use zip or map as shown above:

  1. names = ["Isaac", "Carl Friedrich", "Evariste", "John"]
  2. ages = [84, 77, 20, 53]
  3. print dict(zip(names, ages))
  4.  
  5. exponents = [2, 3, 5, 7, 9]
  6. primes = [3, 7, 31, 127]
  7. print dict(map(None, exponents, primes))
{'Isaac': 84, 'John': 53, 'Carl Friedrich': 77, 'Evariste': 20}
{9: None, 2: 3, 3: 7, 5: 31, 7: 127}

Conditionals in list comprehensions

It is possible to use if in list comprehensions in two distinct ways. This is best illustrated by examples:

tradeSigns = [-1, 1, 1, -1, -1, -1, 1, -1]
tradeDirections = ["Sell" for ts in tradeSigns if ts == -1]

This has set tradeDirections to

['Sell', 'Sell', 'Sell', 'Sell', 'Sell']

In other words, we have pre-filtered tradeSigns and ignored its elements equal to 1. Thus we skipped the 1's and obtained five elements in the resulting tradeDirections, rather than eight.

We could also do this:

tradeDirections = ["Sell" if ts == -1 else "Buy" for ts in tradeSigns]

In this case tradeDirections is set to

['Sell', 'Buy', 'Buy', 'Sell', 'Sell', 'Sell', 'Buy', 'Sell']

perhaps in line with our original intentions. We didn't pre-filter tradeSigns and processed all its elements (this we get eight elements in the result) but chose to replace the -1's with "Sell" and the 1's with "Buy".

In each case we used if but resorted to different syntax.

Filtering one list by another

Suppose you have defined

  1. names = ["Paul", "Alex", "John", "Simon", "Paul", "Michael"]
  2. surnames = ["Smith", "Jones", "Taylor", "Williams", "Brown", "Green"]

and now you want to print out the surnames of all Pauls. This can be achieved by using list comprehensions:

print [surnames[i] for i in range(len(names)) if names[i] == "Paul"]

will produce the output

['Smith', 'Brown']

Checking if an object is a sequence or is iterable

If o is your object, you can use the following check:

if hasattr(o, "__iter__"):
    # ...

The following code

  1. print hasattr(5, "__iter__")
  2. print hasattr([1, 2, 3, 4, 5], "__iter__")
  3. print hasattr([5], "__iter__")
  4. print hasattr((5), "__iter__")
  5. print hasattr((5,), "__iter__")
  6. print hasattr((3, 2), "__iter__")
  7. print hasattr("asdf", "__iter__")

prints

False
True
True
False
True
True
False

Implementing functors in Python

Any object with a __call()__ method may be called using the function call syntax:

  1. class Scale(object):
  2. def __init__(self, factor):
  3. self.factor = factor
  4.  
  5. def __call__(self, arg):
  6. return self.factor * arg
  7.  
  8. s = Scale(2)
  9.  
  10. print s(5)

The functor can have more than one argument:

  1. import math
  2.  
  3. class Pythagoras(object):
  4. def __init__(self):
  5. pass
  6.  
  7. def __call__(self, arg1, arg2):
  8. return math.sqrt(arg1 * arg1 + arg2 * arg2)
  9.  
  10. p = Pythagoras()
  11.  
  12. print p(3, 4)

Local variables in lambda expressions

We see that x + y is calculated twice in the following lambda expression:

func1 = lambda x, y, z: (x + y + z) / (x + y - z)

Can we compute it once and make it a local variable? One solution is to use a helper lambda expression:

func2 = lambda x, y, z: (lambda sum=x + y: (sum + z) / (sum - z))()

Now both

print func1(3.0, 5.0, 7.0)

and

print func2(3.0, 5.0, 7.0)

print the same number:

15.0

Instantiating a Python object dynamically by object class name

Use eval:

  1. def forname(modname, classname):
  2. ''' Returns a class of "classname" from module "modname". '''
  3. module = __import__(modname)
  4. classobj = getattr(module, classname)
  5. return classobj
  6.  
  7. class Foo(object):
  8. def introduction(self):
  9. print "I am FOO"
  10.  
  11. class Bar(object):
  12. def introduction(self):
  13. print "I am BAR"
  14.  
  15. className = "Foo"
  16. o = eval("%s()" % className)
  17. o.introduction()

This will print

I am FOO

If, on the other hand, you set className to "Bar", you will see

I am BAR

Sending the output to STDERR rather than STDOUT

Instead of

print "Hello"

use

  1. import sys
  2.  
  3. sys.stderr.write("Hello\n")

Making a path, rather than just making a directory

If we try the following

  1. import os
  2.  
  3. os.mkdir("foo/bar/baz")

while foo/bar does not exist, foo/bar/baz will never be made. Depending on the operating system, we may see something like

Traceback (most recent call last):
  File "test.py", line 4, in <module>
    os.mkdir("foo/bar/baz")
WindowsError: [Error 3] The system cannot find the path specified: 'foo/bar/baz'

But instead we can use

  1. import distutils.dir_util
  2.  
  3. distutils.dir_util.mkpath("foo/bar/baz")

mkpath will create baz and any missing ancestor directories. If the directory already exists, it will do nothing.

For more information on the useful module distutils see the official documentation.

Reading a text file backwards

Reading a text file backwards is a relatively common task. Let me explain first what I mean by backwards: you read the file line by line, starting from the last line and progressing towards the first.

Why would you need this? Imagine that you have a large CSV (comma separated value) with numerous records sorted in ascending order by date/time. You want to read the last N records. Using the standard text file input/output machinery you would probably end up reading the entire file, discarding all but the last N records. Extremely wasteful. Chances are you will have more than one such file.

I have written a Python module to help you: backwards_text_file.py. You can download it from the Downloads page.

Formatting exceptions and tracebacks

Sometimes you catch an exception and don't even know what it is:

  1. try:
  2. 1 / 0
  3. except:
  4. # What kind of exception did we catch?
  5. pass

Of course, in this case we, the code readers, known that we have ZeroDivisionError but 1 / 0 could be a much more complicated code snippet.

The bottom line is, if we catch an exception we want to know what it is (Problem 1) and we want to be able to format it nicely as a string (Problem 2) so that we can log it (for example).

Problem 1 is solved by sys.exc_info():

  1. import sys
  2.  
  3. try:
  4. 1 / 0
  5. except:
  6. exceptionType, exceptionValue, exceptionTraceBack = sys.exc_info()
  7. print exceptionType
  8. print exceptionValue
  9. print exceptionTraceBack

The values returned by sys.exc_info() are hardly suitable for human consumption. To solve Problem 2 (pretty formatting), we rely on the traceback module:

  1. import logging
  2. import os
  3. import string
  4. import sys
  5. import traceback
  6.  
  7. def main(argv):
  8. try:
  9. 1 / 0
  10. return 0
  11. except:
  12. exceptionType, exceptionValue, exceptionTraceBack = sys.exc_info()
  13. exceptionLineList = traceback.format_exception_only(exceptionType, exceptionValue)
  14. # Note: In the vast majority of cases ``exceptionLineList`` will
  15. # contain a single line
  16. logging.error(string.join(exceptionLineList, "\n"))
  17. traceBackLineList = traceback.format_tb(exceptionTraceBack)
  18. for traceBackLine in traceBackLineList: logging.debug(traceBackLine)
  19. return -1
  20.  
  21. if __name__ == "__main__":
  22. logging.basicConfig(level=logging.DEBUG,
  23. format='%(asctime)s %(levelname)s %(message)s')
  24. sys.exit(main(sys.argv))

For more information on the various exception printing and formatting tools provided by the traceback module read this.

Listing the contents of a directory

  1. import os
  2.  
  3. dirPath = "/path/to/my/dir"
  4. dirChildNames = os.listdir(dirPath)
  5. for dirChildName in dirChildNames:
  6. print dirChildName

This will print something like

myFile3
MYDIR1
myFile2
myFile1~
MYDir2

"." and ".." will be omitted, but things like ".foo" (file names beginning with "." on *nix) will be included.

What if we need to get the full paths of the children? We can use os.path.join:

  1. import os
  2.  
  3. dirPath = "/path/to/my/dir"
  4. dirChildNames = os.listdir(dirPath)
  5. for dirChildName in dirChildNames:
  6. dirChildPath = os.path.join(dirPath, dirChildName)
  7. print dirChildPath

Now, suppose we only want to list the children that are directories:

  1. import os
  2.  
  3. dirPath = "/path/to/my/dir"
  4. dirChildNames = os.listdir(dirPath)
  5. for dirChildName in dirChildNames:
  6. dirChildPath = os.path.join(dirPath, dirChildName)
  7. if os.path.isdir(dirChildPath):
  8. print dirChildPath

Or, on the contrary, you want to list the children that are not directories:

  1. import os
  2.  
  3. dirPath = "/path/to/my/dir"
  4. dirChildNames = os.listdir(dirPath)
  5. for dirChildName in dirChildNames:
  6. dirChildPath = os.path.join(dirPath, dirChildName)
  7. if not os.path.isdir(dirChildPath):
  8. print dirChildPath

Next, suppose we want to limit our listing to the files whose names match a particular regular expression. For example, to list all files with extension "py", we can use this:

  1. import os
  2. import re
  3.  
  4. dirPath = "/path/to/my/dir"
  5. dirChildNames = os.listdir(dirPath)
  6. pythonNameRegEx = re.compile("^.+\\.py$")
  7. for dirChildName in dirChildNames:
  8. dirChildPath = os.path.join(dirPath, dirChildName)
  9. if not os.path.isdir(dirChildPath):
  10. if pythonNameRegEx.match(dirChildName):
  11. print dirChildPath

Finally, in the above example, suppose you want to capture the base name (less ".py") and display it alone:

  1. import os
  2. import re
  3.  
  4. dirPath = "/path/to/my/dir"
  5. dirChildNames = os.listdir(dirPath)
  6. pythonNameRegEx = re.compile("^(.+)\\.py$")
  7. for dirChildName in dirChildNames:
  8. dirChildPath = os.path.join(dirPath, dirChildName)
  9. if not os.path.isdir(dirChildPath):
  10. match = pythonNameRegEx.match(dirChildName)
  11. if match:
  12. print match.group(1)

Notice that we added the brackets to capture the group in the regular expression.

We could list the contents of the directory in alphabetical order:

  1. import os
  2. import re
  3.  
  4. dirPath = "/path/to/my/dir"
  5. dirChildNames = os.listdir(dirPath)
  6. dirChildNames.sort()
  7. pythonNameRegEx = re.compile("^(.+)\\.py$")
  8. for dirChildName in dirChildNames:
  9. dirChildPath = os.path.join(dirPath, dirChildName)
  10. if not os.path.isdir(dirChildPath):
  11. match = pythonNameRegEx.match(dirChildName)
  12. if match:
  13. print match.group(1)

Now, what if we want to display a simple progress indicator such as

Processing file 5 of 8...

We need to know how many matching files there are first. Thus we need two passes. Also, enumerate helps us obtain the index.

  1. import os
  2. import re
  3.  
  4. dirPath = "/path/to/my/dir"
  5. dirChildNames = os.listdir(dirPath)
  6. dirChildNames.sort()
  7. matchingDirChildBaseNames = []
  8. pythonNameRegEx = re.compile("^(.+)\\.py$")
  9. for dirChildName in dirChildNames:
  10. dirChildPath = os.path.join(dirPath, dirChildName)
  11. if not os.path.isdir(dirChildPath):
  12. match = pythonNameRegEx.match(dirChildName)
  13. if match:
  14. matchingDirChildBaseNames.append(match.group(1))
  15.  
  16. matchingDirChildCount = len(matchingDirChildBaseNames)
  17. for (index, dirChildBaseName) in enumerate(matchingDirChildBaseNames):
  18. print "Processing file %d of %d: %s" % (index + 1, matchingDirChildCount, dirChildBaseName)

Obtaining the index in for ... in ... loops

for ... in ... loops are very convenient:

  1. daysOfTheWeek = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
  2.  
  3. for dayOfTheWeek in daysOfTheWeek:
  4. print dayOfTheWeek

But sometimes we need to know the index of the elements we are iterating over, e.g. if we want to print it.

The solution is to use enumerate:

  1. daysOfTheWeek = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"]
  2.  
  3. for (index, dayOfTheWeek) in enumerate(daysOfTheWeek):
  4. print "%d. %s" % (index + 1, dayOfTheWeek)

Offsetting a multiline string with spaces

Suppose that you need to offset a multiline string with a given number of spaces. You can use the regular expression "^" to match the beginning of the line (to match the beginning of each line in a multiline string, you must use the re.MULTILINE modifier):

  1. lineStartRegEx = re.compile("^", re.MULTILINE)
  2.  
  3. def offsetWithSpaces(s, spaceOffset):
  4. return lineStartRegEx.sub(" " * spaceOffset, s)

Then

  1. s = "foo\n bar\nbaz"
  2. print offsetWithSpaces(s, 4)

produces

    foo
      bar
    baz

Chomping in Python

Perl users will be familiar with the chomp function which removes the trailing newline character if it's present. They will use it like so:

  1. s = "hello\n";

s becomes "hello" after calling chomp.

In Python this can be achieved by using string's method rstrip:

  1. s = "hello\n"
  2. s = s.rstrip("\n")

Note that Perl's chomp changes the string in place, whereas Python's rstrip returns a new string, hence the need for s = ....

What is IronPython?

From the project's website:

IronPython is an implementation of the Python programming language running under .NET and Silverlight. It supports an interactive console with fully dynamic compilation. It's well integrated with the rest of the .NET Framework and makes all .NET libraries easily available to Python programmers, while maintaining compatibility with the Python language.

IronPython is an open source project freely available under the Microsoft Public License.

Drawing with Python Imaging Library (PIL)

This will produce a diagonal line:

  1. import Image
  2.  
  3. width = 10
  4. height = 10
  5. im = Image.new("RGB", (width, height))
  6. for i in xrange(0, 10):
  7. im.putpixel((i, i), 125)
  8. im.save("test.png", "PNG")

lambda functions

The following example is illuminating:

  1. # F is the "Add" function
  2. def F(a, b):
  3. return a + b
  4.  
  5. print F(3, 5)
  6. # prints 8
  7.  
  8. # Use F to make an "Add 3" function
  9. F_3 = lambda b : F(3, b)
  10.  
  11. print F_3(5)
  12. # prints 8

Starting a new instance of a COM application

Thanks to Tim Golden for posting this here: http://timgolden.me.uk/python/win32_how_do_i/start-a-new-com-instance.html and David Foster for finding it.

You will find that when you do

  1. import win32com.client
  2.  
  3. excel = win32com.client.Dispatch("Excel.Application")

if there is an existing instance of excel.exe that instance will be used (a new excel.exe will not be started), which may not be desirable, particularly if you want a clean environment.

If you want to start a new excel.exe each time, you should replace Dispatch with DispatchEx:

  1. import win32com.client
  2.  
  3. excel = win32com.client.DispatchEx("Excel.Application")

You can also specify further parameters, e.g. run the application on another machine, in-process/out-of-process (out-of-process by default), etc.

Redirecting STDOUT (STDERR, etc.) on Windows

The "classic" way to redirect STDOUT and STDERR in Python is illustrated below:

  1. import sys
  2.  
  3. originalStdout = sys.stdout
  4. sys.stdout = open("mystdout.txt", "a")
  5.  
  6. print "Hello!"
  7. # This will go to mystdout.txt
  8.  
  9. sys.stdout = originalStdout

Instead of introducing originalStdout we could have used sys.__stdout__ to restore sys.stdout:

import sys
 
sys.stdout = open("mystdout.txt", "a")
 
print "Hello!"
# This will go to mystdout.txt
 
sys.stdout = sys.__stdout__

However, this is not recommended: "It can also be used to restore the actual files to known working file objects in case they have been overwritten with a broken object. However, the preferred way to do this is to explicitly save the previous stream before replacing it, and restore the saved object" (14th October, 2009).

Thus our original approach is preferred. However, there is a caveat: "Changing [sys.stdout] doesn’t affect the standard I/O streams of processes executed by os.popen(), os.system() or the exec*() family of functions in the os module."

Thus

  1. import sys
  2.  
  3. originalStdout = sys.stdout
  4. sys.stdout = open("mystdout.txt", "a")
  5.  
  6. someFunction()
  7.  
  8. sys.stdout = originalStdout

If inside someFunction() we execute an external process using the aforementioned functions or even use some code from Windows DLLs, that external code will be using the original STDOUT. Our redirect won't work for it.

On Windows, we could use the win32api functions to resolve this problem:

  1. import pywintypes
  2. import win32api
  3. import win32file
  4.  
  5. o = win32file.CreateFile("mystdout.log", win32file.GENERIC_WRITE, 0, sa, win32file.CREATE_ALWAYS, 0, 0)
  6. win32api.SetStdHandle(win32api.STD_OUTPUT_HANDLE, o)
  7. # Could also redirect both STDOUT and STDERR to o:
  8. # win32api.SetStdHandle(win32api.STD_ERROR_HANDLE, o)
  9.  
  10. someFunction()

Now we have redirected STDOUT for external processes and Windows DLLs.

Beware! The following won't work:

  1. def redirectStreams():
  2. o = win32file.CreateFile("mystdout.log", win32file.GENERIC_WRITE, 0, sa, win32file.CREATE_ALWAYS, 0, 0)
  3. win32api.SetStdHandle(win32api.STD_OUTPUT_HANDLE, o)
  4.  
  5. redirectStreams()
  6. someFunction()

mystdout.log will be closed when o goes out of scope, before you get a chance to call someFunction(). So you need to make sure that o doesn't go out of scope before the time is ripe.

However, when we redirect the "system" STDOUT and STDERR as described here, the "Python" STDOUT and STDERR won't change. Thus we need to redirect both. If we redirect them to the same file, we need to make sure that we set win32file.FILE_SHARE_READ | win32file.FILE_SHARE_WRITE (or else the file will be locked). Moreover, we can append to the end of file using win32file.SetFilePointer. Putting this all together, we obtain the following:

  1. import sys
  2. import pywintypes
  3. import win32api
  4. import win32file
  5.  
  6. originalSystemStdout = None
  7. originalSystemStderr = None
  8. originalPythonStdout = None
  9. originalPythonStderr = None
  10.  
  11. systemStdoutStderr = None
  12. pythonStdoutStderr = None
  13.  
  14. def redirectStreams(filePathName):
  15. global originalSystemStdout
  16. global originalSystemStderr
  17. global originalPythonStdout
  18. global originalPythonStderr
  19. global systemStdoutStderr
  20. global pythonStdoutStderr
  21.  
  22. if systemStdoutStderr is None and pythonStdoutStderr is None:
  23. originalSystemStdout = win32api.GetStdHandle(win32api.STD_OUTPUT_HANDLE)
  24. originalSystemStderr = win32api.GetStdHandle(win32api.STD_ERROR_HANDLE)
  25.  
  26. sa = pywintypes.SECURITY_ATTRIBUTES()
  27. sa.bInheritHandle = 1;
  28. systemStdoutStderr = win32file.CreateFile(
  29. filePathName,
  30. win32file.GENERIC_WRITE,
  31. win32file.FILE_SHARE_READ | win32file.FILE_SHARE_WRITE,
  32. sa,
  33. win32file.OPEN_ALWAYS,
  34. 0,
  35. 0)
  36. win32file.SetFilePointer(systemStdoutStderr, 0, win32file.FILE_END)
  37.  
  38. win32api.SetStdHandle(win32api.STD_OUTPUT_HANDLE, systemStdoutStderr)
  39. win32api.SetStdHandle(win32api.STD_ERROR_HANDLE, systemStdoutStderr)
  40.  
  41. originalPythonStdout = sys.stdout
  42. originalPythonStderr = sys.stderr
  43.  
  44. pythonStdoutStderr = open(filePathName, "a")
  45.  
  46. sys.stdout = pythonStdoutStderr
  47. sys.stderr = pythonStdoutStderr
  48. else:
  49. raise RuntimeError("The streams are already redirected")
  50.  
  51. def restoreStreams():
  52. global originalSystemStdout
  53. global originalSystemStderr
  54. global originalPythonStdout
  55. global originalPythonStderr
  56. global systemStdoutStderr
  57. global pythonStdoutStderr
  58.  
  59. if not systemStdoutStderr is None and not pythonStdoutStderr is None:
  60. win32api.SetStdHandle(win32api.STD_OUTPUT_HANDLE, originalSystemStdout)
  61. win32api.SetStdHandle(win32api.STD_ERROR_HANDLE, originalSystemStderr)
  62.  
  63. sys.stdout = originalPythonStdout
  64. sys.stderr = originalPythonStderr
  65.  
  66. systemStdoutStderr.Close()
  67. pythonStdoutStderr.close()
  68. else:
  69. raise RuntimeError("The streams have not been redirected")
  70.  
  71. if __name__ == "__main__":
  72. redirectStreams("my.log")
  73. print "Hello!"
  74. someFunction()
  75. print "Done!"
  76. restoreStreams()

Creating a list whose elements are a single value repeated n times

[5] * 10

produces

[5, 5, 5, 5, 5, 5, 5, 5, 5, 5]
10 * [5]

will produce exactly the same result, so we have commutativity.

Similarly

['a'] * 10

produces

['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a']

and

["foo"] * 10

produces

['foo', 'foo', 'foo', 'foo', 'foo', 'foo', 'foo', 'foo', 'foo', 'foo']

Exiting the Python interpreter quickly

You have launched python. How do you exit it quickly?

On Linux/Unix, press [Ctrl]+[D].

On Windows, press [Ctrl]+[Z] followed by [Enter]. If this doesn't work, enter

quit()

You could, of course, do

import sys
sys.exit()

but this is slow!

User-defined exceptions in Python

For example:

class MyException(Exception):
    def __init__(self, message):
        self.__message = message
 
    def __getMessage(self):
        return self.__message
 
    def __setMessage(self, message):
        self.__message = message
 
    message = property(fget=__getMessage, fset=__setMessage)
 
    def __repr__(self):
        return repr(self.message)
 
    def __str__(self):
        return str(self.message)

Define functions conditionally on the modules that are present

If import fails to load a module (e.g. because it cannot be found), it will raise an ImportError. We can exploit this to tailor our implementation accordingly.

The following example module binds the name fooBar to the function fooBar_zlib if the module zlib is present, otherwise it binds it to fooBar_noZlib:

# mymodule.py
 
try:
    import zlib
except ImportError:
    ZLIB_MODULE_PRESENT = False
else:
    ZLIB_MODULE_PRESENT = True
 
def fooBar_zlib():
    # ...zlib-based implementation...
    pass
 
def fooBar_noZlib():
    # ...no-zlib implementation...
    pass
 
if ZLIB_MODULE_PRESENT:
    fooBar = fooBar_zlib
else:
    fooBar = fooBar_noZlib

Thus, when you import mymodule.py, the function fooBar will be bound to either fooBar_zlib or fooBar_noZlib depending on whether the module zlib is present on your system or not. As a user of mymodule.py you don't have to know exactly which implementation is being used. The logic which determines which of the two implementations to use is encapsulated in mymodule.py.

Define functions conditionally on the operating system

A similar idea can be used to define functions conditionally on the operating system.

If we are running on Linux or Unix, os.name should be "posix". If we are running on Windows, os.name should be "nt". So we could do the following

def fooBar_nt():
    # ...Windows implementation...
    pass
 
def fooBar_posix():
    # ...Linux implementation...
    pass
 
if os.name == "posix":
    fooBar = fooBar_posix
else:
    # Assume os.name == "nt"
    fooBar = fooBar_nt

Function decorators

For more information, see http://www.python.org/dev/peps/pep-0318/

def translate_exceptions(func):
    def wrapper(*args, **kwds):
        try:
            return func(*args, **kwds)
        except SomeError, someError:
            raise AnotherError(message=someError.message)
 
    return wrapper
 
def return_unicode(func):
    def wrapper(*args, **kwds):
        result = func(*args, **kwds)
        if isinstance(result, str):
            result = unicode(str)
        return result
 
    return wrapper
 
@translate_exceptions
def myFunc1():
    # ...
    pass
 
@translate_exceptions
@return_unicode
def myFunc2():
    # ...
    pass

One-liners in Python

Python, like Perl, supports one-liners that you can invoke from the command line. For example:

python -c "a = 3; b = 5; print a + b"

Waiting for a key press

raw_input('Press Enter...')
Personal tools