Python Things |
Python is a clean and powerful semicompiled object-oriented programming language. If you hadn't heard of Python, go find out about it now!
Stop press. :) Strange "Python" twins sighted at Python 10.
Recent:
It now also supports saving tracebacks to files.import cgitb; cgitb.enable()
Now all Python CGI scripts on your server will magically produce pretty tracebacks in the Web browser whenever errors occur. You don't have to change any of your CGI scripts! This even works when there's a SyntaxError. (Of course, don't do this if you absolutely have to keep your scripts secret.)import os if os.environ.has_key("GATEWAY_INTERFACE"): import sys, cgitb sys.excepthook = cgitb.excepthook
Here is a collection of some possibly useful Python things. They're in chronological order, so scroll down to the end for the most recent stuff.
regex
module has a syntax mode
labelled RE_SYNTAX_AWK
that helps to make regexes
more familiar for Perl hackers by allowing unbackslashed parens
and pipes. But it doesn't have
\d
,
\D
,
\s
, or
\S
; and
\w
and \W
don't work quite the same
as Perl's. So this patch adds these conveniences under a
new regex syntax flag named RE_EXTRA_CLASSES
.
For kicks, i also added \h
for hexadecimal digits
and \l
for letters of the alphabet.
Also, this patch adds a syntax flag named RE_MINIMAL_OPS
which enables the new operators
??
,
*?
, and
+?
(from Perl 5).
These have similar meanings to their counterparts
?
,
*
, and
+
, but the minimal versions prefer to match the
shortest string possible, which can be really useful
in some situations.
These two features are both enabled when you select the regex syntax
dubiously labelled RE_SYNTAX_PERLISH
.
This patchfile should be applied to the
Python
1.4 beta 3 distribution, and it alters three files:
Modules/regexpr.c
,
Modules/regexpr.h
, and
Lib/regex_syntax.py
.
To apply the patch, simply go to the directory into which you
extracted the Python tar
file (probably named
Python1.4beta3
), and -- if you saved the patch
file as, say, /home/bob/regex-perlish.patch
-- go
patch </home/bob/regex-perlish.patch
Then just run make
to build the new Python interpreter.
regex
module don't have
attributes equivalent to Perl's
$&
,
$`
, or
$'
(the part of the string that matched,
everything before what matched, and everything after
what matched, respectively). It's possible to construct these
strings using the indices from the regs
attribute,
but it's somewhat inconvenient, harder to read, and slower.
So i suggested the addition of three attributes,
before
,
found
, and
after
, to the compiled regex objects.
Andrew Kuchling almost immediately posted a patch to do just that.
digit + some(whitespace) + exactly(':)')
without worrying about the exact syntax or
bothering to backslash dangerous characters. You might like it
if you find yourself wasting a lot of time
looking up regular-expression syntax.
This new version allows a more concise syntax and
generates instances of a Pattern
class, instead of strings;
this way you can directly use methods like search
on the result,
and you don't need to worry about compiling and caching.
It makes regexes more convenient to use. Here's an example:
or, a slightly more complex example from Grail:>>> import rxb >>> rxb.welcome() >>> >>> pat = label.spam(some(letters)) + digit >>> pat.search('foo bar python8') 8 >>> pat.spam 'python' >>> pat <Pattern \\(<spam>[A-Za-z]+\\)[0-9]> >>> rxb.banish()
Thanks to William S. Lear <rael@dejanews.com> who pointed out a problem with this example which was due to a bug in theimport rxb rxb.welcome() flag = member(letters, '-') LISTING_PATTERN = (begline + label.mode( flag + # file type flag*3 + flag*3 + flag*3) + # owner, group, world perms label.data( somespace + anything + # links, owner, grp, sz, date somespace + digit*2 + maybe(':') + digit*2 + # year or hh:mm somespace) + label.file( anybut('->')) + # anything before symlink maybe(label.link( somespace + '->' + anything)) + # possible symlink endline) rxb.banish()
regex.symcomp()
routine. The rxb
module has been recently modified to produce
regular expressions using backslashed (instead of bare) parentheses for
grouping, as a workaround for the symcomp()
bug.
The bug is this: if you open a new subgroup with a left-parenthesis
immediately following the greater-than sign which ends a group label,
symcomp()
will miss the parenthesis and thus miscount the
rest of the subgroups. The bug has been fixed in Python 1.4 final.
list.append(element)
is much
faster than doing list = list + element
, since the latter
has to make a new object with a copy of the whole list. Unfortunately,
list.append
can only append one element at a time.
I wrote the following patch to add the concat
method to
lists, which will concatenate a list argument onto another list in place.
faq2html.py
faq2html.py
as an exercise
in text-processing with Python (in part, this is what prompted
me to think of the regex modifications above, but this script does
not require them). The new version is
quite a bit more general than the old, and should be able to
convert most reasonably-formatted FAQs into HTML, provided that
questions are preceded with Q.
and answers preceded
with A.
. Check the top of the script for details.
dis.disco()
display
the names of local variables along with the disassembly.
"Here is a $string." "Here is a $module.member." "Here is an $object.member." "Here is a $functioncall(with, arguments)." "Here is an ${arbitrary + expression}." "Here is an $array[3] member." "Here is a $dictionary['member']."
You can download the module from this site. It contains a class named 'Itpl' for representing interpolated-string objects, and a function named 'printpl' which will interpolate a given string and print the results. Here is the documentation page generated from the module.
string.join()
routine
to accept any instance of a class that implements the __len__
and __getitem__
disciplines, rather than accepting only
the built-in sequence types (list and tuple). Your __getitem__ method
will be called twice for each element (once to add up the total length
of the result, and once during construction), so it had better return
consistent results for this to work...
This lack of safety is bad. If the returned string lengths are inconsistent, you can cause a segmentation fault. Watch here for a more robust update.
stropjoin.patch
.
while
and if
conditions (warning: controversial!)
while
and if
statements to allow an optional from
keyword to save the result
of the conditional in a variable. This lets you write, for example:
while line from sys.stdin.readline(): do_something_with(line) if status from pipe.close(): handle_the_error(status)
My goal was to put the condition where it belongs instead of having to put extra "if ... break" statements inside the loop or duplicate the condition at the end where it it less apparent. There have been a fair number of comments about this. Just for fun, i'll quote some here (with apologies to the speakers)...
"Reads like Python." (David Ascher)
"... a very elegant solution, IMHO." (Andrew Kuchling)
"... seems like a C idiom trying to work its way into Python." (Johann Hibschman)
"... can easily be emulated using a file iterator." (Fredrik Lundh)
"I like this proposal." (Anthony Baxter)
"... looks just right to me." (Konrad Hinsen)
"I don't see why grammar changes are needed for what is essentially just an addition to a class's methods..." (Tony J Ibbs)
"Is a syntax change really worth it when all you save is one (1) line of source code?" (Fredrik Lundh)
"I'd really like to see it in the official release." (Marnix Klooster)
"I don't see what all the fuss is about. I commonly use a while 1 loop with one or more if:break clauses..." (Donn Cave)
"I have to concur with Donn on this one. I'm never really been inconvenienced by using the while 1:...break idiom." (Barry Warsaw)
"... agree with Donn that this is all unnecessary and we're better off with the 'while 1' idiom." (Guido van Rossum)
"I really like the 'from' proposal." (Richard Jones)
Oh, well. Anyway, here it is. After applying the patch, you
need to go into the Grammar/
subdirectory and do a
make
to rebuild the parser (isn't that cool?) before
going back up and doing make
to build the interpreter.
ifwhilefrom.patch
The tokenize
module included with Python 1.3 and
Python 1.4 does not quite "match the working of the Python tokenizer
exactly", as it claims. Specifically, the new double-star operator
is not recognized, CR/LF is not accepted at the end of a line, FF is
not accepted, and there is no support for triple-quoted strings or
backslash-continuations of lines. The new module fixes the regex in
tokenize.tokenprog
to accept the double-star, but such
a regex is only good for scanning individual lines of text.
So the new module (posted 1 April) includes a new function
tokenize.tokenize()
which will scan streams of text.
The function accepts a readline-like method which is called to come
up with the next input line (or "" for EOF) and a "token-eater"
function. The "token-eater" function should accept five arguments:
the type of the token, a string containing the token, the starting
and ending (row, column) coordinates of the token, and the line itself.
This function should match the working of the Python tokenizer,
nd will return INDENT and DEDENT tokens as the line indentation changes.
The information your "token-eater" function gets from
tokenize.tokenize()
should be enough to exactly
reconstruct the original source script, if you need it. The
regurgitate
script below is an example of how to do this.
The cedit
script below is an example of using the
tokenizer to colourize Python code in a simple Tk text-editing window.
tokenize.py
regurgitate
cedit
tokenize
module,
which helps to make it concise.
The principle is simple -- any identifier which is seen only
once in your script is considered suspect. Warnings are not
generated for keywords or for built-in object methods (when used
as methods); extra warnings are generated for identifiers that
look like __reserved__
words but aren't known.
With the -i
option, this script will also import
modules whenever it sees import
statements in your
script, so that if you use string.split
only once,
there won't be a complaint about split
if you have
imported string
in your script.
To use this script, you need to also have the "tokenize.py" module mentioned above.
pylint
wish
)
_tkinter.createfilehandler
call
and a simple Python interpreter written in Python to make it look like
you're running Python the normal way in a terminal window, but still
have live widgets in Tk windows, like wish
. Funnily enough,
this one's called pywish
. With it, you can play with user
interfaces in a quick and natural way:
wheat[251]% pywish Python 1.4 (Mar 17 1997) [C] (pywish) Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam >>> import sys >>> def hello(): ... print 'hi!' ... >>> b = Tkinter.Button(root, text='hi', command=hello) >>> b.pack() >>> hi! hi! hi! >>> q = Tkinter.Button(root, text='quit', command=sys.exit) >>> q.pack() >>> hi! hi! wheat[252]%
The words in boldface are not ones that i entered; they were printed on the screen when i pressed the "hi" button in the Tk window (first three times, then twice, and then i pressed the "quit" button, which exited to the shell prompt).
My thanks are due to Guido van
Rossum for pointing out createfilehandler
so i could produce
this effect, and also for a tip on successful use of compile
and exec
with multi-line strings: tack on a few newlines.
pywish
I plan to clean it up a little to make it more usable as a general component in other applications, but for now i'll just post it in its current state and hope you find it useful. You can just run this script directly to pop up a Console window.
Console.py
roundup.tar.gz (32 kb)
Done:
The "htmldoc" module is actually quite small (only about 300 lines) as most of the hard work has been factored out into the "inspect" module -- a non-HTML-specific collection of routines for getting all kinds of information out of your Python objects. My favourite routine in "inspect" is inspect.getsource(object), which can get you the source code for a function, method, or class.
htmldoc.py (12 kb)
inspect.py (18 kb)
pydoc sys # document a built-in module pydoc copy # document a module written in Python pydoc types # document a module written in Python pydoc abs # document a built-in function pydoc repr.Repr # document a single class pydoc -k mail # keyword search like man -k pydoc -p 6789 # start a web server at http://localhost:6789/ >>> from pydoc import help >>> help("getopt.getopt") # document something you haven't imported >>> import calendar >>> help(calendar) # document a live object
To get it, download these two files:
pydoc.py (54 kb)
inspect.py (26 kb)
copyright © by Ka-Ping Yee <ping@lfw.org> updated Mon 20 Aug 2001 |