From: Ka-Ping Yee To: Tim Peters cc: , Subject: RE: [Python-Dev] Accessing globals without dict lookup Date: Mon, 11 Feb 2002 07:14:09 -0600 (CST) All right -- i have attempted to diagram a slightly more interesting example, using my interpretation of Guido's scheme. http://lfw.org/repo/cells.gif http://lfw.org/repo/cells-big.gif for a bigger image http://lfw.org/repo/cells.ai for the source file The diagram is supposed to represent the state of things after "import spam", where spam.py contains import eggs i = -2 max = 3 def foo(n): y = abs(i) + max return eggs.ham(y + n) How does it look? Guido, is it anything like what you have in mind? A couple of observations so far: 1. There are going to be lots of global-cell objects. Perhaps they should get their own allocator and free list. 2. Maybe we don't have to change the module dict type. We could just use regular dictionaries, with the special case that if retrieving the value yields a cell object, we then do the objptr/cellptr dance to find the value. (The cell objects have to live outside the dictionaries anyway, since we don't want to lose them on a rehashing.) 3. Could we change the name, please? It would really suck to have two kinds of things called "cell objects" in the Python core. 4. I recall Tim asked something about the cellptr-points-to-itself trick. Here's what i make of it -- it saves a branch: instead of PyObject* cell_get(PyGlobalCell* c) { if (c->cell_objptr) return c->cell_objptr; if (c->cell_cellptr) return c->cell_cellptr->cell_objptr; } it's PyObject* cell_get(PyGlobalCell* c) { if (c->cell_objptr) return c->cell_objptr; return c->cell_cellptr->cell_objptr; } This makes no difference when c->cell_objptr is filled, but it saves one check when c->cell_objptr is NULL in a non-shadowed variable (e.g. after "del x"). I believe that's the only case in which it matters, and it seems fairly rare to me that a module function will attempt to access a variable that's been deleted from the module. Because the module can't know what new variables might be introduced into __builtin__ after the module has been loaded, a failed lookup must finally fall back to a lookup in __builtin__. Given that, it seems like a good idea to set c->cell_cellptr = c when c->cell_objptr is set (for both shadowed and non-shadowed variables). In my picture, this would change the cell that spam.max points to, so that it points to itself instead of __builtin__.max's cell. That is: PyObject* cell_set(PyGlobalCell* c, PyObject* v) { c->cell_objptr = v; c->cell_cellptr = c; } This simplifies things further: PyObject* cell_get(PyGlobalCell* c) { return c->cell_cellptr->cell_objptr; } This buys us no branches, which might be a really good thing on today's speculative execution styles. I know i'm a few messages behind on the discussion -- i'll do some reading to catch up before i say any more. But i hope the diagram is somewhat helpful, anyway. -- ?!ng