Python Refactoring Browser

Okay, Python fans, on Smalltalk Instead Of Python it sounds like one thing the Smalltalkers have going for 'em is the Refactoring Browser. So who's working on a Python equivalent?


This page is now pretty out of date. Check out the Bicycle Repair Man page. - Phil Dawes (Jan 2002)


Current Design Plan:

Load the file(s) you want to work on into buffers on a Bicycle Repair Man object, BRM uses tokenize and parser python modules to attach the AST and important token structures to each buffer it holds. Methods of Bicycle Repair Man are then available to do the refactorings.

Status:

Half of the basic analysis methods are working, and adding the rest of them won't be difficult. I'm using the enhanced abstract syntax tree from Python2C, it's easier. I've looked at converting the ast+ back into Python code, it looks pretty hairy to me, if someone feels like contributing code towards that, I'd be most appreciative. -- Shae Erisson


Assuming that no one is working on such a thing right now, what should it be written in (I presume Python, but maybe there is a reason for doing something else), and are there existing Python development environments (IDLE, emacs python-mode, pythonwin) that could be leveraged to provide part of the environment such a creature should be written in? And if someone were to start working on making one what are the User Stories and which ones should be in the first few iterations?

Python is without question the language to write it in. IDLE and pythonwin come with all their own source directly available from themselves, so they're the obvious starting points. I don't know about the User Stories, but the obvious name, for those who are familiar with the sketch, is "Bicycle Repair Man"

Python, of course, but as for leveraging existing environments, the need for managing Objects (Python classes, instances, function libraries) in a GUI begs the question, "Why not a Zope Application Server based IDE?" The GUI extensions in the Mozilla project are a powerful enticement to take a serious look at this possibility.

From a groupware point of view, what could be more natural than managing Python projects "through the Web"? Zope's internal server (HTTP, FTP, WebDAV, XML-RPC) can run on a desktop to support a single user, or heavier hardware for a team of coders, or any combination. Lots of possibilities here.

Interesting, sounds like 'refactoring while in CVS.' Or maybe that's as close to a Smalltalk Image as Python can get.

Actually, this could become literally true. A new Zope module was recently introduced that will allow Zope components to be stored in CVS, in the form of XML snippets. This may be a step or two over the edge, but it does present some interesting possibilities, especially in the realm of groupware for software projects.


Stories

A particular method has been poorly named. Given the old name, and an improved name, all appropriate occurrences of the old name are changed to the new same, unless the new name is already in use.

A particular programmer wants to turn a collection of modules and routines that he's written into a set of objects and methods. Bicycle Repair Man provides the ability to encapsulate the functionality into each object and method, as well as allow code to be moved from the functions to the objects.

The Ubuntu GNU/Linux install process is being demonstrated to new users. The question is asked "Why is it installing something called 'bicyclereapir'?" The successful attempt to determine what the bicyclerepair package does only leads to further questions....

Martin Fowler's book details more refactorings, many of which would apply to Python. We will want to consult it before this project completes (though the above two stories are probably sufficient to get it started.)

minor noise removed

Starting with an existing codebase (IDLE), while potentially a big timesaver, poses XP issues that wouldn't exist in a project started from scratch - depending upon how tightly coupled the new code has to be with the existing code(not very), it would probably take a while to write unit tests for the existing functionality, assuming the current behavior was even understood. And the existing code may not be written to allow easy testing, particularly because of GUI issues.

Having made those negative points, I have been reading the IDLE code trying to understand it, from the perspective of adding refactoring support.[qui?]

Pages we'll write or add to when the project is well under way.

No one has commented on the fact that Python (usually) loads modules from a PYTHONPATH, and that Smalltalk uses an image. I think this has important implications for those that want to write refactoring browsers in Python. A Python-based browser will have to worry about finding and editing files, not just manipulating classes.

Just finding the files is no biggie:

imp.findModule(MyClass.__module__)

But you're right, Bicycle Repair Man will need to work with uncompiled source, or at best the output of the tokenize module, because what has to be written out is source, not bytecodes. A big deal?

Actually, there's a bytecode-to-source decompiler: www.crazy-compilers.com

I don't think so. I remember a couple of years ago, a friend of mine wrote a C++ browser in Smalltalk that worked on files. It was really nice, actually. It parsed the C files into memory, broke them up into their various component sections, and edited them in memory. You had to explicitly save them to disk again when you were done. I think the same approach is feasible in Python. -- Anthony Lander


See Don Roberts PhD thesis on the refactoring browser. Some of the key ideas are that you need an abstract syntax tree representation of a program and you need a database that can let you answer questions like what is inherited by a particular class. st-www.cs.uiuc.edu


What about the way Python classes are mutable?

some_class.some_method = some_other_class.some_method

Also, the dynamic scoping of Python can play havoc with this kind of analysis.

The first isn't exactly something you do every day. But I'm not certain how the dynamic scoping would cause trouble. Can you explain further?

It's not a big deal, you don't deal with the class/method after it's been assigned somewhere else, you only deal with it beforehand.

some_class.some_method = some_other_class.some_method

In the example in the line above, when some_other_class.some_method gets renamed to some_other_class.another_method, you just change it to:

some_class.some_method = some_other_class.another_method

and you're done.



See original on c2.com