Software

Deprecated code analyzer for perl

Robin Smidsrød

03 Jul 2009 • 4 min read

After reading NPEREZ's blog article DarkPAN SchmarkPAN -- STOP THE MEME, it suddenly comes to me: We need a tool that enable users to profile their codebase to verify if they are using any removed, deprecated or changed features of a new version of the perl interpreter. This would make the task of identifying troublesome constructs in code much easier for your average perl programmer. I, for one, do not know when a particular feature of perl was introduced, or when another particaluar construct was deprecated (or removed). And if I read the perl changelog I'm not always aware of how to test if any particular code construct is in use in my code or not. Wouldn't it be possible to make a tool (could possibly be named perldelta or perllint) that scans a piece of code (but doesn't run it) and gives out information about removed and deprecated constructs. To me this seems like a mix between warnings and Perl::Critic, but focused on identifying removed or deprecated constructs in just the perl interpreter instead of advocating general code style (as Perl::Critic does). The tool could by default run against the release it is packaged with, but optionally also against older releases (with a --version option) so that you could test which version of perl really would cause problems for your app. I'm not sure if all of the features that are deprecated/removed can (easily) be tested with this approach, but I'm fairly certain that it would go a long way towards enabling deprecation cycles in core perl development. Another useful function this tool could have was to determine the minimum (or possibly maximum) useful version of the perl interpreter that your code can use. Example:

Let's say that your code uses pseudo-hashes (I don't even know how to identify those). They were removed somewhere in the 5.9.x-series.
This means that your code is incompatible with 5.9 or newer.
Let's also assume that you use some useful UTF-8 construct that makes you require 5.8.x.
That means that as long as you have that pseudo-hash code in there your limited to running your code on perl 5.8.x.
Let us then assume that a new (junior) programmer enters the stage and has heard about this fancy given/when thingy in perl 5.10 and starts using that in the same codebase.
Suddenly you have a codebase that is in fact completely incompatible with any single perl version, but each of the parts are perfectly compatible with some perl version.

With a tool like what I have described it would be easy to determine a specific version requirement for your codebase (or CPAN module), and it would be even easier to pinpoint things that bites you now, or could possibly come and bite you in the near future. This tool would also benefit distribution packagers in determining incompatibilities when they plan on upgrading to a new version of perl (which we should hope they would do frequently). What I like about Ubuntu, as Gabor Szabo wrote about, compared to e.g. Debian, is that it has regular releases every 6 months. It makes me certain that if a certain feature comes along in some popular package, it is never more than 6 months away. Imagine if we could have this kind of predictability with perl. It would be almost like Christmas, only twice a year. I'm not saying that this tool necessarily needs to be bundled with perl, but it should be trivial to get it to run on your machine so that you could profile your code and give you the necessary advice. Chromatic says that the pumpkings need more help, and I believe this is a way to help them. If the tool could also be made to output some kind of structured output (somewhat similar in concept to Debian popularity-contest) we could harness statistics from both CPAN and the DarkPAN (I really dislike that name) to figure out what features of the language are actually in use without divulging any actual copyrighted/restricted code. Imagine, for a second, that you run this tool against your code and then you upload a set of structured data that identifies what kind of code constructs you use in that code (this could happen automatically if you have it set to do that). If any of those constructs become deprecated or removed in the future you would get an email that says so (based on your opt-in/out settings). If you want to go all out you could even store checksums (SHA1) of the files it is run against and store that value in the structured data, that way you could look up reports against a specific file without actually running it. Imagine running something like this recursively on a big project and have it generate a report of your entire codebase (including CPAN deps) and then with very little effort be able to see the obvious blockers towards upgrading to the next version of perl. I also see that this could be used to better determine Kwalitee of a CPAN module. If we actually implement those last parts about uploading structured data to a central repository it is absolutely vital that it contains no trace of either code or the identity of the owner of that piece of code. Because, if it does, no company would (probably) be willing to submit data into the repository. Identification of a particular report/datadump to a company, individual or named code should be completely optional. Anonymization is a vital key here. I'm not exactly sure how we would be able to handle foreign code (C, C++, Java, etc.), or how XS ties into the picture, but I have a feeling it should be doable. Please let me (and others) hear what you have to say about this subject in the comments.