Category: Software
Deprecated code analyzer for perl
By Robin Smidsrød on Jul 3, 2009 | In Software, Perl | 5 feedbacks »
After reading NPEREZ's blog article DarkPAN SchmarkPAN -- STOP THE MEME, it suddenly comes to me:
We need a tool that enable users to profile their codebase to verify if they are using any removed, deprecated or changed features of a new version of the perl interpreter.
This would make the task of identifying troublesome constructs in code much easier for your average perl programmer. I, for one, do not know when a particular feature of perl was introduced, or when another particaluar construct was deprecated (or removed). And if I read the perl changelog I'm not always aware of how to test if any particular code construct is in use in my code or not.
Wouldn't it be possible to make a tool (could possibly be named perldelta or perllint) that scans a piece of code (but doesn't run it) and gives out information about removed and deprecated constructs. To me this seems like a mix between warnings and Perl::Critic, but focused on identifying removed or deprecated constructs in just the perl interpreter instead of advocating general code style (as Perl::Critic does).
The tool could by default run against the release it is packaged with, but optionally also against older releases (with a --version option) so that you could test which version of perl really would cause problems for your app.
I'm not sure if all of the features that are deprecated/removed can (easily) be tested with this approach, but I'm fairly certain that it would go a long way towards enabling deprecation cycles in core perl development.
Another useful function this tool could have was to determine the minimum (or possibly maximum) useful version of the perl interpreter that your code can use.
Example:
- Let's say that your code uses pseudo-hashes (I don't even know how to identify those). They were removed somewhere in the 5.9.x-series.
- This means that your code is incompatible with 5.9 or newer.
- Let's also assume that you use some useful UTF-8 construct that makes you require 5.8.x.
- That means that as long as you have that pseudo-hash code in there your limited to running your code on perl 5.8.x.
- Let us then assume that a new (junior) programmer enters the stage and has heard about this fancy given/when thingy in perl 5.10 and starts using that in the same codebase.
- Suddenly you have a codebase that is in fact completely incompatible with any single perl version, but each of the parts are perfectly compatible with some perl version.
With a tool like what I have described it would be easy to determine a specific version requirement for your codebase (or CPAN module), and it would be even easier to pinpoint things that bites you now, or could possibly come and bite you in the near future.
This tool would also benefit distribution packagers in determining incompatibilities when they plan on upgrading to a new version of perl (which we should hope they would do frequently).
What I like about Ubuntu, as Gabor Szabo wrote about, compared to e.g. Debian, is that it has regular releases every 6 months. It makes me certain that if a certain feature comes along in some popular package, it is never more than 6 months away. Imagine if we could have this kind of predictability with perl. It would be almost like Christmas, only twice a year.
I'm not saying that this tool necessarily needs to be bundled with perl, but it should be trivial to get it to run on your machine so that you could profile your code and give you the necessary advice. Chromatic says that the pumpkings
need more help, and I believe this is a way to help them.
If the tool could also be made to output some kind of structured output (somewhat similar in concept to Debian popularity-contest) we could harness statistics from both CPAN and the DarkPAN (I really dislike that name) to figure out what features of the language are actually in use without divulging any actual copyrighted/restricted code. Imagine, for a second, that you run this tool against your code and then you upload a set of structured data that identifies what kind of code constructs you use in that code (this could happen automatically if you have it set to do that). If any of those constructs become deprecated or removed in the future you would get an email that says so (based on your opt-in/out settings).
If you want to go all out you could even store checksums (SHA1) of the files it is run against and store that value in the structured data, that way you could look up reports against a specific file without actually running it. Imagine running something like this recursively on a big project and have it generate a report of your entire codebase (including CPAN deps) and then with very little effort be able to see the obvious blockers towards upgrading to the next version of perl.
I also see that this could be used to better determine Kwalitee of a CPAN module.
If we actually implement those last parts about uploading structured data to a central repository it is absolutely vital that it contains no trace of either code or the identity of the owner of that piece of code. Because, if it does, no company would (probably) be willing to submit data into the repository. Identification of a particular report/datadump to a company, individual or named code should be completely optional. Anonymization is a vital key here.
I'm not exactly sure how we would be able to handle foreign code (C, C++, Java, etc.), or how XS ties into the picture, but I have a feeling it should be doable.
Please let me (and others) hear what you have to say about this subject in the comments.
Lets make a perl appreciation survey for the masses
By Robin Smidsrød on Jun 12, 2009 | In Software, Perl | 4 feedbacks »
This is a response to John Napiorkowski's article: Why All the Hate?
Matt S. Trout was talking about the fact that we need to become better with marketing. I second that.
One of the tools of marketeers is the survey.
My suggestion is this: Why don't we publish a survey about perl and get it answered by as many people as possible, even the nay-sayers, reddits and diggers? At least this way the perl community as a whole could better grasp what parts we really need to focus on, based on user feedback? And we'd even include people outside the core community.
I believe that some of the questions should be like this (some alternatives in parenthesis):
- How much do you like perl?
- What are perl's biggest strengths? (we need to make a list here)
- How much do you dislike perl?
- What are perl's biggest weaknesses? (same here)
- How long have you been using perl?
- How are you using perl? (scripter, sysadmin, non-user, cpan dev)
- Are you a member of the perl community? (current, former, never, left)
- What is your age?
- Are you excited about perl6? (yes/no/neutral)
- Do you think the CPAN install process requires improvement? (yes/no/neutral)
- Do you follow perl-related websites? (yes/no/neutral)
It would be wonderful if either EPO, P5P or the Perl Foundation would find the means to issue a survey like this. If it comes from either of these organisations it would probably be taken seriously. This way we could really find out what matters to people, instead of just "guessing" what it is.
I know that something like this was done in the past, and it was what ignited the whole perl6 process. Isn't it about time we do something again?
If you have more suggestions for questions/options to the survey, please state so in the comments. Other comments are also most welcome.
X11 forwarding to Ubuntu server not working with PuTTY and Xming?
By Robin Smidsrød on Jun 7, 2009 | In Software | Send feedback »
I was stumped why I couldn't run X11 software on my Ubuntu server installation and have it forwarded through my SSH connection to my desktop Windows machine running Xming.
The solution was to install the xauth package from the Ubuntu repositories:
$ sudo aptitude install xauth
Apparently installing an X11 application (gitk) didn't pull in that dependency.
Now I can finally run gitk and visualize those complicated git graphs. Yay!
Memory footprint of popular CPAN modules
By Robin Smidsrød on May 26, 2009 | In Software, Perl | 10 feedbacks »
I was reading Jay Kuri's article about CGI alternatives the other day, and I got thinking. How much memory does these various modules for simple (or advanced) web serving use?
After having looked through Mark Stosberg's article on startup penalties I was even more bewildered. It was hard to track the actual cost of each module, because the perl interpreter footprint was also in there (and that we cannot do anything with).
I wrote a small script called perlbloat.pl to check how each of the mentioned modules come out. It uses the GTop module, which is Gnome's cross-platform way of counting things such as memory.
The results was from this command:
$ for name in $(echo CGI HTTP::Engine FCGI::Engine Catalyst CGI::Application Squatting Continuity Mojo Mojolicious Titanium HTML::Mason CGI::Simple); do perlbloat.pl $name; done
| Module | Memory |
|---|---|
| CGI::Application | 135 168 |
| CGI::Simple | 536 576 |
| Squatting | 540 672 |
| CGI | 602 112 |
| Continuity | 1 163 264 |
| HTTP::Engine | 2 072 576 |
| Mojo | 2 719 744 |
| HTML::Mason | 2 916 352 |
| Mojolicious | 3 526 656 |
| Titanium | 3 559 424 |
| FCGI::Engine | 10 280 960 |
| Catalyst | 11 046 912 |

Version numbers are as follows (running on perl 5.10.0 on Ubuntu 8.10):
Catalyst 5.80004 CGI 3.29 CGI::Application 4.21 CGI::Simple 1.110 Continuity 1.0 FCGI::Engine 0.08 HTML::Mason 1.42 HTTP::Engine 0.1.8 Mojo 0.9002 Mojolicious 0.9002 Squatting 0.60 Titanium 1.01
What is interesting to notice here is that CGI::Application actually comes out with a lower footprint than CGI::Simple. Considering CGI::Application has a somewhat bigger API this is surprising.
There is of course no surprise that Catalyst is the most memory hungry module of them all. What seems surprising, though, is that FCGI::Engine eats so much. It would be nice to know why.
If you consider these numbers I would like to know good reasons for using Catalyst in a high-performing environment. To me it seems like the application servers will take a trashing because of the increased memory usage of each process if you compare it to e.g. CGI::Application. Even Titanium which is pretty feature packed comes out at almost three times less memory used.
What is interesting to notice is that if you consider the typical deployment scenario for a Catalyst-based app you get these numbers:
$ perlbloat.pl Moose DBIx::Class Catalyst
Moose added 4.8M
DBIx::Class added 392k
Catalyst added 5.7M
Moose DBIx::Class Catalyst added 10.9M in total
If you consider a similar app based on HTTP::Engine you will have this overhead:
$ perlbloat.pl Moose DBIx::Class HTTP::Engine
Moose added 4.8M
DBIx::Class added 396k
HTTP::Engine added 1.4M
Moose DBIx::Class HTTP::Engine added 6.7M in total
If you turn the loading order around a little bit you get this:
$ perlbloat.pl DBIx::Class Moose HTTP::Engine
DBIx::Class added 528k
Moose added 4.7M
HTTP::Engine added 1.4M
DBIx::Class Moose HTTP::Engine added 6.7M in total
What you can see from this last dump is that Moose and DBIx::Class shares some code (132k), but it is mostly irrelevant when you consider the cost of the rest.
Another package that is getting wildly popular these days is MooseX::Declare (0.22 tested), and as you can see, it has pretty large footprint aswell:
$ perlbloat.pl MooseX::Declare
MooseX::Declare added 10.3M
If you separate Moose and MooseX::Declare you can see that it adds up by itself (it's not only Moose that costs):
$ perlbloat.pl Moose MooseX::Declare
Moose added 4.8M
MooseX::Declare added 5.4M
Moose MooseX::Declare added 10.2M in total
If you have something to say about the numbers I've collected here I would love to hear them. Feel free to post comments.
Why Should I Learn This?
By Robin Smidsrød on May 5, 2009 | In Software, Software Design, Education | Send feedback »
I was talking with my wife (which is a teacher) about how to make students want to learn a specific subject. My solution to the problem was that you have to make the student understand how learning the subject at hand can improve the quality of something they already enjoy doing.
I sat down for some moments and jotted up this class diagram:
Basically, you have a many-to-many relationship between subject and interests. Subjects can be history, geometry, math, grammar, music etc. Interests can be soccer, music, sports, etc.
The connection between them lists the reasons for learning that specific subject based on your hobbies or interests.
A website that easily allows lookup of reasons (and additions) could be a good resource for students, teachers and parents that regularly enter into the discussion mentioned in the title.
Want to help make it a reality?
