I usually find that you tend to run somewhere between 1.5x and 2x number of cores of application processes per server - more if you spend a lot of time blocked waiting for the db on most requests.
The upshot of this is that on the average web node, an extra 10Mb per process costs you somewhere between 60mb (1.5x, 4 cores) and 320mb (2x, 16 cores). And of course then there's copy on write which means you're probably losing less than that in terms of actual memory usage.
Most real production web nodes I see have either 4 or 8Gb of RAM these days, since it simply doesn't cost enough to be worth bothering with less.
So does a couple hundred meg of RAM actually matter in terms of your overall hardware requirements if it reduces your application's development time significantly?
CGI::Application and many of its plugins use 'lazy loading.' so startup costs are defered until objects are actually used. If
your script had used $self->query which instantiates CGI.pm by default, CGI::Application memory usage would go way up. Add a template module, some plugins, and it would go up even more. And realistically, you would do just that in the lifetime of a real-world application. So how much does it matter in the long run? If you want features, you have to pay, either now or later.
Comment from: egor [Visitor]
Memory is cheap.
Development time is valuable.
@Matt: I'm currently in a situation at work that my application is memory-bound. That's why I'm investigating memory footprint of CPAN modules (since we're using a lot of them). If I can choose a similar module with a lower footprint but with equal performance, it's a win in the end (higher concurrency per server).
@jaldhar: I wasn't aware that CGI::Application used lazy loading. And yes, it will make it rise in cost as you're using features. But that's actually a very good point. Why should you take the penalty of code you're not using? I would very much like it if more modules used this approach so that memory would better be preserved.
@robin: You said "Why should you take the penalty of code you're not using? I would very much like it if more modules used this approach so that memory would better be preserved."
You're unfortunately completely disregarding the effects of Copy on Write here. For most long-running servers, you'll be better off if as much code is loaded as soon as possible before the fork. Modules, which use lazy loading by default and do not respect e.g. the "prefork" pragma, often have a lot worse memory (and maybe even CPU) usage because of that, because they'll often load their code only after the fork, which means every running fork of your application server will not only have to parse the modules anew, it will also always use freshly created memory.
To have optimal memory usage, you actually want to have everything you use loaded before the fork. Unfortunately, most CPAN modules do not yet use the prefork-pragma to achieve this.
@markus: I like what the original CGI.pm does, that it offers a :compile option so that you can preload if you want to and make use of copy-on-write. That way the user of the module has the option of choosing based on their intended usage and deployment scenario.
Let's take Moose. It's a really good package and more and more modules use it (which is very nice). But when you consider what it is used for it has broad applications, which means that the writers of Moose cannot automatically assume what environment you're going to use it in and what are your primary concerns. In some cases it might be startup time, in other it can be cpu usage, and in a third it could be memory consumption. Wouldn't it be better if the user was able to specify their focus and if the module (in this case Moose) had to make decisions based on tradeoff between cpu/memory it would follow the users preferred way. This way the module could cater equally well to single instance shell scripts, long-running servers, highly forked/threaded environments. This of course would increase the complexity of the module, but for some usage patterns it could be a huge win.
But lazy-loading combined with a preload/compile option (or proper use of prefork pragma) would be a good start. Maybe more visibility needs to be put on this subject.
@everyone: In my opinion you should always make an educated choice of when to cater for increased cpu usage or increased memory usage (or IO for that matter). If the user (in this case a developer) can choose which one to focus on things will be better for all of us in the end. Just make the default the one that the majority of users will probably want.
@egor: Memory is not cheap on embedded platforms (like cellphones, STBs and routers). Neither is CPU or IO. Which is why developers that work on those platforms need to be very talented. Wouldn't it be nice if perl actually was an option for them? I don't see a reason why we (as a community) should exclude embedded platforms. There is, after all, a lot of money to be made in that arena.
@Markus: "To have optimal memory usage, you actually want to have everything you use loaded before the fork."
Well, yes, but "everything you use" is kind of a key point (and the one I read Robin as having been making), isn't it?
For example, a lot of web frameworks carry their own mini web servers along with them. If I'm going to be running my application under apache, then the framework's built-in server will never be used, so it should never be loaded, either before or after apache forks.
I agree completely that everything you're going to use should be loaded before forking, but it would be nice if there were also a way to arrange things so that features you're not going to use don't get loaded at all.
Hey, neat! Thanks for this.
I'm not going to repeat the "memory is cheap" argument. Twitter requiring insane amounts of horse power to do very little (and still failing for resource starvation left and right) makes it clear that that argument has limits.
CGI of course hides most of its code behind AutoLoad blocks so it's actually more of a pig.
Thank you for including Continuity. Not taking time to really pimp it (or polish it), I'm always surprised when it gets mention. On that note... of pimping it... unlike CGI and mod_perl based servers, all users share a single Continuity instance, which adds another dimension. Resource leaks hurt more on one hand, but on another hand, the price of that overhead is only paid once. And of course there are other dimensions... of being able to do crazy in-process debugging tricks and inspect user's data and application state. But that's tangent to this. Not that other systems don't have their own unique sets of perks.
Enlightening, interesting, and a good conversation piece. Cheers!
I did some memory surveys of my.opera.com back in the day, and back then, GTop wasn't very reliable. I don't remember how I discovered it, but found that I had to check the Linux SMAPS to get any reasonable results. I hacked a Munin plugin to monitor it on an Apache system.
Then, I think it is very important how the application scales with more users. If it is capable of sharing memory between processes, you can scale sublinearly relative to unity, i.e. if you double the number of users you don't need to double the hardware resources. If you can do this well, I'd say development time counts much more. Remember 1 GB of DDR2 RAM comes at around NOK 70 these days...
BTW, I've been working in the Java world lately, and found myself very happy if the app consumes less than half a gig of RAM, and sublinearity is not at all realistic... ;-)
Form is loading...