People seem to always talk in absolutes, black and whites, concerning this topic.
In my experience most engineers take a situational approach. Sometimes it is worth it to optimize early, sometimes it is not. Some things are worth optimizing, some not. Also some things are worth optimizing to certain level.
These types of articles and claims conjure up a debate because everyone is imagining a different scenario in their heads. It is entirely plausible that each claim could be the best solution in a different scenario.
Do these articles assume that engineers are incapable of thinking flexibly and need to follow some absolute truths that have to be debated? Have I been around too good engineers that I have not noticed this issue?
I think maybe yes, you've been around mostly good engineers.
It's not a problem that's exclusive to coding, you find people talking loudly about absolutes in most fields. People like definitive answers. Makes us feel secure.
Which makes absolutes easy to sell, and especially popular with people who feel out of their depth. It also makes them popular in academics.
You also find flexible, creative thinkers in the same fields. They're just not as loud, or as numerous, in my experience.
Absolutes can be indispensable when learning. An entire concept can be boiled down to black-and-white, and often can be completely forgotten about while a learner groks the rest of a subject. Part of the path to expertise for a learner is to go back and challenge those absolutes, because those absolutes are now part of the learner's assumptions.
When people espouse absolutes, I generally (not absolutely) assume either they're not experts or they think their audience can't handle nuance.
> Do these articles assume that engineers are incapable of thinking flexibly and need to follow some absolute truths that have to be debated?
I think this article assumes that there are possibly engineers that don't think flexibly and need to follow some "absolute truths".
> Have I been around too good engineers that I have not noticed this issue?
Probably. I mean, I would love to be between your colleagues.
In some places, submission to ideas rumored to be proven is valued more than flexible thinking, critical thinking at any time, etc. I know that some countries in Asian have a rather high degree of culture like this. It doesn't apply to everyone.
Speculation time (not a linguist): I think this is an inherent problem in human languages, they're fundamentally not sophisticated enough to communicate nuances. We see this all the time in engineering, politics etc. People always talk in terms of rules because these are easier to phrase, but in reality everyone takes every decision situation-by-situation basis. But then when you need to explain these nuances, you simply need to use more words.
I think this also makes sense. Take a cat. Cats clearly are able to think a lot more complex than they can communicate. They use a set of very basic body cues to communicate, but they are able to process information a lot more complex than these. Why would this not be like this humans too?
I have some very opinionated friends on coding styles, libraries, programming languages, programming "rules", "principles" etc... They're all sort of rubbish. I think it's important to know and study them because they give you different ways to think of engineering. But when it's time to commit code to production does anyone really think about "ah this is not DRY, premature optimization, NIH etc...". I claim no. We always think about the specific situation. Is it good to invent this code. Is it good to optimize this possible future extension. Is it good to repeat this particular code. We all know repeating code is "bad" but we all also know reasonable exceptions.
So yes, all rules are by nature fuzzy. In fact, whenever someone tells me a principle they hold, I immediately start thinking about fuzzy-fying it i.e. thinking of possible niche cases where this principle would reasonably NOT hold.
IMO part of this is that nuanced processing is actually a very expensive operation and people would be unable to function without some level of abstraction- language as in all things. That's why, for example, Jeff Vandermeer writes in his Wonderbook that each author has a "thorn", something that bugs them and compels them to write and explore infinite nuances of the topic of their focus.
I think it's systemic, not really linguistic. Complexities are fractal, and there's fundamentally no way to succinctly express them in a title or summary.
I just finished a project where literally everything had been over engineered to the point that it took them weeks to do a release.
The code generally had one path but every method was built using dictionaries of possible classes that would implement interfaces so they wouldn’t have any duplicate code.
It doesn't sound like inner-platform-effect to me. teddyuk didn't say they were reinventing the wheel, just that they were doing awful things with dictionaries.
“ The inner-platform effect is the tendency of software architects to create a system so customizable...”
Yup, that was it and I have seen it so many times - architects should think that a flexible system that “can just” handle any possible future requirement. This always ends in a mess of junk that isn’t actually needed by the business.
>Do these articles assume that engineers are incapable of thinking flexible
There are some developers that are like that, they read in a book about some patterns or idea and then in code reviews will jump on you because you did X wrong and not followed his bellowed pattern from the book.
I read somewhere else yesterday that “it’s easier to throw hardware at a problem than people.” It certainly does not hold true for all cases, but turning this aphorism into a question seems to give us some good heuristics---provided you do know the what the root cause of your performance issues could be, of course.
A rather large sector of software where you don't throw hardware at the problem is embedded software. Shipping hardware that is 25€ more expensive times a million is often much more expensive than optimizing the software.
I've also seen the opposite, quite often actually: Cheaping out on hardware that is going to ship maybe 1-10k (very expensive) units, then spending hundreds of thousands on optimizing software to make it not even good, just less painfully slow. The i.MX6 chip with its weak GPU and corresponding wonky drivers is an especially popular way to get user interfaces that can't keep 60 fps.
The i.MX6 is certainly a poor choice today but do you know of any good contemporary alternatives? Honest curiosity because personally I can't think of any medium-power (or at least thermal dissipation), (Mainline-) Linux capable, well (and openly) documented processor available long-term in low/medium quantities from a proven vendor.
I've recommended Toradex Tegra 2 modules to a customer (I got involved early enough to recommend hardware - a rare case) and they seem to be quite happy with it. Just a few euros more per module than i.MX6 from the same module vendor. The GPU is unsurprisingly (with nVidia making the whole SoC) pretty good. Most everything needed for software support except the user-space graphics driver is even open source. I am not an nVidia fan because of their desktop Linux driver shenanigans, but I strongly prefer Tegra 2 over i.MX6. By the way, i.MX6/6+/7 with the etnaviv driver might also be alright. I've just never used anything but the proprietary "gal3d" driver.
Regarding mainline Linux support, AFAIK you don't get mainline Linux support anyway with i.MX6 and the gal3d driver. You can only use mainline with etnaviv, which is semi-officially(?) supported by Pengutronix. I've heard others say good things about etnaviv.
It was easy to throw hardware at a sequential problem, a decade ago. CPUs were still improving their single-core performance year after year, and memory access and all kinds of IO devices kept getting faster and better. I'm guessing that time was where this idea originates from.
Today, things are different. Single-threaded performance isn't going to be improving much in foreseeable future; the effort shifted to improving parallel performance and, more recently, power consumption. So if you write slow code - perhaps by choosing a slow software stack - your code will remain slow.
It is also a bit less easy to throw hardware at a problem when your software is actually embedded firmware. Upgrading or adding hardware reduces your profit per unit.
But TBH in this particular field Moore's law still rules. Sometimes you are forced to upgrade to a better component for the same price because the component you chose five years ago has reached end-of-life.
Yet paying attention about optimization allows you to do more with what you have, or sometimes allows you to use less optimal but more convenient or simpler approaches.
In cases where you control the hardware and the problem is sufficiently parallelizable.
If you are releasing consumer applications either your program works smoothly on the device a consumer has or not, let alone has the ability to buy either financially or fundamentally (as in you don’t find a phone with desktop level performance anywhere, because they don’t simply exist).
This is partly what I was trying to convey with the "turning this aphorism into a question" fragment. Sometimes it is just not possible, or it is too expensive, to increase hardware performance.
Premature optimization, or at least one of its manifestations, is failing to consider the opposite.
People seem to always talk in absolutes, black and whites, concerning this topic.
Maybe this is just my cynical perception, but I often hear people try to justify situational bad practice by citing outliers unrelated to the situation at hand, particularly with optimization. As you say, they usually imagine a different scenario is at hand. For those situations I quote HL Mencken:
Explanations exist; they have existed for all time; there is always a well-known solution to every human problem—neat, plausible, and wrong.
I interpret the premature optimization dictum to mean:
Really the only time to talk about optimization at the early stage of a project is to set performance objectives or when the project is about optimization. Otherwise people should seek relatively optimal solutions which optimize efficiency at delivery. Optimization starts when they have running code that doesn't meet performance goals. Other optimization is premature, in one sense or another.
One thing I haven't seen in the comments yet is the simple observation that premature optimization is, by definition, a mistake -- otherwise it'd merely be "optimization". The real question is, what makes it "premature"?
I think Knuth (in his famous "... root of all evil" quote) was referring specifically to programming, and to spending time on optimizations prior to functional completion. The mantra, "First make it. Then make it work. Then make it better." [or similar] holds water. But it's pointless to debate in general terms what precisely constitutes a "premature" optimization, vs a merely timely one, or adherence to best practices, given the infinite combinations of circumstance and context relating to software development projects.
We don't have to speculate, the paper is online[1] and he is very explicit about what he meant.
The paper is, in fact, an ode to optimisation and the necessity of optimisation, including, very specifically, micro-optimisation. The "root of all evil..." part is a "yes, but". You actually have to leave out much of the actual sentence in order to strip it of its meaning:
"We should forget about small efficiencies, say about 97% of the time, premature optimization is the root of all evil."
If that wasn't clear, the very next sentence is as follows:
"Yet we should not pass up our opportunities in that critical 3%"
A little later in the paper:
"The conventional wisdom [..] calls for ignoring efficiency in the small; but I believe this is simply an overreaction [..] In established engineering disciplines a 12% improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering."
> "First make it. Then make it work. Then make it better."
That only works when you do simple things, for harder things the "make it better" part usually requires a complete rewrite if you didn't properly think things through from the beginning.
A better mantra would be "First make it, then make it again, then make the real version".
No. Engineering is an iterative process. Even if you're 're-writing' everything, you're still not starting again from scratch, rather you're building off your previous approaches.
> The real question is, what makes it "premature"?
When this principle has come up for me it's not so much about optimization, but about the cost you're paying to have that optimization: In particular, the cost in terms of code complexity. If you've already got generic code for a self-balancing red-black tree, and you can just plug your new data structure into it, and you're reasonably sure that your thing will get 10 nodes on occasion, then sure, use it, even if you don't know whether you'll ever get much larger than that. But if you don't have that code handy, then don't write a self-balancing red-black tree unless you have benchmarks to show that it will help. And don't use a self-balancing R-B tree if you don't know whether the number of nodes ever goes past 2.
(Or, I mean, unless it's a personal project and you want to practice writing a self-balancing R-B tree; but at that point you're not optimizing, you're practicing / having fun, so the advice isn't applicable.)
If code structure A and code structure B are both about equally comprehensible, and you think code structure A will be more effecient, go ahead and use it. But if structure A is extremely complicated, and will make the understanding and maintenance of the system more difficult and prone to error, then don't use it unless you know it's actually worth the cost.
Word to the wise: if you are tempted to code a red-black tree, it probably means you are using the wrong language.
In a more capable language, that code has already been written and put in a library, and been very thoroughly tested and optimized already, and is faster than you could afford to do just now. Less capable languages can't express such a library in usable form, or call into one.
It is not an accident that Linux and BSD kernels have numerous hand-specialized red-black trees in them, now languishing for the attention that is needed to modernize them to perform reasonably on current hardware.
> I think Knuth (in his famous "... root of all evil" quote) was referring specifically to programming, and to spending time on optimizations prior to functional completion.
What he said is, quote, "... we should forget about the small efficiencies, say, about 97% of the time" and "we should not pass up our opportunities in that critical 3%."
What he was talking about in the article[1] is the tendency of programmers to concern themselves with the efficiency of things like the modulo operator, when much larger efficiencies are far more important, such as the design of the algorithm or data structures.
Premature optimization, as when deciding that this small efficiency is important enough to optimize, causes you to forget about the larger efficiencies that cause your program to be slow.
Optimising code without taking into effect the context it runs in is what makes it premature. For instance, the complexity of the calling code, or even some situation where it will never matter in practice (efficiently sorting UI elements, which must be few to be relevantly called UI).
And there is early optimisation, when you know upfront the hill is steep because of the sheer amount of calculation involved in the very nature of the project (gamedev, big data, etc). And this can lead to low level optimizations (e.g McCarmack's fast inverse square root) or architectural measures (distributing code accross computers).
Unless we are talking about speed I think that "premature optimization" refers to readability issues with code.
I work in web development and we never had to run a profiler even once to find performance issues. "Premature optimization" was in my experience always the discussion about readability. In the JavaScript world there is always this one guy who can do things in 3 lines of code instead of using 5 methods to understand the code in 5 month again.
"My formative memory of Python was when the Quake Live team used it for the back end work, and we wound up having serious performance problems with a few million users. My bias is that a lot (not all!) of complex “scalable” systems can be done with a simple, single C++ server."
And so a lot of people seem to feel the need to once again bring back all the discussion in absolutes of optimisation being useless, or the only thing that matters.
I've found the same as John Carmack btw. Taking some component of a customer system, translating it to C++, can make a system that needed 10+ servers suddenly run a lot faster on a single server (meaning higher throughput AND lower latency, because in addition to the raw speed advantage of C++ there's so much you can do in C++ that's not really feasible in, say Python. For example, mmapping files).
But C++ is not exactly the first thing I reach for when something new springs to mind, or I want some data analysis done, or ... (even though C++ is excellent for running production data analysis jobs)
Python is probably the wrong tool for this though, that doesn’t make it bad. Writing this in erlang, elixir or scala or even Golang would be a better choice. That’s not premature optimisation, that’s understanding what problems are requirements up front. Choosing most of these when you have zero users is probably wrong; python might be better as with its libraries it could get you to market faster!
It came to a full circle, lol. I reposted this article because of that tweet.
> But C++ is not exactly the first thing I reach for when something new springs to mind, or I want some data analysis done, or ... (even though C++ is excellent for running production data analysis jobs)
Me too. C++ is a pain in the ass (although I only touch that thing in my college day building simple A* AI). I always go to seeking libs in TS (my go-to language) or at least JS before consulting to other languages.
> I've found the same as John Carmack btw. Taking some component of a customer system, translating it to C++, can make a system that needed 10+ servers suddenly run a lot faster on a single server (meaning higher throughput AND lower latency, because in addition to the raw speed advantage of C++ there's so much you can do in C++ that's not really feasible in, say Python. For example, mmapping files)
I think it's just the nature of true compiled language that comes with a very minimum runtime. Multi steps compilation in Java and interpretation in JS or Python always need sacrifice. I have seen great devops engineers pulling hairs over kubernetes configuration issue happening just because a Java spring boot service needs an enormous memory just to initialize some runtime objects.
> And so a lot of people seem to feel the need to once again bring back all the discussion in absolutes of optimisation being useless, or the only thing that matters.
And then, I found Rust, which is a pretty nice language. It bridges that memory safety VS optimisation issue. Although the learning curve is pretty heavy on the syntax and the new paradigm it brought, it's pretty easy for me who is used to code in TypeScript. I'm just surprised at the minimum mention of Rust in that tweet.
Premature Optimization can hold back the release of software, and make it more complicated to debug.
.
Say you already have a function that returns the list of children in a node..
.
But you need a function that brings back only the first child if it exists.
.
1. Do you copy the function and return just the first item when you have it (with the associated code around the function) for efficiency sake.
2. Or.. Call the existing function and return the first item if there is at least one element?
.
1 Will be the best answer if you have a million records to load from disk.
2.Will be the best answer if you only have 20-30 records on average.
.
But 2 is the best answer before debugging, check that the code works properly before duplicating it and modifying it.
Answer 3 would be to add a limit parameter, and call the other two functions with it. That way, if limit=0, return all records. if limit=1, return one record maximum.
.
Sometimes the answer is to think differently about the problem altogether.
To be pedantic, the answer is neither. You use an abstraction that supports either answer with similar ease, provided by the language/framework/tools you used.
You bring up a very real point, but the most obvious cases are also most obvious to the people developing the stack below you and have almost certainly gone through the trouble of solving those problems.
This issue rears its head when edge cases are non-obvious and typically manifest through profiling, not through design meetings.
The problem with this way of thinking is that those budgets can lead to pathological behaviors. Start time is a great example. Let's say you have 1 second to start, but everything you need to do takes 1.2 seconds.
Okay... We'll defer some work and show a dummy screen to be "started" in less than 1 second, except now we're loading/drawing the dummy screen, so we're usable in 1.4 seconds, and after a double-start flash, so let's smooth that out with a transition to the interactive UI and we're usable in 2.2 seconds, but we "started" the app in 0.4.
Yay?
It may sound contrived, but I've seen this very response to strict start KPI's before. People end up optimizing the micro-goal (KPI) rather than the macro-goal (better experience).
The flip side problem is, when you have multiple people responsible for parts of the work, people will stop optimizing when they hit their budgeted allotment, rather than working to better the whole. A Dev with a 400ms budget might sleep for 390ms to buy himself three years of squeezing out optimizations...
More likely, he'll stop when he gets to 350ms, even though he could have gotten to 250ms without too much effort.
So much this. I'm an advocate of performance metrics being _measured_ and tracked over time, but very careful about setting performance _targets_. It's the latter that tends to cause the dysfunctional behaviour - it's Goodhart's law [0] in action.
When it is required then focusing on user value can help: e.g. "<x> seconds to user login being available" rather than "...page loaded".
There's still the question of absolute numbers rather than statistical distributions, but that's another topic.
Reminds me of the apocryphal story of the game that was trying to get under it's memory budget to be able to run on one of the early consoles, and after scrimping and saving and crunching, they were still over. Until one of the senior engineers commented out an unused buffer that allocated a couple kb "just in case"
If the startup screen allows you to, say, launch an "Open File/Project/Folder" dialog or load one of the recently-worked-on items, but won't let you do anything meaningful - then yes, Yay.
It's quite realistic to load that kind of UI in a lot less time than the "meat" of an app is usable. I would take 0.4 seconds for that and 2.2 for the whole app over 1.2 for the whole app - any day of the week.
My latest project has a performance budget of (on average) 150 milliseconds per click, and it's a web-app. It's going to be a tough goal to meet, considering typical web apps today take 5000ms to load!
That's what's often done in mission-critical software with critical latency elements. Problem is, you have to make sure those values are directly traced to either specific customer requirements, whole system spec or normative reference. It's too easy for systems engineers to come up with hard-to-reach numbers with no grounding on the Real customer need...
You also have to specify and measure exactly the correct element. Sometimes the latency budget is split in not clearly-cut parts, and on events or categories of events that are not clearly defined. Saying 'all requests should give a complete result in < 20ms' just isn't realistic and creates a world of hurt. What kind of request? Always, always? In what load conditions? What about error conditions? Fail-over conditions? Hot/cold start?... It's not simple like specifying a clear-cut feature.
Performance budget should often not be all black and white. Especially when trying to push the envelope of what modern PC hardware (with OoO execution, complex memory hierarchies, multicore CPU/GPU rapid-busy-bused hybrid archs) can do. It should probably be a risk law, and at least a critical spec, and thoroughly tested and regression-tested as such.
And here, sometimes premature optimization will be necessary. If someone gives you hard latency requirements, you'll have to say, very quick, 'I'll need a soft/hard real-time OS it won't do extreme high-throughput'. You'll have to bench...
Premature /micro/-optimization is a problem, sure. Make it work, make it right, make it fast.
But global optimization must start at the system design level. How much money do you have? What can you do for $X?
In the case you can’t avoid running O(n) code on a code path that affects user experience, pick a large n that would be a reasonable value for a user to have.
How much optimization you need depends on the situation. The first question is for how many concurrent users are you going to write it for. In most cases O(n^2) is too much unless you know that in practice n is always going to be less than 10. Even then you should be quite certain that n is never going to become bigger, which you very often aren't. Removing constant factors, i.e., turning 3*n into n, should only be done if it is not difficult and is not bad for code quality in most cases. But at some point you should start removing these constant factors. Maybe if you need to handle more than 10000 requests a second. These numbers are highly variable depending on the application, the language and the use.
Upvoted you for the first half. Hell, if algo is cubic or exponential out of pure simplicity, then as little as x3-10 input will bring it to its knees. It is not worth revisiting it later with all surrounding costs, if that’s inevitable anyway.
Optimization is premature only if the code is not already fantastically stupid, which may happen too often irl to ignore that.
Couldn’t agree with constant factors though. These are pretty random, n-independent (i.e. scalable) and the net expense of making code less obvious is usually bigger than throwing more/better hardware at it, but ymmw.
This falls apart when you consider one common scenario:
We are loading a number of data files and the source of truth is remote. This is O(n) the amount of data.
Two ways to do it: always request the remote data, or cache it, and only ask the remote for changes.
These differ by a constant factor, and it's always a good idea to maintain a local cache when it's possible. Otherwise there's a very good chance that startup time will be dominated by network requests.
But this also falls into “fantastically stupid” category. Just like all web 2.0 e-stores I have to use, which rerequest their dataset every time you touch sort or filter controls. When their largest category is 150kb json and the entire site json is 10x smaller than their ui/ad frameworks.
This article is more current than when it was written.
We are deep into the Post-Moore's Law era, where a new generation gives, now, 120% rather than 200% of the previous generation's performance. Next generation we might get 115% of this, or 110%.
Another consequence is that the improvements we get have become dodgier and less reliable, so that tiny, irrelevant-looking changes in the code may mean a 2x speedup, but more commonly a 2x slowdown. (This is in no way an exaggeration.) The bargain we get from pervasive penetration of more kinds of caches is that sometimes our programs are faster, but we no longer know how fast they should be, or whether another factor of two or ten has been left on the table.
Sorting algorithms are quite mature now, so that a 20% improvement in a relevant case is important, yet a factor of 2 or more may come from the compiler choosing one instruction over another.
Compiler regression bug reports now routinely complain of a 2x performance loss from that instruction choice, but fixing them would result in 2x losses in some other set of programs, instead. We get new unportable compiler intrinsics to patch the failure, that often don't, for obscure reasons.
Moore's law was always about transistor density increases, not performance, and definitely not about serial performance in a single core, no matter how much people want to reframe it. Transistor density is still improving, just not as quickly and CPU speed is still increasing, just not on a single core.
Not so. Earlier generations came with higher, often doubled clock speeds. Many programmers today have never seen such a doubling, yet policies still assume them.
Clock speeds have nothing to do with Moore's law, which was about transistor density. I'm not sure how you can say that a doubling in density hasn't happened when there are "7nm" 64 core CPUs out there. Transistor density has slowed but not stopped. Moore also predicted an exponential increase in price for density improvements which has also seemed to happen.
I get that you are being intentionally condescending and obtuse here, but you were talking about 'the end of Moore's' law, which only talks about transistor density increases in addition to cost, and both of those are still increasing. Clock speeds, instructions per clock, cache, latency, prefetching, out of order execution and many other aspects of CPU performance were not the trend Gordon Moore outlined. You are conflating things that stem from transistor density with Moore's Law.
While Moore's original, concise expression was in terms of transistor areal density, the faster clock rate was implied and expected, just as lower latency is implied by the faster clock. Thus, the stagnation of clock rates is rightly recognized as the beginning of its end. Allocation of the newly available transistors to caches, functional units, execution units, and ultimately extra cores was also implied: the transistors are not decorative: they are there to be used.
With feature sizes approaching a single lattice unit cell, its final stage will be reached shortly.
To insist that transistor count is the only point of Moore's Law is to be deliberately obtuse: it was the exactly the extra value provided by the extra transistors and the machinery built of them, and the faster clocks, that would (and did) generate the capital investment needed to develop each succeeding, more expensive, generation.
This is all rationalizing whatever you were trying to say about performance stalling. People only talk about 'the intent' when they realize they have been regurgitating news headlines and haven't looked any deeper. The point is that performance hasn't stalled at all just because clock rates aren't going up. Faster clocks are diminishing returns in performance due to memory latency. Transistor density was the point and performance is still increasing due to transistor density, even if you don't know how to use multiple cores.
It is an objective fact that performance increases are much, much smaller than 20 years ago. It is easy to see why, and why continuing (for now) shrinkage of transistors is failing to deliver as much as before.
You are welcome to die on your "Moore's Law is about feature size and nothing else" hill, but you will have sparse company there.
Don't you think maybe you should take a step back when you repeat the same things over and over, never back them up and have multiple people link you different Wikipedia articles to correct you?
John L. Hennessy; David A. Patterson (June 4, 2018): "The ending of Dennard Scaling and Moore’s Law also slowed this path; single core performance improved only 3% last year!"
<https://iscaconf.org/isca2018/turing_lecture.html>
You didn't confront anything I linked. Again, Moore's law is about transistor density, which hasn't stopped yet and has gone into to more cores. It was never about single core performance. Now it seems like you are including dennard scaling after someone else mentioned it.
You have a quote about the time frame increasing which has never been disputed either.
I'm sure you can find links to some tech blogs that have the same misunderstandings as you do, why don't you go hunt those down too?
If optimization is going to be a concern, the thoughtful choice of the algorithm(s) up front is not "premature".
Currently, I'm working on a project with a ring buffer (a kind of FIFO buffer). It isn't fast (not lockless) but has a well defined interface so I can swap it for fast one later if I need to.
We shouldn't reinvent a phrase. Premature optimization has always been about programming efficiently, ie. a newbie programmer or even expert "elite" programmer could easily fall into the trap of optimizing prematurely. That means: Spending too much time on very little gain performance-wise, thinking it's important to eke out every little drop of performance. It's a programming lesson only. Experience even show that premature unproven tweaks can both reduce performance and make code harder to read and maintain.
Then there's the business lesson: Since for the last decades we've had Moore's Law and then more, it's been very hard to make the effort pay off by programming for performance. So businesses have learnt to focus on reducing programming time. Coincidentally, it's been shown that quick lead times are beneficial in the competitive marketplace as well. This is a business lesson (not about the programming phrase "premature optimization"!).
We're seeing a reintroduction of optimization and performance, because meeting a ceiling in CPU cycles and the need to utilize more cores efficiently. So we have Golang, Rust, WebAssembly, etc. Having a snappy website and not annoy your users will pay off as well (hello there multiple choice agreement-notices!).
But the programming lesson still stands: For many types of workloads, it doesn't make sense to spend too much time optimizing, performance wise, unless you see proven benefit outweighing the costs. In most cases, it's actually better to optimize afterwards (ie. prototyping and iterating), unless one has a specific algorithm or solution in mind beforehand.
The business lesson is similar, with the added twist of being a criteria wether you make it or break it. For programming, you could just do it for fun or a hobby, and can premature optimize to your hearts content if you like. But it'll be harder now to fall into that trap instead of making something more worthwhile! ;-)
I've never heard anyone actually say that optimization is evil... probably, because people seem to actually enjoy optimizing code.
What's "evil" is optimizing without benchmarks/profiles/etc. Naive implementations are usually easier to write, and can give you a baseline for how much there is to gain (as well as providing "correct examples" for more complicated algorithms).
My team at work is extremely guilty of this, and half the time they aren't even correct.
I have coined the term "voodoo optimization" in response to a point Martin Fowler made in his book "Refactoring 2nd Edition": When running Javascript, for example, you are working with a compiler and engine that has had thousands of man-hours and millions of dollars invested into it.
The compiler is throwing away variable declarations. It's merging for-loops. What you put in is not what comes out. You need concrete numbers and FACTS to optimize your code. Deciding "this looks slow" is not an effective optimization method.
This article is terrible, just full of straw man arguments and holier-than-thou attitude. Meanwhile, the article completely fails to present any data, completely fails to identify why organizations fail to optimize, and completely fails to provide any solutions. The article basically just pretends there's some large anti-optimization movement so he can knock it down. And even while proving the obvious (that optimization is important), the article uses appeals to authority rather than actual data. Ugh.
> Today, it is not at all uncommon for software engineers to extend this maxim to "you should never optimize your code!" Funny, you don't hear too many computer application users making such statements.
Funny, you don't hear too many computer application users asking for optimization, either. The rare, high-profile cases where people complain that something is slow get a lot of attention, but the majority of the time, the feedback you get from users is a resounding silence. I wish users would give me unsolicited actionable feedback on my software, but the reality is, it takes effort to get feedback, and usually people don't complain about performance, they just have a general feeling of malaise about the software that doesn't rise to the level of an explicit complaint. In 11 years of software development, I've had only a handful of users ever complain about performance. The times I've optimized are almost always a result of performance logging. Performance logging lets me say to stakeholders, "This DB query is taking a full second, and 30% of users are exiting the application on that screen--can I spend time to optimize this?"
Funny, you don't actually hear too many software engineers actually saying "you should never optimize your code", either.
I stopped reading at the "observations" section, which were just too full of cringe.
I skimmed the rest, though, and noticed that the author's suggests a few books on assembly language. Good call, guy, I'll be sure to tie my application to a specific processor architecture so I can make it extra difficult to understand why memory is getting thrashed or thread switches are happening at such inopportune times.
I think a lot of programmers still don't realize that this idiom comes from doing micro optimizations early.
Software does need to be architected for performance up front. If you aren't able to work on data with a lot of locality or have latency between fundamental operations, you won't get fast software until you address these issues.
If you are really starting from scratch you will barely understand the problem, but once you do, you can make sure your architecture will align with what you are doing. After that, optimization becomes much lower hanging fruit.
On a modern full-sized laptop etc, sure code efficiency doesn't matter much. But in anything smaller e.g. embedded device, radio card, cheap phone, it means the world. The difference between a product and a fail.
Embedded programmers can probably be identified by their insistence on code efficiency. Speed, space, storage, all critical to them.
"On a modern full-sized laptop etc, sure code efficiency doesn't matter much"
Well it really mattered to me when I lost my patience and replaced a piece of code written in Python that was doing some data manipulation/processing with the native one. Suddenly what used to take an hour got down to less then 1 minute.
I have made quite a few enterprise grade desktop apps. They'd normally involve device control, real time data processing, graphics multimedia etc all running in many threads communicating using my own internal publish-subscribe and in memory EAV database. For each one performance mattered a great deal.
Yes I understand it may not matter much for many database front end apps but the world does not end on those.
Well you're a saint. The average desktop app repaints with stuttering, hangs for seconds, and has all modal controls. So few care about these things any more.
Is 'performance' the right word for this? Its more like user experience or latency - the controls/graphics should be run on a different thread than the data processing.
"Is 'performance' the right word for this? Its more like user experience or latency..."
It is both. All my desktop apps are strictly native and written with care. Many use DirectX graphics for output. Also I've never bought into this Microsoft's move to .NET on desktop propaganda. Electron etc. - would not touch such frameworks with the wooden pole.
Brilliant article. I am currently implementing a programming language and have had to think very hard about representation (in particular). I would highly recommend this for any serious programmer wanting to learn a bit more about performance.
Premature optimisation, to me, is spending days working out a new algorithm for 5% saving on something that looks inefficient to you but you haven't actually any idea if it's really a problem.
I've never met any of these engineers who think optimization is always wrong.
To you premature optimization means spending days eking out 5%, but that's not how everyone sees it. Lots of people I've seen would also call e.g. spending 1 hour vs. 30 minutes on a solution that'd be (say) >= 3x faster "premature optimization" due to the fundamental design decisions (not micro-optimization).
> Lots of people I've seen would also call e.g. spending 1 hour vs. 30 minutes on a solution that'd be (say) >= 3x faster "premature optimization" due to the fundamental design decisions (not micro-optimization).
And it still might be that way, if the piece you're optimizing isn't performance critical at all, and the effort might be better spent elsewhere. But it might not.
The whole thing is a grey area and a good software developer learns to spot when they're spending time on something that doesn't need this effort right now.
In my experience most engineers take a situational approach. Sometimes it is worth it to optimize early, sometimes it is not. Some things are worth optimizing, some not. Also some things are worth optimizing to certain level.
These types of articles and claims conjure up a debate because everyone is imagining a different scenario in their heads. It is entirely plausible that each claim could be the best solution in a different scenario.
Do these articles assume that engineers are incapable of thinking flexibly and need to follow some absolute truths that have to be debated? Have I been around too good engineers that I have not noticed this issue?