Things Every Programmer Should Know
At ui-programming, one of our JCG program participant sites, articles about “Things Every Programmer Should Know” are occasionally posted. As stated in the author’s first post, the 97 Things Every Programmer Should Know project, pearls of wisdom for programmers collected from leading practitioners.
The collection is intended simply to contain multiple and varied perspectives on what it is that contributors to the project feel programmers should know. This can be anything from code-focused advice to culture, from algorithm usage to agile thinking, from implementation know-how to professionalism, from style to substance, etc.
In this article we provide a summary of the first 27 “Things Every Programmer Should Know”!
So, let’s start them, without further ado…
1. Act with Prudence by Seb Rose
“Whatever you undertake, act with prudence and consider the consequences” Anon
No matter how comfortable a schedule looks at the beginning of an iteration, you can’t avoid being under pressure some of the time. If you find yourself having to choose between “doing it right” and “doing it quick” it is often appealing to “do it quick” on the understanding that you’ll come back and fix it later. When you make this promise to yourself, your team, and your customer, you mean it. But all too often the next iteration brings new problems and you become focused on them. This sort of deferred work is known as technical debt and it is not your friend. Specifically, Martin Fowler calls this deliberate technical debt in his taxonomy of technical debt, which should not be confused with inadvertent technical debt.
Technical debt is like a loan: You benefit from it in the short-term, but you have to pay interest on it until it is fully paid off. Shortcuts in the code make it harder to add features or refactor your code. They are breeding grounds for defects and brittle test cases. The longer you leave it, the worse it gets. By the time you get around to undertaking the original fix there may be a whole stack of not-quite-right design choices layered on top of the original problem making the code much harder to refactor and correct. In fact, it is often only when things have got so bad that you must fix it, that you actually do go back to fix it. And by then it is often so hard to fix that you really can’t afford the time or the risk.
There are times when you must incur technical debt to meet a deadline or implement a thin slice of a feature. Try not to be in this position, but if the situation absolutely demands it, then go ahead. But (and this is a big BUT) you must track technical debt and pay it back quickly or things go rapidly downhill. As soon as you make the decision to compromise, write a task card or log it in your issue tracking system to ensure that it does not get forgotten.
If you schedule repayment of the debt in the next iteration, the cost will be minimal. Leaving the debt unpaid will accrue interest and that interest should be tracked to make the cost visible. This will emphasize the effect on business value of the project’s technical debt and enables appropriate prioritization of the repayment. The choice of how to calculate and track the interest will depend on the particular project, but track it you must.
Pay off technical debt as soon as possible. It would be imprudent to do otherwise.
2. Apply Functional Programming Principles by Edward Garson
Functional programming has recently enjoyed renewed interest from the mainstream programming community. Part of the reason is because emergent properties of the functional paradigm are well positioned to address the challenges posed by our industry’s shift toward multi-core. However, while that is certainly an important application, it is not the reason this piece admonishes you to know thy functional programming.
Mastery of the functional programming paradigm can greatly improve the quality of the code you write in other contexts. If you deeply understand and apply the functional paradigm, your designs will exhibit a much higher degree of referential transparency.
Referential transparency is a very desirable property: It implies that functions consistently yield the same results given the same input, irrespective of where and when they are invoked. That is, function evaluation depends less — ideally, not at all — on the side effects of mutable state.
A leading cause of defects in imperative code is attributable to mutable variables. Everyone reading this will have investigated why some value is not as expected in a particular situation. Visibility semantics can help to mitigate these insidious defects, or at least to drastically narrow down their location, but their true culprit may in fact be the providence of designs that employ inordinate mutability.
And we certainly don’t get much help from industry in this regard. Introductions to object orientation tacitly promote such design, because they often show examples composed of graphs of relatively long-lived objects that happily call mutator methods on each other, which can be dangerous. However, with astute test-driven design, particularly when being sure to “Mock Roles, not Objects”, unnecessary mutability can be designed away.
The net result is a design that typically has better responsibility allocation with more numerous, smaller functions that act on arguments passed into them, rather than referencing mutable member variables. There will be fewer defects, and furthermore they will often be simpler to debug, because it is easier to locate where a rogue value is introduced in these designs than to otherwise deduce the particular context that results in an erroneous assignment. This adds up to a much higher degree of referential transparency, and positively nothing will get these ideas as deeply into your bones as learning a functional programming language, where this model of computation is the norm.
Of course, this approach is not optimal in all situations. For example, in object-oriented systems this style often yields better results with domain model development (i.e., where collaborations serve to break down the complexity of business rules) than with user-interface development.
Master the functional programming paradigm so you are able to judiciously apply the lessons learned to other domains. Your object systems (for one) will resonate with referential transparency goodness and be much closer to their functional counterparts than many would have you believe. In fact, some would even assert that the apex of functional programming and object orientation are merely a reflection of each other, a form of computational yin and yang.
3. Ask “What Would the User Do?” (You Are not the User) by Giles Colborne
We all tend to assume that other people think like us. But they don’t. Psychologists call this the false consensus bias. When people think or act differently to us, we’re quite likely to label them (subconsciously) as defective in some way.
This bias explains why programmers have such a hard time putting themselves in the users’ position. Users don’t think like programmers. For a start, they spend much less time using computers. They neither know nor care how a computer works. This means they can’t draw on any of the battery of problem-solving techniques so familiar to programmers. They don’t recognize the patterns and cues programmers use to work with, through, and around an interface.
The best way to find out how users think is to watch one. Ask a user to complete a task using a similar piece of software to what you’re developing. Make sure the task is a real one: “Add up a column of numbers” is OK; “Calculate your expenses for the last month” is better. Avoid tasks that are too specific, such as “Can you select these spreadsheet cells and enter a SUM formula below?” — there’s a big clue in that question. Get the user to talk through his or her progress. Don’t interrupt. Don’t try to help. Keep asking yourself “Why is he doing that?” and “Why is she not doing that?”
The first thing you’ll notice is that users do a core of things similarly. They try to complete tasks in the same order — and they make the same mistakes in the same places. You should design around that core behavior. This is different from design meetings, where people tend to be listened to for saying “What if the user wants to…?” This leads to elaborate features and confusion over what users want. Watching users eliminates this confusion.
You’ll see users getting stuck. When you get stuck, you look around. When users get stuck, they narrow their focus. It becomes harder for them to see solutions elsewhere on the screen. It’s one reason why help text is a poor solution to poor user interface design. If you must have instructions or help text, make sure to locate it right next to your problem areas. A user’s narrow focus of attention is why tool tips are more useful than help menus.
Users tend to muddle through. They’ll find a way that works and stick with it no matter how convoluted. It’s better to provide one really obvious way of doing things than two or three shortcuts.
You’ll also find that there’s a gap between what users say they want and what they actually do. That’s worrying as the normal way of gathering user requirements is to ask them. It’s why the best way to capture requirements is to watch users. Spending an hour watching users is more informative than spending a day guessing what they want.
4. Automate Your Coding Standard by Filip van Laenen
You’ve probably been there too. At the beginning of a project, everybody has lots of good intentions — call them “new project’s resolutions.” Quite often, many of these resolutions are written down in documents. The ones about code end up in the project’s coding standard. During the kick-off meeting, the lead developer goes through the document and, in the best case, everybody agrees that they will try to follow them. Once the project gets underway, though, these good intentions are abandoned, one at a time. When the project is finally delivered the code looks like a mess, and nobody seems to know how it came to be this way.
When did things go wrong? Probably already at the kick-off meeting. Some of the project members didn’t pay attention. Others didn’t understand the point. Worse, some disagreed and were already planning their coding standard rebellion. Finally, some got the point and agreed but, when the pressure in the project got too high, they had to let something go. Well-formatted code doesn’t earn you points with a customer that wants more functionality. Furthermore, following a coding standard can be quite a boring task if it isn’t automated. Just try to indent a messy class by hand to find out for yourself.
But if it’s such a problem, why is that we want to have a coding standard in the first place? One reason to format the code in a uniform way is so that nobody can “own” a piece of code just by formatting it in his or her private way. We may want to prevent developers using certain anti-patterns, in order to avoid some common bugs. In all, a coding standard should make it easier to work in the project, and maintain development speed from the beginning to the end. It follows then that everybody should agree on the coding standard too — it does not help if one developer uses three spaces to indent code, and another one four.
There exists a wealth of tools that can be used to produce code quality reports and to document and maintain the coding standard, but that isn’t the whole solution. It should be automated and enforced where possible. Here are a few examples:
- Make sure code formatting is part of the build process, so that everybody runs it automatically every time they compile the code.
- Use static code analysis tools to scan the code for unwanted anti-patterns. If any are found, break the build.
- Learn to configure those tools so that you can scan for your own, project-specific anti-patterns.
- Do not only measure test coverage, but automatically check the results too. Again, break the build if test coverage is too low.
Try to do this for everything that you consider important. You won’t be able to automate everything you really care about. As for the things that you can’t automatically flag or fix, consider them to be a set of guidelines supplementary to the coding standard that is automated, but accept that you and your colleagues may not follow them as diligently.
Finally, the coding standard should be dynamic rather than static. As the project evolves, the needs of the project change, and what may have seemed smart in the beginning, isn’t necessarily smart a few months later.
5. Beauty Is in Simplicity by Jørn Ølmheim
There is one quote that I think is particularly good for all software developers to know and keep close to their hearts:
Beauty of style and harmony and grace and good rhythm depends on simplicity. — Plato
In one sentence I think this sums up the values that we as software developers should aspire to.
There are a number of things we strive for in our code:
- Readability
- Maintainability
- Speed of development
- The elusive quality of beauty
Plato is telling us that the enabling factor for all of these qualities is simplicity.
What is beautiful code? This is potentially a very subjective question. Perception of beauty depends heavily on individual background, just as much of our perception of anything depends on our background. People educated in the arts have a different perception of (or at least approach to) beauty than people educated in the sciences. Arts majors tend to approach beauty in software by comparing software to works of art, while science majors tend to talk about symmetry and the golden ratio, trying to reduce things to formulae. In my experience, simplicity is the foundation of most of the arguments from both sides.
Think about source code that you have studied. If you haven’t spent time studying other people’s code, stop reading this right now and find some open source code to study. Seriously! I mean it! Go search the web for some code in your language of choice, written by some well-known, acknowledged expert.
You’re back? Good. Where were we? Ah yes… I have found that code that resonates with me and that I consider beautiful has a number of properties in common. Chief among these is simplicity. I find that no matter how complex the total application or system is, the individual parts have to be kept simple. Simple objects with a single responsibility containing similarly simple, focused methods with descriptive names. Some people think the idea of having short methods of five to ten lines of code is extreme, and some languages make it very hard to do this, but I think that such brevity is a desirable goal nonetheless.
The bottom line is that beautiful code is simple code. Each individual part is kept simple with simple responsibilities and simple relationships with the other parts of the system. This is the way we can keep our systems maintainable over time, with clean, simple, testable code, keeping the speed of development high throughout the lifetime of the system.
Beauty is born of and found in simplicity.
6. Before You Refactor by Rajith Attapattu
At some point every programmer will need to refactor existing code. But before you do so please think about the following, as this could save you and others a great deal of time (and pain):
- The best approach for restructuring starts by taking stock of the existing codebase and the tests written against that code. This will help you understand the strengths and weaknesses of the code as it currently stands, so you can ensure that you retain the strong points while avoiding the mistakes. We all think we can do better than the existing system… until we end up with something no better — or even worse — than the previous incarnation because we failed to learn from the existing system’s mistakes.
- Avoid the temptation to rewrite everything. It is best to reuse as much code as possible. No matter how ugly the code is, it has already been tested, reviewed, etc. Throwing away the old code — especially if it was in production — means that you are throwing away months (or years) of tested, battle-hardened code that may have had certain workarounds and bug fixes you aren’t aware of. If you don’t take this into account, the new code you write may end up showing the same mysterious bugs that were fixed in the old code. This will waste a lot of time, effort, and knowledge gained over the years.
- Many incremental changes are better than one massive change. Incremental changes allows you to gauge the impact on the system more easily through feedback, such as from tests. It is no fun to see a hundred test failures after you make a change. This can lead to frustration and pressure that can in turn result in bad decisions. A couple of test failures is easy to deal with and provides a more manageable approach.
- After each iteration, it is important to ensure that the existing tests pass. Add new tests if the existing tests are not sufficient to cover the changes you made. Do not throw away the tests from the old code without due consideration. On the surface some of these tests may not appear to be applicable to your new design, but it would be well worth the effort to dig deep down into the reasons why this particular test was added.
- Personal preferences and ego shouldn’t get in the way. If something isn’t broken, why fix it? That the style or the structure of the code does not meet your personal preference is not a valid reason for restructuring. Thinking you could do a better job than the previous programmer is not a valid reason either.
- New technology is insufficient reason to refactor. One of the worst reasons to refactor is because the current code is way behind all the cool technology we have today, and we believe that a new language or framework can do things a lot more elegantly. Unless a cost–benefit analysis shows that a new language or framework will result in significant improvements in functionality, maintainability, or productivity, it is best to leave it as it is.
- Remember that humans make mistakes. Restructuring will not always guarantee that the new code will be better — or even as good as — the previous attempt. I have seen and been a part of several failed restructuring attempts. It wasn’t pretty, but it was human.
7. Beware the Share by Udi Dahan
It was my first project at the company. I’d just finished my degree and was anxious to prove myself, staying late every day going through the existing code. As I worked through my first feature I took extra care to put in place everything I had learned — commenting, logging, pulling out shared code into libraries where possible, the works. The code review that I had felt so ready for came as a rude awakening — reuse was frowned upon!
How could this be? All through college reuse was held up as the epitome of quality software engineering. All the articles I had read, the textbooks, the seasoned software professionals who taught me. Was it all wrong?
It turns out that I was missing something critical.Context.
The fact that two wildly different parts of the system performed some logic in the same way meant less than I thought. Up until I had pulled out those libraries of shared code, these parts were not dependent on each other. Each could evolve independently. Each could change its logic to suit the needs of the system’s changing business environment. Those four lines of similar code were accidental — a temporal anomaly, a coincidence. That is, until I came along.
The libraries of shared code I created tied the shoelaces of each foot to each other. Steps by one business domain could not be made without first synchronizing with the other. Maintenance costs in those independent functions used to be negligible, but the common library required an order of magnitude more testing.
While I’d decreased the absolute number of lines of code in the system, I had increased the number of dependencies. The context of these dependencies is critical — had they been localized, it may have been justified and had some positive value. When these dependencies aren’t held in check, their tendrils entangle the larger concerns of the system even though the code itself looks just fine.
These mistakes are insidious in that, at their core, they sound like a good idea. When applied in the right context, these techniques are valuable. In the wrong context, they increase cost rather than value. When coming into an existing code base with no knowledge of the context where the various parts will be used, I’m much more careful these days about what is shared.
Beware the share. Check your context. Only then, proceed.
8. The Boy Scout Rule by Uncle Bob
The Boy Scouts have a rule: “Always leave the campground cleaner than you found it.” If you find a mess on the ground, you clean it up regardless of who might have made the mess. You intentionally improve the environment for the next group of campers. Actually the original form of that rule, written by Robert Stephenson Smyth Baden-Powell, the father of scouting, was “Try and leave this world a little better than you found it.”
What if we followed a similar rule in our code: “Always check a module in cleaner than when you checked it out.” No matter who the original author was, what if we always made some effort, no matter how small, to improve the module. What would be the result?
I think if we all followed that simple rule, we’d see the end of the relentless deterioration of our software systems. Instead, our systems would gradually get better and better as they evolved. We’d also see teams caring for the system as a whole, rather than just individuals caring for their own small little part.
I don’t think this rule is too much to ask. You don’t have to make every module perfect before you check it in. You simply have to make it a little bit better than when you checked it out. Of course, this means that any code you add to a module must be clean. It also means that you clean up at least one other thing before you check the module back in. You might simply improve the name of one variable, or split one long function into two smaller functions. You might break a circular dependency, or add an interface to decouple policy from detail.
Frankly, this just sounds like common decency to me — like washing your hands after you use the restroom, or putting your trash in the bin instead of dropping it on the floor. Indeed the act of leaving a mess in the code should be as socially unacceptable as littering. It should be something that just isn’t done.
But it’s more than that. Caring for our own code is one thing. Caring for the team’s code is quite another. Teams help each other, and clean up after each other. They follow the Boy Scout rule because it’s good for everyone, not just good for themselves.
9. Check Your Code First before Looking to Blame Others by Allan Kelly
Developers — all of us! — often have trouble believing our own code is broken. It is just so improbable that, for once, it must be the compiler that’s broken.
Yet in truth it is very (very) unusual that code is broken by a bug in the compiler, interpreter, OS, app server, database, memory manager, or any other piece of system software. Yes, these bugs exist, but they are far less common than we might like to believe.
I once had a genuine problem with a compiler bug optimizing away a loop variable, but I have imagined my compiler or OS had a bug many more times. I have wasted a lot of my time, support time, and management time in the process only to feel a little foolish each time it turned out to be my mistake after all.
Assuming the tools are widely used, mature, and employed in various technology stacks, there is little reason to doubt the quality. Of course, if the tool is an early release, or used by only a few people worldwide, or a piece of seldom downloaded, version 0.1, Open Source Software, there may be good reason to suspect the software. (Equally, an alpha version of commercial software might be suspect.)
Given how rare compiler bugs are, you are far better putting your time and energy into finding the error in your code than proving the compiler is wrong. All the usual debugging advice applies, so isolate the problem, stub out calls, surround it with tests; check calling conventions, shared libraries, and version numbers; explain it to someone else; look out for stack corruption and variable type mismatches; try the code on different machines and different build configurations, such as debug and release.
Question your own assumptions and the assumptions of others. Tools from different vendors might have different assumptions built into them — so too might different tools from the same vendor.
When someone else is reporting a problem you cannot duplicate, go and see what they are doing. They maybe doing something you never thought of or are doing something in a different order.
As a personal rule if I have a bug I can’t pin down, and I’m starting to think it’s the compiler, then it’s time to look for stack corruption. This is especially true if adding trace code makes the problem move around.
Multi-threaded problems are another source of bugs to turn hair gray and induce screaming at the machine. All the recommendations to favor simple code are multiplied when a system is multi-threaded. Debugging and unit tests cannot be relied on to find such bugs with any consistency, so simplicity of design is paramount.
So before you rush to blame the compiler, remember Sherlock Holmes’ advice, “Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth,” and prefer it to Dirk Gently’s, “Once you eliminate the improbable, whatever remains, no matter how impossible, must be the truth.”
10. Consider the Hardware by Jason P Sage
It’s a common opinion that slow software just needs faster hardware. This line of thinking is not necessarily wrong but, like misusing antibiotics, it can become a big problem over time. Most developers don’t have any idea what is really going on “under the hood.” There is often a direct conflict of interest between best programming practices and writing code that screams on the given hardware.
First, let’s look at your CPU’s prefetch cache as an example. Most prefetch caches work by constantly evaluating code that hasn’t even executed yet. They help performance by “guessing” where your code will branch to before it even has happened. When the cache “guesses” correctly, it’s amazingly fast. If it “guesses” wrong, on the other hand, all the preprocessing on this “wrong branch” is useless and a time-consuming cache invalidation occurs. Fortunately, it’s easy to start making the prefetch cache work harder for you. If you code your branch logic so that the most frequent result is the condition that is tested for, you will help your CPU’s prefetch cache be “correct” more often, leading to fewer CPU-expensive cache invalidations. This sometimes may read a little awkwardly, but systematically applying this technique over time will decrease your code’s execution time.
Now, let’s look at some of the conflicts between writing code for hardware and writing software using mainstream best practices.
Folks prefer to write many small functions in favor of larger ones to ease maintainability, but all those function calls come at a price! If you use this paradigm, your software may spend more time preparing and recovering from work than actually doing it! The much loathed goto or jmp command is the fastest method to get around followed closely by machine language indirect addressing jump tables. Functions are great for humans but from the CPU’s point of view they’re expensive.
What about inline functions? Don’t inline functions trade program size for efficiency by copying function code inline versus jumping around? Yes they do! But even when you specify a function is to be inlined, can you be sure it was? Did you know some compilers turn regular functions into inline ones when they feel like it and vice versa? Understanding the machine code created by your compiler from your source code is extremely important if you wish to write code that will perform optimally for the platform at hand.
Many developers think abstracting code to the nth degree, and using inheritance, is just the pinnacle of great software design. Sometimes constructs that look great conceptually are terribly inefficient in practice. Take for example inherited virtual functions: They are pretty slick but, depending on the actual implementation, they can be very costly in CPU clock cycles.
What hardware are you developing for? What does your compiler do to your code as it turns it to machine code? Are you using a virtual machine? You’ll rarely find a single programming methodology that will work perfectly on all hardware platforms, real or virtual.
Computer systems are getting faster, smaller and cheaper all the time, but this does not warrant writing software without regards to performance and storage. Efforts to save clock CPU cycles and storage can pay off as dividends in performance and efficiency.
Here’s something else to ponder: New technologies are coming out all the time to make computers more green and ecosystem friendly. Efficient software may soon be measured in power consumption and may actually affect the environment!
Video game and embedded system developers know the hardware ramifications of their compiled code. Do you?
11. Continuous Refactoring by Michael Hunger
Code bases that are not cared for tend to rot. When a line of code is written it captures the information, knowledge, and skill you had at that moment. As you continue to learn and improve, acquiring new knowledge, many lines of code become less and less appropriate with the passage of time. Although your initial solution solved the problem, you discover better ways to do so.
It is clearly wrong to deny the code the chance to grow with knowledge and abilities.
While reading, maintaining, and writing code you begin to spot pathologies, often referred to as code smells. Do you notice any of the following?
- Duplication, near and far
- Inconsistent or uninformative names
- Long blocks of code
- Unintelligible boolean expressions
- Long sequences of conditionals
- Working in the intestines of other units (objects, modules)
- Objects exposing their internal state
When you have the opportunity, try deodorizing the smelly code. Don’t rush. Just take small steps. In Martin Fowler’s Refactoring the steps of the refactorings presented are outlined in great detail, so it’s easy to follow. I would suggest doing the steps at least once manually to get a feeling for the preconditions and side effects of each refactoring. Thinking about what you’re doing is absolutely necessary when refactoring. A small glitch can become a big deal as it may affect a larger part of the code base than anticipated.
Ask for help if your gut feeling does not guide you in the right direction. Pair with a co-worker for the refactoring session. Two pairs of eyes and sets of experience can have a significant effect — especially if one of these is unclouded by the initial implementation approach.
We often have tools we can call on to help us with automatic refactoring. Many IDEs offer an impressive range of refactorings for a variety of languages. They work on the syntactically sound parse tree of your source code, and can often refactor partially defective or unfinished source code. So there is little excuse for not refactoring.
If you have tests, make sure you keep them running while you are refactoring so that you can easily see if you broke something. If you do not have tests, this may be an opportunity to introduce them for just this reason, and more: The tests give your code an environment to be executed in and validate that the code actually does what is intended, i.e., passes the tests.
When refactoring you often encounter an epiphany at some point. This happens when suddenly all puzzle pieces fall into the place where they belong and the sum of your code is bigger than its parts. From that point it is quite easy to take a leap in the development of your system or its architecture.
Some people say that refactoring is waste in the Lean sense as it doesn’t directly contribute to the business value for the customer. Improving the design of the code, however, is not meant for the machine. It is meant for the people who are going to read, understand, maintain, and extend the system. So every minute you invest in refactoring the code to make it more intelligible and comprehensible is time saved for the soul in future that has to deal with it. And the time saved translates to saved costs. When refactoring you learn a lot. I use it quite often as a learning tool when working with unfamiliar codebases. Improving the design also helps spotting bugs and inconsistencies by just seeing them clearly now. Deleting code — a common effect of refactoring — reduces the amount of code that has to be cared for in the future.
12. Continuously Align Software to Be Reusable by Vijay Narayanan
The oft cited reason for not being able to build reusable software is the lack of time in the development process. Agility and refactoring are your friends for reuse. Take a pragmatic approach to the reuse effort and you will increase the odds of success considerably. The strategy that I have used with building reusable software is to pursue continuous alignment. What exactly is continuous alignment?
The idea of continuous alignment is very simple: Place value on making software assets reusable continuously. Pursue this across every iteration, every release, and every project. You may not make many assets reusable on day one, and that is perfectly okay. The key thing is to align software assets closer and closer to a reusable state using relentless refactoring and code reviews. Do this often and over a period of time you will transform your codebase.
You start by aligning requirements with reusable assets and do so across development iterations. Your iteration has tangible features that are being implemented. They become much more effective if they are aligned with your overall vision. This isn’t meant to make every feature reusable or every iteration produce reusable assets. You want to do just the opposite. Continuous alignment accepts that building reusable software is hard, takes time, and is iterative. You can try to fight that and attempt to produce perfectly reusable software first time. But this will not only add needless complexity, it will also needlessly increase schedule risk for projects. Instead, align assets towards reuse slowly, on demand, and in alignment with business needs.
A simple example will make this approach more concrete. Say you have a piece of code that accesses a legacy database to fetch customer email addresses and send email messages. The logic for accessing the legacy database is interspersed with the code that sends emails. Say there is a new business requirement to display customer email data on a web application. Your initial implementation can’t reuse existing code to access customer data from the legacy system. The refactoring effort required will be too high and there isn’t enough time to pursue that option. In a subsequent iteration you can refactor the email code to create two new components: One that fetches customer data and another that sends email messages. This refactored customer data component is now available for reuse with the web application. This change can be made in one, two, or many iterations. If you cannot get it done, you can include it to on your list of known outstanding refactorings along with existing tasks. When the next project comes around and you get a requirement to access additional customer data from the web application, you can work on the outstanding refactoring.
This strategy can be used when refactoring existing code, wrapping legacy service capabilities, or building a new asset’s features iteratively. The fundamental idea remains the same: Align project backlog and refactorings with reuse objectives. This won’t always be possible and that is OK! Agile practices advocate exploration and alignment rather than prediction and certainty. Continuous alignment simply extends these ideas for implementing reusable assets.
13. Code Layout Matters by Steve Freeman
An infeasible number of years ago I worked on a Cobol system where staff weren’t allowed to change the indentation unless they already had a reason to change the code, because someone once broke something by letting a line slip into one of the special columns at the beginning of a line. This applied even if the layout was misleading, which it sometimes was, so we had to read the code very carefully because we couldn’t trust it. The policy must have cost a fortune in programmer drag.
There’s research to show the we all spend much more of our programming time navigating and reading code — finding where to make the change — than actually typing, so that’s what we want to optimize for.
- Easy to scan. People are really good at visual pattern matching (a leftover from the time when we had to spot lions on the savannah), so I can help myself by making everything that isn’t directly relevant to the domain, all the “accidental complexity” that comes with most commercial languages, fade into the background by standardizing it. If code that behaves the same looks the same, then my perceptual system will help me pick out the differences. That’s why I also observe conventions about how to lay out the parts of a class within a compilation unit: constants, fields, public methods, private methods.
- Expressive layout. We’ve all learned to take the time to find the right names so that our code expresses as clearly as possible what it does, rather than just listing the steps — right? The code’s layout is part of this expressiveness too. A first cut is to have the team agree on an automatic formatter for the basics, then I might make adjustments by hand while I’m coding. Unless there’s active dissension, a team will quickly converge on a common “hand-finished” style. A formatter cannot understand my intentions (I should know, I once wrote one), and it’s more important to me that the line breaks and groupings reflect the intention of the code, not just the syntax of the language. (Kevin McGuire freed me from my bondage to automatic code formatters.)
- Compact format. The more I can get on a screen, the more I can see without breaking context by scrolling or switching files, which means I can keep less state in my head. Long procedure comments and lots of whitespace made sense for 8-character names and line printers, but now I live in an IDE that does syntax coloring and cross linking. Pixels are my limiting factor so I want every one to contribute towards my understanding of the code. I want the layout to help me understand the code, but no more than that.
A non-programmer friend once remarked that code looks like poetry. I get that feeling from really good code, that everything in the text has a purpose and that it’s there to help me understand the idea. Unfortunately, writing code doesn’t have the same romantic image as writing poetry.
14. Code Reviews by Mattias Karlsson
You should do code reviews. Why? Because they increase code quality and reduce defect rate. But not necessarily for the reasons you might think.
Because they may previously have had some bad experiences with reviews, many programmers tend to dislike code reviews. I have seen organizations that require that all code pass a formal review before being deployed to production. Often it is the architect or a lead developer doing this review, a practice that can be described as architect reviews everything. This is stated in their software development process manual, so therefore the programmers must comply. There may be some organizations that need such a rigid and formal process, but most do not. In most organizations such an approach is counterproductive. Reviewees can feel like they are being judged by a parole board. Reviewers need both the time to read the code and the time to keep up to date with all the details of the system. The reviewers can rapidly become the bottleneck in this process, and the process soon degenerates.
Instead of simply correcting mistakes in code, the purpose of code reviews should be to share knowledge and establish common coding guidelines. Sharing your code with other programmers enables collective code ownership. Let a random team member walk through the code with the rest of the team. Instead of looking for errors you should review the code by trying to learn it and understand it.
Be gentle during code reviews. Ensure that comments are constructive, not caustic. Introduce different review roles for the review meeting, to avoid having organizational seniority among team members affect the code review. Examples of roles could include having one reviewer focus on documentation, another on exceptions, and a third to look at the functionality. This approach helps to spread the review burden across the team members.
Have a regular code review day each week. Spend a couple of hours in a review meeting. Rotate the reviewee every meeting in a simple round-robin pattern. Remember to switch roles among team members every review meeting too. Involve newbies in code reviews. They may be inexperienced, but their fresh university knowledge can provide a different perspective. Involve experts for their experience and knowledge. They will identify error-prone code faster and with more accuracy. Code reviews will flow more easily if the team has coding conventions that are checked by tools. That way, code formatting will never be discussed during the code review meeting.
Making code reviews fun is perhaps the most important contributor to success. Reviews are about the people reviewing. If the review meeting is painful or dull it will be hard to motivate anyone. Make it an informal code review whose prime purpose is sharing knowledge between team members. Leave sarcastic comments outside and bring a cake or brown bag lunch instead.
15. Coding with Reason by Yechiel Kimchi
Trying to reason about software correctness by hand results in a formal proof that is longer than the code and is more likely to contain errors than the code. Automated tools are preferable, but not always possible. What follows describes a middle path: reasoning semi-formally about correctness.
The underlying approach is to divide all the code under consideration into short sections — from a single line, such as a function call, to blocks of less than ten lines — and arguing about their correctness. The arguments need only be strong enough to convince your devil’s advocate peer programmer.
A section should be chosen so that at each endpoint the state of the program (namely, the program counter and the values of all “living” objects) satisfies an easily described property, and that the functionality of that section (state transformation) is easy to describe as a single task — these will make reasoning simpler. Such endpoint properties generalize concepts like precondition and postcondition for functions, and invariant for loops and classes (with respect to their instances). Striving for sections to be as independent of one another as possible simplifies reasoning and is indispensable when these sections are to be modified.
Many of the coding practices that are well known (although perhaps less well followed) and considered ‘good’ make reasoning easier. Hence, just by intending to reason about your code, you already start thinking toward a better style and structure. Unsurprisingly, most of these practices can be checked by static code analyzers:
- Avoid using goto statements, as they make remote sections highly interdependent.
- Avoid using modifiable global variables, as they make all sections that use them dependent.
- Each variable should have the smallest possible scope. For example, a local object can be declared right before its first usage.
- Make objects immutable whenever relevant.
- Make the code readable by using spacing, both horizontal and vertical. For example, aligning related structures and using an empty line to separate two sections.
- Make the code self-documenting by choosing descriptive (but relatively short) names for objects, types, functions, etc.
- If you need a nested section, make it a function.
- Make your functions short and focused on a single task. The old 24-line limit still applies. Although screen size and resolution have changed, nothing has changed in human cognition since the 1960s.
- Functions should have few parameters (four is a good upper bound). This does not restrict the data communicated to functions: Grouping related parameters into a single object benefits from object invariants and saves reasoning, such as their coherence and consistency.
- More generally, each unit of code, from a block to a library, should have a narrow interface. Less communication reduces the reasoning required. This means that getters that return internal state are a liability — don’t ask an object for information to work with. Instead, ask the object to do the work with the information it already has. In other words, encapsulation is all — and only — about narrow interfaces.
- In order to preserve class invariants, usage of setters should be discouraged, as setters tend to allow invariants that govern an object’s state to be broken.
As well as reasoning about its correctness, arguing about your code gives you understanding of it. Communicate the insights you gain for everyone’s benefit.
16. A Comment on Comments by Cal Evans
In my first programming class in college, my teacher handed out two BASIC coding sheets. On the board, the assignment read “Write a program to input and average 10 bowling scores.” Then the teacher left the room. How hard could this be? I don’t remember my final solution but I’m sure it had a FOR/NEXT loop in it and couldn’t have been more than 15 lines long in total. Coding sheets — for you kids reading this, yes, we used to write code out longhand before actually entering it into a computer — allowed for around 70 lines of code each. I was very confused as to why the teacher would have given us two sheets. Since my handwriting has always been atrocious, I used the second one to recopy my code very neatly, hoping to get a couple extra points for style.
Much to my surprise, when I received the assignment back at the start of the next class, I received a barely passing grade. (It was to be an omen to me for the rest of my time in college.) Scrawled across the top of my neatly copied code, “No comments?”
It was not enough that the teacher and I both knew what the program was supposed to do. Part of the point of the assignment was to teach me that my code should explain itself to the next programmer coming behind me. It’s a lesson I’ve not forgotten.
Comments are not evil. They are as necessary to programming as basic branching or looping constructs. Most modern languages have a tool akin to javadoc that will parse properly formatted comments to automatically build an API document. This is a very good start, but not nearly enough. Inside your code should be explanations about what the code is supposed to be doing. Coding by the old adage, “If it was hard to write, it should be hard to read,” does a disservice to your client, your employer, your colleagues, and your future self.
On the other hand, you can go too far in your commenting. Make sure that your comments clarify your code but do not obscure it. Sprinkle your code with relevant comments explaining what the code is supposed to accomplish. Your header comments should give any programmer enough information to use your code without having to read it, while your in-line comments should assist the next developer in fixing or extending it.
At one job, I disagreed with a design decision made by those above me. Feeling rather snarky, as young programmers often do, I pasted the text of the email instructing me to use their design into the header comment block of the file. It turns out that managers at this particular shop actually reviewed the code when it was committed. It was my first introduction to the term career-limiting move.
17. Comment Only What the Code Cannot Say by Kevlin Henney
The difference between theory and practice is greater in practice than it is in theory — an observation that certainly applies to comments. In theory, the general idea of commenting code sounds like a worthy one: Offer the reader detail, an explanation of what’s going on. What could be more helpful than being helpful? In practice, however, comments often become a blight. As with any other form of writing, there is a skill to writing good comments. Much of the skill is in knowing when not to write them.
When code is ill-formed, compilers, interpreters, and other tools will be sure to object. If the code is in some way functionally incorrect, reviews, static analysis, tests, and day-to-day use in a production environment will flush most bugs out. But what about comments? In The Elements of Programming Style Kernighan and Plauger noted that “a comment is of zero (or negative) value if it is wrong.” And yet such comments often litter and survive in a code base in a way that coding errors never could. They provide a constant source of distraction and misinformation, a subtle but constant drag on a programmer’s thinking.
What of comments that are not technically wrong, but add no value to the code? Such comments are noise. Comments that parrot the code offer nothing extra to the reader — stating something once in code and again in natural language does not make it any truer or more real. Commented-out code is not executable code, so it has no useful effect for either reader or runtime. It also becomes stale very quickly. Version-related comments and commented-out code try to address questions of versioning and history. These questions have already been answered (far more effectively) by version control tools.
A prevalence of noisy comments and incorrect comments in a code base encourage programmers to ignore all comments, either by skipping past them or by taking active measures to hide them. Programmers are resourceful and will route around anything perceived to be damage: folding comments up; switching coloring scheme so that comments and the background are the same color; scripting to filter out comments. To save a code base from such misapplications of programmer ingenuity, and to reduce the risk of overlooking any comments of genuine value, comments should be treated as if they were code. Each comment should add some value for the reader, otherwise it is waste that should be removed or rewritten.
What then qualifies as value? Comments should say something code does not and cannot say. A comment explaining what a piece of code should already say is an invitation to change code structure or coding conventions so the code speaks for itself. Instead of compensating for poor method or class names, rename them. Instead of commenting sections in long functions, extract smaller functions whose names capture the former sections’ intent. Try to express as much as possible through code. Any shortfall between what you can express in code and what you would like to express in total becomes a plausible candidate for a useful comment. Comment what the code cannot say, not simply what it does not say.
18. Continuous Learning by Clint Shank
We live in interesting times. As development gets distributed across the globe, you learn there are lots of people capable of doing your job. You need to keep learning to stay marketable. Otherwise, you’ll become a dinosaur, stuck in the same job until, one day, you’ll no longer be needed or your job gets outsourced to some cheaper resource.
So what do you do about it? Some employers are generous enough to provide training to broaden your skill set. Others may not be able to spare the time or money for any training at all. To play it safe, you need to take responsibility for your own education.
Here’s a list of ways to keep you learning. Many of these can be found on the Internet for free:
- Read books, magazines, blogs, twitter feeds, and web sites. If you want to go deeper into a subject, consider joining a mailing list or newsgroup.
- If you really want to get immersed in a technology, get hands on — write some code.
- Always try to work with a mentor, as being the top guy can hinder your education. Although you can learn something from anybody, you can learn a whole lot more from someone smarter or more experienced than you. If you can’t find a mentor, consider moving on.
- Use virtual mentors. Find authors and developers on the web who you really like and read everything they write. Subscribe to their blogs.
- Get to know the frameworks and libraries you use. Knowing how something works makes you know how to use it better. If they’re open source, you’re really in luck. Use the debugger to step through the code to see what’s going on under the hood. You’ll get to see code written and reviewed by some really smart people.
- Whenever you make a mistake, fix a bug, or run into a problem, try to really understand what happened. It’s likely that somebody else ran into the same problem and posted it somewhere on the web. Google is really useful here.
- A really good way to learn something is to teach or speak about it. When people are going to listen to you and ask you questions, you’ll be highly motivated to learn. Try a lunch-n-learn at work, a user group, or a local conference.
- Join or start a study group (à la patterns community) or a local user group for a language, technology, or discipline you are interested in.
- Go to conferences. And if you can’t go, many conferences put their talks online for free.
- Long commute? Listen to podcasts.
- Ever run a static analysis tool over the code base or look at the warnings in your IDE? Understand what they’re reporting and why.
- Follow the advice of The Pragmatic Programmers and learn a new language every year. At least learn a new technology or tool. Branching out gives you new ideas you can use in your current technology stack.
- Not everything you learn has to be about technology. Learn the domain you’re working in so you can better understand the requirements and help solve the business problem. Learning how to be more productive — how to work better — is another good option.
- Go back to school.
It would be nice to have the capability that Neo had in The Matrix, and simply download the information we needed into our brains. But we don’t, so it will take a time commitment. You don’t have to spend every waking hour learning. A little time, say each week, is better than nothing. There is (or should be) a life outside of work.
Technology changes fast. Don’t get left behind.
19. Convenience is not an-ility by Gregor Hohpe
Much has been said about the importance and challenges of designing good API’s. It’s difficult to get right the first time and it’s even more difficult to change later. Sort of like raising children. Most experienced programmers have learned that a good API follows a consistent level of abstraction, exhibits consistency and symmetry, and forms the vocabulary for an expressive language. Alas, being aware of the guiding principles does not automatically translate into appropriate behavior. Eating sweets is bad for you.
Instead of preaching from on high, I want to pick on a particular API design ‘strategy,’ one that I encounter time and again: the argument of convenience. It typically begins with one of the following ‘insights:’
- I don’t want other classes to have to make two separate calls to do this one thing.
- Why should I make another method if it’s almost the same as this method? I’ll just add a simple switch.
- See, it’s very easy: If the second string parameter ends with “.txt”, the method automatically assumes that the first parameter is a file name, so I really don’t need two methods.
While well intended, such arguments are prone to decrease the readability of code using the API. A method invocation like
parser.processNodes(text, false);
is virtually meaningless without knowing the implementation or at least consulting the documentation. This method was likely designed for the convenience of the implementer as opposed to the convenience of the caller — “I don’t want the caller to have to make two separate calls” translated into “I didn’t want to code up two separate methods.” There’s nothing fundamentally wrong with convenience if it’s intended to be the antidote to tediousness, clunkiness, or awkwardness. However, if we think a bit more carefully about it, the antidote to those symptoms is efficiency, consistency, and elegance, not necessarily convenience. APIs are supposed to hide underlying complexity, so we can realistically expect good API design to require some effort. A single large method could certainly be more convenient to write than a well thought-out set of operations, but would it be easier to use?
The metaphor of API as a language can guide us towards better design decisions in these situations. An API should provide an expressive language, which gives the next layer above sufficient vocabulary to ask and answer useful questions. This does not imply it should provide exactly one method, or verb, for each question that may be worth asking. A diverse vocabulary allows us to express subtleties in meaning. For example, we prefer to say run instead of walk(true), even though it could be viewed as essentially the same operation, just executed at different speeds. A consistent and well thought out API vocabulary makes for expressive and easy to understand code in the next layer up. More importantly, a composable vocabulary allows other programmers to use the API in ways you may not have anticipated — a great convenience indeed for the users of the API! Next time you are tempted to lump a few things together into one API method, remember that the English language does not have one word for MakeUpYourRoomBeQuietAndDoYourHomeWork, even though it would seem really convenient for such a frequently requested operation.
20. Deploy Early and Often by Steve Berczuk
Debugging the deployment and installation processes is often put off until close to the end of a project. In some projects writing installation tools is delegated to a release engineer who take on the task as a “necessary evil.” Reviews and demonstrations are done from a hand-crafted environment to ensure that everything works. The result is that the team gets no experience with the deployment process or the deployed environment until it may be too late to make changes.
The installation/deployment process is the first thing that the customer sees, and a simple installation/deployment process is the first step to having a reliable (or, at least, easy to debug) production environment. The deployed software is what the customer will use. By not ensuring that the deployment sets up the application correctly, you’ll raise questions with your customer before they get to use your software thoroughly.
Starting your project with an installation process will give you time to evolve the process as you move through the product development cycle, and the chance to make changes to the application code to make the installation easier. Running and testing the installation process on a clean environment periodically also provides a check that you have not made assumptions in the code that rely on the development or test environments.
Putting deployment last means that the deployment process may need to be more complicated to work around assumptions in the code. What seemed a great idea in an IDE, where you have full control over an environment, might make for a much more complicated deployment process. It is better to know all the trade-offs sooner rather than later.
While “being able to deploy” doesn’t seem to have a lot of business value early on as compared to seeing an application run on a developer’s laptop, the simple truth is that until you can demonstrate you application on the target environment, there is a lot of work to do before you can deliver business value. If your rationale for putting off a deployment process is that it is trivial, then do it anyway since it is low cost. If it’s too complicated, or if there are too many uncertainties, do what you would do with application code: experiment, evaluate, and refactor the deployment process as you go.
The installation/deployment process is essential to the productivity of your customers or your professional services team, so you should be testing and refactoring this process as you go. We test and refactor the source code throughout a project. The deployment deserves no less.
21. Distinguish Business Exceptions from Technical by Dan Bergh Johnsson
There are basically two reasons that things go wrong at runtime: technical problems that prevent us from using the application and business logic that prevents us from misusing the application. Most modern languages, such as LISP, Java, Smalltalk, and C#, use exceptions to signal both these situations. However, the two situations are so different that they should be carefully held apart. It is a potential source of confusion to represent them both using the same exception hierarchy, not to mention the same exception class.
An unresolvable technical problem can occur when there is a programming error. For example, if you try to access element 83 from an array of size 17, then the program is clearly off track, and some exception should result. The subtler version is calling some library code with inappropriate arguments, causing the same situation on the inside of the library.
It would be a mistake to attempt to resolve these situations you caused yourself. Instead we let the exception bubble up to the highest architectural level and let some general exception-handling mechanism do what it can to ensure the system is in a safe state, such as rolling back a transaction, logging and alerting administration, and reporting back (politely) to the user.
A variant of this situation is when you are in the “library situation” and a caller has broken the contract of your method, e.g., passing a totally bizarre argument or not having a dependent object set up properly. This is on a par with accessing 83rd element from 17: the caller should have checked; not doing so is a programmer error on the client side. The proper response is to throw a technical exception.
A different, but still technical, situation is when the program cannot proceed because of a problem in the execution environment, such as an unresponsive database. In this situation you must assume that the infrastructure did what it could to resolve the situation — repairing connections and retrying a reasonable number of times — and failed. Even if the cause is different, the situation for the calling code is similar: there is little it can do about it. So, we signal the situation through an exception that we let bubble up to the general exception handling mechanism.
In contrast to these, we have the situation where you cannot complete the call for a domain-logical reason. In this case we have encountered a situation that is an exception, i.e., unusual and undesirable, but not bizarre or programmatically in error. For example, if I try to withdraw money from an account with insufficient funds. In other words, this kind of situation is a part of the contract, and throwing an exception is just an alternative return path that is part of the model and that the client should be aware of and be prepared to handle. For these situations it is appropriate to create a specific exception or a separate exception hierarchy so that the client can handle the situation on its own terms.
Mixing technical exceptions and business exceptions in the same hierarchy blurs the distinction and confuses the caller about what the method contract is, what conditions it is required to ensure before calling, and what situations it is supposed to handle. Separating the cases gives clarity and increases the chances that technical exceptions will be handled by some application framework, while the business domain exceptions actually are considered and handled by the client code.
22. Do Lots of Deliberate Practice by Jon Jagger
Deliberate practice is not simply performing a task. If you ask yourself “Why am I performing this task?” and your answer is “To complete the task,” then you’re not doing deliberate practice.
You do deliberate practice to improve your ability to perform a task. It’s about skill and technique. Deliberate practice means repetition. It means performing the task with the aim of increasing your mastery of one or more aspects of the task. It means repeating the repetition. Slowly, over and over again. Until you achieve your desired level of mastery. You do deliberate practice to master the task not to complete the task.
The principal aim of paid development is to finish a product whereas the principal aim of deliberate practice is to improve your performance. They are not the same. Ask yourself, how much of your time do you spend developing someone else’s product? How much developing yourself?
How much deliberate practice does it take to acquire expertise?
- Peter Norvig writes that “It may be that 10,000 hours […] is the magic number.”
- In Leading Lean Software Development Mary Poppendieck notes that “It takes elite performers a minimum of 10,000 hours of deliberate focused practice to become experts.”
The expertise arrives gradually over time — not all at once in the 10,000th hour! Nevertheless, 10,000 hours is a lot: about 20 hours per week for 10 years. Given this level of commitment you might be worrying that you’re just not expert material. You are. Greatness is largely a matter of conscious choice. Your choice. Research over the last two decades has shown the main factor in acquiring expertise is time spent doing deliberate practice. Innate ability is not the main factor.
- Mary: “There is broad consensus among researchers of expert performance that inborn talent does not account for much more than a threshold; you have to have a minimum amount of natural ability to get started in a sport or profession. After that, the people who excel are the ones who work the hardest.”
There is little point deliberately practicing something you are already an expert at. Deliberate practice means practicing something you are not good at.
- Peter: “The key [to developing expertise] is deliberative practice: not just doing it again and again, but challenging yourself with a task that is just beyond your current ability, trying it, analyzing your performance while and after doing it, and correcting any mistakes.”
- Mary: “Deliberate practice does not mean doing what you are good at; it means challenging yourself, doing what you are not good at. So it’s not necessarily fun.”
Deliberate practice is about learning. About learning that changes you; learning that changes your behavior. Good luck.
23. Domain-Specific Languages by Michael Hunger
Whenever you listen to a discussion by experts in any domain, be it chess players, kindergarten teachers, or insurance agents, you’ll notice that their vocabulary is quite different from everyday language. That’s part of what domain-specific languages (DSLs) are about: A specific domain has a specialized vocabulary to describe the things that are particular to that domain.
In the world of software, DSLs are about executable expressions in a language specific to a domain with limited vocabulary and grammar that is readable, understandable, and — hopefully — writable by domain experts. DSLs targeted at software developers or scientists have been around for a long time. For example, the Unix ‘little languages’ found in configuration files and the languages created with the power of LISP macros are some of the older examples.
DSLs are commonly classified as either internal or external:
- Internal DSLs are written in a general purpose programming language whose syntax has been bent to look much more like natural language. This is easier for languages that offer more syntactic sugar and formatting possibilities (e.g., Ruby and Scala) than it is for others that do not (e.g., Java). Most internal DSLs wrap existing APIs, libraries, or business code and provide a wrapper for less mind-bending access to the functionality. They are directly executable by just running them. Depending on the implementation and the domain, they are used to build data structures, define dependencies, run processes or tasks, communicate with other systems, or validate user input. The syntax of an internal DSL is constrained by the host language. There are many patterns — e.g., expression builder, method chaining, and annotation — that can help you to bend the host language to your DSL. If the host language doesn’t require recompilation, an internal DSL can be developed quite quickly working side by side with a domain expert.
- External DSLs are textual or graphical expressions of the language — although textual DSLs tend to be more common than graphical ones. Textual expressions can be processed by a tool chain that includes lexer, parser, model transformer, generators, and any other type of post-processing. External DSLs are mostly read into internal models which form the basis for further processing. It is helpful to define a grammar (e.g., in EBNF). A grammar provides the starting point for generating parts of the tool chain (e.g., editor, visualizer, parser generator). For simple DSLs, a handmade parser may be sufficient — using, for instance, regular expressions. Custom parsers can become unwieldy if too much is asked of them, so it makes sense to look at tools designed specifically for working with language grammars and DSLs — e.g., openArchitectureWare, ANTlr, SableCC, AndroMDA. Defining external DSLs as XML dialects is also quite common, although readability is often an issue — especially for non-technical readers.
You must always take the target audience of your DSL into account. Are they developers, managers, business customers, or end users? You have to adapt the technical level of the language, the available tools, syntax help (e.g., intellisense), early validation, visualization, and representation to the intended audience. By hiding technical details, DSLs can empower users by giving them the ability to adapt systems to their needs without requiring the help of developers. It can also speed up development because of the potential distribution of work after the initial language framework is in place. The language can be evolved gradually. There are also different migration paths for existing expressions and grammars available.
24. Don’t Be Afraid to Break Things by Mike Lewis
Everyone with industry experience has undoubtedly worked on a project where the codebase was precarious at best. The system is poorly factored, and changing one thing always manages to break another unrelated feature. Whenever a module is added, the coder’s goal is to change as little as possible, and hold their breath during every release. This is the software equivalent of playing Jenga with I-beams in a skyscraper, and is bound for disaster.
The reason that making changes is so nerve wracking is because the system is sick. It needs a doctor, otherwise its condition will only worsen. You already know what is wrong with your system, but you are afraid of breaking the eggs to make your omelet. A skilled surgeon knows that cuts have to be made in order to operate, but the skilled surgeon also knows that the cuts are temporary and will heal. The end result of the operation is worth the initial pain, and the patient should heal to a better state than they were in before the surgery.
Don’t be afraid of your code. Who cares if something gets temporarily broken while you move things around? A paralyzing fear of change is what got your project into this state to begin with. Investing the time to refactor will pay for itself several times over the life cycle of your project. An added benefit is that your team’s experience dealing with the sick system makes you all experts in knowing how it should work. Apply this knowledge rather than resent it. Working on a system you hate is not how anybody should have to spend their time.
Redefine internal interfaces, restructure modules, refactor copy–pasted code, and simplify your design by reducing dependencies. You can significantly reduce code complexity by eliminating corner cases, which often result from improperly coupled features. Slowly transition the old structure into the new one, testing along the way. Trying to accomplish a large refactor in “one big shebang” will cause enough problems to make you consider abandoning the whole effort midway through.
Be the surgeon who isn’t afraid to cut out the sick parts to make room for healing. The attitude is contagious and will inspire others to start working on those cleanup projects they’ve been putting off. Keep a “hygiene” list of tasks that the team feels are worthwhile for the general good of the project. Convince management that even though these tasks may not produce visible results, they will reduce expenses and expedite future releases. Never stop caring about the general “health” of the code.
25. Don’t Be Cute with Your Test Data by Rod Begbie
It was getting late. I was throwing in some placeholder data to test the page layout I’d been working on.
I appropriated the members of The Clash for the names of users. Company names? Song titles by the Sex Pistols would do. Now I needed some stock ticker symbols — just some four letter words in capital letters.
I used those four letter words.
It seemed harmless. Just something to amuse myself, and maybe the other developers the next day before I wired up the real data source.
The following morning, a project manager took some screenshots for a presentation.
Programming history is littered with these kinds of war stories. Things that developers and designers did “that no one else would see” which unexpectedly became visible.
The leak type can vary but, when it happens, it can be deadly to the person, team, or company responsible. Examples include:
- During a status meeting, a client clicks on an button which is as yet unimplemented. They are told: “Don’t click that again, you moron.”
- A programmer maintaining a legacy system has been told to add an error dialog, and decides to use the output of existing behind-the-scenes logging to power it. Users are suddenly faced with messages such as “Holy database commit failure, Batman!” when something breaks.
- Someone mixes up the test and live administration interfaces, and does some “funny” data entry. Customers spot a $1m “Bill Gates-shaped personal massager” on sale in your online store.
To appropriate the old saying that “a lie can travel halfway around the world while the truth is putting on its shoes,” in this day and age a screw-up can be Dugg, Twittered, and Flibflarbed before anyone in the developer’s timezone is awake to do anything about it.
Even your source code isn’t necessarily free of scrutiny. In 2004, when a tarball of the Windows 2000 source code made its way onto file sharing networks, some folks merrily grepped through it for profanity, insults, and other funny content. (The comment // TERRIBLE HORRIBLE NO GOOD VERY BAD HACK has, I will admit, become appropriated by me from time to time since!)
In summary, when writing any text in your code — whether comments, logging, dialogs, or test data — always ask yourself how it will look if it becomes public. It will save some red faces all round.
26. Don’t Ignore that Error! by Pete Goodliffe
I was walking down the street one evening to meet some friends in a bar. We hadn’t shared a beer in some time and I was looking forward to seeing them again. In my haste, I wasn’t looking where I was going. I tripped over the edge of a curb and ended up flat on my face. Well, it serves me right for not paying attention, I guess.
It hurt my leg, but I was in a hurry to meet my friends. So I pulled myself up and carried on. As I walked further the pain was getting worse. Although I’d initially dismissed it as shock, I rapidly realized there was something wrong.
But I hurried on to the bar regardless. I was in agony by the time I arrived. I didn’t have a great night out, because I was terribly distracted. In the morning I went to the doctor and found out I’d fractured my shin bone. Had I stopped when I felt the pain, I’d’ve prevented a lot of extra damage that I caused by walking on it. Probably the worst morning after of my life.
Too many programmers write code like my disastrous night out.
Error, what error? It won’t be serious. Honestly. I can ignore it.
This is not a winning strategy for solid code. In fact, it’s just plain laziness. (The wrong sort.) No matter how unlikely you think an error is in your code, you should always check for it, and always handle it. Every time. You’re not saving time if you don’t: You’re storing up potential problems for the future.
We report errors in our code in a number of ways, including:
- Return codes can be used as the resulting value of a function to mean “it didn’t work.” Error return codes are far too easy to ignore. You won’t see anything in the code to highlight the problem. Indeed, it’s become standard practice to ignore some standard C functions’ return values. How often do you check the return value from printf?
- errno is a curious C aberration, a separate global variable set to signal error. It’s easy to ignore, hard to use, and leads to all sorts of nasty problems — for example, what happens when you have multiple threads calling the same function? Some platforms insulate you from pain here; others do not.
- Exceptions are a more structured language-supported way of signaling and handling errors. And you can’t possibly ignore them. Or can you? I’ve seen lots of code like this:
try { // ...do something... } catch (...) {} // ignore errors
The saving grace of this awful construct is that it highlights the fact you’re doing something morally dubious.
If you ignore an error, turn a blind eye, and pretend that nothing has gone wrong, you run great risks. Just as my leg ended up in a worse state than if I’d stopped walking on it immediately, plowing on regardless can lead to very complex failures. Deal with problems at the earliest opportunity. Keep a short account.
Not handling errors leads to:
- Brittle code. Code that’s filled with exciting, hard-to-find bugs.
- Insecure code. Crackers often exploit poor error handling to break into software systems.
- Poor structure. If there are errors from your code that are tedious to deal with continually, you have probably have a poor interface. Express it so that the errors are less intrusive and the their handling is less onerous.
Just as you should check all potential errors in your code, you need to expose all potentially erroneous conditions in your interfaces. Do not hide them, pretending that your services will always work.
Why don’t we check for errors? There are a number of common excuses. Which of these do you agree with? How would you counter each one?
- Error handling clutters up the flow of the code, making it harder to read, and harder to spot the “normal” flow of execution.
- It’s extra work and I have a deadline looming.
- I know that this function call will never return an error (printf always works, malloc always returns new memory — if it fails we have bigger problems…).
- It’s only a toy program, and needn’t be written to a production-worthy level.
27. Don’t Just Learn the Language, Understand its Culture by Anders Norås
In high school, I had to learn a foreign language. At the time I thought that I’d get by nicely being good at English so I chose to sleep through three years of French class. A few years later I went to Tunisia on vacation. Arabic is the official language there and, being a former French colony, French is also commonly used. English is only spoken in the touristy areas. Because of my linguistic ignorance, I found myself confined at the poolside reading Finnegans Wake, James Joyce’s tour de force in form and language. Joyce’s playful blend of more than forty languages was a surprising albeit exhausting experience. Realizing how interwoven foreign words and phrases gave the author new ways of expressing himself is something I’ve kept with me in my programming career.
In their seminal book, The Pragmatic Programmer, Andy Hunt and Dave Thomas encourage us to learn a new programming language every year. I’ve tried to live by their advice and throughout the years I’ve had the experience of programming in many languages. My most important lesson from my polyglot adventures is that it takes more than just learning the syntax to learn a language: You need to understand its culture. You can write Fortran in any language, but to truly learn a language you have to embrace the language. Don’t make excuses if your C# code is a long Main method with mostly static helper methods, but learn why classes make sense. Don’t shy away if you have a hard time understanding the lambda expressions used in functional languages, force yourself to use them.
Once you’ve learned the ropes of a new language, you’ll be surprised how you’ll start using languages you already know in new ways. I learned how to use delegates effectively in C# from programming Ruby, releasing the full potential of .NETs generics gave me ideas on how I could make Java generics more useful, and LINQ made it a breeze to teach myself Scala.
You’ll also get a better understanding of design patterns by moving between different languages. C programmers find that C# and Java have commoditized the iterator pattern. In Ruby and other dynamic languages you might still use a visitor, but your implementation won’t look like the example from the Gang of Four book.
Some might argue that Finnegans Wake is unreadable, while others applaud it for its stylistic beauty. To make the book a less daunting read, single language translations are available. Ironically, the first of these was in French. Code is in many ways similar. If you write Wakese code with a little Python, some Java, and a hint of Erlang, your projects will be a mess. If you instead explore new languages to expand your mind and get fresh ideas on how you can solve things in different ways, you will find that the code you write in your trusty old language gets more beautiful for every new language you’ve learned.
You can find all 97 things in the book, 97 Things Every Programmer Should Know, or at the original site
This work is licensed under a Creative Commons Attribution 3
Best Regards
Justin
I was tempted to read the article on instapaper, but the lack of proper html formatting did not play nice with it. :(
It looks like there is no consensus on the use of “goto”; two authors say two different things about it.
Good stuff though. Best.
Re: #21, Distinguish Business Exceptions from Technical, Mr Johnsson almost gets it right, in fact he comes closer than 90% of most naieve programmers who misunderstand and misuse exceptions, including the ‘designers’ of numerous Java API’s and libraries. The fact this distinction is being called out should be applauded. But the remaining piece, the most important thing to understand, is that the ‘Business Exceptions’ are not exceptions at all, they are business logic, or more precisely domain logic, because not every piece of code is about business. In the example of trying to withdraw from an account with insufficent funds,… Read more »
Above article is a nice read. However, Principles of programming rest on a few time-tested principles. They are Modularization, Reuse, Reuse, Reuse, Frameworks, Debug, Test etc. But you do not need a book on that.
Very good article for both advance and beginner programmers.
Unmatched theme, I like :)