Skip to main content

Posts

Showing posts from May, 2010

Clean code, clean logs: easy to read, easy to parse (10/10)

There are two groups of receivers particularly interested in your application logs: human beings (you might disagree, but programmers belong to this group as well) and computers (typically shell scripts written by system administrators). Logs should be suitable for both of these groups. If someone looking from behind your back at your application logs sees:

from Wikipedia


then you were probably not reading my tips carefully enough. The reference to famous Clean code book in the title of this series is not accidental: logs should be readable and easy to understand just like the code should.

On the other hand, if your application produces half GiB of logs each hour, no man and no graphical text editor will ever manage to read them entirely. This is where old-school grep, sed and awk come in handy. If it is possible, try to write logging messages in such a way, that they could be understood both by humans and computers, e.g. avoid formatting of numbers, use patterns that can be easily recog…

Clean code, clean logs: log exceptions properly (9/10)

First of all, avoid logging exceptions, let your framework or container (whatever it is) do it for you*. Logging exceptions is one of the most important roles of logging at all, but many programmers tend to treat logging as a way to handle the exception. They sometimes return default value (typically null, 0 or empty string) and pretend that nothing has happened. Other times they first log the exception and then wrap it and throw back:

log.error("IO exception", e);
throw new MyCustomException(e);

This construct will almost always print the same stack trace two times, because something will eventually catch MyCustomException and log its cause. Log or wrap and throw back (which is preferable), never both, otherwise your logs will be confusing.

But if we really do WANT to log the exception? For some reason (because we don’t read APIs and documentation?), about half of the logging statements I see are wrong. Quick quiz, which of the following log the NPE properly?

try {
Integer x…

Clean code, clean logs: watch out for external systems (8/10)

This is the special case of the previous tip: if you communicate with any external system, consider logging every piece of data that comes out from your application and gets in. Period. Integration is a tough job and diagnosing problems between two applications (think two different vendors, environments, technology stacks and teams) is particularly hard. Recently, for example, we've discovered that logging full messages contents, including SOAP and HTTP headers in Apache CXF web services is extremely useful during integration and system testing.

This is a big overhead and if performance is an issue, you can always disable logging. But what is the point of having fast, but broken application, that no one can fix? Be extra careful when integrating with external systems and prepare to pay that cost. If you are lucky and all your integration is handled by ESB, bus is probably the best place to log every incoming request and response. See for example Mules' <log-component/>.

So…

Impressions after GeeCON 2010

Two days ago I came back from Poznań, Poland, where second edition of GeeCON conference took place. After attending the first edition my expectations were very high and sadly I left Poznań a bit disappointed. It is most likely a matter of my personal taste, but still just a few presentations are worth mentioning.

The biggest surprise and most fabulous piece of lecture has been given by Dawid Weiss on Java in high-performance computing. He managed to combine great show with lots of non-trivial examples. Lots of humour, brilliant slides and great contact with the audience. One of the most charismatic speakers I have seen. But on the other hand it was not a stand-up comedy, where you have lots of fun during the lecture but you don’t gain anything useful after them. Dawid given plenty of examples and micro-benchmarks during his speech, making us believe that tuning, benchmarking and even studying bytecode and assembly language might be interesting. Bravo for Dawid, I was really proud to s…

Clean code, clean logs: log method arguments and return values (7/10)

When you find a bug during development, you typically run a debugger trying to track down the potential cause*. Now imagine for a while you can’t use a debugger. For example, because the bug manifested itself on a customer environment few days ago and everything you have are logs. Would be able to find anything in them?

If you follow the simple rule of logging each method input and output (arguments and return values), you don’t even need debugger any more. Of course, you must be reasonable but every method that: accesses external system (including database), blocks, waits, etc. should be considered. Simply follow this pattern:

public String printDocument(Document doc, Mode mode) {
log.debug("Entering printDocument(doc={}, mode={})", doc, mode);
String id = //Lengthy printing operation
log.debug("Leaving printDocument(): {}", id);
return id;
}

Because you are logging both the beginning and the end of method invocation, you can manually discover inefficient …

Clean code, clean logs: tune your pattern (6/10)

Logging pattern is a wonderful tool, that transparently adds meaningful context to every logging statement you make. But you must consider very carefully which information to include in your pattern. For example, logging date when your logs roll every hour is pointless as the date is already included in the log file name. On the contrary, without logging thread name you would be unable to track any process using logs when two threads work concurrently – the logs will overlap. This might be fine in single-threaded applications – that are almost dead nowadays.

From my experience, ideal logging pattern should include (of course except the logged message itself): current time (without date, milliseconds precision), logging level (if you pay attention to it), name of the thread, simple logger name (not fully qualified) and the message. In Logback it is something like:

<appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
<encoder>
<…

Clean code, clean logs: concise and descriptive (5/10)

Each logging statement should contain both data and description. Consider the following examples:

log.debug("Message processed");
log.debug(message.getJMSMessageID());
log.debug("Message with id '{}' processed", message.getJMSMessageID());

Which log would you like to see while diagnosing failure in an unknown application? Believe me, all the examples above are almost equally common. Another anti-pattern:

if(message instanceof TextMessage)
//...
else
log.warn("Unknown message type");

Was it so hard to include actual message type, message id, etc. in warning string? I know something went wrong, but what, what was the context? Third anti-pattern is "magic-log". Real life example: most programmers in the team knew that 3 ampersands followed by exclamation mark, followed by hash, followed by pseudorandom alphanumeric string log means "Message with XYZ id received". Nobody bothered to change the log, simply someone hit the keyboard an…

Clean code, clean logs: avoid side effects (4/10)

Logging statements should have no or little impact on application behavior. Recently a friend of mine gave an example of a system that threw Hibernates' LazyInitializationException only on some particular environment. As you’ve probably guessed from the context, some logging statement caused lazy initialized collection to be loaded when session was attached. On this environment the logging levels were increased and collection was no longer initialized. Think how long would it take you to find a bug without knowing this context?

Another side effect is slowing the application down. Quick answer: if you log too much or improperly use toString()/string concatenation, logging has a performance side effect. How big? Well, I have seen server restarting every 15 minutes because of a thread starvation caused by excessive logging. Now this is a side effect! From my experience, few hundreds of MiB is probably the upper limit of how much you can log onto disk per hour.

Of course if logging stat…

Clean code, clean logs: do you know what you are logging? (3/10)

Every time you issue a logging statement, take a moment and have a look what exactly will land in your log file. Read your logs afterwards and spot malformed sentences. First of all, avoid NPEs like this:

log.debug("Processing request with id: {}", request.getId());

Are you absolutely sure that request is not null here? Another pitfall is logging collections. If you fetched collection of domain objects from the database using Hibernate and carelessly log them like here:

log.debug("Returning users: {}", users);

SLF4J will call toString() only when the statement is actually printed, which is quite nice. But if it does... Out of memory error, N+1 select problem, thread starvation (logging is synchronous!), lazy initialization exception, logs storage filled completely – each of these might occur.

It is a much better idea to log, for example, only ids of domain objects (or even only size of the collection). But making a collection of ids when having a collection of objects h…

Clean code, clean logs: logging levels are there for you (2/10)

Every time you are making a logging statement, you think hard which logging level is appropriate for this type of event, do you? Somehow 90% of programmers never pay attention to logging levels, simply logging everything on the same level, typically INFO or DEBUG. Why? Logging frameworks have two major benefits over System.out: categories and levels. Both allow you to selectively filter logging statements permanently or only for diagnostics time. If you really can’t see the difference, print this table and look at it every time you start typing "log." in your IDE:

ERROR – something terribly wrong had happened, that must be investigated immediately. No system can tolerate items logged on this level. Example: NPE, database unavailable, mission critical use case cannot be continued.

WARN – the process might be continued, but take extra caution. Actually I always wanted to have two levels here: one for obvious problems where work-around exists (for example: "Current data unav…

Clean code, clean logs: use appropriate tools (1/10)

Many programmers seem to forget how logging application behavior and its current activity is important. When somebody puts:

log.info("!@#$%");

happily somewhere in the code, he probably don't realize the importance of application logs during maintenance, tuning and failure identification. Underestimating the value of good logs is a terrible mistake. I have collected few random advices that I find especially useful when it comes to writing logging routines and I will present them in a series of short articles. First tip (out of ten) is about logging libraries and tools.

In my opinion, SLF4J is the best logging API available, mostly because of a great pattern substitution support:

log.debug("Found {} records matching filter: '{}'", records, filter);

In Log4j you would have to use:

log.debug("Found " + records + " records matching filter: '" + filter + "'");

This is not only longer and less readable, but also inefficient bec…