Wednesday, June 14, 2006

Debuggers, Loggers, and Unit Tests

Several months ago I thought about writing a post here about the very same subject.

But several months of busy-ness, and I didn't get around to it, and now here I am, thinking the exact same thoughts, thinking about blogging about them, and feeling a need to exorcise them.

Back when I first started programming (BASIC, on a TRS-80), getting output at all was cool enough. PRINT was a powerful command. If not as powerful as GOTO, considerably less harmful.

10 PRINT "HELLO"
20 GOTO 10

I'm convinced this exact program was what Dijkstra was thinking of when he wrote his ill-conceived essay. I too, thought GOSUB was much more leet in those days.

10 GOSUB 20
20 PRINT "HELLO"
30 RETURN


Ah... memories. Somewhere in my parent's garage I have a TRS-80 (32K!) that I picked up from a customer at kfalls.net, the ISP we used to run. We only had 16K when I was a kid.
If only I could find copies of Pyramid 2000, Calixto Island, El Diablero, Madness and the Minotaur, Packet Man, Temple of ROM...

Anyway. As I was about to say, I've gone through an evolution in coding, that I'm going to guess is fairly common industry wide.

When I first started writing stuff (in BASIC) it was run, and then look for the the line number in the SYNTAX ERROR that inevitably followed. That was debugging. Getting into actual logic bugs was a nightmare, especially once you started using GOSUB, and had nothing but LIST to debug with. Unix gurus of the day were probly reveling in their line editors, like 'ed', but I thought Turbo C was stupid to not use line numbers when I eventually started playing with it in high school. And DOS was a huge step down from my interactive MS BASIC ROM shell.

But eventually I learned C, and before long, my only problems weren't syntax errors and the occations logic error. I had pointer problems! Was I ever glad to learn about the integrated debugger with Visual C. Many long nights stepping through code, watching pointers and counting refs, only to come to a breakpoint, invariably placed right before a bunch of PRINTF statements.

By the time I got into Object Oriented Programming with C++, I'd a firm grip on pointers that I don't claim to maintain today, but I was again running into lots of logic problems, particularly around instantiation. My code became littered with PRINTF (I mean COUT, of course) which would always have to be deleted en masse before turning in a project.

Asserts were a nice compromise, and your instructor smiled upon you for using them, but even though you knew they dropped out in the final project thanks to the magic of macros, they still littered your code. But really, they just weren't as powerful as PRINTFs (I mean COUTs) because they weren't meant to tell you how your code goes.

Along comes java. I don't know who wrote Log4j, but I don't think they should get credit. Everyone and his mother wrote a logger, before they'd ever seen anyone else do it. Maybe it wasn't as efficient, but it became (practically) a no-op in production with a simple change of a couple lines. You could always delete log calls in loops and frequently created classes one they were debugged.

I imagine the idea of loggers came from using scripting languages like perl on Unix, because it was so dead simple to redirect output to a file (or even a socket). Even shell scripts commands could do:

&2 > errors

combined with miracle like grep, and you had something really powerful. A lot more powerful than Visual C++ debugger or the obtuse gdb (and Java didn't have real debugging for a long time.)

I spend a lot of time reading log files on my current job. Hundreds of lines of log output a minute, and a clean test environment on Windows Server 2003 (which means no cygwin, or even an scp) Thank goodness for USB flash drives, which allow me to copy megabytes of logs at a time to my laptop for processing.

We won't discuss the advantages of placing all log events for several applications (always at debug) into one file, because I can't think of them. Thank goodness I don't have to turn on debugging for Hibernate or Hive Mind very often.

Anyway, so I've progressed from parser/compiler errors to ad hoc debugging standard output (and stderr), return codes and $!, asserts to interactive IDE debuggers, to loggers and I'd say each has given me a greater level of debugging power than the one before.

I'd say all these tools are part of the development phase, as opposed to the test phase. But what about unit testing? I'd say they aren't really tests, in the sense that testers think of them as such.

Unit tests get a bad rap sometimes, but that's because if you think of them as test cases, they're pretty weak. You can spend all week mocking up your framework to make sure string concatenation works in your language of choice.

But if you think of unit tests as part of your build process, an added step above and beyond your compiler, a tool to use in conjunction with your logger -- I'm happier now that most of my PRINTFs (I mean LOG.DEBUG statements) are in my test classes -- then it starts to make sense.

Unit tests are a poor substitute for tests. Sometimes I'd argue, worse than no tests, because of the time it takes to write them. Their value comes in design, if you do XP, not in testing. But I say: limit your unit tests to where they will be productive. Think of a suite of unit tests as a second compiler.

Naturally, if you're using a dynamically typed language, you'll probably want more unit tests. This makes sense. Was it Paul Graham saying that a dynamically typed language was as good or better than a statically typed language? Or compiled, or whatever.