Wednesday, June 14, 2006

Debuggers, Loggers, and Unit Tests

Several months ago I thought about writing a post here about the very same subject.

But several months of busy-ness, and I didn't get around to it, and now here I am, thinking the exact same thoughts, thinking about blogging about them, and feeling a need to exorcise them.

Back when I first started programming (BASIC, on a TRS-80), getting output at all was cool enough. PRINT was a powerful command. If not as powerful as GOTO, considerably less harmful.

10 PRINT "HELLO"
20 GOTO 10

I'm convinced this exact program was what Dijkstra was thinking of when he wrote his ill-conceived essay. I too, thought GOSUB was much more leet in those days.

10 GOSUB 20
20 PRINT "HELLO"
30 RETURN


Ah... memories. Somewhere in my parent's garage I have a TRS-80 (32K!) that I picked up from a customer at kfalls.net, the ISP we used to run. We only had 16K when I was a kid.
If only I could find copies of Pyramid 2000, Calixto Island, El Diablero, Madness and the Minotaur, Packet Man, Temple of ROM...

Anyway. As I was about to say, I've gone through an evolution in coding, that I'm going to guess is fairly common industry wide.

When I first started writing stuff (in BASIC) it was run, and then look for the the line number in the SYNTAX ERROR that inevitably followed. That was debugging. Getting into actual logic bugs was a nightmare, especially once you started using GOSUB, and had nothing but LIST to debug with. Unix gurus of the day were probly reveling in their line editors, like 'ed', but I thought Turbo C was stupid to not use line numbers when I eventually started playing with it in high school. And DOS was a huge step down from my interactive MS BASIC ROM shell.

But eventually I learned C, and before long, my only problems weren't syntax errors and the occations logic error. I had pointer problems! Was I ever glad to learn about the integrated debugger with Visual C. Many long nights stepping through code, watching pointers and counting refs, only to come to a breakpoint, invariably placed right before a bunch of PRINTF statements.

By the time I got into Object Oriented Programming with C++, I'd a firm grip on pointers that I don't claim to maintain today, but I was again running into lots of logic problems, particularly around instantiation. My code became littered with PRINTF (I mean COUT, of course) which would always have to be deleted en masse before turning in a project.

Asserts were a nice compromise, and your instructor smiled upon you for using them, but even though you knew they dropped out in the final project thanks to the magic of macros, they still littered your code. But really, they just weren't as powerful as PRINTFs (I mean COUTs) because they weren't meant to tell you how your code goes.

Along comes java. I don't know who wrote Log4j, but I don't think they should get credit. Everyone and his mother wrote a logger, before they'd ever seen anyone else do it. Maybe it wasn't as efficient, but it became (practically) a no-op in production with a simple change of a couple lines. You could always delete log calls in loops and frequently created classes one they were debugged.

I imagine the idea of loggers came from using scripting languages like perl on Unix, because it was so dead simple to redirect output to a file (or even a socket). Even shell scripts commands could do:

&2 > errors

combined with miracle like grep, and you had something really powerful. A lot more powerful than Visual C++ debugger or the obtuse gdb (and Java didn't have real debugging for a long time.)

I spend a lot of time reading log files on my current job. Hundreds of lines of log output a minute, and a clean test environment on Windows Server 2003 (which means no cygwin, or even an scp) Thank goodness for USB flash drives, which allow me to copy megabytes of logs at a time to my laptop for processing.

We won't discuss the advantages of placing all log events for several applications (always at debug) into one file, because I can't think of them. Thank goodness I don't have to turn on debugging for Hibernate or Hive Mind very often.

Anyway, so I've progressed from parser/compiler errors to ad hoc debugging standard output (and stderr), return codes and $!, asserts to interactive IDE debuggers, to loggers and I'd say each has given me a greater level of debugging power than the one before.

I'd say all these tools are part of the development phase, as opposed to the test phase. But what about unit testing? I'd say they aren't really tests, in the sense that testers think of them as such.

Unit tests get a bad rap sometimes, but that's because if you think of them as test cases, they're pretty weak. You can spend all week mocking up your framework to make sure string concatenation works in your language of choice.

But if you think of unit tests as part of your build process, an added step above and beyond your compiler, a tool to use in conjunction with your logger -- I'm happier now that most of my PRINTFs (I mean LOG.DEBUG statements) are in my test classes -- then it starts to make sense.

Unit tests are a poor substitute for tests. Sometimes I'd argue, worse than no tests, because of the time it takes to write them. Their value comes in design, if you do XP, not in testing. But I say: limit your unit tests to where they will be productive. Think of a suite of unit tests as a second compiler.

Naturally, if you're using a dynamically typed language, you'll probably want more unit tests. This makes sense. Was it Paul Graham saying that a dynamically typed language was as good or better than a statically typed language? Or compiled, or whatever.

Saturday, February 25, 2006

Open Office developer contest

Just saw this:

http://wiki.services.openoffice.org/wiki/OpenOffice.org_Developer_Article_Contest

I've been thinking of switching over to the Dark Side myself. Not Perl, Microsoft.

I often use Microsoft Office at work, and my previous post touches on something I've wanted to do frequently, namely, get data in a more manageable format. Who doesn't? And not just PDF Bug reports.

What could be more fun than taking a requirements document (probaboly output from Rational Requisite Pro in the distant past) and turning it into a table in a database that can be cross-referenced with my (possibly Microsoft Excel - based) test conditions.

I could think of a couple things worse (Mercury Test Director, I'm looking at you.)

Okay, maybe my life isn't exciting enough. After 6 months travelling the globe (to Fiji & Ecuador) I get back to work and in only a month, my fondest desire is to convert Word documents and Excel spreadsheets into usable data.

If I can solve that problem, maybe it'll get me back to the southern hemisphere permanently.

A satellite connection and a rugged laptop and a sailing catamaran and a telecommute contract are my wildest dream, but usable information is a close second... or maybe distant third.

So I've been toying with the idea of learning C#. Or VB.NET. I've been looking at what makes SAMIE and PAMIE tick, and I recently discover JACOB and Jiffie, and have been meaning to look at Watir.

Note: Is there a similar PHP framework using COM? How stable is PHP with COM?

But what about Open Office? Well, nobody uses it at work. At least not yet. I'm guessing the developer contest is hoping to help change that.

I'd like to too.

And $750 is half a month's payment on a yacht.

Getting the bugs out (of PDF)

On a contract I am working at, some joker thought it would be funny to give me a list of defects for the project as a PDF file. Actually, they didn't give it to me. It was zipped up in a deployment release bundle, in a zipped up in a folder called "Limitations."

I'm actually lucky to have found it at all. So I suppose I should be happy to have it.

I bet their PDF export script (using XSL:FO, no doubt) is the pride and joy of the consulting firm who used it to output their bugs (note: there is a bug in the output for 'Assign To') and while PDFs have really nice anti-aliased fonts when printed out, I didn't have access to a printer.

Having to go through a large list of defects in PDF, even if they were printed out, wouldn't have been fun.

In all fairness, it was a nice layout, and would have been easy to read on paper, but I wanted something I could manipulate.

Thankfully, Acrobat Reader isn't what it used to be, meaning you can search it, and also, a feature that was added I don't know when, (recently?) "save as text."

I now had a plaintext document that I could search with a text editor, and even more importantly, read more than eleven and a half lines on a page.

But I still wanted more. So I thought I'd slurp them all up into a quick and dirty database to run some queries on and maybe even eventually output in a format that's somwhere in the happy medium between .pdf and .txt

I thought I'd try using Python for this, since I wanted to brush up on it, and I figured it wouldn't take more than an hour.

I got going quick enough: reading a file, check; connecting to database, check;

But I found quick enough that this was not a nicely structured document. After a bit of regex massaging in my favorite windows text editor, Editplus, I still wasn't getting it.

A few futile searchs and I was reading python newsgroups with people confessing of dipping their toes in the "dark side" for text processing.

This isn't a python bashing post, but I put it away after using up the allotted time, and went on to other things.

***

But at 6:15 this morning I was thinking of a solution, and I didn't know if it was real or just a product of that dream state of mind where everything *seems* to make sense.

So I did what any sensible person would do. I ignored it and tried to go back to sleep. It's *Saturday*.

A few minutes later I switched on the stereo, Scorpions "Best of the Ballads, Hot and Slow", a vintage cassette tape, and was soon "In Trance." Before the end of "Yellow Raven", I had my company laptop on, laying in bed, and I was working in PHP.

At eight o'clock the battery died and I got up and put it away.

How did I get from Python to Perl to PHP?

Well, it started out as a simple enough text parsing task, but I found that because of several special conditions (which I should have just ignored) you really needed an ugly procedural mess with a lot of conditions to put it up right. But you needed an object model to store it all, because it wasn't something that could be done with one pass throught the file.

PHP is good at switching from a glob of spaghetti to a set of functions, to using objects as glorified hashes, and now with PHP5, it's even got a decent, though limited object model.

But what I ended up doing was working on my pet project, an ORM for PHP, only to realize when all was said and done, that what I needed in this instance was an ActiveRecord.

Maybe I should have just used Ruby?

Thursday, February 23, 2006

The OTLOTT (One True List of Testing Terms)

I started making a list of Testing terms to help me have a definite vocabulary, and it quickly grew out of hand. I soon realized that what I needed were two lists:

One comprehensive list with all the definitions of all the terms I could find.
&
One definitive list of specific terms for my own use.


This second list would be my own, though I'd draw as much as possible from common usage. I want a specific terminology that can describe what the difference is between a test case and test condition, and what the relationship is between a suite of tests and a set of results.

I see several benefits of this.

Foremost, the act of naming things concentrates your mind on the subject. It allows you to think of something abstractly, and a concrete definition gives you the power to compare it with other abstract things.

This is what I see as the real power of Object Oriented Programming and especially with Design Patterns. The whole idea is to give a name to common patterns to facilitate communication, but even moreso, memory and cognition.

You might have implemented a singleton dozens of times, but until you had a name for it, you had to think about how to solve the particular problem. Once you had a name for it (though you might not have called it a singleton before hearing about GoF), it became obvious that what you needed was a singleton. A little less obvious cognitive test, but one giving even greater power is the ability to decide when you did *not* need a singleton. In fact, you needed a NotSingleton, and eventually this got a name too, perhaps it was a Factory.

And so on until you had a list of patterns with names that helped you remember programming techniques, and definitions that helped you to know when to use (or not use) them, and when to refine your dictionary by adding more terms for things that are definitely *not* whatever.

Even better, if others share your list of names and definitions, you are better able to communicate with each other, and especially, be able to further refine and grow your dictionary.

Which brings me to the first list, the comprehensive one.

Every tester dreams of having the One True List of Test Terms, but it just isn't going to happen. But just the act of making the list is, as I said, a cognitive act with it's own benefits, even if it doesn't result in a general consensus.

It wasn't long before I realized a wiki would be perfect for this, and so I launched one up and started adding terms (as soon as I recovered my password.) As the list grew, I realized that of course, I don't know all the terms, and I know multiple definitions for many, and there are even many terms that I do know that I probably won't think to add to the list.

So I decided it would sure be nice to have some help. And hey! isn't that what wikis are for? So I started one here:

http://wiki.klamathsystems.com/doku.php?id=test-_terms

I'm sure the url will change eventually, but it's a start.*

If you see something that contradicts your own definition, I'd rather see two definitions. We can work on refining the list later and hopefully come to a consensus. Maybe a ranking system to determine preferred usage. And maybe individuals could mark preferred definitions so that they could have there own OTLOTT. I don't know of any existing wiki that has these features, but that doesn't worry me. If the value is there, I'm sure it could be added.


* (On a side note, does anyone know how to get nice URLs in dokuwiki? I'll probably crunch a mod_rewrite script myself if nothing else. I'd also like to have a heirarchy, though wikis aren't supposed to be hierarchical. Something like: http://example.com/wiki/test_resources/terms/foo... I'm not opposed to question marks, but I want all the info to be in the URL. And it seems a shame to implement a local search engine when you could google a site.) **

** Footnotes on a webpage, isn't that what hyperlinks are for? I want a blog that can do hyperlink footnotes.

First post after the disaster

So I set up a blog here to take notes on the open source software I use, as well as the testing methodologies, hoping eventually to combine the two into something that might be usable, hopefully to make my job easier, but something happened (probably my fault) and to make a long story short (not to mention making the blog shorter), I deleted the whole thing and started over.

So this is my first post after the disaster. But it's not much of a disaster. Ernest Hemingway lost his early papers, and the world is probably better for it. "My Old Man" wasn't a masterpiece, no matter what they say. "Big Two-Hearted River" on the other hand, was.