Searching a phone book and copying names

An index

The importance of training

In the companies where I worked as a DBA, I generally proposed – and obtained – to hold some query optimisation training for developers. I found it extremely useful. The idea of developers writing inefficient queries and an undersized DBA team trying to fix the worst ones cannot work – or at least, it cannot scale as the company grows. Developers need to understand what a database can do quickly, and what is going to be dead slow.

But then, there is a problem: most high level developers hate databases. Here I’m not going to judge anyone, but let’s admit it. They hope to ask for any data in any way and obtain it in a fraction of second, without wondering how it will happen. “It’s a kind of magic”, as Freddy Mercury sang.

Again, I’m not judging anyone. This is perfectly human, and it’s exactly the same type of expectations I have when I turn my washer on, or book a flight, or press the light switch. And you know what? Usually, it just works.

But a DBMS is a very complex piece of software. All applications are about collecting, validating, processing and/or presenting data; and the kernel of all this data work is usually a DBMS. It has a lot of tasks to do to guarantee consistency and resist crashes, and to satisfy virtually any type of search. When an application is slow, chances are… the DBMS is the bottleneck.

This means that developers cannot simply ignore database problems. If they do, at some point the application will be unusable. They need to understand the basics of how a query is executed, and think about indexes accordingly.

How to explain something to someone who doesn’t like the topic?

Easy: use analogies with their real life experience.

Databases and analogies

Does the last sentence puzzle you? It shouldn’t. It is very easy to find analogies between real life and databases. At least, in my training I do that all the times. Some analogies I use are:

  • Your intuition is a pretty good SQL optimiser.
  • An index is a phone book.
    • It is actually a two columns index, ideal for many examples.
  • Searching for John Doe in the most natural way is an approximation of a binary search (navigating a B+Tree).
  • Copying a portion of the phone book to a paper to order entries by address is an internal temporary table.
  • A book index is a secondary index in a table organised by primary key, the page numbers printed in a page corner are the primary key, text is the data.

I’ve received very positive feedback with this method. Not only because it is actually good for explaining otherwise complex concepts; but also because it is a good reference to use later, in real life, to find out if a query can use an index.:

  • Can I find all Bakers whose name is greater than Tom? (yes)
  • Can I find all Toms whose surname is smaller than Baker? (no)

Analogies are the key for me, but don’t think you understood everything from this brief explanation. If you want to learn query optimisation, take a look at my MySQL Query Optimisation training. It’s more interesting and more funny than you probably think. During that training, attendees laugh, joke and ask passionate questions.

Because, you know, it’s very hard to remember something that your mind doesn’t associate to an emotion. Be human when you teach, and you’ll do a very good service.

EDIT: I used this analogy for the talk MySQL Query Optimisation 101, which I gave at the second London Open Source Databases meetup. The talk was introductory and doesn’t cover the EXPLAIN command, JOINs, etc.

Toodle pip,
Federico

Leave a Reply

Your email address will not be published. Required fields are marked *