Compression | Bigger Isn’t Better for a Database
August 23, 2013 Leave a comment
Bill Cole – Competitive Sales Specialist,Information Management, IBM
I’m from farm country. Much of my family are in agriculture (sounds better than riding around on tractors). There’s a saying that no farmer wants to own all the land just the land next to his. There’s some truth in that. My cousins feel that way. I’ve ridden with them while they point out their land and what they’d do with the land just over the fence if they could buy it. That next few acres.
Databases are a lot like that. They don’t want to use all the disk on your system, just the space that’s on the SAN. And that gets expensive, especially since the volume of data is expanding at exponential rates. That database you were sure wouldn’t grow past 10GB is now pushing past 25TB with no indication that it will stop there. And you’ve got a half dozen others with the same characteristic growth path. And some of them are actually big databases!!
I brought a shirt back from a show in San Francisco for a DBA friend that said “Size Matters.” It sure does! Growing a database for the sake of growing it doesn’t make all that much sense except for the ego boost. Performance tends to degrade as size increases. Data is spread all over the physical disks. Indexes grow even faster and get spread over more and more disks. Not to mention the additional time to read, write and manage all that extra data.
DB2 10.5 with BLU Acceleration provides three kinds of compression. We can go into the specifics – and others have – but it’s the benefits we’re after. Why does a Production DBA care about compression? Here are a few reasons. If you have others or want to discuss them, please let me know!
- Space. The days when disk was cheap are long since gone. Nobody can even spell JBOD any more. Storage is expensive. We don’t buy a few disks and call it enough. We buy TBs and PBs. Whole farms of space. And none of it is cheap! We’re reminded that it’s our job to make good use of it. Even if we don’t have any control over what the users are adding to the database. Not to mention that dealing with a large array of datafiles is just silly. Uh, make that difficult and time-consuming and serves no real productive purpose.
- Performance. Let’s face it, compression means every read brings in more data. Fewer reads means less waiting on the disk system. No matter how fast it is, the disk is still the slowest part of the environment. Another really cool thing about compression in DB2 10.5 is that it persists all the way through the query execution. No cycles wasted on unnecessary De-compressing. A nice little bonus.
- Memory. If we’re bringing compressed data and not decompressing it, we’re making more intelligent use of memory, right? Less memory for any particular process means more memory available for everyone. Workloads execute faster. No need to buy more memory or processors. You’re a hero. Take the day off. LOL
- Fewer objects. Let’s consider getting rid of a few indexes. Maybe a few MQTs. Hey, we built them for performance, right? If we’re getting better performance simply by using compression, why not test dropping seldom-used indexes or MQTs? They’re taking up space, processor cycles and your time and might not even be necessary. Occam’s Razor again. Sort of. The simplest database design might well be the best.
So we’ve saved lots of resources. Nice. But what about the actual compression & decompression? Doesn’t that take some horsepower? Aren’t we using lots of cycles there? Glad you asked. For years, there have been these seldom-used instructions in the chipset. These instructions process multiple streams of data at one time. In one instruction cycle. That is really, really cool. That’s sort of like changing all four tires at one time (NASCAR country talking) without jacking up the car. The official name of this group of instructions is SIMD (Single Instruction, Multiple Datastreams) and they operate at register speeds. Even faster than main memory. I like it!
Finally, remember when we’d brag about the size of our databases? It was a testament to your DBA-ness. Of course, this was when a “big” database might be something less than a GB. Heck, they’d fit nicely on a DVD. We finagled and finessed to use our space wisely and pleaded with the sys admin for a few more MB. We burnt our hours figuring out secret ways to manage space. Now we have entire SANs at our disposal and we’re still using the whole thing! Careful what you wish for, eh? Compression is our friend and helps move us from the technoid arena to business enablers. A great place to be for sure;Part of the mainstream, not overhead.
Learn more about the new version of DB2 for LUW
Follow Bill Cole on Twitter : @billcole_ibm
Download DB2 10.5 to experience all these capabilities