how to build a query optimizer for big data

It’s time to talk about what we’ve been up to!


In a series of articles, I’ll describe the motivation and background as well as the engineering tools and practices we developed over the past couple of years to attack one of these once-in-a-lifetime projects that get engineers truly excited: building a query optimizer from scratch. All database vendors at some point in time have to redesign one or the other large component in their system. When it comes to the query optimizer, all of them have refurbished/rewritten/remodeled over the past 15 years. The first really big splash in this category was made by Microsoft with its rewrite of the entire query processor for the 7.0 release of SQL Server in the years of 1994-1998. This initiative was instrumental in taking the product from negligible revenue to being a 1 billion dollar a year business in only 2 major releases. Others followed suit, but as far as I can tell none was similarly radical — most were more a matter of refurbishing existing structures. If you’ve been part of any such initiative at, say Oracle, I’d really like to buy you coffee and get some insights in the software engineering aspects of your project: pitfalls, ambitions, team dynamics, etc.

Anyways, for a startup like Greenplum it’s a much dicier decision to rebuild and entire component and, suffice it to say a lot of convincing was needed before upper management gave the green light to go ahead and hire a team of engineers, design a new optimizer, and start coding. Now that the product is shaping up and we’re on the home stretch it’s time to review some of the lessons learned! What’s with the whale you ask? You’ll see.

So, stay tuned for a series on posts chronicling an exciting journey!

Posted in Optimizer Technology | Leave a comment

in the rearview mirror: dbtest 2012

Organizing DBTest together with my partner in crime Eric Lo from the Polytechnic University in Hong Kong was a great experience. For a long time already we both have been very passionate about developing test methodologies for database systems in all forms and shapes; hence, it seemed very fitting to volunteer for organizing the database test workshop!

We managed to solicit a total of 26 submissions, which is an all time high, as far as I can tell. While impressive by itself, it meant we just didn’t have enough people on the PC to keep the review load as low as we had originally promised. Luckily our PC members proved to be great sports and agreed to review pretty much double the number of papers we had originally anticipated. Thank you very much! As a result we managed to put together a very strong program! After some deliberation we decided to include a whopping 12 papers in the program and rather shorten the presentation time than reject several truly outstanding papers.

Continue reading

Posted in Uncategorized | Leave a comment

impressions from SIGMOD’12

This year’s SIGMOD conference was a good excuse to visit Phoenix, Arizona. Turned out, the choice of Scottsdale as a venue was a pretty good one: I prefer conferences/workshops to be held in places without (m)any tourist attractions or distraction in walking distance as it keeps the crowd together–there’s simply nowhere people could walk off to. A punishing 100+F outside temperature posed an additional incentive to stay within the confines of the conference hotel.

Continue reading

Posted in Uncategorized | Leave a comment

dbtest’12: call for participation

In a few weeks, on May 21, 2012, this year’s DBTest workshop will take place colocated with SIGMOD in Scottsdale, Arizona. DBTest is a great forum for practitioners and academics alike to exchange ideas and experiences regarding quality aspects of data management systems.


The number of submissions was distinctly larger than in previous years indicating a pent up demand for solutions around reliability and testability of database systems. We ended up accepting 12 papers — and had to turn down a number of strong and interesting submissions. Check out the program and book your tickets!

Looking forward to seeing you in Scottsdale!

Posted in Uncategorized | Leave a comment

smdb’12: reprint of EMC/Greenplum paper

Check out the latest publication from Greenplum Engineering at this year’s SMDB — ICDE’s workshop on Self-managing Database Systems. One of the authors was an intern during last summer and we asked him to do an investigation into how data is redistributed during query processing in an MPP environment. In particular, are there patterns that can be discerned and used to layout the data more systematically than the usual somewhat hunch-based approach of looking for “frequent join attributes”? Sure enough the answer is yes, and this paper describes how it can be done.

Find an electronic copy here.

Posted in Uncategorized | 1 Comment

got phd?

After a long day of interviews, I wrapped up with a candidate the other day. Among other things, I tend to ask our guests what they’ve learned about the team, the position, and the company. Turns out, this particular candidate kept a close tab on her interviewers and noticed that all of them had Ph.D. degrees in databases or a closely related field. “So, what’s up with that?” she asked, “Is this a requirement for the job?”superman_frame

Well, fact is, about 30% of our core engine development team are Ph.D.’s – with even higher numbers in some teams. Having said this, it might sound a bit unbelievable but we actually do not care much about degrees. So, how come we have such a high density of Ph.D.’s?

Well, it’s easy to confuse cause and symptom. Remember, I’ve written in the past about what kind of people I’m looking to hire: smart people who get stuff done. And although I value experience in engineers, it ranks lower than raw smarts and attitude – I know that’s quite a contrast to most of our competitors: the more established a company becomes the more it slows down and focuses on tried-and-true rather than on can-do. Continue reading

Posted in Uncategorized | 4 Comments

prototypical development

Ever wondered what the Golden Gate Bridge would look like if software engineers had been tasked to build it?

Chances are, after the first brainstorming, one engineer starts building a prototype — maybe over the weekend. One lane, using limestone, his favorite building material (after all ‘limestone has been known to be the best building material for centuries’), nice little arches, maybe not exactly at the right position but good enough to get a first impression of what a bridge in this location could look like. Quickly, a few other excited engineers help out. They change the material mid-way to granite (after all “granite is known to be the best building material for bridges for centuries”) and add a bicycle lane. A massive tower is added as look-out for tourists sure to visit the bridge every year — a key requirement, everybody agrees.

At the next brainstorming, the prototype is unveiled and by unanimous vote the brainstorming meetings are henceforth turned into status meetings. Program Management is delighted to see early results and upper management praises the success of agile development and rapid prototyping: “It would have taken our competitors months even just to select the site!” More resources are poured into the project and the one-car-one-bicycle-lane project is quickly advanced. To make up for the unfortunate choice of location, a sharp turn is inserted half-way–“hey, it worked for the Bay Bridge“. Adding another 5 lanes for traffic is postponed for a future release together with widening the arches for ship traffic. Several segments need rebuilding in different material even before the middle of the strait is reached as granite turns out to be great material for building castles but not so much for bridges. Continue reading

Posted in Uncategorized | 4 Comments

expanding without reloading — online expansion of large-scale data warehouses

DBA’s the world over dread the day when their boss walks into their office and announces that it is time to expand the Enterprise Data Warehouse, the company’s crown jewel. While not a pleasant operation on single-node databases it means major surgery in conventional MPP databases that deploy a large array of shared-nothing compute and storage nodes. The standard approach is to dump all the data, provision new capacity, and then reload all the data. What sounds rather simple is actually an impressive logistic feat — if all goes well. Weeks, if not months, in advance, elaborate project plans are developed that span a number of teams across the company: from the hardware department all the way to the business customers of the database; you need all hands on deck. The process itself requires up to several days of downtime for databases in the Petabyte range — that is, if all goes well. In the event, that one or more things go wrong the crew will be scrambling to get either back to the original configuration or to a makeshift solution before the scheduled window of downtime expires and the business suffers from the outage.
The sheer prospect of difficulty of this operation makes many IT organizations put off an expansion as long as possible. Which usually makes things even harder as the system will be close to capacity when it finally needs to be expanded and spare components or capacity will be harder to come by in the heat of the battle.

With all that in mind, we developed a mechanism that allows expanding a Greenplum Database instance (1) without downtime, (2) no significant performance impact during the operation, and is (3) transactionally consistent on top of it. Continue reading

Posted in Uncategorized | Leave a comment

dbtest 2011

A few weeks ago, the latest edition of the DBTest workshop took place. As in the years before, the workshop was held in connection with SIGMOD, meant to draw a good crowd of practitioners. And draw it did: the gathering was well attended throughout the day by both academics and folks from industry! I would guess that DBTest evolved into the workshop with the largest crowd from industry in all of SIGMOD?

This year’s keynote presentation, by Glenn Paulley of Sybase, was centered around the question of ‘how much more complexity can database systems deal with?’ Quite a interesting outlook and effectively a call to arms to simplify and restructure database architecture as a whole. Some interesting stats: Microsoft was one of the main contributors, as was the case in previous years, and quite a number of papers were authored by attendees of last year’s Dagstuhl workshop on robust query processing.

So, we’ve had some seriously successful workshop. Now, where to go from here? It seems, the organizers of the next edition(s) will face a couple of interesting challenges: Continue reading

Posted in Uncategorized | Leave a comment

sigmod’11: preview of EMC/DCD paper

We’ve seen quite a number of papers co-authored by folks from our R&D organization. The latest addition is by Mohamed Soliman at this year’s SIGMOD conference.

Find the paper here.

Posted in Uncategorized | Leave a comment