orca now open source

One of the most amazing pieces of database technology was just released into the wild—yes, I’m talking about Orca. And yes, I might be a bit biased having been involved in the project in the past.

Slide1Orca is the summary of query optimization research from the past 3 decades, implemented in a way that doesn’t cut many corners. There is still plenty of headroom to add features and functionality, but as you’ll realize very quickly when surveying the code, the architecture lets you slot new development work rapidly and without affecting the existing functionality. That is really the strength of this framework.

For a detailed discussion of the framework, check out the long string of publications ranging from SIGMOD 2009 through VLDB 2015—from the original idea of parallel optimization all the way to rather elegant implementation details around CTE’s and Partitioning.

I want to take a moment and express my gratitude and recognize the people behind the project: Joe Hellerstein, Tim Kordas, Kurt Harriman, Brian Hagenbuch, John Eshleman, and Chuck McDewitt led some of the early rounds of discussions around a new optimizer in Greenplum Database. Siva Narayanan, Subi Arumugam, Chad Whipkey, Kostas Krikellas, Lyublena Antova, Mohamed Soliman, Rhonda Baldwin, Venky Raghavan, Amr El-Helw, Zhongxian Gu, Entong Shen, Joy Kent, George Caragia, Foyzur Rahman, Carlos Garcia and Michalis Petropoulos did the heavy lifting of creating a code base from scratch and building the best query optimizer out there—in the process they valiantly put up with the static code analysis, code coverage, and unit test madness their manager imposed on them. Ravi Shankar, Suchitra Ramani, Deepa Prabhu, and Leo Tung devised and implemented sophisticated test frameworks. Ronaldo Ama provided air cover and never lost faith in the project. Don Haderle served as an external advisor. Scott Yara and Bill Cook supported the enterprise all the way from the early days on—I shall never forget when Bill asked me about the latency of the memory allocator; you wish your CEO was that detail-oriented. Many a team member in Greenplum, EMC, and Pivotal contributed to discussions that made this project a success; my apologies to anybody I may have missed. A heartfelt “Thank You” to all of you.

Lastly, a nod to the skeptics who over all these years constantly questioned the feasibility of the endeavor and declared us crazy just for trying: you challenged us to try harder, think bigger, be more thorough, and, in the end, build to a better product!

And now on to the code

This entry was posted in Uncategorized. Bookmark the permalink.

Leave a comment