orca now open source

One of the most amazing pieces of database technology was just released into the wild—yes, I’m talking about Orca. And yes, I might be a bit biased having been involved in the project in the past.

Slide1Orca is the summary of query optimization research from the past 3 decades, implemented in a way that doesn’t cut many corners. There is still plenty of headroom to add features and functionality, but as you’ll realize very quickly when surveying the code, the architecture lets you slot new development work rapidly and without affecting the existing functionality. That is really the strength of this framework.

For a detailed discussion of the framework, check out the long string of publications ranging from SIGMOD 2009 through VLDB 2015—from the original idea of parallel optimization all the way to rather elegant implementation details around CTE’s and Partitioning.

I want to take a moment and express my gratitude and recognize the people behind the project: Joe Hellerstein, Tim Kordas, Kurt Harriman, Brian Hagenbuch, John Eshleman, and Chuck McDewitt led some of the early rounds of discussions around a new optimizer in Greenplum Database. Siva Narayanan, Subi Arumugam, Chad Whipkey, Kostas Krikellas, Lyublena Antova, Mohamed Soliman, Rhonda Baldwin, Venky Raghavan, Amr El-Helw, Zhongxian Gu, Entong Shen, Joy Kent, George Caragia, Foyzur Rahman, Carlos Garcia and Michalis Petropoulos did the heavy lifting of creating a code base from scratch and building the best query optimizer out there—in the process they valiantly put up with the static code analysis, code coverage, and unit test madness their manager imposed on them. Ravi Shankar, Suchitra Ramani, Deepa Prabhu, and Leo Tung devised and implemented sophisticated test frameworks. Ronaldo Ama provided air cover and never lost faith in the project. Don Haderle served as an external advisor. Scott Yara and Bill Cook supported the enterprise all the way from the early days on—I shall never forget when Bill asked me about the latency of the memory allocator; you wish your CEO was that detail-oriented. Many a team member in Greenplum, EMC, and Pivotal contributed to discussions that made this project a success; my apologies to anybody I may have missed. A heartfelt “Thank You” to all of you.

Lastly, a nod to the skeptics who over all these years constantly questioned the feasibility of the endeavor and declared us crazy just for trying: you challenged us to try harder, think bigger, be more thorough, and, in the end, build to a better product!

And now on to the code

Advertisements

About Mike Waas

I recently founded a data virtualization startup that will change the way we use databases. Stay tuned or if you're interested in joining the revolution, contact us at info@datometry.com.
This entry was posted in Uncategorized. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s