HBaseCon: Moving Beyond the Core to Address Availability & Usability

May 19 2014, 12:58 pm

Jonathan Gray, CEO & Co-founder of Continuuity, is an entrepreneur and software engineer with a background in open source and data. Prior to Continuuity, he was at Facebook working on projects like Facebook Messages. At startup Streamy, Jonathan was an early adopter of Hadoop and HBase committer.

We just wrapped HBaseCon 2014, the annual event for Apache HBase™ contributors, developers, and users. As in years past, this is one of the most technical conferences that we attend, and it’s really focused on the core community of developers who are doing something meaningful with the enabling technology. What makes HBaseCon so compelling is that it’s not theoretical but rather all about overcoming real technical challenges and actual business use cases. And this year, we noticed a couple of key trends that are shaping the future of HBase.

Overall, we noticed that the HBase discussion has moved up a level, and this is a good thing. We’re no longer talking about the core architecture of HBase, which is pretty much set at this point. So people aren’t talking about doing the architecture better, but instead it’s all about building above what’s already there. Last year was very focused on improvements to the core platform, such as detecting server failure more quickly and recovering, and describing new use cases launching on HBase. But, in the year since, HBase has further stabilized into a mature platform and the new use cases are now established production systems. Now the conversation is around building above HBase and around it for higher availability and usability.

There was a lot of good discussion of increasing availability from an HBase standpoint. In the Facebook keynote on HydraBase, they discussed using a consensus protocol for HBase reads and writes in order to tolerate individual server failures without sacrificing availability or strong consistency. Similarly, Hortonworks and others shared work they’ve been doing on timeline consistent read replicas. For example, if a single server goes down you can still read data consistently up to a given point in time—the most updated snapshot of the data. Google’s Bigtable team also touched on availability by addressing their approach to the long tail of latency.

Multiple approaches to availability are happening, but they ultimately lead to the same goals of trying to reduce the big latency outliers and getting to 5-9s (i.e., 99.999%) reliability. In addition to early adopters like Facebook, Cloudera, and Hortonworks, we’re also encouraged to see a lot of other real users step up and take an active role in the community—we’ve seen this particularly in contributions from Salesforce, Xiaomi, and Bloomberg.

All of these companies are using HBase at very large scale, contributing to its development to continue to move it forward, and then sharing their successes with others. For us at Continuuity, HBase usability is what we’re driving at, and we’ll remain very focused on improving usability so that more developers can build their own HBase and Hadoop applications. This is where HBase is going, and we’re excited to be a part of this community and contribute to its success.

blog comments powered by Disqus