Oracle embraces SQLite; wraps it around BerkeleyDB as SQL API

elblanco · on March 31, 2010

This post just generated a tremendous amount of discussion in my company. Not just for the SQLite wrapper, but also about the Berkeley XML DB. Anyone have experience using Berkeley for medium-to-large scale data stores (1-100GB?)? How's the concurrency? What's the licensing model? Is there a way we can use this without having to give away our source?

strlen · on April 1, 2010

Project Voldemort uses BerkeleyDB Java Edition as one of the storage engines (the most frequently used one). Java Edition differs from the C version in being log-structured (very fast writes, at the cost of somewhat slower reads). In one deployment, we're able to store ~130 gb per node data (each node having 32 gb of memory, 18gb of it going to JVM heap, 10gb of the JVM heap going to BDB cache), while having average read latency times (which include the overhead of decoding a request packet) of <1 ms (with some outliers due to large value sizes, etc...).

I should add, this is also while using a single bdb environment for all partitions that belong to a node -- note the most efficient way to store this data (bdb je 4.0 does support sharing a cache between multiple environments, so we might move to separate environment per partition model). We've also built the distribution layer ourselves, rather than use the built-in HA features, but if I recall correctly, the first version of Amazon SimpleDB (prior to Erlang re-write) was built with BDB-JE HA (not to be confused with Amazon Dynamo, which is built on regular BDB-JE and provides its own distribution mechanisms).

Overall, we've found that BDB-JE performs best when there's 1:5 ratio between memory and data set size. With JE (as opposed to classic BDB) the issues are: Java implies a certain vertical scalability limit (due to GC pauses with very large heap sizes, however the GC does get better with every release), log-structured B+Tree is great for writes, but might mean more seeks being made for reads (especially if you were to use range query functionality).

Licensing might be an issue if you're shipping a commercial product to customer site. It doesn't matter if it's used on your own servers or in an open source project (and unlike GPL, the BerkeleyDB license is compatible with Apache licensed projects). The licensing also doesn't apply to Perl/Ruby/Python/other dynamic language bindings for BerkeleyDB, you're free to package them (and then request the customer to install BerkeleyDB themselves, much like you would with MySQL).

In short, it's a great solution if you fit its use case. I should also add that interacting with the Sleepycat people online (e.g., on the forums) also makes you forget they're a part of Oracle.

gojomo · on March 31, 2010

I don't have experience with the Berkeley XML DB. I have used the BerkeleyDB 'Java Edition' (a different codebase and on-disk format) for multi-GB spill-to-disk queues and maps/sets.

The BerkeleyDB license is roughly: free to include in open-source projects, pay if you want to redistribute as part of a closed-source product. So like with the GPL, using strictly inside your company might not be considered 'distribution'. But also, if you are selling software licenses their bundling rates might not be bad compared to your unit price. Of course, do your own license analysis before business use.

psaccounts · on April 1, 2010

What would be the licensing model if the product is open source, but is NOT free (i.e., customers are charged for an open-source product in which BDB is used)?

gojomo · on April 1, 2010

You need to do your own analysis, not trusting some random guy on the internet (eg, me). But:

(1) If people need to pay to use, it's not "open source" by the "open source definition" of the OSI: <http://www.opensource.org/docs/osd>;

(2) Contacting Oracle/BDB will let you know what they think your obligations are.

jrockway · on April 1, 2010

Excellent. Now I can say, "I'm using Oracle" and still get some work done too.

zaphar · on April 1, 2010

that phrasing just sounds wrong to me.

wmf · on March 31, 2010

Is it faster? More reliable? A way to trick people into paying for SQLite?

gojomo · on March 31, 2010

The claims in the announcement email I received were performance-related:

- SQL Performance

-- 10s of thousands of INSERTS/UPDATES/DELETES per-second

-- 100s of thousands of SELECTs per-second

-- Approximately as fast as SQLite for reads (within 10%)

-- About 3X faster than SQLite for updates/writes

-- Has fine grained locking which leads to better read/write concurrency

-- 6-8x more transactions/second compared to SQLite when using multiple threads

ricmo · on March 31, 2010

I'm not sure I get the point of this. Two of SQLites strong points (among many others) are: (a) Short dependency list (b) Platform independent filesystem storage

..doesn't this negate both of those for what seems like little gain?

whiskeyjack · on March 31, 2010

Bad comparison. The point is that you interact with berkleyDB via an SQLite interface. It's just an interface layer in this case.

gojomo · on March 31, 2010

BerkeleyDB doesn't have many dependencies, uses only plain files as storage, and is itself highly-portable open source (with a quasi-copyleft condition).

So you don't give up much to get the claimed benefits of this combination -- only the ability to use public-domain SQLite in proprietary distributed software.

hackermom · on March 31, 2010

SQLite outperforms it by far. That could also weigh in a little :)

pquerna · on March 31, 2010

I've used berkeley for hundreds of concurrent queries -- it is quite good in those situations, SQLlite is not -- its just not designed for those situations.

combining the two brings an easy interface that sqlite provide, and the concurrent performance that bdb has, is definitely providing value.

neilc · on March 31, 2010

Do you have any benchmarks to support that claim?

dlsspy · on April 1, 2010

I do. I wrote a tool to help me understand what I could do on various storage tiers running as safely as possible. Here's one of my results from linode:

http://skitch.com/dlsspy/nh2qb/kvtest-results-on-linode

Here's the code: http://github.com/dustin/kvtest

Depending on what you're doing and how safe you want it, you can adjust inputs to select a different winner.

jamie · on April 1, 2010

This is an awesome shootout! Be careful with BDB, its default tuning is geared more towards an embedded environment then a modern web-app. That's not to say it can't go fast!

I took a look at your BDB demo and saw that you weren't creating the DBs in an environment, which meant every operation was straight to disk, and also meant that you weren't running with write-ahead logs (which will give you durability). I didn't look too closely to see if the other databases had caching enabled or not (or what their defaults were).

On my macbook air, configuring with caching (and no logging) yielded this from your benchmark:

Air:kvtest jamie$ ./bdb-test Running test ``test test'' PASS Running test ``write test'' Ran 284669 operations in 5s (56933 ops/s) PASS

Of course, page caching makes all the difference :-)

BDB has some great sample code, I'd recommend taking a look at: examples_c/ex_env.c (http://www.fiveanddime.net/berkeley-db/db-4.3.28/examples_c/...)

I really wish BDB had more sensible defaults, I think it unfairly gets a black-eye in performance shootouts.

gojomo · on April 1, 2010

What does the 'auditable' qualifier (which seems to make the difference between SQLite beating BDB, or BDB beating SQLite) mean?

dlsspy · on April 1, 2010

auditable means every revision of every change was saved in a fashion that allows you to go back in time and what-not.

Specifically, it means this: http://github.com/dustin/kvtest/blob/master/sqlite-base.cc#L...

megamark16 · on April 1, 2010

So would that mean that you could easily and transparently use this as a backend for something with a sqlite interface, say for example Django?

jamie · on April 1, 2010

Yes, but more importantly, this should allow concurrency to the database. You can now have multiple writers to the same underlying database.

Previously, I believe SQLITE would lock the entire database any time you accessed it.

hackermom · on March 31, 2010

I have nothing but good things to say about SQLite. Small. No dependencies. Lightning fast. Public domain. I use it exclusively for all things databases where the customer for whatever reason does not demand one of the other players. I'm glad to see a major actor in the DB industry picking up on it.

dlsspy · on March 31, 2010

Most importantly: The best tested software out there: http://www.sqlite.org/testing.html

"As of version 3.6.23 (all statistics in the report are against that release of SQLite), the SQLite library consists of approximately 67.2 KSLOC of C code. (KSLOC means thousands of "Source Lines Of Code" or, in other words, lines of code excluding blank lines and comments.) By comparison, the project has 679 times as much test code and test scripts - 45678.3 KSLOC."

hank777 · on April 1, 2010

anybody have any idea how this might compare to mysql inno or myisam?

jamie · on April 1, 2010

It's roughly comparable to innodb in that it's ACID compliant.

MyISAM doesn't support transactions or transaction logging, so it's not as durable as INNODB or BDB.