MongoDB and RocksDB at Parse

lpsz · on April 22, 2015

MongoDB seems to have a bad rep on HN for its various shortcomings, and yet here FB/Parse seem to use it. Is the bad rep overrated? (Honest question)

Jweb_Guru · on April 22, 2015

No, the bad reputation is not overrated. This is the essence of why appeal to authority is a fallacy. Facebook is (or was) largely written in PHP; that doesn't mean PHP's bad reputation is overrated. Many banks run on COBOL; that doesn't mean Cobol's bad reputation is overrated.

MongoDB's bad reputation is well-deserved and based on a long history of misleading (or outright false) marketing claims coupled with poor technical properties (from unavoidable data loss to horrible compaction performance to global write locks). At this point, even if all the known issues were fixed (and they aren't) it would still take a long time for their reputation to recover. Given that some of MongoDB's direct competitors do not suffer from these problems and can largely function as drop-in replacements, recommending people use something else is really the only responsible course of action.

spimmy · on April 22, 2015

God yeah, the write locks have been a pain in my ass for so long. Fortunately the storage engine API delegates locking to the engine, so we get document level locking in RocksDB or WT.

lacker · on April 22, 2015

some of MongoDB's direct competitors do not suffer from these problems and can largely function as drop-in replacements

Which competitors are you referring to? Personally I find the power and flexibility of the MongoDB query language to be a big positive that for many cases outweighs performance and reliability issues. It seems like no other competitors are quite there.

functional_test · on April 22, 2015

TokuMX supports most of Mongo's features besides the aggregation framework last I looked. Better locking, transaction support, better indices too -- you should check it out.

benjarrell · on April 22, 2015

I don't know that I would call them competitors, but DB2 and Informix both have JSON support and both implement the MonogoDB protocol.

ahachete · on April 22, 2015

Also ToroDB (https://github.com/torodb/torodb) implements the MongoDB protocol. But it uses PostgreSQL as a backend and it's open source ;P

(ToroDB developer here)

Jweb_Guru · on April 23, 2015

In addition to the others already mentioned, http://hyperdex.org/ implements the MongoDB API (though I cannot vouch for all of its claims, it seems worth including for completeness).

andrewflnr · on April 22, 2015

GP may be thinking of RethinkDB.

functional_test · on April 22, 2015

This is the last one of these posts I'll ever respond to, I promise. And I'll give you the same response I've given every other time before:

You should not base your decision of database (or anything else for that matter) on marketing copy. For something as important as your primary data store, you should at minimum read the full documentation and run some tests with dummy data to see if it will even plausibly work for your use case.

I used MongoDB successfully for years with a large data set (>1TB) and 100% production uptime for more than 3 years. I never lost data. Your claim that you will unavoidably lose data is baseless and without merit. In fact, every issue you listed has been fixed, again, counter to your claim.

Personally, these days I prefer TokuMX if I'm looking for something compatible with MongoDB, but these baseless attacks on MongoDB have to stop.

EDIT: Every time I make a post like this, I get some downvotes without responses. Please tell me why I'm wrong. If it's just that I'm abrasive... Well, you would be too if you were addressing the same thing for the Nth time.

inglor · on April 22, 2015

Not the downvoter - but I can totally understand the downvote.

The fact your anecdotal evidence is that you did not lose any data doesn't mean the internet is not full of people who have lost data with Mongo. I have no idea what your workload is, but my experience with data loss and uptime has not been as great as yours.

I'm not for bashing things either - I think there are cases Mongo might be appropriate, I just don't like countering claims with "it worked for me on this one data set". If it drops writes for one out of 100 people that's still a big reason to avoid it if that's a big concern for you. As for "these issues have been fixed" you're welcome to open the issue tracker - no one at Mongo claims all of these issues have been fixed (then again, PostgreSQL has open issues too) so your claim that "these issues have been fixed" is kind of odd...

functional_test · on April 22, 2015

I only brought that up to counter his claim that data loss is inevitable. Of course my anecdote doesn't mean it's not common =) But anecdotes are all anyone else has, and every time I've read one about someone losing data, they either hadn't read the documentation, or just didn't understand the semantics of what they were doing. Very very rarely, especially these days, has it been an actual DB bug (though I will admit I got Mongo to core one time on 2.4 doing a compaction).

And it's a little disingenuous to point at the issue tracker -- as you say, everyone has open issues. The specific things that are mentioned though have been fixed: writes are checked by default now, the global lock has been broken up into per table locks, etc. There may still be common issues that aren't being addressed, but if there are, I'm not aware of them.

thezilch · on April 22, 2015

Anecdotes are not all one has. We have researchers, and it seems you might have missed yesterday's https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read... (or previous posts on the matter). This link debunks the value of RTFM for Mongo or misunderstanding the API or the rarity of the bug being the DB.

LunaSea · on April 23, 2015

I think that it is disingenuous to say that because a fuzzer found certain obscure scenarios where there are issues that automatically everyone is going to be affected by it and that the database has no merits.

Also, if I have to choose a datastore, although marketing shouldn't be important, funding is and MongoDB has had some huge funding rounds in the past. This gives me a lot more confidence in choosing it.

functional_test · on April 23, 2015

I did miss that actually. Thank you for responding to me with something real. I'll reply again once I've had the opportunity to take a look.

nemothekid · on April 23, 2015

If I say "its inevitable that you will get into an accident while drunk driving" and you say "I've been drunk driving for years with 100% no accidents" I would assume you are being dense.

functional_test · on April 23, 2015

That is not a valid comparison, and as someone who has been affected by drunk drivers, I take great issue with your trivialization of a serious issue.

But I'd expect nothing less from HN.

nemothekid · on April 23, 2015

How is it not valid? There are documented tests of exactly how and why the database loses data (just like there are studies showing the effects of alcohol), and you have claimed that "it's fine, because it never happened to me". You said the claim was baseless when it wasn't - there is another very popular HN post recently documenting how someone ran a test, proved the database was losing data, and the issue was closed as wontfix (but later reopened). Is aphyr's entire article baseless (and the one he wrote 2 years ago).

In the face of actual data, and reproducible tests - isn't saying something like "well it didn't happen to me" dense?

The comparison might be insensitive, so excuse me for hurting your feelings, but I don't see how its invalid.

A more apt analogy then would be someone saying "My database runs with 100% uptime for 3 years, so there is no reason for me to keep backups"

functional_test · on April 23, 2015

Ah yes, downvotes outnumbering real responses. Classic HN.

viraptor · on April 22, 2015

I promise I'm not trying to troll. Given how the data loss can occur in MongoDB (partitions, silently lost writes) - how do you know that "I never lost data." How do you verify this?

I'm pretty sure that I could kill 0.01% of writes of any random application (which doesn't require extensive audits, like banking for example) and nobody would notice for a really long time. And if the effect was ever noticed, application code would be the first place to look at for the reason.

spimmy · on April 22, 2015

Amen brother. You should never base your db decisions on either marketing copy _or_ HN know-it-all complainypants. :)

We actually evaluated TokuMX extensively last year, pre pluggable storage engine. We might have pursued it if they had implemented a compatible oplog at the time, but with a migration path that consisted of dumping and importing production data -- and no way to re-elect the old primary if there were any production problems -- that made it simply a non-starter for me.

They did eventually implement a compatible oplog, which was a good product decision, but the entire TokuMX engineering team recently quit Tokutek en masse so it's still not a great option in my book. Too bad.

Frozenlock · on April 22, 2015

"TokuMX engineering team recently quit Tokutek "

I would like to know how it happened. Do you have a link to an article or any other source of info?

mushi · on April 22, 2015

They were just acquired, so presumably they may have been running out of money. http://www.percona.com/news-and-events/pressreleases/percona...

spimmy · on April 22, 2015

There's no article, sorry, I just happen to know them.

takeda · on April 22, 2015

> You should not base your decision of database (or anything else for that matter) on marketing copy. For something as important as your primary data store, you should at minimum read the full documentation and run some tests with dummy data to see if it will even plausibly work for your use case.

The issue is that they even lie in their documentation[1].

Also Mongo not necessarily loses data in a catastrophic way, you might have some old or inconsistent data here and there. If you have an authoritative source of data I would compare the data in Mongo with it. Also [1] shows how you can get data corruption even with highest safety settings due to broken design.

[1] https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...

spimmy · on April 22, 2015

Yes and no. (That's my blog post btw.) Mongo is a young database, with some super obvious flaws and growing pains. Some of its bad rep is self-inflicted, when Mongo reps make massively overinflated claims about its reliability, scalability, and performance, and then don't back down when the entire internet laughs at them.

But there are some genuinely amazing things about MongoDB and some places where it really shines. Like flexibility -- we run over half a million different apps and therefore half a million different workloads on Mongo. Schema changes and online indexing are painless. Elections are pretty solid. And the data interface layer really can't be beat.

With the pluggable storage engine API, mongo is really growing up and becoming a real database, much like mysql did many years ago when it graduated from MyISAM to InnoDB. I'm excited about the future.

sanderjd · on April 22, 2015

Could you speak (or give me pointers) to how the data interface layer is better than everything else out there?

(By the way, thanks for the balanced, informative, and generally great comment!)

htsh · on April 22, 2015

Yes and no. Though I don't disagree with any of the criticisms or other comments at this level, for some of us MongoDB has been pretty good in production and we haven't experienced data loss. It all depends on your use case.

My first experience with it was for a social game running off of 3 mongo servers in a replica set & here at a large healthcare company we use it for several internal CRUD applications (again in 3-server replica sets) which we continue to iterate. In these use cases, Mongo's flexibility and ease of making changes trumped it's shortcomings. My understanding is that many social games still use Mongo to store player data as our studio was told to use it by a large publisher.

My take is that it's good for building prototypes and it's pretty flexible to both change & migrate data in and out & Javascript devs pick it up relatively quickly. But I also think one should be well aware of it's shortcomings and be careful not to use it where those things are of importance. More often than anything else, I recommend PostgreSQL when other people ask for a general-use db but I'd likely would pick up Mongo myself simply for speed of development.

Lastly, being part of the "MEAN stack," we can point junior devs & interns to one of the many books that cover the stack & they can learn best practices get up to speed quickly. There's an advantage to having books & a slew of SO answers to refer to in that other devs aren't pulled off of their work to teach. We literally had interns committing code on Day 1 of their jobs last summer as they came in having read up on our stack.

Regarding claims MongoDB has made, which I've only been made aware of the last two days, I don't think there's any defending that & it makes my itch to checkout rethinkdb a bit stronger than it was 2 days ago.

erichate · on April 22, 2015

We have been using MongoDB since 1.6 and has worked well for our applications and have not encountered any major issues that would motivate us back to using MySql as our defacto DB.

Knowing the limitations and behavior of Mongo can go a long way in avoiding some of the issues people have encountered.

Definitely looking forward to testing WT and RocksDB in 3.0, beyond performance improvements will drop our storage costs with compression and for a indie studio every dollar counts!

_cpancake · on April 22, 2015

MongoDB is a fun toy database right now, nothing more. Unless you have the resources to fix every problem that comes up, you should use a mature database like Postgres or MySQL (MySQL is not as terrible as some might think).

threeseed · on April 22, 2015

"Mature" doesn't make it necessarily better. I have had catastrophic data losses with Oracle and Teradata and none with Cassandra or MongoDB.

And I would disagree that Mongo is a "toy". We use it as part of our core EDW and we have petabytes of data in our data lake. No issues what so ever.

takeda · on April 23, 2015

Can't coment on your experience with Oracle and Teradata, but the danger with MongoDB is that it has non-catastrophic data corruption.

One that you don't even realize until you compare the data with canonical source.

jasondc · on April 22, 2015

For looking outside of HN, this may help: http://db-engines.com/en/ranking

polskibus · on April 22, 2015

What's the main use case for something like RocksDB ? Is it an out-of-process local caching engine (in-memory + unloads to disk if necessary) ? Or is it something different? Can it communicate between nodes? Why would I use it instead of a large dictionary in memory?

wicknicks · on April 22, 2015

I'm going to provide basic answers here. You can look at the homepage for more detailed answers (the video linked there is very helpful too) [1].

Main use case: Multi-threaded low latency data access. It is optimized for use cases where the insert/update rate is very high. In short, it has an LSM tree to quickly add new/updated records. At frequent intervals this LSM tree is merged into the on-disk pages. It is very well engineered, making the lookups very fast as well.

Queries can touch data on disk as well.

Rocks is an embedded database. Natively written in C++, but has bindings for other languages. [2]

No, nodes do not talk to each other.

Something I am personally excited about: RocksDB has a merge operator, which is (probably) still in development. It allows you to update the value of an entry by providing a merge function. It is extremely useful to merge data if you binary format supports it (for example, protobufs do this natively, and it will be very smart to store your protobuf binary natively in Rocks, and do regular merges to it).

No, an in-memory dictionary will provide far fewer guarantees and features.

[1] http://rocksdb.org/

[2] https://github.com/facebook/rocksdb/wiki/Third-party-languag...

guiomie · on April 22, 2015

I'd also like the following questions answered, I read the post, and can't figure out why? what? when? to use RockDb with MongoDb...

aikah · on April 22, 2015

By the way, does anybody knows how Parse execute sandboxes javascript on the server ? I can't find an article on that matter.

tfb · on April 22, 2015

> The RocksDB engine developed by Facebook engineers is one of the fastest, most compact and write-optimized storage engines available.

Will it fix MongoDB's data loss issues? I'm hoping that is what "write-optimized" is partly implying.

spimmy · on April 22, 2015

Which data loss issues? You're gonna have to be more specific. ;)

It doesn't fix the election rollback issue, because that's handled way above the storage layer. It does solve a whole slew of storage engine related write issues though. No more "we flush to disk every 100ms and call it good".

jbooth · on April 22, 2015

The 'compact' and 'write-optimized' are probably to differentiate RocksDB from LMDB, which pretty thoroughly smokes it for read loads and has an arguably more useful transaction model (which it pays for by single-threading writes and having a little more write-amplification for small records).

hyc_symas · on April 23, 2015

RocksDB is also single-writer. http://rocksdb.org/blog/521/lock/

"write-optimized" means they take great pains to turn all writes into sequential writes, to avoid random I/O seeks and get maximum I/O throughput to the storage device. Of course structuring data as they do makes their reads much slower.

LMDB is read-optimized, and foregoes major effort at those types of write optimizations because, quite frankly, rotating storage media is going the way of the magnetic tape drive. Solid state storage is ubiquitous, and storage seek time just isn't an issue any more.

(Literally and figuratively - HDDs are not going extinct; because of their capacity/$$ advantage they're still used for archival purposes. But everyone doing performance/latency-sensitive work has moved on to SSDs, Flash-based or otherwise.)

"compact" doesn't make much sense. There's nothing compact about the RocksDB codebase. Over 121,000 lines of source code https://www.openhub.net/p/rocksdb and over 20MB of object code. Compared to e.g. 7,000 lines of source for LMDB and 64KB of object code. https://www.openhub.net/p/MDB

jbooth · on April 23, 2015

I hear you RE: compact source code, and that as much as the benchmarks are why I use LMDB (thanks) and not Rocks when I have a need.

I was under the impression that Rocks manages more compact storage, probably as another consequence of all those sequential writes being packed right next to each other, rather than LMDB's freelist-of-4k-pages model.

Is that the case or was I misreading whatever mailing list I got that from? Don't get me wrong, I value not having compactions more than slightly less write amplification, just checking my understanding here.

hyc_symas · on April 23, 2015

RocksDB is more compact storage-wise when records are small. Notice here http://symas.com/mdb/ondisk/ that RocksDB space is smaller using 24 byte values, but same or larger at 96 byte values. By the time you get to 768 byte values, LMDB is smallest.

jbooth · on April 23, 2015

Cool, thanks for the response and for writing lmdb!

lighthawk · on April 22, 2015

"And we now have some exciting news to share: we are running MongoDB on RocksDB in production for some workloads and we’re seeing great results!"

Some stats and a way to replicate would be nice here.

spimmy · on April 22, 2015

Like I said in the post, we're preparing a series of blog posts for next week that does a deep dive into our workloads, our benchmarks, how to repro, etc.

Spoiler alert: we consistently get around 90% compression and the inserts are ~50x faster. ^_^

tszming · on April 22, 2015

Very cool. Btw, sorry to hijack, please also let your colleagues know that they need to update the SSL cert used in *.parseapp.com, e.g. http://sha1affected.com/results?server=doitfordenmark.parsea...

spimmy · on April 22, 2015

tszming -- will take care of it, thank you!

andrea_s · on April 22, 2015

Compared to MMAPv1 or compared to WiredTiger?

spimmy · on April 22, 2015

We're testing with both. Somewhat hindered by the fact that WT often dies trying to import our data, or crashes, or loses data in weird ways. :( I believe WT will eventually get there, but rocks has been unfathomably more solid for us so far.

diziet · on April 22, 2015

WT has been pretty decent for us (only TB level amount of data, so probably not as much as you folks are dealing with). The thing it has issues with is occasionally using the wrong indexes and caching them, and even putting hints in does not solve the issue in all cases. It started happening on WT, but it seems to be mongo's issue and not the storage engine.

spimmy · on April 22, 2015

We have a lot of edge cases in our data set and usage model, no question. Glad it's working well for you. :)

spimmy · on April 22, 2015

ah but those particular compression / write rate stats are compared to mmapv1, yeah

lighthawk · on April 22, 2015

> we consistently get around 90% compression

For unicode and BSON?