Hacker News new | past | comments | ask | show | jobs | submit login
MongoDB and RocksDB at Parse (parse.com)
111 points by jasondc on April 22, 2015 | hide | past | favorite | 54 comments



MongoDB seems to have a bad rep on HN for its various shortcomings, and yet here FB/Parse seem to use it. Is the bad rep overrated? (Honest question)


No, the bad reputation is not overrated. This is the essence of why appeal to authority is a fallacy. Facebook is (or was) largely written in PHP; that doesn't mean PHP's bad reputation is overrated. Many banks run on COBOL; that doesn't mean Cobol's bad reputation is overrated.

MongoDB's bad reputation is well-deserved and based on a long history of misleading (or outright false) marketing claims coupled with poor technical properties (from unavoidable data loss to horrible compaction performance to global write locks). At this point, even if all the known issues were fixed (and they aren't) it would still take a long time for their reputation to recover. Given that some of MongoDB's direct competitors do not suffer from these problems and can largely function as drop-in replacements, recommending people use something else is really the only responsible course of action.


God yeah, the write locks have been a pain in my ass for so long. Fortunately the storage engine API delegates locking to the engine, so we get document level locking in RocksDB or WT.


some of MongoDB's direct competitors do not suffer from these problems and can largely function as drop-in replacements

Which competitors are you referring to? Personally I find the power and flexibility of the MongoDB query language to be a big positive that for many cases outweighs performance and reliability issues. It seems like no other competitors are quite there.


TokuMX supports most of Mongo's features besides the aggregation framework last I looked. Better locking, transaction support, better indices too -- you should check it out.


I don't know that I would call them competitors, but DB2 and Informix both have JSON support and both implement the MonogoDB protocol.


Also ToroDB (https://github.com/torodb/torodb) implements the MongoDB protocol. But it uses PostgreSQL as a backend and it's open source ;P

(ToroDB developer here)


In addition to the others already mentioned, http://hyperdex.org/ implements the MongoDB API (though I cannot vouch for all of its claims, it seems worth including for completeness).


GP may be thinking of RethinkDB.


This is the last one of these posts I'll ever respond to, I promise. And I'll give you the same response I've given every other time before:

You should not base your decision of database (or anything else for that matter) on marketing copy. For something as important as your primary data store, you should at minimum read the full documentation and run some tests with dummy data to see if it will even plausibly work for your use case.

I used MongoDB successfully for years with a large data set (>1TB) and 100% production uptime for more than 3 years. I never lost data. Your claim that you will unavoidably lose data is baseless and without merit. In fact, every issue you listed has been fixed, again, counter to your claim.

Personally, these days I prefer TokuMX if I'm looking for something compatible with MongoDB, but these baseless attacks on MongoDB have to stop.

EDIT: Every time I make a post like this, I get some downvotes without responses. Please tell me why I'm wrong. If it's just that I'm abrasive... Well, you would be too if you were addressing the same thing for the Nth time.


Not the downvoter - but I can totally understand the downvote.

The fact your anecdotal evidence is that you did not lose any data doesn't mean the internet is not full of people who have lost data with Mongo. I have no idea what your workload is, but my experience with data loss and uptime has not been as great as yours.

I'm not for bashing things either - I think there are cases Mongo might be appropriate, I just don't like countering claims with "it worked for me on this one data set". If it drops writes for one out of 100 people that's still a big reason to avoid it if that's a big concern for you. As for "these issues have been fixed" you're welcome to open the issue tracker - no one at Mongo claims all of these issues have been fixed (then again, PostgreSQL has open issues too) so your claim that "these issues have been fixed" is kind of odd...


I only brought that up to counter his claim that data loss is inevitable. Of course my anecdote doesn't mean it's not common =) But anecdotes are all anyone else has, and every time I've read one about someone losing data, they either hadn't read the documentation, or just didn't understand the semantics of what they were doing. Very very rarely, especially these days, has it been an actual DB bug (though I will admit I got Mongo to core one time on 2.4 doing a compaction).

And it's a little disingenuous to point at the issue tracker -- as you say, everyone has open issues. The specific things that are mentioned though have been fixed: writes are checked by default now, the global lock has been broken up into per table locks, etc. There may still be common issues that aren't being addressed, but if there are, I'm not aware of them.


Anecdotes are not all one has. We have researchers, and it seems you might have missed yesterday's https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read... (or previous posts on the matter). This link debunks the value of RTFM for Mongo or misunderstanding the API or the rarity of the bug being the DB.


I think that it is disingenuous to say that because a fuzzer found certain obscure scenarios where there are issues that automatically everyone is going to be affected by it and that the database has no merits.

Also, if I have to choose a datastore, although marketing shouldn't be important, funding is and MongoDB has had some huge funding rounds in the past. This gives me a lot more confidence in choosing it.


I did miss that actually. Thank you for responding to me with something real. I'll reply again once I've had the opportunity to take a look.


If I say "its inevitable that you will get into an accident while drunk driving" and you say "I've been drunk driving for years with 100% no accidents" I would assume you are being dense.


That is not a valid comparison, and as someone who has been affected by drunk drivers, I take great issue with your trivialization of a serious issue.

But I'd expect nothing less from HN.


How is it not valid? There are documented tests of exactly how and why the database loses data (just like there are studies showing the effects of alcohol), and you have claimed that "it's fine, because it never happened to me". You said the claim was baseless when it wasn't - there is another very popular HN post recently documenting how someone ran a test, proved the database was losing data, and the issue was closed as wontfix (but later reopened). Is aphyr's entire article baseless (and the one he wrote 2 years ago).

In the face of actual data, and reproducible tests - isn't saying something like "well it didn't happen to me" dense?

The comparison might be insensitive, so excuse me for hurting your feelings, but I don't see how its invalid.

A more apt analogy then would be someone saying "My database runs with 100% uptime for 3 years, so there is no reason for me to keep backups"


Ah yes, downvotes outnumbering real responses. Classic HN.


I promise I'm not trying to troll. Given how the data loss can occur in MongoDB (partitions, silently lost writes) - how do you know that "I never lost data." How do you verify this?

I'm pretty sure that I could kill 0.01% of writes of any random application (which doesn't require extensive audits, like banking for example) and nobody would notice for a really long time. And if the effect was ever noticed, application code would be the first place to look at for the reason.


Amen brother. You should never base your db decisions on either marketing copy _or_ HN know-it-all complainypants. :)

We actually evaluated TokuMX extensively last year, pre pluggable storage engine. We might have pursued it if they had implemented a compatible oplog at the time, but with a migration path that consisted of dumping and importing production data -- and no way to re-elect the old primary if there were any production problems -- that made it simply a non-starter for me.

They did eventually implement a compatible oplog, which was a good product decision, but the entire TokuMX engineering team recently quit Tokutek en masse so it's still not a great option in my book. Too bad.


"TokuMX engineering team recently quit Tokutek "

I would like to know how it happened. Do you have a link to an article or any other source of info?


They were just acquired, so presumably they may have been running out of money. http://www.percona.com/news-and-events/pressreleases/percona...


There's no article, sorry, I just happen to know them.


> You should not base your decision of database (or anything else for that matter) on marketing copy. For something as important as your primary data store, you should at minimum read the full documentation and run some tests with dummy data to see if it will even plausibly work for your use case.

The issue is that they even lie in their documentation[1].

Also Mongo not necessarily loses data in a catastrophic way, you might have some old or inconsistent data here and there. If you have an authoritative source of data I would compare the data in Mongo with it. Also [1] shows how you can get data corruption even with highest safety settings due to broken design.

[1] https://aphyr.com/posts/322-call-me-maybe-mongodb-stale-read...


Yes and no. (That's my blog post btw.) Mongo is a young database, with some super obvious flaws and growing pains. Some of its bad rep is self-inflicted, when Mongo reps make massively overinflated claims about its reliability, scalability, and performance, and then don't back down when the entire internet laughs at them.

But there are some genuinely amazing things about MongoDB and some places where it really shines. Like flexibility -- we run over half a million different apps and therefore half a million different workloads on Mongo. Schema changes and online indexing are painless. Elections are pretty solid. And the data interface layer really can't be beat.

With the pluggable storage engine API, mongo is really growing up and becoming a real database, much like mysql did many years ago when it graduated from MyISAM to InnoDB. I'm excited about the future.


Could you speak (or give me pointers) to how the data interface layer is better than everything else out there?

(By the way, thanks for the balanced, informative, and generally great comment!)


Yes and no. Though I don't disagree with any of the criticisms or other comments at this level, for some of us MongoDB has been pretty good in production and we haven't experienced data loss. It all depends on your use case.

My first experience with it was for a social game running off of 3 mongo servers in a replica set & here at a large healthcare company we use it for several internal CRUD applications (again in 3-server replica sets) which we continue to iterate. In these use cases, Mongo's flexibility and ease of making changes trumped it's shortcomings. My understanding is that many social games still use Mongo to store player data as our studio was told to use it by a large publisher.

My take is that it's good for building prototypes and it's pretty flexible to both change & migrate data in and out & Javascript devs pick it up relatively quickly. But I also think one should be well aware of it's shortcomings and be careful not to use it where those things are of importance. More often than anything else, I recommend PostgreSQL when other people ask for a general-use db but I'd likely would pick up Mongo myself simply for speed of development.

Lastly, being part of the "MEAN stack," we can point junior devs & interns to one of the many books that cover the stack & they can learn best practices get up to speed quickly. There's an advantage to having books & a slew of SO answers to refer to in that other devs aren't pulled off of their work to teach. We literally had interns committing code on Day 1 of their jobs last summer as they came in having read up on our stack.

Regarding claims MongoDB has made, which I've only been made aware of the last two days, I don't think there's any defending that & it makes my itch to checkout rethinkdb a bit stronger than it was 2 days ago.


We have been using MongoDB since 1.6 and has worked well for our applications and have not encountered any major issues that would motivate us back to using MySql as our defacto DB.

Knowing the limitations and behavior of Mongo can go a long way in avoiding some of the issues people have encountered.

Definitely looking forward to testing WT and RocksDB in 3.0, beyond performance improvements will drop our storage costs with compression and for a indie studio every dollar counts!


MongoDB is a fun toy database right now, nothing more. Unless you have the resources to fix every problem that comes up, you should use a mature database like Postgres or MySQL (MySQL is not as terrible as some might think).


"Mature" doesn't make it necessarily better. I have had catastrophic data losses with Oracle and Teradata and none with Cassandra or MongoDB.

And I would disagree that Mongo is a "toy". We use it as part of our core EDW and we have petabytes of data in our data lake. No issues what so ever.


Can't coment on your experience with Oracle and Teradata, but the danger with MongoDB is that it has non-catastrophic data corruption.

One that you don't even realize until you compare the data with canonical source.


For looking outside of HN, this may help: http://db-engines.com/en/ranking


What's the main use case for something like RocksDB ? Is it an out-of-process local caching engine (in-memory + unloads to disk if necessary) ? Or is it something different? Can it communicate between nodes? Why would I use it instead of a large dictionary in memory?


I'm going to provide basic answers here. You can look at the homepage for more detailed answers (the video linked there is very helpful too) [1].

Main use case: Multi-threaded low latency data access. It is optimized for use cases where the insert/update rate is very high. In short, it has an LSM tree to quickly add new/updated records. At frequent intervals this LSM tree is merged into the on-disk pages. It is very well engineered, making the lookups very fast as well.

Queries can touch data on disk as well.

Rocks is an embedded database. Natively written in C++, but has bindings for other languages. [2]

No, nodes do not talk to each other.

Something I am personally excited about: RocksDB has a merge operator, which is (probably) still in development. It allows you to update the value of an entry by providing a merge function. It is extremely useful to merge data if you binary format supports it (for example, protobufs do this natively, and it will be very smart to store your protobuf binary natively in Rocks, and do regular merges to it).

No, an in-memory dictionary will provide far fewer guarantees and features.

[1] http://rocksdb.org/

[2] https://github.com/facebook/rocksdb/wiki/Third-party-languag...


I'd also like the following questions answered, I read the post, and can't figure out why? what? when? to use RockDb with MongoDb...


By the way, does anybody knows how Parse execute sandboxes javascript on the server ? I can't find an article on that matter.


> The RocksDB engine developed by Facebook engineers is one of the fastest, most compact and write-optimized storage engines available.

Will it fix MongoDB's data loss issues? I'm hoping that is what "write-optimized" is partly implying.


Which data loss issues? You're gonna have to be more specific. ;)

It doesn't fix the election rollback issue, because that's handled way above the storage layer. It does solve a whole slew of storage engine related write issues though. No more "we flush to disk every 100ms and call it good".


The 'compact' and 'write-optimized' are probably to differentiate RocksDB from LMDB, which pretty thoroughly smokes it for read loads and has an arguably more useful transaction model (which it pays for by single-threading writes and having a little more write-amplification for small records).


RocksDB is also single-writer. http://rocksdb.org/blog/521/lock/

"write-optimized" means they take great pains to turn all writes into sequential writes, to avoid random I/O seeks and get maximum I/O throughput to the storage device. Of course structuring data as they do makes their reads much slower.

LMDB is read-optimized, and foregoes major effort at those types of write optimizations because, quite frankly, rotating storage media is going the way of the magnetic tape drive. Solid state storage is ubiquitous, and storage seek time just isn't an issue any more.

(Literally and figuratively - HDDs are not going extinct; because of their capacity/$$ advantage they're still used for archival purposes. But everyone doing performance/latency-sensitive work has moved on to SSDs, Flash-based or otherwise.)

"compact" doesn't make much sense. There's nothing compact about the RocksDB codebase. Over 121,000 lines of source code https://www.openhub.net/p/rocksdb and over 20MB of object code. Compared to e.g. 7,000 lines of source for LMDB and 64KB of object code. https://www.openhub.net/p/MDB


I hear you RE: compact source code, and that as much as the benchmarks are why I use LMDB (thanks) and not Rocks when I have a need.

I was under the impression that Rocks manages more compact storage, probably as another consequence of all those sequential writes being packed right next to each other, rather than LMDB's freelist-of-4k-pages model.

Is that the case or was I misreading whatever mailing list I got that from? Don't get me wrong, I value not having compactions more than slightly less write amplification, just checking my understanding here.


RocksDB is more compact storage-wise when records are small. Notice here http://symas.com/mdb/ondisk/ that RocksDB space is smaller using 24 byte values, but same or larger at 96 byte values. By the time you get to 768 byte values, LMDB is smallest.


Cool, thanks for the response and for writing lmdb!


"And we now have some exciting news to share: we are running MongoDB on RocksDB in production for some workloads and we’re seeing great results!"

Some stats and a way to replicate would be nice here.


Like I said in the post, we're preparing a series of blog posts for next week that does a deep dive into our workloads, our benchmarks, how to repro, etc.

Spoiler alert: we consistently get around 90% compression and the inserts are ~50x faster. ^_^


Very cool. Btw, sorry to hijack, please also let your colleagues know that they need to update the SSL cert used in *.parseapp.com, e.g. http://sha1affected.com/results?server=doitfordenmark.parsea...


tszming -- will take care of it, thank you!


Compared to MMAPv1 or compared to WiredTiger?


We're testing with both. Somewhat hindered by the fact that WT often dies trying to import our data, or crashes, or loses data in weird ways. :( I believe WT will eventually get there, but rocks has been unfathomably more solid for us so far.


WT has been pretty decent for us (only TB level amount of data, so probably not as much as you folks are dealing with). The thing it has issues with is occasionally using the wrong indexes and caching them, and even putting hints in does not solve the issue in all cases. It started happening on WT, but it seems to be mongo's issue and not the storage engine.


We have a lot of edge cases in our data set and usage model, no question. Glad it's working well for you. :)


ah but those particular compression / write rate stats are compared to mmapv1, yeah


> we consistently get around 90% compression

For unicode and BSON?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: