bobgogl.blogg.se - Arangodb sharding

ArangoDB currently uses (1), but we want to switch to (3).

If you expect your server to run stable, then (1) might be much fast during normal operations. So if you expect your server to crash often, then (1) might not be a good idea. (3) depends: with a clean shutdown as fast as (2), with a crash as slow as (1) You need to do much more synching then in (1). If you have a look at what CouchDB you will see what I mean. (2) this is the slowest solution because you need to ensure that there are no inconsistencies even in case of a server crash. (4) other solutions like keeping only parts in memory, use memory as a cache, and so on are also possible (3) disk-backed with a file-system like clean flag (2) use disk-based indexes (this is currently implemented in CouchDB) (1) use memory only indexes (this is currently implemented in ArangoDB) (3) We decided to keep the indexes only in memory. I assume that you are using a fulltext index in your example, right? We want to speed up the process and hopefully can improve there over time (see also the next bullet point). The fulltext index is indeed very slow when building. There is an elastic search plugin to use ElasticSearch as fulltext search engine for ArangoDB. We think that search engines like ElasticSearch, Solr are much better in this - especially when it comes to stemming, different languages, phonetic searches. (2) Fulltext indexes are not our main expertise. Therefore it is indeed true, that we did not add support for TP3 because we believe it will be of limited use. Therefore we decided to create a Javascript version of Gremlin which runs directly on the shards thus minimising the amount of moved data. As soon as you need to shard the data and spread it to many servers you will move a lot of data between Gremlin and the DBservers. This works very well if you can embedded the database and keep it in the same process space. Gremlin is a nice language, but it requires you to move a lot of data into the client. (1) We do not believe that TP is helpful in a shared environment. I still would like to tell you about our opinions on the raised issues, namely full-text indexes and blueprint. Hi, I'm the CTO of ArangoDB, so my comments are most certainly biased.