天天看點

Apache Solr vs Elasticsearch-feature

Feature

Solr 6.2.1

ElasticSearch 5.0

Format

XML, CSV, JSON

JSON

HTTP REST API

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 SolrJ

Apache Solr vs Elasticsearch-feature

JMX support

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 ES specific stats are exposed through the REST API

Apache Solr vs Elasticsearch-feature

Java

Apache Solr vs Elasticsearch-feature

PHP, Ruby, Perl, Scala, Python, .NET, Javascript, Go, Erlang, Clojure

Apache Solr vs Elasticsearch-feature

Drupal, Magento, Django, ColdFusion, Wordpress, OpenCMS, Plone, Typo3, ez Publish, Symfony2, Riak (via Yokozuna)

Drupal, Django, Symfony2, Wordpress, CouchBase

Apache Solr vs Elasticsearch-feature

DataStax Enterprise Search, Cloudera Search, Hortonworks Data Platform, MapR

Apache Solr vs Elasticsearch-feature

JSON, XML, PHP, Python, Ruby, CSV, Velocity, XSLT, native Java

Master-slave replication

Apache Solr vs Elasticsearch-feature

 Only in non-SolrCloud. In SolrCloud, behaves identically to ES.

Apache Solr vs Elasticsearch-feature

 Not an issue because shards are replicated across nodes.

Integrated snapshot and restore

Filesystem

Filesystem, AWS Cloud Plugin for S3 repositories, HDFS Plugin for Hadoop environments, Azure Cloud Plugin for Azure storage repositories

Data Import

DataImportHandler - JDBC, CSV, XML, Tika, URL, Flat File

[DEPRECATED in 2.x] Rivers modules - ActiveMQ, Amazon SQS, CouchDB, Dropbox, DynamoDB, FileSystem, Git, GitHub, Hazelcast, JDBC, JMS, Kafka, LDAP, MongoDB, neo4j, OAI, RabbitMQ, Redis, RSS, Sofa, Solr, St9, Subversion, Twitter,

Wikipedia

ID field for updates and deduplication

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 with stored fields

Apache Solr vs Elasticsearch-feature

 with _source field

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 Supports Solr and Wordnet synonym format

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 4.4+

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 One set of fields per schema, one schema per core

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 Schemaless mode or via dynamic fields.

Apache Solr vs Elasticsearch-feature

 Only backward-compatible changes.

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 via multi-fields

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 Need to programmatically create queries if going beyond Lucene query syntax.

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Geo-distance Faceting

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

More Like This

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

<a target="_blank" href="https://issues.apache.org/jira/browse/SOLR-4587">JIRA issue</a>

Apache Solr vs Elasticsearch-feature

 Percolation. Distributed percolation supported in 1.0

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Autocomplete

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

<a target="_blank" href="https://github.com/elasticsearch/elasticsearch/issues/1066#issuecomment-8625739">workaround</a>

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 via parent-child query

Apache Solr vs Elasticsearch-feature

 via has_children and top_children queries

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 Joined index has to be single-shard and replicated across all nodes.

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 New to 4.7.0

Apache Solr vs Elasticsearch-feature

 via scan search type

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 also supports filtering by native scripts

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 local params and cache property

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 DisMax, eDisMax

Apache Solr vs Elasticsearch-feature

 query_string, dis_max, match, multi_match etc

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 but awkward. Involves positively boosting the inverse set of negatively-boosted documents.

Apache Solr vs Elasticsearch-feature

Search across multiple indexes

Apache Solr vs Elasticsearch-feature

 it can search across multiple compatible collections

Apache Solr vs Elasticsearch-feature

Result highlighting

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Term Vectors API

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 via SearchComponents

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Pluggable Analyzers/Tokenizers

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Pluggable Field Types

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Pluggable Function queries

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Pluggable scoring scripts

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 Installable from GitHub, maven, sonatype or elasticsearch.org

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 Depends on separate ZooKeeper server

Apache Solr vs Elasticsearch-feature

 Only Elasticsearch nodes

Automatic node discovery

Apache Solr vs Elasticsearch-feature

 ZooKeeper

Apache Solr vs Elasticsearch-feature

 internal Zen Discovery or ZooKeeper

Partition tolerance

Apache Solr vs Elasticsearch-feature

 The partition without a ZooKeeper quorum will stop accepting indexing requests or cluster state changes, while the partition

with a quorum continues to function.

Apache Solr vs Elasticsearch-feature

 Partitioned clusters can diverge unless discovery.zen.minimum_master_nodes set to at least N/2+1, where N is the size of

Automatic failover

Apache Solr vs Elasticsearch-feature

 If all nodes storing a shard and its replicas fail, client requests will fail, unless requests are made with the shards.tolerant=true

parameter, in which case partial results are retuned from the available shards.

Apache Solr vs Elasticsearch-feature

Automatic leader election

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Shard replication

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 it can be machine, rack, availability zone, and/or data center aware. Arbitrary tags can be assigned to nodes and it can

be configured to not assign the same shard and its replicates on a node with the same tags.

Change # of shards

Apache Solr vs Elasticsearch-feature

 Shards can be added (when using implicit routing) or split (when using compositeId). Cannot be lowered. Replicas can be increased

anytime.

Apache Solr vs Elasticsearch-feature

 each index has 5 shards by default. Number of primary shards cannot be changed once the index is created. Replicas can be

increased anytime.

Shard splitting

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 can be done by creating a shard replicate on the desired node and then removing the shard from the source node

Apache Solr vs Elasticsearch-feature

 can move shards and replicas to any node in the cluster on demand

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

 shards or _route_ parameter

Apache Solr vs Elasticsearch-feature

 routing parameter

Pluggable shard/replica assignment

Apache Solr vs Elasticsearch-feature
Apache Solr vs Elasticsearch-feature

Consistency

Indexing requests are synchronous with replication. A indexing request won't return until all replicas respond. No check for downed replicas. They will catch up when they recover. When new replicas are added, they won't start accepting and responding to requests

until they are finished replicating the index.

Replication between nodes is synchronous by default, thus ES is consistent by default, but it can be set to asynchronous on a per document indexing basis. Index writes can be configured to fail is there are not sufficient active shard replicas. The default

is quorum, but all or one are also available.

Web Admin interface

Apache Solr vs Elasticsearch-feature

 bundled with Solr

Apache Solr vs Elasticsearch-feature

 Marvel or Kibana apps

Visualisation

<a target="_blank" href="https://github.com/LucidWorks/banana">Banana (Port of Kibana)</a>

<a target="_blank" href="https://www.elastic.co/products/kibana">Kibana</a>

Hosting providers

I'm embedding my answer to this "Solr-vs-Elasticsearch" Quora question verbatim here:

1. Elasticsearch was born in the age of REST APIs. If you love REST APIs, you'll probably feel more at home with ES from the get-go. I don't actually think it's 'cleaner' or 'easier to use', but just that it is more aligned with web 2.0 developers' mindsets. 2. Elasticsearch's Query DSL syntax is really flexible and it's pretty easy to write complex queries with it, though it does border on being verbose. Solr doesn't have an equivalent, last I checked. Having said that, I've never found Solr's query syntax wanting, and I've always been able to easily write a custom SearchComponent if needed (more on this later). 3. I find Elasticsearch's documentation to be pretty awful. It doesn't help that some examples in the documentation are written in YAML and others in JSON. I wrote a ES code parser once to auto-generate documentation from Elasticsearch's source and found a number of discrepancies between code and what's documented on the website, not to mention a number of undocumented/alternative ways to specify the same config key.  By contrast, I've found Solr to be consistent and really well-documented. I've found pretty much everything I've wanted to know about querying and updating indices without having to dig into code much. Solr's schema.xml and solrconfig.xml are *extensively* documented with most if not all commonly used configurations.  4. Whilst what Rick says about ES being mostly ready to go out-of-box is true, I think that is also a possible problem with ES. Many users don't take the time to do the most simple config (e.g. type mapping) of ES because it 'just works' in dev, and end up running into issues in production.  And once you do have to do config, then I personally prefer Solr's config system over ES'. Long JSON config files can get overwhelming because of the JSON's lack of support for comments. Yes you can use YAML, but it's annoying and confusing to go back and forth between YAML and JSON.  5. If your own app works/thinks in JSON, then without a doubt go for ES because ES thinks in JSON too. Solr merely supports it as an afterthought. ES has a number of nice JSON-related features such as parent-child and nested docs that makes it a very natural fit. Parent-child joins are awkward in Solr, and I don't think there's a Solr equivalent for ES Inner hits. 6. ES doesn't require ZooKeeper for it's 'elastic' features which is nice coz I personally find ZK unpleasant, but as a result, ES does have issues with split-brain scenarios though (google 'elasticsearch split-brain' or see this: Elasticsearch Resiliency Status). 7. Overall from working with clients as a Solr/Elasticsearch consultant, I've found that developer preferences tend to end up along language party lines: if you're a Java/c# developer, you'll be pretty happy with Solr. If you live in Javascript or Ruby, you'll probably love Elasticsearch. If you're on Python or PHP, you'll probably be fine with either.  Something to add about this: ES doesn't have a very elegant Java API IMHO (you'll basically end up using REST because it's less painful), whereas Solrj is very satisfactory and more efficient than Solr's REST API. If you're primarily a Java dev team, do take this into consideration for your sanity. There's no scenario in which constructing JSON in Java is fun/simple, whereas in Python its absolutely pain-free, and believe me, if you have a non-trivial app, your ES json query strings will be works of art.  8. ES doesn't have in-built support for pluggable 'SearchComponents', to use Solr's terminology. SearchComponents are (for me) a pretty indispensable part of Solr for anyone who needs to do anything customized and in-depth with search queries.  Yes of course, in ES you can just implement your own RestHandler, but that's just not the same as being able to plug-into and rewire the way search queries are handled and parsed.  9. Whichever way you go, I highly suggest you choose a client library which is as 'close to the metal' as you can get. Both ES and Solr have *really* simple search and updating search APIs. If a client library introduces an additional DSL layer in attempt to 'simplify', I suggest you think long and hard about using it, as it's likely to complicate matters in the long-run, and make debugging and asking for help on SO more problematic.  In particular, if you're using Rails + Solr, consider using rsolr/rsolr instead of sunspot/sunspot if you can help it. ActiveRecord is complex code and sufficiently magical. The last thing you want is more magic on top of that.  --- To conclude, ES and Solr have more or less feature-parity and from a feature standpoint, there's rarely one reason to go one way or the other (unless your app lives/breathes JSON). Performance-wise, they are also likely to be quite similar (I'm sure there are exceptions to the rule. ES' relatively new autocomplete implementation, for example, is a pretty dramatic departure from previous Lucene/Solr implementations, and I suspect it produces faster responses at scale). ES does offer less friction from the get-go and you feel like you have something working much quicker, but I find this to be illusory. Any time gained in this stage is lost when figuring out how to properly configure ES because of poor documentation - an inevitablity when you have a non-trivial application.  Solr encourages you to understand a little more about what you're doing, and the chance of you shooting yourself in the foot is somewhat lower, mainly because you're forced to read and modify the 2 well-documented XML config files in order to have a working search app. EDIT on Nov 2015:  ES has been gradually distinguishing itself from Solr when it comes to data analytics. I think it's fair to attribute this to the immense traction of the ELK stack in the logging, monitoring and analytic space. My guess is that this is where Elastic (the company) gets the majority of its revenue, so it makes perfect sense that ES (the product) reflects this. We see this manifesting primarily in the form of aggregations, which is a more flexible and nuanced replacement for facets. Read more about aggregations here: Migrating to aggregations Aggregations have been out for a while now (since 1.4), but with the recently released ES 2.0 comes pipeline aggregations, which let you compute aggregations such as derivatives, moving averages, and series arithmetic on the results of other aggregations. Very cool stuff, and Solr simply doesn't have an equivalent. More on pipeline aggregations here: Out of this world aggregations If you're currently using or contemplating using Solr in an analytics app, it is worth your while to look into ES aggregation features to see if you need any of it.