Getting started with ElasticSearch
A quick introduction to ElasticSearch, an open source, ditributed and RESTful search engine based on lucene, and how easily you can start working with it.
Search Solution
As mentioned in the earlier post, Choosing the right search solution for your site, feel free to analyze a best suited search solution for your requirements. In below section, we will cover further some of functionality and capabilities offered by ElasticSearch.
ElasticSearch
In brief, ElasticSearch is open source, distributed, Schema Less and RESTful search engine based on Lucene. Some of the typical functionality of ES are:
- Distributed: aggregated results of search performed on multiple shards/indices
- Schema Less: is document oriented. Supports JSON format, automatic mapping types is supported.
- RESTful: supports REST interface
- Faceted Search: support for navigational search functionality
- Replication: supports index replication
- Fail over: replication and distributed nature provides inbuilt fail over.
- Near Real time: supports near real time updates
- Versioning: allows to store different versions of document.
- Percolation: allows to register queries against an index, returning matching queries for a doc.
- Index Aliasing: allows to create alias for indices.
Installing ElasticsSearch
Download the latest version of ES from site, download. Refer to the installation guide for environment specific steps. Extract the zip file to destination folder, and go to installation folder. To start the process in foreground,
$ bin/elasticsearch
To start the process in background,
$ bin/elasticsearch &
Running ElasticSearch as service
Download the Service Wrapper from github repository. Check the README file to install the service wrapper.
$ bin/service start/stop
Install plugin
Refer to Plugin Guide page for detailed list of available plugins. To install ElasticSearch Head Plugin, go to installation directory
$ bin/plugin -install mobz/elasticsearch-head
To browse the installed plugin, http://localhost:9200/_plugin/head/
Configure ES server
Refer to Configuration page for all the configurations. To change server configurations, go to installation directory
$ vi config/elasticsearch.yml
Change relevant settings for your environment. eg. cluster name for your cluster (cluster.name: localtestsearch). Restart server for changes to take place and browse using head plugin to see the changes.
Testing ES server from command line
Refer to online guide, Index for creating index and adding documents
#Create Index $ curl -XPUT 'http://localhost:9200/twitter/' #Add document $ curl -XPUT 'http://localhost:9200/twitter/tweet/1' -d '{ "tweet" : { "user" : "kimchy", "post_date" : "2009-11-15T14:12:12", "message" : "trying out Elastic Search" } }' #Get document by id $ curl -XGET 'http://localhost:9200/twitter/tweet/1' #Search document $ curl -XGET 'http://localhost:9200/twitter/tweet/_search?q=user:kimchy'
Testing ES server using head plugin
Browsing index data using Head plugin, http://localhost:9200/_plugin/head/.
Accessing from Java
As a java developer, you would prefer to start connecting to the server using quick test case or java application. Get ready and start using java api,
Maven integraton
Use elastic search java api through maven dependency,
<dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch</artifactId> <version>0.20.5</version> </dependency>
Using Java API
To connect to the locally installed ES server,
- create client
- create index and set settings and mappings for document type
- add documents to the index
- get document
//Create Client Settings settings = ImmutableSettings.settingsBuilder().put("cluster.name", "localtestsearch").build(); TransportClient transportClient = new TransportClient(settings); transportClient = transportClient.addTransportAddress(new InetSocketTransportAddress("localhost", 9300)); return (Client) transportClient; //Create Index and set settings and mappings CreateIndexRequestBuilder createIndexRequestBuilder = client.admin().indices().prepareCreate(indexName); createIndexRequestBuilder.execute().actionGet(); //Add documents IndexRequestBuilder indexRequestBuilder = client().prepareIndex(indexName, documentType, documentId); //build json object XContentBuilder contentBuilder = jsonBuilder().startObject().prettyPrint(); contentBuilder.field("name", "jai"); contentBuilder.stopObject(); indexRequestBuilder.setSource(contentBuilder); IndexResponse response = indexRequestBuilder .execute().actionGet(); //Get document GetRequestBuilder getRequestBuilder = client().prepareGet(indexName, type, id); getRequestBuilder.setFields(new String[]{"name"}); GetResponse response = getRequestBuilder.execute().actionGet(); String name = response.field("name").getValue().toString();
In later posts, we will be discussing further the advanced options of using java api.
Online documentation
Some quick links to the ES online documentations,
- Glossary:First get familiar with different glossary used to under the concepts better.
- Guide:The online Guide, details out different sections along with quick examples to start with.
- Blog: The Blog section covers regular update details.
- Tutorials: The Tutorials section covers common concepts around ES usage.
- API docs: Check the Java API section covering sample examples.
- Video: Check the Video covering various ES topics.
- Github: Check different ElasticSearch projects on githun covering sample examples etc. also.
Starting with basic concepts
Lucene Concepts
Lucene is text search engine library. Get familiar with the basic lucene terminology:
- Document: collection of fields
- Field: string based key-value pair
- Collection: set of documents
- Precision: number of relevant documents
- Recall: number of documents returned
- Inverted index: a term can list the number
- of documents it contains
- Lucene blocks: Index Writer, Analyzer, tokenizer, Query parser, Query, Index Searcher etc.
- Index and Segments: Indexes written as non modifiable segments.
- Score: relevancy for each document matching the query
Refer to Lucene online Documentation and Wiki for further details.
ElasticSearch Concepts
- Document: JSON document with data
- Field: string based on key-value pair
- Type: like a table in Relational database
- Index: like a database with multiple type
- Mapping: like a schema for database
- Distributed nature: node, shards, replicas etc.
Refer to ElasticSearch Glossary section for further details.
Hi,
Thank you for this nice tutorial, I am new to this kind of search, could you please provide the example as a small maven project so we can test it in eclipse ?
thanks, your help is appreciated.
Can you please give project link in github?
Please check the tutorial on github,
https://github.com/jaibeermalik/elasticsearch-tutorial
Hello,
I want to highlight text in the search string, is there any straight way to do using elastic Search API in java ?
I have search text description and search String keyword.
Regards
Krishan
Why do the cURL examples use port 9200, but the Java example uses port 9300?
Thanks…… lreeder
Here we are connecting to Elastic search server by TCP port which is by default 9300. When we connect by http then by default port for communication is 9200
Good website to learn..
how to delete a index based on size?