Imported Upstream version 1.0.3upstream/1.0.3

author: Hilko Bengen <bengen@debian.org> 2014-06-07 12:02:12 +0200
committer: Hilko Bengen <bengen@debian.org> 2014-06-07 12:02:12 +0200
commit: d5ed89b946297270ec28abf44bef2371a06f1f4f (patch)
tree: ce2d945e4dde69af90bd9905a70d8d27f4936776 /docs/reference/setup
download: elasticsearch-d5ed89b946297270ec28abf44bef2371a06f1f4f.tar.gz
5 files changed, 485 insertions, 0 deletions
diff --git a/docs/reference/setup/as-a-service-win.asciidoc b/docs/reference/setup/as-a-service-win.asciidoc
new file mode 100644
index 0000000..f77f194
--- /dev/null
+++ b/docs/reference/setup/as-a-service-win.asciidoc
@@ -0,0 +1,63 @@
+[[setup-service-win]]
+== Running as a Service on Windows
+
+Windows users can configure Elasticsearch to run as a service to run in the background or start automatically
+at startup without any user interaction.
+This can be achieved through `service.bat` script under `bin/` folder which allows one to install,
+remove, manage or configure the service and potentially start and stop the service, all from the command-line.
+
+[source,sh]
+--------------------------------------------------
+c:\elasticsearch-0.90.5\bin>service
+
+Usage: service.bat install|remove|start|stop|manager [SERVICE_ID]
+--------------------------------------------------
+
+The script requires one parameter (the command to execute) followed by an optional one indicating the service
+id (useful when installing multiple Elasticsearch services).
+
+The commands available are:
+
+[horizontal]
+`install`:: Install Elasticsearch as a service
+
+`remove`:: Remove the installed Elasticsearch service (and stop the service if started)
+
+`start`:: Start the Elasticsearch service (if installed)
+
+`stop`:: Stop the Elasticsearch service (if started)
+
+`manager`:: Start a GUI for managing the installed service
+
+Note that the environment configuration options available during the installation are copied and will be used during
+the service lifecycle. This means any changes made to them after the installation will not be picked up unless
+the service is reinstalled.
+
+Based on the architecture of the available JDK/JRE (set through `JAVA_HOME`), the appropriate 64-bit(x64) or 32-bit(x86)
+service will be installed. This information is made available during install:
+
+[source,sh]
+--------------------------------------------------
+c:\elasticsearch-0.90.5\bin>service install
+Installing service      :  "elasticsearch-service-x64"
+Using JAVA_HOME (64-bit):  "c:\jvm\jdk1.7"
+The service 'elasticsearch-service-x64' has been installed.
+--------------------------------------------------
+
+NOTE: While a JRE can be used for the Elasticsearch service, due to its use of a client VM (as oppose to a server JVM which
+offers better performance for long-running applications) its usage is discouraged and a warning will be issued.
+
+[float]
+=== Customizing service settings
+
+There are two ways to customize the service settings:
+
+Manager GUI:: accessible through `manager` command, the GUI offers insight into the installed service including its status, startup type,
+JVM, start and stop settings among other things. Simply invoking `service.bat` from the command-line with the aforementioned option
+will open up the manager window:
+
+image::images/service-manager-win.png["Windows Service Manager GUI",align="center"]
+
+Customizing `service.bat`:: at its core, `service.bat` relies on http://commons.apache.org/proper/commons-daemon/[Apache Commons Daemon] project
+to install the services. For full flexibility such as customizing the user under which the service runs, one can modify the installation
+parameters to tweak all the parameters accordingly. Do note that this requires reinstalling the service for the new settings to be applied.
diff --git a/docs/reference/setup/as-a-service.asciidoc b/docs/reference/setup/as-a-service.asciidoc
new file mode 100644
index 0000000..71bf760
--- /dev/null
+++ b/docs/reference/setup/as-a-service.asciidoc
@@ -0,0 +1,86 @@
+[[setup-service]]
+== Running As a Service on Linux
+
+In order to run elasticsearch as a service on your operating system, the provided packages try to make it as easy as possible for you to start and stop elasticsearch during reboot and upgrades.
+
+[float]
+=== Linux
+
+Currently our build automatically creates a debian package and an RPM package, which is available on the download page. The package itself does not have any dependencies, but you have to make sure that you installed a JDK.
+
+Each package features a configuration file, which allows you to set the following parameters
+
+[horizontal]
+`ES_USER`::               The user to run as, defaults to `elasticsearch`
+`ES_GROUP`::              The group to run as, defaults to `elasticsearch`
+`ES_HEAP_SIZE`::          The heap size to start with
+`ES_HEAP_NEWSIZE`::       The size of the new generation heap
+`ES_DIRECT_SIZE`::        The maximum size of the direct memory
+`MAX_OPEN_FILES`::        Maximum number of open files, defaults to `65535`
+`MAX_LOCKED_MEMORY`::     Maximum locked memory size. Set to "unlimited" if you use the bootstrap.mlockall option in elasticsearch.yml. You must also set ES_HEAP_SIZE.
+`MAX_MAP_COUNT`::         Maximum number of memory map areas a process may have. If you use `mmapfs` as index store type, make sure this is set to a high value. For more information, check the https://github.com/torvalds/linux/blob/master/Documentation/sysctl/vm.txt[linux kernel documentation] about `max_map_count`. This is set via `sysctl` before starting elasticsearch. Defaults to `65535`
+`LOG_DIR`::               Log directory, defaults to `/var/log/elasticsearch`
+`DATA_DIR`::              Data directory, defaults to `/var/lib/elasticsearch`
+`WORK_DIR`::              Work directory, defaults to `/tmp/elasticsearch`
+`CONF_DIR`::              Configuration file directory (which needs to include `elasticsearch.yml` and `logging.yml` files), defaults to `/etc/elasticsearch`
+`CONF_FILE`::             Path to configuration file, defaults to `/etc/elasticsearch/elasticsearch.yml`
+`ES_JAVA_OPTS`::          Any additional java options you may want to apply. This may be useful, if you need to set the `node.name` property, but do not want to change the `elasticsearch.yml` configuration file, because it is distributed via a provisioning system like puppet or chef. Example: `ES_JAVA_OPTS="-Des.node.name=search-01"`
+`RESTART_ON_UPGRADE`::    Configure restart on package upgrade, defaults to `false`. This means you will have to restart your elasticsearch instance after installing a package manually. The reason for this is to ensure, that upgrades in a cluster do not result in a continouos shard reallocation resulting in high network traffic and reducing the response times of your cluster.
+
+[float]
+==== Debian/Ubuntu
+
+The debian package ships with everything you need as it uses standard debian tools like update `update-rc.d` to define the runlevels it runs on. The init script is placed at `/etc/init.d/elasticsearch` is you would expect it. The configuration file is placed at `/etc/default/elasticsearch`.
+
+The debian package does not start up the service by default. The reason for this is to prevent the instance to accidentally join a cluster, without being configured appropriately. After installing using `dpkg -i` you can use the following commands to ensure, that elasticsearch starts when the system is booted and then start up elasticsearch:
+
+[source,sh]
+--------------------------------------------------
+sudo update-rc.d elasticsearch defaults 95 10
+sudo /etc/init.d/elasticsearch start
+--------------------------------------------------
+
+[float]
+===== Installing the oracle JDK
+
+The usual recommendation is to run the Oracle JDK with elasticsearch. However Ubuntu and Debian only ship the OpenJDK due to license issues. You can easily install the oracle installer package though. In case you are missing the `add-apt-repository` command under Debian GNU/Linux, make sure have at least Debian Wheezy and the package `python-software-properties` installed
+
+[source,sh]
+--------------------------------------------------
+sudo add-apt-repository ppa:webupd8team/java
+sudo apt-get update
+sudo apt-get install oracle-java7-installer
+java -version
+--------------------------------------------------
+
+The last command should verify a successful installation of the Oracle JDK.
+
+
+[float]
+==== RPM based distributions
+
+[float]
+===== Using chkconfig
+
+Some RPM based distributions are using `chkconfig` to enable and disable services. The init script is located at `/etc/init.d/elasticsearch`, where as the configuration file is placed at `/etc/sysconfig/elasticsearch`. Like the debian package the RPM package is not started by default after installation, you have to do this manually by entering the following commands
+
+[source,sh]
+--------------------------------------------------
+sudo /sbin/chkconfig --add elasticsearch
+sudo service elasticsearch start
+--------------------------------------------------
+
+
+[float]
+===== Using systemd
+
+Distributions like SUSE do not use the `chkconfig` tool to register services, but rather `systemd` and its command `/bin/systemctl` to start and stop services (at least in newer versions, otherwise use the `chkconfig` commands above). The configuration file is also placed at `/etc/sysconfig/elasticsearch`. After installing the RPM, you have to change the systemd configuration and then start up elasticsearch
+
+[source,sh]
+--------------------------------------------------
+sudo /bin/systemctl daemon-reload
+sudo /bin/systemctl enable elasticsearch.service
+sudo /bin/systemctl start elasticsearch.service
+--------------------------------------------------
+
+Also note that changing the `MAX_MAP_COUNT` setting in `/etc/sysconfig/elasticsearch` does not have any effect, you will have to change it in `/usr/lib/sysctl.d/elasticsearch.conf` in order to have it applied at startup.
diff --git a/docs/reference/setup/configuration.asciidoc b/docs/reference/setup/configuration.asciidoc
new file mode 100644
index 0000000..885dd9f
--- /dev/null
+++ b/docs/reference/setup/configuration.asciidoc
@@ -0,0 +1,237 @@
+[[setup-configuration]]
+== Configuration
+
+[float]
+=== Environment Variables
+
+Within the scripts, Elasticsearch comes with built in `JAVA_OPTS` passed
+to the JVM started. The most important setting for that is the `-Xmx` to
+control the maximum allowed memory for the process, and `-Xms` to
+control the minimum allocated memory for the process (_in general, the
+more memory allocated to the process, the better_).
+
+Most times it is better to leave the default `JAVA_OPTS` as they are,
+and use the `ES_JAVA_OPTS` environment variable in order to set / change
+JVM settings or arguments.
+
+The `ES_HEAP_SIZE` environment variable allows to set the heap memory
+that will be allocated to elasticsearch java process. It will allocate
+the same value to both min and max values, though those can be set
+explicitly (not recommended) by setting `ES_MIN_MEM` (defaults to
+`256m`), and `ES_MAX_MEM` (defaults to `1gb`).
+
+It is recommended to set the min and max memory to the same value, and
+enable <<setup-configuration-memory,`mlockall`>>.
+
+[float]
+[[system]]
+=== System Configuration
+
+[float]
+[[file-descriptors]]
+==== File Descriptors
+
+Make sure to increase the number of open files descriptors on the
+machine (or for the user running elasticsearch). Setting it to 32k or
+even 64k is recommended.
+
+In order to test how many open files the process can open, start it with
+`-Des.max-open-files` set to `true`. This will print the number of open
+files the process can open on startup.
+
+Alternatively, you can retrieve the `max_file_descriptors` for each node
+using the <<cluster-nodes-info>> API, with:
+
+[source,js]
+--------------------------------------------------
+curl localhost:9200/_nodes/process?pretty
+--------------------------------------------------
+
+
+[float]
+[[setup-configuration-memory]]
+==== Memory Settings
+
+There is an option to use
+http://opengroup.org/onlinepubs/007908799/xsh/mlockall.html[mlockall] to
+try to lock the process address space so it won't be swapped. For this
+to work, the `bootstrap.mlockall` should be set to `true` and it is
+recommended to set both the min and max memory allocation to be the
+same. Note: This option is only available on Linux/Unix operating
+systems.
+
+In order to see if this works or not, set the `common.jna` logging to
+DEBUG level. A solution to "Unknown mlockall error 0" can be to set
+`ulimit -l unlimited`.
+
+Note, `mlockall` might cause the JVM or shell
+session to exit if it fails to allocate the memory (because not enough
+memory is available on the machine).
+
+[float]
+[[settings]]
+=== Elasticsearch Settings
+
+*elasticsearch* configuration files can be found under `ES_HOME/config`
+folder. The folder comes with two files, the `elasticsearch.yml` for
+configuring Elasticsearch different
+<<modules,modules>>, and `logging.yml` for
+configuring the Elasticsearch logging.
+
+The configuration format is http://www.yaml.org/[YAML]. Here is an
+example of changing the address all network based modules will use to
+bind and publish to:
+
+[source,yaml]
+--------------------------------------------------
+network :
+    host : 10.0.0.4
+--------------------------------------------------
+
+
+[float]
+[[paths]]
+==== Paths
+
+In production use, you will almost certainly want to change paths for
+data and log files:
+
+[source,yaml]
+--------------------------------------------------
+path:
+  logs: /var/log/elasticsearch
+  data: /var/data/elasticsearch
+--------------------------------------------------
+
+[float]
+[[cluster-name]]
+==== Cluster name
+
+Also, don't forget to give your production cluster a name, which is used
+to discover and auto-join other nodes:
+
+[source,yaml]
+--------------------------------------------------
+cluster:
+  name: <NAME OF YOUR CLUSTER>
+--------------------------------------------------
+
+[float]
+[[node-name]]
+==== Node name
+
+You may also want to change the default node name for each node to
+something like the display hostname. By default Elasticsearch will
+randomly pick a Marvel character name from a list of around 3000 names
+when your node starts up.
+
+[source,yaml]
+--------------------------------------------------
+node:
+  name: <NAME OF YOUR NODE>
+--------------------------------------------------
+
+Internally, all settings are collapsed into "namespaced" settings. For
+example, the above gets collapsed into `node.name`. This means that
+its easy to support other configuration formats, for example,
+http://www.json.org[JSON]. If JSON is a preferred configuration format,
+simply rename the `elasticsearch.yml` file to `elasticsearch.json` and
+add:
+
+[float]
+[[styles]]
+==== Configuration styles
+
+[source,yaml]
+--------------------------------------------------
+{
+    "network" : {
+        "host" : "10.0.0.4"
+    }
+}
+--------------------------------------------------
+
+It also means that its easy to provide the settings externally either
+using the `ES_JAVA_OPTS` or as parameters to the `elasticsearch`
+command, for example:
+
+[source,sh]
+--------------------------------------------------
+$ elasticsearch -Des.network.host=10.0.0.4
+--------------------------------------------------
+
+Another option is to set `es.default.` prefix instead of `es.` prefix,
+which means the default setting will be used only if not explicitly set
+in the configuration file.
+
+Another option is to use the `${...}` notation within the configuration
+file which will resolve to an environment setting, for example:
+
+[source,js]
+--------------------------------------------------
+{
+    "network" : {
+        "host" : "${ES_NET_HOST}"
+    }
+}
+--------------------------------------------------
+
+The location of the configuration file can be set externally using a
+system property:
+
+[source,sh]
+--------------------------------------------------
+$ elasticsearch -Des.config=/path/to/config/file
+--------------------------------------------------
+
+[float]
+[[configuration-index-settings]]
+=== Index Settings
+
+Indices created within the cluster can provide their own settings. For
+example, the following creates an index with memory based storage
+instead of the default file system based one (the format can be either
+YAML or JSON):
+
+[source,sh]
+--------------------------------------------------
+$ curl -XPUT http://localhost:9200/kimchy/ -d \
+'
+index :
+    store:
+        type: memory
+'
+--------------------------------------------------
+
+Index level settings can be set on the node level as well, for example,
+within the `elasticsearch.yml` file, the following can be set:
+
+[source,yaml]
+--------------------------------------------------
+index :
+    store:
+        type: memory
+--------------------------------------------------
+
+This means that every index that gets created on the specific node
+started with the mentioned configuration will store the index in memory
+*unless the index explicitly sets it*. In other words, any index level
+settings override what is set in the node configuration. Of course, the
+above can also be set as a "collapsed" setting, for example:
+
+[source,sh]
+--------------------------------------------------
+$ elasticsearch -Des.index.store.type=memory
+--------------------------------------------------
+
+All of the index level configuration can be found within each
+<<index-modules,index module>>.
+
+[float]
+[[logging]]
+=== Logging
+
+Elasticsearch uses an internal logging abstraction and comes, out of the
+box, with http://logging.apache.org/log4j/[log4j]. It tries to simplify
+log4j configuration by using http://www.yaml.org/[YAML] to configure it,
+and the logging configuration file is `config/logging.yml` file.
diff --git a/docs/reference/setup/dir-layout.asciidoc b/docs/reference/setup/dir-layout.asciidoc
new file mode 100644
index 0000000..2971c45
--- /dev/null
+++ b/docs/reference/setup/dir-layout.asciidoc
@@ -0,0 +1,47 @@
+[[setup-dir-layout]]
+== Directory Layout
+
+The directory layout of an installation is as follows:
+
+[cols="<,<,<,<",options="header",]
+|=======================================================================
+|Type |Description |Default Location |Setting
+|*home* |Home of elasticsearch installation | | `path.home`
+
+|*bin* |Binary scripts including `elasticsearch` to start a node | `{path.home}/bin` |
+
+|*conf* |Configuration files including `elasticsearch.yml` |`{path.home}/config` |`path.conf`
+
+|*data* |The location of the data files of each index / shard allocated
+on the node. Can hold multiple locations. |`{path.home}/data`|`path.data`
+
+|*work* |Temporal files that are used by different nodes. |`{path.home}/work` |`path.work`
+
+|*logs* |Log files location |`{path.home}/logs` |`path.logs`
+
+|*plugins* |Plugin files location. Each plugin will be contained in a subdirectory. |`{path.home}/plugins` |`path.plugins`
+|=======================================================================
+
+The multiple data locations allows to stripe it. The striping is simple,
+placing whole files in one of the locations, and deciding where to place
+the file based on the value of the `index.store.distributor` setting:
+
+* `least_used` (default) always selects the directory with the most
+available space +
+ * `random` selects directories at random. The probability of selecting
+a particular directory is proportional to amount of available space in
+this directory.
+
+Note, there are no multiple copies of the same data, in that, its
+similar to RAID 0. Though simple, it should provide a good solution for
+people that don't want to mess with RAID. Here is how it is configured:
+
+---------------------------------
+path.data: /mnt/first,/mnt/second
+---------------------------------
+
+Or the in an array format:
+
+----------------------------------------
+path.data: ["/mnt/first", "/mnt/second"]
+----------------------------------------
diff --git a/docs/reference/setup/repositories.asciidoc b/docs/reference/setup/repositories.asciidoc
new file mode 100644
index 0000000..aa0ae30
--- /dev/null
+++ b/docs/reference/setup/repositories.asciidoc
@@ -0,0 +1,52 @@
+[[setup-repositories]]
+== Repositories
+
+We also have repositories available for APT and YUM based distributions.
+
+We have split the major versions in separate urls to avoid accidental upgrades across major version.
+For all 0.90.x releases use 0.90 as version number, for 1.0.x use 1.0, etc.
+
+[float]
+=== APT
+
+Download and install the Public Signing Key
+
+[source,sh]
+--------------------------------------------------
+wget -O - http://packages.elasticsearch.org/GPG-KEY-elasticsearch | apt-key add -
+--------------------------------------------------
+
+Add the following to your /etc/apt/sources.list to enable the repository
+
+[source,sh]
+--------------------------------------------------
+deb http://packages.elasticsearch.org/elasticsearch/0.90/debian stable main
+--------------------------------------------------
+
+Run apt-get update and the repository is ready for use.
+
+
+[float]
+=== YUM
+
+Download and install the Public Signing Key
+
+[source,sh]
+--------------------------------------------------
+rpm --import http://packages.elasticsearch.org/GPG-KEY-elasticsearch
+--------------------------------------------------
+
+Add the following in your /etc/yum.repos.d/ directory
+
+[source,sh]
+--------------------------------------------------
+[elasticsearch-0.90]
+name=Elasticsearch repository for 0.90.x packages
+baseurl=http://packages.elasticsearch.org/elasticsearch/0.90/centos
+gpgcheck=1
+gpgkey=http://packages.elasticsearch.org/GPG-KEY-elasticsearch
+enabled=1
+--------------------------------------------------
+
+And your repository is ready for use.
+
author	Hilko Bengen <bengen@debian.org>	2014-06-07 12:02:12 +0200
committer	Hilko Bengen <bengen@debian.org>	2014-06-07 12:02:12 +0200
commit	d5ed89b946297270ec28abf44bef2371a06f1f4f (patch)
tree	ce2d945e4dde69af90bd9905a70d8d27f4936776 /docs/reference/setup
download	elasticsearch-d5ed89b946297270ec28abf44bef2371a06f1f4f.tar.gz