summaryrefslogtreecommitdiff
path: root/docs/reference/modules/scripting.asciidoc
diff options
context:
space:
mode:
Diffstat (limited to 'docs/reference/modules/scripting.asciidoc')
-rw-r--r--docs/reference/modules/scripting.asciidoc316
1 files changed, 316 insertions, 0 deletions
diff --git a/docs/reference/modules/scripting.asciidoc b/docs/reference/modules/scripting.asciidoc
new file mode 100644
index 0000000..166a030
--- /dev/null
+++ b/docs/reference/modules/scripting.asciidoc
@@ -0,0 +1,316 @@
+[[modules-scripting]]
+== Scripting
+
+The scripting module allows to use scripts in order to evaluate custom
+expressions. For example, scripts can be used to return "script fields"
+as part of a search request, or can be used to evaluate a custom score
+for a query and so on.
+
+The scripting module uses by default http://mvel.codehaus.org/[mvel] as
+the scripting language with some extensions. mvel is used since it is
+extremely fast and very simple to use, and in most cases, simple
+expressions are needed (for example, mathematical equations).
+
+Additional `lang` plugins are provided to allow to execute scripts in
+different languages. Currently supported plugins are `lang-javascript`
+for JavaScript, `lang-groovy` for Groovy, and `lang-python` for Python.
+All places where a `script` parameter can be used, a `lang` parameter
+(on the same level) can be provided to define the language of the
+script. The `lang` options are `mvel`, `js`, `groovy`, `python`, and
+`native`.
+
+[float]
+=== Default Scripting Language
+
+The default scripting language (assuming no `lang` parameter is
+provided) is `mvel`. In order to change it set the `script.default_lang`
+to the appropriate language.
+
+[float]
+=== Preloaded Scripts
+
+Scripts can always be provided as part of the relevant API, but they can
+also be preloaded by placing them under `config/scripts` and then
+referencing them by the script name (instead of providing the full
+script). This helps reduce the amount of data passed between the client
+and the nodes.
+
+The name of the script is derived from the hierarchy of directories it
+exists under, and the file name without the lang extension. For example,
+a script placed under `config/scripts/group1/group2/test.py` will be
+named `group1_group2_test`.
+
+[float]
+=== Disabling dynamic scripts
+
+We recommend running Elasticsearch behind an application or proxy,
+which protects Elasticsearch from the outside world. If users are
+allowed to run dynamic scripts (even in a search request), then they
+have the same access to your box as the user that Elasticsearch is
+running as.
+
+First, you should not run Elasticsearch as the `root` user, as this
+would allow a script to access or do *anything* on your server, without
+limitations. Second, you should not expose Elasticsearch directly to
+users, but instead have a proxy application inbetween. If you *do*
+intend to expose Elasticsearch directly to your users, then you have
+to decide whether you trust them enough to run scripts on your box or
+not. If not, then even if you have a proxy which only allows `GET`
+requests, you should disable dynamic scripting by adding the following
+setting to the `config/elasticsearch.yml` file on every node:
+
+[source,yaml]
+-----------------------------------
+script.disable_dynamic: true
+-----------------------------------
+
+This will still allow execution of named scripts provided in the config, or
+_native_ Java scripts registered through plugins, however it will prevent
+users from running arbitrary scripts via the API.
+
+[float]
+=== Automatic Script Reloading
+
+The `config/scripts` directory is scanned periodically for changes.
+New and changed scripts are reloaded and deleted script are removed
+from preloaded scripts cache. The reload frequency can be specified
+using `watcher.interval` setting, which defaults to `60s`.
+To disable script reloading completely set `script.auto_reload_enabled`
+to `false`.
+
+[float]
+=== Native (Java) Scripts
+
+Even though `mvel` is pretty fast, allow to register native Java based
+scripts for faster execution.
+
+In order to allow for scripts, the `NativeScriptFactory` needs to be
+implemented that constructs the script that will be executed. There are
+two main types, one that extends `AbstractExecutableScript` and one that
+extends `AbstractSearchScript` (probably the one most users will extend,
+with additional helper classes in `AbstractLongSearchScript`,
+`AbstractDoubleSearchScript`, and `AbstractFloatSearchScript`).
+
+Registering them can either be done by settings, for example:
+`script.native.my.type` set to `sample.MyNativeScriptFactory` will
+register a script named `my`. Another option is in a plugin, access
+`ScriptModule` and call `registerScript` on it.
+
+Executing the script is done by specifying the `lang` as `native`, and
+the name of the script as the `script`.
+
+Note, the scripts need to be in the classpath of elasticsearch. One
+simple way to do it is to create a directory under plugins (choose a
+descriptive name), and place the jar / classes files there, they will be
+automatically loaded.
+
+[float]
+=== Score
+
+In all scripts that can be used in facets, allow to access the current
+doc score using `doc.score`.
+
+[float]
+=== Computing scores based on terms in scripts
+
+see <<modules-advanced-scripting, advanced scripting documentation>>
+
+[float]
+=== Document Fields
+
+Most scripting revolve around the use of specific document fields data.
+The `doc['field_name']` can be used to access specific field data within
+a document (the document in question is usually derived by the context
+the script is used). Document fields are very fast to access since they
+end up being loaded into memory (all the relevant field values/tokens
+are loaded to memory).
+
+The following data can be extracted from a field:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Expression |Description
+|`doc['field_name'].value` |The native value of the field. For example,
+if its a short type, it will be short.
+
+|`doc['field_name'].values` |The native array values of the field. For
+example, if its a short type, it will be short[]. Remember, a field can
+have several values within a single doc. Returns an empty array if the
+field has no values.
+
+|`doc['field_name'].empty` |A boolean indicating if the field has no
+values within the doc.
+
+|`doc['field_name'].multiValued` |A boolean indicating that the field
+has several values within the corpus.
+
+|`doc['field_name'].lat` |The latitude of a geo point type.
+
+|`doc['field_name'].lon` |The longitude of a geo point type.
+
+|`doc['field_name'].lats` |The latitudes of a geo point type.
+
+|`doc['field_name'].lons` |The longitudes of a geo point type.
+
+|`doc['field_name'].distance(lat, lon)` |The `plane` distance (in meters)
+of this geo point field from the provided lat/lon.
+
+|`doc['field_name'].distanceWithDefault(lat, lon, default)` |The `plane` distance (in meters)
+of this geo point field from the provided lat/lon with a default value.
+
+|`doc['field_name'].distanceInMiles(lat, lon)` |The `plane` distance (in
+miles) of this geo point field from the provided lat/lon.
+
+|`doc['field_name'].distanceInMilesWithDefault(lat, lon, default)` |The `plane` distance (in
+miles) of this geo point field from the provided lat/lon with a default value.
+
+|`doc['field_name'].distanceInKm(lat, lon)` |The `plane` distance (in
+km) of this geo point field from the provided lat/lon.
+
+|`doc['field_name'].distanceInKmWithDefault(lat, lon, default)` |The `plane` distance (in
+km) of this geo point field from the provided lat/lon with a default value.
+
+|`doc['field_name'].arcDistance(lat, lon)` |The `arc` distance (in
+meters) of this geo point field from the provided lat/lon.
+
+|`doc['field_name'].arcDistanceWithDefault(lat, lon, default)` |The `arc` distance (in
+meters) of this geo point field from the provided lat/lon with a default value.
+
+|`doc['field_name'].arcDistanceInMiles(lat, lon)` |The `arc` distance (in
+miles) of this geo point field from the provided lat/lon.
+
+|`doc['field_name'].arcDistanceInMilesWithDefault(lat, lon, default)` |The `arc` distance (in
+miles) of this geo point field from the provided lat/lon with a default value.
+
+|`doc['field_name'].arcDistanceInKm(lat, lon)` |The `arc` distance (in
+km) of this geo point field from the provided lat/lon.
+
+|`doc['field_name'].arcDistanceInKmWithDefault(lat, lon, default)` |The `arc` distance (in
+km) of this geo point field from the provided lat/lon with a default value.
+
+|`doc['field_name'].factorDistance(lat, lon)` |The distance factor of this geo point field from the provided lat/lon.
+
+|`doc['field_name'].factorDistance(lat, lon, default)` |The distance factor of this geo point field from the provided lat/lon with a default value.
+
+
+|=======================================================================
+
+[float]
+=== Stored Fields
+
+Stored fields can also be accessed when executed a script. Note, they
+are much slower to access compared with document fields, but are not
+loaded into memory. They can be simply accessed using
+`_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
+
+[float]
+=== Source Field
+
+The source field can also be accessed when executing a script. The
+source field is loaded per doc, parsed, and then provided to the script
+for evaluation. The `_source` forms the context under which the source
+field can be accessed, for example `_source.obj2.obj1.field3`.
+
+Accessing `_source` is much slower compared to using `_doc`
+but the data is not loaded into memory. For a single field access `_fields` may be
+faster than using `_source` due to the extra overhead of potentially parsing large documents.
+However, `_source` may be faster if you access multiple fields or if the source has already been
+loaded for other purposes.
+
+
+[float]
+=== mvel Built In Functions
+
+There are several built in functions that can be used within scripts.
+They include:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Function |Description
+|`time()` |The current time in milliseconds.
+
+|`sin(a)` |Returns the trigonometric sine of an angle.
+
+|`cos(a)` |Returns the trigonometric cosine of an angle.
+
+|`tan(a)` |Returns the trigonometric tangent of an angle.
+
+|`asin(a)` |Returns the arc sine of a value.
+
+|`acos(a)` |Returns the arc cosine of a value.
+
+|`atan(a)` |Returns the arc tangent of a value.
+
+|`toRadians(angdeg)` |Converts an angle measured in degrees to an
+approximately equivalent angle measured in radians
+
+|`toDegrees(angrad)` |Converts an angle measured in radians to an
+approximately equivalent angle measured in degrees.
+
+|`exp(a)` |Returns Euler's number _e_ raised to the power of value.
+
+|`log(a)` |Returns the natural logarithm (base _e_) of a value.
+
+|`log10(a)` |Returns the base 10 logarithm of a value.
+
+|`sqrt(a)` |Returns the correctly rounded positive square root of a
+value.
+
+|`cbrt(a)` |Returns the cube root of a double value.
+
+|`IEEEremainder(f1, f2)` |Computes the remainder operation on two
+arguments as prescribed by the IEEE 754 standard.
+
+|`ceil(a)` |Returns the smallest (closest to negative infinity) value
+that is greater than or equal to the argument and is equal to a
+mathematical integer.
+
+|`floor(a)` |Returns the largest (closest to positive infinity) value
+that is less than or equal to the argument and is equal to a
+mathematical integer.
+
+|`rint(a)` |Returns the value that is closest in value to the argument
+and is equal to a mathematical integer.
+
+|`atan2(y, x)` |Returns the angle _theta_ from the conversion of
+rectangular coordinates (_x_, _y_) to polar coordinates (r,_theta_).
+
+|`pow(a, b)` |Returns the value of the first argument raised to the
+power of the second argument.
+
+|`round(a)` |Returns the closest _int_ to the argument.
+
+|`random()` |Returns a random _double_ value.
+
+|`abs(a)` |Returns the absolute value of a value.
+
+|`max(a, b)` |Returns the greater of two values.
+
+|`min(a, b)` |Returns the smaller of two values.
+
+|`ulp(d)` |Returns the size of an ulp of the argument.
+
+|`signum(d)` |Returns the signum function of the argument.
+
+|`sinh(x)` |Returns the hyperbolic sine of a value.
+
+|`cosh(x)` |Returns the hyperbolic cosine of a value.
+
+|`tanh(x)` |Returns the hyperbolic tangent of a value.
+
+|`hypot(x, y)` |Returns sqrt(_x2_ + _y2_) without intermediate overflow
+or underflow.
+|=======================================================================
+
+[float]
+=== Arithmetic precision in MVEL
+
+When dividing two numbers using MVEL based scripts, the engine tries to
+be smart and adheres to the default behaviour of java. This means if you
+divide two integers (you might have configured the fields as integer in
+the mapping), the result will also be an integer. This means, if a
+calculation like `1/num` is happening in your scripts and `num` is an
+integer with the value of `8`, the result is `0` even though you were
+expecting it to be `0.125`. You may need to enforce precision by
+explicitly using a double like `1.0/num` in order to get the expected
+result.