Imported Upstream version 1.0.3upstream/1.0.3

author: Hilko Bengen <bengen@debian.org> 2014-06-07 12:02:12 +0200
committer: Hilko Bengen <bengen@debian.org> 2014-06-07 12:02:12 +0200
commit: d5ed89b946297270ec28abf44bef2371a06f1f4f (patch)
tree: ce2d945e4dde69af90bd9905a70d8d27f4936776 /docs/reference/mapping
download: elasticsearch-d5ed89b946297270ec28abf44bef2371a06f1f4f.tar.gz
28 files changed, 2923 insertions, 0 deletions
diff --git a/docs/reference/mapping/conf-mappings.asciidoc b/docs/reference/mapping/conf-mappings.asciidoc
new file mode 100644
index 0000000..e9bb3f9
--- /dev/null
+++ b/docs/reference/mapping/conf-mappings.asciidoc
@@ -0,0 +1,19 @@
+[[mapping-conf-mappings]]
+== Config Mappings
+
+Creating new mappings can be done using the
+<<indices-put-mapping,Put Mapping>>
+API. When a document is indexed with no mapping associated with it in
+the specific index, the
+<<mapping-dynamic-mapping,dynamic / default
+mapping>> feature will kick in and automatically create mapping
+definition for it.
+
+Mappings can also be provided on the node level, meaning that each index
+created will automatically be started with all the mappings defined
+within a certain location.
+
+Mappings can be defined within files called `[mapping_name].json` and be
+placed either under `config/mappings/_default` location, or under
+`config/mappings/[index_name]` (for mappings that should be associated
+only with a specific index).
diff --git a/docs/reference/mapping/date-format.asciidoc b/docs/reference/mapping/date-format.asciidoc
new file mode 100644
index 0000000..eada734
--- /dev/null
+++ b/docs/reference/mapping/date-format.asciidoc
@@ -0,0 +1,206 @@
+[[mapping-date-format]]
+== Date Format
+
+In JSON documents, dates are represented as strings. Elasticsearch uses a set
+of pre-configured format to recognize and convert those, but you can change the
+defaults by specifying the `format` option when defining a `date` type, or by
+specifying `dynamic_date_formats` in the `root object` mapping (which will
+be used unless explicitly overridden by a `date` type). There are built in
+formats supported, as well as complete custom one.
+
+The parsing of dates uses http://joda-time.sourceforge.net/[Joda]. The
+default date parsing used if no format is specified is
+http://joda-time.sourceforge.net/api-release/org/joda/time/format/ISODateTimeFormat.html#dateOptionalTimeParser()[ISODateTimeFormat.dateOptionalTimeParser].
+
+An extension to the format allow to define several formats using `||`
+separator. This allows to define less strict formats that can be used,
+for example, the `yyyy/MM/dd HH:mm:ss||yyyy/MM/dd` format will parse
+both `yyyy/MM/dd HH:mm:ss` and `yyyy/MM/dd`. The first format will also
+act as the one that converts back from milliseconds to a string
+representation.
+
+[float]
+[[date-math]]
+=== Date Math
+
+The `date` type supports using date math expression when using it in a
+query/filter (mainly make sense in `range` query/filter).
+
+The expression starts with an "anchor" date, which can be either `now`
+or a date string (in the applicable format) ending with `||`. It can
+then follow by a math expression, supporting `+`, `-` and `/`
+(rounding). The units supported are `y` (year), `M` (month), `w` (week), `h` (hour),
+`m` (minute), and `s` (second).
+
+Here are some samples: `now+1h`, `now+1h+1m`, `now+1h/d`,
+`2012-01-01||+1M/d`.
+
+Note, when doing `range` type searches, and the upper value is
+inclusive, the rounding will properly be rounded to the ceiling instead
+of flooring it.
+
+To change this behavior, set 
+`"mapping.date.round_ceil": false`.
+
+
+[float]
+[[built-in]]
+=== Built In Formats
+
+The following tables lists all the defaults ISO formats supported:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Name |Description
+|`basic_date`|A basic formatter for a full date as four digit year, two
+digit month of year, and two digit day of month (yyyyMMdd).
+
+|`basic_date_time`|A basic formatter that combines a basic date and time,
+separated by a 'T' (yyyyMMdd'T'HHmmss.SSSZ).
+
+|`basic_date_time_no_millis`|A basic formatter that combines a basic date
+and time without millis, separated by a 'T' (yyyyMMdd'T'HHmmssZ).
+
+|`basic_ordinal_date`|A formatter for a full ordinal date, using a four
+digit year and three digit dayOfYear (yyyyDDD).
+
+|`basic_ordinal_date_time`|A formatter for a full ordinal date and time,
+using a four digit year and three digit dayOfYear
+(yyyyDDD'T'HHmmss.SSSZ).
+
+|`basic_ordinal_date_time_no_millis`|A formatter for a full ordinal date
+and time without millis, using a four digit year and three digit
+dayOfYear (yyyyDDD'T'HHmmssZ).
+
+|`basic_time`|A basic formatter for a two digit hour of day, two digit
+minute of hour, two digit second of minute, three digit millis, and time
+zone offset (HHmmss.SSSZ).
+
+|`basic_time_no_millis`|A basic formatter for a two digit hour of day,
+two digit minute of hour, two digit second of minute, and time zone
+offset (HHmmssZ).
+
+|`basic_t_time`|A basic formatter for a two digit hour of day, two digit
+minute of hour, two digit second of minute, three digit millis, and time
+zone off set prefixed by 'T' ('T'HHmmss.SSSZ).
+
+|`basic_t_time_no_millis`|A basic formatter for a two digit hour of day,
+two digit minute of hour, two digit second of minute, and time zone
+offset prefixed by 'T' ('T'HHmmssZ).
+
+|`basic_week_date`|A basic formatter for a full date as four digit
+weekyear, two digit week of weekyear, and one digit day of week
+(xxxx'W'wwe).
+
+|`basic_week_date_time`|A basic formatter that combines a basic weekyear
+date and time, separated by a 'T' (xxxx'W'wwe'T'HHmmss.SSSZ).
+
+|`basic_week_date_time_no_millis`|A basic formatter that combines a basic
+weekyear date and time without millis, separated by a 'T'
+(xxxx'W'wwe'T'HHmmssZ).
+
+|`date`|A formatter for a full date as four digit year, two digit month
+of year, and two digit day of month (yyyy-MM-dd).
+
+|`date_hour`|A formatter that combines a full date and two digit hour of
+day.
+
+|`date_hour_minute`|A formatter that combines a full date, two digit hour
+of day, and two digit minute of hour.
+
+|`date_hour_minute_second`|A formatter that combines a full date, two
+digit hour of day, two digit minute of hour, and two digit second of
+minute.
+
+|`date_hour_minute_second_fraction`|A formatter that combines a full
+date, two digit hour of day, two digit minute of hour, two digit second
+of minute, and three digit fraction of second
+(yyyy-MM-dd'T'HH:mm:ss.SSS).
+
+|`date_hour_minute_second_millis`|A formatter that combines a full date,
+two digit hour of day, two digit minute of hour, two digit second of
+minute, and three digit fraction of second (yyyy-MM-dd'T'HH:mm:ss.SSS).
+
+|`date_optional_time`|a generic ISO datetime parser where the date is
+mandatory and the time is optional.
+
+|`date_time`|A formatter that combines a full date and time, separated by
+a 'T' (yyyy-MM-dd'T'HH:mm:ss.SSSZZ).
+
+|`date_time_no_millis`|A formatter that combines a full date and time
+without millis, separated by a 'T' (yyyy-MM-dd'T'HH:mm:ssZZ).
+
+|`hour`|A formatter for a two digit hour of day.
+
+|`hour_minute`|A formatter for a two digit hour of day and two digit
+minute of hour.
+
+|`hour_minute_second`|A formatter for a two digit hour of day, two digit
+minute of hour, and two digit second of minute.
+
+|`hour_minute_second_fraction`|A formatter for a two digit hour of day,
+two digit minute of hour, two digit second of minute, and three digit
+fraction of second (HH:mm:ss.SSS).
+
+|`hour_minute_second_millis`|A formatter for a two digit hour of day, two
+digit minute of hour, two digit second of minute, and three digit
+fraction of second (HH:mm:ss.SSS).
+
+|`ordinal_date`|A formatter for a full ordinal date, using a four digit
+year and three digit dayOfYear (yyyy-DDD).
+
+|`ordinal_date_time`|A formatter for a full ordinal date and time, using
+a four digit year and three digit dayOfYear (yyyy-DDD'T'HH:mm:ss.SSSZZ).
+
+|`ordinal_date_time_no_millis`|A formatter for a full ordinal date and
+time without millis, using a four digit year and three digit dayOfYear
+(yyyy-DDD'T'HH:mm:ssZZ).
+
+|`time`|A formatter for a two digit hour of day, two digit minute of
+hour, two digit second of minute, three digit fraction of second, and
+time zone offset (HH:mm:ss.SSSZZ).
+
+|`time_no_millis`|A formatter for a two digit hour of day, two digit
+minute of hour, two digit second of minute, and time zone offset
+(HH:mm:ssZZ).
+
+|`t_time`|A formatter for a two digit hour of day, two digit minute of
+hour, two digit second of minute, three digit fraction of second, and
+time zone offset prefixed by 'T' ('T'HH:mm:ss.SSSZZ).
+
+|`t_time_no_millis`|A formatter for a two digit hour of day, two digit
+minute of hour, two digit second of minute, and time zone offset
+prefixed by 'T' ('T'HH:mm:ssZZ).
+
+|`week_date`|A formatter for a full date as four digit weekyear, two
+digit week of weekyear, and one digit day of week (xxxx-'W'ww-e).
+
+|`week_date_time`|A formatter that combines a full weekyear date and
+time, separated by a 'T' (xxxx-'W'ww-e'T'HH:mm:ss.SSSZZ).
+
+|`weekDateTimeNoMillis`|A formatter that combines a full weekyear date
+and time without millis, separated by a 'T' (xxxx-'W'ww-e'T'HH:mm:ssZZ).
+
+|`week_year`|A formatter for a four digit weekyear.
+
+|`weekyearWeek`|A formatter for a four digit weekyear and two digit week
+of weekyear.
+
+|`weekyearWeekDay`|A formatter for a four digit weekyear, two digit week
+of weekyear, and one digit day of week.
+
+|`year`|A formatter for a four digit year.
+
+|`year_month`|A formatter for a four digit year and two digit month of
+year.
+
+|`year_month_day`|A formatter for a four digit year, two digit month of
+year, and two digit day of month.
+|=======================================================================
+
+[float]
+[[custom]]
+=== Custom Format
+
+Allows for a completely customizable date format explained
+http://joda-time.sourceforge.net/api-release/org/joda/time/format/DateTimeFormat.html[here].
diff --git a/docs/reference/mapping/dynamic-mapping.asciidoc b/docs/reference/mapping/dynamic-mapping.asciidoc
new file mode 100644
index 0000000..b10bced
--- /dev/null
+++ b/docs/reference/mapping/dynamic-mapping.asciidoc
@@ -0,0 +1,65 @@
+[[mapping-dynamic-mapping]]
+== Dynamic Mapping
+
+Default mappings allow to automatically apply generic mapping definition
+to types that do not have mapping pre defined. This is mainly done
+thanks to the fact that the
+<<mapping-object-type,object mapping>> and
+namely the <<mapping-root-object-type,root
+object mapping>> allow for schema-less dynamic addition of unmapped
+fields.
+
+The default mapping definition is plain mapping definition that is
+embedded within the distribution:
+
+[source,js]
+--------------------------------------------------
+{
+    "_default_" : {
+    }
+}
+--------------------------------------------------
+
+Pretty short, no? Basically, everything is defaulted, especially the
+dynamic nature of the root object mapping. The default mapping
+definition can be overridden in several manners. The simplest manner is
+to simply define a file called `default-mapping.json` and placed it
+under the `config` directory (which can be configured to exist in a
+different location). It can also be explicitly set using the
+`index.mapper.default_mapping_location` setting.
+
+The dynamic creation of mappings for unmapped types can be completely
+disabled by setting `index.mapper.dynamic` to `false`.
+
+The dynamic creation of fields within a type can be completely
+disabled by setting the `dynamic` property of the type to `strict`.
+
+Here is a <<indices-put-mapping,Put Mapping>> example that
+disables dynamic field creation for a `tweet`:
+
+[source,js]
+--------------------------------------------------
+$ curl -XPUT 'http://localhost:9200/twitter/tweet/_mapping' -d '
+{
+    "tweet" : {
+        "dynamic": "strict",
+        "properties" : {
+            "message" : {"type" : "string", "store" : true }
+        }
+    }
+}
+'
+--------------------------------------------------
+
+Here is how we can change the default
+<<mapping-date-format,date_formats>> used in the
+root and inner object types:
+
+[source,js]
+--------------------------------------------------
+{
+    "_default_" : {
+        "date_formats" : ["yyyy-MM-dd", "dd-MM-yyyy", "date_optional_time"]
+    }
+}
+--------------------------------------------------
diff --git a/docs/reference/mapping/fields.asciidoc b/docs/reference/mapping/fields.asciidoc
new file mode 100644
index 0000000..a1f7e98
--- /dev/null
+++ b/docs/reference/mapping/fields.asciidoc
@@ -0,0 +1,33 @@
+[[mapping-fields]]
+== Fields
+
+Each mapping has a number of fields associated with it
+which can be used to control how the document metadata
+(eg <<mapping-all-field>>) is indexed.
+
+include::fields/uid-field.asciidoc[]
+
+include::fields/id-field.asciidoc[]
+
+include::fields/type-field.asciidoc[]
+
+include::fields/source-field.asciidoc[]
+
+include::fields/all-field.asciidoc[]
+
+include::fields/analyzer-field.asciidoc[]
+
+include::fields/boost-field.asciidoc[]
+
+include::fields/parent-field.asciidoc[]
+
+include::fields/routing-field.asciidoc[]
+
+include::fields/index-field.asciidoc[]
+
+include::fields/size-field.asciidoc[]
+
+include::fields/timestamp-field.asciidoc[]
+
+include::fields/ttl-field.asciidoc[]
+
diff --git a/docs/reference/mapping/fields/all-field.asciidoc b/docs/reference/mapping/fields/all-field.asciidoc
new file mode 100644
index 0000000..65453ef
--- /dev/null
+++ b/docs/reference/mapping/fields/all-field.asciidoc
@@ -0,0 +1,78 @@
+[[mapping-all-field]]
+=== `_all`
+
+The idea of the `_all` field is that it includes the text of one or more
+other fields within the document indexed. It can come very handy
+especially for search requests, where we want to execute a search query
+against the content of a document, without knowing which fields to
+search on. This comes at the expense of CPU cycles and index size.
+
+The `_all` fields can be completely disabled. Explicit field mapping and
+object mapping can be excluded / included in the `_all` field. By
+default, it is enabled and all fields are included in it for ease of
+use.
+
+When disabling the `_all` field, it is a good practice to set
+`index.query.default_field` to a different value (for example, if you
+have a main "message" field in your data, set it to `message`).
+
+One of the nice features of the `_all` field is that it takes into
+account specific fields boost levels. Meaning that if a title field is
+boosted more than content, the title (part) in the `_all` field will
+mean more than the content (part) in the `_all` field.
+
+Here is a sample mapping:
+
+[source,js]
+--------------------------------------------------
+{
+    "person" : {
+        "_all" : {"enabled" : true},
+        "properties" : {
+            "name" : {
+                "type" : "object",
+                "dynamic" : false,
+                "properties" : {
+                    "first" : {"type" : "string", "store" : true , "include_in_all" : false},
+                    "last" : {"type" : "string", "index" : "not_analyzed"}
+                }
+            },
+            "address" : {
+                "type" : "object",
+                "include_in_all" : false,
+                "properties" : {
+                    "first" : {
+                        "properties" : {
+                            "location" : {"type" : "string", "store" : true, "index_name" : "firstLocation"}
+                        }
+                    },
+                    "last" : {
+                        "properties" : {
+                            "location" : {"type" : "string"}
+                        }
+                    }
+                }
+            },
+            "simple1" : {"type" : "long", "include_in_all" : true},
+            "simple2" : {"type" : "long", "include_in_all" : false}
+        }
+    }
+}
+--------------------------------------------------
+
+The `_all` fields allows for `store`, `term_vector` and `analyzer` (with
+specific `index_analyzer` and `search_analyzer`) to be set.
+
+[float]
+[[highlighting]]
+==== Highlighting
+
+For any field to allow
+<<search-request-highlighting,highlighting>> it has
+to be either stored or part of the `_source` field. By default `_all`
+field does not qualify for either, so highlighting for it does not yield
+any data.
+
+Although it is possible to `store` the `_all` field, it is basically an
+aggregation of all fields, which means more data will be stored, and
+highlighting it might produce strange results.
diff --git a/docs/reference/mapping/fields/analyzer-field.asciidoc b/docs/reference/mapping/fields/analyzer-field.asciidoc
new file mode 100644
index 0000000..30bb072
--- /dev/null
+++ b/docs/reference/mapping/fields/analyzer-field.asciidoc
@@ -0,0 +1,41 @@
+[[mapping-analyzer-field]]
+=== `_analyzer`
+
+The `_analyzer` mapping allows to use a document field property as the
+name of the analyzer that will be used to index the document. The
+analyzer will be used for any field that does not explicitly defines an
+`analyzer` or `index_analyzer` when indexing.
+
+Here is a simple mapping:
+
+[source,js]
+--------------------------------------------------
+{
+    "type1" : {
+        "_analyzer" : {
+            "path" : "my_field"
+        }
+    }
+}
+--------------------------------------------------
+
+The above will use the value of the `my_field` to lookup an analyzer
+registered under it. For example, indexing a the following doc:
+
+[source,js]
+--------------------------------------------------
+{
+    "my_field" : "whitespace"
+}
+--------------------------------------------------
+
+Will cause the `whitespace` analyzer to be used as the index analyzer
+for all fields without explicit analyzer setting.
+
+The default path value is `_analyzer`, so the analyzer can be driven for
+a specific document by setting `_analyzer` field in it. If custom json
+field name is needed, an explicit mapping with a different path should
+be set.
+
+By default, the `_analyzer` field is indexed, it can be disabled by
+settings `index` to `no` in the mapping.
diff --git a/docs/reference/mapping/fields/boost-field.asciidoc b/docs/reference/mapping/fields/boost-field.asciidoc
new file mode 100644
index 0000000..1d00845
--- /dev/null
+++ b/docs/reference/mapping/fields/boost-field.asciidoc
@@ -0,0 +1,72 @@
+[[mapping-boost-field]]
+=== `_boost`
+
+deprecated[1.0.0.RC1,See <<function-score-instead-of-boost>>]
+
+Boosting is the process of enhancing the relevancy of a document or
+field. Field level mapping allows to define explicit boost level on a
+specific field. The boost field mapping (applied on the
+<<mapping-root-object-type,root object>>) allows
+to define a boost field mapping where *its content will control the
+boost level of the document*. For example, consider the following
+mapping:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_boost" : {"name" : "my_boost", "null_value" : 1.0}
+    }
+}
+--------------------------------------------------
+
+The above mapping defines mapping for a field named `my_boost`. If the
+`my_boost` field exists within the JSON document indexed, its value will
+control the boost level of the document indexed. For example, the
+following JSON document will be indexed with a boost value of `2.2`:
+
+[source,js]
+--------------------------------------------------
+{
+    "my_boost" : 2.2,
+    "message" : "This is a tweet!"
+}
+--------------------------------------------------
+
+[[function-score-instead-of-boost]]
+==== Function score instead of boost
+
+Support for document boosting via the `_boost` field has been removed
+from Lucene and is deprecated in Elasticsearch as of v1.0.0.RC1. The
+implementation in Lucene resulted in unpredictable result when
+used with multiple fields or multi-value fields.
+
+Instead, the <<query-dsl-function-score-query>> can be used to achieve
+the desired functionality by boosting each document by the value in
+any field the document:
+
+[source,js]
+--------------------------------------------------
+{
+    "query": {
+        "function_score": {
+            "query": {  <1>
+                "match": {
+                    "title": "your main query"
+                }
+            },
+            "functions": [{
+                "script_score": { <2>
+                    "script": "doc['my_boost_field'].value"
+                }
+            }],
+            "score_mode": "multiply"
+        }
+    }
+}
+--------------------------------------------------
+<1> The original query, now wrapped in a `function_score` query.
+<2> This script returns the value in `my_boost_field`, which is then
+    multiplied by the query `_score` for each document.
+
+
diff --git a/docs/reference/mapping/fields/id-field.asciidoc b/docs/reference/mapping/fields/id-field.asciidoc
new file mode 100644
index 0000000..1adab49
--- /dev/null
+++ b/docs/reference/mapping/fields/id-field.asciidoc
@@ -0,0 +1,52 @@
+[[mapping-id-field]]
+=== `_id`
+
+Each document indexed is associated with an id and a type. The `_id`
+field can be used to index just the id, and possible also store it. By
+default it is not indexed and not stored (thus, not created).
+
+Note, even though the `_id` is not indexed, all the APIs still work
+(since they work with the `_uid` field), as well as fetching by ids
+using `term`, `terms` or `prefix` queries/filters (including the
+specific `ids` query/filter).
+
+The `_id` field can be enabled to be indexed, and possibly stored,
+using:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_id" : {"index": "not_analyzed", "store" : false }
+    }
+}
+--------------------------------------------------
+
+The `_id` mapping can also be associated with a `path` that will be used
+to extract the id from a different location in the source document. For
+example, having the following mapping:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_id" : {
+            "path" : "post_id"
+        }
+    }
+}
+--------------------------------------------------
+
+Will cause `1` to be used as the id for:
+
+[source,js]
+--------------------------------------------------
+{
+    "message" : "You know, for Search",
+    "post_id" : "1"
+}
+--------------------------------------------------
+
+This does require an additional lightweight parsing step while indexing,
+in order to extract the id to decide which shard the index operation
+will be executed on.
diff --git a/docs/reference/mapping/fields/index-field.asciidoc b/docs/reference/mapping/fields/index-field.asciidoc
new file mode 100644
index 0000000..96a320b
--- /dev/null
+++ b/docs/reference/mapping/fields/index-field.asciidoc
@@ -0,0 +1,15 @@
+[[mapping-index-field]]
+=== `_index`
+
+The ability to store in a document the index it belongs to. By default
+it is disabled, in order to enable it, the following mapping should be
+defined:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_index" : { "enabled" : true }
+    }
+}
+--------------------------------------------------
diff --git a/docs/reference/mapping/fields/parent-field.asciidoc b/docs/reference/mapping/fields/parent-field.asciidoc
new file mode 100644
index 0000000..3225b53
--- /dev/null
+++ b/docs/reference/mapping/fields/parent-field.asciidoc
@@ -0,0 +1,21 @@
+[[mapping-parent-field]]
+=== `_parent`
+
+The parent field mapping is defined on a child mapping, and points to
+the parent type this child relates to. For example, in case of a `blog`
+type and a `blog_tag` type child document, the mapping for `blog_tag`
+should be:
+
+[source,js]
+--------------------------------------------------
+{
+    "blog_tag" : {
+        "_parent" : {
+            "type" : "blog"
+        }
+    }
+}
+--------------------------------------------------
+
+The mapping is automatically stored and indexed (meaning it can be
+searched on using the `_parent` field notation).
diff --git a/docs/reference/mapping/fields/routing-field.asciidoc b/docs/reference/mapping/fields/routing-field.asciidoc
new file mode 100644
index 0000000..8ca2286
--- /dev/null
+++ b/docs/reference/mapping/fields/routing-field.asciidoc
@@ -0,0 +1,69 @@
+[[mapping-routing-field]]
+=== `_routing`
+
+The routing field allows to control the `_routing` aspect when indexing
+data and explicit routing control is required.
+
+[float]
+==== store / index
+
+The first thing the `_routing` mapping does is to store the routing
+value provided (`store` set to `false`) and index it (`index` set to
+`not_analyzed`). The reason why the routing is stored by default is so
+reindexing data will be possible if the routing value is completely
+external and not part of the docs.
+
+[float]
+==== required
+
+Another aspect of the `_routing` mapping is the ability to define it as
+required by setting `required` to `true`. This is very important to set
+when using routing features, as it allows different APIs to make use of
+it. For example, an index operation will be rejected if no routing value
+has been provided (or derived from the doc). A delete operation will be
+broadcasted to all shards if no routing value is provided and `_routing`
+is required.
+
+[float]
+==== path
+
+The routing value can be provided as an external value when indexing
+(and still stored as part of the document, in much the same way
+`_source` is stored). But, it can also be automatically extracted from
+the index doc based on a `path`. For example, having the following
+mapping:
+
+[source,js]
+--------------------------------------------------
+{
+    "comment" : {
+        "_routing" : {
+            "required" : true,
+            "path" : "blog.post_id"
+        }
+    }
+}
+--------------------------------------------------
+
+Will cause the following doc to be routed based on the `111222` value:
+
+[source,js]
+--------------------------------------------------
+{
+    "text" : "the comment text"
+    "blog" : {
+        "post_id" : "111222"
+    }
+}
+--------------------------------------------------
+
+Note, using `path` without explicit routing value provided required an
+additional (though quite fast) parsing phase.
+
+[float]
+==== id uniqueness
+
+When indexing documents specifying a custom `_routing`, the uniqueness
+of the `_id` is not guaranteed throughout all the shards that the index
+is composed of. In fact, documents with the same `_id` might end up in
+different shards if indexed with different `_routing` values.
diff --git a/docs/reference/mapping/fields/size-field.asciidoc b/docs/reference/mapping/fields/size-field.asciidoc
new file mode 100644
index 0000000..7abfd40
--- /dev/null
+++ b/docs/reference/mapping/fields/size-field.asciidoc
@@ -0,0 +1,26 @@
+[[mapping-size-field]]
+=== `_size`
+
+The `_size` field allows to automatically index the size of the original
+`_source` indexed. By default, it's disabled. In order to enable it, set
+the mapping to:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_size" : {"enabled" : true}
+    }
+}
+--------------------------------------------------
+
+In order to also store it, use:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_size" : {"enabled" : true, "store" : true }
+    }
+}
+--------------------------------------------------
diff --git a/docs/reference/mapping/fields/source-field.asciidoc b/docs/reference/mapping/fields/source-field.asciidoc
new file mode 100644
index 0000000..22bb963
--- /dev/null
+++ b/docs/reference/mapping/fields/source-field.asciidoc
@@ -0,0 +1,41 @@
+[[mapping-source-field]]
+=== `_source`
+
+The `_source` field is an automatically generated field that stores the
+actual JSON that was used as the indexed document. It is not indexed
+(searchable), just stored. When executing "fetch" requests, like
+<<docs-get,get>> or
+<<search-search,search>>, the `_source` field is
+returned by default.
+
+Though very handy to have around, the source field does incur storage
+overhead within the index. For this reason, it can be disabled. For
+example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_source" : {"enabled" : false}
+    }
+}
+--------------------------------------------------
+
+[float]
+[[include-exclude]]
+==== Includes / Excludes
+
+Allow to specify paths in the source that would be included / excluded
+when it's stored, supporting `*` as wildcard annotation. For example:
+
+[source,js]
+--------------------------------------------------
+{
+    "my_type" : {
+        "_source" : {
+            "includes" : ["path1.*", "path2.*"],
+            "excludes" : ["pat3.*"]
+        }
+    }
+}
+--------------------------------------------------
diff --git a/docs/reference/mapping/fields/timestamp-field.asciidoc b/docs/reference/mapping/fields/timestamp-field.asciidoc
new file mode 100644
index 0000000..97bca8d
--- /dev/null
+++ b/docs/reference/mapping/fields/timestamp-field.asciidoc
@@ -0,0 +1,82 @@
+[[mapping-timestamp-field]]
+=== `_timestamp`
+
+The `_timestamp` field allows to automatically index the timestamp of a
+document. It can be provided externally via the index request or in the
+`_source`. If it is not provided externally it will be automatically set
+to the date the document was processed by the indexing chain.
+
+[float]
+==== enabled
+
+By default it is disabled, in order to enable it, the following mapping
+should be defined:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_timestamp" : { "enabled" : true }
+    }
+}
+--------------------------------------------------
+
+[float]
+==== store / index
+
+By default the `_timestamp` field has `store` set to `false` and `index`
+set to `not_analyzed`. It can be queried as a standard date field.
+
+[float]
+==== path
+
+The `_timestamp` value can be provided as an external value when
+indexing. But, it can also be automatically extracted from the document
+to index based on a `path`. For example, having the following mapping:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_timestamp" : {
+            "enabled" : true,
+            "path" : "post_date"
+        }
+    }
+}
+--------------------------------------------------
+
+Will cause `2009-11-15T14:12:12` to be used as the timestamp value for:
+
+[source,js]
+--------------------------------------------------
+{
+    "message" : "You know, for Search",
+    "post_date" : "2009-11-15T14:12:12"
+}
+--------------------------------------------------
+
+Note, using `path` without explicit timestamp value provided require an
+additional (though quite fast) parsing phase.
+
+[float]
+==== format
+
+You can define the <<mapping-date-format,date
+format>> used to parse the provided timestamp value. For example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_timestamp" : {
+            "enabled" : true,
+            "path" : "post_date",
+            "format" : "YYYY-MM-dd"
+        }
+    }
+}
+--------------------------------------------------
+
+Note, the default format is `dateOptionalTime`. The timestamp value will
+first be parsed as a number and if it fails the format will be tried.
diff --git a/docs/reference/mapping/fields/ttl-field.asciidoc b/docs/reference/mapping/fields/ttl-field.asciidoc
new file mode 100644
index 0000000..d47aaca
--- /dev/null
+++ b/docs/reference/mapping/fields/ttl-field.asciidoc
@@ -0,0 +1,70 @@
+[[mapping-ttl-field]]
+=== `_ttl`
+
+A lot of documents naturally come with an expiration date. Documents can
+therefore have a `_ttl` (time to live), which will cause the expired
+documents to be deleted automatically.
+
+[float]
+==== enabled
+
+By default it is disabled, in order to enable it, the following mapping
+should be defined:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_ttl" : { "enabled" : true }
+    }
+}
+--------------------------------------------------
+
+[float]
+==== store / index
+
+By default the `_ttl` field has `store` set to `true` and `index` set to
+`not_analyzed`. Note that `index` property has to be set to
+`not_analyzed` in order for the purge process to work.
+
+[float]
+==== default
+
+You can provide a per index/type default `_ttl` value as follows:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_ttl" : { "enabled" : true, "default" : "1d" }
+    }
+}
+--------------------------------------------------
+
+In this case, if you don't provide a `_ttl` value in your query or in
+the `_source` all tweets will have a `_ttl` of one day.
+
+In case you do not specify a time unit like `d` (days), `m` (minutes),
+`h` (hours), `ms` (milliseconds) or `w` (weeks), milliseconds is used as
+default unit.
+
+If no `default` is set and no `_ttl` value is given then the document
+has an infinite `_ttl` and will not expire.
+
+You can dynamically update the `default` value using the put mapping
+API. It won't change the `_ttl` of already indexed documents but will be
+used for future documents.
+
+[float]
+==== Note on documents expiration
+
+Expired documents will be automatically deleted regularly. You can
+dynamically set the `indices.ttl.interval` to fit your needs. The
+default value is `60s`.
+
+The deletion orders are processed by bulk. You can set
+`indices.ttl.bulk_size` to fit your needs. The default value is `10000`.
+
+Note that the expiration procedure handle versioning properly so if a
+document is updated between the collection of documents to expire and
+the delete order, the document won't be deleted.
diff --git a/docs/reference/mapping/fields/type-field.asciidoc b/docs/reference/mapping/fields/type-field.asciidoc
new file mode 100644
index 0000000..bac7457
--- /dev/null
+++ b/docs/reference/mapping/fields/type-field.asciidoc
@@ -0,0 +1,31 @@
+[[mapping-type-field]]
+=== Type Field
+
+Each document indexed is associated with an id and a type. The type,
+when indexing, is automatically indexed into a `_type` field. By
+default, the `_type` field is indexed (but *not* analyzed) and not
+stored. This means that the `_type` field can be queried.
+
+The `_type` field can be stored as well, for example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_type" : {"store" : true}
+    }
+}
+--------------------------------------------------
+
+The `_type` field can also not be indexed, and all the APIs will still
+work except for specific queries (term queries / filters) or faceting
+done on the `_type` field.
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_type" : {"index" : "no"}
+    }
+}
+--------------------------------------------------
diff --git a/docs/reference/mapping/fields/uid-field.asciidoc b/docs/reference/mapping/fields/uid-field.asciidoc
new file mode 100644
index 0000000..f9ce245
--- /dev/null
+++ b/docs/reference/mapping/fields/uid-field.asciidoc
@@ -0,0 +1,11 @@
+[[mapping-uid-field]]
+=== `_uid`
+
+Each document indexed is associated with an id and a type, the internal
+`_uid` field is the unique identifier of a document within an index and
+is composed of the type and the id (meaning that different types can
+have the same id and still maintain uniqueness).
+
+The `_uid` field is automatically used when `_type` is not indexed to
+perform type based filtering, and does not require the `_id` to be
+indexed.
diff --git a/docs/reference/mapping/meta.asciidoc b/docs/reference/mapping/meta.asciidoc
new file mode 100644
index 0000000..5cb0c14
--- /dev/null
+++ b/docs/reference/mapping/meta.asciidoc
@@ -0,0 +1,25 @@
+[[mapping-meta]]
+== Meta
+
+Each mapping can have custom meta data associated with it. These are
+simple storage elements that are simply persisted along with the mapping
+and can be retrieved when fetching the mapping definition. The meta is
+defined under the `_meta` element, for example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "_meta" : {
+            "attr1" : "value1",
+            "attr2" : {
+                "attr3" : "value3"
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+Meta can be handy for example for client libraries that perform
+serialization and deserialization to store its meta model (for example,
+the class the document maps to).
diff --git a/docs/reference/mapping/types.asciidoc b/docs/reference/mapping/types.asciidoc
new file mode 100644
index 0000000..0cc967e
--- /dev/null
+++ b/docs/reference/mapping/types.asciidoc
@@ -0,0 +1,24 @@
+[[mapping-types]]
+== Types
+
+The datatype for each field in a document (eg strings, numbers, 
+objects etc) can be controlled via the type mapping.
+
+include::types/core-types.asciidoc[]
+
+include::types/array-type.asciidoc[]
+
+include::types/object-type.asciidoc[]
+
+include::types/root-object-type.asciidoc[]
+
+include::types/nested-type.asciidoc[]
+
+include::types/ip-type.asciidoc[]
+
+include::types/geo-point-type.asciidoc[]
+
+include::types/geo-shape-type.asciidoc[]
+
+include::types/attachment-type.asciidoc[]
+
diff --git a/docs/reference/mapping/types/array-type.asciidoc b/docs/reference/mapping/types/array-type.asciidoc
new file mode 100644
index 0000000..3f887b1
--- /dev/null
+++ b/docs/reference/mapping/types/array-type.asciidoc
@@ -0,0 +1,74 @@
+[[mapping-array-type]]
+=== Array Type
+
+JSON documents allow to define an array (list) of fields or objects.
+Mapping array types could not be simpler since arrays gets automatically
+detected and mapping them can be done either with
+<<mapping-core-types,Core Types>> or
+<<mapping-object-type,Object Type>> mappings.
+For example, the following JSON defines several arrays:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "message" : "some arrays in this tweet...",
+        "tags" : ["elasticsearch", "wow"],
+        "lists" : [
+            {
+                "name" : "prog_list",
+                "description" : "programming list"
+            },
+            {
+                "name" : "cool_list",
+                "description" : "cool stuff list"
+            }
+        ]
+    }
+}
+--------------------------------------------------
+
+The above JSON has the `tags` property defining a list of a simple
+`string` type, and the `lists` property is an `object` type array. Here
+is a sample explicit mapping:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "message" : {"type" : "string"},
+            "tags" : {"type" : "string", "index_name" : "tag"},
+            "lists" : {
+                "properties" : {
+                    "name" : {"type" : "string"},
+                    "description" : {"type" : "string"}
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+The fact that array types are automatically supported can be shown by
+the fact that the following JSON document is perfectly fine:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "message" : "some arrays in this tweet...",
+        "tags" : "elasticsearch",
+        "lists" : {
+            "name" : "prog_list",
+            "description" : "programming list"
+        }
+    }
+}
+--------------------------------------------------
+
+Note also, that thanks to the fact that we used the `index_name` to use
+the non plural form (`tag` instead of `tags`), we can actually refer to
+the field using the `index_name` as well. For example, we can execute a
+query using `tweet.tags:wow` or `tweet.tag:wow`. We could, of course,
+name the field as `tag` and skip the `index_name` all together).
diff --git a/docs/reference/mapping/types/attachment-type.asciidoc b/docs/reference/mapping/types/attachment-type.asciidoc
new file mode 100644
index 0000000..54f9701
--- /dev/null
+++ b/docs/reference/mapping/types/attachment-type.asciidoc
@@ -0,0 +1,90 @@
+[[mapping-attachment-type]]
+=== Attachment Type
+
+The `attachment` type allows to index different "attachment" type field
+(encoded as `base64`), for example, Microsoft Office formats, open
+document formats, ePub, HTML, and so on (full list can be found
+http://lucene.apache.org/tika/0.10/formats.html[here]).
+
+The `attachment` type is provided as a
+https://github.com/elasticsearch/elasticsearch-mapper-attachments[plugin
+extension]. The plugin is a simple zip file that can be downloaded and
+placed under `$ES_HOME/plugins` location. It will be automatically
+detected and the `attachment` type will be added.
+
+Note, the `attachment` type is experimental.
+
+Using the attachment type is simple, in your mapping JSON, simply set a
+certain JSON element as attachment, for example:
+
+[source,js]
+--------------------------------------------------
+{
+    "person" : {
+        "properties" : {
+            "my_attachment" : { "type" : "attachment" }
+        }
+    }
+}
+--------------------------------------------------
+
+In this case, the JSON to index can be:
+
+[source,js]
+--------------------------------------------------
+{
+    "my_attachment" : "... base64 encoded attachment ..."
+}
+--------------------------------------------------
+
+Or it is possible to use more elaborated JSON if content type or
+resource name need to be set explicitly:
+
+[source,js]
+--------------------------------------------------
+{
+    "my_attachment" : {
+        "_content_type" : "application/pdf",
+        "_name" : "resource/name/of/my.pdf",
+        "content" : "... base64 encoded attachment ..."
+    }
+}
+--------------------------------------------------
+
+The `attachment` type not only indexes the content of the doc, but also
+automatically adds meta data on the attachment as well (when available).
+The metadata supported are: `date`, `title`, `author`, and `keywords`.
+They can be queried using the "dot notation", for example:
+`my_attachment.author`.
+
+Both the meta data and the actual content are simple core type mappers
+(string, date, ...), thus, they can be controlled in the mappings. For
+example:
+
+[source,js]
+--------------------------------------------------
+{
+    "person" : {
+        "properties" : {
+            "file" : {
+                "type" : "attachment",
+                "fields" : {
+                    "file" : {"index" : "no"},
+                    "date" : {"store" : true},
+                    "author" : {"analyzer" : "myAnalyzer"}
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+In the above example, the actual content indexed is mapped under
+`fields` name `file`, and we decide not to index it, so it will only be
+available in the `_all` field. The other fields map to their respective
+metadata names, but there is no need to specify the `type` (like
+`string` or `date`) since it is already known.
+
+The plugin uses http://lucene.apache.org/tika/[Apache Tika] to parse
+attachments, so many formats are supported, listed
+http://lucene.apache.org/tika/0.10/formats.html[here].
diff --git a/docs/reference/mapping/types/core-types.asciidoc b/docs/reference/mapping/types/core-types.asciidoc
new file mode 100644
index 0000000..90ec792
--- /dev/null
+++ b/docs/reference/mapping/types/core-types.asciidoc
@@ -0,0 +1,754 @@
+[[mapping-core-types]]
+=== Core Types
+
+Each JSON field can be mapped to a specific core type. JSON itself
+already provides us with some typing, with its support for `string`,
+`integer`/`long`, `float`/`double`, `boolean`, and `null`.
+
+The following sample tweet JSON document will be used to explain the
+core types:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" {
+        "user" : "kimchy"
+        "message" : "This is a tweet!",
+        "postDate" : "2009-11-15T14:12:12",
+        "priority" : 4,
+        "rank" : 12.3
+    }
+}
+--------------------------------------------------
+
+Explicit mapping for the above JSON tweet can be:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "user" : {"type" : "string", "index" : "not_analyzed"},
+            "message" : {"type" : "string", "null_value" : "na"},
+            "postDate" : {"type" : "date"},
+            "priority" : {"type" : "integer"},
+            "rank" : {"type" : "float"}
+        }
+    }
+}
+--------------------------------------------------
+
+[float]
+[[string]]
+==== String
+
+The text based string type is the most basic type, and contains one or
+more characters. An example mapping can be:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "message" : {
+                "type" : "string",
+                "store" : true,
+                "index" : "analyzed",
+                "null_value" : "na"
+            },
+            "user" : {
+                "type" : "string",
+                "index" : "not_analyzed",
+                "norms" : {
+                    "enabled" : false
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+The above mapping defines a `string` `message` property/field within the
+`tweet` type. The field is stored in the index (so it can later be
+retrieved using selective loading when searching), and it gets analyzed
+(broken down into searchable terms). If the message has a `null` value,
+then the value that will be stored is `na`. There is also a `string` `user`
+which is indexed as-is (not broken down into tokens) and has norms
+disabled (so that matching this field is a binary decision, no match is
+better than another one).
+
+The following table lists all the attributes that can be used with the
+`string` type:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Attribute |Description
+|`index_name` |The name of the field that will be stored in the index.
+Defaults to the property/field name.
+
+|`store` |Set to `true` to store actual field in the index, `false` to not
+store it. Defaults to `false` (note, the JSON document itself is stored,
+and it can be retrieved from it).
+
+|`index` |Set to `analyzed` for the field to be indexed and searchable
+after being broken down into token using an analyzer. `not_analyzed`
+means that its still searchable, but does not go through any analysis
+process or broken down into tokens. `no` means that it won't be
+searchable at all (as an individual field; it may still be included in
+`_all`). Setting to `no` disables `include_in_all`. Defaults to
+`analyzed`.
+
+|`doc_values` |Set to `true` to store field values in a column-stride fashion.
+Automatically set to `true` when the fielddata format is `doc_values`.
+
+|`term_vector` |Possible values are `no`, `yes`, `with_offsets`,
+`with_positions`, `with_positions_offsets`. Defaults to `no`.
+
+|`boost` |The boost value. Defaults to `1.0`.
+
+|`null_value` |When there is a (JSON) null value for the field, use the
+`null_value` as the field value. Defaults to not adding the field at
+all.
+
+|`norms.enabled` |Boolean value if norms should be enabled or not. Defaults
+to `true` for `analyzed` fields, and to `false` for `not_analyzed` fields.
+
+|`norms.loading` |Describes how norms should be loaded, possible values are
+`eager` and `lazy` (default). It is possible to change the default value to
+eager for all fields by configuring the index setting `index.norms.loading`
+to `eager`.
+
+|`index_options` | Allows to set the indexing
+options, possible values are `docs` (only doc numbers are indexed),
+`freqs` (doc numbers and term frequencies), and `positions` (doc
+numbers, term frequencies and positions). Defaults to `positions` for
+`analyzed` fields, and to `docs` for `not_analyzed` fields. It
+is also possible to set it to `offsets` (doc numbers, term
+frequencies, positions and offsets).
+
+|`analyzer` |The analyzer used to analyze the text contents when
+`analyzed` during indexing and when searching using a query string.
+Defaults to the globally configured analyzer.
+
+|`index_analyzer` |The analyzer used to analyze the text contents when
+`analyzed` during indexing.
+
+|`search_analyzer` |The analyzer used to analyze the field when part of
+a query string. Can be updated on an existing field.
+
+|`include_in_all` |Should the field be included in the `_all` field (if
+enabled). If `index` is set to `no` this defaults to `false`, otherwise,
+defaults to `true` or to the parent `object` type setting.
+
+|`ignore_above` |The analyzer will ignore strings larger than this size.
+Useful for generic `not_analyzed` fields that should ignore long text.
+
+|`position_offset_gap` |Position increment gap between field instances
+with the same field name. Defaults to 0.
+|=======================================================================
+
+The `string` type also support custom indexing parameters associated
+with the indexed value. For example:
+
+[source,js]
+--------------------------------------------------
+{
+    "message" : {
+        "_value":  "boosted value",
+        "_boost":  2.0
+    }
+}
+--------------------------------------------------
+
+The mapping is required to disambiguate the meaning of the document.
+Otherwise, the structure would interpret "message" as a value of type
+"object". The key `_value` (or `value`) in the inner document specifies
+the real string content that should eventually be indexed. The `_boost`
+(or `boost`) key specifies the per field document boost (here 2.0).
+
+[float]
+[[number]]
+==== Number
+
+A number based type supporting `float`, `double`, `byte`, `short`,
+`integer`, and `long`. It uses specific constructs within Lucene in
+order to support numeric values. The number types have the same ranges
+as corresponding
+http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html[Java
+types]. An example mapping can be:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "rank" : {
+                "type" : "float",
+                "null_value" : 1.0
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+The following table lists all the attributes that can be used with a
+numbered type:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Attribute |Description
+|`type` |The type of the number. Can be `float`, `double`, `integer`,
+`long`, `short`, `byte`. Required.
+
+|`index_name` |The name of the field that will be stored in the index.
+Defaults to the property/field name.
+
+|`store` |Set to `true` to store actual field in the index, `false` to not
+store it. Defaults to `false` (note, the JSON document itself is stored,
+and it can be retrieved from it).
+
+|`index` |Set to `no` if the value should not be indexed. Setting to
+`no` disables `include_in_all`. If set to `no` the field can be stored
+in `_source`, have `include_in_all` enabled, or `store` should be set to
+`true` for this to be useful.
+
+|`doc_values` |Set to `true` to store field values in a column-stride fashion.
+Automatically set to `true` when the fielddata format is `doc_values`.
+
+|`precision_step` |The precision step (number of terms generated for
+each number value). Defaults to `4`.
+
+|`boost` |The boost value. Defaults to `1.0`.
+
+|`null_value` |When there is a (JSON) null value for the field, use the
+`null_value` as the field value. Defaults to not adding the field at
+all.
+
+|`include_in_all` |Should the field be included in the `_all` field (if
+enabled). If `index` is set to `no` this defaults to `false`, otherwise,
+defaults to `true` or to the parent `object` type setting.
+
+|`ignore_malformed` |Ignored a malformed number. Defaults to `false`.
+
+|`coerce` |Try convert strings to numbers and truncate fractions for integers. Defaults to `true`.
+
+|=======================================================================
+
+[float]
+[[token_count]]
+==== Token Count
+The `token_count` type maps to the JSON string type but indexes and stores
+the number of tokens in the string rather than the string itself.  For
+example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "name" : {
+                "type" : "string",
+                "fields" : {
+                    "word_count": {
+                        "type" : "token_count",
+                        "store" : "yes",
+                        "analyzer" : "standard"
+                    }
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+All the configuration that can be specified for a number can be specified
+for a token_count.  The only extra configuration is the required
+`analyzer` field which specifies which analyzer to use to break the string
+into tokens.  For best performance, use an analyzer with no token filters.
+
+[NOTE]
+===================================================================
+Technically the `token_count` type sums position increments rather than
+counting tokens. This means that even if the analyzer filters out stop
+words they are included in the count.
+===================================================================
+
+[float]
+[[date]]
+==== Date
+
+The date type is a special type which maps to JSON string type. It
+follows a specific format that can be explicitly set. All dates are
+`UTC`. Internally, a date maps to a number type `long`, with the added
+parsing stage from string to long and from long to string. An example
+mapping:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "postDate" : {
+                "type" : "date",
+                "format" : "YYYY-MM-dd"
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+The date type will also accept a long number representing UTC
+milliseconds since the epoch, regardless of the format it can handle.
+
+The following table lists all the attributes that can be used with a
+date type:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Attribute |Description
+|`index_name` |The name of the field that will be stored in the index.
+Defaults to the property/field name.
+
+|`format` |The <<mapping-date-format,date
+format>>. Defaults to `dateOptionalTime`.
+
+|`store` |Set to `true` to store actual field in the index, `false` to not
+store it. Defaults to `false` (note, the JSON document itself is stored,
+and it can be retrieved from it).
+
+|`index` |Set to `no` if the value should not be indexed. Setting to
+`no` disables `include_in_all`. If set to `no` the field can be stored
+in `_source`, have `include_in_all` enabled, or `store` should be set to
+`true` for this to be useful.
+
+|`doc_values` |Set to `true` to store field values in a column-stride fashion.
+Automatically set to `true` when the fielddata format is `doc_values`.
+
+|`precision_step` |The precision step (number of terms generated for
+each number value). Defaults to `4`.
+
+|`boost` |The boost value. Defaults to `1.0`.
+
+|`null_value` |When there is a (JSON) null value for the field, use the
+`null_value` as the field value. Defaults to not adding the field at
+all.
+
+|`include_in_all` |Should the field be included in the `_all` field (if
+enabled). If `index` is set to `no` this defaults to `false`, otherwise,
+defaults to `true` or to the parent `object` type setting.
+
+|`ignore_malformed` |Ignored a malformed number. Defaults to `false`.
+
+|=======================================================================
+
+[float]
+[[boolean]]
+==== Boolean
+
+The boolean type Maps to the JSON boolean type. It ends up storing
+within the index either `T` or `F`, with automatic translation to `true`
+and `false` respectively.
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "hes_my_special_tweet" : {
+                "type" : "boolean",
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+The boolean type also supports passing the value as a number (in this
+case `0` is `false`, all other values are `true`).
+
+The following table lists all the attributes that can be used with the
+boolean type:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Attribute |Description
+|`index_name` |The name of the field that will be stored in the index.
+Defaults to the property/field name.
+
+|`store` |Set to `true` to store actual field in the index, `false` to not
+store it. Defaults to `false` (note, the JSON document itself is stored,
+and it can be retrieved from it).
+
+|`index` |Set to `no` if the value should not be indexed. Setting to
+`no` disables `include_in_all`. If set to `no` the field can be stored
+in `_source`, have `include_in_all` enabled, or `store` should be set to
+`true` for this to be useful.
+
+|`boost` |The boost value. Defaults to `1.0`.
+
+|`null_value` |When there is a (JSON) null value for the field, use the
+`null_value` as the field value. Defaults to not adding the field at
+all.
+
+|`include_in_all` |Should the field be included in the `_all` field (if
+enabled). If `index` is set to `no` this defaults to `false`, otherwise,
+defaults to `true` or to the parent `object` type setting.
+|=======================================================================
+
+[float]
+[[binary]]
+==== Binary
+
+The binary type is a base64 representation of binary data that can be
+stored in the index. The field is not stored by default and not indexed at
+all.
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "image" : {
+                "type" : "binary",
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+The following table lists all the attributes that can be used with the
+binary type:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Attribute |Description
+|`index_name` |The name of the field that will be stored in the index.
+Defaults to the property/field name.
+|`store` |Set to `true` to store actual field in the index, `false` to not
+store it. Defaults to `false` (note, the JSON document itself is stored,
+and it can be retrieved from it).
+|=======================================================================
+
+[float]
+[[fielddata-filters]]
+==== Fielddata filters
+
+It is possible to control which field values are loaded into memory,
+which is particularly useful for faceting on string fields, using
+fielddata filters, which are explained in detail in the
+<<index-modules-fielddata,Fielddata>> section.
+
+Fielddata filters can exclude terms which do not match a regex, or which
+don't fall between a `min` and `max` frequency range:
+
+[source,js]
+--------------------------------------------------
+{
+    tweet: {
+        type:      "string",
+        analyzer:  "whitespace"
+        fielddata: {
+            filter: {
+                regex: {
+                    "pattern":        "^#.*"
+                },
+                frequency: {
+                    min:              0.001,
+                    max:              0.1,
+                    min_segment_size: 500
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+These filters can be updated on an existing field mapping and will take
+effect the next time the fielddata for a segment is loaded. Use the
+<<indices-clearcache,Clear Cache>> API
+to reload the fielddata using the new filters.
+
+[float]
+[[postings]]
+==== Postings format
+
+Posting formats define how fields are written into the index and how
+fields are represented into memory. Posting formats can be defined per
+field via the `postings_format` option. Postings format are configurable.
+Elasticsearch has several builtin formats:
+
+`direct`::
+        A postings format that uses disk-based storage but loads
+        its terms and postings directly into memory. Note this postings format
+        is very memory intensive and has certain limitation that don't allow
+        segments to grow beyond 2.1GB see \{@link DirectPostingsFormat} for
+        details.
+
+`memory`::
+        A postings format that stores its entire terms, postings,
+        positions and payloads in a finite state transducer. This format should
+        only be used for primary keys or with fields where each term is
+        contained in a very low number of documents.
+
+`pulsing`::
+        A postings format in-lines the posting lists for very low
+        frequent terms in the term dictionary. This is useful to improve lookup
+        performance for low-frequent terms.
+
+`bloom_default`::
+        A postings format that uses a bloom filter to
+        improve term lookup performance. This is useful for primarily keys or
+        fields that are used as a delete key.
+
+`bloom_pulsing`::
+        A postings format that combines the advantages of
+        *bloom* and *pulsing* to further improve lookup performance.
+
+`default`::
+        The default Elasticsearch postings format offering best
+        general purpose performance. This format is used if no postings format
+        is specified in the field mapping.
+
+[float]
+===== Postings format example
+
+On all field types it possible to configure a `postings_format`
+attribute:
+
+[source,js]
+--------------------------------------------------
+{
+  "person" : {
+     "properties" : {
+         "second_person_id" : {"type" : "string", "postings_format" : "pulsing"}
+     }
+  }
+}
+--------------------------------------------------
+
+On top of using the built-in posting formats it is possible define
+custom postings format. See
+<<index-modules-codec,codec module>> for more
+information.
+
+[float]
+==== Doc values format
+
+Doc values formats define how fields are written into column-stride storage in
+the index for the purpose of sorting or faceting. Fields that have doc values
+enabled will have special field data instances, which will not be uninverted
+from the inverted index, but directly read from disk. This makes _refresh faster
+and ultimately allows for having field data stored on disk depending on the
+configured doc values format.
+
+Doc values formats are configurable. Elasticsearch has several builtin formats:
+
+`memory`::
+        A doc values format which stores data in memory. Compared to the default
+        field data implementations, using doc values with this format will have
+        similar performance but will be faster to load, making '_refresh' less
+        time-consuming.
+
+`disk`::
+        A doc values format which stores all data on disk, requiring almost no
+        memory from the JVM at the cost of a slight performance degradation.
+
+`default`::
+        The default Elasticsearch doc values format, offering good performance
+        with low memory usage. This format is used if no format is specified in
+        the field mapping.
+
+[float]
+===== Doc values format example
+
+On all field types, it is possible to configure a `doc_values_format` attribute:
+
+[source,js]
+--------------------------------------------------
+{
+  "product" : {
+     "properties" : {
+         "price" : {"type" : "integer", "doc_values_format" : "memory"}
+     }
+  }
+}
+--------------------------------------------------
+
+On top of using the built-in doc values formats it is possible to define
+custom doc values formats. See
+<<index-modules-codec,codec module>> for more information.
+
+[float]
+==== Similarity
+
+Elasticsearch allows you to configure a similarity (scoring algorithm) per field.
+Allowing users a simpler extension beyond the usual TF/IDF algorithm. As
+part of this, new algorithms have been added including BM25. Also as
+part of the changes, it is now possible to define a Similarity per
+field, giving even greater control over scoring.
+
+You can configure similarities via the
+<<index-modules-similarity,similarity module>>
+
+[float]
+===== Configuring Similarity per Field
+
+Defining the Similarity for a field is done via the `similarity` mapping
+property, as this example shows:
+
+[source,js]
+--------------------------------------------------
+{
+  "book" : {
+    "properties" : {
+      "title" : { "type" : "string", "similarity" : "BM25" }
+    }
+}
+--------------------------------------------------
+
+The following Similarities are configured out-of-box:
+
+`default`::
+        The Default TF/IDF algorithm used by Elasticsearch and
+        Lucene in previous versions.
+
+`BM25`::
+        The BM25 algorithm.
+        http://en.wikipedia.org/wiki/Okapi_BM25[See Okapi_BM25] for more
+        details.
+
+
+[[copy-to]]
+[float]
+===== Copy to field
+
+added[1.0.0.RC2]
+
+Adding `copy_to` parameter to any field mapping will cause all values of this field to be copied to fields specified in
+the parameter. In the following example all values from fields `title` and `abstract` will be copied to the field
+`meta_data`.
+
+
+[source,js]
+--------------------------------------------------
+{
+  "book" : {
+    "properties" : {
+      "title" : { "type" : "string", "copy_to" : "meta_data" },
+      "abstract" : { "type" : "string", "copy_to" : "meta_data" },
+      "meta_data" : { "type" : "string" },
+    }
+}
+--------------------------------------------------
+
+Multiple fields are also supported:
+
+[source,js]
+--------------------------------------------------
+{
+  "book" : {
+    "properties" : {
+      "title" : { "type" : "string", "copy_to" : ["meta_data", "article_info"] },
+    }
+}
+--------------------------------------------------
+
+[float]
+===== Multi fields
+
+added[1.0.0.RC1]
+
+The `fields` options allows to map several core types fields into a single
+json source field. This can be useful if a single field need to be
+used in different ways. For example a single field is to be used for both
+free text search and sorting.
+
+[source,js]
+--------------------------------------------------
+{
+  "tweet" : {
+    "properties" : {
+      "name" : {
+        "type" : "string",
+        "index" : "analyzed",
+        "fields" : {
+          "raw" : {"type" : "string", "index" : "not_analyzed"}
+        }
+      }
+    }
+  }
+}
+--------------------------------------------------
+
+In the above example the field `name` gets processed twice. The first time it gets
+processed as an analyzed string and this version is accessible under the field name
+`name`, this is the main field and is in fact just like any other field. The second time
+it gets processed as a not analyzed string and is accessible under the name `name.raw`.
+
+[float]
+==== Include in All
+
+The `include_in_all` setting is ignored on any field that is defined in
+the `fields` options. Setting the `include_in_all` only makes sense on
+the main field, since the raw field value to copied to the `_all` field,
+the tokens aren't copied.
+
+[float]
+==== Updating a field
+
+In the essence a field can't be updated. However multi fields can be
+added to existing fields. This allows for example to have a different
+`index_analyzer` configuration in addition to the already configured
+`index_analyzer` configuration specified in the main and other multi fields.
+
+Also the new multi field will only be applied on document that have been
+added after the multi field has been added and in fact the new multi field
+doesn't exist in existing documents.
+
+Another important note is that new multi fields will be merged into the
+list of existing multi fields, so when adding new multi fields for a field
+previous added multi fields don't need to be specified.
+
+[float]
+==== Accessing Fields
+
+deprecated[1.0.0,Use <<copy-to,`copy_to`>> instead]
+
+The multi fields defined in the `fields` are prefixed with the
+name of the main field and can be accessed by their full path using the
+navigation notation: `name.raw`, or using the typed navigation notation
+`tweet.name.raw`. The `path` option allows to control how fields are accessed.
+If the `path` option is set to `full`, then the full path of the main field
+is prefixed, but if the `path` option is set to `just_name` the actual
+multi field name without any prefix is used. The default value for
+the `path` option is `full`.
+
+The `just_name` setting, among other things, allows indexing content of multiple
+fields under the same name. In the example below the content of both fields
+`first_name` and `last_name` can be accessed by using `any_name` or `tweet.any_name`.
+
+[source,js]
+--------------------------------------------------
+{
+  "tweet" : {
+    "properties": {
+      "first_name": {
+        "type": "string",
+        "index": "analyzed",
+        "path": "just_name",
+        "fields": {
+          "any_name": {"type": "string","index": "analyzed"}
+        }
+      },
+      "last_name": {
+        "type": "string",
+        "index": "analyzed",
+        "path": "just_name",
+        "fields": {
+          "any_name": {"type": "string","index": "analyzed"}
+        }
+      }
+    }
+  }
+}
+--------------------------------------------------
+
diff --git a/docs/reference/mapping/types/geo-point-type.asciidoc b/docs/reference/mapping/types/geo-point-type.asciidoc
new file mode 100644
index 0000000..19b38e5
--- /dev/null
+++ b/docs/reference/mapping/types/geo-point-type.asciidoc
@@ -0,0 +1,207 @@
+[[mapping-geo-point-type]]
+=== Geo Point Type
+
+Mapper type called `geo_point` to support geo based points. The
+declaration looks as follows:
+
+[source,js]
+--------------------------------------------------
+{
+    "pin" : {
+        "properties" : {
+            "location" : {
+                "type" : "geo_point"
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+[float]
+==== Indexed Fields
+
+The `geo_point` mapping will index a single field with the format of
+`lat,lon`. The `lat_lon` option can be set to also index the `.lat` and
+`.lon` as numeric fields, and `geohash` can be set to `true` to also
+index `.geohash` value.
+
+A good practice is to enable indexing `lat_lon` as well, since both the
+geo distance and bounding box filters can either be executed using in
+memory checks, or using the indexed lat lon values, and it really
+depends on the data set which one performs better. Note though, that
+indexed lat lon only make sense when there is a single geo point value
+for the field, and not multi values.
+
+[float]
+==== Geohashes
+
+Geohashes are a form of lat/lon encoding which divides the earth up into
+a grid. Each cell in this grid is represented by a geohash string. Each
+cell in turn can be further subdivided into smaller cells which are
+represented by a longer string. So the longer the geohash, the smaller
+(and thus more accurate) the cell is.
+
+Because geohashes are just strings, they can be stored in an inverted
+index like any other string, which makes querying them very efficient.
+
+If you enable the `geohash` option, a `geohash` ``sub-field'' will be
+indexed as, eg `pin.geohash`. The length of the geohash is controlled by
+the `geohash_precision` parameter, which can either be set to an absolute
+length (eg `12`, the default) or to a distance (eg `1km`).
+
+More usefully, set the `geohash_prefix` option to `true` to not only index
+the geohash value, but all the enclosing cells as well.  For instance, a
+geohash of `u30` will be indexed as `[u,u3,u30]`. This option can be used
+by the <<query-dsl-geohash-cell-filter>> to find geopoints within a
+particular cell very efficiently.
+
+[float]
+==== Input Structure
+
+The above mapping defines a `geo_point`, which accepts different
+formats. The following formats are supported:
+
+[float]
+===== Lat Lon as Properties
+
+[source,js]
+--------------------------------------------------
+{
+    "pin" : {
+        "location" : {
+            "lat" : 41.12,
+            "lon" : -71.34
+        }
+    }
+}
+--------------------------------------------------
+
+[float]
+===== Lat Lon as String
+
+Format in `lat,lon`.
+
+[source,js]
+--------------------------------------------------
+{
+    "pin" : {
+        "location" : "41.12,-71.34"
+    }
+}
+--------------------------------------------------
+
+[float]
+===== Geohash
+
+[source,js]
+--------------------------------------------------
+{
+    "pin" : {
+        "location" : "drm3btev3e86"
+    }
+}
+--------------------------------------------------
+
+[float]
+===== Lat Lon as Array
+
+Format in `[lon, lat]`, note, the order of lon/lat here in order to
+conform with http://geojson.org/[GeoJSON].
+
+[source,js]
+--------------------------------------------------
+{
+    "pin" : {
+        "location" : [-71.34, 41.12]
+    }
+}
+--------------------------------------------------
+
+[float]
+==== Mapping Options
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Option |Description
+|`lat_lon` |Set to `true` to also index the `.lat` and `.lon` as fields.
+Defaults to `false`.
+
+|`geohash` |Set to `true` to also index the `.geohash` as a field.
+Defaults to `false`.
+
+|`geohash_precision` |Sets the geohash precision. It can be set to an
+absolute geohash length or a distance value (eg 1km, 1m, 1ml) defining
+the size of the smallest cell. Defaults to an absolute length of 12.
+
+|`geohash_prefix` |If this option is set to `true`, not only the geohash
+but also all its parent cells (true prefixes) will be indexed as well. The
+number of terms that will be indexed depends on the `geohash_precision`.
+Defaults to `false`. *Note*: This option implicitly enables `geohash`.
+
+|`validate` |Set to `true` to reject geo points with invalid latitude or
+longitude (default is `false`) *Note*: Validation only works when
+normalization has been disabled.
+
+|`validate_lat` |Set to `true` to reject geo points with an invalid
+latitude
+
+|`validate_lon` |Set to `true` to reject geo points with an invalid
+longitude
+
+|`normalize` |Set to `true` to normalize latitude and longitude (default
+is `true`)
+
+|`normalize_lat` |Set to `true` to normalize latitude
+
+|`normalize_lon` |Set to `true` to normalize longitude
+|=======================================================================
+
+[float]
+==== Field data
+
+By default, geo points use the `array` format which loads geo points into two
+parallel double arrays, making sure there is no precision loss. However, this
+can require a non-negligible amount of memory (16 bytes per document) which is
+why Elasticsearch also provides a field data implementation with lossy
+compression called `compressed`:
+
+[source,js]
+--------------------------------------------------
+{
+    "pin" : {
+        "properties" : {
+            "location" : {
+                "type" : "geo_point",
+                "fielddata" : {
+                    "format" : "compressed",
+                    "precision" : "1cm"
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+This field data format comes with a `precision` option which allows to
+configure how much precision can be traded for memory. The default value is
+`1cm`. The following table presents values of the memory savings given various
+precisions:
+
+|=============================================
+| Precision | Bytes per point | Size reduction
+|       1km |               4 |            75%
+|        3m |               6 |          62.5%
+|       1cm |               8 |            50%
+|       1mm |              10 |          37.5%
+|=============================================
+
+Precision can be changed on a live index by using the update mapping API.
+
+[float]
+==== Usage in Scripts
+
+When using `doc[geo_field_name]` (in the above mapping,
+`doc['location']`), the `doc[...].value` returns a `GeoPoint`, which
+then allows access to `lat` and `lon` (for example,
+`doc[...].value.lat`). For performance, it is better to access the `lat`
+and `lon` directly using `doc[...].lat` and `doc[...].lon`.
diff --git a/docs/reference/mapping/types/geo-shape-type.asciidoc b/docs/reference/mapping/types/geo-shape-type.asciidoc
new file mode 100644
index 0000000..600900a
--- /dev/null
+++ b/docs/reference/mapping/types/geo-shape-type.asciidoc
@@ -0,0 +1,232 @@
+[[mapping-geo-shape-type]]
+=== Geo Shape Type
+
+The `geo_shape` mapping type facilitates the indexing of and searching
+with arbitrary geo shapes such as rectangles and polygons. It should be
+used when either the data being indexed or the queries being executed
+contain shapes other than just points.
+
+You can query documents using this type using
+<<query-dsl-geo-shape-filter,geo_shape Filter>>
+or <<query-dsl-geo-shape-query,geo_shape
+Query>>.
+
+Note, the `geo_shape` type uses
+https://github.com/spatial4j/spatial4j[Spatial4J] and
+http://www.vividsolutions.com/jts/jtshome.htm[JTS], both of which are
+optional dependencies. Consequently you must add Spatial4J v0.3 and JTS
+v1.12 to your classpath in order to use this type.
+
+[float]
+==== Mapping Options
+
+The geo_shape mapping maps geo_json geometry objects to the geo_shape
+type. To enable it, users must explicitly map fields to the geo_shape
+type.
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Option |Description
+
+|`tree` |Name of the PrefixTree implementation to be used: `geohash` for
+GeohashPrefixTree and `quadtree` for QuadPrefixTree. Defaults to
+`geohash`.
+
+|`precision` |This parameter may be used instead of `tree_levels` to set
+an appropriate value for the `tree_levels` parameter. The value
+specifies the desired precision and Elasticsearch will calculate the
+best tree_levels value to honor this precision. The value should be a
+number followed by an optional distance unit. Valid distance units
+include: `in`, `inch`, `yd`, `yard`, `mi`, `miles`, `km`, `kilometers`,
+`m`,`meters` (default), `cm`,`centimeters`, `mm`, `millimeters`.
+
+|`tree_levels` |Maximum number of layers to be used by the PrefixTree.
+This can be used to control the precision of shape representations and
+therefore how many terms are indexed. Defaults to the default value of
+the chosen PrefixTree implementation. Since this parameter requires a
+certain level of understanding of the underlying implementation, users
+may use the `precision` parameter instead. However, Elasticsearch only
+uses the tree_levels parameter internally and this is what is returned
+via the mapping API even if you use the precision parameter.
+
+|`distance_error_pct` |Used as a hint to the PrefixTree about how
+precise it should be. Defaults to 0.025 (2.5%) with 0.5 as the maximum
+supported value.
+|=======================================================================
+
+[float]
+==== Prefix trees
+
+To efficiently represent shapes in the index, Shapes are converted into
+a series of hashes representing grid squares using implementations of a
+PrefixTree. The tree notion comes from the fact that the PrefixTree uses
+multiple grid layers, each with an increasing level of precision to
+represent the Earth.
+
+Multiple PrefixTree implementations are provided:
+
+* GeohashPrefixTree - Uses
+http://en.wikipedia.org/wiki/Geohash[geohashes] for grid squares.
+Geohashes are base32 encoded strings of the bits of the latitude and
+longitude interleaved. So the longer the hash, the more precise it is.
+Each character added to the geohash represents another tree level and
+adds 5 bits of precision to the geohash. A geohash represents a
+rectangular area and has 32 sub rectangles. The maximum amount of levels
+in Elasticsearch is 24.
+* QuadPrefixTree - Uses a
+http://en.wikipedia.org/wiki/Quadtree[quadtree] for grid squares.
+Similar to geohash, quad trees interleave the bits of the latitude and
+longitude the resulting hash is a bit set. A tree level in a quad tree
+represents 2 bits in this bit set, one for each coordinate. The maximum
+amount of levels for the quad trees in Elasticsearch is 50.
+
+[float]
+===== Accuracy
+
+Geo_shape does not provide 100% accuracy and depending on how it is
+configured it may return some false positives or false negatives for
+certain queries. To mitigate this, it is important to select an
+appropriate value for the tree_levels parameter and to adjust
+expectations accordingly. For example, a point may be near the border of
+a particular grid cell. And may not match a query that only matches the
+cell right next to it even though the shape is very close to the point.
+
+[float]
+===== Example
+
+[source,js]
+--------------------------------------------------
+{
+    "properties": {
+        "location": {
+            "type": "geo_shape",
+            "tree": "quadtree",
+            "precision": "1m"
+        }
+    }
+}
+--------------------------------------------------
+
+This mapping maps the location field to the geo_shape type using the
+quad_tree implementation and a precision of 1m. Elasticsearch translates
+this into a tree_levels setting of 26.
+
+[float]
+===== Performance considerations
+
+Elasticsearch uses the paths in the prefix tree as terms in the index
+and in queries. The higher the levels is (and thus the precision), the
+more terms are generated. Both calculating the terms, keeping them in
+memory, and storing them has a price of course. Especially with higher
+tree levels, indices can become extremely large even with a modest
+amount of data. Additionally, the size of the features also matters.
+Big, complex polygons can take up a lot of space at higher tree levels.
+Which setting is right depends on the use case. Generally one trades off
+accuracy against index size and query performance.
+
+The defaults in Elasticsearch for both implementations are a compromise
+between index size and a reasonable level of precision of 50m at the
+equator. This allows for indexing tens of millions of shapes without
+overly bloating the resulting index too much relative to the input size.
+
+[float]
+==== Input Structure
+
+The http://www.geojson.org[GeoJSON] format is used to represent Shapes
+as input as follows:
+
+[source,js]
+--------------------------------------------------
+{
+    "location" : {
+        "type" : "point",
+        "coordinates" : [45.0, -45.0]
+    }
+}
+--------------------------------------------------
+
+Note, both the `type` and `coordinates` fields are required.
+
+The supported `types` are `point`, `linestring`, `polygon`, `multipoint`
+and `multipolygon`.
+
+Note, in geojson the correct order is longitude, latitude coordinate
+arrays. This differs from some APIs such as e.g. Google Maps that
+generally use latitude, longitude.
+
+[float]
+===== Envelope
+
+Elasticsearch supports an `envelope` type which consists of coordinates
+for upper left and lower right points of the shape:
+
+[source,js]
+--------------------------------------------------
+{
+    "location" : {
+        "type" : "envelope",
+        "coordinates" : [[-45.0, 45.0], [45.0, -45.0]]
+    }
+}
+--------------------------------------------------
+
+[float]
+===== http://www.geojson.org/geojson-spec.html#id4[Polygon]
+
+A polygon is defined by a list of a list of points. The first and last
+points in each list must be the same (the polygon must be closed).
+
+[source,js]
+--------------------------------------------------
+{
+    "location" : {
+        "type" : "polygon",
+        "coordinates" : [
+            [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ]
+        ]
+    }
+}
+--------------------------------------------------
+
+The first array represents the outer boundary of the polygon, the other
+arrays represent the interior shapes ("holes"):
+
+[source,js]
+--------------------------------------------------
+{
+    "location" : {
+        "type" : "polygon",
+        "coordinates" : [
+            [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ],
+            [ [100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2] ]
+        ]
+    }
+}
+--------------------------------------------------
+
+[float]
+===== http://www.geojson.org/geojson-spec.html#id7[MultiPolygon]
+
+A list of geojson polygons.
+
+[source,js]
+--------------------------------------------------
+{
+    "location" : {
+        "type" : "multipolygon",
+        "coordinates" : [
+            [[[102.0, 2.0], [103.0, 2.0], [103.0, 3.0], [102.0, 3.0], [102.0, 2.0]]],
+            [[[100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0]],
+            [[100.2, 0.2], [100.8, 0.2], [100.8, 0.8], [100.2, 0.8], [100.2, 0.2]]]
+        ]
+    }
+}
+--------------------------------------------------
+
+[float]
+==== Sorting and Retrieving index Shapes
+
+Due to the complex input structure and index representation of shapes,
+it is not currently possible to sort shapes or retrieve their fields
+directly. The geo_shape value is only retrievable through the `_source`
+field.
diff --git a/docs/reference/mapping/types/ip-type.asciidoc b/docs/reference/mapping/types/ip-type.asciidoc
new file mode 100644
index 0000000..51f3c5a
--- /dev/null
+++ b/docs/reference/mapping/types/ip-type.asciidoc
@@ -0,0 +1,36 @@
+[[mapping-ip-type]]
+=== IP Type
+
+An `ip` mapping type allows to store _ipv4_ addresses in a numeric form
+allowing to easily sort, and range query it (using ip values).
+
+The following table lists all the attributes that can be used with an ip
+type:
+
+[cols="<,<",options="header",]
+|=======================================================================
+|Attribute |Description
+|`index_name` |The name of the field that will be stored in the index.
+Defaults to the property/field name.
+
+|`store` |Set to `true` to store actual field in the index, `false` to not
+store it. Defaults to `false` (note, the JSON document itself is stored,
+and it can be retrieved from it).
+
+|`index` |Set to `no` if the value should not be indexed. In this case,
+`store` should be set to `true`, since if it's not indexed and not
+stored, there is nothing to do with it.
+
+|`precision_step` |The precision step (number of terms generated for
+each number value). Defaults to `4`.
+
+|`boost` |The boost value. Defaults to `1.0`.
+
+|`null_value` |When there is a (JSON) null value for the field, use the
+`null_value` as the field value. Defaults to not adding the field at
+all.
+
+|`include_in_all` |Should the field be included in the `_all` field (if
+enabled). Defaults to `true` or to the parent `object` type setting.
+|=======================================================================
+
diff --git a/docs/reference/mapping/types/nested-type.asciidoc b/docs/reference/mapping/types/nested-type.asciidoc
new file mode 100644
index 0000000..17d8a13
--- /dev/null
+++ b/docs/reference/mapping/types/nested-type.asciidoc
@@ -0,0 +1,81 @@
+[[mapping-nested-type]]
+=== Nested Type
+
+Nested objects/documents allow to map certain sections in the document
+indexed as nested allowing to query them as if they are separate docs
+joining with the parent owning doc.
+
+One of the problems when indexing inner objects that occur several times
+in a doc is that "cross object" search match will occur, for example:
+
+[source,js]
+--------------------------------------------------
+{
+    "obj1" : [
+        {
+            "name" : "blue",
+            "count" : 4
+        },
+        {
+            "name" : "green",
+            "count" : 6
+        }
+    ]
+}
+--------------------------------------------------
+
+Searching for name set to blue and count higher than 5 will match the
+doc, because in the first element the name matches blue, and in the
+second element, count matches "higher than 5".
+
+Nested mapping allows mapping certain inner objects (usually multi
+instance ones), for example:
+
+[source,js]
+--------------------------------------------------
+{
+    "type1" : {
+        "properties" : {
+            "obj1" : {
+                "type" : "nested"
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+The above will cause all `obj1` to be indexed as a nested doc. The
+mapping is similar in nature to setting `type` to `object`, except that
+it's `nested`.
+
+Note: changing an object type to nested type requires reindexing.
+
+The `nested` object fields can also be automatically added to the
+immediate parent by setting `include_in_parent` to true, and also
+included in the root object by setting `include_in_root` to true.
+
+Nested docs will also automatically use the root doc `_all` field.
+
+Searching on nested docs can be done using either the
+<<query-dsl-nested-query,nested query>> or
+<<query-dsl-nested-filter,nested filter>>.
+
+[float]
+==== Internal Implementation
+
+Internally, nested objects are indexed as additional documents, but,
+since they can be guaranteed to be indexed within the same "block", it
+allows for extremely fast joining with parent docs.
+
+Those internal nested documents are automatically masked away when doing
+operations against the index (like searching with a match_all query),
+and they bubble out when using the nested query.
+
+Because nested docs are always masked to the parent doc, the nested docs
+can never be accessed outside the scope of the `nested` query. For example
+stored fields can be enabled on fields inside nested objects, but there is
+no way of retrieving them, since stored fields are fetched outside of
+the `nested` query scope.
+
+The `_source` field is always associated with the parent document and
+because of that field values via the source can be fetched for nested object.
diff --git a/docs/reference/mapping/types/object-type.asciidoc b/docs/reference/mapping/types/object-type.asciidoc
new file mode 100644
index 0000000..ce28239
--- /dev/null
+++ b/docs/reference/mapping/types/object-type.asciidoc
@@ -0,0 +1,244 @@
+[[mapping-object-type]]
+=== Object Type
+
+JSON documents are hierarchical in nature, allowing them to define inner
+"objects" within the actual JSON. Elasticsearch completely understands
+the nature of these inner objects and can map them easily, providing
+query support for their inner fields. Because each document can have
+objects with different fields each time, objects mapped this way are
+known as "dynamic". Dynamic mapping is enabled by default. Let's take
+the following JSON as an example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "person" : {
+            "name" : {
+                "first_name" : "Shay",
+                "last_name" : "Banon"
+            },
+            "sid" : "12345"
+        },
+        "message" : "This is a tweet!"
+    }
+}
+--------------------------------------------------
+
+The above shows an example where a tweet includes the actual `person`
+details. A `person` is an object, with a `sid`, and a `name` object
+which has `first_name` and `last_name`. It's important to note that
+`tweet` is also an object, although it is a special
+<<mapping-root-object-type,root object type>>
+which allows for additional mapping definitions.
+
+The following is an example of explicit mapping for the above JSON:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "person" : {
+                "type" : "object",
+                "properties" : {
+                    "name" : {
+                        "properties" : {
+                            "first_name" : {"type" : "string"},
+                            "last_name" : {"type" : "string"}
+                        }
+                    },
+                    "sid" : {"type" : "string", "index" : "not_analyzed"}
+                }
+            },
+            "message" : {"type" : "string"}
+        }
+    }
+}
+--------------------------------------------------
+
+In order to mark a mapping of type `object`, set the `type` to object.
+This is an optional step, since if there are `properties` defined for
+it, it will automatically be identified as an `object` mapping.
+
+[float]
+==== properties
+
+An object mapping can optionally define one or more properties using the
+`properties` tag for a field. Each property can be either another
+`object`, or one of the
+<<mapping-core-types,core_types>>.
+
+[float]
+==== dynamic
+
+One of the most important features of Elasticsearch is its ability to be
+schema-less. This means that, in our example above, the `person` object
+can be indexed later with a new property -- `age`, for example -- and it
+will automatically be added to the mapping definitions. Same goes for
+the `tweet` root object.
+
+This feature is by default turned on, and it's the `dynamic` nature of
+each object mapped. Each object mapped is automatically dynamic, though
+it can be explicitly turned off:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "person" : {
+                "type" : "object",
+                "properties" : {
+                    "name" : {
+                        "dynamic" : false,
+                        "properties" : {
+                            "first_name" : {"type" : "string"},
+                            "last_name" : {"type" : "string"}
+                        }
+                    },
+                    "sid" : {"type" : "string", "index" : "not_analyzed"}
+                }
+            },
+            "message" : {"type" : "string"}
+        }
+    }
+}
+--------------------------------------------------
+
+In the above example, the `name` object mapped is not dynamic, meaning
+that if, in the future, we try to index JSON with a `middle_name` within
+the `name` object, it will get discarded and not added.
+
+There is no performance overhead if an `object` is dynamic, the ability
+to turn it off is provided as a safety mechanism so "malformed" objects
+won't, by mistake, index data that we do not wish to be indexed.
+
+If a dynamic object contains yet another inner `object`, it will be
+automatically added to the index and mapped as well.
+
+When processing dynamic new fields, their type is automatically derived.
+For example, if it is a `number`, it will automatically be treated as
+number <<mapping-core-types,core_type>>. Dynamic
+fields default to their default attributes, for example, they are not
+stored and they are always indexed.
+
+Date fields are special since they are represented as a `string`. Date
+fields are detected if they can be parsed as a date when they are first
+introduced into the system. The set of date formats that are tested
+against can be configured using the `dynamic_date_formats` on the root object,
+which is explained later.
+
+Note, once a field has been added, *its type can not change*. For
+example, if we added age and its value is a number, then it can't be
+treated as a string.
+
+The `dynamic` parameter can also be set to `strict`, meaning that not
+only new fields will not be introduced into the mapping, parsing
+(indexing) docs with such new fields will fail.
+
+[float]
+==== enabled
+
+The `enabled` flag allows to disable parsing and indexing a named object
+completely. This is handy when a portion of the JSON document contains
+arbitrary JSON which should not be indexed, nor added to the mapping.
+For example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "properties" : {
+            "person" : {
+                "type" : "object",
+                "properties" : {
+                    "name" : {
+                        "type" : "object",
+                        "enabled" : false
+                    },
+                    "sid" : {"type" : "string", "index" : "not_analyzed"}
+                }
+            },
+            "message" : {"type" : "string"}
+        }
+    }
+}
+--------------------------------------------------
+
+In the above, `name` and its content will not be indexed at all.
+
+
+[float]
+==== include_in_all
+
+`include_in_all` can be set on the `object` type level. When set, it
+propagates down to all the inner mapping defined within the `object`
+that do no explicitly set it.
+
+[float]
+==== path
+
+deprecated[1.0.0,Use <<copy-to,`copy_to`>> instead]
+
+In the <<mapping-core-types,core_types>>
+section, a field can have a `index_name` associated with it in order to
+control the name of the field that will be stored within the index. When
+that field exists within an object(s) that are not the root object, the
+name of the field of the index can either include the full "path" to the
+field with its `index_name`, or just the `index_name`. For example
+(under mapping of _type_ `person`, removed the tweet type for clarity):
+
+[source,js]
+--------------------------------------------------
+{
+    "person" : {
+        "properties" : {
+            "name1" : {
+                "type" : "object",
+                "path" : "just_name",
+                "properties" : {
+                    "first1" : {"type" : "string"},
+                    "last1" : {"type" : "string", "index_name" : "i_last_1"}
+                }
+            },
+            "name2" : {
+                "type" : "object",
+                "path" : "full",
+                "properties" : {
+                    "first2" : {"type" : "string"},
+                    "last2" : {"type" : "string", "index_name" : "i_last_2"}
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+In the above example, the `name1` and `name2` objects within the
+`person` object have different combination of `path` and `index_name`.
+The document fields that will be stored in the index as a result of that
+are:
+
+[cols="<,<",options="header",]
+|=================================
+|JSON Name |Document Field Name
+|`name1`/`first1` |`first1`
+|`name1`/`last1` |`i_last_1`
+|`name2`/`first2` |`name2.first2`
+|`name2`/`last2` |`name2.i_last_2`
+|=================================
+
+Note, when querying or using a field name in any of the APIs provided
+(search, query, selective loading, ...), there is an automatic detection
+from logical full path and into the `index_name` and vice versa. For
+example, even though `name1`/`last1` defines that it is stored with
+`just_name` and a different `index_name`, it can either be referred to
+using `name1.last1` (logical name), or its actual indexed name of
+`i_last_1`.
+
+More over, where applicable, for example, in queries, the full path
+including the type can be used such as `person.name.last1`, in this
+case, both the actual indexed name will be resolved to match against the
+index, and an automatic query filter will be added to only match
+`person` types.
diff --git a/docs/reference/mapping/types/root-object-type.asciidoc b/docs/reference/mapping/types/root-object-type.asciidoc
new file mode 100644
index 0000000..ac368c4
--- /dev/null
+++ b/docs/reference/mapping/types/root-object-type.asciidoc
@@ -0,0 +1,224 @@
+[[mapping-root-object-type]]
+=== Root Object Type
+
+The root object mapping is an
+<<mapping-object-type,object type mapping>> that
+maps the root object (the type itself). On top of all the different
+mappings that can be set using the
+<<mapping-object-type,object type mapping>>, it
+allows for additional, type level mapping definitions.
+
+The root object mapping allows to index a JSON document that either
+starts with the actual mapping type, or only contains its fields. For
+example, the following `tweet` JSON can be indexed:
+
+[source,js]
+--------------------------------------------------
+{
+    "message" : "This is a tweet!"
+}
+--------------------------------------------------
+
+But, also the following JSON can be indexed:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "message" : "This is a tweet!"
+    }
+}
+--------------------------------------------------
+
+Out of the two, it is preferable to use the document *without* the type
+explicitly set.
+
+[float]
+==== Index / Search Analyzers
+
+The root object allows to define type mapping level analyzers for index
+and search that will be used with all different fields that do not
+explicitly set analyzers on their own. Here is an example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "index_analyzer" : "standard",
+        "search_analyzer" : "standard"
+    }
+}
+--------------------------------------------------
+
+The above simply explicitly defines both the `index_analyzer` and
+`search_analyzer` that will be used. There is also an option to use the
+`analyzer` attribute to set both the `search_analyzer` and
+`index_analyzer`.
+
+[float]
+==== dynamic_date_formats
+
+`dynamic_date_formats` (old setting called `date_formats` still works)
+is the ability to set one or more date formats that will be used to
+detect `date` fields. For example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "dynamic_date_formats" : ["yyyy-MM-dd", "dd-MM-yyyy"],
+        "properties" : {
+            "message" : {"type" : "string"}
+        }
+    }
+}
+--------------------------------------------------
+
+In the above mapping, if a new JSON field of type string is detected,
+the date formats specified will be used in order to check if its a date.
+If it passes parsing, then the field will be declared with `date` type,
+and will use the matching format as its format attribute. The date
+format itself is explained
+<<mapping-date-format,here>>.
+
+The default formats are: `dateOptionalTime` (ISO) and
+`yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z`.
+
+*Note:* `dynamic_date_formats` are used *only* for dynamically added
+date fields, not for `date` fields that you specify in your mapping.
+
+[float]
+==== date_detection
+
+Allows to disable automatic date type detection (a new field introduced
+and matches the provided format), for example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "date_detection" : false,
+        "properties" : {
+            "message" : {"type" : "string"}
+        }
+    }
+}
+--------------------------------------------------
+
+[float]
+==== numeric_detection
+
+Sometimes, even though json has support for native numeric types,
+numeric values are still provided as strings. In order to try and
+automatically detect numeric values from string, the `numeric_detection`
+can be set to `true`. For example:
+
+[source,js]
+--------------------------------------------------
+{
+    "tweet" : {
+        "numeric_detection" : true,
+        "properties" : {
+            "message" : {"type" : "string"}
+        }
+    }
+}
+--------------------------------------------------
+
+[float]
+==== dynamic_templates
+
+Dynamic templates allow to define mapping templates that will be applied
+when dynamic introduction of fields / objects happens.
+
+For example, we might want to have all fields to be stored by default,
+or all `string` fields to be stored, or have `string` fields to always
+be indexed with multi fields syntax, once analyzed and once not_analyzed.
+Here is a simple example:
+
+[source,js]
+--------------------------------------------------
+{
+    "person" : {
+        "dynamic_templates" : [
+            {
+                "template_1" : {
+                    "match" : "multi*",
+                    "mapping" : {
+                        "type" : "{dynamic_type}",
+                        "index" : "analyzed",
+                        "fields" : {
+                            "org" : {"type": "{dynamic_type}", "index" : "not_analyzed"}
+                        }
+                    }
+                }
+            },
+            {
+                "template_2" : {
+                    "match" : "*",
+                    "match_mapping_type" : "string",
+                    "mapping" : {
+                        "type" : "string",
+                        "index" : "not_analyzed"
+                    }
+                }
+            }
+        ]
+    }
+}
+--------------------------------------------------
+
+The above mapping will create a field with multi fields for all field
+names starting with multi, and will map all `string` types to be
+`not_analyzed`.
+
+Dynamic templates are named to allow for simple merge behavior. A new
+mapping, just with a new template can be "put" and that template will be
+added, or if it has the same name, the template will be replaced.
+
+The `match` allow to define matching on the field name. An `unmatch`
+option is also available to exclude fields if they do match on `match`.
+The `match_mapping_type` controls if this template will be applied only
+for dynamic fields of the specified type (as guessed by the json
+format).
+
+Another option is to use `path_match`, which allows to match the dynamic
+template against the "full" dot notation name of the field (for example
+`obj1.*.value` or `obj1.obj2.*`), with the respective `path_unmatch`.
+
+The format of all the matching is simple format, allowing to use * as a
+matching element supporting simple patterns such as xxx*, *xxx, xxx*yyy
+(with arbitrary number of pattern types), as well as direct equality.
+The `match_pattern` can be set to `regex` to allow for regular
+expression based matching.
+
+The `mapping` element provides the actual mapping definition. The
+`{name}` keyword can be used and will be replaced with the actual
+dynamic field name being introduced. The `{dynamic_type}` (or
+`{dynamicType}`) can be used and will be replaced with the mapping
+derived based on the field type (or the derived type, like `date`).
+
+Complete generic settings can also be applied, for example, to have all
+mappings be stored, just set:
+
+[source,js]
+--------------------------------------------------
+{
+    "person" : {
+        "dynamic_templates" : [
+            {
+                "store_generic" : {
+                    "match" : "*",
+                    "mapping" : {
+                        "store" : true
+                    }
+                }
+            }
+        ]
+    }
+}
+--------------------------------------------------
+
+Such generic templates should be placed at the end of the
+`dynamic_templates` list because when two or more dynamic templates
+match a field, only the first matching one from the list is used.
author	Hilko Bengen <bengen@debian.org>	2014-06-07 12:02:12 +0200
committer	Hilko Bengen <bengen@debian.org>	2014-06-07 12:02:12 +0200
commit	d5ed89b946297270ec28abf44bef2371a06f1f4f (patch)
tree	ce2d945e4dde69af90bd9905a70d8d27f4936776 /docs/reference/mapping
download	elasticsearch-d5ed89b946297270ec28abf44bef2371a06f1f4f.tar.gz