summaryrefslogtreecommitdiff
path: root/docs/reference/mapping/types/geo-point-type.asciidoc
blob: 19b38e5f124ea8bcd11f081f6a8261191b283bbf (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
[[mapping-geo-point-type]]
=== Geo Point Type

Mapper type called `geo_point` to support geo based points. The
declaration looks as follows:

[source,js]
--------------------------------------------------
{
    "pin" : {
        "properties" : {
            "location" : {
                "type" : "geo_point"
            }
        }
    }
}
--------------------------------------------------

[float]
==== Indexed Fields

The `geo_point` mapping will index a single field with the format of
`lat,lon`. The `lat_lon` option can be set to also index the `.lat` and
`.lon` as numeric fields, and `geohash` can be set to `true` to also
index `.geohash` value.

A good practice is to enable indexing `lat_lon` as well, since both the
geo distance and bounding box filters can either be executed using in
memory checks, or using the indexed lat lon values, and it really
depends on the data set which one performs better. Note though, that
indexed lat lon only make sense when there is a single geo point value
for the field, and not multi values.

[float]
==== Geohashes

Geohashes are a form of lat/lon encoding which divides the earth up into
a grid. Each cell in this grid is represented by a geohash string. Each
cell in turn can be further subdivided into smaller cells which are
represented by a longer string. So the longer the geohash, the smaller
(and thus more accurate) the cell is.

Because geohashes are just strings, they can be stored in an inverted
index like any other string, which makes querying them very efficient.

If you enable the `geohash` option, a `geohash` ``sub-field'' will be
indexed as, eg `pin.geohash`. The length of the geohash is controlled by
the `geohash_precision` parameter, which can either be set to an absolute
length (eg `12`, the default) or to a distance (eg `1km`).

More usefully, set the `geohash_prefix` option to `true` to not only index
the geohash value, but all the enclosing cells as well.  For instance, a
geohash of `u30` will be indexed as `[u,u3,u30]`. This option can be used
by the <<query-dsl-geohash-cell-filter>> to find geopoints within a
particular cell very efficiently.

[float]
==== Input Structure

The above mapping defines a `geo_point`, which accepts different
formats. The following formats are supported:

[float]
===== Lat Lon as Properties

[source,js]
--------------------------------------------------
{
    "pin" : {
        "location" : {
            "lat" : 41.12,
            "lon" : -71.34
        }
    }
}
--------------------------------------------------

[float]
===== Lat Lon as String

Format in `lat,lon`.

[source,js]
--------------------------------------------------
{
    "pin" : {
        "location" : "41.12,-71.34"
    }
}
--------------------------------------------------

[float]
===== Geohash

[source,js]
--------------------------------------------------
{
    "pin" : {
        "location" : "drm3btev3e86"
    }
}
--------------------------------------------------

[float]
===== Lat Lon as Array

Format in `[lon, lat]`, note, the order of lon/lat here in order to
conform with http://geojson.org/[GeoJSON].

[source,js]
--------------------------------------------------
{
    "pin" : {
        "location" : [-71.34, 41.12]
    }
}
--------------------------------------------------

[float]
==== Mapping Options

[cols="<,<",options="header",]
|=======================================================================
|Option |Description
|`lat_lon` |Set to `true` to also index the `.lat` and `.lon` as fields.
Defaults to `false`.

|`geohash` |Set to `true` to also index the `.geohash` as a field.
Defaults to `false`.

|`geohash_precision` |Sets the geohash precision. It can be set to an
absolute geohash length or a distance value (eg 1km, 1m, 1ml) defining
the size of the smallest cell. Defaults to an absolute length of 12.

|`geohash_prefix` |If this option is set to `true`, not only the geohash
but also all its parent cells (true prefixes) will be indexed as well. The
number of terms that will be indexed depends on the `geohash_precision`.
Defaults to `false`. *Note*: This option implicitly enables `geohash`.

|`validate` |Set to `true` to reject geo points with invalid latitude or
longitude (default is `false`) *Note*: Validation only works when
normalization has been disabled.

|`validate_lat` |Set to `true` to reject geo points with an invalid
latitude

|`validate_lon` |Set to `true` to reject geo points with an invalid
longitude

|`normalize` |Set to `true` to normalize latitude and longitude (default
is `true`)

|`normalize_lat` |Set to `true` to normalize latitude

|`normalize_lon` |Set to `true` to normalize longitude
|=======================================================================

[float]
==== Field data

By default, geo points use the `array` format which loads geo points into two
parallel double arrays, making sure there is no precision loss. However, this
can require a non-negligible amount of memory (16 bytes per document) which is
why Elasticsearch also provides a field data implementation with lossy
compression called `compressed`:

[source,js]
--------------------------------------------------
{
    "pin" : {
        "properties" : {
            "location" : {
                "type" : "geo_point",
                "fielddata" : {
                    "format" : "compressed",
                    "precision" : "1cm"
                }
            }
        }
    }
}
--------------------------------------------------

This field data format comes with a `precision` option which allows to
configure how much precision can be traded for memory. The default value is
`1cm`. The following table presents values of the memory savings given various
precisions:

|=============================================
| Precision | Bytes per point | Size reduction
|       1km |               4 |            75%
|        3m |               6 |          62.5%
|       1cm |               8 |            50%
|       1mm |              10 |          37.5%
|=============================================

Precision can be changed on a live index by using the update mapping API.

[float]
==== Usage in Scripts

When using `doc[geo_field_name]` (in the above mapping,
`doc['location']`), the `doc[...].value` returns a `GeoPoint`, which
then allows access to `lat` and `lon` (for example,
`doc[...].value.lat`). For performance, it is better to access the `lat`
and `lon` directly using `doc[...].lat` and `doc[...].lon`.