summaryrefslogtreecommitdiff
path: root/docs/reference/analysis/tokenfilters/stop-tokenfilter.asciidoc
blob: 14b3a32b2f8e7a3d4c6b4c15aba8bf30faeb3151 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
[[analysis-stop-tokenfilter]]
=== Stop Token Filter

A token filter of type `stop` that removes stop words from token
streams.

The following are settings that can be set for a `stop` token filter
type:

[cols="<,<",options="header",]
|=======================================================================
|Setting |Description
|`stopwords` |A list of stop words to use. Defaults to english stop
words.

|`stopwords_path` |A path (either relative to `config` location, or
absolute) to a stopwords file configuration. Each stop word should be in
its own "line" (separated by a line break). The file must be UTF-8
encoded.

|`ignore_case` |Set to `true` to lower case all words first. Defaults to
`false`.

|`remove_trailing` |Set to `false` in order to not ignore the last term of
a search if it is a stop word. This is very useful for the completion
suggester as a query like `green a` can be extended to `green apple` even
though you remove stop words in general. Defaults to `true`.
|=======================================================================

stopwords allow for custom language specific expansion of default
stopwords. It follows the `_lang_` notation and supports: arabic,
armenian, basque, brazilian, bulgarian, catalan, czech, danish, dutch,
english, finnish, french, galician, german, greek, hindi, hungarian,
indonesian, italian, norwegian, persian, portuguese, romanian, russian,
spanish, swedish, turkish.