diff options
Diffstat (limited to 'usr/src/man/man5/regex.5')
-rw-r--r-- | usr/src/man/man5/regex.5 | 1040 |
1 files changed, 0 insertions, 1040 deletions
diff --git a/usr/src/man/man5/regex.5 b/usr/src/man/man5/regex.5 deleted file mode 100644 index 077c6335f9..0000000000 --- a/usr/src/man/man5/regex.5 +++ /dev/null @@ -1,1040 +0,0 @@ -.\" -.\" Sun Microsystems, Inc. gratefully acknowledges The Open Group for -.\" permission to reproduce portions of its copyrighted documentation. -.\" Original documentation from The Open Group can be obtained online at -.\" http://www.opengroup.org/bookstore/. -.\" -.\" The Institute of Electrical and Electronics Engineers and The Open -.\" Group, have given us permission to reprint portions of their -.\" documentation. -.\" -.\" In the following statement, the phrase ``this text'' refers to portions -.\" of the system documentation. -.\" -.\" Portions of this text are reprinted and reproduced in electronic form -.\" in the SunOS Reference Manual, from IEEE Std 1003.1, 2004 Edition, -.\" Standard for Information Technology -- Portable Operating System -.\" Interface (POSIX), The Open Group Base Specifications Issue 6, -.\" Copyright (C) 2001-2004 by the Institute of Electrical and Electronics -.\" Engineers, Inc and The Open Group. In the event of any discrepancy -.\" between these versions and the original IEEE and The Open Group -.\" Standard, the original IEEE and The Open Group Standard is the referee -.\" document. The original Standard can be obtained online at -.\" http://www.opengroup.org/unix/online.html. -.\" -.\" This notice shall appear on any product containing this material. -.\" -.\" The contents of this file are subject to the terms of the -.\" Common Development and Distribution License (the "License"). -.\" You may not use this file except in compliance with the License. -.\" -.\" You can obtain a copy of the license at usr/src/OPENSOLARIS.LICENSE -.\" or http://www.opensolaris.org/os/licensing. -.\" See the License for the specific language governing permissions -.\" and limitations under the License. -.\" -.\" When distributing Covered Code, include this CDDL HEADER in each -.\" file and include the License file at usr/src/OPENSOLARIS.LICENSE. -.\" If applicable, add the following below this CDDL HEADER, with the -.\" fields enclosed by brackets "[]" replaced with your own identifying -.\" information: Portions Copyright [yyyy] [name of copyright owner] -.\" -.\" -.\" Copyright (c) 1992, X/Open Company Limited All Rights Reserved -.\" Portions Copyright (c) 1999, Sun Microsystems, Inc. All Rights Reserved -.\" Copyright 2017 Nexenta Systems, Inc. -.\" -.Dd August 14, 2020 -.Dt REGEX 5 -.Os -.Sh NAME -.Nm regex -.Nd internationalized basic and extended regular expression matching -.Sh DESCRIPTION -Regular Expressions -.Pq REs -provide a mechanism to select specific strings from a set of character strings. -The Internationalized Regular Expressions described below differ from the Simple -Regular Expressions described on the -.Xr regexp 5 -manual page in the following ways: -.Bl -bullet -.It -both Basic and Extended Regular Expressions are supported -.It -the Internationalization features -- character class, equivalence class, and -multi-character collation -- are supported. -.El -.Pp -The Basic Regular Expression -.Pq BRE -notation and construction rules described in the -.Sx BASIC REGULAR EXPRESSIONS -section apply to most utilities supporting regular expressions. -Some utilities, instead, support the Extended Regular Expressions -.Pq ERE -described in the -.Sx EXTENDED REGULAR EXPRESSIONS -section; any exceptions for both cases are noted in the descriptions of the -specific utilities using regular expressions. -Both BREs and EREs are supported by the Regular Expression Matching interfaces -.Xr regcomp 3C -and -.Xr regexec 3C . -.Sh BASIC REGULAR EXPRESSIONS -.Ss BREs Matching a Single Character -A BRE ordinary character, a special character preceded by a backslash, or a -period matches a single character. -A bracket expression matches a single character or a single collating element. -See -.Sx RE Bracket Expression , -below. -.Ss BRE Ordinary Characters -An ordinary character is a BRE that matches itself: any character in the -supported character set, except for the BRE special characters listed in -.Sx BRE Special Characters , -below. -.Pp -The interpretation of an ordinary character preceded by a backslash -.Pq Qq \e -is undefined, except for: -.Bl -enum -.It -the characters -.Qq \&) , -.Qq \&( , -.Qq { , -and -.Qq } -.It -the digits 1 to 9 inclusive -.Po see -.Sx BREs Matching Multiple Characters , -below -.Pc -.It -a character inside a bracket expression. -.El -.Ss BRE Special Characters -A BRE special character has special properties in certain contexts. -Outside those contexts, or when preceded by a backslash, such a character will -be a BRE that matches the special character itself. -The BRE special characters and the contexts in which they have their special -meaning are: -.Bl -tag -width Ds -.It Sy \&. \&[ \&\e -The period, left-bracket, and backslash are special except when used in a -bracket expression -.Po see -.Sx RE Bracket Expression , -below -.Pc . -An expression containing a -.Qq \&[ -that is not preceded by a backslash and is not part of a bracket expression -produces undefined results. -.It Sy * -The asterisk is special except when used: -.Bl -bullet -.It -in a bracket expression -.It -as the first character of an entire BRE -.Po after an initial -.Qq ^ , -if any -.Pc -.It -as the first character of a subexpression -.Po after an initial -.Qq ^ , -if any; see -.Sx BREs Matching Multiple Characters , -below -.Pc . -.El -.It Sy ^ -The circumflex is special when used: -.Bl -bullet -.It -as an anchor -.Po see -.Sx BRE Expression Anchoring , -below -.Pc . -.It -as the first character of a bracket expression -.Po see -.Sx RE Bracket Expression , -below -.Pc . -.El -.It Sy $ -The dollar sign is special when used as an anchor. -.El -.Ss Periods in BREs -A period -.Pq Qq \&. , -when used outside a bracket expression, is a BRE that matches any character in -the supported character set except NUL. -.Ss RE Bracket Expression -A bracket expression -.Po an expression enclosed in square brackets, -.Qq [] -.Pc -is an RE that matches a single collating element contained in the non-empty set -of collating elements represented by the bracket expression. -.Pp -The following rules and definitions apply to bracket expressions: -.Bl -enum -.It -A -.Em bracket expression -is either a matching list expression or a non-matching list expression. -It consists of one or more expressions: collating elements, collating symbols, -equivalence classes, character classes, or range expressions -.Pq see rule 7 below . -Portable applications must not use range expressions, even though all -implementations support them. -The right-bracket -.Pq Qq \&] -loses its special meaning and represents itself in a bracket expression if it -occurs first in the list -.Po after an initial circumflex -.Pq Qq ^ , -if any -.Pc . -Otherwise, it terminates the bracket expression, unless it appears in a -collating symbol -.Po such as -.Qq [.].] -.Pc -or is the ending right-bracket for a collating symbol, equivalence class, or -character class. -.Pp -The special characters -.Qq \&. , -.Qq * , -.Qq \&[ , -.Qq \&\e -.Pq period, asterisk, left-bracket and backslash, respectively -lose their special meaning within a bracket expression. -.Pp -The character sequences -.Qq [. , -.Qq [= , -.Qq [: -.Pq left-bracket followed by a period, equals-sign, or colon -are special inside a bracket expression and are used to delimit collating -symbols, equivalence class expressions, and character class expressions. -These symbols must be followed by a valid expression and the matching -terminating sequence -.Qq .] , -.Qq =] -or -.Qq :] , -as described in the following items. -.It -A -.Em matching list expression -specifies a list that matches any one of the expressions represented in the -list. -The first character in the list must not be the circumflex. -For example, -.Qq [abc] -is an RE that matches any of the characters -.Qq a , -.Qq b -or -.Qq c . -.It -A -.Em non-matching list expression -begins with a circumflex -.Pq Qq ^ , -and specifies a list that matches any character or collating element except for -the expressions represented in the list after the leading circumflex. -For example, -.Qq [^abc] -is an RE that matches any character or collating element except the characters -.Qq a , -.Qq b , -or -.Qq c . -The circumflex will have this special meaning only when it occurs first in the -list, immediately following the left-bracket. -.It -A -.Em collating symbol -is a collating element enclosed within bracket-period -.Pq Qq [..] -delimiters. -Multi-character collating elements must be represented as collating symbols when -it is necessary to distinguish them from a list of the individual characters -that make up the multi-character collating element. -For example, if the string -.Qq ch -is a collating element in the current collation sequence with the associated -collating symbol -.Qq Aq ch , -the expression -.Qq [[.ch.]] -will be treated as an RE matching the character sequence -.Qq ch , -while -.Qq [ch] -will be treated as an RE matching -.Qq c -or -.Qq h . -Collating symbols will be recognized only inside bracket expressions. -This implies that the RE -.Qq [[.ch.]]*c -matches the first to fifth character in the string -.Qq chchch. -If the string is not a collating element in the current collating sequence -definition, or if the collating element has no characters associated with it, -the symbol will be treated as an invalid expression. -.It -An -.Em equivalence class expression -represents the set of collating elements belonging to an equivalence class. -Only primary equivalence classes will be recognised. -The class is expressed by enclosing any one of the collating elements in the -equivalence class within bracket-equal -.Pq Qq [==] -delimiters. -For example, if -.Qq a -and -.Qq b -belong to the same equivalence class, then -.Qq [[=a=]b] , -.Qq [[==]a] -and -.Qq [[==]b] -will each be equivalent to -.Qq [ab] . -If the collating element does not belong to an equivalence class, the -equivalence class expression will be treated as a -.Em collating symbol . -.It -A -.Em character class expression -represents the set of characters belonging to a character class, as defined in -the -.Ev LC_CTYPE -category in the current locale. -All character classes specified in the current locale will be recognized. -A character class expression is expressed as a character class name enclosed -within bracket-colon -.Pq Qq [::] -delimiters. -.Pp -The following character class expressions are supported in all locales: -.Bl -column "[:alnum:]" "[:cntrl:]" "[:lower:]" "[:xdigit:]" -.It [:alnum:] Ta [:cntrl:] Ta [:lower:] Ta [:space:] -.It [:alpha:] Ta [:digit:] Ta [:print:] Ta [:upper:] -.It [:blank:] Ta [:graph:] Ta [:punct:] Ta [:xdigit:] -.El -.Pp -In addition, character class expressions of the form -.Qq [:name:] -are recognized in those locales where the -.Em name -keyword has been given a -.Em charclass -definition in the -.Ev LC_CTYPE -category. -.It -A -.Em range expression -represents the set of collating elements that fall between two elements in the -current collation sequence, inclusively. -It is expressed as the starting point and the ending point separated by a hyphen -.Pq Qq - . -.Pp -Range expressions must not be used in portable applications because their -behavior is dependent on the collating sequence. -Ranges will be treated according to the current collating sequence, and include -such characters that fall within the range based on that collating sequence, -regardless of character values. -This, however, means that the interpretation will differ depending on collating -sequence. -If, for instance, one collating sequence defines as a variant of -.Qq a , -while another defines it as a letter following -.Qq z , -then the expression -.Qq [-z] -is valid in the first language and invalid in the second. -.sp -In the following, all examples assume the collation sequence specified for the -POSIX locale, unless another collation sequence is specifically defined. -.Pp -The starting range point and the ending range point must be a collating element -or collating symbol. -An equivalence class expression used as a starting or ending point of a range -expression produces unspecified results. -An equivalence class can be used portably within a bracket expression, but only -outside the range. -For example, the unspecified expression -.Qq [[=e=]-f] -should be given as -.Qq [[=e=]e-f] . -The ending range point must collate equal to or higher than the starting range -point; otherwise, the expression will be treated as invalid. -The order used is the order in which the collating elements are specified in the -current collation definition. -One-to-many mappings -.Po see -.Xr locale 5 -.Pc -will not be performed. -For example, assuming that the character -.Qq eszet -is placed in the collation sequence after -.Qq r -and -.Qq s , -but before -.Qq t , -and that it maps to the sequence -.Qq ss -for collation purposes, then the expression -.Qq [r-s] -matches only -.Qq r -and -.Qq s , -but the expression -.Qq [s-t] -matches -.Qq s , -.Qq beta , -or -.Qq t . -.Pp -The interpretation of range expressions where the ending range point is also -the starting range point of a subsequent range expression -.Po for instance -.Qq [a-m-o] -.Pc -is undefined. -.Pp -The hyphen character will be treated as itself if it occurs first -.Po after an initial -.Qq ^ , -if any -.Pc -or last in the list, or as an ending range point in a range expression. -As examples, the expressions -.Qq [-ac] -and -.Qq [ac-] -are equivalent and match any of the characters -.Qq a , -.Qq c , -or -.Qq -; -.Qq [^-ac] -and -.Qq [^ac-] -are equivalent and match any characters except -.Qq a , -.Qq c , -or -.Qq -; -the expression -.Qq [%--] -matches any of the characters between -.Qq % -and -.Qq - -inclusive; the expression -.Qq [--@] -matches any of the characters between -.Qq - -and -.Qq @ -inclusive; and the expression -.Qq [a--@] -is invalid, because the letter -.Qq a -follows the symbol -.Qq - -in the POSIX locale. -To use a hyphen as the starting range point, it must either come first in the -bracket expression or be specified as a collating symbol, for example: -.Qq [][.-.]-0] , -which matches either a right bracket or any character or collating element that -collates between hyphen and 0, inclusive. -.Pp -If a bracket expression must specify both -.Qq - -and -.Qq \&] , -the -.Qq \&] -must be placed first -.Po after the -.Qq ^ , -if any -.Pc -and the -.Qq - -last within the bracket expression. -.El -.Pp -Note: Latin-1 characters such as -.Qq \(ga -or -.Qq ^ -are not printable in some locales, for example, the -.Em ja -locale. -.Ss BREs Matching Multiple Characters -The following rules can be used to construct BREs matching multiple characters -from BREs matching a single character: -.Bl -enum -.It -The concatenation of BREs matches the concatenation of the strings matched -by each component of the BRE. -.It -A -.Em subexpression -can be defined within a BRE by enclosing it between the character pairs -.Qq \e( -and -.Qq \e) . -Such a subexpression matches whatever it would have matched without the -.Qq \e( -and -.Qq \e) , -except that anchoring within subexpressions is optional behavior; see -.Sx BRE Expression Anchoring , -below. -Subexpressions can be arbitrarily nested. -.It -The -.Em back-reference -expression -.Qq \e Ns Em n -matches the same -.Pq possibly empty -string of characters as was matched by a subexpression enclosed between -.Qq \e( -and -.Qq \e) -preceding the -.Qq \e Ns Em n . -The character -.Qq Em n -must be a digit from 1 to 9 inclusive, -.Em n Ns th -subexpression -.Po the one that begins with the -.Em n Ns th -.Qq \e( -and ends with the corresponding paired -.Qq \e) -.Pc . -The expression is invalid if less than -.Em n -subexpressions precede the -.Qq \e Ns Em n . -For example, the expression -.Qq ^\e(.*\e)\e1$ -matches a line consisting of two adjacent appearances of the same string, and -the expression -.Qq \e(a\e)*\e1 -fails to match -.Qq a . -The limit of nine back-references to subexpressions in the RE is based on the -use of a single digit identifier. -This does not imply that only nine subexpressions are allowed in REs. -.It -When a BRE matching a single character, a subexpression or a back-reference is -followed by the special character asterisk -.Pq Qq * , -together with that asterisk it matches what zero or more consecutive occurrences -of the BRE would match. -For example, -.Qq [ab]* -and -.Qq [ab][ab] -are equivalent when matching the string -.Qq ab . -.It -When a BRE matching a single character, a subexpression, or a back-reference -is followed by an -.Em interval expression -of the format -.Qq \e{ Ns Em m Ns \e} , -.Qq \e{ Ns Em m Ns ,\e} -or -.Qq \e{ Ns Em m Ns \&, Ns Em n Ns \e} , -together with that interval expression it matches what repeated consecutive -occurrences of the BRE would match. -The values of -.Em m -and -.Em n -will be decimal integers in the range 0 <= -.Em m -<= -.Em n -<= -.Dv BRE_DUP_MAX , -where -.Em m -specifies the exact or minimum number of occurrences and -.Em n -specifies the maximum number of occurrences. -The expression -.Qq \e{ Ns Em m Ns \e} -matches exactly -.Em m -occurrences of the preceding BRE, -.Qq \e{ Ns Em m Ns ,\e} -matches at least -.Em m -occurrences and -.Qq \e{ Ns Em m Ns \&, Ns Em n Ns \e} -matches any number of occurrences between -.Em m -and -.Em n , -inclusive. -.Pp -For example, in the string -.Qq abababccccccd , -the BRE -.Qq c\e{3\e} -is matched by characters seven to nine, the BRE -.Qq \e(ab\e)\e{4,\e} -is not matched at all and the BRE -.Qq c\e{1,3\e}d -is matched by characters ten to thirteen. -.El -.Pp -The behavior of multiple adjacent duplication symbols -.Po Qq * -and intervals -.Pc -produces undefined results. -.Ss BRE Precedence -The order of precedence is as shown in the following table: -.Bl -column "BRE Precedence (from high to low)" "" -.It Sy BRE Precedence (from high to low) Ta -.It collation-related bracket symbols Ta [= =] [: :] [. .] -.It escaped characters Ta \e< Ns Em special character Ns > -.It bracket expression Ta [ ] -.It subexpressions/back-references Ta \e( \e) \e Ns Em n -.It single-character-BRE duplication Ta * \e{ Ns Em m Ns \&, Ns Em n Ns \e} -.It concatenation Ta -.It anchoring Ta ^ $ -.El -.Ss BRE Expression Anchoring -A BRE can be limited to matching strings that begin or end a line; this is -called -.Em anchoring . -The circumflex and dollar sign special characters will be considered BRE anchors -in the following contexts: -.Bl -enum -.It -A circumflex -.Pq Qq ^ -is an anchor when used as the first character of an entire BRE. -The implementation may treat circumflex as an anchor when used as the first -character of a subexpression. -The circumflex will anchor the expression to the beginning of a string; -only sequences starting at the first character of a string will be matched by -the BRE. -For example, the BRE -.Qq ^ab -matches -.Qq ab -in the string -.Qq abcdef , -but fails to match in the string -.Qq cdefab . -A portable BRE must escape a leading circumflex in a subexpression to match a -literal circumflex. -.It -A dollar sign -.Pq Qq $ -is an anchor when used as the last character of an entire BRE. -The implementation may treat a dollar sign as an anchor when used as the last -character of a subexpression. -The dollar sign will anchor the expression to the end of the string being -matched; the dollar sign can be said to match the end-of-string following the -last character. -.It -A BRE anchored by both -.Qq ^ -and -.Qq $ -matches only an entire string. -For example, the BRE -^abcdef$ -matches strings consisting only of -.Qq abcdef . -.It -.Qq ^ -and -.Qq $ -are not special in subexpressions. -.El -.Pp -Note: The Solaris implementation does not support anchoring in BRE -subexpressions. -.Sh EXTENDED REGULAR EXPRESSIONS -The rules specified for BREs apply to Extended Regular Expressions -.Pq EREs -with the following exceptions: -.Bl -bullet -.It -The characters -.Qq | , -.Qq + , -and -.Qq \&? -have special meaning, as defined below. -.It -The -.Qq { -and -.Qq } -characters, when used as the duplication operator, are not preceded by -backslashes. -The constructs -.Qq \e{ -and -.Qq \e} -simply match the characters -.Qq { -and -.Qq }, respectively. -.It -The back reference operator is not supported. -.It -Anchoring -.Pq Qq ^$ -is supported in subexpressions. -.El -.Ss EREs Matching a Single Character -An ERE ordinary character, a special character preceded by a backslash, or a -period matches a single character. -A bracket expression matches a single character or a single collating element. -An -.Em ERE matching a single character -enclosed in parentheses matches the same as the ERE without parentheses would -have matched. -.Ss ERE Ordinary Characters -An -.Em ordinary character -is an ERE that matches itself. -An ordinary character is any character in the supported character set, except -for the ERE special characters listed in -.Sx ERE Special Characters -below. -The interpretation of an ordinary character preceded by a backslash -.Pq Qq \&\e -is undefined. -.Ss ERE Special Characters -An -.Em ERE special character -has special properties in certain contexts. -Outside those contexts, or when preceded by a backslash, such a character is an -ERE that matches the special character itself. -The extended regular expression special characters and the contexts in which -they have their special meaning are: -.Bl -tag -width Ds -.It Sy \&. \&[ \&\e \&( -The period, left-bracket, backslash, and left-parenthesis are special except -when used in a bracket expression -.Po see -.Sx RE Bracket Expression , -above -.Pc . -Outside a bracket expression, a left-parenthesis immediately followed by a -right-parenthesis produces undefined results. -.It Sy \&) -The right-parenthesis is special when matched with a preceding -left-parenthesis, both outside a bracket expression. -.It Sy * + \&? { -The asterisk, plus-sign, question-mark, and left-brace are special except when -used in a bracket expression -.Po see -.Sx RE Bracket Expression , -above -.Pc . -Any of the following uses produce undefined results: -.Bl -bullet -.It -if these characters appear first in an ERE, or immediately following a -vertical-line, circumflex or left-parenthesis -.It -if a left-brace is not part of a valid interval expression. -.El -.It Sy \&| -The vertical-line is special except when used in a bracket expression -.Po see -.Sx RE Bracket Expression , -above -.Pc . -A vertical-line appearing first or last in an ERE, or immediately following a -vertical-line or a left-parenthesis, or immediately preceding a -right-parenthesis, produces undefined results. -.It Sy ^ -The circumflex is special when used: -.Bl -bullet -.It -as an anchor -.Po see -.Sx ERE Expression Anchoring , -below -.Pc . -.It -as the first character of a bracket expression -.Po see -.Sx RE Bracket Expression , -above -.Pc . -.El -.It Sy $ -The dollar sign is special when used as an anchor. -.El -.Ss Periods in EREs -A period -.Pq Qq \&. , -when used outside a bracket expression, is an ERE that matches any character in -the supported character set except NUL. -.Ss ERE Bracket Expression -The rules for ERE Bracket Expressions are the same as for Basic Regular -Expressions; see -.Sx RE Bracket Expression , -above. -.Ss EREs Matching Multiple Characters -The following rules will be used to construct EREs matching multiple characters -from EREs matching a single character: -.Bl -enum -.It -A -.Em concatenation of EREs -matches the concatenation of the character sequences matched by each component -of the ERE. -A concatenation of EREs enclosed in parentheses matches whatever the -concatenation without the parentheses matches. -For example, both the ERE -.Qq cd -and the ERE -.Qq (cd) -are matched by the third and fourth character of the string -.Qq abcdefabcdef . -.It -When an ERE matching a single character or an ERE enclosed in parentheses is -followed by the special character plus-sign -.Pq Qq + , -together with that plus-sign it matches what one or more consecutive occurrences -of the ERE would match. -For example, the ERE -.Qq b+(bc) -matches the fourth to seventh characters in the string -.Qq acabbbcde ; -.Qq [ab]+ -and -.Qq [ab][ab]* -are equivalent. -.It -When an ERE matching a single character or an ERE enclosed in parentheses is -followed by the special character asterisk -.Pq Qq * , -together with that asterisk it matches what zero or more consecutive occurrences -of the ERE would match. -For example, the ERE -.Qq b*c -matches the first character in the string -.Qq cabbbcde , -and the ERE -.Qq b*cd -matches the third to seventh characters in the string -.Qq cabbbcdebbbbbbcdbc . -And, -.Qq [ab]* -and -.Qq [ab][ab] -are equivalent when matching the string -.Qq ab . -.It -When an ERE matching a single character or an ERE enclosed in parentheses is -followed by the special character question-mark -.Pq Qq \&? , -together with that question-mark it matches what zero or one consecutive -occurrences of the ERE would match. -For example, the ERE -.Qq b?c -matches the second character in the string -.Qq acabbbcde . -.It -When an ERE matching a single character or an ERE enclosed in parentheses is -followed by an -.Em interval expression -of the format -.Qq { Ns Em m Ns } , -.Qq { Ns Em m Ns ,} -or -.Qq { Ns Em m Ns \&, Ns Em n Ns } , -together with that interval expression it matches what repeated consecutive -occurrences of the ERE would match. -The values of -.Em m -and -.Em n -will be decimal integers in the range 0 <= -.Em m -<= -.Em n -<= -.Dv RE_DUP_MAX , -where -.Em m -specifies the exact or minimum number of occurrences and -.Em n -specifies the maximum number of occurrences. -The expression -.Qq { Ns Em m Ns } -matches exactly -.Em m -occurrences of the preceding ERE, -.Qq { Ns Em m Ns ,} -matches at least -.Em m -occurrences and -.Qq { Ns m Ns \&, Ns Em n Ns } -matches any number of occurrences between -.Em m -and -.Em n , -inclusive. -.El -.Pp -For example, in the string -.Qq abababccccccd -the ERE -.Qq c{3} -is matched by characters seven to nine and the ERE -.Qq (ab){2,} -is matched by characters one to six. -.Pp -The behavior of multiple adjacent duplication symbols -.Po -.Qq + , -.Qq * , -.Qq \&? -and intervals -.Pc -produces undefined results. -.Ss ERE Alternation -Two EREs separated by the special character vertical-line -.Pq Qq | -match a string that is matched by either. -For example, the ERE -.Qq a((bc)|d) -matches the string -.Qq abc -and the string -.Qq ad . -Single characters, or expressions matching single characters, separated by the -vertical bar and enclosed in parentheses, will be treated as an ERE matching a -single character. -.Ss ERE Precedence -The order of precedence will be as shown in the following table: -.Bl -column "ERE Precedence (from high to low)" "" -.It Sy ERE Precedence (from high to low) Ta -.It collation-related bracket symbols Ta [= =] [: :] [. .] -.It escaped characters Ta \e< Ns Em special character Ns > -.It bracket expression Ta \&[ \&] -.It grouping Ta \&( \&) -.It single-character-ERE duplication Ta * + \&? { Ns Em m Ns \&, Ns Em n Ns} -.It concatenation Ta -.It anchoring Ta ^ $ -.It alternation Ta | -.El -.Pp -For example, the ERE -.Qq abba|cde -matches either the string -.Qq abba -or the string -.Qq cde -.Po rather than the string -.Qq abbade -or -.Qq abbcde , -because concatenation has a higher order of precedence than alternation -.Pc . -.Ss ERE Expression Anchoring -An ERE can be limited to matching strings that begin or end a line; this is -called -.Em anchoring . -The circumflex and dollar sign special characters are considered ERE anchors -when used anywhere outside a bracket expression. -This has the following effects: -.Bl -enum -.It -A circumflex -.Pq Qq ^ -outside a bracket expression anchors the expression or subexpression it begins -to the beginning of a string; such an expression or subexpression can match only -a sequence starting at the first character of a string. -For example, the EREs -.Qq ^ab -and -.Qq (^ab) -match -.Qq ab -in the string -.Qq abcdef , -but fail to match in the string -.Qq cdefab , -and the ERE -.Qq a^b -is valid, but can never match because the -.Qq a -prevents the expression -.Qq ^b -from matching starting at the first character. -.It -A dollar sign -.Pq Qq $ -outside a bracket expression anchors the expression or subexpression it ends to -the end of a string; such an expression or subexpression can match only a -sequence ending at the last character of a string. -For example, the EREs -.Qq ef$ -and -.Qq (ef$) -match -.Qq ef -in the string -.Qq abcdef , -but fail to match in the string -.Qq cdefab , -and the ERE -.Qq e$f -is valid, but can never match because the -.Qq f -prevents the expression -.Qq e$ -from matching ending at the last character. -.El -.Sh SEE ALSO -.Xr localedef 1 , -.Xr regcomp 3C , -.Xr attributes 5 , -.Xr environ 5 , -.Xr locale 5 , -.Xr regexp 5 |