summaryrefslogtreecommitdiff
path: root/lang/python27/patches/patch-Doc_library_urlparse.rst
blob: 7e7b9ee12dca0a1d4d80283ccc5709f93bc1f823 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
$NetBSD: patch-Doc_library_urlparse.rst,v 1.2 2022/02/25 22:41:32 gutteridge Exp $

Fix CVE-2021-23336: Add `separator` argument to parse_qs; warn with default
Via Fedora:
https://src.fedoraproject.org/rpms/python2.7/blob/rawhide/f/00359-CVE-2021-23336.patch

Fix CVE-2022-0391: urlparse does not sanitize URLs containing ASCII newline and tabs
Via Fedora:
https://src.fedoraproject.org/rpms/python2.7/raw/40dd05e5d77dbfa81777c9f84b704bc2239bf710/f/00377-CVE-2022-0391.patch

--- Doc/library/urlparse.rst.orig	2020-04-19 21:13:39.000000000 +0000
+++ Doc/library/urlparse.rst
@@ -125,6 +125,9 @@ The :mod:`urlparse` module defines the f
    decomposed before parsing, or is not a Unicode string, no error will be
    raised.
 
+   Following the `WHATWG spec`_ that updates RFC 3986, ASCII newline
+   ``\n``, ``\r`` and tab ``\t`` characters are stripped from the URL.
+
    .. versionchanged:: 2.5
       Added attributes to return value.
 
@@ -136,7 +139,7 @@ The :mod:`urlparse` module defines the f
       now raise :exc:`ValueError`.
 
 
-.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing[, max_num_fields]]])
+.. function:: parse_qs(qs[, keep_blank_values[, strict_parsing[, max_num_fields[, separator]]]])
 
    Parse a query string given as a string argument (data of type
    :mimetype:`application/x-www-form-urlencoded`).  Data are returned as a
@@ -157,6 +160,15 @@ The :mod:`urlparse` module defines the f
    read. If set, then throws a :exc:`ValueError` if there are more than
    *max_num_fields* fields read.
 
+   The optional argument *separator* is the symbol to use for separating the
+   query arguments. It is recommended to set it to ``'&'`` or ``';'``.
+   It defaults to ``'&'``; a warning is raised if this default is used.
+   This default may be changed with the following environment variable settings:
+
+   - ``PYTHON_URLLIB_QS_SEPARATOR='&'``: use only ``&`` as separator, without warning (as in Python 3.6.13+ or 3.10)
+   - ``PYTHON_URLLIB_QS_SEPARATOR=';'``: use only ``;`` as separator
+   - ``PYTHON_URLLIB_QS_SEPARATOR=legacy``: use both ``&`` and ``;`` (as in previous versions of Python)
+
    Use the :func:`urllib.urlencode` function to convert such dictionaries into
    query strings.
 
@@ -186,6 +198,9 @@ The :mod:`urlparse` module defines the f
    read. If set, then throws a :exc:`ValueError` if there are more than
    *max_num_fields* fields read.
 
+   The optional argument *separator* is the symbol to use for separating the
+   query arguments. It works as in :py:func:`parse_qs`.
+
    Use the :func:`urllib.urlencode` function to convert such lists of pairs into
    query strings.
 
@@ -195,6 +210,7 @@ The :mod:`urlparse` module defines the f
    .. versionchanged:: 2.7.16
       Added *max_num_fields* parameter.
 
+
 .. function:: urlunparse(parts)
 
    Construct a URL from a tuple as returned by ``urlparse()``. The *parts* argument
@@ -308,6 +324,10 @@ The :mod:`urlparse` module defines the f
 
 .. seealso::
 
+   `WHATWG`_ -  URL Living standard
+      Working Group for the URL Standard that defines URLs, domains, IP addresses, the
+      application/x-www-form-urlencoded format, and their API.
+
    :rfc:`3986` - Uniform Resource Identifiers
       This is the current standard (STD66). Any changes to urlparse module
       should conform to this. Certain deviations could be observed, which are
@@ -332,6 +352,7 @@ The :mod:`urlparse` module defines the f
    :rfc:`1738` - Uniform Resource Locators (URL)
       This specifies the formal syntax and semantics of absolute URLs.
 
+.. _WHATWG: https://url.spec.whatwg.org/
 
 .. _urlparse-result-object: