summaryrefslogtreecommitdiff
path: root/sysutils
diff options
context:
space:
mode:
authorbouyer <bouyer@pkgsrc.org>2018-01-24 23:29:32 +0000
committerbouyer <bouyer@pkgsrc.org>2018-01-24 23:29:32 +0000
commit78d049661c95c751a8eb0c99f26b5e3c856541a2 (patch)
tree8e6935dbe6888a512ba0fade6387d663ff707758 /sysutils
parent72ebd87a97bd54bf5ca1840395c03bc59fffd1b5 (diff)
downloadpkgsrc-78d049661c95c751a8eb0c99f26b5e3c856541a2.tar.gz
Update xen 4.8 packages to 4.8.3. Changes since 4.8.2: include patches from
all security advisory up to and including XSA254. While there pass XEN_VENDORVERSION=nb${PKGREVISION} to make so that 'xl info' shows the NetBSD PKGREVISION. If PKGREVISION is not available, define this as 'nb0'.
Diffstat (limited to 'sysutils')
-rw-r--r--sysutils/xenkernel48/Makefile10
-rw-r--r--sysutils/xenkernel48/distinfo31
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA231110
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA23225
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA234187
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA237311
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA23847
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA23948
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA240665
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA241104
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA24245
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA24395
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA24461
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA24676
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA247287
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA248164
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA24944
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA25069
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA25123
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA254-1389
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA254-244
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA254-3758
-rw-r--r--sysutils/xenkernel48/patches/patch-XSA254-4165
-rw-r--r--sysutils/xentools48/Makefile5
-rw-r--r--sysutils/xentools48/distinfo10
-rw-r--r--sysutils/xentools48/patches/patch-XSA23354
-rw-r--r--sysutils/xentools48/patches/patch-XSA24056
27 files changed, 19 insertions, 3864 deletions
diff --git a/sysutils/xenkernel48/Makefile b/sysutils/xenkernel48/Makefile
index 71294c57411..22e28c9155e 100644
--- a/sysutils/xenkernel48/Makefile
+++ b/sysutils/xenkernel48/Makefile
@@ -1,9 +1,8 @@
-# $NetBSD: Makefile,v 1.11 2018/01/18 10:28:13 bouyer Exp $
+# $NetBSD: Makefile,v 1.12 2018/01/24 23:29:32 bouyer Exp $
-VERSION= 4.8.2
+VERSION= 4.8.3
DISTNAME= xen-${VERSION}
PKGNAME= xenkernel48-${VERSION}
-PKGREVISION= 3
CATEGORIES= sysutils
MASTER_SITES= https://downloads.xenproject.org/release/xen/${VERSION}/
DIST_SUBDIR= xen48
@@ -26,6 +25,11 @@ PYTHON_FOR_BUILD_ONLY= YES
PYTHON_VERSIONS_INCOMPATIBLE= 34 35 36
MAKE_ENV+= OCAML_TOOLS=no
+.if defined(PKGREVISION) && !empty(PKGREVISION) && (${PKGREVISION} != "0")
+MAKE_ENV+= XEN_VENDORVERSION=nb${PKGREVISION}
+.else
+MAKE_ENV+= XEN_VENDORVERSION=nb0
+.endif
INSTALLATION_DIRS= xen48-kernel
XENKERNELDIR= ${PREFIX}/${INSTALLATION_DIRS}
diff --git a/sysutils/xenkernel48/distinfo b/sysutils/xenkernel48/distinfo
index df11b75c684..cda4b4566a8 100644
--- a/sysutils/xenkernel48/distinfo
+++ b/sysutils/xenkernel48/distinfo
@@ -1,31 +1,10 @@
-$NetBSD: distinfo,v 1.5 2018/01/18 10:28:13 bouyer Exp $
+$NetBSD: distinfo,v 1.6 2018/01/24 23:29:32 bouyer Exp $
-SHA1 (xen48/xen-4.8.2.tar.gz) = 184c57ce9e71e34b3cbdd318524021f44946efbe
-RMD160 (xen48/xen-4.8.2.tar.gz) = f4126cb0f7ff427ed7d20ce399dcd1077c599343
-SHA512 (xen48/xen-4.8.2.tar.gz) = 7805531f73d23ecfff3439770e62d387f4254a444875670d53a0a739323e5d4d8f8fcc478f8936ee1ae8aff3e0229549e47c01c606365a8ce060dd5c503e87da
-Size (xen48/xen-4.8.2.tar.gz) = 22522336 bytes
+SHA1 (xen48/xen-4.8.3.tar.gz) = ee55e8dc1e79d16d2f85fbe1f8bbd27a2db8422f
+RMD160 (xen48/xen-4.8.3.tar.gz) = 54b7ba828d8198c2a4629eabf7acfba2e9c6561c
+SHA512 (xen48/xen-4.8.3.tar.gz) = 584d8ee6e432e291a70e8f727da6d0a71afff7509fbf2e32eeb9cfe58b8279a80770c2c5f7759dcb5c0b08ed4644039e770e280ab534673215753d598f3f6508
+Size (xen48/xen-4.8.3.tar.gz) = 22529092 bytes
SHA1 (patch-Config.mk) = abf55aa58792315e758ee3785a763cfa8c2da68f
-SHA1 (patch-XSA231) = fc249a68ea53064ff7d95f24380f66f3fc3393e7
-SHA1 (patch-XSA232) = 86d633941ac3165ca4034db660a48d60384ea252
-SHA1 (patch-XSA234) = acf4170a410d9f314c0cc0c5c092db6bb6cc69a0
-SHA1 (patch-XSA237) = 3125554b155bd650480934a37d89d1a7471dfb20
-SHA1 (patch-XSA238) = 58b6fcb73d314d7f06256ed3769210e49197aa90
-SHA1 (patch-XSA239) = 10619718e8a1536a7f52eb3838cdb490e6ba8c97
-SHA1 (patch-XSA240) = 77b398914ca79da6cd6abf34674d5476b6d3bcba
-SHA1 (patch-XSA241) = 351395135fcd30b7ba35e84a64bf6348214d4fa6
-SHA1 (patch-XSA242) = 77e224f927818adb77b8ef10329fd886ece62835
-SHA1 (patch-XSA243) = 75eef49628bc0b3bd4fe8b023cb2da75928103a7
-SHA1 (patch-XSA244) = 2739ff8a920630088853a9076f71ca2caf639320
-SHA1 (patch-XSA246) = b48433ee2213340d1bd3c810ea3e5c6de7890fd7
-SHA1 (patch-XSA247) = b92c4a7528ebd121ba2700610589df6fff40cbbf
-SHA1 (patch-XSA248) = d5787fa7fc48449ca90200811b66cb6278c750aa
-SHA1 (patch-XSA249) = 7037a35f37eb866f16fe90482e66d0eca95944c4
-SHA1 (patch-XSA250) = 25ab2e8c67ebe2b40cf073197c17f1625f5581f6
-SHA1 (patch-XSA251) = dc0786c85bcfbdd3f7a1c97a3af32c10deea8276
-SHA1 (patch-XSA254-1) = a2e1573bebd2f5e873da85d1f29a6cb5cfa2fb31
-SHA1 (patch-XSA254-2) = fddc172293fcd8cfbaaf61155bb16738fb6fdcf5
-SHA1 (patch-XSA254-3) = eaded260831b8146c7943ed5c9138d8bde256213
-SHA1 (patch-XSA254-4) = 9766e14d3e48d41d8bce969f07c9f3a7b22d9120
SHA1 (patch-xen_Makefile) = be3f4577a205b23187b91319f91c50720919f70b
SHA1 (patch-xen_Rules.mk) = 5f33a667bae67c85d997a968c0f8b014b707d13c
SHA1 (patch-xen_arch_x86_Rules.mk) = e2d148fb308c37c047ca41a678471217b6166977
diff --git a/sysutils/xenkernel48/patches/patch-XSA231 b/sysutils/xenkernel48/patches/patch-XSA231
deleted file mode 100644
index 0e750604b70..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA231
+++ /dev/null
@@ -1,110 +0,0 @@
-$NetBSD: patch-XSA231,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From: George Dunlap <george.dunlap@citrix.com>
-Subject: xen/mm: make sure node is less than MAX_NUMNODES
-
-The output of MEMF_get_node(memflags) can be as large as nodeid_t can
-hold (currently 255). This is then used as an index to arrays of size
-MAX_NUMNODE, which is 64 on x86 and 1 on ARM, can be passed in by an
-untrusted guest (via memory_exchange and increase_reservation) and is
-not currently bounds-checked.
-
-Check the value in page_alloc.c before using it, and also check the
-value in the hypercall call sites and return -EINVAL if appropriate.
-Don't permit domains other than the hardware or control domain to
-allocate node-constrained memory.
-
-This is XSA-231.
-
-Reported-by: Matthew Daley <mattd@bugfuzz.com>
-Signed-off-by: George Dunlap <george.dunlap@citrix.com>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-
---- xen/common/memory.c.orig
-+++ xen/common/memory.c
-@@ -411,6 +411,31 @@ static void decrease_reservation(struct
- a->nr_done = i;
- }
-
-+static bool propagate_node(unsigned int xmf, unsigned int *memflags)
-+{
-+ const struct domain *currd = current->domain;
-+
-+ BUILD_BUG_ON(XENMEMF_get_node(0) != NUMA_NO_NODE);
-+ BUILD_BUG_ON(MEMF_get_node(0) != NUMA_NO_NODE);
-+
-+ if ( XENMEMF_get_node(xmf) == NUMA_NO_NODE )
-+ return true;
-+
-+ if ( is_hardware_domain(currd) || is_control_domain(currd) )
-+ {
-+ if ( XENMEMF_get_node(xmf) >= MAX_NUMNODES )
-+ return false;
-+
-+ *memflags |= MEMF_node(XENMEMF_get_node(xmf));
-+ if ( xmf & XENMEMF_exact_node_request )
-+ *memflags |= MEMF_exact_node;
-+ }
-+ else if ( xmf & XENMEMF_exact_node_request )
-+ return false;
-+
-+ return true;
-+}
-+
- static long memory_exchange(XEN_GUEST_HANDLE_PARAM(xen_memory_exchange_t) arg)
- {
- struct xen_memory_exchange exch;
-@@ -483,6 +508,12 @@ static long memory_exchange(XEN_GUEST_HA
- }
- }
-
-+ if ( unlikely(!propagate_node(exch.out.mem_flags, &memflags)) )
-+ {
-+ rc = -EINVAL;
-+ goto fail_early;
-+ }
-+
- d = rcu_lock_domain_by_any_id(exch.in.domid);
- if ( d == NULL )
- {
-@@ -501,7 +532,6 @@ static long memory_exchange(XEN_GUEST_HA
- d,
- XENMEMF_get_address_bits(exch.out.mem_flags) ? :
- (BITS_PER_LONG+PAGE_SHIFT)));
-- memflags |= MEMF_node(XENMEMF_get_node(exch.out.mem_flags));
-
- for ( i = (exch.nr_exchanged >> in_chunk_order);
- i < (exch.in.nr_extents >> in_chunk_order);
-@@ -864,12 +894,8 @@ static int construct_memop_from_reservat
- }
- read_unlock(&d->vnuma_rwlock);
- }
-- else
-- {
-- a->memflags |= MEMF_node(XENMEMF_get_node(r->mem_flags));
-- if ( r->mem_flags & XENMEMF_exact_node_request )
-- a->memflags |= MEMF_exact_node;
-- }
-+ else if ( unlikely(!propagate_node(r->mem_flags, &a->memflags)) )
-+ return -EINVAL;
-
- return 0;
- }
---- xen/common/page_alloc.c.orig
-+++ xen/common/page_alloc.c
-@@ -706,9 +706,13 @@ static struct page_info *alloc_heap_page
- if ( node >= MAX_NUMNODES )
- node = cpu_to_node(smp_processor_id());
- }
-+ else if ( unlikely(node >= MAX_NUMNODES) )
-+ {
-+ ASSERT_UNREACHABLE();
-+ return NULL;
-+ }
- first_node = node;
-
-- ASSERT(node < MAX_NUMNODES);
- ASSERT(zone_lo <= zone_hi);
- ASSERT(zone_hi < NR_ZONES);
-
diff --git a/sysutils/xenkernel48/patches/patch-XSA232 b/sysutils/xenkernel48/patches/patch-XSA232
deleted file mode 100644
index 91a7e8492f9..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA232
+++ /dev/null
@@ -1,25 +0,0 @@
-$NetBSD: patch-XSA232,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From: Andrew Cooper <andrew.cooper3@citrix.com>
-Subject: grant_table: fix GNTTABOP_cache_flush handling
-
-Don't fall over a NULL grant_table pointer when the owner of the domain
-is a system domain (DOMID_{XEN,IO} etc).
-
-This is XSA-232.
-
-Reported-by: Matthew Daley <mattd@bugfuzz.com>
-Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
-Reviewed-by: Jan Beulich <jbeulich@suse.com>
-
---- xen/common/grant_table.c.orig
-+++ xen/common/grant_table.c
-@@ -3053,7 +3053,7 @@ static int cache_flush(gnttab_cache_flus
-
- page = mfn_to_page(mfn);
- owner = page_get_owner_and_reference(page);
-- if ( !owner )
-+ if ( !owner || !owner->grant_table )
- {
- rcu_unlock_domain(d);
- return -EPERM;
diff --git a/sysutils/xenkernel48/patches/patch-XSA234 b/sysutils/xenkernel48/patches/patch-XSA234
deleted file mode 100644
index 557c6a8f447..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA234
+++ /dev/null
@@ -1,187 +0,0 @@
-$NetBSD: patch-XSA234,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: gnttab: also validate PTE permissions upon destroy/replace
-
-In order for PTE handling to match up with the reference counting done
-by common code, presence and writability of grant mapping PTEs must
-also be taken into account; validating just the frame number is not
-enough. This is in particular relevant if a guest fiddles with grant
-PTEs via non-grant hypercalls.
-
-Note that the flags being passed to replace_grant_host_mapping()
-already happen to be those of the existing mapping, so no new function
-parameter is needed.
-
-This is XSA-234.
-
-Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-
---- xen/arch/x86/mm.c.orig
-+++ xen/arch/x86/mm.c
-@@ -4017,7 +4017,8 @@ static int create_grant_pte_mapping(
- }
-
- static int destroy_grant_pte_mapping(
-- uint64_t addr, unsigned long frame, struct domain *d)
-+ uint64_t addr, unsigned long frame, unsigned int grant_pte_flags,
-+ struct domain *d)
- {
- int rc = GNTST_okay;
- void *va;
-@@ -4063,16 +4064,27 @@ static int destroy_grant_pte_mapping(
-
- ol1e = *(l1_pgentry_t *)va;
-
-- /* Check that the virtual address supplied is actually mapped to frame. */
-- if ( unlikely(l1e_get_pfn(ol1e) != frame) )
-+ /*
-+ * Check that the PTE supplied actually maps frame (with appropriate
-+ * permissions).
-+ */
-+ if ( unlikely(l1e_get_pfn(ol1e) != frame) ||
-+ unlikely((l1e_get_flags(ol1e) ^ grant_pte_flags) &
-+ (_PAGE_PRESENT | _PAGE_RW)) )
- {
- page_unlock(page);
-- MEM_LOG("PTE entry %lx for address %"PRIx64" doesn't match frame %lx",
-- (unsigned long)l1e_get_intpte(ol1e), addr, frame);
-+ MEM_LOG("PTE %"PRIpte" at %"PRIx64" doesn't match grant (%"PRIpte")",
-+ l1e_get_intpte(ol1e), addr,
-+ l1e_get_intpte(l1e_from_pfn(frame, grant_pte_flags)));
- rc = GNTST_general_error;
- goto failed;
- }
-
-+ if ( unlikely((l1e_get_flags(ol1e) ^ grant_pte_flags) &
-+ ~(_PAGE_AVAIL | PAGE_CACHE_ATTRS)) )
-+ MEM_LOG("PTE flags %x at %"PRIx64" don't match grant (%x)\n",
-+ l1e_get_flags(ol1e), addr, grant_pte_flags);
-+
- /* Delete pagetable entry. */
- if ( unlikely(!UPDATE_ENTRY
- (l1,
-@@ -4081,7 +4093,7 @@ static int destroy_grant_pte_mapping(
- 0)) )
- {
- page_unlock(page);
-- MEM_LOG("Cannot delete PTE entry at %p", va);
-+ MEM_LOG("Cannot delete PTE entry at %"PRIx64, addr);
- rc = GNTST_general_error;
- goto failed;
- }
-@@ -4149,7 +4161,8 @@ static int create_grant_va_mapping(
- }
-
- static int replace_grant_va_mapping(
-- unsigned long addr, unsigned long frame, l1_pgentry_t nl1e, struct vcpu *v)
-+ unsigned long addr, unsigned long frame, unsigned int grant_pte_flags,
-+ l1_pgentry_t nl1e, struct vcpu *v)
- {
- l1_pgentry_t *pl1e, ol1e;
- unsigned long gl1mfn;
-@@ -4185,19 +4198,30 @@ static int replace_grant_va_mapping(
-
- ol1e = *pl1e;
-
-- /* Check that the virtual address supplied is actually mapped to frame. */
-- if ( unlikely(l1e_get_pfn(ol1e) != frame) )
-- {
-- MEM_LOG("PTE entry %lx for address %lx doesn't match frame %lx",
-- l1e_get_pfn(ol1e), addr, frame);
-+ /*
-+ * Check that the virtual address supplied is actually mapped to frame
-+ * (with appropriate permissions).
-+ */
-+ if ( unlikely(l1e_get_pfn(ol1e) != frame) ||
-+ unlikely((l1e_get_flags(ol1e) ^ grant_pte_flags) &
-+ (_PAGE_PRESENT | _PAGE_RW)) )
-+ {
-+ MEM_LOG("PTE %"PRIpte" for %lx doesn't match grant (%"PRIpte")",
-+ l1e_get_intpte(ol1e), addr,
-+ l1e_get_intpte(l1e_from_pfn(frame, grant_pte_flags)));
- rc = GNTST_general_error;
- goto unlock_and_out;
- }
-
-+ if ( unlikely((l1e_get_flags(ol1e) ^ grant_pte_flags) &
-+ ~(_PAGE_AVAIL | PAGE_CACHE_ATTRS)) )
-+ MEM_LOG("PTE flags %x for %"PRIx64" don't match grant (%x)",
-+ l1e_get_flags(ol1e), addr, grant_pte_flags);
-+
- /* Delete pagetable entry. */
- if ( unlikely(!UPDATE_ENTRY(l1, pl1e, ol1e, nl1e, gl1mfn, v, 0)) )
- {
-- MEM_LOG("Cannot delete PTE entry at %p", (unsigned long *)pl1e);
-+ MEM_LOG("Cannot delete PTE entry for %"PRIx64, addr);
- rc = GNTST_general_error;
- goto unlock_and_out;
- }
-@@ -4211,9 +4235,11 @@ static int replace_grant_va_mapping(
- }
-
- static int destroy_grant_va_mapping(
-- unsigned long addr, unsigned long frame, struct vcpu *v)
-+ unsigned long addr, unsigned long frame, unsigned int grant_pte_flags,
-+ struct vcpu *v)
- {
-- return replace_grant_va_mapping(addr, frame, l1e_empty(), v);
-+ return replace_grant_va_mapping(addr, frame, grant_pte_flags,
-+ l1e_empty(), v);
- }
-
- static int create_grant_p2m_mapping(uint64_t addr, unsigned long frame,
-@@ -4307,21 +4333,40 @@ int replace_grant_host_mapping(
- unsigned long gl1mfn;
- struct page_info *l1pg;
- int rc;
-+ unsigned int grant_pte_flags;
-
- if ( paging_mode_external(current->domain) )
- return replace_grant_p2m_mapping(addr, frame, new_addr, flags);
-
-+ grant_pte_flags =
-+ _PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_GNTTAB | _PAGE_NX;
-+
-+ if ( flags & GNTMAP_application_map )
-+ grant_pte_flags |= _PAGE_USER;
-+ if ( !(flags & GNTMAP_readonly) )
-+ grant_pte_flags |= _PAGE_RW;
-+ /*
-+ * On top of the explicit settings done by create_grant_host_mapping()
-+ * also open-code relevant parts of adjust_guest_l1e(). Don't mirror
-+ * available and cachability flags, though.
-+ */
-+ if ( !is_pv_32bit_domain(curr->domain) )
-+ grant_pte_flags |= (grant_pte_flags & _PAGE_USER)
-+ ? _PAGE_GLOBAL
-+ : _PAGE_GUEST_KERNEL | _PAGE_USER;
-+
- if ( flags & GNTMAP_contains_pte )
- {
- if ( !new_addr )
-- return destroy_grant_pte_mapping(addr, frame, curr->domain);
-+ return destroy_grant_pte_mapping(addr, frame, grant_pte_flags,
-+ curr->domain);
-
- MEM_LOG("Unsupported grant table operation");
- return GNTST_general_error;
- }
-
- if ( !new_addr )
-- return destroy_grant_va_mapping(addr, frame, curr);
-+ return destroy_grant_va_mapping(addr, frame, grant_pte_flags, curr);
-
- pl1e = guest_map_l1e(new_addr, &gl1mfn);
- if ( !pl1e )
-@@ -4369,7 +4414,7 @@ int replace_grant_host_mapping(
- put_page(l1pg);
- guest_unmap_l1e(pl1e);
-
-- rc = replace_grant_va_mapping(addr, frame, ol1e, curr);
-+ rc = replace_grant_va_mapping(addr, frame, grant_pte_flags, ol1e, curr);
- if ( rc && !paging_mode_refcounts(curr->domain) )
- put_page_from_l1e(ol1e, curr->domain);
-
diff --git a/sysutils/xenkernel48/patches/patch-XSA237 b/sysutils/xenkernel48/patches/patch-XSA237
deleted file mode 100644
index 760a9e6a5a4..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA237
+++ /dev/null
@@ -1,311 +0,0 @@
-$NetBSD: patch-XSA237,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86: don't allow MSI pIRQ mapping on unowned device
-
-MSI setup should be permitted only for existing devices owned by the
-respective guest (the operation may still be carried out by the domain
-controlling that guest).
-
-This is part of XSA-237.
-
-Reported-by: HW42 <hw42@ipsumj.de>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-
---- xen/arch/x86/irq.c.orig
-+++ xen/arch/x86/irq.c
-@@ -1964,7 +1964,10 @@ int map_domain_pirq(
- if ( !cpu_has_apic )
- goto done;
-
-- pdev = pci_get_pdev(msi->seg, msi->bus, msi->devfn);
-+ pdev = pci_get_pdev_by_domain(d, msi->seg, msi->bus, msi->devfn);
-+ if ( !pdev )
-+ goto done;
-+
- ret = pci_enable_msi(msi, &msi_desc);
- if ( ret )
- {
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86: enforce proper privilege when (un)mapping pIRQ-s
-
-(Un)mapping of IRQs, just like other RESOURCE__ADD* / RESOURCE__REMOVE*
-actions (in FLASK terms) should be XSM_DM_PRIV rather than XSM_TARGET.
-This in turn requires bypassing the XSM check in physdev_unmap_pirq()
-for the HVM emuirq case just like is being done in physdev_map_pirq().
-The primary goal security wise, however, is to no longer allow HVM
-guests, by specifying their own domain ID instead of DOMID_SELF, to
-enter code paths intended for PV guest and the control domains of HVM
-guests only.
-
-This is part of XSA-237.
-
-Reported-by: HW42 <hw42@ipsumj.de>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: George Dunlap <george.dunlap@citrix.com>
-
---- xen/arch/x86/physdev.c.orig
-+++ xen/arch/x86/physdev.c
-@@ -110,7 +110,7 @@ int physdev_map_pirq(domid_t domid, int
- if ( d == NULL )
- return -ESRCH;
-
-- ret = xsm_map_domain_pirq(XSM_TARGET, d);
-+ ret = xsm_map_domain_pirq(XSM_DM_PRIV, d);
- if ( ret )
- goto free_domain;
-
-@@ -255,13 +255,14 @@ int physdev_map_pirq(domid_t domid, int
- int physdev_unmap_pirq(domid_t domid, int pirq)
- {
- struct domain *d;
-- int ret;
-+ int ret = 0;
-
- d = rcu_lock_domain_by_any_id(domid);
- if ( d == NULL )
- return -ESRCH;
-
-- ret = xsm_unmap_domain_pirq(XSM_TARGET, d);
-+ if ( domid != DOMID_SELF || !is_hvm_domain(d) )
-+ ret = xsm_unmap_domain_pirq(XSM_DM_PRIV, d);
- if ( ret )
- goto free_domain;
-
---- xen/include/xsm/dummy.h.orig
-+++ xen/include/xsm/dummy.h
-@@ -453,7 +453,7 @@ static XSM_INLINE char *xsm_show_irq_sid
-
- static XSM_INLINE int xsm_map_domain_pirq(XSM_DEFAULT_ARG struct domain *d)
- {
-- XSM_ASSERT_ACTION(XSM_TARGET);
-+ XSM_ASSERT_ACTION(XSM_DM_PRIV);
- return xsm_default_action(action, current->domain, d);
- }
-
-@@ -465,7 +465,7 @@ static XSM_INLINE int xsm_map_domain_irq
-
- static XSM_INLINE int xsm_unmap_domain_pirq(XSM_DEFAULT_ARG struct domain *d)
- {
-- XSM_ASSERT_ACTION(XSM_TARGET);
-+ XSM_ASSERT_ACTION(XSM_DM_PRIV);
- return xsm_default_action(action, current->domain, d);
- }
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86/MSI: disallow redundant enabling
-
-At the moment, Xen attempts to allow redundant enabling of MSI by
-having pci_enable_msi() return 0, and point to the existing MSI
-descriptor, when the msi already exists.
-
-Unfortunately, if subsequent errors are encountered, the cleanup
-paths assume pci_enable_msi() had done full initialization, and
-hence undo everything that was assumed to be done by that
-function without also undoing other setup that would normally
-occur only after that function was called (in map_domain_pirq()
-itself).
-
-Rather than try to make the redundant enabling case work properly, just
-forbid it entirely by having pci_enable_msi() return -EEXIST when MSI
-is already set up.
-
-This is part of XSA-237.
-
-Reported-by: HW42 <hw42@ipsumj.de>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-Reviewed-by: George Dunlap <george.dunlap@citrix.com>
-
---- xen/arch/x86/msi.c.orig
-+++ xen/arch/x86/msi.c
-@@ -1050,11 +1050,10 @@ static int __pci_enable_msi(struct msi_i
- old_desc = find_msi_entry(pdev, msi->irq, PCI_CAP_ID_MSI);
- if ( old_desc )
- {
-- printk(XENLOG_WARNING "irq %d already mapped to MSI on %04x:%02x:%02x.%u\n",
-+ printk(XENLOG_ERR "irq %d already mapped to MSI on %04x:%02x:%02x.%u\n",
- msi->irq, msi->seg, msi->bus,
- PCI_SLOT(msi->devfn), PCI_FUNC(msi->devfn));
-- *desc = old_desc;
-- return 0;
-+ return -EEXIST;
- }
-
- old_desc = find_msi_entry(pdev, -1, PCI_CAP_ID_MSIX);
-@@ -1118,11 +1117,10 @@ static int __pci_enable_msix(struct msi_
- old_desc = find_msi_entry(pdev, msi->irq, PCI_CAP_ID_MSIX);
- if ( old_desc )
- {
-- printk(XENLOG_WARNING "irq %d already mapped to MSI-X on %04x:%02x:%02x.%u\n",
-+ printk(XENLOG_ERR "irq %d already mapped to MSI-X on %04x:%02x:%02x.%u\n",
- msi->irq, msi->seg, msi->bus,
- PCI_SLOT(msi->devfn), PCI_FUNC(msi->devfn));
-- *desc = old_desc;
-- return 0;
-+ return -EEXIST;
- }
-
- old_desc = find_msi_entry(pdev, -1, PCI_CAP_ID_MSI);
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86/IRQ: conditionally preserve irq <-> pirq mapping on map error paths
-
-Mappings that had been set up before should not be torn down when
-handling unrelated errors.
-
-This is part of XSA-237.
-
-Reported-by: HW42 <hw42@ipsumj.de>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: George Dunlap <george.dunlap@citrix.com>
-
---- xen/arch/x86/irq.c.orig
-+++ xen/arch/x86/irq.c
-@@ -1252,7 +1252,8 @@ static int prepare_domain_irq_pirq(struc
- return -ENOMEM;
- }
- *pinfo = info;
-- return 0;
-+
-+ return !!err;
- }
-
- static void set_domain_irq_pirq(struct domain *d, int irq, struct pirq *pirq)
-@@ -1295,7 +1296,10 @@ int init_domain_irq_mapping(struct domai
- continue;
- err = prepare_domain_irq_pirq(d, i, i, &info);
- if ( err )
-+ {
-+ ASSERT(err < 0);
- break;
-+ }
- set_domain_irq_pirq(d, i, info);
- }
-
-@@ -1903,6 +1907,7 @@ int map_domain_pirq(
- struct pirq *info;
- struct irq_desc *desc;
- unsigned long flags;
-+ DECLARE_BITMAP(prepared, MAX_MSI_IRQS) = {};
-
- ASSERT(spin_is_locked(&d->event_lock));
-
-@@ -1946,8 +1951,10 @@ int map_domain_pirq(
- }
-
- ret = prepare_domain_irq_pirq(d, irq, pirq, &info);
-- if ( ret )
-+ if ( ret < 0 )
- goto revoke;
-+ if ( !ret )
-+ __set_bit(0, prepared);
-
- desc = irq_to_desc(irq);
-
-@@ -2019,8 +2026,10 @@ int map_domain_pirq(
- irq = create_irq(NUMA_NO_NODE);
- ret = irq >= 0 ? prepare_domain_irq_pirq(d, irq, pirq + nr, &info)
- : irq;
-- if ( ret )
-+ if ( ret < 0 )
- break;
-+ if ( !ret )
-+ __set_bit(nr, prepared);
- msi_desc[nr].irq = irq;
-
- if ( irq_permit_access(d, irq) != 0 )
-@@ -2053,15 +2062,15 @@ int map_domain_pirq(
- desc->msi_desc = NULL;
- spin_unlock_irqrestore(&desc->lock, flags);
- }
-- while ( nr-- )
-+ while ( nr )
- {
- if ( irq >= 0 && irq_deny_access(d, irq) )
- printk(XENLOG_G_ERR
- "dom%d: could not revoke access to IRQ%d (pirq %d)\n",
- d->domain_id, irq, pirq);
-- if ( info )
-+ if ( info && test_bit(nr, prepared) )
- cleanup_domain_irq_pirq(d, irq, info);
-- info = pirq_info(d, pirq + nr);
-+ info = pirq_info(d, pirq + --nr);
- irq = info->arch.irq;
- }
- msi_desc->irq = -1;
-@@ -2077,12 +2086,14 @@ int map_domain_pirq(
- spin_lock_irqsave(&desc->lock, flags);
- set_domain_irq_pirq(d, irq, info);
- spin_unlock_irqrestore(&desc->lock, flags);
-+ ret = 0;
- }
-
- done:
- if ( ret )
- {
-- cleanup_domain_irq_pirq(d, irq, info);
-+ if ( test_bit(0, prepared) )
-+ cleanup_domain_irq_pirq(d, irq, info);
- revoke:
- if ( irq_deny_access(d, irq) )
- printk(XENLOG_G_ERR
---- xen/arch/x86/physdev.c.orig
-+++ xen/arch/x86/physdev.c
-@@ -185,7 +185,7 @@ int physdev_map_pirq(domid_t domid, int
- }
- else if ( type == MAP_PIRQ_TYPE_MULTI_MSI )
- {
-- if ( msi->entry_nr <= 0 || msi->entry_nr > 32 )
-+ if ( msi->entry_nr <= 0 || msi->entry_nr > MAX_MSI_IRQS )
- ret = -EDOM;
- else if ( msi->entry_nr != 1 && !iommu_intremap )
- ret = -EOPNOTSUPP;
---- xen/include/asm-x86/msi.h.orig
-+++ xen/include/asm-x86/msi.h
-@@ -55,6 +55,8 @@
- /* MAX fixed pages reserved for mapping MSIX tables. */
- #define FIX_MSIX_MAX_PAGES 512
-
-+#define MAX_MSI_IRQS 32 /* limited by MSI capability struct properties */
-+
- struct msi_info {
- u16 seg;
- u8 bus;
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86/FLASK: fix unmap-domain-IRQ XSM hook
-
-The caller and the FLASK implementation of xsm_unmap_domain_irq()
-disagreed about what the "data" argument points to in the MSI case:
-Change both sides to pass/take a PCI device.
-
-This is part of XSA-237.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-
---- xen/arch/x86/irq.c.orig
-+++ xen/arch/x86/irq.c
-@@ -2144,7 +2144,8 @@ int unmap_domain_pirq(struct domain *d,
- nr = msi_desc->msi.nvec;
- }
-
-- ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq, msi_desc);
-+ ret = xsm_unmap_domain_irq(XSM_HOOK, d, irq,
-+ msi_desc ? msi_desc->dev : NULL);
- if ( ret )
- goto done;
-
---- xen/xsm/flask/hooks.c.orig
-+++ xen/xsm/flask/hooks.c
-@@ -915,8 +915,8 @@ static int flask_unmap_domain_msi (struc
- u32 *sid, struct avc_audit_data *ad)
- {
- #ifdef CONFIG_HAS_PCI
-- struct msi_info *msi = data;
-- u32 machine_bdf = (msi->seg << 16) | (msi->bus << 8) | msi->devfn;
-+ const struct pci_dev *pdev = data;
-+ u32 machine_bdf = (pdev->seg << 16) | (pdev->bus << 8) | pdev->devfn;
-
- AVC_AUDIT_DATA_INIT(ad, DEV);
- ad->device = machine_bdf;
diff --git a/sysutils/xenkernel48/patches/patch-XSA238 b/sysutils/xenkernel48/patches/patch-XSA238
deleted file mode 100644
index 4835be6fc7b..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA238
+++ /dev/null
@@ -1,47 +0,0 @@
-$NetBSD: patch-XSA238,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From cdc2887076b19b39fab9faec495082586f3113df Mon Sep 17 00:00:00 2001
-From: XenProject Security Team <security@xenproject.org>
-Date: Tue, 5 Sep 2017 13:41:37 +0200
-Subject: x86/ioreq server: correctly handle bogus
- XEN_DMOP_{,un}map_io_range_to_ioreq_server arguments
-
-Misbehaving device model can pass incorrect XEN_DMOP_map/
-unmap_io_range_to_ioreq_server arguments, namely end < start when
-specifying address range. When this happens we hit ASSERT(s <= e) in
-rangeset_contains_range()/rangeset_overlaps_range() with debug builds.
-Production builds will not trap right away but may misbehave later
-while handling such bogus ranges.
-
-This is XSA-238.
-
-Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
-Reviewed-by: Jan Beulich <jbeulich@suse.com>
----
- xen/arch/x86/hvm/ioreq.c | 6 ++++++
- 1 file changed, 6 insertions(+)
-
-diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c
-index b2a8b0e986..8c8bf1f0ec 100644
---- xen/arch/x86/hvm/ioreq.c.orig
-+++ xen/arch/x86/hvm/ioreq.c
-@@ -820,6 +820,9 @@ int hvm_map_io_range_to_ioreq_server(struct domain *d, ioservid_t id,
- struct hvm_ioreq_server *s;
- int rc;
-
-+ if ( start > end )
-+ return -EINVAL;
-+
- spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
-
- rc = -ENOENT;
-@@ -872,6 +875,9 @@ int hvm_unmap_io_range_from_ioreq_server(struct domain *d, ioservid_t id,
- struct hvm_ioreq_server *s;
- int rc;
-
-+ if ( start > end )
-+ return -EINVAL;
-+
- spin_lock_recursive(&d->arch.hvm_domain.ioreq_server.lock);
-
- rc = -ENOENT;
diff --git a/sysutils/xenkernel48/patches/patch-XSA239 b/sysutils/xenkernel48/patches/patch-XSA239
deleted file mode 100644
index dea0ad151f3..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA239
+++ /dev/null
@@ -1,48 +0,0 @@
-$NetBSD: patch-XSA239,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86/HVM: prefill partially used variable on emulation paths
-
-Certain handlers ignore the access size (vioapic_write() being the
-example this was found with), perhaps leading to subsequent reads
-seeing data that wasn't actually written by the guest. For
-consistency and extra safety also do this on the read path of
-hvm_process_io_intercept(), even if this doesn't directly affect what
-guests get to see, as we've supposedly already dealt with read handlers
-leaving data completely unitialized.
-
-This is XSA-239.
-
-Reported-by: Roger Pau Monné <roger.pau@citrix.com>
-Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-
---- xen/arch/x86/hvm/emulate.c.orig
-+++ xen/arch/x86/hvm/emulate.c
-@@ -129,7 +129,7 @@ static int hvmemul_do_io(
- .count = *reps,
- .dir = dir,
- .df = df,
-- .data = data,
-+ .data = data_is_addr ? data : 0,
- .data_is_ptr = data_is_addr, /* ioreq_t field name is misleading */
- .state = STATE_IOREQ_READY,
- };
---- xen/arch/x86/hvm/intercept.c.orig
-+++ xen/arch/x86/hvm/intercept.c
-@@ -127,6 +127,7 @@ int hvm_process_io_intercept(const struc
- addr = (p->type == IOREQ_TYPE_COPY) ?
- p->addr + step * i :
- p->addr;
-+ data = 0;
- rc = ops->read(handler, addr, p->size, &data);
- if ( rc != X86EMUL_OKAY )
- break;
-@@ -161,6 +162,7 @@ int hvm_process_io_intercept(const struc
- {
- if ( p->data_is_ptr )
- {
-+ data = 0;
- switch ( hvm_copy_from_guest_phys(&data, p->data + step * i,
- p->size) )
- {
diff --git a/sysutils/xenkernel48/patches/patch-XSA240 b/sysutils/xenkernel48/patches/patch-XSA240
deleted file mode 100644
index 8bc1f97215f..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA240
+++ /dev/null
@@ -1,665 +0,0 @@
-$NetBSD: patch-XSA240,v 1.2 2017/12/15 14:02:15 bouyer Exp $
-
-From 2315b8c651e0cc31c9153d09c9912b8fbe632ad2 Mon Sep 17 00:00:00 2001
-From: Jan Beulich <jbeulich@suse.com>
-Date: Thu, 28 Sep 2017 15:17:25 +0100
-Subject: [PATCH 1/2] x86: limit linear page table use to a single level
-
-That's the only way that they're meant to be used. Without such a
-restriction arbitrarily long chains of same-level page tables can be
-built, tearing down of which may then cause arbitrarily deep recursion,
-causing a stack overflow. To facilitate this restriction, a counter is
-being introduced to track both the number of same-level entries in a
-page table as well as the number of uses of a page table in another
-same-level one (counting into positive and negative direction
-respectively, utilizing the fact that both counts can't be non-zero at
-the same time).
-
-Note that the added accounting introduces a restriction on the number
-of times a page can be used in other same-level page tables - more than
-32k of such uses are no longer possible.
-
-Note also that some put_page_and_type[_preemptible]() calls are
-replaced with open-coded equivalents. This seemed preferrable to
-adding "parent_table" to the matrix of functions.
-
-Note further that cross-domain same-level page table references are no
-longer permitted (they probably never should have been).
-
-This is XSA-240.
-
-Reported-by: Jann Horn <jannh@google.com>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Signed-off-by: George Dunlap <george.dunlap@citrix.com>
----
- xen/arch/x86/domain.c | 1 +
- xen/arch/x86/mm.c | 171 ++++++++++++++++++++++++++++++++++++++-----
- xen/include/asm-x86/domain.h | 2 +
- xen/include/asm-x86/mm.h | 25 +++++--
- 4 files changed, 175 insertions(+), 24 deletions(-)
-
-diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
-index a725b43a67..5265b0496c 100644
---- xen/arch/x86/domain.c.orig
-+++ xen/arch/x86/domain.c
-@@ -1245,6 +1245,7 @@ int arch_set_info_guest(
- rc = -ERESTART;
- /* Fallthrough */
- case -ERESTART:
-+ v->arch.old_guest_ptpg = NULL;
- v->arch.old_guest_table =
- pagetable_get_page(v->arch.guest_table);
- v->arch.guest_table = pagetable_null();
-diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
-index a40461d4d6..31d4a03840 100644
---- xen/arch/x86/mm.c.orig
-+++ xen/arch/x86/mm.c
-@@ -733,6 +733,61 @@ static void put_data_page(
- put_page(page);
- }
-
-+static bool inc_linear_entries(struct page_info *pg)
-+{
-+ typeof(pg->linear_pt_count) nc = read_atomic(&pg->linear_pt_count), oc;
-+
-+ do {
-+ /*
-+ * The check below checks for the "linear use" count being non-zero
-+ * as well as overflow. Signed integer overflow is undefined behavior
-+ * according to the C spec. However, as long as linear_pt_count is
-+ * smaller in size than 'int', the arithmetic operation of the
-+ * increment below won't overflow; rather the result will be truncated
-+ * when stored. Ensure that this is always true.
-+ */
-+ BUILD_BUG_ON(sizeof(nc) >= sizeof(int));
-+ oc = nc++;
-+ if ( nc <= 0 )
-+ return false;
-+ nc = cmpxchg(&pg->linear_pt_count, oc, nc);
-+ } while ( oc != nc );
-+
-+ return true;
-+}
-+
-+static void dec_linear_entries(struct page_info *pg)
-+{
-+ typeof(pg->linear_pt_count) oc;
-+
-+ oc = arch_fetch_and_add(&pg->linear_pt_count, -1);
-+ ASSERT(oc > 0);
-+}
-+
-+static bool inc_linear_uses(struct page_info *pg)
-+{
-+ typeof(pg->linear_pt_count) nc = read_atomic(&pg->linear_pt_count), oc;
-+
-+ do {
-+ /* See the respective comment in inc_linear_entries(). */
-+ BUILD_BUG_ON(sizeof(nc) >= sizeof(int));
-+ oc = nc--;
-+ if ( nc >= 0 )
-+ return false;
-+ nc = cmpxchg(&pg->linear_pt_count, oc, nc);
-+ } while ( oc != nc );
-+
-+ return true;
-+}
-+
-+static void dec_linear_uses(struct page_info *pg)
-+{
-+ typeof(pg->linear_pt_count) oc;
-+
-+ oc = arch_fetch_and_add(&pg->linear_pt_count, 1);
-+ ASSERT(oc < 0);
-+}
-+
- /*
- * We allow root tables to map each other (a.k.a. linear page tables). It
- * needs some special care with reference counts and access permissions:
-@@ -762,15 +817,35 @@ get_##level##_linear_pagetable( \
- \
- if ( (pfn = level##e_get_pfn(pde)) != pde_pfn ) \
- { \
-+ struct page_info *ptpg = mfn_to_page(pde_pfn); \
-+ \
-+ /* Make sure the page table belongs to the correct domain. */ \
-+ if ( unlikely(page_get_owner(ptpg) != d) ) \
-+ return 0; \
-+ \
- /* Make sure the mapped frame belongs to the correct domain. */ \
- if ( unlikely(!get_page_from_pagenr(pfn, d)) ) \
- return 0; \
- \
- /* \
-- * Ensure that the mapped frame is an already-validated page table. \
-+ * Ensure that the mapped frame is an already-validated page table \
-+ * and is not itself having linear entries, as well as that the \
-+ * containing page table is not iself in use as a linear page table \
-+ * elsewhere. \
- * If so, atomically increment the count (checking for overflow). \
- */ \
- page = mfn_to_page(pfn); \
-+ if ( !inc_linear_entries(ptpg) ) \
-+ { \
-+ put_page(page); \
-+ return 0; \
-+ } \
-+ if ( !inc_linear_uses(page) ) \
-+ { \
-+ dec_linear_entries(ptpg); \
-+ put_page(page); \
-+ return 0; \
-+ } \
- y = page->u.inuse.type_info; \
- do { \
- x = y; \
-@@ -778,6 +853,8 @@ get_##level##_linear_pagetable( \
- unlikely((x & (PGT_type_mask|PGT_validated)) != \
- (PGT_##level##_page_table|PGT_validated)) ) \
- { \
-+ dec_linear_uses(page); \
-+ dec_linear_entries(ptpg); \
- put_page(page); \
- return 0; \
- } \
-@@ -1202,6 +1279,9 @@ get_page_from_l4e(
- l3e_remove_flags((pl3e), _PAGE_USER|_PAGE_RW|_PAGE_ACCESSED); \
- } while ( 0 )
-
-+static int _put_page_type(struct page_info *page, bool preemptible,
-+ struct page_info *ptpg);
-+
- void put_page_from_l1e(l1_pgentry_t l1e, struct domain *l1e_owner)
- {
- unsigned long pfn = l1e_get_pfn(l1e);
-@@ -1271,17 +1351,22 @@ static int put_page_from_l2e(l2_pgentry_t l2e, unsigned long pfn)
- if ( l2e_get_flags(l2e) & _PAGE_PSE )
- put_superpage(l2e_get_pfn(l2e));
- else
-- put_page_and_type(l2e_get_page(l2e));
-+ {
-+ struct page_info *pg = l2e_get_page(l2e);
-+ int rc = _put_page_type(pg, false, mfn_to_page(pfn));
-+
-+ ASSERT(!rc);
-+ put_page(pg);
-+ }
-
- return 0;
- }
-
--static int __put_page_type(struct page_info *, int preemptible);
--
- static int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
- int partial, bool_t defer)
- {
- struct page_info *pg;
-+ int rc;
-
- if ( !(l3e_get_flags(l3e) & _PAGE_PRESENT) || (l3e_get_pfn(l3e) == pfn) )
- return 1;
-@@ -1304,21 +1389,28 @@ static int put_page_from_l3e(l3_pgentry_t l3e, unsigned long pfn,
- if ( unlikely(partial > 0) )
- {
- ASSERT(!defer);
-- return __put_page_type(pg, 1);
-+ return _put_page_type(pg, true, mfn_to_page(pfn));
- }
-
- if ( defer )
- {
-+ current->arch.old_guest_ptpg = mfn_to_page(pfn);
- current->arch.old_guest_table = pg;
- return 0;
- }
-
-- return put_page_and_type_preemptible(pg);
-+ rc = _put_page_type(pg, true, mfn_to_page(pfn));
-+ if ( likely(!rc) )
-+ put_page(pg);
-+
-+ return rc;
- }
-
- static int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn,
- int partial, bool_t defer)
- {
-+ int rc = 1;
-+
- if ( (l4e_get_flags(l4e) & _PAGE_PRESENT) &&
- (l4e_get_pfn(l4e) != pfn) )
- {
-@@ -1327,18 +1419,22 @@ static int put_page_from_l4e(l4_pgentry_t l4e, unsigned long pfn,
- if ( unlikely(partial > 0) )
- {
- ASSERT(!defer);
-- return __put_page_type(pg, 1);
-+ return _put_page_type(pg, true, mfn_to_page(pfn));
- }
-
- if ( defer )
- {
-+ current->arch.old_guest_ptpg = mfn_to_page(pfn);
- current->arch.old_guest_table = pg;
- return 0;
- }
-
-- return put_page_and_type_preemptible(pg);
-+ rc = _put_page_type(pg, true, mfn_to_page(pfn));
-+ if ( likely(!rc) )
-+ put_page(pg);
- }
-- return 1;
-+
-+ return rc;
- }
-
- static int alloc_l1_table(struct page_info *page)
-@@ -1536,6 +1632,7 @@ static int alloc_l3_table(struct page_info *page)
- {
- page->nr_validated_ptes = i;
- page->partial_pte = 0;
-+ current->arch.old_guest_ptpg = NULL;
- current->arch.old_guest_table = page;
- }
- while ( i-- > 0 )
-@@ -1628,6 +1725,7 @@ static int alloc_l4_table(struct page_info *page)
- {
- if ( current->arch.old_guest_table )
- page->nr_validated_ptes++;
-+ current->arch.old_guest_ptpg = NULL;
- current->arch.old_guest_table = page;
- }
- }
-@@ -2370,14 +2468,20 @@ int free_page_type(struct page_info *pag
- }
-
-
--static int __put_final_page_type(
-- struct page_info *page, unsigned long type, int preemptible)
-+static int _put_final_page_type(struct page_info *page, unsigned long type,
-+ bool preemptible, struct page_info *ptpg)
- {
- int rc = free_page_type(page, type, preemptible);
-
- /* No need for atomic update of type_info here: noone else updates it. */
- if ( rc == 0 )
- {
-+ if ( ptpg && PGT_type_equal(type, ptpg->u.inuse.type_info) )
-+ {
-+ dec_linear_uses(page);
-+ dec_linear_entries(ptpg);
-+ }
-+ ASSERT(!page->linear_pt_count || page_get_owner(page)->is_dying);
- /*
- * Record TLB information for flush later. We do not stamp page tables
- * when running in shadow mode:
-@@ -2413,8 +2517,8 @@ static int __put_final_page_type(
- }
-
-
--static int __put_page_type(struct page_info *page,
-- int preemptible)
-+static int _put_page_type(struct page_info *page, bool preemptible,
-+ struct page_info *ptpg)
- {
- unsigned long nx, x, y = page->u.inuse.type_info;
- int rc = 0;
-@@ -2441,12 +2545,28 @@ static int __put_page_type(struct page_info *page,
- x, nx)) != x) )
- continue;
- /* We cleared the 'valid bit' so we do the clean up. */
-- rc = __put_final_page_type(page, x, preemptible);
-+ rc = _put_final_page_type(page, x, preemptible, ptpg);
-+ ptpg = NULL;
- if ( x & PGT_partial )
- put_page(page);
- break;
- }
-
-+ if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
-+ {
-+ /*
-+ * page_set_tlbflush_timestamp() accesses the same union
-+ * linear_pt_count lives in. Unvalidated page table pages,
-+ * however, should occur during domain destruction only
-+ * anyway. Updating of linear_pt_count luckily is not
-+ * necessary anymore for a dying domain.
-+ */
-+ ASSERT(page_get_owner(page)->is_dying);
-+ ASSERT(page->linear_pt_count < 0);
-+ ASSERT(ptpg->linear_pt_count > 0);
-+ ptpg = NULL;
-+ }
-+
- /*
- * Record TLB information for flush later. We do not stamp page
- * tables when running in shadow mode:
-@@ -2466,6 +2586,13 @@ static int __put_page_type(struct page_info *page,
- return -EINTR;
- }
-
-+ if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
-+ {
-+ ASSERT(!rc);
-+ dec_linear_uses(page);
-+ dec_linear_entries(ptpg);
-+ }
-+
- return rc;
- }
-
-@@ -2600,6 +2727,7 @@ static int __get_page_type(struct page_info *page, unsigned long type,
- page->nr_validated_ptes = 0;
- page->partial_pte = 0;
- }
-+ page->linear_pt_count = 0;
- rc = alloc_page_type(page, type, preemptible);
- }
-
-@@ -2614,7 +2742,7 @@ static int __get_page_type(struct page_info *page, unsigned long type,
-
- void put_page_type(struct page_info *page)
- {
-- int rc = __put_page_type(page, 0);
-+ int rc = _put_page_type(page, false, NULL);
- ASSERT(rc == 0);
- (void)rc;
- }
-@@ -2630,7 +2758,7 @@ int get_page_type(struct page_info *page, unsigned long type)
-
- int put_page_type_preemptible(struct page_info *page)
- {
-- return __put_page_type(page, 1);
-+ return _put_page_type(page, true, NULL);
- }
-
- int get_page_type_preemptible(struct page_info *page, unsigned long type)
-@@ -2836,11 +2964,14 @@ int put_old_guest_table(struct vcpu *v)
- if ( !v->arch.old_guest_table )
- return 0;
-
-- switch ( rc = put_page_and_type_preemptible(v->arch.old_guest_table) )
-+ switch ( rc = _put_page_type(v->arch.old_guest_table, true,
-+ v->arch.old_guest_ptpg) )
- {
- case -EINTR:
- case -ERESTART:
- return -ERESTART;
-+ case 0:
-+ put_page(v->arch.old_guest_table);
- }
-
- v->arch.old_guest_table = NULL;
-@@ -2997,6 +3128,7 @@ int new_guest_cr3(unsigned long mfn)
- rc = -ERESTART;
- /* fallthrough */
- case -ERESTART:
-+ curr->arch.old_guest_ptpg = NULL;
- curr->arch.old_guest_table = page;
- break;
- default:
-@@ -3264,7 +3396,10 @@ long do_mmuext_op(
- if ( type == PGT_l1_page_table )
- put_page_and_type(page);
- else
-+ {
-+ curr->arch.old_guest_ptpg = NULL;
- curr->arch.old_guest_table = page;
-+ }
- }
- }
-
-@@ -3297,6 +3432,7 @@ long do_mmuext_op(
- {
- case -EINTR:
- case -ERESTART:
-+ curr->arch.old_guest_ptpg = NULL;
- curr->arch.old_guest_table = page;
- rc = 0;
- break;
-@@ -3375,6 +3511,7 @@ long do_mmuext_op(
- rc = -ERESTART;
- /* fallthrough */
- case -ERESTART:
-+ curr->arch.old_guest_ptpg = NULL;
- curr->arch.old_guest_table = page;
- break;
- default:
-diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
-index f6a40eb881..60bb8c9014 100644
---- xen/include/asm-x86/domain.h.orig
-+++ xen/include/asm-x86/domain.h
-@@ -531,6 +531,8 @@ struct arch_vcpu
- pagetable_t guest_table_user; /* (MFN) x86/64 user-space pagetable */
- pagetable_t guest_table; /* (MFN) guest notion of cr3 */
- struct page_info *old_guest_table; /* partially destructed pagetable */
-+ struct page_info *old_guest_ptpg; /* containing page table of the */
-+ /* former, if any */
- /* guest_table holds a ref to the page, and also a type-count unless
- * shadow refcounts are in use */
- pagetable_t shadow_table[4]; /* (MFN) shadow(s) of guest */
-diff --git a/xen/include/asm-x86/mm.h b/xen/include/asm-x86/mm.h
-index 6687dbc985..63590a7716 100644
---- xen/include/asm-x86/mm.h.orig
-+++ xen/include/asm-x86/mm.h
-@@ -125,11 +125,11 @@ struct page_info
- u32 tlbflush_timestamp;
-
- /*
-- * When PGT_partial is true then this field is valid and indicates
-- * that PTEs in the range [0, @nr_validated_ptes) have been validated.
-- * An extra page reference must be acquired (or not dropped) whenever
-- * PGT_partial gets set, and it must be dropped when the flag gets
-- * cleared. This is so that a get() leaving a page in partially
-+ * When PGT_partial is true then the first two fields are valid and
-+ * indicate that PTEs in the range [0, @nr_validated_ptes) have been
-+ * validated. An extra page reference must be acquired (or not dropped)
-+ * whenever PGT_partial gets set, and it must be dropped when the flag
-+ * gets cleared. This is so that a get() leaving a page in partially
- * validated state (where the caller would drop the reference acquired
- * due to the getting of the type [apparently] failing [-ERESTART])
- * would not accidentally result in a page left with zero general
-@@ -153,10 +153,18 @@ struct page_info
- * put_page_from_lNe() (due to the apparent failure), and hence it
- * must be dropped when the put operation is resumed (and completes),
- * but it must not be acquired if picking up the page for validation.
-+ *
-+ * The 3rd field, @linear_pt_count, indicates
-+ * - by a positive value, how many same-level page table entries a page
-+ * table has,
-+ * - by a negative value, in how many same-level page tables a page is
-+ * in use.
- */
- struct {
-- u16 nr_validated_ptes;
-- s8 partial_pte;
-+ u16 nr_validated_ptes:PAGETABLE_ORDER + 1;
-+ u16 :16 - PAGETABLE_ORDER - 1 - 2;
-+ s16 partial_pte:2;
-+ s16 linear_pt_count;
- };
-
- /*
-@@ -207,6 +215,9 @@ struct page_info
- #define PGT_count_width PG_shift(9)
- #define PGT_count_mask ((1UL<<PGT_count_width)-1)
-
-+/* Are the 'type mask' bits identical? */
-+#define PGT_type_equal(x, y) (!(((x) ^ (y)) & PGT_type_mask))
-+
- /* Cleared when the owning guest 'frees' this page. */
- #define _PGC_allocated PG_shift(1)
- #define PGC_allocated PG_mask(1, 1)
---
-2.14.1
-
-From 41d579aad2fee971e5ce0279a9b559a0fdc74452 Mon Sep 17 00:00:00 2001
-From: George Dunlap <george.dunlap@citrix.com>
-Date: Fri, 22 Sep 2017 11:46:55 +0100
-Subject: [PATCH 2/2] x86/mm: Disable PV linear pagetables by default
-
-Allowing pagetables to point to other pagetables of the same level
-(often called 'linear pagetables') has been included in Xen since its
-inception. But it is not used by the most common PV guests (Linux,
-NetBSD, minios), and has been the source of a number of subtle
-reference-counting bugs.
-
-Add a command-line option to control whether PV linear pagetables are
-allowed (disabled by default).
-
-Reported-by: Jann Horn <jannh@google.com>
-Signed-off-by: George Dunlap <george.dunlap@citrix.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
----
-Changes since v2:
-- s/_/-/; in command-line option
-- Added __read_mostly
----
- docs/misc/xen-command-line.markdown | 15 +++++++++++++++
- xen/arch/x86/mm.c | 9 +++++++++
- 2 files changed, 24 insertions(+)
-
-diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
-index 54acc60723..ffa66eb146 100644
---- docs/misc/xen-command-line.markdown.orig
-+++ docs/misc/xen-command-line.markdown
-@@ -1350,6 +1350,21 @@ The following resources are available:
- CDP, one COS will corespond two CBMs other than one with CAT, due to the
- sum of CBMs is fixed, that means actual `cos_max` in use will automatically
- reduce to half when CDP is enabled.
-+
-+### pv-linear-pt
-+> `= <boolean>`
-+
-+> Default: `false`
-+
-+Allow PV guests to have pagetable entries pointing to other pagetables
-+of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
-+This technique is often called "linear pagetables", and is sometimes
-+used to allow operating systems a simple way to consistently map the
-+current process's pagetables into its own virtual address space.
-+
-+None of the most common PV operating systems (Linux, NetBSD, MiniOS)
-+use this technique, but there may be custom operating systems which
-+do.
-
- ### reboot
- > `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
-diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
-index 31d4a03840..5d125cff3a 100644
---- xen/arch/x86/mm.c.orig
-+++ xen/arch/x86/mm.c
-@@ -800,6 +800,9 @@ static void dec_linear_uses(struct page_info *pg)
- * frame if it is mapped by a different root table. This is sufficient and
- * also necessary to allow validation of a root table mapping itself.
- */
-+static bool __read_mostly pv_linear_pt_enable = true;
-+boolean_param("pv-linear-pt", pv_linear_pt_enable);
-+
- #define define_get_linear_pagetable(level) \
- static int \
- get_##level##_linear_pagetable( \
-@@ -809,6 +812,12 @@ get_##level##_linear_pagetable( \
- struct page_info *page; \
- unsigned long pfn; \
- \
-+ if ( !pv_linear_pt_enable ) \
-+ { \
-+ MEM_LOG("Attempt to create linear p.t. (feature disabled)"); \
-+ return 0; \
-+ } \
-+ \
- if ( (level##e_get_flags(pde) & _PAGE_RW) ) \
- { \
- MEM_LOG("Attempt to create linear p.t. with write perms"); \
---
-2.14.1
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86: don't wrongly trigger linear page table assertion
-
-_put_page_type() may do multiple iterations until its cmpxchg()
-succeeds. It invokes set_tlbflush_timestamp() on the first
-iteration, however. Code inside the function takes care of this, but
-- the assertion in _put_final_page_type() would trigger on the second
- iteration if time stamps in a debug build are permitted to be
- sufficiently much wider than the default 6 bits (see WRAP_MASK in
- flushtlb.c),
-- it returning -EINTR (for a continuation to be scheduled) would leave
- the page inconsistent state (until the re-invocation completes).
-Make the set_tlbflush_timestamp() invocation conditional, bypassing it
-(for now) only in the case we really can't tolerate the stamp to be
-stored.
-
-This is part of XSA-240.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: George Dunlap <george.dunlap@citrix.com>
-
---- xen/arch/x86/mm.c.orig
-+++ xen/arch/x86/mm.c
-@@ -2561,30 +2561,21 @@
- break;
- }
-
-- if ( ptpg && PGT_type_equal(x, ptpg->u.inuse.type_info) )
-- {
-- /*
-- * page_set_tlbflush_timestamp() accesses the same union
-- * linear_pt_count lives in. Unvalidated page table pages,
-- * however, should occur during domain destruction only
-- * anyway. Updating of linear_pt_count luckily is not
-- * necessary anymore for a dying domain.
-- */
-- ASSERT(page_get_owner(page)->is_dying);
-- ASSERT(page->linear_pt_count < 0);
-- ASSERT(ptpg->linear_pt_count > 0);
-- ptpg = NULL;
-- }
--
- /*
- * Record TLB information for flush later. We do not stamp page
- * tables when running in shadow mode:
- * 1. Pointless, since it's the shadow pt's which must be tracked.
- * 2. Shadow mode reuses this field for shadowed page tables to
- * store flags info -- we don't want to conflict with that.
-+ * Also page_set_tlbflush_timestamp() accesses the same union
-+ * linear_pt_count lives in. Pages (including page table ones),
-+ * however, don't need their flush time stamp set except when
-+ * the last reference is being dropped. For page table pages
-+ * this happens in _put_final_page_type().
- */
-- if ( !(shadow_mode_enabled(page_get_owner(page)) &&
-+ if ( (!ptpg || !PGT_type_equal(x, ptpg->u.inuse.type_info)) &&
-+ !(shadow_mode_enabled(page_get_owner(page)) &&
- (page->count_info & PGC_page_table)) )
- page->tlbflush_timestamp = tlbflush_current_time();
- }
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86: don't wrongly trigger linear page table assertion (2)
-
-_put_final_page_type(), when free_page_type() has exited early to allow
-for preemption, should not update the time stamp, as the page continues
-to retain the typ which is in the process of being unvalidated. I can't
-see why the time stamp update was put on that path in the first place
-(albeit it may well have been me who had put it there years ago).
-
-This is part of XSA-240.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: <George Dunlap <george.dunlap.com>
-
---- xen/arch/x86/mm.c.orig
-+++ xen/arch/x86/mm.c
-@@ -2560,9 +2560,6 @@ static int _put_final_page_type(struct p
- {
- ASSERT((page->u.inuse.type_info &
- (PGT_count_mask|PGT_validated|PGT_partial)) == 1);
-- if ( !(shadow_mode_enabled(page_get_owner(page)) &&
-- (page->count_info & PGC_page_table)) )
-- page->tlbflush_timestamp = tlbflush_current_time();
- wmb();
- page->u.inuse.type_info |= PGT_validated;
- }
diff --git a/sysutils/xenkernel48/patches/patch-XSA241 b/sysutils/xenkernel48/patches/patch-XSA241
deleted file mode 100644
index 840b744fa43..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA241
+++ /dev/null
@@ -1,104 +0,0 @@
-$NetBSD: patch-XSA241,v 1.2 2017/12/15 14:02:15 bouyer Exp $
-
-x86: don't store possibly stale TLB flush time stamp
-
-While the timing window is extremely narrow, it is theoretically
-possible for an update to the TLB flush clock and a subsequent flush
-IPI to happen between the read and write parts of the update of the
-per-page stamp. Exclude this possibility by disabling interrupts
-across the update, preventing the IPI to be serviced in the middle.
-
-This is XSA-241.
-
-Reported-by: Jann Horn <jannh@google.com>
-Suggested-by: George Dunlap <george.dunlap@citrix.com>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: George Dunlap <george.dunlap@citrix.com>
-
---- xen/arch/arm/smp.c.orig
-+++ xen/arch/arm/smp.c
-@@ -1,4 +1,5 @@
- #include <xen/config.h>
-+#include <xen/mm.h>
- #include <asm/system.h>
- #include <asm/smp.h>
- #include <asm/cpregs.h>
---- xen/arch/x86/mm.c.orig 2017-12-15 14:29:51.000000000 +0100
-+++ xen/arch/x86/mm.c 2017-12-15 14:30:10.000000000 +0100
-@@ -2500,7 +2500,7 @@
- */
- if ( !(shadow_mode_enabled(page_get_owner(page)) &&
- (page->count_info & PGC_page_table)) )
-- page->tlbflush_timestamp = tlbflush_current_time();
-+ page_set_tlbflush_timestamp(page);
- wmb();
- page->u.inuse.type_info--;
- }
-@@ -2573,7 +2573,7 @@
- if ( (!ptpg || !PGT_type_equal(x, ptpg->u.inuse.type_info)) &&
- !(shadow_mode_enabled(page_get_owner(page)) &&
- (page->count_info & PGC_page_table)) )
-- page->tlbflush_timestamp = tlbflush_current_time();
-+ page_set_tlbflush_timestamp(page);
- }
-
- if ( likely((y = cmpxchg(&page->u.inuse.type_info, x, nx)) == x) )
---- xen/arch/x86/mm/shadow/common.c.orig
-+++ xen/arch/x86/mm/shadow/common.c
-@@ -1464,7 +1464,7 @@ void shadow_free(struct domain *d, mfn_t
- * TLBs when we reuse the page. Because the destructors leave the
- * contents of the pages in place, we can delay TLB flushes until
- * just before the allocator hands the page out again. */
-- sp->tlbflush_timestamp = tlbflush_current_time();
-+ page_set_tlbflush_timestamp(sp);
- perfc_decr(shadow_alloc_count);
- page_list_add_tail(sp, &d->arch.paging.shadow.freelist);
- sp = next;
---- xen/common/page_alloc.c.orig
-+++ xen/common/page_alloc.c
-@@ -960,7 +960,7 @@ static void free_heap_pages(
- /* If a page has no owner it will need no safety TLB flush. */
- pg[i].u.free.need_tlbflush = (page_get_owner(&pg[i]) != NULL);
- if ( pg[i].u.free.need_tlbflush )
-- pg[i].tlbflush_timestamp = tlbflush_current_time();
-+ page_set_tlbflush_timestamp(&pg[i]);
-
- /* This page is not a guest frame any more. */
- page_set_owner(&pg[i], NULL); /* set_gpfn_from_mfn snoops pg owner */
---- xen/include/asm-arm/flushtlb.h.orig
-+++ xen/include/asm-arm/flushtlb.h
-@@ -12,6 +12,11 @@ static inline void tlbflush_filter(cpuma
-
- #define tlbflush_current_time() (0)
-
-+static inline void page_set_tlbflush_timestamp(struct page_info *page)
-+{
-+ page->tlbflush_timestamp = tlbflush_current_time();
-+}
-+
- #if defined(CONFIG_ARM_32)
- # include <asm/arm32/flushtlb.h>
- #elif defined(CONFIG_ARM_64)
---- xen/include/asm-x86/flushtlb.h.orig
-+++ xen/include/asm-x86/flushtlb.h
-@@ -23,6 +23,20 @@ DECLARE_PER_CPU(u32, tlbflush_time);
-
- #define tlbflush_current_time() tlbflush_clock
-
-+static inline void page_set_tlbflush_timestamp(struct page_info *page)
-+{
-+ /*
-+ * Prevent storing a stale time stamp, which could happen if an update
-+ * to tlbflush_clock plus a subsequent flush IPI happen between the
-+ * reading of tlbflush_clock and the writing of the struct page_info
-+ * field.
-+ */
-+ ASSERT(local_irq_is_enabled());
-+ local_irq_disable();
-+ page->tlbflush_timestamp = tlbflush_current_time();
-+ local_irq_enable();
-+}
-+
- /*
- * @cpu_stamp is the timestamp at last TLB flush for the CPU we are testing.
- * @lastuse_stamp is a timestamp taken when the PFN we are testing was last
diff --git a/sysutils/xenkernel48/patches/patch-XSA242 b/sysutils/xenkernel48/patches/patch-XSA242
deleted file mode 100644
index c5614cd0a79..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA242
+++ /dev/null
@@ -1,45 +0,0 @@
-$NetBSD: patch-XSA242,v 1.2 2017/12/15 14:02:15 bouyer Exp $
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86: don't allow page_unlock() to drop the last type reference
-
-Only _put_page_type() does the necessary cleanup, and hence not all
-domain pages can be released during guest cleanup (leaving around
-zombie domains) if we get this wrong.
-
-This is XSA-242.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-
---- xen/arch/x86/mm.c.orig 2017-12-15 14:30:10.000000000 +0100
-+++ xen/arch/x86/mm.c 2017-12-15 14:31:32.000000000 +0100
-@@ -1906,7 +1906,11 @@
-
- do {
- x = y;
-+ ASSERT((x & PGT_count_mask) && (x & PGT_locked));
-+
- nx = x - (1 | PGT_locked);
-+ /* We must not drop the last reference here. */
-+ ASSERT(nx & PGT_count_mask);
- } while ( (y = cmpxchg(&page->u.inuse.type_info, x, nx)) != x );
- }
-
-@@ -2575,6 +2579,17 @@
- (page->count_info & PGC_page_table)) )
- page_set_tlbflush_timestamp(page);
- }
-+ else if ( unlikely((nx & (PGT_locked | PGT_count_mask)) ==
-+ (PGT_locked | 1)) )
-+ {
-+ /*
-+ * We must not drop the second to last reference when the page is
-+ * locked, as page_unlock() doesn't do any cleanup of the type.
-+ */
-+ cpu_relax();
-+ y = page->u.inuse.type_info;
-+ continue;
-+ }
-
- if ( likely((y = cmpxchg(&page->u.inuse.type_info, x, nx)) == x) )
- break;
diff --git a/sysutils/xenkernel48/patches/patch-XSA243 b/sysutils/xenkernel48/patches/patch-XSA243
deleted file mode 100644
index 1464bba9dcb..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA243
+++ /dev/null
@@ -1,95 +0,0 @@
-$NetBSD: patch-XSA243,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From: Andrew Cooper <andrew.cooper3@citrix.com>
-Subject: x86/shadow: Don't create self-linear shadow mappings for 4-level translated guests
-
-When initially creating a monitor table for 4-level translated guests, don't
-install a shadow-linear mapping. This mapping is actually self-linear, and
-trips up the writeable heuristic logic into following Xen's mappings, not the
-guests' shadows it was expecting to follow.
-
-A consequence of this is that sh_guess_wrmap() needs to cope with there being
-no shadow-linear mapping present, which in practice occurs once each time a
-vcpu switches to 4-level paging from a different paging mode.
-
-An appropriate shadow-linear slot will be inserted into the monitor table
-either while constructing lower level monitor tables, or by sh_update_cr3().
-
-While fixing this, clarify the safety of the other mappings. Despite
-appearing unsafe, it is correct to create a guest-linear mapping for
-translated domains; this is self-linear and doesn't point into the translated
-domain. Drop a dead clause for translate != external guests.
-
-This is XSA-243.
-
-Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
-Acked-by: Tim Deegan <tim@xen.org>
-
-diff --git a/xen/arch/x86/mm/shadow/multi.c b/xen/arch/x86/mm/shadow/multi.c
-index d70b1c6..029e8d4 100644
---- xen/arch/x86/mm/shadow/multi.c.orig
-+++ xen/arch/x86/mm/shadow/multi.c
-@@ -1472,26 +1472,38 @@ void sh_install_xen_entries_in_l4(struct domain *d, mfn_t gl4mfn, mfn_t sl4mfn)
- sl4e[shadow_l4_table_offset(RO_MPT_VIRT_START)] = shadow_l4e_empty();
- }
-
-- /* Shadow linear mapping for 4-level shadows. N.B. for 3-level
-- * shadows on 64-bit xen, this linear mapping is later replaced by the
-- * monitor pagetable structure, which is built in make_monitor_table
-- * and maintained by sh_update_linear_entries. */
-- sl4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
-- shadow_l4e_from_mfn(sl4mfn, __PAGE_HYPERVISOR);
--
-- /* Self linear mapping. */
-- if ( shadow_mode_translate(d) && !shadow_mode_external(d) )
-+ /*
-+ * Linear mapping slots:
-+ *
-+ * Calling this function with gl4mfn == sl4mfn is used to construct a
-+ * monitor table for translated domains. In this case, gl4mfn forms the
-+ * self-linear mapping (i.e. not pointing into the translated domain), and
-+ * the shadow-linear slot is skipped. The shadow-linear slot is either
-+ * filled when constructing lower level monitor tables, or via
-+ * sh_update_cr3() for 4-level guests.
-+ *
-+ * Calling this function with gl4mfn != sl4mfn is used for non-translated
-+ * guests, where the shadow-linear slot is actually self-linear, and the
-+ * guest-linear slot points into the guests view of its pagetables.
-+ */
-+ if ( shadow_mode_translate(d) )
- {
-- // linear tables may not be used with translated PV guests
-- sl4e[shadow_l4_table_offset(LINEAR_PT_VIRT_START)] =
-+ ASSERT(mfn_eq(gl4mfn, sl4mfn));
-+
-+ sl4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
- shadow_l4e_empty();
- }
- else
- {
-- sl4e[shadow_l4_table_offset(LINEAR_PT_VIRT_START)] =
-- shadow_l4e_from_mfn(gl4mfn, __PAGE_HYPERVISOR);
-+ ASSERT(!mfn_eq(gl4mfn, sl4mfn));
-+
-+ sl4e[shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START)] =
-+ shadow_l4e_from_mfn(sl4mfn, __PAGE_HYPERVISOR);
- }
-
-+ sl4e[shadow_l4_table_offset(LINEAR_PT_VIRT_START)] =
-+ shadow_l4e_from_mfn(gl4mfn, __PAGE_HYPERVISOR);
-+
- unmap_domain_page(sl4e);
- }
- #endif
-@@ -4287,6 +4299,11 @@ static int sh_guess_wrmap(struct vcpu *v, unsigned long vaddr, mfn_t gmfn)
-
- /* Carefully look in the shadow linear map for the l1e we expect */
- #if SHADOW_PAGING_LEVELS >= 4
-+ /* Is a shadow linear map is installed in the first place? */
-+ sl4p = v->arch.paging.shadow.guest_vtable;
-+ sl4p += shadow_l4_table_offset(SH_LINEAR_PT_VIRT_START);
-+ if ( !(shadow_l4e_get_flags(*sl4p) & _PAGE_PRESENT) )
-+ return 0;
- sl4p = sh_linear_l4_table(v) + shadow_l4_linear_offset(vaddr);
- if ( !(shadow_l4e_get_flags(*sl4p) & _PAGE_PRESENT) )
- return 0;
diff --git a/sysutils/xenkernel48/patches/patch-XSA244 b/sysutils/xenkernel48/patches/patch-XSA244
deleted file mode 100644
index d241f0af72e..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA244
+++ /dev/null
@@ -1,61 +0,0 @@
-$NetBSD: patch-XSA244,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From: Andrew Cooper <andrew.cooper3@citrix.com>
-Subject: [PATCH] x86/cpu: Fix IST handling during PCPU bringup
-
-Clear IST references in newly allocated IDTs. Nothing good will come of
-having them set before the TSS is suitably constructed (although the chances
-of the CPU surviving such an IST interrupt/exception is extremely slim).
-
-Uniformly set the IST references after the TSS is in place. This fixes an
-issue on AMD hardware, where onlining a PCPU while PCPU0 is in HVM context
-will cause IST_NONE to be copied into the new IDT, making that PCPU vulnerable
-to privilege escalation from PV guests until it subsequently schedules an HVM
-guest.
-
-This is XSA-244
-
-Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
-Reviewed-by: Jan Beulich <jbeulich@suse.com>
----
- xen/arch/x86/cpu/common.c | 5 +++++
- xen/arch/x86/smpboot.c | 3 +++
- 2 files changed, 8 insertions(+)
-
-diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
-index 78f5667..6cf3628 100644
---- xen/arch/x86/cpu/common.c.orig
-+++ xen/arch/x86/cpu/common.c
-@@ -640,6 +640,7 @@ void __init early_cpu_init(void)
- * - Sets up TSS with stack pointers, including ISTs
- * - Inserts TSS selector into regular and compat GDTs
- * - Loads GDT, IDT, TR then null LDT
-+ * - Sets up IST references in the IDT
- */
- void load_system_tables(void)
- {
-@@ -702,6 +703,10 @@ void load_system_tables(void)
- asm volatile ("ltr %w0" : : "rm" (TSS_ENTRY << 3) );
- asm volatile ("lldt %w0" : : "rm" (0) );
-
-+ set_ist(&idt_tables[cpu][TRAP_double_fault], IST_DF);
-+ set_ist(&idt_tables[cpu][TRAP_nmi], IST_NMI);
-+ set_ist(&idt_tables[cpu][TRAP_machine_check], IST_MCE);
-+
- /*
- * Bottom-of-stack must be 16-byte aligned!
- *
-diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
-index 3ca716c..1609b62 100644
---- xen/arch/x86/smpboot.c.orig
-+++ xen/arch/x86/smpboot.c
-@@ -724,6 +724,9 @@ static int cpu_smpboot_alloc(unsigned int cpu)
- if ( idt_tables[cpu] == NULL )
- goto oom;
- memcpy(idt_tables[cpu], idt_table, IDT_ENTRIES * sizeof(idt_entry_t));
-+ set_ist(&idt_tables[cpu][TRAP_double_fault], IST_NONE);
-+ set_ist(&idt_tables[cpu][TRAP_nmi], IST_NONE);
-+ set_ist(&idt_tables[cpu][TRAP_machine_check], IST_NONE);
-
- for ( stub_page = 0, i = cpu & ~(STUBS_PER_PAGE - 1);
- i < nr_cpu_ids && i <= (cpu | (STUBS_PER_PAGE - 1)); ++i )
diff --git a/sysutils/xenkernel48/patches/patch-XSA246 b/sysutils/xenkernel48/patches/patch-XSA246
deleted file mode 100644
index 4fedacc0c49..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA246
+++ /dev/null
@@ -1,76 +0,0 @@
-$NetBSD: patch-XSA246,v 1.1 2017/12/15 14:02:15 bouyer Exp $
-
-From: Julien Grall <julien.grall@linaro.org>
-Subject: x86/pod: prevent infinite loop when shattering large pages
-
-When populating pages, the PoD may need to split large ones using
-p2m_set_entry and request the caller to retry (see ept_get_entry for
-instance).
-
-p2m_set_entry may fail to shatter if it is not possible to allocate
-memory for the new page table. However, the error is not propagated
-resulting to the callers to retry infinitely the PoD.
-
-Prevent the infinite loop by return false when it is not possible to
-shatter the large mapping.
-
-This is XSA-246.
-
-Signed-off-by: Julien Grall <julien.grall@linaro.org>
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: George Dunlap <george.dunlap@citrix.com>
-
---- xen/arch/x86/mm/p2m-pod.c.orig
-+++ xen/arch/x86/mm/p2m-pod.c
-@@ -1071,9 +1071,8 @@ p2m_pod_demand_populate(struct p2m_domai
- * NOTE: In a fine-grained p2m locking scenario this operation
- * may need to promote its locking from gfn->1g superpage
- */
-- p2m_set_entry(p2m, gfn_aligned, INVALID_MFN, PAGE_ORDER_2M,
-- p2m_populate_on_demand, p2m->default_access);
-- return 0;
-+ return p2m_set_entry(p2m, gfn_aligned, INVALID_MFN, PAGE_ORDER_2M,
-+ p2m_populate_on_demand, p2m->default_access);
- }
-
- /* Only reclaim if we're in actual need of more cache. */
-@@ -1104,8 +1103,12 @@ p2m_pod_demand_populate(struct p2m_domai
-
- gfn_aligned = (gfn >> order) << order;
-
-- p2m_set_entry(p2m, gfn_aligned, mfn, order, p2m_ram_rw,
-- p2m->default_access);
-+ if ( p2m_set_entry(p2m, gfn_aligned, mfn, order, p2m_ram_rw,
-+ p2m->default_access) )
-+ {
-+ p2m_pod_cache_add(p2m, p, order);
-+ goto out_fail;
-+ }
-
- for( i = 0; i < (1UL << order); i++ )
- {
-@@ -1150,13 +1153,18 @@ remap_and_retry:
- BUG_ON(order != PAGE_ORDER_2M);
- pod_unlock(p2m);
-
-- /* Remap this 2-meg region in singleton chunks */
-- /* NOTE: In a p2m fine-grained lock scenario this might
-- * need promoting the gfn lock from gfn->2M superpage */
-+ /*
-+ * Remap this 2-meg region in singleton chunks. See the comment on the
-+ * 1G page splitting path above for why a single call suffices.
-+ *
-+ * NOTE: In a p2m fine-grained lock scenario this might
-+ * need promoting the gfn lock from gfn->2M superpage.
-+ */
- gfn_aligned = (gfn>>order)<<order;
-- for(i=0; i<(1<<order); i++)
-- p2m_set_entry(p2m, gfn_aligned + i, INVALID_MFN, PAGE_ORDER_4K,
-- p2m_populate_on_demand, p2m->default_access);
-+ if ( p2m_set_entry(p2m, gfn_aligned, INVALID_MFN, PAGE_ORDER_4K,
-+ p2m_populate_on_demand, p2m->default_access) )
-+ return -1;
-+
- if ( tb_init_done )
- {
- struct {
diff --git a/sysutils/xenkernel48/patches/patch-XSA247 b/sysutils/xenkernel48/patches/patch-XSA247
deleted file mode 100644
index 248e2702a0d..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA247
+++ /dev/null
@@ -1,287 +0,0 @@
-$NetBSD: patch-XSA247,v 1.1 2017/12/15 14:02:15 bouyer Exp $
-
-From 0a004cf322940d99432b84284b22f3a9ea67a282 Mon Sep 17 00:00:00 2001
-From: George Dunlap <george.dunlap@citrix.com>
-Date: Fri, 10 Nov 2017 16:53:54 +0000
-Subject: [PATCH 1/2] p2m: Always check to see if removing a p2m entry actually
- worked
-
-The PoD zero-check functions speculatively remove memory from the p2m,
-then check to see if it's completely zeroed, before putting it in the
-cache.
-
-Unfortunately, the p2m_set_entry() calls may fail if the underlying
-pagetable structure needs to change and the domain has exhausted its
-p2m memory pool: for instance, if we're removing a 2MiB region out of
-a 1GiB entry (in the p2m_pod_zero_check_superpage() case), or a 4k
-region out of a 2MiB or larger entry (in the p2m_pod_zero_check()
-case); and the return value is not checked.
-
-The underlying mfn will then be added into the PoD cache, and at some
-point mapped into another location in the p2m. If the guest
-afterwards ballons out this memory, it will be freed to the hypervisor
-and potentially reused by another domain, in spite of the fact that
-the original domain still has writable mappings to it.
-
-There are several places where p2m_set_entry() shouldn't be able to
-fail, as it is guaranteed to write an entry of the same order that
-succeeded before. Add a backstop of crashing the domain just in case,
-and an ASSERT_UNREACHABLE() to flag up the broken assumption on debug
-builds.
-
-While we're here, use PAGE_ORDER_2M rather than a magic constant.
-
-This is part of XSA-247.
-
-Reported-by: George Dunlap <george.dunlap.com>
-Signed-off-by: George Dunlap <george.dunlap@citrix.com>
-Reviewed-by: Jan Beulich <jbeulich@suse.com>
----
-v4:
-- Removed some training whitespace
-v3:
-- Reformat reset clause to be more compact
-- Make sure to set map[i] = NULL when unmapping in case we need to bail
-v2:
-- Crash a domain if a p2m_set_entry we think cannot fail fails anyway.
----
- xen/arch/x86/mm/p2m-pod.c | 77 +++++++++++++++++++++++++++++++++++++----------
- 1 file changed, 61 insertions(+), 16 deletions(-)
-
-diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
-index 0e15290390..d73a86dde0 100644
---- xen/arch/x86/mm/p2m-pod.c.orig
-+++ xen/arch/x86/mm/p2m-pod.c
-@@ -754,8 +754,10 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
- }
-
- /* Try to remove the page, restoring old mapping if it fails. */
-- p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_2M,
-- p2m_populate_on_demand, p2m->default_access);
-+ if ( p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_2M,
-+ p2m_populate_on_demand, p2m->default_access) )
-+ goto out;
-+
- p2m_tlb_flush_sync(p2m);
-
- /* Make none of the MFNs are used elsewhere... for example, mapped
-@@ -812,9 +814,18 @@ p2m_pod_zero_check_superpage(struct p2m_domain *p2m, unsigned long gfn)
- ret = SUPERPAGE_PAGES;
-
- out_reset:
-- if ( reset )
-- p2m_set_entry(p2m, gfn, mfn0, 9, type0, p2m->default_access);
--
-+ /*
-+ * This p2m_set_entry() call shouldn't be able to fail, since the same order
-+ * on the same gfn succeeded above. If that turns out to be false, crashing
-+ * the domain should be the safest way of making sure we don't leak memory.
-+ */
-+ if ( reset && p2m_set_entry(p2m, gfn, mfn0, PAGE_ORDER_2M,
-+ type0, p2m->default_access) )
-+ {
-+ ASSERT_UNREACHABLE();
-+ domain_crash(d);
-+ }
-+
- out:
- gfn_unlock(p2m, gfn, SUPERPAGE_ORDER);
- return ret;
-@@ -871,19 +882,30 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
- }
-
- /* Try to remove the page, restoring old mapping if it fails. */
-- p2m_set_entry(p2m, gfns[i], INVALID_MFN, PAGE_ORDER_4K,
-- p2m_populate_on_demand, p2m->default_access);
-+ if ( p2m_set_entry(p2m, gfns[i], INVALID_MFN, PAGE_ORDER_4K,
-+ p2m_populate_on_demand, p2m->default_access) )
-+ goto skip;
-
- /* See if the page was successfully unmapped. (Allow one refcount
- * for being allocated to a domain.) */
- if ( (mfn_to_page(mfns[i])->count_info & PGC_count_mask) > 1 )
- {
-+ /*
-+ * If the previous p2m_set_entry call succeeded, this one shouldn't
-+ * be able to fail. If it does, crashing the domain should be safe.
-+ */
-+ if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-+ types[i], p2m->default_access) )
-+ {
-+ ASSERT_UNREACHABLE();
-+ domain_crash(d);
-+ goto out_unmap;
-+ }
-+
-+ skip:
- unmap_domain_page(map[i]);
- map[i] = NULL;
-
-- p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-- types[i], p2m->default_access);
--
- continue;
- }
- }
-@@ -902,12 +924,25 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
-
- unmap_domain_page(map[i]);
-
-- /* See comment in p2m_pod_zero_check_superpage() re gnttab
-- * check timing. */
-- if ( j < PAGE_SIZE/sizeof(*map[i]) )
-+ map[i] = NULL;
-+
-+ /*
-+ * See comment in p2m_pod_zero_check_superpage() re gnttab
-+ * check timing.
-+ */
-+ if ( j < (PAGE_SIZE / sizeof(*map[i])) )
- {
-- p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-- types[i], p2m->default_access);
-+ /*
-+ * If the previous p2m_set_entry call succeeded, this one shouldn't
-+ * be able to fail. If it does, crashing the domain should be safe.
-+ */
-+ if ( p2m_set_entry(p2m, gfns[i], mfns[i], PAGE_ORDER_4K,
-+ types[i], p2m->default_access) )
-+ {
-+ ASSERT_UNREACHABLE();
-+ domain_crash(d);
-+ goto out_unmap;
-+ }
- }
- else
- {
-@@ -931,7 +966,17 @@ p2m_pod_zero_check(struct p2m_domain *p2m, unsigned long *gfns, int count)
- p2m->pod.entry_count++;
- }
- }
--
-+
-+ return;
-+
-+out_unmap:
-+ /*
-+ * Something went wrong, probably crashing the domain. Unmap
-+ * everything and return.
-+ */
-+ for ( i = 0; i < count; i++ )
-+ if ( map[i] )
-+ unmap_domain_page(map[i]);
- }
-
- #define POD_SWEEP_LIMIT 1024
---
-2.15.0
-
-From f01b21460bdd5205e1a92552d37a276866f64f1f Mon Sep 17 00:00:00 2001
-From: George Dunlap <george.dunlap@citrix.com>
-Date: Fri, 10 Nov 2017 16:53:55 +0000
-Subject: [PATCH 2/2] p2m: Check return value of p2m_set_entry() when
- decreasing reservation
-
-If the entire range specified to p2m_pod_decrease_reservation() is marked
-populate-on-demand, then it will make a single p2m_set_entry() call,
-reducing its PoD entry count.
-
-Unfortunately, in the right circumstances, this p2m_set_entry() call
-may fail. It that case, repeated calls to decrease_reservation() may
-cause p2m->pod.entry_count to fall below zero, potentially tripping
-over BUG_ON()s to the contrary.
-
-Instead, check to see if the entry succeeded, and return false if not.
-The caller will then call guest_remove_page() on the gfns, which will
-return -EINVAL upon finding no valid memory there to return.
-
-Unfortunately if the order > 0, the entry may have partially changed.
-A domain_crash() is probably the safest thing in that case.
-
-Other p2m_set_entry() calls in the same function should be fine,
-because they are writing the entry at its current order. Nonetheless,
-check the return value and crash if our assumption turns otu to be
-wrong.
-
-This is part of XSA-247.
-
-Reported-by: George Dunlap <george.dunlap.com>
-Signed-off-by: George Dunlap <george.dunlap@citrix.com>
-Reviewed-by: Jan Beulich <jbeulich@suse.com>
----
-v2: Crash the domain if we're not sure it's safe (or if we think it
-can't happen)
----
- xen/arch/x86/mm/p2m-pod.c | 42 +++++++++++++++++++++++++++++++++---------
- 1 file changed, 33 insertions(+), 9 deletions(-)
-
-diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
-index d73a86dde0..c750d0d8cc 100644
---- xen/arch/x86/mm/p2m-pod.c.orig
-+++ xen/arch/x86/mm/p2m-pod.c
-@@ -557,11 +557,23 @@ p2m_pod_decrease_reservation(struct domain *d,
-
- if ( !nonpod )
- {
-- /* All PoD: Mark the whole region invalid and tell caller
-- * we're done. */
-- p2m_set_entry(p2m, gpfn, INVALID_MFN, order, p2m_invalid,
-- p2m->default_access);
-- p2m->pod.entry_count-=(1<<order);
-+ /*
-+ * All PoD: Mark the whole region invalid and tell caller
-+ * we're done.
-+ */
-+ if ( p2m_set_entry(p2m, gpfn, INVALID_MFN, order, p2m_invalid,
-+ p2m->default_access) )
-+ {
-+ /*
-+ * If this fails, we can't tell how much of the range was changed.
-+ * Best to crash the domain unless we're sure a partial change is
-+ * impossible.
-+ */
-+ if ( order != 0 )
-+ domain_crash(d);
-+ goto out_unlock;
-+ }
-+ p2m->pod.entry_count -= 1UL << order;
- BUG_ON(p2m->pod.entry_count < 0);
- ret = 1;
- goto out_entry_check;
-@@ -602,8 +614,14 @@ p2m_pod_decrease_reservation(struct domain *d,
- n = 1UL << cur_order;
- if ( t == p2m_populate_on_demand )
- {
-- p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
-- p2m_invalid, p2m->default_access);
-+ /* This shouldn't be able to fail */
-+ if ( p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
-+ p2m_invalid, p2m->default_access) )
-+ {
-+ ASSERT_UNREACHABLE();
-+ domain_crash(d);
-+ goto out_unlock;
-+ }
- p2m->pod.entry_count -= n;
- BUG_ON(p2m->pod.entry_count < 0);
- pod -= n;
-@@ -624,8 +642,14 @@ p2m_pod_decrease_reservation(struct domain *d,
-
- page = mfn_to_page(mfn);
-
-- p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
-- p2m_invalid, p2m->default_access);
-+ /* This shouldn't be able to fail */
-+ if ( p2m_set_entry(p2m, gpfn + i, INVALID_MFN, cur_order,
-+ p2m_invalid, p2m->default_access) )
-+ {
-+ ASSERT_UNREACHABLE();
-+ domain_crash(d);
-+ goto out_unlock;
-+ }
- p2m_tlb_flush_sync(p2m);
- for ( j = 0; j < n; ++j )
- set_gpfn_from_mfn(mfn_x(mfn), INVALID_M2P_ENTRY);
---
-2.15.0
-
diff --git a/sysutils/xenkernel48/patches/patch-XSA248 b/sysutils/xenkernel48/patches/patch-XSA248
deleted file mode 100644
index b0ccf377bb2..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA248
+++ /dev/null
@@ -1,164 +0,0 @@
-$NetBSD: patch-XSA248,v 1.1 2017/12/15 14:02:15 bouyer Exp $
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86/mm: don't wrongly set page ownership
-
-PV domains can obtain mappings of any pages owned by the correct domain,
-including ones that aren't actually assigned as "normal" RAM, but used
-by Xen internally. At the moment such "internal" pages marked as owned
-by a guest include pages used to track logdirty bits, as well as p2m
-pages and the "unpaged pagetable" for HVM guests. Since the PV memory
-management and shadow code conflict in their use of struct page_info
-fields, and since shadow code is being used for log-dirty handling for
-PV domains, pages coming from the shadow pool must, for PV domains, not
-have the domain set as their owner.
-
-While the change could be done conditionally for just the PV case in
-shadow code, do it unconditionally (and for consistency also for HAP),
-just to be on the safe side.
-
-There's one special case though for shadow code: The page table used for
-running a HVM guest in unpaged mode is subject to get_page() (in
-set_shadow_status()) and hence must have its owner set.
-
-This is XSA-248.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Tim Deegan <tim@xen.org>
-Reviewed-by: George Dunlap <george.dunlap@citrix.com>
-
---- xen/arch/x86/mm/hap/hap.c.orig
-+++ xen/arch/x86/mm/hap/hap.c
-@@ -283,8 +283,7 @@ static struct page_info *hap_alloc_p2m_p
- {
- d->arch.paging.hap.total_pages--;
- d->arch.paging.hap.p2m_pages++;
-- page_set_owner(pg, d);
-- pg->count_info |= 1;
-+ ASSERT(!page_get_owner(pg) && !(pg->count_info & PGC_count_mask));
- }
- else if ( !d->arch.paging.p2m_alloc_failed )
- {
-@@ -299,21 +298,23 @@ static struct page_info *hap_alloc_p2m_p
-
- static void hap_free_p2m_page(struct domain *d, struct page_info *pg)
- {
-+ struct domain *owner = page_get_owner(pg);
-+
- /* This is called both from the p2m code (which never holds the
- * paging lock) and the log-dirty code (which always does). */
- paging_lock_recursive(d);
-
-- ASSERT(page_get_owner(pg) == d);
-- /* Should have just the one ref we gave it in alloc_p2m_page() */
-- if ( (pg->count_info & PGC_count_mask) != 1 ) {
-- HAP_ERROR("Odd p2m page %p count c=%#lx t=%"PRtype_info"\n",
-- pg, pg->count_info, pg->u.inuse.type_info);
-+ /* Should still have no owner and count zero. */
-+ if ( owner || (pg->count_info & PGC_count_mask) )
-+ {
-+ HAP_ERROR("d%d: Odd p2m page %"PRI_mfn" d=%d c=%lx t=%"PRtype_info"\n",
-+ d->domain_id, mfn_x(page_to_mfn(pg)),
-+ owner ? owner->domain_id : DOMID_INVALID,
-+ pg->count_info, pg->u.inuse.type_info);
- WARN();
-+ pg->count_info &= ~PGC_count_mask;
-+ page_set_owner(pg, NULL);
- }
-- pg->count_info &= ~PGC_count_mask;
-- /* Free should not decrement domain's total allocation, since
-- * these pages were allocated without an owner. */
-- page_set_owner(pg, NULL);
- d->arch.paging.hap.p2m_pages--;
- d->arch.paging.hap.total_pages++;
- hap_free(d, page_to_mfn(pg));
---- xen/arch/x86/mm/shadow/common.c.orig
-+++ xen/arch/x86/mm/shadow/common.c
-@@ -1573,32 +1573,29 @@ shadow_alloc_p2m_page(struct domain *d)
- pg = mfn_to_page(shadow_alloc(d, SH_type_p2m_table, 0));
- d->arch.paging.shadow.p2m_pages++;
- d->arch.paging.shadow.total_pages--;
-+ ASSERT(!page_get_owner(pg) && !(pg->count_info & PGC_count_mask));
-
- paging_unlock(d);
-
-- /* Unlike shadow pages, mark p2m pages as owned by the domain.
-- * Marking the domain as the owner would normally allow the guest to
-- * create mappings of these pages, but these p2m pages will never be
-- * in the domain's guest-physical address space, and so that is not
-- * believed to be a concern. */
-- page_set_owner(pg, d);
-- pg->count_info |= 1;
- return pg;
- }
-
- static void
- shadow_free_p2m_page(struct domain *d, struct page_info *pg)
- {
-- ASSERT(page_get_owner(pg) == d);
-- /* Should have just the one ref we gave it in alloc_p2m_page() */
-- if ( (pg->count_info & PGC_count_mask) != 1 )
-+ struct domain *owner = page_get_owner(pg);
-+
-+ /* Should still have no owner and count zero. */
-+ if ( owner || (pg->count_info & PGC_count_mask) )
- {
-- SHADOW_ERROR("Odd p2m page count c=%#lx t=%"PRtype_info"\n",
-+ SHADOW_ERROR("d%d: Odd p2m page %"PRI_mfn" d=%d c=%lx t=%"PRtype_info"\n",
-+ d->domain_id, mfn_x(page_to_mfn(pg)),
-+ owner ? owner->domain_id : DOMID_INVALID,
- pg->count_info, pg->u.inuse.type_info);
-+ pg->count_info &= ~PGC_count_mask;
-+ page_set_owner(pg, NULL);
- }
-- pg->count_info &= ~PGC_count_mask;
- pg->u.sh.type = SH_type_p2m_table; /* p2m code reuses type-info */
-- page_set_owner(pg, NULL);
-
- /* This is called both from the p2m code (which never holds the
- * paging lock) and the log-dirty code (which always does). */
-@@ -3216,7 +3213,9 @@ int shadow_enable(struct domain *d, u32
- | _PAGE_PRESENT | _PAGE_RW | _PAGE_USER
- | _PAGE_ACCESSED | _PAGE_DIRTY | _PAGE_PSE);
- unmap_domain_page(e);
-+ pg->count_info = 1;
- pg->u.inuse.type_info = PGT_l2_page_table | 1 | PGT_validated;
-+ page_set_owner(pg, d);
- }
-
- paging_lock(d);
-@@ -3254,7 +3253,11 @@ int shadow_enable(struct domain *d, u32
- if ( rv != 0 && !pagetable_is_null(p2m_get_pagetable(p2m)) )
- p2m_teardown(p2m);
- if ( rv != 0 && pg != NULL )
-+ {
-+ pg->count_info &= ~PGC_count_mask;
-+ page_set_owner(pg, NULL);
- shadow_free_p2m_page(d, pg);
-+ }
- domain_unpause(d);
- return rv;
- }
-@@ -3363,7 +3366,22 @@ out:
-
- /* Must be called outside the lock */
- if ( unpaged_pagetable )
-+ {
-+ if ( page_get_owner(unpaged_pagetable) == d &&
-+ (unpaged_pagetable->count_info & PGC_count_mask) == 1 )
-+ {
-+ unpaged_pagetable->count_info &= ~PGC_count_mask;
-+ page_set_owner(unpaged_pagetable, NULL);
-+ }
-+ /* Complain here in cases where shadow_free_p2m_page() won't. */
-+ else if ( !page_get_owner(unpaged_pagetable) &&
-+ !(unpaged_pagetable->count_info & PGC_count_mask) )
-+ SHADOW_ERROR("d%d: Odd unpaged pt %"PRI_mfn" c=%lx t=%"PRtype_info"\n",
-+ d->domain_id, mfn_x(page_to_mfn(unpaged_pagetable)),
-+ unpaged_pagetable->count_info,
-+ unpaged_pagetable->u.inuse.type_info);
- shadow_free_p2m_page(d, unpaged_pagetable);
-+ }
- }
-
- void shadow_final_teardown(struct domain *d)
diff --git a/sysutils/xenkernel48/patches/patch-XSA249 b/sysutils/xenkernel48/patches/patch-XSA249
deleted file mode 100644
index a0780ca267c..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA249
+++ /dev/null
@@ -1,44 +0,0 @@
-$NetBSD: patch-XSA249,v 1.1 2017/12/15 14:02:15 bouyer Exp $
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86/shadow: fix refcount overflow check
-
-Commit c385d27079 ("x86 shadow: for multi-page shadows, explicitly track
-the first page") reduced the refcount width to 25, without adjusting the
-overflow check. Eliminate the disconnect by using a manifest constant.
-
-Interestingly, up to commit 047782fa01 ("Out-of-sync L1 shadows: OOS
-snapshot") the refcount was 27 bits wide, yet the check was already
-using 26.
-
-This is XSA-249.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: George Dunlap <george.dunlap@citrix.com>
-Reviewed-by: Tim Deegan <tim@xen.org>
----
-v2: Simplify expression back to the style it was.
-
---- xen/arch/x86/mm/shadow/private.h.orig
-+++ xen/arch/x86/mm/shadow/private.h
-@@ -529,7 +529,7 @@ static inline int sh_get_ref(struct doma
- x = sp->u.sh.count;
- nx = x + 1;
-
-- if ( unlikely(nx >= 1U<<26) )
-+ if ( unlikely(nx >= (1U << PAGE_SH_REFCOUNT_WIDTH)) )
- {
- SHADOW_PRINTK("shadow ref overflow, gmfn=%lx smfn=%lx\n",
- __backpointer(sp), mfn_x(smfn));
---- xen/include/asm-x86/mm.h.orig
-+++ xen/include/asm-x86/mm.h
-@@ -82,7 +82,8 @@ struct page_info
- unsigned long type:5; /* What kind of shadow is this? */
- unsigned long pinned:1; /* Is the shadow pinned? */
- unsigned long head:1; /* Is this the first page of the shadow? */
-- unsigned long count:25; /* Reference count */
-+#define PAGE_SH_REFCOUNT_WIDTH 25
-+ unsigned long count:PAGE_SH_REFCOUNT_WIDTH; /* Reference count */
- } sh;
-
- /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */
diff --git a/sysutils/xenkernel48/patches/patch-XSA250 b/sysutils/xenkernel48/patches/patch-XSA250
deleted file mode 100644
index 0ca2deeda00..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA250
+++ /dev/null
@@ -1,69 +0,0 @@
-$NetBSD: patch-XSA250,v 1.1 2017/12/15 14:02:15 bouyer Exp $
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86/shadow: fix ref-counting error handling
-
-The old-Linux handling in shadow_set_l4e() mistakenly ORed together the
-results of sh_get_ref() and sh_pin(). As the latter failing is not a
-correctness problem, simply ignore its return value.
-
-In sh_set_toplevel_shadow() a failing sh_get_ref() must not be
-accompanied by installing the entry, despite the domain being crashed.
-
-This is XSA-250.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Tim Deegan <tim@xen.org>
-
---- xen/arch/x86/mm/shadow/multi.c.orig
-+++ xen/arch/x86/mm/shadow/multi.c
-@@ -923,7 +923,7 @@ static int shadow_set_l4e(struct domain
- shadow_l4e_t new_sl4e,
- mfn_t sl4mfn)
- {
-- int flags = 0, ok;
-+ int flags = 0;
- shadow_l4e_t old_sl4e;
- paddr_t paddr;
- ASSERT(sl4e != NULL);
-@@ -938,15 +938,16 @@ static int shadow_set_l4e(struct domain
- {
- /* About to install a new reference */
- mfn_t sl3mfn = shadow_l4e_get_mfn(new_sl4e);
-- ok = sh_get_ref(d, sl3mfn, paddr);
-- /* Are we pinning l3 shadows to handle wierd linux behaviour? */
-- if ( sh_type_is_pinnable(d, SH_type_l3_64_shadow) )
-- ok |= sh_pin(d, sl3mfn);
-- if ( !ok )
-+
-+ if ( !sh_get_ref(d, sl3mfn, paddr) )
- {
- domain_crash(d);
- return SHADOW_SET_ERROR;
- }
-+
-+ /* Are we pinning l3 shadows to handle weird Linux behaviour? */
-+ if ( sh_type_is_pinnable(d, SH_type_l3_64_shadow) )
-+ sh_pin(d, sl3mfn);
- }
-
- /* Write the new entry */
-@@ -3965,14 +3966,15 @@ sh_set_toplevel_shadow(struct vcpu *v,
-
- /* Take a ref to this page: it will be released in sh_detach_old_tables()
- * or the next call to set_toplevel_shadow() */
-- if ( !sh_get_ref(d, smfn, 0) )
-+ if ( sh_get_ref(d, smfn, 0) )
-+ new_entry = pagetable_from_mfn(smfn);
-+ else
- {
- SHADOW_ERROR("can't install %#lx as toplevel shadow\n", mfn_x(smfn));
- domain_crash(d);
-+ new_entry = pagetable_null();
- }
-
-- new_entry = pagetable_from_mfn(smfn);
--
- install_new_entry:
- /* Done. Install it */
- SHADOW_PRINTK("%u/%u [%u] gmfn %#"PRI_mfn" smfn %#"PRI_mfn"\n",
diff --git a/sysutils/xenkernel48/patches/patch-XSA251 b/sysutils/xenkernel48/patches/patch-XSA251
deleted file mode 100644
index 929c0901897..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA251
+++ /dev/null
@@ -1,23 +0,0 @@
-$NetBSD: patch-XSA251,v 1.1 2017/12/15 14:02:15 bouyer Exp $
-
-From: Jan Beulich <jbeulich@suse.com>
-Subject: x86/paging: don't unconditionally BUG() on finding SHARED_M2P_ENTRY
-
-PV guests can fully control the values written into the P2M.
-
-This is XSA-251.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-
---- xen/arch/x86/mm/paging.c.orig
-+++ xen/arch/x86/mm/paging.c
-@@ -276,7 +276,7 @@ void paging_mark_pfn_dirty(struct domain
- return;
-
- /* Shared MFNs should NEVER be marked dirty */
-- BUG_ON(SHARED_M2P(pfn));
-+ BUG_ON(paging_mode_translate(d) && SHARED_M2P(pfn));
-
- /*
- * Values with the MSB set denote MFNs that aren't really part of the
diff --git a/sysutils/xenkernel48/patches/patch-XSA254-1 b/sysutils/xenkernel48/patches/patch-XSA254-1
deleted file mode 100644
index 1448bf48457..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA254-1
+++ /dev/null
@@ -1,389 +0,0 @@
-$NetBSD: patch-XSA254-1,v 1.1 2018/01/18 10:28:13 bouyer Exp $
-
-From: Andrew Cooper <andrew.cooper3@citrix.com>
-Date: Wed, 17 Jan 2018 16:14:16 +0000 (+0100)
-Subject: x86/entry: Remove support for partial cpu_user_regs frames
-X-Git-Url: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff_plain;h=a7cf0a3b818377a8a49baed3606bfa2f214cd645;hp=40c02dd27a3e350197ef438b1ea6ad21f275c1c5
-
-x86/entry: Remove support for partial cpu_user_regs frames
-
-Save all GPRs on entry to Xen.
-
-The entry_int82() path is via a DPL1 gate, only usable by 32bit PV guests, so
-can get away with only saving the 32bit registers. All other entrypoints can
-be reached from 32 or 64bit contexts.
-
-This is part of XSA-254.
-
-Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
-Reviewed-by: Wei Liu <wei.liu2@citrix.com>
-Acked-by: Jan Beulich <jbeulich@suse.com>
-master commit: f9eb74789af77e985ae653193f3622263499f674
-master date: 2018-01-05 19:57:07 +0000
----
-
-diff --git a/tools/tests/x86_emulator/x86_emulate.c b/tools/tests/x86_emulator/x86_emulate.c
-index 19d8385..127a926 100644
---- tools/tests/x86_emulator/x86_emulate.c.orig
-+++ tools/tests/x86_emulator/x86_emulate.c
-@@ -33,7 +33,6 @@ typedef bool bool_t;
- #define MASK_INSR(v, m) (((v) * ((m) & -(m))) & (m))
-
- #define cpu_has_amd_erratum(nr) 0
--#define mark_regs_dirty(r) ((void)(r))
-
- /* For generic assembly code: use macros to define operation/operand sizes. */
- #ifdef __i386__
-diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
-index c8a303d..747cf65 100644
---- xen/arch/x86/domain.c.orig
-+++ xen/arch/x86/domain.c
-@@ -148,7 +148,6 @@ static void noreturn continue_idle_domain(struct vcpu *v)
- static void noreturn continue_nonidle_domain(struct vcpu *v)
- {
- check_wakeup_from_wait();
-- mark_regs_dirty(guest_cpu_user_regs());
- reset_stack_and_jump(ret_from_intr);
- }
-
-diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
-index 249932a..f4bf8b5 100644
---- xen/arch/x86/traps.c.orig
-+++ xen/arch/x86/traps.c
-@@ -3049,7 +3049,6 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
- goto fail;
- if ( admin_io_okay(port, op_bytes, currd) )
- {
-- mark_regs_dirty(regs);
- io_emul(regs);
- }
- else
-@@ -3079,7 +3078,6 @@ static int emulate_privileged_op(struct cpu_user_regs *regs)
- goto fail;
- if ( admin_io_okay(port, op_bytes, currd) )
- {
-- mark_regs_dirty(regs);
- io_emul(regs);
- if ( (op_bytes == 1) && pv_post_outb_hook )
- pv_post_outb_hook(port, regs->eax);
-diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
-index 474ffbc..df693c2 100644
---- xen/arch/x86/x86_64/compat/entry.S.orig
-+++ xen/arch/x86/x86_64/compat/entry.S
-@@ -15,7 +15,8 @@
- ENTRY(compat_hypercall)
- ASM_CLAC
- pushq $0
-- SAVE_VOLATILE type=TRAP_syscall compat=1
-+ movl $TRAP_syscall, 4(%rsp)
-+ SAVE_ALL compat=1 /* DPL1 gate, restricted to 32bit PV guests only. */
- CR4_PV32_RESTORE
-
- cmpb $0,untrusted_msi(%rip)
-@@ -66,7 +67,6 @@ compat_test_guest_events:
- /* %rbx: struct vcpu */
- compat_process_softirqs:
- sti
-- andl $~TRAP_regs_partial,UREGS_entry_vector(%rsp)
- call do_softirq
- jmp compat_test_all_events
-
-@@ -203,7 +203,8 @@ ENTRY(cstar_enter)
- pushq $FLAT_USER_CS32
- pushq %rcx
- pushq $0
-- SAVE_VOLATILE TRAP_syscall
-+ movl $TRAP_syscall, 4(%rsp)
-+ SAVE_ALL
- GET_CURRENT(bx)
- movq VCPU_domain(%rbx),%rcx
- cmpb $0,DOMAIN_is_32bit_pv(%rcx)
-diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
-index 85f1a4b..ac9ab4c 100644
---- xen/arch/x86/x86_64/entry.S.orig
-+++ xen/arch/x86/x86_64/entry.S
-@@ -97,7 +97,8 @@ ENTRY(lstar_enter)
- pushq $FLAT_KERNEL_CS64
- pushq %rcx
- pushq $0
-- SAVE_VOLATILE TRAP_syscall
-+ movl $TRAP_syscall, 4(%rsp)
-+ SAVE_ALL
- GET_CURRENT(bx)
- testb $TF_kernel_mode,VCPU_thread_flags(%rbx)
- jz switch_to_kernel
-@@ -139,7 +140,6 @@ test_guest_events:
- /* %rbx: struct vcpu */
- process_softirqs:
- sti
-- SAVE_PRESERVED
- call do_softirq
- jmp test_all_events
-
-@@ -189,7 +189,8 @@ GLOBAL(sysenter_eflags_saved)
- pushq $3 /* ring 3 null cs */
- pushq $0 /* null rip */
- pushq $0
-- SAVE_VOLATILE TRAP_syscall
-+ movl $TRAP_syscall, 4(%rsp)
-+ SAVE_ALL
- GET_CURRENT(bx)
- cmpb $0,VCPU_sysenter_disables_events(%rbx)
- movq VCPU_sysenter_addr(%rbx),%rax
-@@ -206,7 +207,6 @@ UNLIKELY_END(sysenter_nt_set)
- leal (,%rcx,TBF_INTERRUPT),%ecx
- UNLIKELY_START(z, sysenter_gpf)
- movq VCPU_trap_ctxt(%rbx),%rsi
-- SAVE_PRESERVED
- movl $TRAP_gp_fault,UREGS_entry_vector(%rsp)
- movl %eax,TRAPBOUNCE_error_code(%rdx)
- movq TRAP_gp_fault * TRAPINFO_sizeof + TRAPINFO_eip(%rsi),%rax
-@@ -224,7 +224,8 @@ UNLIKELY_END(sysenter_gpf)
- ENTRY(int80_direct_trap)
- ASM_CLAC
- pushq $0
-- SAVE_VOLATILE 0x80
-+ movl $0x80, 4(%rsp)
-+ SAVE_ALL
-
- cmpb $0,untrusted_msi(%rip)
- UNLIKELY_START(ne, msi_check)
-@@ -252,7 +253,6 @@ int80_slow_path:
- * IDT entry with DPL==0.
- */
- movl $((0x80 << 3) | X86_XEC_IDT),UREGS_error_code(%rsp)
-- SAVE_PRESERVED
- movl $TRAP_gp_fault,UREGS_entry_vector(%rsp)
- /* A GPF wouldn't have incremented the instruction pointer. */
- subq $2,UREGS_rip(%rsp)
-diff --git a/xen/arch/x86/x86_64/traps.c b/xen/arch/x86/x86_64/traps.c
-index a9b0282..df4ac81 100644
-diff --git a/xen/arch/x86/x86_emulate.c b/xen/arch/x86/x86_emulate.c
---- xen/arch/x86/x86_64/traps.c.orig 2017-09-06 12:26:35.000000000 +0200
-+++ xen/arch/x86/x86_64/traps.c 2018-01-17 20:50:17.000000000 +0100
-@@ -66,15 +66,10 @@
- regs->rbp, regs->rsp, regs->r8);
- printk("r9: %016lx r10: %016lx r11: %016lx\n",
- regs->r9, regs->r10, regs->r11);
-- if ( !(regs->entry_vector & TRAP_regs_partial) )
-- {
-- printk("r12: %016lx r13: %016lx r14: %016lx\n",
-- regs->r12, regs->r13, regs->r14);
-- printk("r15: %016lx cr0: %016lx cr4: %016lx\n",
-- regs->r15, crs[0], crs[4]);
-- }
-- else
-- printk("cr0: %016lx cr4: %016lx\n", crs[0], crs[4]);
-+ printk("r12: %016lx r13: %016lx r14: %016lx\n",
-+ regs->r12, regs->r13, regs->r14);
-+ printk("r15: %016lx cr0: %016lx cr4: %016lx\n",
-+ regs->r15, crs[0], crs[4]);
- printk("cr3: %016lx cr2: %016lx\n", crs[3], crs[2]);
- printk("ds: %04x es: %04x fs: %04x gs: %04x "
- "ss: %04x cs: %04x\n",
-index f52f543..c1e2d54 100644
---- xen/arch/x86/x86_emulate.c.orig
-+++ xen/arch/x86/x86_emulate.c
-@@ -11,7 +11,6 @@
-
- #include <xen/domain_page.h>
- #include <asm/x86_emulate.h>
--#include <asm/asm_defns.h> /* mark_regs_dirty() */
- #include <asm/processor.h> /* current_cpu_info */
- #include <asm/xstate.h>
- #include <asm/amd.h> /* cpu_has_amd_erratum() */
-diff --git a/xen/arch/x86/x86_emulate/x86_emulate.c b/xen/arch/x86/x86_emulate/x86_emulate.c
-index c4d282a..9851416 100644
---- xen/arch/x86/x86_emulate/x86_emulate.c.orig
-+++ xen/arch/x86/x86_emulate/x86_emulate.c
-@@ -1559,10 +1559,10 @@ decode_register(
- case 9: p = &regs->r9; break;
- case 10: p = &regs->r10; break;
- case 11: p = &regs->r11; break;
-- case 12: mark_regs_dirty(regs); p = &regs->r12; break;
-- case 13: mark_regs_dirty(regs); p = &regs->r13; break;
-- case 14: mark_regs_dirty(regs); p = &regs->r14; break;
-- case 15: mark_regs_dirty(regs); p = &regs->r15; break;
-+ case 12: p = &regs->r12; break;
-+ case 13: p = &regs->r13; break;
-+ case 14: p = &regs->r14; break;
-+ case 15: p = &regs->r15; break;
- #endif
- default: BUG(); p = NULL; break;
- }
-diff --git a/xen/common/wait.c b/xen/common/wait.c
-index 4ac98c0..398f653 100644
---- xen/common/wait.c.orig
-+++ xen/common/wait.c
-@@ -128,7 +128,6 @@ static void __prepare_to_wait(struct waitqueue_vcpu *wqv)
- unsigned long dummy;
- u32 entry_vector = cpu_info->guest_cpu_user_regs.entry_vector;
-
-- cpu_info->guest_cpu_user_regs.entry_vector &= ~TRAP_regs_partial;
- ASSERT(wqv->esp == 0);
-
- /* Save current VCPU affinity; force wakeup on *this* CPU only. */
-diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
-index f1c6fa1..99cb337 100644
---- xen/include/asm-x86/asm_defns.h.orig
-+++ xen/include/asm-x86/asm_defns.h
-@@ -17,15 +17,6 @@
- void ret_from_intr(void);
- #endif
-
--#ifdef CONFIG_FRAME_POINTER
--/* Indicate special exception stack frame by inverting the frame pointer. */
--#define SETUP_EXCEPTION_FRAME_POINTER(offs) \
-- leaq offs(%rsp),%rbp; \
-- notq %rbp
--#else
--#define SETUP_EXCEPTION_FRAME_POINTER(offs)
--#endif
--
- #ifndef NDEBUG
- #define ASSERT_INTERRUPT_STATUS(x, msg) \
- pushf; \
-@@ -42,31 +33,6 @@ void ret_from_intr(void);
- #define ASSERT_INTERRUPTS_DISABLED \
- ASSERT_INTERRUPT_STATUS(z, "INTERRUPTS DISABLED")
-
--/*
-- * This flag is set in an exception frame when registers R12-R15 did not get
-- * saved.
-- */
--#define _TRAP_regs_partial 16
--#define TRAP_regs_partial (1 << _TRAP_regs_partial)
--/*
-- * This flag gets set in an exception frame when registers R12-R15 possibly
-- * get modified from their originally saved values and hence need to be
-- * restored even if the normal call flow would restore register values.
-- *
-- * The flag being set implies _TRAP_regs_partial to be unset. Restoring
-- * R12-R15 thus is
-- * - required when this flag is set,
-- * - safe when _TRAP_regs_partial is unset.
-- */
--#define _TRAP_regs_dirty 17
--#define TRAP_regs_dirty (1 << _TRAP_regs_dirty)
--
--#define mark_regs_dirty(r) ({ \
-- struct cpu_user_regs *r__ = (r); \
-- ASSERT(!((r__)->entry_vector & TRAP_regs_partial)); \
-- r__->entry_vector |= TRAP_regs_dirty; \
--})
--
- #ifdef __ASSEMBLY__
- # define _ASM_EX(p) p-.
- #else
-@@ -236,7 +202,7 @@ static always_inline void stac(void)
- #endif
-
- #ifdef __ASSEMBLY__
--.macro SAVE_ALL op
-+.macro SAVE_ALL op, compat=0
- .ifeqs "\op", "CLAC"
- ASM_CLAC
- .else
-@@ -255,40 +221,6 @@ static always_inline void stac(void)
- movq %rdx,UREGS_rdx(%rsp)
- movq %rcx,UREGS_rcx(%rsp)
- movq %rax,UREGS_rax(%rsp)
-- movq %r8,UREGS_r8(%rsp)
-- movq %r9,UREGS_r9(%rsp)
-- movq %r10,UREGS_r10(%rsp)
-- movq %r11,UREGS_r11(%rsp)
-- movq %rbx,UREGS_rbx(%rsp)
-- movq %rbp,UREGS_rbp(%rsp)
-- SETUP_EXCEPTION_FRAME_POINTER(UREGS_rbp)
-- movq %r12,UREGS_r12(%rsp)
-- movq %r13,UREGS_r13(%rsp)
-- movq %r14,UREGS_r14(%rsp)
-- movq %r15,UREGS_r15(%rsp)
--.endm
--
--/*
-- * Save all registers not preserved by C code or used in entry/exit code. Mark
-- * the frame as partial.
-- *
-- * @type: exception type
-- * @compat: R8-R15 don't need saving, and the frame nevertheless is complete
-- */
--.macro SAVE_VOLATILE type compat=0
--.if \compat
-- movl $\type,UREGS_entry_vector-UREGS_error_code(%rsp)
--.else
-- movl $\type|TRAP_regs_partial,\
-- UREGS_entry_vector-UREGS_error_code(%rsp)
--.endif
-- addq $-(UREGS_error_code-UREGS_r15),%rsp
-- cld
-- movq %rdi,UREGS_rdi(%rsp)
-- movq %rsi,UREGS_rsi(%rsp)
-- movq %rdx,UREGS_rdx(%rsp)
-- movq %rcx,UREGS_rcx(%rsp)
-- movq %rax,UREGS_rax(%rsp)
- .if !\compat
- movq %r8,UREGS_r8(%rsp)
- movq %r9,UREGS_r9(%rsp)
-@@ -297,20 +229,17 @@ static always_inline void stac(void)
- .endif
- movq %rbx,UREGS_rbx(%rsp)
- movq %rbp,UREGS_rbp(%rsp)
-- SETUP_EXCEPTION_FRAME_POINTER(UREGS_rbp)
--.endm
--
--/*
-- * Complete a frame potentially only partially saved.
-- */
--.macro SAVE_PRESERVED
-- btrl $_TRAP_regs_partial,UREGS_entry_vector(%rsp)
-- jnc 987f
-+#ifdef CONFIG_FRAME_POINTER
-+/* Indicate special exception stack frame by inverting the frame pointer. */
-+ leaq UREGS_rbp(%rsp), %rbp
-+ notq %rbp
-+#endif
-+.if !\compat
- movq %r12,UREGS_r12(%rsp)
- movq %r13,UREGS_r13(%rsp)
- movq %r14,UREGS_r14(%rsp)
- movq %r15,UREGS_r15(%rsp)
--987:
-+.endif
- .endm
-
- #define LOAD_ONE_REG(reg, compat) \
-@@ -351,33 +280,13 @@ static always_inline void stac(void)
- * @compat: R8-R15 don't need reloading
- */
- .macro RESTORE_ALL adj=0 compat=0
--.if !\compat
-- testl $TRAP_regs_dirty,UREGS_entry_vector(%rsp)
--.endif
- LOAD_C_CLOBBERED \compat
- .if !\compat
-- jz 987f
- movq UREGS_r15(%rsp),%r15
- movq UREGS_r14(%rsp),%r14
- movq UREGS_r13(%rsp),%r13
- movq UREGS_r12(%rsp),%r12
--#ifndef NDEBUG
-- .subsection 1
--987: testl $TRAP_regs_partial,UREGS_entry_vector(%rsp)
-- jnz 987f
-- cmpq UREGS_r15(%rsp),%r15
-- jne 789f
-- cmpq UREGS_r14(%rsp),%r14
-- jne 789f
-- cmpq UREGS_r13(%rsp),%r13
-- jne 789f
-- cmpq UREGS_r12(%rsp),%r12
-- je 987f
--789: BUG /* Corruption of partial register state. */
-- .subsection 0
--#endif
- .endif
--987:
- LOAD_ONE_REG(bp, \compat)
- LOAD_ONE_REG(bx, \compat)
- subq $-(UREGS_error_code-UREGS_r15+\adj), %rsp
diff --git a/sysutils/xenkernel48/patches/patch-XSA254-2 b/sysutils/xenkernel48/patches/patch-XSA254-2
deleted file mode 100644
index 06a1f08957b..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA254-2
+++ /dev/null
@@ -1,44 +0,0 @@
-$NetBSD: patch-XSA254-2,v 1.1 2018/01/18 10:28:13 bouyer Exp $
-
-From: Andrew Cooper <andrew.cooper3@citrix.com>
-Date: Wed, 17 Jan 2018 16:15:25 +0000 (+0100)
-Subject: x86/mm: Always set _PAGE_ACCESSED on L4e updates
-X-Git-Url: http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff_plain;h=049e2f45bfa488967494466ec6506c3ecae5fe0e;hp=49a44f089c59c326828b8aed39bee9743f5802fb
-
-x86/mm: Always set _PAGE_ACCESSED on L4e updates
-
-Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
-Reviewed-by: Jan Beulich <jbeulich@suse.com>
-master commit: bd61fe94bee0556bc2f64999a4a8315b93f90f21
-master date: 2018-01-15 13:53:16 +0000
----
-
-diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
-index be51d16..c22455d 100644
---- xen/arch/x86/mm.c.orig
-+++ xen/arch/x86/mm.c
-@@ -1297,11 +1297,23 @@ get_page_from_l4e(
- _PAGE_USER|_PAGE_RW); \
- } while ( 0 )
-
-+/*
-+ * When shadowing an L4 behind the guests back (e.g. for per-pcpu
-+ * purposes), we cannot efficiently sync access bit updates from hardware
-+ * (on the shadow tables) back into the guest view.
-+ *
-+ * We therefore unconditionally set _PAGE_ACCESSED even in the guests
-+ * view. This will appear to the guest as a CPU which proactively pulls
-+ * all valid L4e's into its TLB, which is compatible with the x86 ABI.
-+ *
-+ * At the time of writing, all PV guests set the access bit anyway, so
-+ * this is no actual change in their behaviour.
-+ */
- #define adjust_guest_l4e(pl4e, d) \
- do { \
- if ( likely(l4e_get_flags((pl4e)) & _PAGE_PRESENT) && \
- likely(!is_pv_32bit_domain(d)) ) \
-- l4e_add_flags((pl4e), _PAGE_USER); \
-+ l4e_add_flags((pl4e), _PAGE_USER | _PAGE_ACCESSED); \
- } while ( 0 )
-
- #define unadjust_guest_l3e(pl3e, d) \
diff --git a/sysutils/xenkernel48/patches/patch-XSA254-3 b/sysutils/xenkernel48/patches/patch-XSA254-3
deleted file mode 100644
index a922b876ae1..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA254-3
+++ /dev/null
@@ -1,758 +0,0 @@
-$NetBSD: patch-XSA254-3,v 1.1 2018/01/18 10:28:13 bouyer Exp $
-
-From 1ba477bde737bf9b28cc455bef1e9a6bc76d66fc Mon Sep 17 00:00:00 2001
-From: Jan Beulich <jbeulich@suse.com>
-Date: Wed, 17 Jan 2018 17:16:28 +0100
-Subject: [PATCH] x86: Meltdown band-aid against malicious 64-bit PV guests
-
-This is a very simplistic change limiting the amount of memory a running
-64-bit PV guest has mapped (and hence available for attacking): Only the
-mappings of stack, IDT, and TSS are being cloned from the direct map
-into per-CPU page tables. Guest controlled parts of the page tables are
-being copied into those per-CPU page tables upon entry into the guest.
-Cross-vCPU synchronization of top level page table entry changes is
-being effected by forcing other active vCPU-s of the guest into the
-hypervisor.
-
-The change to context_switch() isn't strictly necessary, but there's no
-reason to keep switching page tables once a PV guest is being scheduled
-out.
-
-This isn't providing full isolation yet, but it should be covering all
-pieces of information exposure of which would otherwise require an XSA.
-
-There is certainly much room for improvement, especially of performance,
-here - first and foremost suppressing all the negative effects on AMD
-systems. But in the interest of backportability (including to really old
-hypervisors, which may not even have alternative patching) any such is
-being left out here.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-master commit: 5784de3e2067ed73efc2fe42e62831e8ae7f46c4
-master date: 2018-01-16 17:49:03 +0100
----
- xen/arch/x86/domain.c | 5 +
- xen/arch/x86/mm.c | 17 ++++
- xen/arch/x86/smpboot.c | 198 +++++++++++++++++++++++++++++++++++++
- xen/arch/x86/x86_64/asm-offsets.c | 2 +
- xen/arch/x86/x86_64/compat/entry.S | 11 +++
- xen/arch/x86/x86_64/entry.S | 149 +++++++++++++++++++++++++++-
- xen/include/asm-x86/asm_defns.h | 30 ++++++
- xen/include/asm-x86/current.h | 12 +++
- xen/include/asm-x86/processor.h | 1 +
- xen/include/asm-x86/x86_64/page.h | 5 +-
- 10 files changed, 424 insertions(+), 6 deletions(-)
-
-diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
-index 747cf65..bf3590d 100644
---- xen/arch/x86/domain.c.orig
-+++ xen/arch/x86/domain.c
-@@ -1951,6 +1951,9 @@ static void paravirt_ctxt_switch_to(struct vcpu *v)
-
- switch_kernel_stack(v);
-
-+ this_cpu(root_pgt)[root_table_offset(PERDOMAIN_VIRT_START)] =
-+ l4e_from_page(v->domain->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
-+
- cr4 = pv_guest_cr4_to_real_cr4(v);
- if ( unlikely(cr4 != read_cr4()) )
- write_cr4(cr4);
-@@ -2119,6 +2122,8 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
-
- ASSERT(local_irq_is_enabled());
-
-+ get_cpu_info()->xen_cr3 = 0;
-+
- cpumask_copy(&dirty_mask, next->vcpu_dirty_cpumask);
- /* Allow at most one CPU at a time to be dirty. */
- ASSERT(cpumask_weight(&dirty_mask) <= 1);
-diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
-index c22455d..69e1ab6 100644
---- xen/arch/x86/mm.c.orig
-+++ xen/arch/x86/mm.c
-@@ -3861,6 +3861,7 @@ long do_mmu_update(
- struct vcpu *curr = current, *v = curr;
- struct domain *d = v->domain, *pt_owner = d, *pg_owner;
- struct domain_mmap_cache mapcache;
-+ bool sync_guest = false;
- uint32_t xsm_needed = 0;
- uint32_t xsm_checked = 0;
- int rc = put_old_guest_table(curr);
-@@ -3998,6 +3998,8 @@ long do_mmu_update(
- case PGT_l4_page_table:
- rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
- cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
-+ if ( !rc )
-+ sync_guest = true;
- break;
- case PGT_writable_page:
- perfc_incr(writable_mmu_updates);
-@@ -4111,6 +4114,20 @@ long do_mmu_update(
-
- domain_mmap_cache_destroy(&mapcache);
-
-+ if ( sync_guest )
-+ {
-+ /*
-+ * Force other vCPU-s of the affected guest to pick up L4 entry
-+ * changes (if any). Issue a flush IPI with empty operation mask to
-+ * facilitate this (including ourselves waiting for the IPI to
-+ * actually have arrived). Utilize the fact that FLUSH_VA_VALID is
-+ * meaningless without FLUSH_CACHE, but will allow to pass the no-op
-+ * check in flush_area_mask().
-+ */
-+ flush_area_mask(pt_owner->domain_dirty_cpumask,
-+ ZERO_BLOCK_PTR, FLUSH_VA_VALID);
-+ }
-+
- perfc_add(num_page_updates, i);
-
- out:
-diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
-index 144258f..327c744 100644
---- xen/arch/x86/smpboot.c.orig
-+++ xen/arch/x86/smpboot.c
-@@ -319,6 +319,9 @@ void start_secondary(void *unused)
- */
- spin_debug_disable();
-
-+ get_cpu_info()->xen_cr3 = 0;
-+ get_cpu_info()->pv_cr3 = __pa(this_cpu(root_pgt));
-+
- load_system_tables();
-
- /* Full exception support from here on in. */
-@@ -628,6 +631,187 @@ void cpu_exit_clear(unsigned int cpu)
- set_cpu_state(CPU_STATE_DEAD);
- }
-
-+static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
-+{
-+ unsigned long linear = (unsigned long)ptr, pfn;
-+ unsigned int flags;
-+ l3_pgentry_t *pl3e = l4e_to_l3e(idle_pg_table[root_table_offset(linear)]) +
-+ l3_table_offset(linear);
-+ l2_pgentry_t *pl2e;
-+ l1_pgentry_t *pl1e;
-+
-+ if ( linear < DIRECTMAP_VIRT_START )
-+ return 0;
-+
-+ flags = l3e_get_flags(*pl3e);
-+ ASSERT(flags & _PAGE_PRESENT);
-+ if ( flags & _PAGE_PSE )
-+ {
-+ pfn = (l3e_get_pfn(*pl3e) & ~((1UL << (2 * PAGETABLE_ORDER)) - 1)) |
-+ (PFN_DOWN(linear) & ((1UL << (2 * PAGETABLE_ORDER)) - 1));
-+ flags &= ~_PAGE_PSE;
-+ }
-+ else
-+ {
-+ pl2e = l3e_to_l2e(*pl3e) + l2_table_offset(linear);
-+ flags = l2e_get_flags(*pl2e);
-+ ASSERT(flags & _PAGE_PRESENT);
-+ if ( flags & _PAGE_PSE )
-+ {
-+ pfn = (l2e_get_pfn(*pl2e) & ~((1UL << PAGETABLE_ORDER) - 1)) |
-+ (PFN_DOWN(linear) & ((1UL << PAGETABLE_ORDER) - 1));
-+ flags &= ~_PAGE_PSE;
-+ }
-+ else
-+ {
-+ pl1e = l2e_to_l1e(*pl2e) + l1_table_offset(linear);
-+ flags = l1e_get_flags(*pl1e);
-+ if ( !(flags & _PAGE_PRESENT) )
-+ return 0;
-+ pfn = l1e_get_pfn(*pl1e);
-+ }
-+ }
-+
-+ if ( !(root_get_flags(rpt[root_table_offset(linear)]) & _PAGE_PRESENT) )
-+ {
-+ pl3e = alloc_xen_pagetable();
-+ if ( !pl3e )
-+ return -ENOMEM;
-+ clear_page(pl3e);
-+ l4e_write(&rpt[root_table_offset(linear)],
-+ l4e_from_paddr(__pa(pl3e), __PAGE_HYPERVISOR));
-+ }
-+ else
-+ pl3e = l4e_to_l3e(rpt[root_table_offset(linear)]);
-+
-+ pl3e += l3_table_offset(linear);
-+
-+ if ( !(l3e_get_flags(*pl3e) & _PAGE_PRESENT) )
-+ {
-+ pl2e = alloc_xen_pagetable();
-+ if ( !pl2e )
-+ return -ENOMEM;
-+ clear_page(pl2e);
-+ l3e_write(pl3e, l3e_from_paddr(__pa(pl2e), __PAGE_HYPERVISOR));
-+ }
-+ else
-+ {
-+ ASSERT(!(l3e_get_flags(*pl3e) & _PAGE_PSE));
-+ pl2e = l3e_to_l2e(*pl3e);
-+ }
-+
-+ pl2e += l2_table_offset(linear);
-+
-+ if ( !(l2e_get_flags(*pl2e) & _PAGE_PRESENT) )
-+ {
-+ pl1e = alloc_xen_pagetable();
-+ if ( !pl1e )
-+ return -ENOMEM;
-+ clear_page(pl1e);
-+ l2e_write(pl2e, l2e_from_paddr(__pa(pl1e), __PAGE_HYPERVISOR));
-+ }
-+ else
-+ {
-+ ASSERT(!(l2e_get_flags(*pl2e) & _PAGE_PSE));
-+ pl1e = l2e_to_l1e(*pl2e);
-+ }
-+
-+ pl1e += l1_table_offset(linear);
-+
-+ if ( l1e_get_flags(*pl1e) & _PAGE_PRESENT )
-+ {
-+ ASSERT(l1e_get_pfn(*pl1e) == pfn);
-+ ASSERT(l1e_get_flags(*pl1e) == flags);
-+ }
-+ else
-+ l1e_write(pl1e, l1e_from_pfn(pfn, flags));
-+
-+ return 0;
-+}
-+
-+DEFINE_PER_CPU(root_pgentry_t *, root_pgt);
-+
-+static int setup_cpu_root_pgt(unsigned int cpu)
-+{
-+ root_pgentry_t *rpt = alloc_xen_pagetable();
-+ unsigned int off;
-+ int rc;
-+
-+ if ( !rpt )
-+ return -ENOMEM;
-+
-+ clear_page(rpt);
-+ per_cpu(root_pgt, cpu) = rpt;
-+
-+ rpt[root_table_offset(RO_MPT_VIRT_START)] =
-+ idle_pg_table[root_table_offset(RO_MPT_VIRT_START)];
-+ /* SH_LINEAR_PT inserted together with guest mappings. */
-+ /* PERDOMAIN inserted during context switch. */
-+ rpt[root_table_offset(XEN_VIRT_START)] =
-+ idle_pg_table[root_table_offset(XEN_VIRT_START)];
-+
-+ /* Install direct map page table entries for stack, IDT, and TSS. */
-+ for ( off = rc = 0; !rc && off < STACK_SIZE; off += PAGE_SIZE )
-+ rc = clone_mapping(__va(__pa(stack_base[cpu])) + off, rpt);
-+
-+ if ( !rc )
-+ rc = clone_mapping(idt_tables[cpu], rpt);
-+ if ( !rc )
-+ rc = clone_mapping(&per_cpu(init_tss, cpu), rpt);
-+
-+ return rc;
-+}
-+
-+static void cleanup_cpu_root_pgt(unsigned int cpu)
-+{
-+ root_pgentry_t *rpt = per_cpu(root_pgt, cpu);
-+ unsigned int r;
-+
-+ if ( !rpt )
-+ return;
-+
-+ per_cpu(root_pgt, cpu) = NULL;
-+
-+ for ( r = root_table_offset(DIRECTMAP_VIRT_START);
-+ r < root_table_offset(HYPERVISOR_VIRT_END); ++r )
-+ {
-+ l3_pgentry_t *l3t;
-+ unsigned int i3;
-+
-+ if ( !(root_get_flags(rpt[r]) & _PAGE_PRESENT) )
-+ continue;
-+
-+ l3t = l4e_to_l3e(rpt[r]);
-+
-+ for ( i3 = 0; i3 < L3_PAGETABLE_ENTRIES; ++i3 )
-+ {
-+ l2_pgentry_t *l2t;
-+ unsigned int i2;
-+
-+ if ( !(l3e_get_flags(l3t[i3]) & _PAGE_PRESENT) )
-+ continue;
-+
-+ ASSERT(!(l3e_get_flags(l3t[i3]) & _PAGE_PSE));
-+ l2t = l3e_to_l2e(l3t[i3]);
-+
-+ for ( i2 = 0; i2 < L2_PAGETABLE_ENTRIES; ++i2 )
-+ {
-+ if ( !(l2e_get_flags(l2t[i2]) & _PAGE_PRESENT) )
-+ continue;
-+
-+ ASSERT(!(l2e_get_flags(l2t[i2]) & _PAGE_PSE));
-+ free_xen_pagetable(l2e_to_l1e(l2t[i2]));
-+ }
-+
-+ free_xen_pagetable(l2t);
-+ }
-+
-+ free_xen_pagetable(l3t);
-+ }
-+
-+ free_xen_pagetable(rpt);
-+}
-+
- static void cpu_smpboot_free(unsigned int cpu)
- {
- unsigned int order, socket = cpu_to_socket(cpu);
-@@ -664,6 +848,8 @@ static void cpu_smpboot_free(unsigned int cpu)
- free_domheap_page(mfn_to_page(mfn));
- }
-
-+ cleanup_cpu_root_pgt(cpu);
-+
- order = get_order_from_pages(NR_RESERVED_GDT_PAGES);
- free_xenheap_pages(per_cpu(gdt_table, cpu), order);
-
-@@ -719,6 +905,9 @@ static int cpu_smpboot_alloc(unsigned int cpu)
- set_ist(&idt_tables[cpu][TRAP_nmi], IST_NONE);
- set_ist(&idt_tables[cpu][TRAP_machine_check], IST_NONE);
-
-+ if ( setup_cpu_root_pgt(cpu) )
-+ goto oom;
-+
- for ( stub_page = 0, i = cpu & ~(STUBS_PER_PAGE - 1);
- i < nr_cpu_ids && i <= (cpu | (STUBS_PER_PAGE - 1)); ++i )
- if ( cpu_online(i) && cpu_to_node(i) == node )
-@@ -773,6 +962,8 @@ static struct notifier_block cpu_smpboot_nfb = {
-
- void __init smp_prepare_cpus(unsigned int max_cpus)
- {
-+ int rc;
-+
- register_cpu_notifier(&cpu_smpboot_nfb);
-
- mtrr_aps_sync_begin();
-@@ -786,6 +977,11 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
-
- stack_base[0] = stack_start;
-
-+ rc = setup_cpu_root_pgt(0);
-+ if ( rc )
-+ panic("Error %d setting up PV root page table\n", rc);
-+ get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
-+
- set_nr_sockets();
-
- socket_cpumask = xzalloc_array(cpumask_t *, nr_sockets);
-@@ -850,6 +1046,8 @@ void __init smp_prepare_boot_cpu(void)
- {
- cpumask_set_cpu(smp_processor_id(), &cpu_online_map);
- cpumask_set_cpu(smp_processor_id(), &cpu_present_map);
-+
-+ get_cpu_info()->xen_cr3 = 0;
- }
-
- static void
-diff --git a/xen/arch/x86/x86_64/asm-offsets.c b/xen/arch/x86/x86_64/asm-offsets.c
-index 64905c6..325abdc 100644
---- xen/arch/x86/x86_64/asm-offsets.c.orig
-+++ xen/arch/x86/x86_64/asm-offsets.c
-@@ -137,6 +137,8 @@ void __dummy__(void)
- OFFSET(CPUINFO_processor_id, struct cpu_info, processor_id);
- OFFSET(CPUINFO_current_vcpu, struct cpu_info, current_vcpu);
- OFFSET(CPUINFO_cr4, struct cpu_info, cr4);
-+ OFFSET(CPUINFO_xen_cr3, struct cpu_info, xen_cr3);
-+ OFFSET(CPUINFO_pv_cr3, struct cpu_info, pv_cr3);
- DEFINE(CPUINFO_sizeof, sizeof(struct cpu_info));
- BLANK();
-
-diff --git a/xen/arch/x86/x86_64/compat/entry.S b/xen/arch/x86/x86_64/compat/entry.S
-index df693c2..c8f68a0 100644
---- xen/arch/x86/x86_64/compat/entry.S.orig
-+++ xen/arch/x86/x86_64/compat/entry.S
-@@ -205,6 +205,17 @@ ENTRY(cstar_enter)
- pushq $0
- movl $TRAP_syscall, 4(%rsp)
- SAVE_ALL
-+
-+ GET_STACK_END(bx)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
-+ neg %rcx
-+ jz .Lcstar_cr3_okay
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
-+ neg %rcx
-+ write_cr3 rcx, rdi, rsi
-+ movq $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
-+.Lcstar_cr3_okay:
-+
- GET_CURRENT(bx)
- movq VCPU_domain(%rbx),%rcx
- cmpb $0,DOMAIN_is_32bit_pv(%rcx)
-diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
-index ac9ab4c..d1afb3c 100644
---- xen/arch/x86/x86_64/entry.S.orig
-+++ xen/arch/x86/x86_64/entry.S
-@@ -36,6 +36,32 @@ ENTRY(switch_to_kernel)
- /* %rbx: struct vcpu, interrupts disabled */
- restore_all_guest:
- ASSERT_INTERRUPTS_DISABLED
-+
-+ /* Copy guest mappings and switch to per-CPU root page table. */
-+ mov %cr3, %r9
-+ GET_STACK_END(dx)
-+ mov STACK_CPUINFO_FIELD(pv_cr3)(%rdx), %rdi
-+ movabs $PADDR_MASK & PAGE_MASK, %rsi
-+ movabs $DIRECTMAP_VIRT_START, %rcx
-+ mov %rdi, %rax
-+ and %rsi, %rdi
-+ and %r9, %rsi
-+ add %rcx, %rdi
-+ add %rcx, %rsi
-+ mov $ROOT_PAGETABLE_FIRST_XEN_SLOT, %ecx
-+ mov root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rsi), %r8
-+ mov %r8, root_table_offset(SH_LINEAR_PT_VIRT_START)*8(%rdi)
-+ rep movsq
-+ mov $ROOT_PAGETABLE_ENTRIES - \
-+ ROOT_PAGETABLE_LAST_XEN_SLOT - 1, %ecx
-+ sub $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \
-+ ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rsi
-+ sub $(ROOT_PAGETABLE_FIRST_XEN_SLOT - \
-+ ROOT_PAGETABLE_LAST_XEN_SLOT - 1) * 8, %rdi
-+ rep movsq
-+ mov %r9, STACK_CPUINFO_FIELD(xen_cr3)(%rdx)
-+ write_cr3 rax, rdi, rsi
-+
- RESTORE_ALL
- testw $TRAP_syscall,4(%rsp)
- jz iret_exit_to_guest
-@@ -70,6 +96,22 @@ iret_exit_to_guest:
- ALIGN
- /* No special register assumptions. */
- restore_all_xen:
-+ /*
-+ * Check whether we need to switch to the per-CPU page tables, in
-+ * case we return to late PV exit code (from an NMI or #MC).
-+ */
-+ GET_STACK_END(ax)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%rax), %rdx
-+ mov STACK_CPUINFO_FIELD(pv_cr3)(%rax), %rax
-+ test %rdx, %rdx
-+ /*
-+ * Ideally the condition would be "nsz", but such doesn't exist,
-+ * so "g" will have to do.
-+ */
-+UNLIKELY_START(g, exit_cr3)
-+ write_cr3 rax, rdi, rsi
-+UNLIKELY_END(exit_cr3)
-+
- RESTORE_ALL adj=8
- iretq
-
-@@ -99,7 +141,18 @@ ENTRY(lstar_enter)
- pushq $0
- movl $TRAP_syscall, 4(%rsp)
- SAVE_ALL
-- GET_CURRENT(bx)
-+
-+ GET_STACK_END(bx)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
-+ neg %rcx
-+ jz .Llstar_cr3_okay
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
-+ neg %rcx
-+ write_cr3 rcx, rdi, rsi
-+ movq $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
-+.Llstar_cr3_okay:
-+
-+ __GET_CURRENT(bx)
- testb $TF_kernel_mode,VCPU_thread_flags(%rbx)
- jz switch_to_kernel
-
-@@ -191,7 +244,18 @@ GLOBAL(sysenter_eflags_saved)
- pushq $0
- movl $TRAP_syscall, 4(%rsp)
- SAVE_ALL
-- GET_CURRENT(bx)
-+
-+ GET_STACK_END(bx)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
-+ neg %rcx
-+ jz .Lsyse_cr3_okay
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
-+ neg %rcx
-+ write_cr3 rcx, rdi, rsi
-+ movq $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
-+.Lsyse_cr3_okay:
-+
-+ __GET_CURRENT(bx)
- cmpb $0,VCPU_sysenter_disables_events(%rbx)
- movq VCPU_sysenter_addr(%rbx),%rax
- setne %cl
-@@ -227,13 +291,23 @@ ENTRY(int80_direct_trap)
- movl $0x80, 4(%rsp)
- SAVE_ALL
-
-+ GET_STACK_END(bx)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rcx
-+ neg %rcx
-+ jz .Lint80_cr3_okay
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
-+ neg %rcx
-+ write_cr3 rcx, rdi, rsi
-+ movq $0, STACK_CPUINFO_FIELD(xen_cr3)(%rbx)
-+.Lint80_cr3_okay:
-+
- cmpb $0,untrusted_msi(%rip)
- UNLIKELY_START(ne, msi_check)
- movl $0x80,%edi
- call check_for_unexpected_msi
- UNLIKELY_END(msi_check)
-
-- GET_CURRENT(bx)
-+ __GET_CURRENT(bx)
-
- /* Check that the callback is non-null. */
- leaq VCPU_int80_bounce(%rbx),%rdx
-@@ -384,9 +458,27 @@ ENTRY(dom_crash_sync_extable)
-
- ENTRY(common_interrupt)
- SAVE_ALL CLAC
-+
-+ GET_STACK_END(14)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%r14), %rcx
-+ mov %rcx, %r15
-+ neg %rcx
-+ jz .Lintr_cr3_okay
-+ jns .Lintr_cr3_load
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
-+ neg %rcx
-+.Lintr_cr3_load:
-+ write_cr3 rcx, rdi, rsi
-+ xor %ecx, %ecx
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
-+ testb $3, UREGS_cs(%rsp)
-+ cmovnz %rcx, %r15
-+.Lintr_cr3_okay:
-+
- CR4_PV32_RESTORE
- movq %rsp,%rdi
- callq do_IRQ
-+ mov %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
- jmp ret_from_intr
-
- /* No special register assumptions. */
-@@ -404,6 +496,23 @@ ENTRY(page_fault)
- /* No special register assumptions. */
- GLOBAL(handle_exception)
- SAVE_ALL CLAC
-+
-+ GET_STACK_END(14)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%r14), %rcx
-+ mov %rcx, %r15
-+ neg %rcx
-+ jz .Lxcpt_cr3_okay
-+ jns .Lxcpt_cr3_load
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
-+ neg %rcx
-+.Lxcpt_cr3_load:
-+ write_cr3 rcx, rdi, rsi
-+ xor %ecx, %ecx
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
-+ testb $3, UREGS_cs(%rsp)
-+ cmovnz %rcx, %r15
-+.Lxcpt_cr3_okay:
-+
- handle_exception_saved:
- GET_CURRENT(bx)
- testb $X86_EFLAGS_IF>>8,UREGS_eflags+1(%rsp)
-@@ -468,6 +577,7 @@ handle_exception_saved:
- leaq exception_table(%rip),%rdx
- PERFC_INCR(exceptions, %rax, %rbx)
- callq *(%rdx,%rax,8)
-+ mov %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
- testb $3,UREGS_cs(%rsp)
- jz restore_all_xen
- leaq VCPU_trap_bounce(%rbx),%rdx
-@@ -500,6 +610,7 @@ exception_with_ints_disabled:
- rep; movsq # make room for ec/ev
- 1: movq UREGS_error_code(%rsp),%rax # ec/ev
- movq %rax,UREGS_kernel_sizeof(%rsp)
-+ mov %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
- jmp restore_all_xen # return to fixup code
-
- /* No special register assumptions. */
-@@ -578,6 +689,17 @@ ENTRY(double_fault)
- movl $TRAP_double_fault,4(%rsp)
- /* Set AC to reduce chance of further SMAP faults */
- SAVE_ALL STAC
-+
-+ GET_STACK_END(bx)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%rbx), %rbx
-+ test %rbx, %rbx
-+ jz .Ldblf_cr3_okay
-+ jns .Ldblf_cr3_load
-+ neg %rbx
-+.Ldblf_cr3_load:
-+ write_cr3 rbx, rdi, rsi
-+.Ldblf_cr3_okay:
-+
- movq %rsp,%rdi
- call do_double_fault
- BUG /* do_double_fault() shouldn't return. */
-@@ -596,10 +718,28 @@ ENTRY(nmi)
- movl $TRAP_nmi,4(%rsp)
- handle_ist_exception:
- SAVE_ALL CLAC
-+
-+ GET_STACK_END(14)
-+ mov STACK_CPUINFO_FIELD(xen_cr3)(%r14), %rcx
-+ mov %rcx, %r15
-+ neg %rcx
-+ jz .List_cr3_okay
-+ jns .List_cr3_load
-+ mov %rcx, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
-+ neg %rcx
-+.List_cr3_load:
-+ write_cr3 rcx, rdi, rsi
-+ movq $0, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
-+.List_cr3_okay:
-+
- CR4_PV32_RESTORE
- testb $3,UREGS_cs(%rsp)
- jz 1f
-- /* Interrupted guest context. Copy the context to stack bottom. */
-+ /*
-+ * Interrupted guest context. Clear the restore value for xen_cr3
-+ * and copy the context to stack bottom.
-+ */
-+ xor %r15, %r15
- GET_CPUINFO_FIELD(guest_cpu_user_regs,di)
- movq %rsp,%rsi
- movl $UREGS_kernel_sizeof/8,%ecx
-@@ -609,6 +749,7 @@ handle_ist_exception:
- movzbl UREGS_entry_vector(%rsp),%eax
- leaq exception_table(%rip),%rdx
- callq *(%rdx,%rax,8)
-+ mov %r15, STACK_CPUINFO_FIELD(xen_cr3)(%r14)
- cmpb $TRAP_nmi,UREGS_entry_vector(%rsp)
- jne ret_from_intr
-
-diff --git a/xen/include/asm-x86/asm_defns.h b/xen/include/asm-x86/asm_defns.h
-index 99cb337..1c8d66c 100644
---- xen/include/asm-x86/asm_defns.h.orig
-+++ xen/include/asm-x86/asm_defns.h
-@@ -93,9 +93,30 @@ void ret_from_intr(void);
- UNLIKELY_DONE(mp, tag); \
- __UNLIKELY_END(tag)
-
-+ .equ .Lrax, 0
-+ .equ .Lrcx, 1
-+ .equ .Lrdx, 2
-+ .equ .Lrbx, 3
-+ .equ .Lrsp, 4
-+ .equ .Lrbp, 5
-+ .equ .Lrsi, 6
-+ .equ .Lrdi, 7
-+ .equ .Lr8, 8
-+ .equ .Lr9, 9
-+ .equ .Lr10, 10
-+ .equ .Lr11, 11
-+ .equ .Lr12, 12
-+ .equ .Lr13, 13
-+ .equ .Lr14, 14
-+ .equ .Lr15, 15
-+
- #define STACK_CPUINFO_FIELD(field) (1 - CPUINFO_sizeof + CPUINFO_##field)
- #define GET_STACK_END(reg) \
-+ .if .Lr##reg > 8; \
-+ movq $STACK_SIZE-1, %r##reg; \
-+ .else; \
- movl $STACK_SIZE-1, %e##reg; \
-+ .endif; \
- orq %rsp, %r##reg
-
- #define GET_CPUINFO_FIELD(field, reg) \
-@@ -177,6 +198,15 @@ void ret_from_intr(void);
- #define ASM_STAC ASM_AC(STAC)
- #define ASM_CLAC ASM_AC(CLAC)
-
-+.macro write_cr3 val:req, tmp1:req, tmp2:req
-+ mov %cr4, %\tmp1
-+ mov %\tmp1, %\tmp2
-+ and $~X86_CR4_PGE, %\tmp1
-+ mov %\tmp1, %cr4
-+ mov %\val, %cr3
-+ mov %\tmp2, %cr4
-+.endm
-+
- #define CR4_PV32_RESTORE \
- 667: ASM_NOP5; \
- .pushsection .altinstr_replacement, "ax"; \
-diff --git a/xen/include/asm-x86/current.h b/xen/include/asm-x86/current.h
-index e6587e6..397fa4c 100644
---- xen/include/asm-x86/current.h.orig
-+++ xen/include/asm-x86/current.h
-@@ -42,6 +42,18 @@ struct cpu_info {
- struct vcpu *current_vcpu;
- unsigned long per_cpu_offset;
- unsigned long cr4;
-+ /*
-+ * Of the two following fields the latter is being set to the CR3 value
-+ * to be used on the given pCPU for loading whenever 64-bit PV guest
-+ * context is being entered. The value never changes once set.
-+ * The former is the value to restore when re-entering Xen, if any. IOW
-+ * its value being zero means there's nothing to restore. However, its
-+ * value can also be negative, indicating to the exit-to-Xen code that
-+ * restoring is not necessary, but allowing any nested entry code paths
-+ * to still know the value to put back into CR3.
-+ */
-+ unsigned long xen_cr3;
-+ unsigned long pv_cr3;
- /* get_stack_bottom() must be 16-byte aligned */
- };
-
-diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
-index fb0cd55..a73993c 100644
---- xen/include/asm-x86/processor.h.orig
-+++ xen/include/asm-x86/processor.h
-@@ -524,6 +524,7 @@ extern idt_entry_t idt_table[];
- extern idt_entry_t *idt_tables[];
-
- DECLARE_PER_CPU(struct tss_struct, init_tss);
-+DECLARE_PER_CPU(root_pgentry_t *, root_pgt);
-
- extern void init_int80_direct_trap(struct vcpu *v);
-
-diff --git a/xen/include/asm-x86/x86_64/page.h b/xen/include/asm-x86/x86_64/page.h
-index 589f225..afc77c3 100644
---- xen/include/asm-x86/x86_64/page.h.orig
-+++ xen/include/asm-x86/x86_64/page.h
-@@ -25,8 +25,8 @@
- /* These are architectural limits. Current CPUs support only 40-bit phys. */
- #define PADDR_BITS 52
- #define VADDR_BITS 48
--#define PADDR_MASK ((1UL << PADDR_BITS)-1)
--#define VADDR_MASK ((1UL << VADDR_BITS)-1)
-+#define PADDR_MASK ((_AC(1,UL) << PADDR_BITS) - 1)
-+#define VADDR_MASK ((_AC(1,UL) << VADDR_BITS) - 1)
-
- #define is_canonical_address(x) (((long)(x) >> 47) == ((long)(x) >> 63))
-
-@@ -117,6 +117,7 @@ typedef l4_pgentry_t root_pgentry_t;
- : (((_s) < ROOT_PAGETABLE_FIRST_XEN_SLOT) || \
- ((_s) > ROOT_PAGETABLE_LAST_XEN_SLOT)))
-
-+#define root_table_offset l4_table_offset
- #define root_get_pfn l4e_get_pfn
- #define root_get_flags l4e_get_flags
- #define root_get_intpte l4e_get_intpte
---
-2.1.4
-
diff --git a/sysutils/xenkernel48/patches/patch-XSA254-4 b/sysutils/xenkernel48/patches/patch-XSA254-4
deleted file mode 100644
index 5561ef9f93f..00000000000
--- a/sysutils/xenkernel48/patches/patch-XSA254-4
+++ /dev/null
@@ -1,165 +0,0 @@
-$NetBSD: patch-XSA254-4,v 1.1 2018/01/18 10:28:13 bouyer Exp $
-
-From 31d38d633a306b2b06767b5a5f5a8a00269f3c92 Mon Sep 17 00:00:00 2001
-From: Jan Beulich <jbeulich@suse.com>
-Date: Wed, 17 Jan 2018 17:17:26 +0100
-Subject: [PATCH] x86: allow Meltdown band-aid to be disabled
-
-First of all we don't need it on AMD systems. Additionally allow its use
-to be controlled by command line option. For best backportability, this
-intentionally doesn't use alternative instruction patching to achieve
-the intended effect - while we likely want it, this will be later
-follow-up.
-
-Signed-off-by: Jan Beulich <jbeulich@suse.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
-master commit: e871e80c38547d9faefc6604532ba3e985e65873
-master date: 2018-01-16 17:50:59 +0100
----
- docs/misc/xen-command-line.markdown | 12 ++++++++++++
- xen/arch/x86/domain.c | 7 +++++--
- xen/arch/x86/mm.c | 2 +-
- xen/arch/x86/smpboot.c | 17 ++++++++++++++---
- xen/arch/x86/x86_64/entry.S | 2 ++
- 5 files changed, 34 insertions(+), 6 deletions(-)
-
-diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
-index 0fcdb7d..768d4f5 100644
---- docs/misc/xen-command-line.markdown.orig
-+++ docs/misc/xen-command-line.markdown
-@@ -1687,6 +1687,18 @@ In the case that x2apic is in use, this option switches between physical and
- clustered mode. The default, given no hint from the **FADT**, is cluster
- mode.
-
-+### xpti
-+> `= <boolean>`
-+
-+> Default: `false` on AMD hardware
-+> Default: `true` everywhere else
-+
-+Override default selection of whether to isolate 64-bit PV guest page
-+tables.
-+
-+** WARNING: Not yet a complete isolation implementation, but better than
-+nothing. **
-+
- ### xsave
- > `= <boolean>`
-
-diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
-index bf3590d..8817263 100644
---- xen/arch/x86/domain.c.orig
-+++ xen/arch/x86/domain.c
-@@ -1947,12 +1947,15 @@ static void paravirt_ctxt_switch_from(struct vcpu *v)
-
- static void paravirt_ctxt_switch_to(struct vcpu *v)
- {
-+ root_pgentry_t *root_pgt = this_cpu(root_pgt);
- unsigned long cr4;
-
- switch_kernel_stack(v);
-
-- this_cpu(root_pgt)[root_table_offset(PERDOMAIN_VIRT_START)] =
-- l4e_from_page(v->domain->arch.perdomain_l3_pg, __PAGE_HYPERVISOR_RW);
-+ if ( root_pgt )
-+ root_pgt[root_table_offset(PERDOMAIN_VIRT_START)] =
-+ l4e_from_page(v->domain->arch.perdomain_l3_pg,
-+ __PAGE_HYPERVISOR_RW);
-
- cr4 = pv_guest_cr4_to_real_cr4(v);
- if ( unlikely(cr4 != read_cr4()) )
-diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
-index 69e1ab6..303c551 100644
---- xen/arch/x86/mm.c.orig
-+++ xen/arch/x86/mm.c
-@@ -3999,7 +3999,7 @@ long do_mmu_update(
- rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
- cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
- if ( !rc )
-- sync_guest = true;
-+ sync_guest = this_cpu(root_pgt);
- break;
- case PGT_writable_page:
- perfc_incr(writable_mmu_updates);
-diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
-index 327c744..c19508f 100644
---- xen/arch/x86/smpboot.c.orig
-+++ xen/arch/x86/smpboot.c
-@@ -320,7 +320,7 @@ void start_secondary(void *unused)
- spin_debug_disable();
-
- get_cpu_info()->xen_cr3 = 0;
-- get_cpu_info()->pv_cr3 = __pa(this_cpu(root_pgt));
-+ get_cpu_info()->pv_cr3 = this_cpu(root_pgt) ? __pa(this_cpu(root_pgt)) : 0;
-
- load_system_tables();
-
-@@ -729,14 +729,20 @@ static int clone_mapping(const void *ptr, root_pgentry_t *rpt)
- return 0;
- }
-
-+static __read_mostly int8_t opt_xpti = -1;
-+boolean_param("xpti", opt_xpti);
- DEFINE_PER_CPU(root_pgentry_t *, root_pgt);
-
- static int setup_cpu_root_pgt(unsigned int cpu)
- {
-- root_pgentry_t *rpt = alloc_xen_pagetable();
-+ root_pgentry_t *rpt;
- unsigned int off;
- int rc;
-
-+ if ( !opt_xpti )
-+ return 0;
-+
-+ rpt = alloc_xen_pagetable();
- if ( !rpt )
- return -ENOMEM;
-
-@@ -977,10 +983,14 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
-
- stack_base[0] = stack_start;
-
-+ if ( opt_xpti < 0 )
-+ opt_xpti = boot_cpu_data.x86_vendor != X86_VENDOR_AMD;
-+
- rc = setup_cpu_root_pgt(0);
- if ( rc )
- panic("Error %d setting up PV root page table\n", rc);
-- get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
-+ if ( per_cpu(root_pgt, 0) )
-+ get_cpu_info()->pv_cr3 = __pa(per_cpu(root_pgt, 0));
-
- set_nr_sockets();
-
-@@ -1048,6 +1058,7 @@ void __init smp_prepare_boot_cpu(void)
- cpumask_set_cpu(smp_processor_id(), &cpu_present_map);
-
- get_cpu_info()->xen_cr3 = 0;
-+ get_cpu_info()->pv_cr3 = 0;
- }
-
- static void
-diff --git a/xen/arch/x86/x86_64/entry.S b/xen/arch/x86/x86_64/entry.S
-index d1afb3c..505604f 100644
---- xen/arch/x86/x86_64/entry.S.orig
-+++ xen/arch/x86/x86_64/entry.S
-@@ -45,6 +45,7 @@ restore_all_guest:
- movabs $DIRECTMAP_VIRT_START, %rcx
- mov %rdi, %rax
- and %rsi, %rdi
-+ jz .Lrag_keep_cr3
- and %r9, %rsi
- add %rcx, %rdi
- add %rcx, %rsi
-@@ -61,6 +62,7 @@ restore_all_guest:
- rep movsq
- mov %r9, STACK_CPUINFO_FIELD(xen_cr3)(%rdx)
- write_cr3 rax, rdi, rsi
-+.Lrag_keep_cr3:
-
- RESTORE_ALL
- testw $TRAP_syscall,4(%rsp)
---
-2.1.4
-
diff --git a/sysutils/xentools48/Makefile b/sysutils/xentools48/Makefile
index 73df03a77f5..fc286a013af 100644
--- a/sysutils/xentools48/Makefile
+++ b/sysutils/xentools48/Makefile
@@ -1,6 +1,6 @@
-# $NetBSD: Makefile,v 1.15 2018/01/15 09:47:55 jperkin Exp $
+# $NetBSD: Makefile,v 1.16 2018/01/24 23:29:32 bouyer Exp $
#
-VERSION= 4.8.2
+VERSION= 4.8.3
VERSION_IPXE= 827dd1bfee67daa683935ce65316f7e0f057fe1c
DIST_IPXE= ipxe-git-${VERSION_IPXE}.tar.gz
DIST_NEWLIB= newlib-1.16.0.tar.gz
@@ -16,7 +16,6 @@ DIST_LIBPCI= pciutils-2.2.9.tar.bz2
DIST_SUBDIR= xen48
DISTNAME= xen-${VERSION}
PKGNAME= xentools48-${VERSION}
-PKGREVISION= 2
CATEGORIES= sysutils
MASTER_SITES= https://downloads.xenproject.org/release/xen/${VERSION}/
diff --git a/sysutils/xentools48/distinfo b/sysutils/xentools48/distinfo
index eeae956278b..4963e47adb6 100644
--- a/sysutils/xentools48/distinfo
+++ b/sysutils/xentools48/distinfo
@@ -1,4 +1,4 @@
-$NetBSD: distinfo,v 1.6 2017/10/28 04:08:46 khorben Exp $
+$NetBSD: distinfo,v 1.7 2018/01/24 23:29:32 bouyer Exp $
SHA1 (xen48/gmp-4.3.2.tar.bz2) = c011e8feaf1bb89158bd55eaabd7ef8fdd101a2c
RMD160 (xen48/gmp-4.3.2.tar.bz2) = a8f3f41501ece290c348aeb4444bbea40bc53e71
@@ -36,10 +36,10 @@ SHA1 (xen48/tpm_emulator-0.7.4.tar.gz) = ffa3aafcd833fdcd7483bbdb4ff862f30ffde57
RMD160 (xen48/tpm_emulator-0.7.4.tar.gz) = ded71632d316126138f2db4a5f2051b2489ae5ff
SHA512 (xen48/tpm_emulator-0.7.4.tar.gz) = 4928b5b82f57645be9408362706ff2c4d9baa635b21b0d41b1c82930e8c60a759b1ea4fa74d7e6c7cae1b7692d006aa5cb72df0c3b88bf049779aa2b566f9d35
Size (xen48/tpm_emulator-0.7.4.tar.gz) = 214145 bytes
-SHA1 (xen48/xen-4.8.2.tar.gz) = 184c57ce9e71e34b3cbdd318524021f44946efbe
-RMD160 (xen48/xen-4.8.2.tar.gz) = f4126cb0f7ff427ed7d20ce399dcd1077c599343
-SHA512 (xen48/xen-4.8.2.tar.gz) = 7805531f73d23ecfff3439770e62d387f4254a444875670d53a0a739323e5d4d8f8fcc478f8936ee1ae8aff3e0229549e47c01c606365a8ce060dd5c503e87da
-Size (xen48/xen-4.8.2.tar.gz) = 22522336 bytes
+SHA1 (xen48/xen-4.8.3.tar.gz) = ee55e8dc1e79d16d2f85fbe1f8bbd27a2db8422f
+RMD160 (xen48/xen-4.8.3.tar.gz) = 54b7ba828d8198c2a4629eabf7acfba2e9c6561c
+SHA512 (xen48/xen-4.8.3.tar.gz) = 584d8ee6e432e291a70e8f727da6d0a71afff7509fbf2e32eeb9cfe58b8279a80770c2c5f7759dcb5c0b08ed4644039e770e280ab534673215753d598f3f6508
+Size (xen48/xen-4.8.3.tar.gz) = 22529092 bytes
SHA1 (xen48/zlib-1.2.3.tar.gz) = 60faeaaf250642db5c0ea36cd6dcc9f99c8f3902
RMD160 (xen48/zlib-1.2.3.tar.gz) = 89a57e336c24f7f6eebda3a1724e14b71187e117
SHA512 (xen48/zlib-1.2.3.tar.gz) = 021b958fcd0d346c4ba761bcf0cc40f3522de6186cf5a0a6ea34a70504ce9622b1c2626fce40675bc8282cf5f5ade18473656abc38050f72f5d6480507a2106e
diff --git a/sysutils/xentools48/patches/patch-XSA233 b/sysutils/xentools48/patches/patch-XSA233
deleted file mode 100644
index a6104eda1ea..00000000000
--- a/sysutils/xentools48/patches/patch-XSA233
+++ /dev/null
@@ -1,54 +0,0 @@
-$NetBSD: patch-XSA233,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From: Juergen Gross <jgross@suse.com>
-Subject: tools/xenstore: dont unlink connection object twice
-
-A connection object of a domain with associated stubdom has two
-parents: the domain and the stubdom. When cleaning up the list of
-active domains in domain_cleanup() make sure not to unlink the
-connection twice from the same domain. This could happen when the
-domain and its stubdom are being destroyed at the same time leading
-to the domain loop being entered twice.
-
-Additionally don't use talloc_free() in this case as it will remove
-a random parent link, leading eventually to a memory leak. Use
-talloc_unlink() instead specifying the context from which the
-connection object should be removed.
-
-This is XSA-233.
-
-Reported-by: Eric Chanudet <chanudete@ainfosec.com>
-Signed-off-by: Juergen Gross <jgross@suse.com>
-Reviewed-by: Ian Jackson <ian.jackson@eu.citrix.com>
-
---- tools/xenstore/xenstored_domain.c.orig
-+++ tools/xenstore/xenstored_domain.c
-@@ -221,10 +221,11 @@ static int destroy_domain(void *_domain)
- static void domain_cleanup(void)
- {
- xc_dominfo_t dominfo;
-- struct domain *domain, *tmp;
-+ struct domain *domain;
- int notify = 0;
-
-- list_for_each_entry_safe(domain, tmp, &domains, list) {
-+ again:
-+ list_for_each_entry(domain, &domains, list) {
- if (xc_domain_getinfo(*xc_handle, domain->domid, 1,
- &dominfo) == 1 &&
- dominfo.domid == domain->domid) {
-@@ -236,8 +237,12 @@ static void domain_cleanup(void)
- if (!dominfo.dying)
- continue;
- }
-- talloc_free(domain->conn);
-- notify = 0; /* destroy_domain() fires the watch */
-+ if (domain->conn) {
-+ talloc_unlink(talloc_autofree_context(), domain->conn);
-+ domain->conn = NULL;
-+ notify = 0; /* destroy_domain() fires the watch */
-+ goto again;
-+ }
- }
-
- if (notify)
diff --git a/sysutils/xentools48/patches/patch-XSA240 b/sysutils/xentools48/patches/patch-XSA240
deleted file mode 100644
index 435adf582f8..00000000000
--- a/sysutils/xentools48/patches/patch-XSA240
+++ /dev/null
@@ -1,56 +0,0 @@
-$NetBSD: patch-XSA240,v 1.1 2017/10/17 08:42:30 bouyer Exp $
-
-From 41d579aad2fee971e5ce0279a9b559a0fdc74452 Mon Sep 17 00:00:00 2001
-From: George Dunlap <george.dunlap@citrix.com>
-Date: Fri, 22 Sep 2017 11:46:55 +0100
-Subject: [PATCH 2/2] x86/mm: Disable PV linear pagetables by default
-
-Allowing pagetables to point to other pagetables of the same level
-(often called 'linear pagetables') has been included in Xen since its
-inception. But it is not used by the most common PV guests (Linux,
-NetBSD, minios), and has been the source of a number of subtle
-reference-counting bugs.
-
-Add a command-line option to control whether PV linear pagetables are
-allowed (disabled by default).
-
-Reported-by: Jann Horn <jannh@google.com>
-Signed-off-by: George Dunlap <george.dunlap@citrix.com>
-Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
----
-Changes since v2:
-- s/_/-/; in command-line option
-- Added __read_mostly
----
- docs/misc/xen-command-line.markdown | 15 +++++++++++++++
- xen/arch/x86/mm.c | 9 +++++++++
- 2 files changed, 24 insertions(+)
-
-diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown
-index 54acc60723..ffa66eb146 100644
---- docs/misc/xen-command-line.markdown.orig
-+++ docs/misc/xen-command-line.markdown
-@@ -1350,6 +1350,21 @@ The following resources are available:
- CDP, one COS will corespond two CBMs other than one with CAT, due to the
- sum of CBMs is fixed, that means actual `cos_max` in use will automatically
- reduce to half when CDP is enabled.
-+
-+### pv-linear-pt
-+> `= <boolean>`
-+
-+> Default: `true`
-+
-+Allow PV guests to have pagetable entries pointing to other pagetables
-+of the same level (i.e., allowing L2 PTEs to point to other L2 pages).
-+This technique is often called "linear pagetables", and is sometimes
-+used to allow operating systems a simple way to consistently map the
-+current process's pagetables into its own virtual address space.
-+
-+None of the most common PV operating systems (Linux, MiniOS)
-+use this technique, but NetBSD in PV mode, and maybe custom operating
-+systems which do.
-
- ### reboot
- > `= t[riple] | k[bd] | a[cpi] | p[ci] | P[ower] | e[fi] | n[o] [, [w]arm | [c]old]`
-diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
-index 31d4a03840..5d125cff3a 100644