summaryrefslogtreecommitdiff
path: root/usr/src/docs/admin
diff options
context:
space:
mode:
Diffstat (limited to 'usr/src/docs/admin')
-rw-r--r--usr/src/docs/admin252
1 files changed, 252 insertions, 0 deletions
diff --git a/usr/src/docs/admin b/usr/src/docs/admin
new file mode 100644
index 0000000000..42fb2ba045
--- /dev/null
+++ b/usr/src/docs/admin
@@ -0,0 +1,252 @@
+# Overlay Networks Administration Guide
+
+To better understand the implementation of overlay networks from a
+non-data path point of view, it's useful to put together a straw man
+proposal for how these work from an administration point of view.
+
+This goes through a bunch of options and leaves some open questions. Any
+thoughts on it should be directed to Robert Mustacchi <rm@joyent.com> or
+rmustacc on irc.freenode.net.
+
+## Overlay Devices Background
+
+It's worth going into a bit of background as to what overlay devices
+look like. An overlay device is a combination of an etherstub and an IP
+tunnel device with potentially multiple destinations. An overlay device
+has two different components: an encapsulation module and a search
+module.
+
+The encapsulation module defines how packets get transmitted on the wire
+and aspects of their transport. Consider the two rather popular
+encapsulation methods: VXLAN and NVGRE. The VXLAN specification defines
+the VXLAN header that gets put on an Ethernet frame and then defines
+that it be sent over UDP. NVGRE on the other hand defines that it should
+use a GRE header with a specific GRE protocol field, which is itself is
+a separate IP type.
+
+In addition, today, because they define the mechanism for encapsulation
+and decapsulation, they also define how they should receive packets. In
+other words, what IP address and in the case of VXLAN, port, they
+listen on.
+
+
+The search plugin, on the other hand, defines how we route encapsulated
+packets to a destination. The most simplest form of this is a point to
+point tunnel where the destination plugin sends everything to a single
+unicast IP address. Only slightly more complicated would be something
+that sent it to a single multicast address. However, these plugins also
+allow for custom means of looking up destinations and dynamic
+destinations. In other words, sending an Ethernet frame to a different
+destination based on its mac address or other criteria. This may be
+implemented by having a list of mappings in files, or there may be some
+much more advanced mechanism that integrates with a broader
+orchestration suite or other database. We need to think about both of
+these different classes of search plugins - static and dynamic as we go
+through this.
+
+On top off these devices, a series of VNICs can be created, like an
+etherstub. All their traffic will be subject to a local virtual switch
+and then sent over the encapsulated network.
+
+## Basic model
+
+We want to introduce a new object into dladm that we call an overlay. An
+overlay is its own class, like bridges, IP tunnels, and etherstubs.
+There would generally be four basic commands, much like other parts of
+dladm:
+
+ o dladm create-overlay -- Creates an overlay device
+ o dladm show-overlay -- Shows information about overlay devices
+ o dladm modify-overlay -- Modifies aspects of overlay devices
+ o dladm delete-overlay -- Remove an overlay device
+
+
+### Example: Creating a point to point VXLAN overlay network
+
+Let's consider the act of creating and manipulating a VXLAN overlay
+network. There are a handful of properties that we care about:
+
+ o The virtual network id, the unique VXLAN identifier
+ o The UDP IP address and port that we're listening on
+ o The UDP IP address and port that we're going to send to
+
+In the VXLAN case, we have a default port that we'd like to use that's
+been assigned by IANA, but obviously, allow it to be overridden. In
+addition, we've listed two implicit properties in this example that we
+want to first class:
+
+ o The encapsulation plugin, VXLAN
+ o The search plugin, a direct point to point tunnel we'll call 'direct'
+
+Here's one idea of how this could look:
+
+```
+dladm create-overlay -e vxlan -s direct -v 169 -p vxlan/listen_ip=a.b.c.d -p direct/dest_ip=e.f.g.h overlay0
+```
+
+If we take this apart, -e specifies the encapsulation module that we
+should use for this device. While -s is specifying that we should use
+the 'direct' search plugin which is a point to point tunnel. While we
+have default values for the vxlan/listen_port and we can use the same
+default for the search plugin, we don't have the same for an IP address.
+Because of this, we have to specify it on the command line.
+
+In this world, -e, -s, and -v are sugar for properties. Respectively the
+properties 'encap', 'search', and 'vnetid' That means this could look
+like:
+
+```
+dladm create-overlay -p encap=vxlan -p search=direct -p vnetid=169 -p vxlan/listen_ip=a.b.c.d -p direct/dest_ip=e.f.g.h overlay0
+```
+
+In addition, much like with other dladm objects, a user would also be
+able to specify other properties on the creation command like, such as
+the vxlan listen port (vxlan/listen_port) or something like the MTU.
+
+I believe that having a summarized version with some options that are
+reflected with abbreviations like in the first form makes sense, but we
+don't want to elevate too much into short letters, otherwise every
+plugin, while overlap exists, will cause us to go through and run out of
+letters.
+
+## Propety namespaces
+
+In the above, you'll notice that we've namespaced various properties
+such as the vxlan/listen_ip. There is nothing special about these
+properties. This is just a way of grouping things in a way that might be
+a bit more useful for folks. This makes it easy to see which properties
+are related to the fact that we're doing vxlan encapsulation or some
+other plugin-based aspect. The unprefixed properties should all be
+generic.
+
+
+## Default and required properties
+
+Each plugin is going to have to define their own set of required
+properties and whether they have default values. For example, vxlan
+defines two properties:
+
+ o vxlan/listen_ip
+ o vxlan/listen_port
+
+The listen_port has a default value of 4789. While this is what most
+folks will want to use, some will want to override it at creation time.
+However, just like we can modify the defaults with ipadm, we'll want to
+do the same here and have some mode that allows us to modify them. This
+will allow an administrator to override a default property for their
+site.
+
+In addition to those default, some plugins have properties that require
+an argument, but have no default. There's no good default for an IP
+address to listen on. INADDR_ANY or loop back (and their v6 equivalents)
+are almost certainly wrong choices. As such, the plugins will need to
+define their required properties. This should probably be displayed in
+some way along with the defaults. Perhaps this is done as a new flag on
+modify-overlay and show-overlay to see the default/required sets?
+
+```
+dladm show-overlay -d
+dladm modify-overlay -d
+```
+
+
+## Working through other examples
+
+To help evaluate other behavior and choices, it's worth working through
+some other examples and seeing what kinds of questions this causes us to
+have.
+
+### NVGRE and Geneve
+
+NVGRE and Geneve are other encapsulation modules. NVGRE is already
+widely deployed generally in MSFT related software. Today, it fits the
+model fairly well. NVGRE just has a single property:
+
+ o nvgre/listen_ip
+
+Geneve on the other hand is yet another draft that is supposed to bring
+the titans of NVGRE and VXLAN together. It's another UDP like protocol.
+It has properties like VXLAN:
+
+ o geneve/listen_ip
+ o geneve/listen_port
+
+
+Now, you might be saying, but wait rm, we're always redefining
+listen_ip! Can't we just make that a first class property. Oh and maybe
+listen_port, but wait that NVGRE isn't using it. Hmm.
+
+So, I have thought about this, but I think for the time, I'd rather make
+them specific to the plugin. There may be something new that comes
+around and actually doesn't use IP at all and in fact speaks on an
+ethertype. At which point we'll want to use dls/dlpi/vnd to manage that
+communication and there will not be any IP. Of course, this doesn't
+exist today, but I think it doesn't make sense to elevate it at first.
+If it turns out that this is all terrible, then we can go figure that
+out.
+
+### Multicast
+
+Another common thing that we're going to build out of the gate is a
+multicast module. Unlike the direct model, with the multicast model, the
+user subscribes to a multicast address to receive packets and sends
+packets out to a multicast group. In this case, the normal listen_ip
+would still exist, it would just be a multicast IP. However instead of a
+multicast/direct_ip, we would have:
+
+ o multicast/dest_ip
+
+### Search plugins and different encap types
+
+One of the things that we'd like to do is have an encapsulation
+algorithm encode what it requires to be directed, eg. VXLAN requires and
+IP address and a port. NVGRE just requires an IP. We'll have t work out
+an interface, but the search plugin will be able to get some information
+from the encapsulation plugin which will let it know if it should
+require an IP address, an IP address and a port, or some other set of
+configuration. It will also provide a way to share defaults.
+Realistically, the search plugin, should probably just inherit the
+default from the encap plugin, if appropriate. Otherwise an
+administrator should define a global default for all encapsulation
+plugins for the search module. While the former is useful, the latter
+isn't going to be very realistic.
+
+### Files plugin and dynamic destinations
+
+The files plugin is interesting, because it moves away from the world of
+static maps to dynamic destinations. In this case, the destination of a
+packet with a mac address m, is defined by some file on the system,
+similar to /etc/hosts (but an entirely different format).
+
+The interesting thing about the files model is that it requires some
+additional features, which we need to implement anyways. For example
+we'll need to:
+
+ o Be able to flush the destination map
+ o Obtain some way of getting the destination map from the kernel
+ o For the files case, tell it to refresh the mappings from the file
+ o Update the mapping file to something else
+
+While the last of these would be done through a normal modify-overlay of
+a property, the question is how would we handle all of these other
+actions. They basically have us performing actions on the device that,
+in some cases, are specific to the plugin-type.
+
+How should this be done? Should these be kind of transitive properties
+that exist? Like a 'files/refresh' property that you set to true? I'm
+not sure I really like the idea of transitive properties as much as I
+perhaps want to have some alternative option space to modify-overlay
+perhaps that allows you to inject one of these options.
+
+Perhaps all of these three things instead should be part of some
+command that isn't dladm? I dunno, I'm not really partial to that.
+
+## Concluding Thoughts
+
+All in all, I think we have a reasonable framework to start with, but
+some of these details kind of matter and as we go further down the path
+of implementation, the tradeoffs and options that we want to make will
+make more sense. It may very well be that there will not be many
+properties except for the Joyent or other user specific search
+mechanisms and therefore spending too much time on this isn't
+worthwhile.