diff options
Diffstat (limited to 'docs/htmldocs/Samba3-HOWTO/SambaHA.html')
| -rw-r--r-- | docs/htmldocs/Samba3-HOWTO/SambaHA.html | 271 |
1 files changed, 0 insertions, 271 deletions
diff --git a/docs/htmldocs/Samba3-HOWTO/SambaHA.html b/docs/htmldocs/Samba3-HOWTO/SambaHA.html deleted file mode 100644 index 8d77f64d59..0000000000 --- a/docs/htmldocs/Samba3-HOWTO/SambaHA.html +++ /dev/null @@ -1,271 +0,0 @@ -<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1"><title>Chapter 32. High Availability</title><link rel="stylesheet" href="../samba.css" type="text/css"><meta name="generator" content="DocBook XSL Stylesheets V1.75.2"><link rel="home" href="index.html" title="The Official Samba 3.5.x HOWTO and Reference Guide"><link rel="up" href="optional.html" title="Part III. Advanced Configuration"><link rel="prev" href="Backup.html" title="Chapter 31. Backup Techniques"><link rel="next" href="largefile.html" title="Chapter 33. Handling Large Directories"></head><body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF"><div class="navheader"><table width="100%" summary="Navigation header"><tr><th colspan="3" align="center">Chapter 32. High Availability</th></tr><tr><td width="20%" align="left"><a accesskey="p" href="Backup.html">Prev</a> </td><th width="60%" align="center">Part III. Advanced Configuration</th><td width="20%" align="right"> <a accesskey="n" href="largefile.html">Next</a></td></tr></table><hr></div><div class="chapter" title="Chapter 32. High Availability"><div class="titlepage"><div><div><h2 class="title"><a name="SambaHA"></a>Chapter 32. High Availability</h2></div><div><div class="author"><h3 class="author"><span class="firstname">John</span> <span class="othername">H.</span> <span class="surname">Terpstra</span></h3><div class="affiliation"><span class="orgname">Samba Team<br></span><div class="address"><p><code class="email"><<a class="email" href="mailto:jht@samba.org">jht@samba.org</a>></code></p></div></div></div></div><div><div class="author"><h3 class="author"><span class="firstname">Jeremy</span> <span class="surname">Allison</span></h3><div class="affiliation"><span class="orgname">Samba Team<br></span><div class="address"><p><code class="email"><<a class="email" href="mailto:jra@samba.org">jra@samba.org</a>></code></p></div></div></div></div></div></div><div class="toc"><p><b>Table of Contents</b></p><dl><dt><span class="sect1"><a href="SambaHA.html#id434489">Features and Benefits</a></span></dt><dt><span class="sect1"><a href="SambaHA.html#id434596">Technical Discussion</a></span></dt><dd><dl><dt><span class="sect2"><a href="SambaHA.html#id434627">The Ultimate Goal</a></span></dt><dt><span class="sect2"><a href="SambaHA.html#id434749">Why Is This So Hard?</a></span></dt><dt><span class="sect2"><a href="SambaHA.html#id435417">A Simple Solution</a></span></dt><dt><span class="sect2"><a href="SambaHA.html#id435490">High-Availability Server Products</a></span></dt><dt><span class="sect2"><a href="SambaHA.html#id435618">MS-DFS: The Poor Man's Cluster</a></span></dt><dt><span class="sect2"><a href="SambaHA.html#id435651">Conclusions</a></span></dt></dl></dd></dl></div><div class="sect1" title="Features and Benefits"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id434489"></a>Features and Benefits</h2></div></div></div><p> -<a class="indexterm" name="id434496"></a> -<a class="indexterm" name="id434503"></a> -<a class="indexterm" name="id434510"></a> -Network administrators are often concerned about the availability of file and print -services. Network users are inclined toward intolerance of the services they depend -on to perform vital task responsibilities. -</p><p> -A sign in a computer room served to remind staff of their responsibilities. It read: -</p><div class="blockquote"><blockquote class="blockquote"><p> -<a class="indexterm" name="id434528"></a> -<a class="indexterm" name="id434535"></a> -<a class="indexterm" name="id434542"></a> -<a class="indexterm" name="id434548"></a> -All humans fail, in both great and small ways we fail continually. Machines fail too. -Computers are machines that are managed by humans, the fallout from failure -can be spectacular. Your responsibility is to deal with failure, to anticipate it -and to eliminate it as far as is humanly and economically wise to achieve. -Are your actions part of the problem or part of the solution? -</p></blockquote></div><p> -If we are to deal with failure in a planned and productive manner, then first we must -understand the problem. That is the purpose of this chapter. -</p><p> -<a class="indexterm" name="id434567"></a> -<a class="indexterm" name="id434574"></a> -<a class="indexterm" name="id434581"></a> -Parenthetically, in the following discussion there are seeds of information on how to -provision a network infrastructure against failure. Our purpose here is not to provide -a lengthy dissertation on the subject of high availability. Additionally, we have made -a conscious decision to not provide detailed working examples of high availability -solutions; instead we present an overview of the issues in the hope that someone will -rise to the challenge of providing a detailed document that is focused purely on -presentation of the current state of knowledge and practice in high availability as it -applies to the deployment of Samba and other CIFS/SMB technologies. -</p></div><div class="sect1" title="Technical Discussion"><div class="titlepage"><div><div><h2 class="title" style="clear: both"><a name="id434596"></a>Technical Discussion</h2></div></div></div><p> -<a class="indexterm" name="id434603"></a> -<a class="indexterm" name="id434610"></a> -<a class="indexterm" name="id434617"></a> -The following summary was part of a presentation by Jeremy Allison at the SambaXP 2003 -conference that was held at Goettingen, Germany, in April 2003. Material has been added -from other sources, but it was Jeremy who inspired the structure that follows. -</p><div class="sect2" title="The Ultimate Goal"><div class="titlepage"><div><div><h3 class="title"><a name="id434627"></a>The Ultimate Goal</h3></div></div></div><p> -<a class="indexterm" name="id434635"></a> -<a class="indexterm" name="id434642"></a> -<a class="indexterm" name="id434648"></a> - All clustering technologies aim to achieve one or more of the following: - </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>Obtain the maximum affordable computational power.</p></li><li class="listitem"><p>Obtain faster program execution.</p></li><li class="listitem"><p>Deliver unstoppable services.</p></li><li class="listitem"><p>Avert points of failure.</p></li><li class="listitem"><p>Exact most effective utilization of resources.</p></li></ul></div><p> - A clustered file server ideally has the following properties: -<a class="indexterm" name="id434687"></a> -<a class="indexterm" name="id434693"></a> -<a class="indexterm" name="id434700"></a> -<a class="indexterm" name="id434707"></a> - </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>All clients can connect transparently to any server.</p></li><li class="listitem"><p>A server can fail and clients are transparently reconnected to another server.</p></li><li class="listitem"><p>All servers serve out the same set of files.</p></li><li class="listitem"><p>All file changes are immediately seen on all servers.</p><div class="itemizedlist"><ul class="itemizedlist" type="circle"><li class="listitem"><p>Requires a distributed file system.</p></li></ul></div></li><li class="listitem"><p>Infinite ability to scale by adding more servers or disks.</p></li></ul></div></div><div class="sect2" title="Why Is This So Hard?"><div class="titlepage"><div><div><h3 class="title"><a name="id434749"></a>Why Is This So Hard?</h3></div></div></div><p> - In short, the problem is one of <span class="emphasis"><em>state</em></span>. - </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p> -<a class="indexterm" name="id434768"></a> - All TCP/IP connections are dependent on state information. - </p><p> -<a class="indexterm" name="id434779"></a> - The TCP connection involves a packet sequence number. This - sequence number would need to be dynamically updated on all - machines in the cluster to effect seamless TCP failover. - </p></li><li class="listitem"><p> -<a class="indexterm" name="id434794"></a> -<a class="indexterm" name="id434801"></a> - CIFS/SMB (the Windows networking protocols) uses TCP connections. - </p><p> - This means that from a basic design perspective, failover is not - seriously considered. - </p><div class="itemizedlist"><ul class="itemizedlist" type="circle"><li class="listitem"><p> - All current SMB clusters are failover solutions - they rely on the clients to reconnect. They provide server - failover, but clients can lose information due to a server failure. -<a class="indexterm" name="id434823"></a> - </p></li></ul></div><p> - </p></li><li class="listitem"><p> - Servers keep state information about client connections. - </p><div class="itemizedlist"><a class="indexterm" name="id434840"></a><ul class="itemizedlist" type="circle"><li class="listitem"><p>CIFS/SMB involves a lot of state.</p></li><li class="listitem"><p>Every file open must be compared with other open files - to check share modes.</p></li></ul></div><p> - </p></li></ul></div><div class="sect3" title="The Front-End Challenge"><div class="titlepage"><div><div><h4 class="title"><a name="id434861"></a>The Front-End Challenge</h4></div></div></div><p> -<a class="indexterm" name="id434869"></a> -<a class="indexterm" name="id434875"></a> -<a class="indexterm" name="id434882"></a> -<a class="indexterm" name="id434889"></a> -<a class="indexterm" name="id434896"></a> -<a class="indexterm" name="id434903"></a> -<a class="indexterm" name="id434910"></a> - To make it possible for a cluster of file servers to appear as a single server that has one - name and one IP address, the incoming TCP data streams from clients must be processed by the - front-end virtual server. This server must de-multiplex the incoming packets at the SMB protocol - layer level and then feed the SMB packet to different servers in the cluster. - </p><p> -<a class="indexterm" name="id434922"></a> -<a class="indexterm" name="id434929"></a> - One could split all IPC$ connections and RPC calls to one server to handle printing and user - lookup requirements. RPC printing handles are shared between different IPC4 sessions it is - hard to split this across clustered servers! - </p><p> - Conceptually speaking, all other servers would then provide only file services. This is a simpler - problem to concentrate on. - </p></div><div class="sect3" title="Demultiplexing SMB Requests"><div class="titlepage"><div><div><h4 class="title"><a name="id434948"></a>Demultiplexing SMB Requests</h4></div></div></div><p> -<a class="indexterm" name="id434956"></a> -<a class="indexterm" name="id434962"></a> -<a class="indexterm" name="id434969"></a> -<a class="indexterm" name="id434976"></a> - De-multiplexing of SMB requests requires knowledge of SMB state information, - all of which must be held by the front-end <span class="emphasis"><em>virtual</em></span> server. - This is a perplexing and complicated problem to solve. - </p><p> -<a class="indexterm" name="id434991"></a> -<a class="indexterm" name="id434998"></a> -<a class="indexterm" name="id435004"></a> - Windows XP and later have changed semantics so state information (vuid, tid, fid) - must match for a successful operation. This makes things simpler than before and is a - positive step forward. - </p><p> -<a class="indexterm" name="id435016"></a> -<a class="indexterm" name="id435023"></a> - SMB requests are sent by vuid to their associated server. No code exists today to - effect this solution. This problem is conceptually similar to the problem of - correctly handling requests from multiple requests from Windows 2000 - Terminal Server in Samba. - </p><p> -<a class="indexterm" name="id435036"></a> - One possibility is to start by exposing the server pool to clients directly. - This could eliminate the de-multiplexing step. - </p></div><div class="sect3" title="The Distributed File System Challenge"><div class="titlepage"><div><div><h4 class="title"><a name="id435046"></a>The Distributed File System Challenge</h4></div></div></div><p> -<a class="indexterm" name="id435054"></a> - There exists many distributed file systems for UNIX and Linux. - </p><p> -<a class="indexterm" name="id435064"></a> -<a class="indexterm" name="id435071"></a> -<a class="indexterm" name="id435078"></a> -<a class="indexterm" name="id435085"></a> -<a class="indexterm" name="id435092"></a> -<a class="indexterm" name="id435098"></a> - Many could be adopted to backend our cluster, so long as awareness of SMB - semantics is kept in mind (share modes, locking, and oplock issues in particular). - Common free distributed file systems include: -<a class="indexterm" name="id435107"></a> -<a class="indexterm" name="id435114"></a> -<a class="indexterm" name="id435121"></a> -<a class="indexterm" name="id435127"></a> - </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>NFS</p></li><li class="listitem"><p>AFS</p></li><li class="listitem"><p>OpenGFS</p></li><li class="listitem"><p>Lustre</p></li></ul></div><p> -<a class="indexterm" name="id435158"></a> - The server pool (cluster) can use any distributed file system backend if all SMB - semantics are performed within this pool. - </p></div><div class="sect3" title="Restrictive Constraints on Distributed File Systems"><div class="titlepage"><div><div><h4 class="title"><a name="id435168"></a>Restrictive Constraints on Distributed File Systems</h4></div></div></div><p> -<a class="indexterm" name="id435176"></a> -<a class="indexterm" name="id435183"></a> -<a class="indexterm" name="id435190"></a> -<a class="indexterm" name="id435197"></a> - Where a clustered server provides purely SMB services, oplock handling - may be done within the server pool without imposing a need for this to - be passed to the backend file system pool. - </p><p> -<a class="indexterm" name="id435209"></a> -<a class="indexterm" name="id435215"></a> - On the other hand, where the server pool also provides NFS or other file services, - it will be essential that the implementation be oplock-aware so it can - interoperate with SMB services. This is a significant challenge today. A failure - to provide this interoperability will result in a significant loss of performance that will be - sorely noted by users of Microsoft Windows clients. - </p><p> - Last, all state information must be shared across the server pool. - </p></div><div class="sect3" title="Server Pool Communications"><div class="titlepage"><div><div><h4 class="title"><a name="id435235"></a>Server Pool Communications</h4></div></div></div><p> -<a class="indexterm" name="id435243"></a> -<a class="indexterm" name="id435249"></a> -<a class="indexterm" name="id435256"></a> -<a class="indexterm" name="id435263"></a> - Most backend file systems support POSIX file semantics. This makes it difficult - to push SMB semantics back into the file system. POSIX locks have different properties - and semantics from SMB locks. - </p><p> -<a class="indexterm" name="id435275"></a> -<a class="indexterm" name="id435281"></a> -<a class="indexterm" name="id435288"></a> - All <code class="literal">smbd</code> processes in the server pool must of necessity communicate - very quickly. For this, the current <em class="parameter"><code>tdb</code></em> file structure that Samba - uses is not suitable for use across a network. Clustered <code class="literal">smbd</code>s must use something else. - </p></div><div class="sect3" title="Server Pool Communications Demands"><div class="titlepage"><div><div><h4 class="title"><a name="id435316"></a>Server Pool Communications Demands</h4></div></div></div><p> - High-speed interserver communications in the server pool is a design prerequisite - for a fully functional system. Possibilities for this include: - </p><div class="itemizedlist"><a class="indexterm" name="id435329"></a><a class="indexterm" name="id435336"></a><ul class="itemizedlist" type="disc"><li class="listitem"><p> - Proprietary shared memory bus (example: Myrinet or SCI [scalable coherent interface]). - These are high-cost items. - </p></li><li class="listitem"><p> - Gigabit Ethernet (now quite affordable). - </p></li><li class="listitem"><p> - Raw Ethernet framing (to bypass TCP and UDP overheads). - </p></li></ul></div><p> - We have yet to identify metrics for performance demands to enable this to happen - effectively. - </p></div><div class="sect3" title="Required Modifications to Samba"><div class="titlepage"><div><div><h4 class="title"><a name="id435366"></a>Required Modifications to Samba</h4></div></div></div><p> - Samba needs to be significantly modified to work with a high-speed server interconnect - system to permit transparent failover clustering. - </p><p> - Particular functions inside Samba that will be affected include: - </p><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p> - The locking database, oplock notifications, - and the share mode database. - </p></li><li class="listitem"><p> -<a class="indexterm" name="id435391"></a> -<a class="indexterm" name="id435398"></a> - Failure semantics need to be defined. Samba behaves the same way as Windows. - When oplock messages fail, a file open request is allowed, but this is - potentially dangerous in a clustered environment. So how should interserver - pool failure semantics function, and how should such functionality be implemented? - </p></li><li class="listitem"><p> - Should this be implemented using a point-to-point lock manager, or can this - be done using multicast techniques? - </p></li></ul></div></div></div><div class="sect2" title="A Simple Solution"><div class="titlepage"><div><div><h3 class="title"><a name="id435417"></a>A Simple Solution</h3></div></div></div><p> -<a class="indexterm" name="id435425"></a> -<a class="indexterm" name="id435432"></a> -<a class="indexterm" name="id435438"></a> - Allowing failover servers to handle different functions within the exported file system - removes the problem of requiring a distributed locking protocol. - </p><p> -<a class="indexterm" name="id435450"></a> -<a class="indexterm" name="id435457"></a> - If only one server is active in a pair, the need for high-speed server interconnect is avoided. - This allows the use of existing high-availability solutions, instead of inventing a new one. - This simpler solution comes at a price the cost of which is the need to manage a more - complex file name space. Since there is now not a single file system, administrators - must remember where all services are located a complexity not easily dealt with. - </p><p> -<a class="indexterm" name="id435477"></a> - The <span class="emphasis"><em>virtual server</em></span> is still needed to redirect requests to backend - servers. Backend file space integrity is the responsibility of the administrator. - </p></div><div class="sect2" title="High-Availability Server Products"><div class="titlepage"><div><div><h3 class="title"><a name="id435490"></a>High-Availability Server Products</h3></div></div></div><p> -<a class="indexterm" name="id435498"></a> -<a class="indexterm" name="id435504"></a> -<a class="indexterm" name="id435511"></a> -<a class="indexterm" name="id435518"></a> -<a class="indexterm" name="id435525"></a> - Failover servers must communicate in order to handle resource failover. This is essential - for high-availability services. The use of a dedicated heartbeat is a common technique to - introduce some intelligence into the failover process. This is often done over a dedicated - link (LAN or serial). - </p><p> -<a class="indexterm" name="id435537"></a> -<a class="indexterm" name="id435544"></a> -<a class="indexterm" name="id435551"></a> -<a class="indexterm" name="id435558"></a> -<a class="indexterm" name="id435565"></a> - Many failover solutions (like Red Hat Cluster Manager and Microsoft Wolfpack) - can use a shared SCSI of Fiber Channel disk storage array for failover communication. - Information regarding Red Hat high availability solutions for Samba may be obtained from - <a class="ulink" href="http://www.redhat.com/docs/manuals/enterprise/RHEL-AS-2.1-Manual/cluster-manager/s1-service-samba.html" target="_top">www.redhat.com</a>. - </p><p> -<a class="indexterm" name="id435583"></a> - The Linux High Availability project is a resource worthy of consultation if your desire is - to build a highly available Samba file server solution. Please consult the home page at - <a class="ulink" href="http://www.linux-ha.org/" target="_top">www.linux-ha.org/</a>. - </p><p> -<a class="indexterm" name="id435601"></a> -<a class="indexterm" name="id435608"></a> - Front-end server complexity remains a challenge for high availability because it must deal - gracefully with backend failures, while at the same time providing continuity of service - to all network clients. - </p></div><div class="sect2" title="MS-DFS: The Poor Man's Cluster"><div class="titlepage"><div><div><h3 class="title"><a name="id435618"></a>MS-DFS: The Poor Man's Cluster</h3></div></div></div><p> -<a class="indexterm" name="id435626"></a> -<a class="indexterm" name="id435633"></a> - MS-DFS links can be used to redirect clients to disparate backend servers. This pushes - complexity back to the network client, something already included by Microsoft. - MS-DFS creates the illusion of a simple, continuous file system name space that works even - at the file level. - </p><p> - Above all, at the cost of complexity of management, a distributed system (pseudo-cluster) can - be created using existing Samba functionality. - </p></div><div class="sect2" title="Conclusions"><div class="titlepage"><div><div><h3 class="title"><a name="id435651"></a>Conclusions</h3></div></div></div><div class="itemizedlist"><ul class="itemizedlist" type="disc"><li class="listitem"><p>Transparent SMB clustering is hard to do!</p></li><li class="listitem"><p>Client failover is the best we can do today.</p></li><li class="listitem"><p>Much more work is needed before a practical and manageable high-availability transparent cluster solution will be possible.</p></li><li class="listitem"><p>MS-DFS can be used to create the illusion of a single transparent cluster.</p></li></ul></div></div></div></div><div class="navfooter"><hr><table width="100%" summary="Navigation footer"><tr><td width="40%" align="left"><a accesskey="p" href="Backup.html">Prev</a> </td><td width="20%" align="center"><a accesskey="u" href="optional.html">Up</a></td><td width="40%" align="right"> <a accesskey="n" href="largefile.html">Next</a></td></tr><tr><td width="40%" align="left" valign="top">Chapter 31. Backup Techniques </td><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td><td width="40%" align="right" valign="top"> Chapter 33. Handling Large Directories</td></tr></table></div></body></html> |
