summaryrefslogtreecommitdiff
path: root/README.md
blob: c5feb4a1684a3a8d7a2cc0c0d585d2789daebf02 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
illumos-kvm: KVM for illumos
============================

KVM is the kernel virtual machine, a framework for the in-kernel acceleration
of QEMU.  illumos-kvm is a port of KVM to illumos, taking advantage of
illumos-specific constructs like DTrace, cyclics, mdb, kstat, OS
virtualization, network virtualization, ZFS, etc.  It is derived from the KVM
source for Linux 2.6.34, the longterm source for which may be found here:

    git://git.kernel.org/pub/scm/linux/kernel/git/longterm/linux-2.6.34.y.git

To date, this implementation has been verified with a wide range of guest
operating systems including illumos itself (both SmartOS and OpenIndiana
distributions), FreeBSD, Plan 9, QNX, ChromeOS, HaikuOS, Microsoft Windows
and Linux.

The design center for this work is to use the virtualization features made
available in the microprocessor -- and in particular, Intel's VMX.  As such,
behavior on microprocessors that do not support VMX -- and more specifically,
the extended page tables (EPT) found in second generation VMX support --
should be graceful failure, not degraded operation.

Divergences from KVM
--------------------

Divergences from KVM fall into several broad categories:  some functionality
has been removed or not implemented because it is obviated by features of
illumos (e.g., the custom tracing facility built into KVM); some functionality
has been removed because it is only relevant to hardware that lacks
virtualization support (e.g., older x86 hardware) or on hardware for which
illumos lacks support (e.g., PPC, s390); and some functionality has been
removed because the implementation complexity was simply too great relative
to its value.

Of this latter category, three areas of divergence merit special note.  First,
there is no support for pageable guest memory (that is, guest memory is locked
down).  While this is an opinionated decision at some level (in our
experience, memory oversell leads to unacceptable pathologies in all but the
idlest of workloads), we would welcome the work to integrate the KVM MMU
notifier support into illumos-kvm.

Second (and relatedly), illumos itself has no support for kernel same-page
mapping (KSM) as found in Linux.  While illumos could in principle add such
support, it is our experience that the memory that accrues from this is not
sufficiently significant to pay for the increase in implementation and
operator complexity.

Finally, there is no support currently for AMD SVM.  This is not a value
judgement of AMD's technology, but rather a reflection of limited engineering
and testing resources.  (In the spirit of full disclosure, it should be said
that the sponsor of illumos-kvm, Joyent, is an Intel-funded company -- but the
lack of AMD support reflects only engineering prioritization and lack of
testing infrastructure; AMD SVM support would be most welcome should someone
in the community be so motivated as to port and test it.)

Building illumos-kvm
--------------------

### Preparation

Edit the Makefile and appropriately set the path for the `KERNEL_SOURCE`
directory to point to the root of a checked out and built illumos directory.
Building illumos KVM requires several recent additions to illumos,
so be sure your illumos is up to date.

Verify that you have gcc 4.4.4 installed that is used to build illumos.
If you are not building this with SmartOS, you may need to modify the
Makefile such that CC is pointing to the correct gcc.

Verify that you either have SUNWmake or GNU make installed.

### Building

To build, simply use the default make target:

    $ make

To check style, header files, and other various nits:

    $ make check

Installing illumos-kvm
----------------------

### System requirements

To run illumos-kvm, you will need an illumos that has the fix for issue
1347 (integrated on 2011-08-11). Further, your machine will need to
support VMX.  To see if your machine supports VMX, run `isainfo -v` and
look for `vmx`, e.g.:

      % isainfo -v
      64-bit amd64 applications
            vmx sse4.2 sse4.1 ssse3 popcnt tscp cx16 sse3 sse2 sse fxsr mmx 
            cmov amd_sysc cx8 tsc fpu 
      32-bit i386 applications
            vmx sse4.2 sse4.1 ssse3 popcnt tscp ahf cx16 sse3 sse2 sse fxsr mmx 
            cmov sep cx8 tsc fpu 

If you do not see `vmx` in this output, the `kvm` driver will be unable to
attach.

### Required binaries

There are two mandatory artifacts to install, and two optional component:

* `kvm` is the driver itself
* `kvm.conf` is the driver configuration file
* `kvm.so` is the mdb module
* `JOY_kvm_link.so` is the devfsadm plugin

On the target machine, place `kvm` in `/kernel/drv/amd64` and `kvm.conf`
in `/kernel/drv`. Place `JOY_kvm_link.so` in `/usr/lib/devfsadm/linkmod` then:

    # add_drv kvm

You can verify that the driver installed and attached properly by checking for
its presence in /dev.

    # ls -l /dev/kvm

Running illumos-kvm
-------------------

To run KVM, you will need the build product of the illumos-kvm-cmd repo:
`qemu-system-x86_64`; please follow the instructions in the illumos-kvm-cmd
repo to execute QEMU such that KVM is enabled.

Monitoring illumos-kvm
----------------------

Once one or more VMs are running, there is a variety of tooling to help
understand the operating characteristics of the system.

### kvmstat

The `kvmstat` command, found in the illumos repository, can be used to monitor
VMs. For example, here is one second of `kvmstat` output from a machine
running two VMs (one 2 VCPU instance running Linux; another 4 VCPU instance
running the illumos-derived SmartOS):

       pid vcpu |  exits :  haltx   irqx  irqwx    iox  mmiox |   irqs   emul   eptv
      4668    0 |     23 :      6      0      0      1      0 |      6     16      0
      4668    1 |     25 :      6      1      0      1      0 |      6     16      0
      5026    0 |  17833 :    223   2946    707    106      0 |   3379  13315      0
      5026    1 |  18687 :    244   2761    512      0      0 |   3085  14803      0
      5026    2 |  15696 :    194   3452    542      0      0 |   3568  11230      0
      5026    3 |  16822 :    244   2817    487      0      0 |   3100  12963      0

As for the meaning of the columns, they are explained with `kvmstat -h`:

      # kvmstat -h
      Usage: kvmstat [interval [count]]

        Displays statistics for running kernel virtual machines, with one line
        per virtual CPU.  All statistics are reported as per-second rates.

        The columns are as follows:

          pid    =>  identifier of process controlling the virtual CPU
          vcpu   =>  virtual CPU identifier relative to its virtual machine
          exits  =>  virtual machine exits for the virtual CPU
          haltx  =>  virtual machine exits due to the HLT instruction
          irqx   =>  virtual machine exits due to a pending external interrupt
          irqwx  =>  virtual machine exits due to an open interrupt window
          iox    =>  virtual machine exits due to an I/O instruction
          mmiox  =>  virtual machine exits due to memory mapped I/O 
          irqs   =>  interrupts injected into the virtual CPU
          emul   =>  instructions emulated in the kernel
          eptv   =>  extended page table violations

### kstat

As one might expect, `kvmstat` is implemented in terms of kstat.  You
can use `kstat(1)` to browse the kstats from the `kvm` module:

       # kstat -m kvm
       ...
       module: kvm                      instance: 0     
       name:   vcpu-4                   class:    misc
        crtime                          4407.142410068
        exits                           5367443
        fpu-reload                      57302
        halt-exits                      317275
        halt-wakeup                     8991
        host-state-reload               503920
        hypercalls                      0
        insn-emulation                  3043881
        inst-emulation-fail             0
        invlpg                          0
        io-exits                        237191
        irq-exits                       1668
        irq-injections                  320339
        irq-window-exits                1635
        mmio-exits                      617
        nmi-injections                  0
        nmi-window-exits                0
        pf-fixed                        163629
        pf-guest                        0
        pid                             3949
        request-irq-exits               0
        signal-exits                    460
        snaptime                        43219.723435123
        zonename                        global
       
       module: kvm                      instance: 4     
       name:   vm                       class:    misc
        crtime                          4407.1241134
        lpages                          0
        mmu-cache-miss                  950
        mmu-flooded                     0
        mmu-pte-updated                 0
        mmu-pte-write                   56360
        mmu-pte-zapped                  0
        mmu-recycled                    0
        mmu-unsync-page                 0
        pid                             3949
        remote-tlb-flush                1511
        snaptime                        43219.723875091
        zonename                        global
       

### DTrace

While there is not currently a stable KVM provider, there are many SDT probes
in KVM; `dtrace -l -m sdt:kvm` to list these.

Of these, of particular note are the `kvm-guest-entry` and `kvm-guest-exit`
probes, which fire upon entry to and exit from a guest virtual machine.  To
determine context, one can use the `vmregs` variable present in illumos.

For example, here's a simple script that shows histograms of time spent in VM
guests on a per-PID and per-VCPU basis:

        #pragma D option quiet
        
        kvm-guest-entry
        {
                self->entry = timestamp;
        }
        
        kvm-guest-exit
        /self->entry/
        {
                @[pid, vmregs[VMX_VIRTUAL_PROCESSOR_ID]] =
                    quantize(timestamp - self->entry);
        }
        
        END
        {
                printa("pid %d, vcpu %d: %@d\n", @);
        }

Here's what the output of running the above might look like:

      pid 3949, vcpu 1: 
           value  ------------- Distribution ------------- count    
             512 |                                         0        
            1024 |@@@@@@@@@@@@@                            26805    
            2048 |@@@@@                                    11641    
            4096 |@@@@@@@                                  14187    
            8192 |@                                        1559     
           16384 |@                                        2931     
           32768 |@@@                                      5653     
           65536 |@@@@                                     8385     
          131072 |@@@                                      6926     
          262144 |@@@                                      6639     
          524288 |                                         785      
         1048576 |                                         0        

There are many other ways in which DTrace can be used to understand either
host or guest behavior; see the `tools` subdirectory from some sample D
scripts.

### mdb

The `kvm.so` build product is an mdb module that contains several useful
commands, including a `kvm` walker to iterate over all `struct kvm`
structures.

Contributing to illumos-kvm
---------------------------

Unless and until its volume dictate that it be elsewhere, illumos KVM
discussion should be on the `illumos-developer` mailing list.
Contributions are happily accepted; please send patches to 
`illumos-developer`.