diff options
Diffstat (limited to 'doc/articles')
-rw-r--r-- | doc/articles/defer_panic_recover.html | 274 | ||||
-rw-r--r-- | doc/articles/defer_panic_recover.tmpl | 195 | ||||
-rw-r--r-- | doc/articles/error_handling.html | 433 | ||||
-rw-r--r-- | doc/articles/error_handling.tmpl | 314 | ||||
-rw-r--r-- | doc/articles/slice-1.png | bin | 0 -> 6334 bytes | |||
-rw-r--r-- | doc/articles/slice-2.png | bin | 0 -> 7220 bytes | |||
-rw-r--r-- | doc/articles/slice-3.png | bin | 0 -> 7303 bytes | |||
-rw-r--r-- | doc/articles/slice-array.png | bin | 0 -> 1237 bytes | |||
-rw-r--r-- | doc/articles/slice-struct.png | bin | 0 -> 3650 bytes | |||
-rw-r--r-- | doc/articles/slices_usage_and_internals.html | 479 | ||||
-rw-r--r-- | doc/articles/slices_usage_and_internals.tmpl | 438 |
11 files changed, 2133 insertions, 0 deletions
diff --git a/doc/articles/defer_panic_recover.html b/doc/articles/defer_panic_recover.html new file mode 100644 index 000000000..18c0de2d6 --- /dev/null +++ b/doc/articles/defer_panic_recover.html @@ -0,0 +1,274 @@ +<!--{ + "Title": "Defer, Panic, and Recover" +}--> +<!-- + DO NOT EDIT: created by + tmpltohtml articles/defer_panic_recover.tmpl +--> + +<p> +Go has the usual mechanisms for control flow: if, for, switch, goto. It also +has the go statement to run code in a separate goroutine. Here I'd like to +discuss some of the less common ones: defer, panic, and recover. +</p> + +<p> +A <b>defer statement</b> pushes a function call onto a list. The list of saved +calls is executed after the surrounding function returns. Defer is commonly +used to simplify functions that perform various clean-up actions. +</p> + +<p> +For example, let's look at a function that opens two files and copies the +contents of one file to the other: +</p> + +<pre><!--{{code "progs/defer.go" `/func CopyFile/` `/STOP/`}} +-->func CopyFile(dstName, srcName string) (written int64, err error) { + src, err := os.Open(srcName) + if err != nil { + return + } + + dst, err := os.Create(dstName) + if err != nil { + return + } + + written, err = io.Copy(dst, src) + dst.Close() + src.Close() + return +}</pre> + +<p> +This works, but there is a bug. If the second call to os.Open fails, the +function will return without closing the source file. This can be easily +remedied by putting a call to src.Close() before the second return statement, +but if the function were more complex the problem might not be so easily +noticed and resolved. By introducing defer statements we can ensure that the +files are always closed: +</p> + +<pre><!--{{code "progs/defer2.go" `/func CopyFile/` `/STOP/`}} +-->func CopyFile(dstName, srcName string) (written int64, err error) { + src, err := os.Open(srcName) + if err != nil { + return + } + defer src.Close() + + dst, err := os.Create(dstName) + if err != nil { + return + } + defer dst.Close() + + return io.Copy(dst, src) +}</pre> + +<p> +Defer statements allow us to think about closing each file right after opening +it, guaranteeing that, regardless of the number of return statements in the +function, the files <i>will</i> be closed. +</p> + +<p> +The behavior of defer statements is straightforward and predictable. There are +three simple rules: +</p> + +<p> +1. <i>A deferred function's arguments are evaluated when the defer statement is +evaluated.</i> +</p> + +<p> +In this example, the expression "i" is evaluated when the Println call is +deferred. The deferred call will print "0" after the function returns. +</p> + +<pre><!--{{code "progs/defer.go" `/func a/` `/STOP/`}} +-->func a() { + i := 0 + defer fmt.Println(i) + i++ + return +}</pre> + +<p> +2. <i>Deferred function calls are executed in Last In First Out order +</i>after<i> the surrounding function returns.</i> +</p> + +<p> +This function prints "3210": +</p> + +<pre><!--{{code "progs/defer.go" `/func b/` `/STOP/`}} +-->func b() { + for i := 0; i < 4; i++ { + defer fmt.Print(i) + } +}</pre> + +<p> +3. <i>Deferred functions may read and assign to the returning function's named +return values.</i> +</p> + +<p> +In this example, a deferred function increments the return value i <i>after</i> +the surrounding function returns. Thus, this function returns 2: +</p> + +<pre><!--{{code "progs/defer.go" `/func c/` `/STOP/`}} +-->func c() (i int) { + defer func() { i++ }() + return 1 +}</pre> + +<p> +This is convenient for modifying the error return value of a function; we will +see an example of this shortly. +</p> + +<p> +<b>Panic</b> is a built-in function that stops the ordinary flow of control and +begins <i>panicking</i>. When the function F calls panic, execution of F stops, +any deferred functions in F are executed normally, and then F returns to its +caller. To the caller, F then behaves like a call to panic. The process +continues up the stack until all functions in the current goroutine have +returned, at which point the program crashes. Panics can be initiated by +invoking panic directly. They can also be caused by runtime errors, such as +out-of-bounds array accesses. +</p> + +<p> +<b>Recover</b> is a built-in function that regains control of a panicking +goroutine. Recover is only useful inside deferred functions. During normal +execution, a call to recover will return nil and have no other effect. If the +current goroutine is panicking, a call to recover will capture the value given +to panic and resume normal execution. +</p> + +<p> +Here's an example program that demonstrates the mechanics of panic and defer: +</p> + +<pre><!--{{code "progs/defer2.go" `/package main/` `/STOP/`}} +-->package main + +import "fmt" + +func main() { + f() + fmt.Println("Returned normally from f.") +} + +func f() { + defer func() { + if r := recover(); r != nil { + fmt.Println("Recovered in f", r) + } + }() + fmt.Println("Calling g.") + g(0) + fmt.Println("Returned normally from g.") +} + +func g(i int) { + if i > 3 { + fmt.Println("Panicking!") + panic(fmt.Sprintf("%v", i)) + } + defer fmt.Println("Defer in g", i) + fmt.Println("Printing in g", i) + g(i + 1) +}</pre> + +<p> +The function g takes the int i, and panics if i is greater than 3, or else it +calls itself with the argument i+1. The function f defers a function that calls +recover and prints the recovered value (if it is non-nil). Try to picture what +the output of this program might be before reading on. +</p> + +<p> +The program will output: +</p> + +<pre>Calling g. +Printing in g 0 +Printing in g 1 +Printing in g 2 +Printing in g 3 +Panicking! +Defer in g 3 +Defer in g 2 +Defer in g 1 +Defer in g 0 +Recovered in f 4 +Returned normally from f.</pre> + +<p> +If we remove the deferred function from f the panic is not recovered and +reaches the top of the goroutine's call stack, terminating the program. This +modified program will output: +</p> + +<pre>Calling g. +Printing in g 0 +Printing in g 1 +Printing in g 2 +Printing in g 3 +Panicking! +Defer in g 3 +Defer in g 2 +Defer in g 1 +Defer in g 0 +panic: 4 + +panic PC=0x2a9cd8 +[stack trace omitted]</pre> + +<p> +For a real-world example of <b>panic</b> and <b>recover</b>, see the +<a href="/pkg/encoding/json/">json package</a> from the Go standard library. +It decodes JSON-encoded data with a set of recursive functions. +When malformed JSON is encountered, the parser calls panic is to unwind the +stack to the top-level function call, which recovers from the panic and returns +an appropriate error value (see the 'error' and 'unmarshal' functions in +<a href="/src/pkg/encoding/json/decode.go">decode.go</a>). +</p> + +<p> +The convention in the Go libraries is that even when a package uses panic +internally, its external API still presents explicit error return values. +</p> + +<p> +Other uses of <b>defer</b> (beyond the file.Close() example given earlier) +include releasing a mutex: +</p> + +<pre>mu.Lock() +defer mu.Unlock()</pre> + +<p> +printing a footer: +</p> + +<pre>printHeader() +defer printFooter()</pre> + +<p> +and more. +</p> + +<p> +In summary, the defer statement (with or without panic and recover) provides an +unusual and powerful mechanism for control flow. It can be used to model a +number of features implemented by special-purpose structures in other +programming languages. Try it out. +</p> diff --git a/doc/articles/defer_panic_recover.tmpl b/doc/articles/defer_panic_recover.tmpl new file mode 100644 index 000000000..60c8eebe0 --- /dev/null +++ b/doc/articles/defer_panic_recover.tmpl @@ -0,0 +1,195 @@ +<!--{ + "Title": "Defer, Panic, and Recover" +}--> +{{donotedit}} +<p> +Go has the usual mechanisms for control flow: if, for, switch, goto. It also +has the go statement to run code in a separate goroutine. Here I'd like to +discuss some of the less common ones: defer, panic, and recover. +</p> + +<p> +A <b>defer statement</b> pushes a function call onto a list. The list of saved +calls is executed after the surrounding function returns. Defer is commonly +used to simplify functions that perform various clean-up actions. +</p> + +<p> +For example, let's look at a function that opens two files and copies the +contents of one file to the other: +</p> + +{{code "progs/defer.go" `/func CopyFile/` `/STOP/`}} + +<p> +This works, but there is a bug. If the second call to os.Open fails, the +function will return without closing the source file. This can be easily +remedied by putting a call to src.Close() before the second return statement, +but if the function were more complex the problem might not be so easily +noticed and resolved. By introducing defer statements we can ensure that the +files are always closed: +</p> + +{{code "progs/defer2.go" `/func CopyFile/` `/STOP/`}} + +<p> +Defer statements allow us to think about closing each file right after opening +it, guaranteeing that, regardless of the number of return statements in the +function, the files <i>will</i> be closed. +</p> + +<p> +The behavior of defer statements is straightforward and predictable. There are +three simple rules: +</p> + +<p> +1. <i>A deferred function's arguments are evaluated when the defer statement is +evaluated.</i> +</p> + +<p> +In this example, the expression "i" is evaluated when the Println call is +deferred. The deferred call will print "0" after the function returns. +</p> + +{{code "progs/defer.go" `/func a/` `/STOP/`}} + +<p> +2. <i>Deferred function calls are executed in Last In First Out order +</i>after<i> the surrounding function returns.</i> +</p> + +<p> +This function prints "3210": +</p> + +{{code "progs/defer.go" `/func b/` `/STOP/`}} + +<p> +3. <i>Deferred functions may read and assign to the returning function's named +return values.</i> +</p> + +<p> +In this example, a deferred function increments the return value i <i>after</i> +the surrounding function returns. Thus, this function returns 2: +</p> + +{{code "progs/defer.go" `/func c/` `/STOP/`}} + +<p> +This is convenient for modifying the error return value of a function; we will +see an example of this shortly. +</p> + +<p> +<b>Panic</b> is a built-in function that stops the ordinary flow of control and +begins <i>panicking</i>. When the function F calls panic, execution of F stops, +any deferred functions in F are executed normally, and then F returns to its +caller. To the caller, F then behaves like a call to panic. The process +continues up the stack until all functions in the current goroutine have +returned, at which point the program crashes. Panics can be initiated by +invoking panic directly. They can also be caused by runtime errors, such as +out-of-bounds array accesses. +</p> + +<p> +<b>Recover</b> is a built-in function that regains control of a panicking +goroutine. Recover is only useful inside deferred functions. During normal +execution, a call to recover will return nil and have no other effect. If the +current goroutine is panicking, a call to recover will capture the value given +to panic and resume normal execution. +</p> + +<p> +Here's an example program that demonstrates the mechanics of panic and defer: +</p> + +{{code "progs/defer2.go" `/package main/` `/STOP/`}} + +<p> +The function g takes the int i, and panics if i is greater than 3, or else it +calls itself with the argument i+1. The function f defers a function that calls +recover and prints the recovered value (if it is non-nil). Try to picture what +the output of this program might be before reading on. +</p> + +<p> +The program will output: +</p> + +<pre>Calling g. +Printing in g 0 +Printing in g 1 +Printing in g 2 +Printing in g 3 +Panicking! +Defer in g 3 +Defer in g 2 +Defer in g 1 +Defer in g 0 +Recovered in f 4 +Returned normally from f.</pre> + +<p> +If we remove the deferred function from f the panic is not recovered and +reaches the top of the goroutine's call stack, terminating the program. This +modified program will output: +</p> + +<pre>Calling g. +Printing in g 0 +Printing in g 1 +Printing in g 2 +Printing in g 3 +Panicking! +Defer in g 3 +Defer in g 2 +Defer in g 1 +Defer in g 0 +panic: 4 + +panic PC=0x2a9cd8 +[stack trace omitted]</pre> + +<p> +For a real-world example of <b>panic</b> and <b>recover</b>, see the +<a href="/pkg/encoding/json/">json package</a> from the Go standard library. +It decodes JSON-encoded data with a set of recursive functions. +When malformed JSON is encountered, the parser calls panic is to unwind the +stack to the top-level function call, which recovers from the panic and returns +an appropriate error value (see the 'error' and 'unmarshal' functions in +<a href="/src/pkg/encoding/json/decode.go">decode.go</a>). +</p> + +<p> +The convention in the Go libraries is that even when a package uses panic +internally, its external API still presents explicit error return values. +</p> + +<p> +Other uses of <b>defer</b> (beyond the file.Close() example given earlier) +include releasing a mutex: +</p> + +<pre>mu.Lock() +defer mu.Unlock()</pre> + +<p> +printing a footer: +</p> + +<pre>printHeader() +defer printFooter()</pre> + +<p> +and more. +</p> + +<p> +In summary, the defer statement (with or without panic and recover) provides an +unusual and powerful mechanism for control flow. It can be used to model a +number of features implemented by special-purpose structures in other +programming languages. Try it out. +</p> diff --git a/doc/articles/error_handling.html b/doc/articles/error_handling.html new file mode 100644 index 000000000..b9393a2cb --- /dev/null +++ b/doc/articles/error_handling.html @@ -0,0 +1,433 @@ +<!--{ + "Title": "Error Handling and Go" +}--> +<!-- + DO NOT EDIT: created by + tmpltohtml articles/error_handling.tmpl +--> + +<p> +If you have written any Go code you have probably encountered the built-in +<code>error</code> type. Go code uses <code>error</code> values to +indicate an abnormal state. For example, the <code>os.Open</code> function +returns a non-nil <code>error</code> value when it fails to open a file. +</p> + +<pre><!--{{code "progs/error.go" `/func Open/`}} +-->func Open(name string) (file *File, err error)</pre> + +<p> +The following code uses <code>os.Open</code> to open a file. If an error +occurs it calls <code>log.Fatal</code> to print the error message and stop. +</p> + +<pre><!--{{code "progs/error.go" `/func openFile/` `/STOP/`}} +--> f, err := os.Open("filename.ext") + if err != nil { + log.Fatal(err) + } + // do something with the open *File f</pre> + +<p> +You can get a lot done in Go knowing just this about the <code>error</code> +type, but in this article we'll take a closer look at <code>error</code> and +discuss some good practices for error handling in Go. +</p> + +<p> +<b>The error type</b> +</p> + +<p> +The <code>error</code> type is an interface type. An <code>error</code> +variable represents any value that can describe itself as a string. Here is the +interface's declaration: +</p> + +<pre>type error interface { + Error() string +}</pre> + +<p> +The <code>error</code> type, as with all built in types, is +<a href="/doc/go_spec.html#Predeclared_identifiers">predeclared</a> in the +<a href="/doc/go_spec.html#Blocks">universe block</a>. +</p> + +<p> +The most commonly-used <code>error</code> implementation is the +<a href="/pkg/errors/">errors</a> package's unexported <code>errorString</code> type. +</p> + +<pre><!--{{code "progs/error.go" `/errorString/` `/STOP/`}} +-->// errorString is a trivial implementation of error. +type errorString struct { + s string +} + +func (e *errorString) Error() string { + return e.s +}</pre> + +<p> +You can construct one of these values with the <code>errors.New</code> +function. It takes a string that it converts to an <code>errors.errorString</code> +and returns as an <code>error</code> value. +</p> + +<pre><!--{{code "progs/error.go" `/New/` `/STOP/`}} +-->// New returns an error that formats as the given text. +func New(text string) error { + return &errorString{text} +}</pre> + +<p> +Here's how you might use <code>errors.New</code>: +</p> + +<pre><!--{{code "progs/error.go" `/func Sqrt/` `/STOP/`}} +-->func Sqrt(f float64) (float64, error) { + if f < 0 { + return 0, errors.New("math: square root of negative number") + } + // implementation +}</pre> + +<p> +A caller passing a negative argument to <code>Sqrt</code> receives a non-nil +<code>error</code> value (whose concrete representation is an +<code>errors.errorString</code> value). The caller can access the error string +("math: square root of...") by calling the <code>error</code>'s +<code>Error</code> method, or by just printing it: +</p> + +<pre><!--{{code "progs/error.go" `/func printErr/` `/STOP/`}} +--> f, err := Sqrt(-1) + if err != nil { + fmt.Println(err) + }</pre> + +<p> +The <a href="/pkg/fmt/">fmt</a> package formats an <code>error</code> value +by calling its <code>Error() string</code> method. +</p> + +<p> +It is the error implementation's responsibility to summarize the context. +The error returned by <code>os.Open</code> formats as "open /etc/passwd: +permission denied," not just "permission denied." The error returned by our +<code>Sqrt</code> is missing information about the invalid argument. +</p> + +<p> +To add that information, a useful function is the <code>fmt</code> package's +<code>Errorf</code>. It formats a string according to <code>Printf</code>'s +rules and returns it as an <code>error</code> created by +<code>errors.New</code>. +</p> + +<pre><!--{{code "progs/error.go" `/fmtError/` `/STOP/`}} +--> if f < 0 { + return 0, fmt.Errorf("math: square root of negative number %g", f) + }</pre> + +<p> +In many cases <code>fmt.Errorf</code> is good enough, but since +<code>error</code> is an interface, you can use arbitrary data structures as +error values, to allow callers to inspect the details of the error. +</p> + +<p> +For instance, our hypothetical callers might want to recover the invalid +argument passed to <code>Sqrt</code>. We can enable that by defining a new +error implementation instead of using <code>errors.errorString</code>: +</p> + +<pre><!--{{code "progs/error.go" `/type NegativeSqrtError/` `/STOP/`}} +-->type NegativeSqrtError float64 + +func (f NegativeSqrtError) Error() string { + return fmt.Sprintf("math: square root of negative number %g", float64(f)) +}</pre> + +<p> +A sophisticated caller can then use a +<a href="/doc/go_spec.html#Type_assertions">type assertion</a> to check for a +<code>NegativeSqrtError</code> and handle it specially, while callers that just +pass the error to <code>fmt.Println</code> or <code>log.Fatal</code> will see +no change in behavior. +</p> + +<p> +As another example, the <a href="/pkg/encoding/json/">json</a> package specifies a +<code>SyntaxError</code> type that the <code>json.Decode</code> function +returns when it encounters a syntax error parsing a JSON blob. +</p> + +<pre><!--{{code "progs/error.go" `/type SyntaxError/` `/STOP/`}} +-->type SyntaxError struct { + msg string // description of error + Offset int64 // error occurred after reading Offset bytes +} + +func (e *SyntaxError) Error() string { return e.msg }</pre> + +<p> +The <code>Offset</code> field isn't even shown in the default formatting of the +error, but callers can use it to add file and line information to their error +messages: +</p> + +<pre><!--{{code "progs/error.go" `/func decodeError/` `/STOP/`}} +--> if err := dec.Decode(&val); err != nil { + if serr, ok := err.(*json.SyntaxError); ok { + line, col := findLine(f, serr.Offset) + return fmt.Errorf("%s:%d:%d: %v", f.Name(), line, col, err) + } + return err + }</pre> + +<p> +(This is a slightly simplified version of some +<a href="http://camlistore.org/code/?p=camlistore.git;a=blob;f=lib/go/camli/jsonconfig/eval.go#l68">actual code</a> +from the <a href="http://camlistore.org">Camlistore</a> project.) +</p> + +<p> +The <code>error</code> interface requires only a <code>Error</code> method; +specific error implementations might have additional methods. For instance, the +<a href="/pkg/net/">net</a> package returns errors of type +<code>error</code>, following the usual convention, but some of the error +implementations have additional methods defined by the <code>net.Error</code> +interface: +</p> + +<pre>package net + +type Error interface { + error + Timeout() bool // Is the error a timeout? + Temporary() bool // Is the error temporary? +}</pre> + +<p> +Client code can test for a <code>net.Error</code> with a type assertion and +then distinguish transient network errors from permanent ones. For instance, a +web crawler might sleep and retry when it encounters a temporary error and give +up otherwise. +</p> + +<pre><!--{{code "progs/error.go" `/func netError/` `/STOP/`}} +--> if nerr, ok := err.(net.Error); ok && nerr.Temporary() { + time.Sleep(1e9) + continue + } + if err != nil { + log.Fatal(err) + }</pre> + +<p> +<b>Simplifying repetitive error handling</b> +</p> + +<p> +In Go, error handling is important. The language's design and conventions +encourage you to explicitly check for errors where they occur (as distinct from +the convention in other languages of throwing exceptions and sometimes catching +them). In some cases this makes Go code verbose, but fortunately there are some +techniques you can use to minimize repetitive error handling. +</p> + +<p> +Consider an <a href="http://code.google.com/appengine/docs/go/">App Engine</a> +application with an HTTP handler that retrieves a record from the datastore and +formats it with a template. +</p> + +<pre><!--{{code "progs/error2.go" `/func init/` `/STOP/`}} +-->func init() { + http.HandleFunc("/view", viewRecord) +} + +func viewRecord(w http.ResponseWriter, r *http.Request) { + c := appengine.NewContext(r) + key := datastore.NewKey(c, "Record", r.FormValue("id"), 0, nil) + record := new(Record) + if err := datastore.Get(c, key, record); err != nil { + http.Error(w, err.Error(), 500) + return + } + if err := viewTemplate.Execute(w, record); err != nil { + http.Error(w, err.Error(), 500) + } +}</pre> + +<p> +This function handles errors returned by the <code>datastore.Get</code> +function and <code>viewTemplate</code>'s <code>Execute</code> method. In both +cases, it presents a simple error message to the user with the HTTP status code +500 ("Internal Server Error"). This looks like a manageable amount of code, but +add some more HTTP handlers and you quickly end up with many copies of +identical error handling code. +</p> + +<p> +To reduce the repetition we can define our own HTTP <code>appHandler</code> +type that includes an <code>error</code> return value: +</p> + +<pre><!--{{code "progs/error3.go" `/type appHandler/`}} +-->type appHandler func(http.ResponseWriter, *http.Request) error</pre> + +<p> +Then we can change our <code>viewRecord</code> function to return errors: +</p> + +<pre><!--{{code "progs/error3.go" `/func viewRecord/` `/STOP/`}} +-->func viewRecord(w http.ResponseWriter, r *http.Request) error { + c := appengine.NewContext(r) + key := datastore.NewKey(c, "Record", r.FormValue("id"), 0, nil) + record := new(Record) + if err := datastore.Get(c, key, record); err != nil { + return err + } + return viewTemplate.Execute(w, record) +}</pre> + +<p> +This is simpler than the original version, but the <a +href="/pkg/net/http/">http</a> package doesn't understand functions that return +<code>error</code>. +To fix this we can implement the <code>http.Handler</code> interface's +<code>ServeHTTP</code> method on <code>appHandler</code>: +</p> + +<pre><!--{{code "progs/error3.go" `/ServeHTTP/` `/STOP/`}} +-->func (fn appHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) { + if err := fn(w, r); err != nil { + http.Error(w, err.Error(), 500) + } +}</pre> + +<p> +The <code>ServeHTTP</code> method calls the <code>appHandler</code> function +and displays the returned error (if any) to the user. Notice that the method's +receiver, <code>fn</code>, is a function. (Go can do that!) The method invokes +the function by calling the receiver in the expression <code>fn(w, r)</code>. +</p> + +<p> +Now when registering <code>viewRecord</code> with the http package we use the +<code>Handle</code> function (instead of <code>HandleFunc</code>) as +<code>appHandler</code> is an <code>http.Handler</code> (not an +<code>http.HandlerFunc</code>). +</p> + +<pre><!--{{code "progs/error3.go" `/func init/` `/STOP/`}} +-->func init() { + http.Handle("/view", appHandler(viewRecord)) +}</pre> + +<p> +With this basic error handling infrastructure in place, we can make it more +user friendly. Rather than just displaying the error string, it would be better +to give the user a simple error message with an appropriate HTTP status code, +while logging the full error to the App Engine developer console for debugging +purposes. +</p> + +<p> +To do this we create an <code>appError</code> struct containing an +<code>error</code> and some other fields: +</p> + +<pre><!--{{code "progs/error4.go" `/type appError/` `/STOP/`}} +-->type appError struct { + Error error + Message string + Code int +}</pre> + +<p> +Next we modify the appHandler type to return <code>*appError</code> values: +</p> + +<pre><!--{{code "progs/error4.go" `/type appHandler/`}} +-->type appHandler func(http.ResponseWriter, *http.Request) *appError</pre> + +<p> +(It's usually a mistake to pass back the concrete type of an error rather than +<code>error</code>, for reasons to be discussed in another article, but +it's the right thing to do here because <code>ServeHTTP</code> is the only +place that sees the value and uses its contents.) +</p> + +<p> +And make <code>appHandler</code>'s <code>ServeHTTP</code> method display the +<code>appError</code>'s <code>Message</code> to the user with the correct HTTP +status <code>Code</code> and log the full <code>Error</code> to the developer +console: +</p> + +<pre><!--{{code "progs/error4.go" `/ServeHTTP/` `/STOP/`}} +-->func (fn appHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) { + if e := fn(w, r); e != nil { // e is *appError, not os.Error. + c := appengine.NewContext(r) + c.Errorf("%v", e.Error) + http.Error(w, e.Message, e.Code) + } +}</pre> + +<p> +Finally, we update <code>viewRecord</code> to the new function signature and +have it return more context when it encounters an error: +</p> + +<pre><!--{{code "progs/error4.go" `/func viewRecord/` `/STOP/`}} +-->func viewRecord(w http.ResponseWriter, r *http.Request) *appError { + c := appengine.NewContext(r) + key := datastore.NewKey(c, "Record", r.FormValue("id"), 0, nil) + record := new(Record) + if err := datastore.Get(c, key, record); err != nil { + return &appError{err, "Record not found", 404} + } + if err := viewTemplate.Execute(w, record); err != nil { + return &appError{err, "Can't display record", 500} + } + return nil +}</pre> + +<p> +This version of <code>viewRecord</code> is the same length as the original, but +now each of those lines has specific meaning and we are providing a friendlier +user experience. +</p> + +<p> +It doesn't end there; we can further improve the error handling in our +application. Some ideas: +</p> + +<ul> +<li>give the error handler a pretty HTML template, +<li>make debugging easier by writing the stack trace to the HTTP response when +the user is an administrator, +<li>write a constructor function for <code>appError</code> that stores the +stack trace for easier debugging, +<li>recover from panics inside the <code>appHandler</code>, logging the error +to the console as "Critical," while simply telling the user "a serious error +has occurred." This is a nice touch to avoid exposing the user to inscrutable +error messages caused by programming errors. +See the <a href="defer_panic_recover.html">Defer, Panic, and Recover</a> +article for more details. +</ul> + +<p> +<b>Conclusion</b> +</p> + +<p> +Proper error handling is an essential requirement of good software. By +employing the techniques described in this post you should be able to write +more reliable and succinct Go code. +</p> diff --git a/doc/articles/error_handling.tmpl b/doc/articles/error_handling.tmpl new file mode 100644 index 000000000..141b4a54d --- /dev/null +++ b/doc/articles/error_handling.tmpl @@ -0,0 +1,314 @@ +<!--{ + "Title": "Error Handling and Go" +}--> +{{donotedit}} +<p> +If you have written any Go code you have probably encountered the built-in +<code>error</code> type. Go code uses <code>error</code> values to +indicate an abnormal state. For example, the <code>os.Open</code> function +returns a non-nil <code>error</code> value when it fails to open a file. +</p> + +{{code "progs/error.go" `/func Open/`}} + +<p> +The following code uses <code>os.Open</code> to open a file. If an error +occurs it calls <code>log.Fatal</code> to print the error message and stop. +</p> + +{{code "progs/error.go" `/func openFile/` `/STOP/`}} + +<p> +You can get a lot done in Go knowing just this about the <code>error</code> +type, but in this article we'll take a closer look at <code>error</code> and +discuss some good practices for error handling in Go. +</p> + +<p> +<b>The error type</b> +</p> + +<p> +The <code>error</code> type is an interface type. An <code>error</code> +variable represents any value that can describe itself as a string. Here is the +interface's declaration: +</p> + +<pre>type error interface { + Error() string +}</pre> + +<p> +The <code>error</code> type, as with all built in types, is +<a href="/doc/go_spec.html#Predeclared_identifiers">predeclared</a> in the +<a href="/doc/go_spec.html#Blocks">universe block</a>. +</p> + +<p> +The most commonly-used <code>error</code> implementation is the +<a href="/pkg/errors/">errors</a> package's unexported <code>errorString</code> type. +</p> + +{{code "progs/error.go" `/errorString/` `/STOP/`}} + +<p> +You can construct one of these values with the <code>errors.New</code> +function. It takes a string that it converts to an <code>errors.errorString</code> +and returns as an <code>error</code> value. +</p> + +{{code "progs/error.go" `/New/` `/STOP/`}} + +<p> +Here's how you might use <code>errors.New</code>: +</p> + +{{code "progs/error.go" `/func Sqrt/` `/STOP/`}} + +<p> +A caller passing a negative argument to <code>Sqrt</code> receives a non-nil +<code>error</code> value (whose concrete representation is an +<code>errors.errorString</code> value). The caller can access the error string +("math: square root of...") by calling the <code>error</code>'s +<code>Error</code> method, or by just printing it: +</p> + +{{code "progs/error.go" `/func printErr/` `/STOP/`}} + +<p> +The <a href="/pkg/fmt/">fmt</a> package formats an <code>error</code> value +by calling its <code>Error() string</code> method. +</p> + +<p> +It is the error implementation's responsibility to summarize the context. +The error returned by <code>os.Open</code> formats as "open /etc/passwd: +permission denied," not just "permission denied." The error returned by our +<code>Sqrt</code> is missing information about the invalid argument. +</p> + +<p> +To add that information, a useful function is the <code>fmt</code> package's +<code>Errorf</code>. It formats a string according to <code>Printf</code>'s +rules and returns it as an <code>error</code> created by +<code>errors.New</code>. +</p> + +{{code "progs/error.go" `/fmtError/` `/STOP/`}} + +<p> +In many cases <code>fmt.Errorf</code> is good enough, but since +<code>error</code> is an interface, you can use arbitrary data structures as +error values, to allow callers to inspect the details of the error. +</p> + +<p> +For instance, our hypothetical callers might want to recover the invalid +argument passed to <code>Sqrt</code>. We can enable that by defining a new +error implementation instead of using <code>errors.errorString</code>: +</p> + +{{code "progs/error.go" `/type NegativeSqrtError/` `/STOP/`}} + +<p> +A sophisticated caller can then use a +<a href="/doc/go_spec.html#Type_assertions">type assertion</a> to check for a +<code>NegativeSqrtError</code> and handle it specially, while callers that just +pass the error to <code>fmt.Println</code> or <code>log.Fatal</code> will see +no change in behavior. +</p> + +<p> +As another example, the <a href="/pkg/encoding/json/">json</a> package specifies a +<code>SyntaxError</code> type that the <code>json.Decode</code> function +returns when it encounters a syntax error parsing a JSON blob. +</p> + +{{code "progs/error.go" `/type SyntaxError/` `/STOP/`}} + +<p> +The <code>Offset</code> field isn't even shown in the default formatting of the +error, but callers can use it to add file and line information to their error +messages: +</p> + +{{code "progs/error.go" `/func decodeError/` `/STOP/`}} + +<p> +(This is a slightly simplified version of some +<a href="http://camlistore.org/code/?p=camlistore.git;a=blob;f=lib/go/camli/jsonconfig/eval.go#l68">actual code</a> +from the <a href="http://camlistore.org">Camlistore</a> project.) +</p> + +<p> +The <code>error</code> interface requires only a <code>Error</code> method; +specific error implementations might have additional methods. For instance, the +<a href="/pkg/net/">net</a> package returns errors of type +<code>error</code>, following the usual convention, but some of the error +implementations have additional methods defined by the <code>net.Error</code> +interface: +</p> + +<pre>package net + +type Error interface { + error + Timeout() bool // Is the error a timeout? + Temporary() bool // Is the error temporary? +}</pre> + +<p> +Client code can test for a <code>net.Error</code> with a type assertion and +then distinguish transient network errors from permanent ones. For instance, a +web crawler might sleep and retry when it encounters a temporary error and give +up otherwise. +</p> + +{{code "progs/error.go" `/func netError/` `/STOP/`}} + +<p> +<b>Simplifying repetitive error handling</b> +</p> + +<p> +In Go, error handling is important. The language's design and conventions +encourage you to explicitly check for errors where they occur (as distinct from +the convention in other languages of throwing exceptions and sometimes catching +them). In some cases this makes Go code verbose, but fortunately there are some +techniques you can use to minimize repetitive error handling. +</p> + +<p> +Consider an <a href="http://code.google.com/appengine/docs/go/">App Engine</a> +application with an HTTP handler that retrieves a record from the datastore and +formats it with a template. +</p> + +{{code "progs/error2.go" `/func init/` `/STOP/`}} + +<p> +This function handles errors returned by the <code>datastore.Get</code> +function and <code>viewTemplate</code>'s <code>Execute</code> method. In both +cases, it presents a simple error message to the user with the HTTP status code +500 ("Internal Server Error"). This looks like a manageable amount of code, but +add some more HTTP handlers and you quickly end up with many copies of +identical error handling code. +</p> + +<p> +To reduce the repetition we can define our own HTTP <code>appHandler</code> +type that includes an <code>error</code> return value: +</p> + +{{code "progs/error3.go" `/type appHandler/`}} + +<p> +Then we can change our <code>viewRecord</code> function to return errors: +</p> + +{{code "progs/error3.go" `/func viewRecord/` `/STOP/`}} + +<p> +This is simpler than the original version, but the <a +href="/pkg/net/http/">http</a> package doesn't understand functions that return +<code>error</code>. +To fix this we can implement the <code>http.Handler</code> interface's +<code>ServeHTTP</code> method on <code>appHandler</code>: +</p> + +{{code "progs/error3.go" `/ServeHTTP/` `/STOP/`}} + +<p> +The <code>ServeHTTP</code> method calls the <code>appHandler</code> function +and displays the returned error (if any) to the user. Notice that the method's +receiver, <code>fn</code>, is a function. (Go can do that!) The method invokes +the function by calling the receiver in the expression <code>fn(w, r)</code>. +</p> + +<p> +Now when registering <code>viewRecord</code> with the http package we use the +<code>Handle</code> function (instead of <code>HandleFunc</code>) as +<code>appHandler</code> is an <code>http.Handler</code> (not an +<code>http.HandlerFunc</code>). +</p> + +{{code "progs/error3.go" `/func init/` `/STOP/`}} + +<p> +With this basic error handling infrastructure in place, we can make it more +user friendly. Rather than just displaying the error string, it would be better +to give the user a simple error message with an appropriate HTTP status code, +while logging the full error to the App Engine developer console for debugging +purposes. +</p> + +<p> +To do this we create an <code>appError</code> struct containing an +<code>error</code> and some other fields: +</p> + +{{code "progs/error4.go" `/type appError/` `/STOP/`}} + +<p> +Next we modify the appHandler type to return <code>*appError</code> values: +</p> + +{{code "progs/error4.go" `/type appHandler/`}} + +<p> +(It's usually a mistake to pass back the concrete type of an error rather than +<code>error</code>, for reasons to be discussed in another article, but +it's the right thing to do here because <code>ServeHTTP</code> is the only +place that sees the value and uses its contents.) +</p> + +<p> +And make <code>appHandler</code>'s <code>ServeHTTP</code> method display the +<code>appError</code>'s <code>Message</code> to the user with the correct HTTP +status <code>Code</code> and log the full <code>Error</code> to the developer +console: +</p> + +{{code "progs/error4.go" `/ServeHTTP/` `/STOP/`}} + +<p> +Finally, we update <code>viewRecord</code> to the new function signature and +have it return more context when it encounters an error: +</p> + +{{code "progs/error4.go" `/func viewRecord/` `/STOP/`}} + +<p> +This version of <code>viewRecord</code> is the same length as the original, but +now each of those lines has specific meaning and we are providing a friendlier +user experience. +</p> + +<p> +It doesn't end there; we can further improve the error handling in our +application. Some ideas: +</p> + +<ul> +<li>give the error handler a pretty HTML template, +<li>make debugging easier by writing the stack trace to the HTTP response when +the user is an administrator, +<li>write a constructor function for <code>appError</code> that stores the +stack trace for easier debugging, +<li>recover from panics inside the <code>appHandler</code>, logging the error +to the console as "Critical," while simply telling the user "a serious error +has occurred." This is a nice touch to avoid exposing the user to inscrutable +error messages caused by programming errors. +See the <a href="defer_panic_recover.html">Defer, Panic, and Recover</a> +article for more details. +</ul> + +<p> +<b>Conclusion</b> +</p> + +<p> +Proper error handling is an essential requirement of good software. By +employing the techniques described in this post you should be able to write +more reliable and succinct Go code. +</p> diff --git a/doc/articles/slice-1.png b/doc/articles/slice-1.png Binary files differnew file mode 100644 index 000000000..ba465cf71 --- /dev/null +++ b/doc/articles/slice-1.png diff --git a/doc/articles/slice-2.png b/doc/articles/slice-2.png Binary files differnew file mode 100644 index 000000000..a57581e8c --- /dev/null +++ b/doc/articles/slice-2.png diff --git a/doc/articles/slice-3.png b/doc/articles/slice-3.png Binary files differnew file mode 100644 index 000000000..64ece5e87 --- /dev/null +++ b/doc/articles/slice-3.png diff --git a/doc/articles/slice-array.png b/doc/articles/slice-array.png Binary files differnew file mode 100644 index 000000000..a533702cf --- /dev/null +++ b/doc/articles/slice-array.png diff --git a/doc/articles/slice-struct.png b/doc/articles/slice-struct.png Binary files differnew file mode 100644 index 000000000..f9141fc59 --- /dev/null +++ b/doc/articles/slice-struct.png diff --git a/doc/articles/slices_usage_and_internals.html b/doc/articles/slices_usage_and_internals.html new file mode 100644 index 000000000..c10dfe0ca --- /dev/null +++ b/doc/articles/slices_usage_and_internals.html @@ -0,0 +1,479 @@ +<!--{ + "Title": "Slices: usage and internals" +}--> +<!-- + DO NOT EDIT: created by + tmpltohtml articles/slices_usage_and_internals.tmpl +--> + + +<p> +Go's slice type provides a convenient and efficient means of working with +sequences of typed data. Slices are analogous to arrays in other languages, but +have some unusual properties. This article will look at what slices are and how +they are used. +</p> + +<p> +<b>Arrays</b> +</p> + +<p> +The slice type is an abstraction built on top of Go's array type, and so to +understand slices we must first understand arrays. +</p> + +<p> +An array type definition specifies a length and an element type. For example, +the type <code>[4]int</code> represents an array of four integers. An array's +size is fixed; its length is part of its type (<code>[4]int</code> and +<code>[5]int</code> are distinct, incompatible types). Arrays can be indexed in +the usual way, so the expression <code>s[n]</code> accesses the <i>n</i>th +element: +</p> + +<pre> +var a [4]int +a[0] = 1 +i := a[0] +// i == 1 +</pre> + +<p> +Arrays do not need to be initialized explicitly; the zero value of an array is +a ready-to-use array whose elements are themselves zeroed: +</p> + +<pre> +// a[2] == 0, the zero value of the int type +</pre> + +<p> +The in-memory representation of <code>[4]int</code> is just four integer values laid out sequentially: +</p> + +<p> +<img src="slice-array.png"> +</p> + +<p> +Go's arrays are values. An array variable denotes the entire array; it is not a +pointer to the first array element (as would be the case in C). This means +that when you assign or pass around an array value you will make a copy of its +contents. (To avoid the copy you could pass a <i>pointer</i> to the array, but +then that's a pointer to an array, not an array.) One way to think about arrays +is as a sort of struct but with indexed rather than named fields: a fixed-size +composite value. +</p> + +<p> +An array literal can be specified like so: +</p> + +<pre> +b := [2]string{"Penn", "Teller"} +</pre> + +<p> +Or, you can have the compiler count the array elements for you: +</p> + +<pre> +b := [...]string{"Penn", "Teller"} +</pre> + +<p> +In both cases, the type of <code>b</code> is <code>[2]string</code>. +</p> + +<p> +<b>Slices</b> +</p> + +<p> +Arrays have their place, but they're a bit inflexible, so you don't see them +too often in Go code. Slices, though, are everywhere. They build on arrays to +provide great power and convenience. +</p> + +<p> +The type specification for a slice is <code>[]T</code>, where <code>T</code> is +the type of the elements of the slice. Unlike an array type, a slice type has +no specified length. +</p> + +<p> +A slice literal is declared just like an array literal, except you leave out +the element count: +</p> + +<pre> +letters := []string{"a", "b", "c", "d"} +</pre> + +<p> +A slice can be created with the built-in function called <code>make</code>, +which has the signature, +</p> + +<pre> +func make([]T, len, cap) []T +</pre> + +<p> +where T stands for the element type of the slice to be created. The +<code>make</code> function takes a type, a length, and an optional capacity. +When called, <code>make</code> allocates an array and returns a slice that +refers to that array. +</p> + +<pre> +var s []byte +s = make([]byte, 5, 5) +// s == []byte{0, 0, 0, 0, 0} +</pre> + +<p> +When the capacity argument is omitted, it defaults to the specified length. +Here's a more succinct version of the same code: +</p> + +<pre> +s := make([]byte, 5) +</pre> + +<p> +The length and capacity of a slice can be inspected using the built-in +<code>len</code> and <code>cap</code> functions. +</p> + +<pre> +len(s) == 5 +cap(s) == 5 +</pre> + +<p> +The next two sections discuss the relationship between length and capacity. +</p> + +<p> +The zero value of a slice is <code>nil</code>. The <code>len</code> and +<code>cap</code> functions will both return 0 for a nil slice. +</p> + +<p> +A slice can also be formed by "slicing" an existing slice or array. Slicing is +done by specifying a half-open range with two indices separated by a colon. For +example, the expression <code>b[1:4]</code> creates a slice including elements +1 through 3 of <code>b</code> (the indices of the resulting slice will be 0 +through 2). +</p> + +<pre> +b := []byte{'g', 'o', 'l', 'a', 'n', 'g'} +// b[1:4] == []byte{'o', 'l', 'a'}, sharing the same storage as b +</pre> + +<p> +The start and end indices of a slice expression are optional; they default to zero and the slice's length respectively: +</p> + +<pre> +// b[:2] == []byte{'g', 'o'} +// b[2:] == []byte{'l', 'a', 'n', 'g'} +// b[:] == b +</pre> + +<p> +This is also the syntax to create a slice given an array: +</p> + +<pre> +x := [3]string{"Лайка", "Белка", "Стрелка"} +s := x[:] // a slice referencing the storage of x +</pre> + +<p> +<b>Slice internals</b> +</p> + +<p> +A slice is a descriptor of an array segment. It consists of a pointer to the +array, the length of the segment, and its capacity (the maximum length of the +segment). +</p> + +<p> +<img src="slice-struct.png"> +</p> + +<p> +Our variable <code>s</code>, created earlier by <code>make([]byte, 5)</code>, +is structured like this: +</p> + +<p> +<img src="slice-1.png"> +</p> + +<p> +The length is the number of elements referred to by the slice. The capacity is +the number of elements in the underlying array (beginning at the element +referred to by the slice pointer). The distinction between length and capacity +will be made clear as we walk through the next few examples. +</p> + +<p> +As we slice <code>s</code>, observe the changes in the slice data structure and +their relation to the underlying array: +</p> + +<pre> +s = s[2:4] +</pre> + +<p> +<img src="slice-2.png"> +</p> + +<p> +Slicing does not copy the slice's data. It creates a new slice value that +points to the original array. This makes slice operations as efficient as +manipulating array indices. Therefore, modifying the <i>elements</i> (not the +slice itself) of a re-slice modifies the elements of the original slice: +</p> + +<pre> +d := []byte{'r', 'o', 'a', 'd'} +e := d[2:] +// e == []byte{'a', 'd'} +e[1] == 'm' +// e == []byte{'a', 'm'} +// d == []byte{'r', 'o', 'a', 'm'} +</pre> + +<p> +Earlier we sliced <code>s</code> to a length shorter than its capacity. We can +grow s to its capacity by slicing it again: +</p> + +<pre> +s = s[:cap(s)] +</pre> + +<p> +<img src="slice-3.png"> +</p> + +<p> +A slice cannot be grown beyond its capacity. Attempting to do so will cause a +runtime panic, just as when indexing outside the bounds of a slice or array. +Similarly, slices cannot be re-sliced below zero to access earlier elements in +the array. +</p> + +<p> +<b>Growing slices (the copy and append functions)</b> +</p> + +<p> +To increase the capacity of a slice one must create a new, larger slice and +copy the contents of the original slice into it. This technique is how dynamic +array implementations from other languages work behind the scenes. The next +example doubles the capacity of <code>s</code> by making a new slice, +<code>t</code>, copying the contents of <code>s</code> into <code>t</code>, and +then assigning the slice value <code>t</code> to <code>s</code>: +</p> + +<pre> +t := make([]byte, len(s), (cap(s)+1)*2) // +1 in case cap(s) == 0 +for i := range s { + t[i] = s[i] +} +s = t +</pre> + +<p> +The looping piece of this common operation is made easier by the built-in copy +function. As the name suggests, copy copies data from a source slice to a +destination slice. It returns the number of elements copied. +</p> + +<pre> +func copy(dst, src []T) int +</pre> + +<p> +The <code>copy</code> function supports copying between slices of different +lengths (it will copy only up to the smaller number of elements). In addition, +<code>copy</code> can handle source and destination slices that share the same +underlying array, handling overlapping slices correctly. +</p> + +<p> +Using <code>copy</code>, we can simplify the code snippet above: +</p> + +<pre> +t := make([]byte, len(s), (cap(s)+1)*2) +copy(t, s) +s = t +</pre> + +<p> +A common operation is to append data to the end of a slice. This function +appends byte elements to a slice of bytes, growing the slice if necessary, and +returns the updated slice value: +</p> + +<pre><!--{{code "progs/slices.go" `/AppendByte/` `/STOP/`}} +-->func AppendByte(slice []byte, data ...byte) []byte { + m := len(slice) + n := m + len(data) + if n > cap(slice) { // if necessary, reallocate + // allocate double what's needed, for future growth. + newSlice := make([]byte, (n+1)*2) + copy(newSlice, slice) + slice = newSlice + } + slice = slice[0:n] + copy(slice[m:n], data) + return slice +}</pre> + +<p> +One could use <code>AppendByte</code> like this: +</p> + +<pre> +p := []byte{2, 3, 5} +p = AppendByte(p, 7, 11, 13) +// p == []byte{2, 3, 5, 7, 11, 13} +</pre> + +<p> +Functions like <code>AppendByte</code> are useful because they offer complete +control over the way the slice is grown. Depending on the characteristics of +the program, it may be desirable to allocate in smaller or larger chunks, or to +put a ceiling on the size of a reallocation. +</p> + +<p> +But most programs don't need complete control, so Go provides a built-in +<code>append</code> function that's good for most purposes; it has the +signature +</p> + +<pre> +func append(s []T, x ...T) []T +</pre> + +<p> +The <code>append</code> function appends the elements <code>x</code> to the end +of the slice <code>s</code>, and grows the slice if a greater capacity is +needed. +</p> + +<pre> +a := make([]int, 1) +// a == []int{0} +a = append(a, 1, 2, 3) +// a == []int{0, 1, 2, 3} +</pre> + +<p> +To append one slice to another, use <code>...</code> to expand the second +argument to a list of arguments. +</p> + +<pre> +a := []string{"John", "Paul"} +b := []string{"George", "Ringo", "Pete"} +a = append(a, b...) // equivalent to "append(a, b[0], b[1], b[2])" +// a == []string{"John", "Paul", "George", "Ringo", "Pete"} +</pre> + +<p> +Since the zero value of a slice (<code>nil</code>) acts like a zero-length +slice, you can declare a slice variable and then append to it in a loop: +</p> + +<pre><!--{{code "progs/slices.go" `/Filter/` `/STOP/`}} +-->// Filter returns a new slice holding only +// the elements of s that satisfy f() +func Filter(s []int, fn func(int) bool) []int { + var p []int // == nil + for _, i := range s { + if fn(i) { + p = append(p, i) + } + } + return p +}</pre> + +<p> +<b>A possible "gotcha"</b> +</p> + +<p> +As mentioned earlier, re-slicing a slice doesn't make a copy of the underlying +array. The full array will be kept in memory until it is no longer referenced. +Occasionally this can cause the program to hold all the data in memory when +only a small piece of it is needed. +</p> + +<p> +For example, this <code>FindDigits</code> function loads a file into memory and +searches it for the first group of consecutive numeric digits, returning them +as a new slice. +</p> + +<pre><!--{{code "progs/slices.go" `/digit/` `/STOP/`}} +-->var digitRegexp = regexp.MustCompile("[0-9]+") + +func FindDigits(filename string) []byte { + b, _ := ioutil.ReadFile(filename) + return digitRegexp.Find(b) +}</pre> + +<p> +This code behaves as advertised, but the returned <code>[]byte</code> points +into an array containing the entire file. Since the slice references the +original array, as long as the slice is kept around the garbage collector can't +release the array; the few useful bytes of the file keep the entire contents in +memory. +</p> + +<p> +To fix this problem one can copy the interesting data to a new slice before +returning it: +</p> + +<pre><!--{{code "progs/slices.go" `/CopyDigits/` `/STOP/`}} +-->func CopyDigits(filename string) []byte { + b, _ := ioutil.ReadFile(filename) + b = digitRegexp.Find(b) + c := make([]byte, len(b)) + copy(c, b) + return c +}</pre> + +<p> +A more concise version of this function could be constructed by using +<code>append</code>. This is left as an exercise for the reader. +</p> + +<p> +<b>Further Reading</b> +</p> + +<p> +<a href="/doc/effective_go.html">Effective Go</a> contains an +in-depth treatment of <a href="/doc/effective_go.html#slices">slices</a> +and <a href="/doc/effective_go.html#arrays">arrays</a>, +and the Go <a href="/doc/go_spec.html">language specification</a> +defines <a href="/doc/go_spec.html#Slice_types">slices</a> and their +<a href="/doc/go_spec.html#Length_and_capacity">associated</a> +<a href="/doc/go_spec.html#Making_slices_maps_and_channels">helper</a> +<a href="/doc/go_spec.html#Appending_and_copying_slices">functions</a>. +</p> diff --git a/doc/articles/slices_usage_and_internals.tmpl b/doc/articles/slices_usage_and_internals.tmpl new file mode 100644 index 000000000..d2f8fb7f5 --- /dev/null +++ b/doc/articles/slices_usage_and_internals.tmpl @@ -0,0 +1,438 @@ +<!--{ + "Title": "Slices: usage and internals" +}--> +{{donotedit}} + +<p> +Go's slice type provides a convenient and efficient means of working with +sequences of typed data. Slices are analogous to arrays in other languages, but +have some unusual properties. This article will look at what slices are and how +they are used. +</p> + +<p> +<b>Arrays</b> +</p> + +<p> +The slice type is an abstraction built on top of Go's array type, and so to +understand slices we must first understand arrays. +</p> + +<p> +An array type definition specifies a length and an element type. For example, +the type <code>[4]int</code> represents an array of four integers. An array's +size is fixed; its length is part of its type (<code>[4]int</code> and +<code>[5]int</code> are distinct, incompatible types). Arrays can be indexed in +the usual way, so the expression <code>s[n]</code> accesses the <i>n</i>th +element: +</p> + +<pre> +var a [4]int +a[0] = 1 +i := a[0] +// i == 1 +</pre> + +<p> +Arrays do not need to be initialized explicitly; the zero value of an array is +a ready-to-use array whose elements are themselves zeroed: +</p> + +<pre> +// a[2] == 0, the zero value of the int type +</pre> + +<p> +The in-memory representation of <code>[4]int</code> is just four integer values laid out sequentially: +</p> + +<p> +<img src="slice-array.png"> +</p> + +<p> +Go's arrays are values. An array variable denotes the entire array; it is not a +pointer to the first array element (as would be the case in C). This means +that when you assign or pass around an array value you will make a copy of its +contents. (To avoid the copy you could pass a <i>pointer</i> to the array, but +then that's a pointer to an array, not an array.) One way to think about arrays +is as a sort of struct but with indexed rather than named fields: a fixed-size +composite value. +</p> + +<p> +An array literal can be specified like so: +</p> + +<pre> +b := [2]string{"Penn", "Teller"} +</pre> + +<p> +Or, you can have the compiler count the array elements for you: +</p> + +<pre> +b := [...]string{"Penn", "Teller"} +</pre> + +<p> +In both cases, the type of <code>b</code> is <code>[2]string</code>. +</p> + +<p> +<b>Slices</b> +</p> + +<p> +Arrays have their place, but they're a bit inflexible, so you don't see them +too often in Go code. Slices, though, are everywhere. They build on arrays to +provide great power and convenience. +</p> + +<p> +The type specification for a slice is <code>[]T</code>, where <code>T</code> is +the type of the elements of the slice. Unlike an array type, a slice type has +no specified length. +</p> + +<p> +A slice literal is declared just like an array literal, except you leave out +the element count: +</p> + +<pre> +letters := []string{"a", "b", "c", "d"} +</pre> + +<p> +A slice can be created with the built-in function called <code>make</code>, +which has the signature, +</p> + +<pre> +func make([]T, len, cap) []T +</pre> + +<p> +where T stands for the element type of the slice to be created. The +<code>make</code> function takes a type, a length, and an optional capacity. +When called, <code>make</code> allocates an array and returns a slice that +refers to that array. +</p> + +<pre> +var s []byte +s = make([]byte, 5, 5) +// s == []byte{0, 0, 0, 0, 0} +</pre> + +<p> +When the capacity argument is omitted, it defaults to the specified length. +Here's a more succinct version of the same code: +</p> + +<pre> +s := make([]byte, 5) +</pre> + +<p> +The length and capacity of a slice can be inspected using the built-in +<code>len</code> and <code>cap</code> functions. +</p> + +<pre> +len(s) == 5 +cap(s) == 5 +</pre> + +<p> +The next two sections discuss the relationship between length and capacity. +</p> + +<p> +The zero value of a slice is <code>nil</code>. The <code>len</code> and +<code>cap</code> functions will both return 0 for a nil slice. +</p> + +<p> +A slice can also be formed by "slicing" an existing slice or array. Slicing is +done by specifying a half-open range with two indices separated by a colon. For +example, the expression <code>b[1:4]</code> creates a slice including elements +1 through 3 of <code>b</code> (the indices of the resulting slice will be 0 +through 2). +</p> + +<pre> +b := []byte{'g', 'o', 'l', 'a', 'n', 'g'} +// b[1:4] == []byte{'o', 'l', 'a'}, sharing the same storage as b +</pre> + +<p> +The start and end indices of a slice expression are optional; they default to zero and the slice's length respectively: +</p> + +<pre> +// b[:2] == []byte{'g', 'o'} +// b[2:] == []byte{'l', 'a', 'n', 'g'} +// b[:] == b +</pre> + +<p> +This is also the syntax to create a slice given an array: +</p> + +<pre> +x := [3]string{"Лайка", "Белка", "Стрелка"} +s := x[:] // a slice referencing the storage of x +</pre> + +<p> +<b>Slice internals</b> +</p> + +<p> +A slice is a descriptor of an array segment. It consists of a pointer to the +array, the length of the segment, and its capacity (the maximum length of the +segment). +</p> + +<p> +<img src="slice-struct.png"> +</p> + +<p> +Our variable <code>s</code>, created earlier by <code>make([]byte, 5)</code>, +is structured like this: +</p> + +<p> +<img src="slice-1.png"> +</p> + +<p> +The length is the number of elements referred to by the slice. The capacity is +the number of elements in the underlying array (beginning at the element +referred to by the slice pointer). The distinction between length and capacity +will be made clear as we walk through the next few examples. +</p> + +<p> +As we slice <code>s</code>, observe the changes in the slice data structure and +their relation to the underlying array: +</p> + +<pre> +s = s[2:4] +</pre> + +<p> +<img src="slice-2.png"> +</p> + +<p> +Slicing does not copy the slice's data. It creates a new slice value that +points to the original array. This makes slice operations as efficient as +manipulating array indices. Therefore, modifying the <i>elements</i> (not the +slice itself) of a re-slice modifies the elements of the original slice: +</p> + +<pre> +d := []byte{'r', 'o', 'a', 'd'} +e := d[2:] +// e == []byte{'a', 'd'} +e[1] == 'm' +// e == []byte{'a', 'm'} +// d == []byte{'r', 'o', 'a', 'm'} +</pre> + +<p> +Earlier we sliced <code>s</code> to a length shorter than its capacity. We can +grow s to its capacity by slicing it again: +</p> + +<pre> +s = s[:cap(s)] +</pre> + +<p> +<img src="slice-3.png"> +</p> + +<p> +A slice cannot be grown beyond its capacity. Attempting to do so will cause a +runtime panic, just as when indexing outside the bounds of a slice or array. +Similarly, slices cannot be re-sliced below zero to access earlier elements in +the array. +</p> + +<p> +<b>Growing slices (the copy and append functions)</b> +</p> + +<p> +To increase the capacity of a slice one must create a new, larger slice and +copy the contents of the original slice into it. This technique is how dynamic +array implementations from other languages work behind the scenes. The next +example doubles the capacity of <code>s</code> by making a new slice, +<code>t</code>, copying the contents of <code>s</code> into <code>t</code>, and +then assigning the slice value <code>t</code> to <code>s</code>: +</p> + +<pre> +t := make([]byte, len(s), (cap(s)+1)*2) // +1 in case cap(s) == 0 +for i := range s { + t[i] = s[i] +} +s = t +</pre> + +<p> +The looping piece of this common operation is made easier by the built-in copy +function. As the name suggests, copy copies data from a source slice to a +destination slice. It returns the number of elements copied. +</p> + +<pre> +func copy(dst, src []T) int +</pre> + +<p> +The <code>copy</code> function supports copying between slices of different +lengths (it will copy only up to the smaller number of elements). In addition, +<code>copy</code> can handle source and destination slices that share the same +underlying array, handling overlapping slices correctly. +</p> + +<p> +Using <code>copy</code>, we can simplify the code snippet above: +</p> + +<pre> +t := make([]byte, len(s), (cap(s)+1)*2) +copy(t, s) +s = t +</pre> + +<p> +A common operation is to append data to the end of a slice. This function +appends byte elements to a slice of bytes, growing the slice if necessary, and +returns the updated slice value: +</p> + +{{code "progs/slices.go" `/AppendByte/` `/STOP/`}} + +<p> +One could use <code>AppendByte</code> like this: +</p> + +<pre> +p := []byte{2, 3, 5} +p = AppendByte(p, 7, 11, 13) +// p == []byte{2, 3, 5, 7, 11, 13} +</pre> + +<p> +Functions like <code>AppendByte</code> are useful because they offer complete +control over the way the slice is grown. Depending on the characteristics of +the program, it may be desirable to allocate in smaller or larger chunks, or to +put a ceiling on the size of a reallocation. +</p> + +<p> +But most programs don't need complete control, so Go provides a built-in +<code>append</code> function that's good for most purposes; it has the +signature +</p> + +<pre> +func append(s []T, x ...T) []T +</pre> + +<p> +The <code>append</code> function appends the elements <code>x</code> to the end +of the slice <code>s</code>, and grows the slice if a greater capacity is +needed. +</p> + +<pre> +a := make([]int, 1) +// a == []int{0} +a = append(a, 1, 2, 3) +// a == []int{0, 1, 2, 3} +</pre> + +<p> +To append one slice to another, use <code>...</code> to expand the second +argument to a list of arguments. +</p> + +<pre> +a := []string{"John", "Paul"} +b := []string{"George", "Ringo", "Pete"} +a = append(a, b...) // equivalent to "append(a, b[0], b[1], b[2])" +// a == []string{"John", "Paul", "George", "Ringo", "Pete"} +</pre> + +<p> +Since the zero value of a slice (<code>nil</code>) acts like a zero-length +slice, you can declare a slice variable and then append to it in a loop: +</p> + +{{code "progs/slices.go" `/Filter/` `/STOP/`}} + +<p> +<b>A possible "gotcha"</b> +</p> + +<p> +As mentioned earlier, re-slicing a slice doesn't make a copy of the underlying +array. The full array will be kept in memory until it is no longer referenced. +Occasionally this can cause the program to hold all the data in memory when +only a small piece of it is needed. +</p> + +<p> +For example, this <code>FindDigits</code> function loads a file into memory and +searches it for the first group of consecutive numeric digits, returning them +as a new slice. +</p> + +{{code "progs/slices.go" `/digit/` `/STOP/`}} + +<p> +This code behaves as advertised, but the returned <code>[]byte</code> points +into an array containing the entire file. Since the slice references the +original array, as long as the slice is kept around the garbage collector can't +release the array; the few useful bytes of the file keep the entire contents in +memory. +</p> + +<p> +To fix this problem one can copy the interesting data to a new slice before +returning it: +</p> + +{{code "progs/slices.go" `/CopyDigits/` `/STOP/`}} + +<p> +A more concise version of this function could be constructed by using +<code>append</code>. This is left as an exercise for the reader. +</p> + +<p> +<b>Further Reading</b> +</p> + +<p> +<a href="/doc/effective_go.html">Effective Go</a> contains an +in-depth treatment of <a href="/doc/effective_go.html#slices">slices</a> +and <a href="/doc/effective_go.html#arrays">arrays</a>, +and the Go <a href="/doc/go_spec.html">language specification</a> +defines <a href="/doc/go_spec.html#Slice_types">slices</a> and their +<a href="/doc/go_spec.html#Length_and_capacity">associated</a> +<a href="/doc/go_spec.html#Making_slices_maps_and_channels">helper</a> +<a href="/doc/go_spec.html#Appending_and_copying_slices">functions</a>. +</p> |