Introduction
I am asked quite a bit about when and when not to use pointers in Go. The problem most people have, is that they try to make this decision based on what they think the performance tradeoff will be. Hence the problem, don’t make coding decisions based on unfounded thoughts you may have about performance. Make coding decisions based on the code being idiomatic, simple, readable and reasonable.
My use of pointers is based on discoveries I have made looking at code from the standard library. There are always exceptions to these rules, but what I will show you is common practice. It starts with classifying the type of value that needs to be shared. These type classifications are built-in, struct and reference types. Let’s look at each one individually.
Built-In TypesGo’s
built-in types represent primitive data values that are the building blocks for managing and working with data. I group these types collectively as the set of boolean, numeric and string types. When declaring functions and methods that accept values of these types, the standard library rarely shares them with a pointer.
Let’s start by looking at the
isShellSpecialVar function from the
env package:
Listing 1 http://golang.org/src/os/env.go38 func isShellSpecialVar(c uint8) bool {
39 switch c {
40 case '*', '#', '$', '@', '!', '?', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9':
41 return true
42 }
43 return false
44 }
The
isShellSpecialVar function in listing 1 is declared to accept a value of type
uint8 and return a value of type
bool. For the caller to use this function, they must pass a copy of their
uint8 type value into the function. This is the same for the return value. A copy of the function’s
bool type value is being returned back to the caller.
Next, let’s look at the
getShellName function from the same
env package:
Listing 2 http://golang.org/src/os/env.go54 func getShellName(s string) (string, int) {
55 switch {
56 case s[0] == '{':
. . .
66 return "", 1 // Bad syntax; just eat the brace.
67 case isShellSpecialVar(s[0]):
68 return s[0:1], 1
69 }
. . .
74 return s[:i], i
75 }
The
getShellName function in listing 2 is declared to accept a value of type
string and return two values, one of type
string and the other of type
int. A
string is a special built-in type in Go that represents an immutable slice of bytes. Since this slice can’t grow, the capacity value is not associated with its
slice header. It is best to treat values of type
string the same way you treat boolean and numeric type values, as a primitive data value.
When a call is made to
getShellName, the caller passes a copy of its
string value into the function. The function generates a new
string value and returns a copy of that value back to the caller. All the values being passed in and out of this function are copies of the original values.
This practice of sharing copies of
string values is very prevalent in the
strings package:
Listing 3 All of the functions in the
strings package accept copies of the caller’s
string values and return copies of the
string values they create. Listing 3 shows the implementation of the
Trim function. The function accepts copies of two
string values, and returns a copy of either the first
string value that is passed in or a copy of a new
string value that has trimmed out the cutset.
If you review more code from the standard library that share built-in type values, you will see how these values are rarely shared with a pointer. If a function or method needs to change the value of a built-in type, a new value reflecting that change is often returned back to the caller.
In general, don’t share built-in type values with a pointer.
Struct TypesStruct types allow for the creation of complex data types by composing different types together. This is accomplished by composing a sequence of fields, each which with a name and a type. They also support
embedding, which adds to the way struct types can be composed.
Struct types can be implemented to behave like built-in types. When they are, you should treat them as such. To see a struct type that behaves as a primitive data value, we can look at the
time package:
Listing 4 http://golang.org/src/time/time.go39 type Time struct {
40 // sec gives the number of seconds elapsed since
41 // January 1, year 1 00:00:00 UTC.
42 sec int64
43
44 // nsec specifies a non-negative nanosecond
45 // offset within the second named by Seconds.
46 // It must be in the range [0, 999999999].
47 nsec int32
48
49 // loc specifies the Location that should be used to
50 // determine the minute, hour, month, day, and year
51 // that correspond to this Time.
52 // Only the zero Time has a nil Location.
53 // In that case it is interpreted to mean UTC.
54 loc *Location
55 }
Listing 4 shows the
Time struct type. This type represents time and has been implemented to behave as a primitive data value. If you look at the factory function
Now, you will see it returns a value of type
Time, not a pointer:
Listing 5 Listing 5 shows how the
Now function returns a value of type
Time. This is an indication that values of type
Time are safe to copy and is the preferred way to share them. Next, let’s look at a method that is used to change the value of a
Time value:
Listing 6 http://golang.org/src/time/time.go610 func (t Time) Add(d Duration) Time {
611 t.sec += int64(d / 1e9)
612 nsec := int32(t.nsec) + int32(d%1e9)
613 if nsec >= 1e9 {
614 t.sec++
615 nsec -= 1e9
616 } else if nsec < 0 {
617 t.sec--
618 nsec += 1e9
619 }
620 t.nsec = nsec
621 return t
622 }
Just like we saw when working with built-in types, listing 6 shows how the
Add method is called against a copy of the caller’s
Time value. The method changes the local copy of the receiver value and returns a copy of that change back to the caller.
Functions that accept
Time values also accept copies of these values:
Listing 7 Listing 7 shows the declaration of the
div function which accepts a value of type
Time and
Duration. Again, values of type
Time are treated like a primitive data type and are copied when shared.
Most of the time struct types are not created to behave like a primitive data type. In these cases, sharing the value by using a pointer is a better way to go. Let’s look at an example from the
os package:
Listing 8 In listing 8 we see the
Open function from the os package. It opens a file for reading and returns a pointer to a value of type
File. Next, let’s look at the declaration of the
File struct type for the UNIX platform:
Listing 9 http://golang.org/src/os/file_unix.go15 // File represents an open file descriptor.
16 type File struct {
17 *file
18 }
19
20 // file is the real representation of *File.
21 // The extra level of indirection ensures that no clients of os
22 // can overwrite this data, which could cause the finalizer
23 // to close the wrong file descriptor.
24 type file struct {
25 fd int
26 name string
27 dirinfo *dirInfo // nil unless directory being read
28 nepipe int32 // number of consecutive EPIPE in Write
29 }
I left the comments for these type declarations in listing 9 because they really bring home the point I want to make. When you have a factory function like
Open that is providing you a pointer, it is a good sign that you should not be making copies of the referenced value being returned.
Open is returning a pointer because it is not safe to make copies of the referenced
File value being returned. The value should always be used and shared through the pointer.
Even if a function or method is not changing the state of a
File struct type value, it still needs to be shared with a pointer. Let’s look at the
epipecheck function from the
os package for the UNIX platform:
Listing 10 http://golang.org/src/os/file_unix.go58 func epipecheck(file *File, e error) {
59 if e == syscall.EPIPE {
60 if atomic.AddInt32(&file.nepipe, 1) >= 10 {
61 sigpipe()
62 }
63 } else {
64 atomic.StoreInt32(&file.nepipe, 0)
65 }
66 }
Here in listing 10, the
epipecheck function accepts a pointer of type
File. The caller therefore shares its
File type value with the function via a pointer. Notice the
epipecheck function is not changing the state of the
File value but using it to perform its operation.
This applies as well for the methods declared for the
File type:
Listing 11 http://golang.org/src/os/file.go224 func (f *File) Chdir() error {
225 if f == nil {
226 return ErrInvalid
227 }
228 if e := syscall.Fchdir(f.fd); e != nil {
229 return &PathError{"chdir", f.name, e}
230 }
231 return nil
232 }
The
Chdir method in listing 11 is using a pointer receiver to implement the method and does not change the state of the receiver value. In all these cases, to share a value of type
File, it must be done with a pointer. A
File value is not a primitive data value.
If you review more code from the standard library, you will see how struct types are either implemented as a primitive data value like the built-in types or implemented as a value that needs to be shared with a pointer and never copied. The factory functions for a given struct type will give you a great clue as to how the type is implemented.
In general, share struct type values with a pointer unless the struct type has been implemented to behave like a primitive data value.
If you are still not sure, this is another way to think about. Think of every struct as having a nature. If the nature of the struct is something that should not be changed, like a time, a color or a coordinate, then implement the struct as a primitive data value. If the nature of the struct is something that can be changed, even if it never is in your program, it is not a primitive data value and should be implemented to be shared with a pointer. Don’t create structs that have a duality of nature.
Reference TypesReference types are slices, maps, channels, interface and function values. These are values that contain a header value that references an underlying data structure via a pointer and other meta-data. We rarely share reference type values with a pointer because the header value is designed to be copied. The header value already contains a pointer which is sharing the underlying data structure for us by default.
Let’s look at an example from the
net package:
Listing 12 Listing 12 shows a named type from the
net package called
IP with a base type that is a slice of bytes. There is value in using a named type when you need to declare behavior around a built-in or reference type. Let’s look at the
MarshalText method for the
IP named type:
Listing 13 http://golang.org/src/net/ip.go329 func (ip IP) MarshalText() ([]byte, error) {
330 if len(ip) == 0 {
331 return []byte(""), nil
332 }
333 if len(ip) != IPv4len && len(ip) != IPv6len {
334 return nil, errors.New("invalid IP address")
335 }
336 return []byte(ip.String()), nil
337 }
Here in listing 13, we can see how the
MarshalText method is using a value receiver. This is exactly what I would expect to see because we don’t share reference types with a pointer. If you look at the rest of the methods declared for the
IP named type in the
net package, you will see the use of more value receivers.
This applies to sharing reference type values as parameters to functions and methods:
Listing 14 http://golang.org/src/net/ip.go318 // ipEmptyString is like ip.String except that it returns
319 // an empty string when ip is unset.
320 func ipEmptyString(ip IP) string {
321 if len(ip) == 0 {
322 return ""
323 }
324 return ip.String()
325 }
The
ipEmptyString function in listing 14 accepts a value of named type
IP. No pointer is used to share this value since the base type for
IP is a slice of bytes and therefore a reference type.
There is one common exception to the rule of not sharing a reference type with a pointer:
Listing 15 http://golang.org/src/net/ip.go341 func (ip *IP) UnmarshalText(text []byte) error {
342 if len(text) == 0 {
343 *ip = nil
344 return nil
345 }
346 s := string(text)
347 x := ParseIP(s)
348 if x == nil {
349 return &ParseError{"IP address", s}
350 }
351 *ip = x
352 return nil
353 }
Anytime you are unmarshaling data into a reference type, you will need to share that reference type value with a pointer. Listing 15 shows the
UnmarshalText method that is performing an unmarshal operation and is declared with a pointer receiver. The
Decode and
Unmarshal functions from the
encoding packages would also expect to receive a pointer to a reference type.
If you review more code from the standard library, you will see how values from reference types in most cases are not shared with a pointer. Since the reference type contains a header value whose purpose is to share an underlying data structure, sharing these values with a pointer is unnecessary. There is already a pointer in use.
In general, don’t share reference type values with a pointer unless you are implementing an unmarshal type of functionality.
Slices Of ValuesOne thing I avoid when I can is storing data with a slice of pointers. When I retrieve data from a database, the web or even a file, I store that data in a slice of values:
Listing 16 10 func FindRegion(s *Service, region string) ([]BuoyStation, error) {
11 var bs []BuoyStation
12 f := func(c *mgo.Collection) error {
13 queryMap := bson.M{"region": region}
14 return c.Find(queryMap).All(&bs)
15 }
16
17 if err := s.DBAction(cfg.Database, "buoy_stations", f); err != nil {
18 return nil, err
19 }
20
21 return bs, nil
22 }
Here is some code in listing 16 from one of my projects that makes a call to a MongoDB database via the
mgo package. On line 14, I pass the address of the
bs slice to the
All method. The
All method performs an unmarshal call underneath to create the values for the slice. Then the slice of data values is returned by passing a copy of the slice header value back to the caller.
Using a slice of values allows the data for the program to be stored in a contiguous block of memory. This means that more of the core data I am using can be cached by the CPU at once and hopefully stays in the cache longer. If I create a slice of pointers, there is no guarantee the memory for these core data values would be contiguous, only the pointers to those values would be stored contiguously. Though I am thinking about performance in this case, I would argue that it is more idiomatic.
There are times when this is not possible. Imagine if I needed a slice of
File type values. Since I can’t make copies of
File type values, I would need to create a slice of
File type pointers. This situation occurs often when working with struct types from the standard library and not your own.
In general, create slices and maps of values when you can.
ConclusionThe standard library is fairly consistent in how it shares values based on the type of value it is working with. Don’t use pointers with built-in data types unless you have a special need to do so. Struct types have a duality. If the struct type is implemented as a primitive data type, then don’t use a pointer. If not, then share those values with a pointer. Finally, reference types should not be shared with a pointer with very few exceptions.
I would like to re-iterate three other thoughts as I close. First, make coding decisions based on the code being idiomatic, simple, readable and reasonable. Second, this is not about right and wrong, think about the code you are writing and there being reason behind the decisions you are making. Lastly, take each situation and scenario as an individual case, and try not to apply a blanket pattern or solution.