Slices are used everywhere in my code. If I am working with data from MongoDB, it is stored in a slice. If I need to keep track of a collection of problems after running an operation, it is stored in a slice. If you don’t understand how slices work yet or have been avoiding them like I did when I started, read these two posts to learn more.
https://www.ardanlabs.com/blog/2013/08/understanding-slices-in-go-programming.html
https://www.ardanlabs.com/blog/2013/08/collections-of-unknown-length-in-go.html
A question that I am constantly asking myself when coding is, "Do I want to use a pointer to this value or do I want to make a copy?" Though Go can be used as a functional programming language, it is an imperative programming language at heart. What's the difference?
A functional programming language does not allow you to change the state of a variable or a value once it has been created and initialized. This means variables and values are immutable, they can't be changed. If you want to change the state of a variable or a value, you must make a copy and initialize the copy with the changes. Functions are always passed copies and return values are always copies too.
In an imperative programming language we can create variables and values that are mutable, or can be changed. We can pass a pointer for any variable or value to a function, which in turn can change the state as necessary. A functional programming language wants you to think in terms of mathematical functions that take input and produce a result. In an imperative programming language we can build similar functions, but we can also build functions that perform operations on state that can exist anywhere in memory.
Being able to use a pointer has advantages but can also get you in trouble. Using pointers can alleviate memory constraints and possibly improve performance. It can also create synchronization issues such as shared access to values and resources. Find the solution that works best for each individual use case. For your Go programs I recommend using pointers when it is safe and practical. Go is an imperative programming language so take advantage of that.
In Go everything is pass by value and it is really important to remember that. We can pass by value the address of an object or pass by value a copy of an object. When we use a pointer in Go it can sometime be confusing because Go handles all the dereferencing for us. Don’t get me wrong, its great that Go does this, but sometime you can forget what the value of your variable actually is.
At some point in every program, I have the need to iterate over a slice to perform some work. In Go we use the keyword range within a for loop construct to iterate over a slice. In the beginning I made some very bad mistakes iterating over slices because I misunderstood how the range keyword worked. I will show you a nasty bug I created iterating over a slice that puzzled me for a bit. Now it is obvious to me why the code misbehaved, but at the time I was shaking my head.
Let's create some simple values and place them inside of a slice. Then we will iterate over the slice and see what happens.
package main
import (
"fmt"
)
type Dog struct {
Name string
Age int
}
func main() {
jackie := Dog{
Name: "Jackie",
Age: 19,
}
fmt.Printf("Jackie Addr: %p\n", &jackie)
sammy := Dog{
Name: "Sammy",
Age: 10,
}
fmt.Printf("Sammy Addr: %p\n", &sammy)
dogs := []Dog{jackie, sammy}
fmt.Println("")
for _, dog := range dogs {
fmt.Printf("Name: %s Age: %d\n", dog.Name, dog.Age)
fmt.Printf("Addr: %p\n", &dog)
fmt.Println("")
}
}
The program creates two dog values and puts them into a slice of dogs. We display the address of each dog value. Then we iterate over the slice displaying the name, age and address of each Dog.
Here is the output for the program:
Jackie Addr: 0x2101bc000
Sammy Addr: 0x2101bc040
Name: Jackie Age: 19
Addr: 0x2101bc060
Name: Sammy Age: 10
Addr: 0x2101bc060
So why is the address of the dog value different inside the range loop and why does the same address appear twice? This all has to do with the fact that everything is pass by value in Go. In this code example we actually create 2 extra copies of each Dog value in memory.
The initial existence of each Dog value is created with a composite literal:
jackie := Dog{
Name: "Jackie",
Age: 19,
}
The first copies of the values are created when the values are placed into the slice:
dogs := []Dog{jackie, sammy}
The second copies of the values are created when we iterate over the slice:
dog := range dogs
Now we can see why the address of the dog variable inside range loop is always the same. We are displaying the address of the dog variable, which happens to be a local variable of type Dog and contains a copy of the Dog value for each index of the slice. With each iteration of the slice, the location of the dog variable is the same. The value of the dog variable is changing.
That nasty bug I was talking about earlier had to do with me thinking the address of the dog variable could be used as a pointer to each individual Dog value inside the slice. Something like this:
allDogs := []*Dog{}
for _, dog := range dogs {
allDogs = append(allDogs, &dog)
}
for _, dog := range allDogs {
fmt.Printf("Name: %s Age: %d\n", dog.Name, dog.Age)
}
I create a new slice that can hold pointers to Dog values. Then I range over the slice of dogs storing the address of each Dog value into the new slice. Or at least I think I am storing the address of each Dog value.
If I add this code to the program and run it, this is the output:
Name: Sammy Age: 10
Name: Sammy Age: 10
I end up with a slice where every element has the same address. This address is pointing to a copy of the last value that we iterated over. Yikes!!
If making all these copies is not what you want, you could use pointers. Here is the example program using pointers:
package main
import (
"fmt"
)
type Dog struct {
Name string
Age int
}
func main() {
jackie := &Dog{
Name: "Jackie",
Age: 19,
}
fmt.Printf("Jackie Addr: %p\n", jackie)
sammy := &Dog{
Name: "Sammy",
Age: 10,
}
fmt.Printf("Sammy Addr: %p\n\n", sammy)
dogs := []*Dog{jackie, sammy}
for _, dog := range dogs {
fmt.Printf("Name: %s Age: %d\n", dog.Name, dog.Age)
fmt.Printf("Addr: %p\n\n", dog)
}
}
Here is the output:
Jackie Addr: 0x2101bb000
Sammy Addr: 0x2101bb040
Name: Jackie Age: 19
Addr: 0x2101bb000
Name: Sammy Age: 10
Addr: 0x2101bb040
This time we create a slice of pointers to Dog values. When we iterate over this slice, the value of the dog variable is the address of each Dog value we stored in the slice. Instead of creating two extra copies of each Dog value, we are using the same initial Dog value we created with the composite literal.
When the slice is a collection of Dog values or a collection of pointers to Dog values, the range loop is the same.
for _, dog := range dogs {
fmt.Printf("Name: %s Age: %d\n", dog.Name, dog.Age)
}
Go handles access to the Dog value regardless of whether we are using a pointer or not. This is awesome but can sometimes lead to a bit of confusion. At least it was for me in the beginning.
I can't tell you when you should use a pointer or when you should use a copy. Just remember that Go is going to pass everything by value. That includes function parameters, return values and when iterating over a slice, map or channel.
Yes, you can also range over a channel. Take a look at this sample code I altered from a blog post written by Ewen Cheslack-Postava:
http://ewencp.org/blog/golang-iterators/package main
import (
"fmt"
)
type Dog struct {
Name string
Age int
}
type DogCollection struct {
Data []*Dog
}
func (this *DogCollection) Init() {
cloey := &Dog{"Cloey", 1}
ralph := &Dog{"Ralph", 5}
jackie := &Dog{"Jackie", 10}
bella := &Dog{"Bella", 2}
jamie := &Dog{"Jamie", 6}
this.Data = []*Dog{cloey, ralph, jackie, bella, jamie}
}
func (this *DogCollection) CollectionChannel() chan *Dog {
dataChannel := make(chan *Dog, len(this.Data))
for _, dog := range this.Data {
dataChannel <- dog
}
close(dataChannel)
return dataChannel
}
func main() {
dc := DogCollection{}
dc.Init()
for dog := range dc.CollectionChannel() {
fmt.Printf("Channel Name: %s\n", dog.Name)
}
}
If you run the program you will get the following output:
Channel Name: Cloey
Channel Name: Ralph
Channel Name: Jackie
Channel Name: Bella
Channel Name: Jamie
I really love this sample code because it shows the beauty of a closed channel. The key to making this program work is the fact that a closed channel is always in the signaled state. That means any read on the channel will return immediately. If the channel is empty a default value is returned. This is what allows the range to iterate over all the data that was passed into the channel and complete when the channel is empty. Once the channel is empty, the next read on the channel will return nil. This causes the loop to terminate.
Slices are great, lightweight and powerful. You should be using them and gaining the benefits they provide. Just remember that when you are iterating over a slice, you are getting a copy of each element of the slice. If that happens to be an object, you are getting a copy of that object. Don’t ever use the address of the local variable in the range loop. That is a local variable that contains a copy of the slice element and only has local context. Don’t make the same mistake I made.
Zero vs. Empty Slices