Series Index
Generics Part 01: Basic Syntax
Generics Part 02: Underlying Types
Generics Part 03: Struct Types and Data Semantics
Introduction
In the previous post, I showed you how to declare a user-defined type, based on an underlying type. I did this through the progression of writing different versions of the same type using concrete types, the empty interface and then finally, generics. I also provided information on how the compiler was limited in its ability to infer the substitution for the generic type during zero value construction, but it could with initialized construction.
In this post, I will share an example of how to declare a user-defined type based on a struct with generic fields. I will also talk about how using value or pointer type declarations will change the semantics. The code for this post can be found at this playground link.
Concrete Example
If you wanted to code a linked list in Go today, you would have to write complete implementations of the list for every new data type you wanted to manage. With the new generics syntax, you can now have just one implementation.
Listing 1
14 type node[T any] struct {
15 Data T
16 next *node[T]
17 prev *node[T]
18 }
In listing 1, a struct type is declared that represents a node for the linked list. Each node
contains an individual piece of data that is stored and managed by the list. The identifier T
is defined as a generic type (to be determined later), thanks to the square brackets attached to the type’s name on line 14. The use of the predeclared identifier any
, inside the same square brackets, tells the compiler that there are no constraints against what type T
can become. It states that T
can be substituted for “any” concrete type.
Note: Generic type declarations require a constraint as part of the syntax.
With type T
declared, the Data
field on line 15 can now be defined as a field of some type T
to be determined later. The next
and prev
fields declared on lines 16 and 17 need to point to a node
of that same type T
. These are the pointers to the next and previous node in the linked list, respectively. To make this connection, the fields are declared as pointers to a node
that is bound to type T
through the use of the square brackets.
Listing 2
20 type list[T any] struct {
21 first *node[T]
22 last *node[T]
23 }
Listing 2 shows a second struct type named list
which represents a collection of nodes by pointing to the first and last node in a list. These fields need to point to a node
of some type T
, just like the next
and prev
fields from listing 1.
Once again on line 20, the identifier T
is defined as a generic type (to be determined later) that can be substituted for “any” concrete type. Then on lines 21 and 22, the first
and last
fields are declared as pointers to a node
of some type T
using the square bracket syntax.
Listing 3
25 func (l *list[T]) add(data T) *node[T] {
26 n := node[T]{
27 Data: data,
28 prev: l.last,
29 }
30 if l.first == nil {
31 l.first = &n
32 l.last = &n
33 return &n
34 }
35 l.last.next = &n
36 l.last = &n
37 return &n
38 }
Listing 3 shows the implementation of a method named add
for the list
type. No formal generic type list declaration is required (as with functions) since the method is bound to the list
through the receiver. The add
method’s receiver is declared as a pointer to a list
of some type T
and the return is declared as a pointer to a node
of the same type T
.
The code on lines 30 through 37 will always be the same, regardless of what type of data is being stored in the list since that is just pointer manipulation. It’s only the construction of a new node
on line 26 that is affected by the type of data that will be managed. Thanks to generics, the construction of a node
can be bound to type T
which gets substituted later at compile time.
Without generics, this entire method would need to be duplicated since line 26 would need to be hard coded to a known, declared type prior to compilation. Since the amount of code (for the entire list implementation) that needs to change for different data types is very small, being able to declare a node
and list
to manage data of some type T
reduces the cost of code duplication and maintenance.
Application
With the node
and list
types declared, I can now write a small application that constructs two lists where data is added and displayed.
Listing 4
44 type user struct {
45 name string
46 }
50 func main() {
51
52 // Store values of type user into the list.
53 var lv list[user]
54 n1 := lv.add(user{"bill"})
55 n2 := lv.add(user{"ale"})
56 fmt.Println(n1.Data, n2.Data)
57
58 // Store pointers of type user into the list.
59 var lp list[*user]
60 n3 := lp.add(&user{"bill"})
61 n4 := lp.add(&user{"ale"})
62 fmt.Println(n3.Data, n4.Data)
63 }
Output
{bill} {ale}
&{bill} &{ale}
Listing 4 shows that small application. On line 44, a type name user
is declared and then on line 53, a list
is constructed to its zero value state to manage values of type user
. On line 59, a second list
is constructed to its zero value state and this list manages pointers to values of type user
. The only difference between these two lists is one manages values of type user
and the other pointers of type user
.
Since type user
is explicitly specified during the construction of the list
on line 53, the add
method in turn accepts values of type user
on lines 54 and 55. Since a pointer of type user
is explicitly specified during the construction of the list
on line 59, the add
method called on lines 60 and 61 accept pointers of type user
.
You can see in the output of the program, the Data
field for the nodes in the respective lists match the data semantic used in the construction.
Forcing Pointer Semantics
What happens if I choose to change the code and declare the Data
field as a pointer of type T
and also have the add
method accept a pointer of type T
?
Playground Link
Listing 5
14 type node[T any] struct {
15 Data *T <- CHANGED CODE
16 next *node[T]
17 prev *node[T]
18 }
25 func (l *list[T]) add(data *T) *node[T] { <- CHANGED CODE
26 n := node[T]{
27 Data: data,
28 prev: l.last,
29 }
In listing 5, I changed the code on lines 15 and 25 to declare the Data
field as a pointer of type T
and the add
method to accept a pointer of type T
.
What happens when I build the program?
Listing 6
50 func main() {
51
52 // Store values of type user into the list.
53 var lv list[user]
54 n1 := lv.add(user{"bill"}) <- NOW MUST PASS A POINTER OF TYPE USER
55 n2 := lv.add(user{"ale"}) <- NOW MUST PASS A POINTER OF TYPE USER
56 fmt.Println(n1.Data, n2.Data)
57
58 // Store pointers of type user into the list.
59 var lp list[*user]
60 n3 := lp.add(&user{"bill"}) <- NOW MUST PASS A POINTER TO A POINTER OF TYPE USER
61 n4 := lp.add(&user{"ale"}) <- NOW MUST PASS A POINTER TO A POINTER OF TYPE USER
62 fmt.Println(n3.Data, n4.Data)
63 }
Output
type checking failed for main
prog.go2:54:15: cannot use (user literal) (value of type user) as *user value in argument
prog.go2:55:15: cannot use (user literal) (value of type user) as *user value in argument
prog.go2:60:15: cannot use &(user literal) (value of type *user) as **user value in argument
prog.go2:61:15: cannot use &(user literal) (value of type *user) as **user value in argument
Listing 6 shows the problem. You can see the compiler is expecting the code to pass a pointer of type user
on lines 54 and 55 and a pointer to a pointer of type user
on lines 60 and 61.
The type information being explicitly passed to the compiler on the construction calls no longer matches what the add
method is asking for as input. The API is no longer in line with the concrete data that is being declared at construction. In my opinion, this can create confusion in how the API works and what data semantics are being used underneath. This kind of confusion in an API can lead to misuse and mistakes.
Conclusion
After reading this post, you should have a better understanding of the generics syntax for user-defined types in Go that are based on struct types. You saw how to declare fields of the generic type and how to declare fields that can point to other struct type values. You also saw how to declare a method that can accept and return values of the same generic type. Finally, you understand how forcing pointer semantics on field and method declarations of the generic type can create confusion and misuses of using the API.
In the next post, I will explore how you can declare behavioral constraints on generic types when generic functions require values of the generic types to exhibit behavior. If you can’t wait, I recommend you check out the repo of code that these blog posts are based on and experiment for yourself. If you have any questions, please reach out to me over email, Slack or Twitter.