从一个例子看Go的逃逸分析-技术圈

1. 从一个例子开始

下面是一段 c 代码，函数 getStr 生成了 a-z 的串，我们分别在函数内部和 main 中对字串进行了输出。

//例1.1
#include <stdio.h>

//返回字串
char* getStr(){
    //char数组 函数栈上分配内存
    char buf[27];
    int i;

    //产生a-z的串
    for (i=0; i<sizeof(buf)-1; i++){
        buf[i] = i + 'a';
    }
    buf[i] = '\0';

    printf("%s\n", buf);
    return buf;
}

int main(){
    char *p;
    p = getStr();
    printf("%s\n", *p);

    return 0;
}

运行结果如下：

abcdefghijklmnopqrstuvwxyz
m

如果你有一些 c 的编程经验，那么你一定知道产生这个结果是因为 buf[27]的内存是在函数栈上分配的，这段内存在函数结束后会被自动回收，所以在 main 函数中想再次输出这个字串，就会产生一个未知的结果。我们在对上面代码进行编译时，编译器也会给出告警：

In function ‘getStr’:
warning: function returns address of local variable [-Wreturn-local-addr]

解决这个问题的方法之一(只是一种方法，并非好的实践)是在函数内部使用 malloc 申请一段内存，因为 malloc 的内存是在堆上分配的，函数返回后不会自动回收因此可以得到预期结果。代码如下：

//例1.2
#include <stdio.h>
#include <stdlib.h>

char* getStr(){
    char *buf;
    int len = 27;
    //堆上分配内存，不会在函数结束后被自动回收
    buf = (char *) malloc(len);

    int i;
    for (i=0; i<len-1; i++){
        buf[i] = i + 'a';
    }

    buf[i] = '\0';

    printf("%s\n", buf);
    return buf;
}

int main(){
    char *p;
    p = getStr();
    printf("%s\n", p);

    //手动将堆上内存释放
    free(p);
    return 0;
}

类似的功能，我们用 go 语言实现，可以是这样的:

//例1.3
package main

import "fmt"

func getStr() *[26] byte{
    buf := [26]byte{}
    for i:=0; i<len(buf); i++{
        buf[i] = byte('a' + i)
    }

    return &buf
}

func main(){
    var p *[26] byte
    p = getStr();

    fmt.Printf("%s\n", *p)
}

运行结果如下：

abcdefghijklmnopqrstuvwxyz

这段程序中，我们并没有在 getStr 中指出 buf 是要分配在堆上的，但是程序为什么能正确运行呢？正是因为 go 中有逃逸分析机制。

2. 什么是逃逸分析

函数中的一个变量，其内存是分配在堆上，还是分配在栈上？在 go 语言中，这一点是由编译器决定的，这就是所谓的逃逸分析。例 1.3 中，go 编译器发现 buf 对应的内存在函数返回后仍然被使用，于是自动将其分配到堆上，从而保证了程序可以正常运行。而且逃逸至堆上的内存，其回收也是由 go 的垃圾回收机制自动完成，yyds！

3. 查看逃逸的方法

假如我们的代码是 escape.go，可以使用如下命令查看实际的逃逸情况。

//逃逸概要情况
go build -gcflags "-m" escape.go
//详情
go build -gcflags "-m -m" escape.go

对于例 1.3 中的代码，执行go build -gcflags "-m"，得到结果如下：

# command-line-arguments
./c.go:20:15: inlining call to fmt.Printf
./c.go:6:5: moved to heap: buf
./c.go:20:24: *p escapes to heap
./c.go:20:15: []interface {} literal does not escape
<autogenerated>:1: .this does not escape

可见 buf 的确逃逸到了堆上。

4. 产生逃逸的情况

哪些情况 go 会将函数栈上的内存分配至堆上呢？官方的 FAQ(How do I know whether a variable is allocated on the heap or the stack?^[1])里给出了答案

When possible, the Go compilers will allocate variables that are local to a function in that function's stack frame. However, if the compiler cannot prove that the variable is not referenced after the function returns, then the compiler must allocate the variable on the garbage-collected heap to avoid dangling pointer errors. Also, if a local variable is very large, it might make more sense to store it on the heap rather than the stack.

可见逃逸的情形主要有两大类：

编译器无法证明变量在函数返回后不再被引用，则分配在堆上。
如果变量比较大，则放在堆上更合理。

4.1 函数返回后变量仍被使用的情况

由于闭包，导致函数返回后函数内变量仍被外部使用。

package main

func main() {
    f := fibonacci()
    f()
}

func fibonacci() func() int {
    a, b := 0, 1
    return func() int {
        a, b = b, a+b
        return a
    }
}

查看逃逸情况如下：

go build -gcflags "-m -l" escape.go
# command-line-arguments
./escape.go:9:5: moved to heap: a
./escape.go:9:8: moved to heap: b
./escape.go:10:12: func literal escapes to heap

返回指针

package main

type User struct {
    name string
    age int8
}

//返回指向User的指针
func NewUser() *User{
    u := User{
        name: "ball",
        age: 99,
    }

    //u对应的内存可能在外部被使用，放到堆上
    return &u
}

func main() {
}

查看逃逸情况如下：

go build -gcflags "-m  -l" escape.go
# command-line-arguments
./escape.go:9:2: moved to heap: u

返回接口

package main

type Man interface{
    Show()
}

type User struct {
    name string
    age int8
}

func (u User)Show(){
}

func NewMan() (m Man){
    u:= User{
        name: "ball",
        age: 99,
    }

    m = u

    return
}

func main() {
}

查看逃逸情况如下：

go build -gcflags "-m -l" escape.go
# command-line-arguments
./escape.go:12:7: u does not escape
./escape.go:21:4: u escapes to heap
<autogenerated>:1: .this does not escape
<autogenerated>:1: leaking param: .this

Newman 中有一个 u 到接口 m 的转换。go 同的接口由动态值和动态类型两部分构成，m 中的动态值指针，指向了 u(更准备的说应该是 u 的拷贝)对应的内存，这部分是可能在函数返回后会用到的，所以只能分配在堆上。

4.2 变量过大被分配在堆上的情况

//escape.go
package main
func Slice(){
    s := make([]int64, 8192, 8192)
    s[0] = 1
}

func main() {
    Slice()
}

查看逃逸情况如下：

go build -gcflags "-m -m -l" escape.go
# command-line-arguments
./escape.go:3:11: make([]int64, 8192, 8192) escapes to heap:
./escape.go:3:11:   flow: {heap} = &{storage for make([]int64, 8192, 8192)}:
./escape.go:3:11:     from make([]int64, 8192, 8192) (too large for stack) at ./escape.go:3:11
./escape.go:3:11: make([]int64, 8192, 8192) escapes to heap

这里由于切片长度过大（too large for stack），被分配到了栈上。如果你的好奇心比较强，可能会有如下疑问：

go 函数栈这么小么，长度是 10000 的 int64 切片都放不下？
这个 too large 到底是多大？

关于这些问题，准备后面写一篇函数栈内存分配的文章专门说明。这里你只要记住结论就可以。

go 的函数栈是非常大的 32 位系统 250M，64 位系统 1G。(1.14.4 是这样的，不确认不同版本的 go 是否完全相同)
这里所说的 too large 不是函数栈的内存不够，而是说一个变量如果对应这么大块的内存，把它分配在栈上的效率大概率比较低，所以放堆上更合理。
go1.14.4 版本中，too large 是指 >= 8 字节 * 8192。也就是说如果代码做如下改动，不会产生逃逸。

s := make([]int64, 8191, 8191)

5. 逃逸分析可能带来的问题

5.1 go 中内存分配在堆与栈上的不同

如果分配在栈中，则函数执行结束可自动将内存回收；
如果分配在堆中，则函数执行结束交给 GC（垃圾回收）处理;

5.2 可能的问题

由 5.1 可知，如果过多的产生逃逸，会使更多的内存分配在堆上，其后果是 GC 的压力比较大，这同样可能影响代码运行的效率。实际项目中需要进行权衡，以决定是否要避免逃逸。

我们看下面一个比较极端的例子：

benchmark.go
package gotest

func Slice(n int64){
    s := make([]int64, 8191, 8191)
    s[0] = 1
}

对应的压测文件

//benchmark_test.go
package gotest_test

import (
    "testing"
    "gotest"
)

func BenchmarkSlice(b *testing.B){
    for i :=0; i<b.N; i++{
        gotest.Slice()
    }
}

Slice 中我们设置切片容量 8191，此时内存分配在栈上，未发生逃逸。

压测结果

go test -bench=.

goos: linux
goarch: amd64
pkg: gotest
BenchmarkSlice-4        1000000000               0.510 ns/op
PASS
ok      gotest  0.570s

接下来，我们将切片大小改为 8192，刚好产生逃逸，内存分配在堆上

s := make([]int64, 8192, 8192)

go test -bench=.
goos: linux
goarch: amd64
pkg: gotest
BenchmarkSlice-4           80602             13799 ns/op
PASS
ok      gotest  1.275s

可见，本例中，栈上分配内存带了来巨大的优势。

6. 更多逃逸的情况

第 4 节中所概括的逃逸情况只是主要场景，还有很多逃逸的情形。

6.1 变量大小不定带来的逃逸

package main

func s(){
    n := 10
    s := make([]int32, n)
    s[0] = 1
}

func main() {
}

查看逃逸情况如下

go build -gcflags "-m -m -l" escape.go
# command-line-arguments
./escape.go:5:11: make([]int32, n) escapes to heap:
./escape.go:5:11:   flow: {heap} = &{storage for make([]int32, n)}:
./escape.go:5:11:     from make([]int32, n) (non-constant size) at ./escape.go:5:11
./escape.go:5:11: make([]int32, n) escapes to heap

编译器给出解释为 non-constant size。这也可以理解，大小不定就有可能很大，为了确保栈内存分配的高效，防御性的把它分配到堆上，说得过去。

6.2 那些神奇的逃逸

package main

type X struct {
    p *int
}

func main() {
    var i1 int
    x1 := &X{
        p: &i1, // GOOD: i1 does not escape
    }
    *x1.p++

    var i2 int
    x2 := &X{}
    x2.p = &i2 // BAD: Cause of i2 escape
    *x2.p++
}

对 x1 的 x2 两个的赋值，同样的功能，只因为写法的不同，就造成其中一个产生了逃逸！我能说什么呢...

go build -gcflags "-m -l" escape.go
# command-line-arguments
./escape.go:15:6: moved to heap: i2
./escape.go:9:8: &X literal does not escape
./escape.go:16:8: &X literal does not escape

对两种方法使用 benchmark 测试，性能相差近 50 倍！所以，大家应该知道 struct 中有指针成员该怎么进行赋值效率最高了吧。

更多匪夷所思的逃逸情况可以参看：Escape-Analysis Flaws^[2]。不想啃英文的同学可以去这里Go 逃逸分析的缺陷^[3]

参考

[1]

How do I know whether a variable is allocated on the heap or the stack?: https://golang.org/doc/faq#stack_or_heap

[2]

Escape-Analysis Flaws: https://www.ardanlabs.com/blog/2018/01/escape-analysis-flaws.html

[3]

Go 逃逸分析的缺陷: https://studygolang.com/articles/12396

[4]

《Go专家编程》Go逃逸分析: https://my.oschina.net/renhc/blog/2222104

[5]

深入理解Go-逃逸分析: https://segmentfault.com/a/1190000020086727

[6]

Go 语言机制之内存剖析: https://studygolang.com/articles/12445

[7]

浅谈接口实现原理: https://www.cnblogs.com/DongDon/p/12586212.html

推荐阅读

Go - 基于逃逸分析来提升程序性能

福利

我为大家整理了一份从入门到进阶的Go学习资料礼包，包含学习建议：入门看什么，进阶看什么。关注公众号「polarisxu」，回复 ebook 获取；还可以回复「进群」，和数万 Gopher 交流学习。