A Guide to the Go Garbage Collector

一、Introduction

This guide is intended to aid advanced Go users in better understanding their application costs by providing insights into the Go garbage collector. It also provides guidance on how Go users may use these insights to improve their applications' resource utilization. It does not assume any knowledge of garbage collection, but does assume familiarity with the Go programming language.

The Go language takes responsibility for arranging the storage of Go values; in most cases, a Go developer need not care about where these values are stored, or why, if at all. In practice, however, these values often need to be stored in computer physical memory and physical memory is a finite resource. Because it is finite, memory must be managed carefully and recycled in order to avoid running out of it while executing a Go program. It's the job of a Go implementation to allocate and recycle memory as needed.

Another term for automatically recycling memory is garbage collection. At a high level, a garbage collector (or GC, for short) is a system that recycles memory on behalf of the application by identifying which parts of memory are no longer needed. The Go standard toolchain provides a runtime library that ships with every application, and this runtime library includes a garbage collector.

Note that the existence of a garbage collector as described by this guide is not guaranteed by the Go specification, only that the underlying storage for Go values is managed by the language itself. This omission is intentional and enables the use of radically different memory management techniques.

Therefore, this guide is about a specific implementation of the Go programming language and may not apply to other implementations. Specifically, this following guide applies to the standard toolchain (the gc Go compiler and tools). Gccgo and Gollvm both use a very similar GC implementation so many of the same concepts apply, but details may vary.

本指南旨在帮助高级 Go 用户通过提供对 Go 垃圾回收器的深入了解,更好地理解他们的应用成本。它还提供了指导,说明 Go 用户如何利用这些洞察来提高应用程序的资源利用率。本指南不假设读者对垃圾回收有任何了解,但假设读者对 Go 编程语言有一定的熟悉度。

Go 语言负责安排 Go 值的存储;在大多数情况下,Go 开发者无需关心这些值存储在哪里,或者为什么需要存储。然而,在实践中,这些值通常需要存储在计算机的物理内存中,而物理内存是有限的资源。因为它是有限的,所以必须谨慎管理和回收内存,以避免在执行 Go 程序时耗尽内存。根据需要分配和回收内存是 Go 实现的任务。

自动回收内存的另一个术语是垃圾回收。从高层次上看,垃圾回收器(简称 GC)是一个代表应用程序回收不再需要的内存部分的系统。Go 标准工具链提供了一个运行时库,这个库随每个应用程序一起提供,而这个运行时库包括了一个垃圾回收器。

请注意,本指南中描述的垃圾回收器的存在并不是由 Go 规范保证的,只有 Go 值的底层存储是由语言本身管理的。这种遗漏是有意为之,它使得可以使用截然不同的内存管理技术。

因此,本指南是关于 Go 编程语言的特定实现,并不适用于其他实现。具体来说,本指南适用于标准工具链(gc Go 编译器及其工具)。Gccgo 和 Gollvm 都使用非常相似的 GC 实现,因此许多相同的概念也适用,但细节可能会有所不同。

此外,这是一个活文档,会随着时间的推移而变化,以便最好地反映 Go 的最新版本。本文档目前描述的是截至 Go 1.19 版本的垃圾回收器。

二、Where Go Values Live

Before we dive into the GC, let's first discuss the memory that doesn't need to be managed by the GC.

For instance, non-pointer Go values stored in local variables will likely not be managed by the Go GC at all, and Go will instead arrange for memory to be allocated that's tied to the lexical scope in which it's created. In general, this is more efficient than relying on the GC, because the Go compiler is able to predetermine when that memory may be freed and emit machine instructions that clean up. Typically, we refer to allocating memory for Go values this way as "stack allocation," because the space is stored on the goroutine stack.

Go values whose memory cannot be allocated this way, because the Go compiler cannot determine its lifetime, are said to escape to the heap. "The heap" can be thought of as a catch-all for memory allocation, for when Go values need to be placed somewhere. The act of allocating memory on the heap is typically referred to as "dynamic memory allocation" because both the compiler and the runtime can make very few assumptions as to how this memory is used and when it can be cleaned up. That's where a GC comes in: it's a system that specifically identifies and cleans up dynamic memory allocations.

There are many reasons why a Go value might need to escape to the heap. One reason could be that its size is dynamically determined. Consider for instance the backing array of a slice whose initial size is determined by a variable, rather than a constant. Note that escaping to the heap must also be transitive: if a reference to a Go value is written into another Go value that has already been determined to escape, that value must also escape.

Whether a Go value escapes or not is a function of the context in which it is used and the Go compiler's escape analysis algorithm. It would be fragile and difficult to try to enumerate precisely when values escape: the algorithm itself is fairly sophisticated and changes between Go releases. For more details on how to identify which values escape and which do not, see the section on eliminating heap allocations.

在我们深入讨论垃圾回收器(GC)之前,首先让我们讨论一下不需要由 GC 管理的内存。

例如,存储在局部变量中的非指针 Go 值可能根本不会被 Go GC 管理,相反,Go 会安排分配与其所创建的词法作用域绑定的内存。一般来说,这比依赖 GC 更有效率,因为 Go 编译器能够预先确定何时可以释放该内存,并发出清理的机器指令。通常,我们将这种方式为 Go 值分配内存称为“栈分配”,因为空间存储在 goroutine 栈上。

Go语言中,当编译器无法确定其生命周期,无法为其分配内存时,我们说这个值已经“逃逸”到了堆内存。"堆内存"可以被视为所有内存分配的一种备选方案,当Go值需要被放置在某处时。在堆上分配内存的行为通常被称为"动态内存分配",因为对这块内存的使用以及何时可以清理,编译器和运行时都能做出的假设很少。这就是垃圾回收器(GC)的职责:它是一个专门用来识别和清理动态内存分配的系统。

Go值“逃逸”到堆内存有很多原因。其中一个原因可能是其大小是动态确定的。例如,切片的底层数组,其初始大小由变量决定,而不是常量。请注意,逃逸到堆内存也必须是传递性的:如果一个Go值的引用被写入到另一个已经确定逃逸的Go值中,那么这个值也必须逃逸。

Go值是否逃逸是由其使用的上下文和Go编译器的逃逸分析算法决定的。试图精确列出何时逃逸将会是脆弱且困难的:逃逸分析算法本身相当复杂,并且在不同的Go版本之间有所变化。关于如何确定哪些值逃逸,哪些值不逃逸的详细信息,请参阅减少堆内存分配的部分。

三、Tracing Garbage Collection

Garbage collection may refer to many different methods of automatically recycling memory; for example, reference counting. In the context of this document, garbage collection refers to tracing garbage collection, which identifies in-use, so-called live, objects by following pointers transitively.

Let's define these terms more rigorously.

  • Object—An object is a dynamically allocated piece of memory that contains one or more Go values.

  • Pointer—A memory address that references any value within an object. This naturally includes Go values of the form *T, but also includes parts of built-in Go values. Strings, slices, channels, maps, and interface values all contain memory addresses that the GC must trace.

Together, objects and pointers to other objects form the object graph. To identify live memory, the GC walks the object graph starting at the program's roots, pointers that identify objects that are definitely in-use by the program. Two examples of roots are local variables and global variables. The process of walking the object graph is referred to as scanning.

This basic algorithm is common to all tracing GCs. Where tracing GCs differ is what they do once they discover memory is live. Go's GC uses the mark-sweep technique, which means that in order to keep track of its progress, the GC also marks the values it encounters as live. Once tracing is complete, the GC then walks over all memory in the heap and makes all memory that is not marked available for allocation. This process is called sweeping.

One alternative technique you may be familiar with is to actually move the objects to a new part of memory and leave behind a forwarding pointer that is later used to update all the application's pointers. We call a GC that moves objects in this way a moving GC; Go has a non-moving GC.

垃圾回收可能指的是许多不同的自动回收内存的方法;例如,引用计数。在本文档的上下文中,垃圾回收指的是追踪式垃圾回收,它通过递归跟随指针来识别正在使用中的所谓“活跃”对象。

让我们更严格地定义这些术语。

  • 对象(Object)——对象是一块动态分配的内存,其中包含一个或多个 Go 值。

  • 指针(Pointer)——指向对象内任意值的内存地址。这自然包括形式为 *T 的 Go 值,但也包括内置 Go 值的部分。字符串、切片、通道、映射和接口值都包含垃圾回收器必须追踪的内存地址。

一起,对象和指向其他对象的指针形成了对象图。为了识别活跃内存,垃圾回收器(GC)从程序的根节点开始遍历对象图,这些根节点是指针,它们标识了程序肯定正在使用的对象。根节点的两个例子是局部变量和全局变量。遍历对象图的过程被称为扫描。

这个基本算法是所有追踪式垃圾回收器共有的。追踪式垃圾回收器之间的不同之处在于一旦发现内存是活跃的,它们会做什么。Go 的垃圾回收器使用标记-清除技术,这意味着为了跟踪其进度,垃圾回收器也会将其遇到的活跃值标记起来。一旦追踪完成,垃圾回收器就会遍历堆中的所有内存,并使所有未标记的内存可供分配。这个过程称为清除。

你可能熟悉的另一种技术实际上是将对象移动到内存的新区域,并留下一个转发指针,稍后用来更新应用程序的所有指针。我们将以这种方式移动对象的垃圾回收器称为移动垃圾回收器;Go 有一个非移动垃圾回收器。

四、The GC cycle

Because the Go GC is a mark-sweep GC, it broadly operates in two phases: the mark phase, and the sweep phase. While this statement might seem tautological, it contains an important insight: it's not possible to release memory back to be allocated until all memory has been traced, because there may still be an un-scanned pointer keeping an object alive. As a result, the act of sweeping must be entirely separated from the act of marking. Furthermore, the GC may also not be active at all, when there's no GC-related work to do. The GC continuously rotates through these three phases of sweeping, off, and marking in what's known as the GC cycle. For the purposes of this document, consider the GC cycle starting with sweeping, turning off, then marking.

The next few sections will focus on building intuition for the costs of the GC to aid users in tweaking GC parameters for their own benefit.

由于 Go 垃圾回收器是一种标记-清除式的垃圾回收器,它大致上分为两个阶段进行操作:标记阶段和清除阶段。虽然这个说法可能看起来是同义反复,但它包含了一个重要的见解:在所有内存都被追踪完毕之前,不可能将内存释放回供分配,因为可能仍然存在未扫描的指针保持着对象的活跃状态。因此,清除操作必须完全与标记操作分离。此外,在没有垃圾回收相关工作要做的时候,垃圾回收器也可能完全不活跃。垃圾回收器在这三个阶段——清除、关闭和标记——之间不断轮换,这被称为垃圾回收周期。为了本文档的目的,考虑垃圾回收周期从清除开始,然后关闭,接着是标记。

接下来的几个部分将专注于建立对垃圾回收成本的直观理解,以帮助用户调整垃圾回收参数以获得自身利益。

五、Understanding costs

The GC is inherently a complex piece of software built on even more complex systems. It's easy to become mired in detail when trying to understand the GC and tweak its behavior. This section is intended to provide a framework for reasoning about the cost of the Go GC and tuning parameters.

To begin with, consider this model of GC cost based on three simple axioms.

  1. The GC involves only two resources: CPU time, and physical memory.

  2. The GC's memory costs consist of live heap memory, new heap memory allocated before the mark phase, and space for metadata that, even if proportional to the previous costs, are small in comparison.

    Note: live heap memory is memory that was determined to be live by the previous GC cycle, while new heap memory is any memory allocated in the current cycle, which may or may not be live by the end.

  3. The GC's CPU costs are modeled as a fixed cost per cycle, and a marginal cost that scales proportionally with the size of the live heap.

    Note: Asymptotically speaking, sweeping scales worse than marking and scanning, as it must perform work proportional to the size of the whole heap, including memory that is determined to be not live (i.e. "dead"). However, in the current implementation sweeping is so much faster than marking and scanning that its associated costs can be ignored in this discussion.

垃圾回收器(GC)本质上是建立在更为复杂的系统之上的复杂软件。在尝试理解GC并调整其行为时,很容易陷入细节的泥潭。本节旨在提供一个框架,用于推理Go GC的成本并调整参数。

首先,考虑基于三个简单公理的GC成本模型。

  1. 垃圾回收器(GC)只涉及两种资源:CPU 时间和物理内存。

  2. GC 的内存成本包括活跃堆内存、标记阶段之前分配的新堆内存,以及用于元数据的空间,尽管与前两者相比,这些元数据空间可能成正比,但实际上相对较小。

    注意:活跃堆内存是在上一个GC周期中被确定为活跃的内存,而新堆内存是在当前周期中分配的任何内存,这些内存到周期结束时可能是活跃的也可能不是活跃的。

  3. GC 的CPU成本被建模为每个周期有一个固定成本,以及一个与活跃堆大小成正比的边际成本。

    注意:从渐进的角度来看,清除操作的扩展性比标记和扫描要差,因为它必须执行与整个堆的大小成比例的工作,包括那些被确定为不活跃(即“死亡”)的内存。然而,就当前的实现而言,清除操作比标记和扫描要快得多,因此在这次讨论中可以忽略其相关成本。

This model is simple but effective: it accurately categorizes the dominant costs of the GC. However, this model says nothing about the magnitude of these costs, nor how they interact. To model that, consider the following situation, referred to from here on as the steady-state.

  • The rate at which the application allocates new memory (in bytes per second) is constant.

    Note: it's important to understand that this allocation rate is completely separate from whether or not this new memory is live. None of it could be live, all of it could be live, or some of it could be live. (On top of this, some old heap memory could also die, so it's not necessarily the case that if that memory is live, the live heap size grows.)

    To put this more concretely, consider a web service that allocates 2 MiB of total heap memory for each request that it handles. During the request, at most 512 KiB of that 2 MiB stays live while the request is in flight, and when the service is finished handling the request, all that memory dies. Now, for the sake of simplicity suppose each request takes about 1 second to handle end-to-end. Then, a steady stream of requests, say 100 requests per second, results in an allocation rate of 200 MiB/s and a 50 MiB peak live heap.

  • The application's object graph looks roughly the same each time (objects are similarly sized, there's a roughly constant number of pointers, the maximum depth of the graph is roughly constant).

    Another way to think about this is that the marginal costs of GC are constant.

    Note: the steady-state may seem contrived, but it's representative of the behavior of an application under some constant workload. Naturally, workloads can change even while an application is executing, but typically application behavior looks like a bunch of these steady-states strung together with some transient behavior in between.

    Note: the steady-state makes no assumptions about the live heap. It may be growing with each subsequent GC cycle, it may shrink, or it may stay the same. However, trying to encompass all of these situations in the explanations to follow is tedious and not very illustrative, so the guide will focus on examples where the live heap remains constant. The GOGC section explores the non-constant live heap scenario in some more detail.

这个模型虽然简单,但是效果显著:它准确地划分了垃圾收集的主要开销。然而,这个模型并未说明这些开销的大小,以及它们如何相互作用。要理解这一点,我们需要考虑以下情况,这种状态被称为稳态。

  • 应用程序分配新内存的速率(以每秒字节数计)是恒定的。

    注意:这里需要明白,这种分配速率与新内存是否存活是完全无关的。内存中的任何部分都可能不存活,所有部分都可能存活,或者部分可能存活。(说到这,一部分旧的堆内存也可能会不再存活,所以如果内存存活,那么存活的堆内存大小不一定会增长。)

    更具体点,考虑一个网络服务,它为每个请求分配总共2 MiB的堆内存。在处理请求期间,最大的 512 KiB 的那 2 MiB 在请求进行中保持活动状态,当服务完成请求处理后,所有内存都会回收。为了简洁,假设处理每个请求的时间大约在 1 秒钟左右。那么,一个稳定的请求流,比如每秒100个请求,就会产生200 MiB/s的分配速率和50 MiB的峰值存活堆内存大小。

  • 应用程序的对象图在每次执行时看起来差不多(对象大小相同,指针数量大致恒定,图的最大深度大致恒定)。

    另一种思考方式是,垃圾回收的边际成本是恒定的。

    注意:稳态可能看起来有些人为,但其实可以代表了一些应用在恒定负载下的行为。当然,即使应用程序在运行时,工作负载也可以改变,但通常应用程序的行为看起来就像是一堆稳态,其中间夹杂一些瞬态行为。

    注意:稳态并没有对存活堆做任何假设。它可能会在随后的每个垃圾回收周期中都增长,也可能会缩小,或者可能保持不变。然而,试图在接下来的解释中包含所有这些情况都是繁琐且没有多大说明性的,所以此指南将专注于存活堆保持不变的示例。在GOGC部分,会更详细地探讨非恒定存活堆的场景。

In the steady-state while the live heap size is constant, every GC cycle is going to look identical in the cost model as long as the GC executes after the same amount of time has passed. That's because in that fixed amount of time, with a fixed rate of allocation by the application, a fixed amount of new heap memory will be allocated. So with the live heap size constant, and that new heap memory constant, memory use is always going to be the same. And because the live heap is the same size, the marginal GC CPU costs will be the same, and the fixed costs will be incurred at some regular interval.

Now consider if the GC were to shift the point at which it runs later in time. Then, more memory would be allocated but each GC cycle would still incur the same CPU cost. However over some other fixed window of time fewer GC cycles would finish, resulting in a lower overall CPU cost. The opposite would be true if the GC decided to start earlier in time: less memory would be allocated and CPU costs would be incurred more often.

This situation represents the fundamental trade-off between CPU time and memory that a GC can make, controlled by how often the GC actually executes. In other words, the trade-off is entirely defined by GC frequency.

One more detail remains to be defined, and that's when the GC should decide to start. Note that this directly sets the GC frequency in any particular steady-state, defining the trade-off. In Go, deciding when the GC should start is the main parameter which the user has control over.

在稳态中,当存活堆的大小保持恒定时,只要垃圾回收器在相同的时间后执行,每个垃圾回收周期在成本模型中都看起来是一样的。那是因为,在固定的时间内,应用程序以固定的速度分配内存,会分配固定的新堆内存。所以,当存活堆大小保持恒定,新的堆内存保持恒定时,内存使用总是相同的。而且,由于存活堆的大小相同,边际的垃圾回收CPU成本也将是相同的,固定的成本将会在一些规则的间隔内产生。

现在,考虑一下如果垃圾回收决定在时间上稍后运行的情况。那么,将会分配更多的内存,但每个垃圾回收周期仍然会产生相同的CPU成本。然而,在其他一些固定的时间窗口内,完成的垃圾回收周期会变少,总体的CPU成本会显著下降。如果垃圾回收决定提前开始,那就恰好相反:分配的内存会减少,CPU成本的发生会更频繁。

这种情况代表了垃圾回收器可以做出的CPU时间和内存之间的基本权衡,这种权衡是由垃圾回收器实际执行的频率来控制的。换句话说,这种权衡完全由垃圾回收的频率来定义。

还有一个细节需要定义,那就是垃圾回收器应该何时开始。注意,这直接设置了在任何特定稳态中的垃圾回收频率,定义了权衡。在Go中,决定垃圾回收器何时开始是用户可以控制的主要参数。

六、GOGC

At a high level, GOGC determines the trade-off between GC CPU and memory.

It works by determining the target heap size after each GC cycle, a target value for the total heap size in the next cycle. The GC's goal is to finish a collection cycle before the total heap size exceeds the target heap size. Total heap size is defined as the live heap size at the end of the previous cycle, plus any new heap memory allocated by the application since the previous cycle. Meanwhile, target heap memory is defined as:


Target heap memory = Live heap + (Live heap + GC roots) * GOGC / 100

As an example, consider a Go program with a live heap size of 8 MiB, 1 MiB of goroutine stacks, and 1 MiB of pointers in global variables. Then, with a GOGC value of 100, the amount of new memory that will be allocated before the next GC runs will be 10 MiB, or 100% of the 10 MiB of work, for a total heap footprint of 18 MiB. With a GOGC value of 50, then it'll be 50%, or 5 MiB. With a GOGC value of 200, it'll be 200%, or 20 MiB.

Note: GOGC includes the root set only as of Go 1.18. Previously, it would only count the live heap. Often, the amount of memory in goroutine stacks is quite small and the live heap size dominates all other sources of GC work, but in cases where programs had hundreds of thousands of goroutines, the GC was making poor judgements.

The heap target controls GC frequency: the bigger the target, the longer the GC can wait to start another mark phase and vice versa. While the precise formula is useful for making estimates, it's best to think of GOGC in terms of its fundamental purpose: a parameter that picks a point in the GC CPU and memory trade-off. The key takeaway is that doubling GOGC will double heap memory overheads and roughly halve GC CPU cost, and vice versa. (To see a full explanation as to why, see the appendix.)

Note: the target heap size is just a target, and there are several reasons why the GC cycle might not finish right at that target. For one, a large enough heap allocation can simply exceed the target. However, other reasons appear in GC implementations that go beyond the GC model this guide has been using thus far. For some more detail, see the latency section, but the complete details may be found in the additional resources.

在一个高层面上,GOGC决定了垃圾回收的CPU和内存之间的权衡。 它通过确定每个垃圾回收周期后的目标堆大小来工作,为下一个周期的总堆大小设定一个目标值。垃圾回收的目标是在总堆大小超过目标堆大小之前完成一次回收周期。总堆大小被定义为前一个周期结束时的存活堆大小,加上应用自前一个周期以来分配的新堆内存。同时,目标堆内存被定义为:


Target heap memory = Live heap + (Live heap + GC roots) * GOGC / 100

例如,考虑一个Go程序,其存活堆大小为8 MiB,goroutine栈为1 MiB,全局变量中的指针为1 MiB。那么,如果GOGC的值是100,那么在下一次垃圾回收运行之前将会分配10 MiB的新内存,即工作量的100%,总的堆占用为18 MiB。如果GOGC的值是50,那么它将是50%,即5 MiB。如果GOGC的值是200,那么它将是200%,即20 MiB。

注意:只有从Go 1.18版本开始,GOGC才包括根集。在此之前,它只会计算存活堆。通常情况下,goroutine栈中的内存量很小,存活堆大小主导了垃圾回收的所有其他工作来源,但在程序有数十万个goroutines的情况下,垃圾回收器的判断是不准确的。

The heap target controls GC frequency: the bigger the target, the longer the GC can wait to start another mark phase and vice versa. While the precise formula is useful for making estimates, it's best to think of GOGC in terms of its fundamental purpose: a parameter that picks a point in the GC CPU and memory trade-off. The key takeaway is that doubling GOGC will double heap memory overheads and roughly halve GC CPU cost, and vice versa. (To see a full explanation as to why, see the appendix.)

Note: the target heap size is just a target, and there are several reasons why the GC cycle might not finish right at that target. For one, a large enough heap allocation can simply exceed the target. However, other reasons appear in GC implementations that go beyond the GC model this guide has been using thus far. For some more detail, see the latency section, but the complete details may be found in the additional resources.

GOGC may be configured through either the GOGC environment variable (which all Go programs recognize), or through the SetGCPercent API in the runtime/debug package.

Note that GOGC may also be used to turn off the GC entirely (provided the memory limit does not apply) by setting GOGC=off or calling SetGCPercent(-1). Conceptually, this setting is equivalent to setting GOGC to a value of infinity, as the amount of new memory before a GC is triggered is unbounded.

To better understand everything we've discussed so far, try out the interactive visualization below that is built on the GC cost model discussed earlier. This visualization depicts the execution of some program whose non-GC work takes 10 seconds of CPU time to complete. In the first second it performs some initialization step (growing its live heap) before settling into a steady-state. The application allocates 200 MiB in total, with 20 MiB live at a time. It assumes that the only relevant GC work to complete comes from the live heap, and that (unrealistically) the application uses no additional memory.

Use the slider to adjust the value of GOGC to see how the application responds in terms of total duration and GC overhead. Each GC cycle ends while the new heap drops to zero. The time taken while the new heap drops to zero is the combined time for the mark phase for cycle N, and the sweep phase for the cycle N+1. Note that this visualization (and all the visualizations in this guide) assume the application is paused while the GC executes, so GC CPU costs are fully represented by the time it takes for new heap memory to drop to zero. This is only to make visualization simpler; the same intuition still applies. The X axis shifts to always show the full CPU-time duration of the program. Notice that additional CPU time used by the GC increases the overall duration.

堆目标控制了垃圾回收的频率:目标越大,垃圾回收能够等待开始另一个标记阶段的时间就越长,反之亦然。虽然精确的公式有助于进行估算,但最好把GOGC看作它的基本目的:一个在垃圾回收的CPU和内存权衡中选择一个点的参数。关键的要点是,加倍GOGC将会使堆内存开销加倍,并且大致把垃圾回收的CPU成本减半,反之亦然。(为了看到为什么会这样的完全解释,请参见附录。)

注意:目标堆大小只是一个目标,有许多原因可能导致垃圾回收周期并不会在该目标处结束。例如,一个足够大的堆分配可以简单地超过目标。然而,其他原因会出现在垃圾回收实现中,这超出了本指南迄今为止使用的垃圾回收模型。有关更多的细节,请参见延迟部分,但完整的细节可以在附加资源中找到。

GOGC 可以通过 GOGC 环境变量(所有 Go 程序都识别)或通过 runtime/debug 包中的 SetGCPercent API 进行配置。

请注意,GOGC 也可以用来完全关闭 GC(在不适用内存限制的情况下),方法是将 GOGC 设置为 off 或调用 SetGCPercent(-1)。从概念上讲,这种设置等同于将 GOGC 设置为无穷大,因为在触发 GC 之前新内存的数量是不受限制的。

为了更好地理解我们到目前为止讨论的所有内容,请尝试下面的交互式可视化工具,它基于前面讨论的 GC 成本模型构建。这个可视化展示了某个程序的执行情况,该程序的非 GC 工作需要 10 秒钟的 CPU 时间才能完成。在第一秒钟,它执行了一些初始化步骤(增加其活跃堆),然后进入稳态。应用程序总共分配了 200 MiB,每次有 20 MiB 是活跃的。它假设完成的唯一相关的 GC 工作来自活跃堆,并且(不现实地)应用程序没有使用额外的内存。

使用滑块调整 GOGC 的值,看看应用程序在总持续时间和 GC 额外开销方面如何响应。每个 GC 周期在新堆降至零时结束。新堆降至零所花费的时间是第 N 个周期的标记阶段和第 N+1 个周期的清除阶段的总时间。请注意,这个可视化(以及本指南中的所有可视化)假设应用程序在 GC 执行期间是暂停的,因此 GC 的 CPU 成本完全由新堆内存降至零所需的时间表示。这只是为了让可视化更简单;相同的直觉仍然适用。X 轴移动以始终显示程序的完整 CPU 时间持续时间。请注意,GC 使用的额外 CPU 时间增加了总体持续时间。

Notice that the GC always incurs some CPU and peak memory overhead. As GOGC increases, CPU overhead decreases, but peak memory increases proportionally to the live heap size. As GOGC decreases, the peak memory requirement decreases at the expense of additional CPU overhead.

Note: the graph displays CPU time, not wall-clock time to complete the program. If the program runs on 1 CPU and fully utilizes its resources, then these are equivalent. A real-world program likely runs on a multi-core system and does not 100% utilize the CPUs at all times. In these cases the wall-time impact of the GC will be lower.

Note: the Go GC has a minimum total heap size of 4 MiB, so if the GOGC-set target is ever below that, it gets rounded up. The visualization reflects this detail.

Here's another example that's a little bit more dynamic and realistic. Once again, the application takes 10 CPU-seconds to complete without the GC, but the steady-state allocation rate increases dramatically half-way through, and the live heap size shifts around a bit in the first phase. This example demonstrates how the steady-state might look when the live heap size is actually changing, and how a higher allocation rate leads to more frequent GC cycles.

请注意,GC 总是会带来一些 CPU 和峰值内存的额外开销。随着 GOGC 增加,CPU 额外开销减少,但峰值内存会随着活跃堆大小成比例增加。随着 GOGC 减少,峰值内存需求减少,但会以增加额外的 CPU 开销为代价。

注意:图表显示的是 CPU 时间,而不是程序完成的墙钟时间。如果程序在 1 个 CPU 上运行并充分利用其资源,那么这两者是等价的。一个真实的程序可能会在多核系统上运行,并且不会在所有时间里 100% 地利用 CPU。在这些情况下,GC 对墙钟时间的影响会更低。

注意:Go GC 的最小总堆大小为 4 MiB,所以如果设置的 GOGC 目标低于这个值,它会被向上取整。可视化反映了这个细节。

这里还有一个稍微动态和现实的例子。再次,应用程序在没有 GC 的情况下需要 10 个 CPU 秒来完成,但是在中途稳态分配率急剧增加,活跃堆大小在第一阶段有些波动。这个例子展示了当活跃堆大小实际在变化时,稳态可能是什么样子,以及更高的分配率如何导致更频繁的 GC 周期。

七、Memory limit

Until Go 1.19, GOGC was the sole parameter that could be used to modify the GC's behavior. While it works great as a way to set a trade-off, it doesn't take into account that available memory is finite. Consider what happens when there's a transient spike in the live heap size: because the GC will pick a total heap size proportional to that live heap size, GOGC must be configured such for the peak live heap size, even if in the usual case a higher GOGC value provides a better trade-off.

The visualization below demonstrates this transient heap spike situation.

在 Go 1.19 之前,GOGC 是唯一可以用于修改 GC 行为的参数。虽然它作为设置权衡的方法效果很好,但它没有考虑到可用内存是有限的。考虑一下当活跃堆大小出现瞬态峰值时会发生什么:因为 GC 会选择一个与活跃堆大小成比例的总堆大小,所以必须将 GOGC 配置为峰值活跃堆大小,即使在通常情况下更高的 GOGC 值提供了更好的权衡。

下面的可视化展示了这种瞬态堆峰值的情况。

Creative Commons License Flag Counter