Debugging Go Redis

Connection pool size

To improve performance, go-redis automatically manages a pool of network connections (sockets). By default, the pool size is 10 connections per every available CPU as reported by runtime.GOMAXPROCS. In most cases, that is more than enough and tweaking it rarely helps.

rdb := redis.NewClient(&redis.Options{
    PoolSize: 1000,
})

redis: connection pool timeout

You can get that error when there are no free connections in the pool for Options.PoolTimeout duration. If you are using redis.PubSub or redis.Conn, make sure to properly release PuSub/Conn resources by calling Close method when they are not needed any more.

You can also get that error when Redis processes commands too slowly and all connections in the pool are blocked for more than PoolTimeout duration.

Timeouts

Even if you are using context.Context deadlines, do NOT disable DialTimeout, ReadTimeout, and WriteTimeout, because go-redis executes some background checks without using a context and instead relies on connection timeouts.

If you are using cloud providers like AWS or Google Cloud, don't use timeouts smaller than 1 second. Such small timeouts work well most of the time, but fail miserably when cloud is slower than usually. See Go Context timeouts can be harmfulopen in new window for details.

Also see Context deadline exceededopen in new window if you are getting such error.

Large number of open connections

Under high load, some commands will time out and go-redis will close such connections because they can still receive some data later and can't be reused. Closed connections are first put into TIME_WAIT state and remain there for double maximum segment life time which is usually 1 minute:

cat /proc/sys/net/ipv4/tcp_fin_timeout
60

To cope with that, you can increase read/write timeouts or upgrade your servers to handle more traffic. You can also increase the maximum number of open connections, but that will not make your servers or network faster.

Also see Coping with the TCP TIME-WAITopen in new window for some advices.

Pipelines

Because go-redis spends most of the time writing/reading/waiting data from connections, you can improve performance by sending multiple commands at once using pipelines.

Cache

If your app logic does not allow using pipelines, consider adding a local in-process cache for the most popular operations, for example, using TinyLFU.

Hardware

Make sure that your servers have good network latency and fast CPUs with large caches. If you have multiple CPU cores, consider running multiple Redis instances on a single server.

See Factors impacting Redis performanceopen in new window for more details.

Sharding

You can also split data across multiple Redis instances so that each instance contains a subset of the keys. This way the load is spread across multiple servers and you can increase performance by adding more servers.

Redis Ring is a good option if you are using Redis for caching and can afford losing some data. Otherwise, you can try Redis Cluster.

Monitoring

Monitoring the performance of a Redis database is crucial for maintaining the overall health, efficiency, and reliability of your system. Proper performance monitoring helps identify and resolve potential issues before they lead to service disruptions or performance degradation.

Uptrace is a OpenTelemetry backendopen in new window that supports distributed tracing, metrics, and logs. You can use it to monitor applications and troubleshoot issues.

Uptrace Overview

Uptrace comes with an intuitive query builder, rich dashboards, alerting rules with notifications, and integrations for most languages and frameworks.

Uptrace can process billions of spans and metrics on a single server and allows you to monitor your applications at 10x lower cost.

In just a few minutes, you can try Uptrace by visiting the cloud demoopen in new window (no login required) or running it locally with Dockeropen in new window. The source code is available on GitHubopen in new window.

Debugging Go Redis

# Connection pool size

# redis: connection pool timeout

# Timeouts

# Large number of open connections

# Pipelines

# Cache

# Hardware

# Sharding

# Monitoring