This chapter is for the system administration course only
This section covers:
- Best practices
Perhaps the most important aspect of tuning Varnish is writing effective VCL code. For now, however, we will focus on tuning Varnish for your hardware, operating system and network. To be able to do that, knowledge of Varnish architecture is helpful.
It is important to know the internal architecture of Varnish for two reasons. First, the architecture is chiefly responsible for the performance, and second, it influences how you integrate Varnish in your own architecture.
There are several aspects of the design that were unique to Varnish when it was originally implemented. Truly good solutions, regardless of reusing ancient ideas or coming up with something radically different, is the aim of Varnish.
Fig. 15 shows a block diagram of the Varnish architecture. The diagram shows the data flow between the principal parts of Varnish.
The main block is the Manager process, which is contained in the
varnishd binary program.
The task of the Manager process is to delegate tasks, including caching, to child processes.
The Manager process ensures that there is always a process for each task.
The main driver for these design decisions is security, which is explain at Security barriers in Varnish https://www.varnish-cache.org/docs/trunk/phk/barriers.html.
The Manager’s command line interface (CLI) is accessible through:
varnishadm as explained in The Management Interface varnishadm section,
2) the Varnish Agent vagent2, or
3) the Varnish Administration Console (VAC) (via vagent2)
The Varnish Agent vagent2 is an open source HTTP REST interface that exposes
varnishd services to allow remote control and monitoring.
vagent2 offers a web UI as shown in Fig. 16, but you can write your own UI since vagent2 is an open interface.
Some features of vagent2 are:
- VCL uploading, downloading, persisting (storing to disk).
- parameter viewing, storing (not persisting yet)
- show/clear of panic messages
- start/stop/status of
varnishstatin JSON format
For more information about vagent2 and installation instructions, please visit https://github.com/varnish/vagent2.
Varnish Software has a commercial offering of a fully functional web UI called Varnish Administration Console (VAC). For more information about VAC, refer to the Varnish Administration Console (VAC) section.
The Parent Process: The Manager¶
The Manager process is owned by the root user, and its main functions are:
- apply configuration changes (from VCL files and parameters)
- delegate tasks to child processes: the Cacher and the VCL to C Compiler (VCC)
- monitor Varnish
- provide a Varnish command line interface (CLI)
- initialize the child process: the Cacher
The Manager checks every few seconds whether the Cacher is still there.
If the Manager does not get a reply within a given interval defined in
ping_interval, the Manager kills the Cacher and starts it up again.
This automatic restart also happens if the Cacher exits unexpectedly, for example, from a segmentation fault or assert error.
You can ping manually the cacher by executing
Automatic restart of child processes is a resilience property of Varnish.
This property ensures that even if Varnish contains a critical bug that crashes the child, the child starts up again usually within a few seconds.
You can toggle this property using the
Even if you do not perceive a lengthy service downtime, you should check whether the Varnish child is being restarted.
This is important, because child restarts introduce extra loading time as
varnishd is constantly emptying its cache.
Automatic restarts are logged into
To verify that the child process is not being restarted, you can also check its lifetime with the
MAIN.uptime counter in
Varnish Software and the Varnish community at large occasionally get requests for assistance in performance tuning Varnish that turn out to be crash-issues.
The Child Process: The Cacher¶
Since the Cacher listens on public IP addresses and known ports, it is exposed to evil clients. Therefore, for security reasons, this child process is owned by an unprivileged user, and it has no backwards communication to its parent, the Manager.
The main functions of the Cacher are:
- listen to client requests
- manage worker threads
- store caches
- log traffic
- update counters for statistics
Varnish uses workspaces to reduce the contention between each thread when they need to acquire or modify memory. There are multiple workspaces, but the most important one is the session workspace, which is used to manipulate session data. An example is changing www.example.com to example.com before it is entered into the cache, to reduce the number of duplicates.
It is important to remember that even if you have 5 MB of session workspace and are using 1000 threads, the actual memory usage is not 5 GB. The virtual memory usage will indeed be 5GB, but unless you actually use the memory, this is not a problem. Your memory controller and operating system will keep track of what you actually use.
To communicate with the rest of the system, the child process uses the VSL accessible from the file system. This means that if a thread needs to log something, all it has to do is to grab a lock, write to a memory area and then free the lock. In addition to that, each worker thread has a cache for log-data to reduce lock contention. We will discuss more about the Threading Model later in this chapter.
The log file is usually about 80MB, and split in two. The first part is counters, the second part is request data. To view the actual data, a number of tools exist that parses the VSL.
Since the log-data is not meant to be written to disk in its raw form, Varnish can afford to be very verbose. You then use one of the log-parsing tools to extract the piece of information you want – either to store it permanently or to monitor Varnish in real-time.
If something goes wrong in the Cacher, it logs a detailed panic message to
For testing, you can induce panic to
varnishd by issuing the command
varnishadm debug.panic.worker or by pressing the Induce Panic button in the Varnish Agent web interface.
Command to print VCL code compiled to C language and exit:
$varnishd -C -f <vcl_filename>
Useful to check whether your VCL code compiles correctly.
Configuring the caching policies of Varnish is done in the Varnish Configuration Language (VCL).
Your VCL is then translated by the VCC process to C, which is compiled by a normal C compiler – typically
gcc, and linked into the running Varnish instance.
Since the VCL compilation is done outside of the child process, there is no risk of affecting the running Varnish instance by accidentally loading an ill-formatted VCL.
As a result, changing configuration while running Varnish is very cheap. Policies of the new VCL takes effect immediately. However, objects cached with an older configuration may persist until they have no more old references or the new configuration acts on them.
A compiled VCL file is kept around until you restart Varnish completely, or until you issue
vcl.discard from the management interface.
You can only discard compiled VCL files after all references to them are gone.
You can see the amount of VCL references by reading the parameter
- The storage option
-sdefines the size of your cache and where it is stored
varnishd -sfollowed by one of the following methods to allocate space for the cache:
mseVarnish Massive Storage Engine (MSE) in Varnish Plus only
-s <malloc[,size]> option calls
malloc() to allocate memory space for every object that goes into the cache.
If the allocated space cannot fit in memory, the operating system automatically swaps the needed space to disk.
Varnish uses the jemalloc implementation. Although jemalloc emphasizes fragmentation avoidance, fragmentation still occurs. Jemalloc worst case of memory fragmentation is 20%, therefore, expect up to this percentage of additional memory usage. In addition to memory fragmentation you should consider an additional 5% overhead as described later in this section.
Another option is
This option creates a file on a filesystem to contain the entire cache.
Then, the operating system maps the entire file into memory if possible.
-s file storage method does not retain data when you stop or restart Varnish!
For persistence, use the option
The usage of this option, however, is strongly discouraged mainly because of consistency issues that might arise with it.
The Varnish Massive Storage Engine (MSE) option
-s <mse,path[,path...]]> is an improved storage method for Varnish Plus only.
MSE main improvements are decreased disk I/O load and lower storage fragmentation.
MSE is designed to store and handle over 100 TB with persistence, which makes it very useful for video on demand setups.
MSE uses a hybrid of two cache algorithms, least recently used (LRU) and least frequently used (LFU), to manage memory.
Benchmarks show that this algorithm outperforms
MSE also implements a mechanism to eliminate internal fragmentation.
The latest version of MSE requires a bookkeeping file. The size of this bookkeeping file depends on the cache size. Cache sizes in the order of gigabytes require a bookkeeping file of around 1% of the storage size. Cache sizes in the order of terabytes should have a bookkeeping file size around 0.5% of storage size.
For detailed instructions on how to configure MSE, please refer to the Varnish Plus documentation. For more details about its features and previous versions, please visit https://info.varnish-software.com/blog/varnish-mse-persistence.
When choosing storage backend, use
malloc if your cache will be contained entirely or mostly in memory.
If your cache will exceed the available physical memory, you have two options:
We recommend you to use MSE because it performs much better than
file storage backend.
There is a storage overhead in Varnish, so the actual memory footprint of Varnish exceeds what the
-s argument specifies if the cache is full.
The current estimated overhead is 1kB per object.
For 1 million objects, that means 1GB extra memory usage.
This estimate might slightly vary between Varnish versions.
In addition to the overhead per object, Varnish requires memory to manage the cache and handle its own operation.
Our tests show that an estimate of 5% of overhead is accurate enough.
This overhead applies equally to
For more details about memory usage in Varnish, please refer to https://info.varnish-software.com/blog/understanding-varnish-cache-memory-usage.
As a rule of thumb use:
malloc if the space you want to allocate fits in memory, if not, use
Remember that there is about 5% memory overhead and do not forget to consider the memory needed for fragmentation in
malloc or the disk space for the bookkeeping file in
In the CLI:
Do not fall for the copy/paste tips
Test the parameters in CLI, then store them in the configuration file
Varnish has many different parameters which can be adjusted to make
Varnish act better under specific workloads or with specific software and
hardware setups. They can all be viewed with
param.show in the
You can set up parameters in two different ways.
varnishadm, use the command
param.set <param> <value>.
Alternatively, you can issue the command
varnishd -p param=value.
Remember that changes made in the management interface are not persistent. Therefore, unless you store your changes in a startup script, they will be lost when Varnish restarts.
The general advice with regards to parameters is to keep it simple. Most of the defaults are optimal. If you do not have a very specific need, it is generally better to use the default values.
A few debug commands exist in the CLI, which can be revealed with
These commands are meant exclusively for development or testing, and many of them are downright dangerous.
Parameters can also be configured via the Varnish Administration Console (VAC) as shown in the figure below.
Suggested values for system variables and Varnish parameters are installation specific
With or without user input
Available for Varnish Plus only
yum install varnishtuner
apt-get install varnishtuner
The biggest potential for improvement is outside Varnish. First and foremost in tuning the network stack and the TCP/IP connection handling.
Varnish Tuner is a program toolkit based on the experience and documentation we have built. The toolkit tries to gather as much information as possible from your installation and decides which parameters need tuning.
The tuning advice that the toolkit gives is specific to that system. The Varnish Tuner gathers information from the system it is running in. Based on that information, it suggests values for systems variables of your OS and parameters for your Varnish installation that can be beneficial to tune. Varnish Tuner includes the following information for each suggested system variable or parameter:
- current value
- suggested value
- text explaining why it is advised to be changed
varnishtuner requires by default user input to produce its output.
If you are not sure about the requested input, you can instruct
varnishtuner to do not suggest parameters that require user input.
For this, you issue
Varnish Tuner is valuable to both experts and non-experts. Varnish Tuner is available for Varnish Plus series only.
Copying Varnish Tuner suggestions to other systems might not be a good idea.
Varnish Tuner Persistence¶
The output of
varnishtuner updates every time you introduce a new input or execute a suggested command.
However, the result of the suggested commands are not necessarily persistent, which means that they do not survive a reboot or restart of Varnish Cache.
To make the tuning persistent, you can add do the following:
- Specify the Varnish parameters in the configuration file.
- Specify the
sysctlsystem variables in
To see the usage documentation of Varnish Tuner, execute:
Install Varnish Tuner¶
Ubuntu Trusty 14.04
Packages in our repositories are signed and distributed via https. You need to enable https support in the package manager and install our public key first:
apt-get install -y apt-transport-https curl https://<username>:<password>@repo.varnish-software.com/GPG-key.txt | apt-key add -
You add the Varnish Plus repository o
# Varnish Tuner deb https://<username>:<password>@repo.varnish-software.com/ubuntu <distribution_codename> non-free
<distribution_codename> is the codename of your Linux distribution, for example: trusty, debian, or wheezy.
apt-get update apt-get install varnishtuner
Above are the installation instructions for Ubuntu to get Varnish Tuner from our repositories.
<password> with the ones of your Varnish Plus subscription.
If you do not know them, please send an email to our support email to recover them.
Red Hat Enterprise Linux 6
To install Varnish Plus on RHEL6, put the following lines into
[varnishtuner] name=Varnishtuner baseurl=https://<username>:<password>@repo.varnish-software.com/redhat/ \ varnishtuner/el6 enabled=1 gpgcheck=0
- The child process runs multiple threads in two thread pools
- Threads accept new connections and delegate them
- One worker threads per client request – it’s common to use hundreds of worker threads
- Expire-threads evict old content from the cache
|Thread-name||Amount of threads||Task|
|cache-worker||1 per active connection||Handle requests|
|ban lurker||1||Clean bans|
|acceptor||1||Accept new connections|
|epoll/kqueue||Configurable, default: 2||Manage thread pools|
|expire||1||Remove old content|
|backend poll||1 per backend poll||Health checks|
- Thread pools can safely be ignored
- Start threads better sooner than later
- Maximum and minimum values are per thread pool
|thread_pool_add_delay||0.000 [seconds]||Period of time to wait for subsequent thread creation.|
|thread_pool_destroy_delay||1 second||Added time to
|thread_pool_fail_delay||0.200 [seconds]||Period of time before retrying the creation of a thread. This after the creation of a thread failed.|
|thread_pool_max||5000 [threads]||Maximum number of worker threads per pool.|
|thread_pool_min||100 [threads]||Minimum number of worker threads per pool.|
|thread_pool_stack||48k [bytes]||Worker thread stack size.|
|thread_pool_timeout||300.000 [seconds]||Period of time before idle threads are destroyed.|
|thread_pools||2 [pools]||Number of worker thread pools.|
|thread_queue_limit||20 requests||Permitted queue length per thread-pool.|
|thread_stats_rate||10 [requests]||Maximum number of jobs a worker thread may handle before it is forced to dump its accumulated stats into the global counters.|
|workspace_thread||2k [bytes]||Bytes of auxillary workspace per thread.|
When tuning Varnish, think about the expected traffic.
The most important thread setting is the number of cache-worker threads.
You may configure
These parameters are per thread pool.
Although Varnish threading model allows you to use multiple thread pools, we recommend you to do not modify this parameter. Based on our experience and tests, we have seen that 2 thread pools are enough. In other words, the performance of Varnish does not increase when adding more than 2 pools.
If you run across the tuning advice that suggests to have a thread pool per CPU core, rest assured that this is old advice. We recommend to have at most 2 thread pools, but you may increase the number of threads per pool.
Details of Threading Parameters¶
- Default values have proved to be sufficient in most cases
thread_pool_maxare the most common threading parameters to tune.
- Run extra threads to avoid creating them on demand
Varnish runs one thread per session, so the maximum number of threads is equal to the number of maximum sessions that Varnish can serve concurrently. If you seem to need more threads than the default, it is very likely that there is something wrong in your setup. Therefore, you should investigate elsewhere before you increase the maximum value.
You can observe if the default values are enough by looking at
Look at the counter over time, because it is fairly static right after startup.
When tuning the number of threads,
thread_pool_max are the most important parameters.
Values of these parameters are per thread pool.
thread_pools parameter is mainly used to calculate the total number of threads.
For the sake of keeping things simple, the current best practice is to leave
thread_pools at the default 2 [pools].
Varnish operates with multiple pools of threads. When a connection is accepted, the connection is delegated to one of these thread pools. Afterwards, the thread pool either delegates the connection request to an available thread, queue the request otherwise, or drop the connection if the queue is full. By default, Varnish uses 2 thread pools, and this has proven sufficient for even the most busy Varnish server.
Varnish has the ability to spawn new worker threads on demand, and remove them once the load is reduced. This is mainly intended for traffic spikes. It’s a better approach to keep a few threads idle during regular traffic, than to run on a minimum amount of threads and constantly spawn and destroy threads as demand changes. As long as you are on a 64-bit system, the cost of running a few hundred threads extra is very low.
thread_pool_min parameter defines how many threads run for each thread pool even when there is no load.
thread_pool_max defines the maximum amount of threads that could be used per thread pool.
That means that with the minimum defaults 100 [threads] and 5000 [threads] of minimum and maximums threads per pool respectively, you have:
- at least 100 [threads] * 2 [pools] worker threads at any given time
- no more than 5000 [threads] * 2 [pools] worker threads ever
New threads use preallocated workspace, which should be enough for the required task.
If threads have not enough workspace, the child process is unable to process the task and it terminates.
To avoid this situation, evaluate your setup and consider to increase the
Time Overhead per Thread Creation¶
thread_pool_add_delay: Wait at least this long after creating a thread.
thread_pool_timeout: Thread idle threshold.
thread_pool_fail_delay: After a failed thread creation, wait at least this long before trying to create another thread.
Varnish can use several thousand threads, and has had this capability from the very beginning.
However, not all operating system kernels were prepared to deal with this capability.
Therefore the parameter
thread_pool_add_delay was added to ensure that there is a small delay between each thread that spawns.
As operating systems have matured, this has become less important and the default value of
thread_pool_add_delay has been reduced dramatically,
from 20 ms to 2 ms.
There are a few, less important parameters related to thread timing.
thread_pool_timeout is how long a thread is kept around when there is no work for it before it is removed.
This only applies if you have more threads than the minimum, and is rarely changed.
Another less important parameter is the
After the operating system fails to create a new thread,
thread_pool_fail_delay defines how long to wait for a re-trial.
As Varnish has matured, fewer and fewer parameters require tuning.
workspace_backend are parameters that could still be relevant.
workspace_client– incoming HTTP header workspace from the client
workspace_backend– bytes of HTTP protocol workspace for backend HTTP req/resp
- Tune it if you have many big headers or have a VMOD that uses too much memory
- Remember: it is virtual, not physical memory
Workspaces are some of the things you can change with parameters. Sometimes you may have to increase them to avoid running out of workspace.
workspace_clientparameter states how much memory can be allocated for each HTTP session. This space is used for tasks like string manipulation of incoming headers. The
workspace_backendparameter indicates how much memory can be allocated to modify objects returned from the backend. After an object is modified, its exact size is allocated and the object is stored read-only.
- As most of the parameters can be left unchanged, we will not go through all of them.
- You can take a look at the list of parameter by issuing
varnishadm param.show -lto get information about what they can do.
|connect_timeout||3.500 [seconds]||OS/network latency||Backend|
|first_byte_timeout||60.000 [seconds]||Web page generation||Backend|
|timeout_idle||5.000 [seconds]||keep-alive timeout||Client|
|timeout_req||2.000 [seconds]||deadline to receive a complete request header||Client|
|cli_timeout||60.000 [seconds]||Management thread->child||Management|
The timeout-parameters are generally set to pretty good defaults, but you might have to adjust them for unusual applications.
The default value of
connect_timeout is 3.500 [seconds].
This value is more than enough when having the Varnish server and the backend in the same server room.
Consider to increase the
connect_timeout value if your Varnish server and backend have a higher network latency.
Keep in mind that the session timeout affects how long sessions are kept around, which in turn affects file descriptors left open. It is not wise to increase the session timeout without taking this into consideration.
cli_timeout is how long the management thread waits for the worker thread to reply before it assumes it is dead, kills it and starts it back up.
The default value seems to do the trick for most users today.
connect_timeout is set too high, it does not let Varnish handle errors gracefully.
Another use-case for increasing
connect_timeout occurs when virtual machines are involved as they can increase the connection time significantly.
More information in https://www.varnish-software.com/blog/understanding-timeouts-varnish-cache.
- Delay backend responses for over 1 second
first_byte_timeoutto 1 second
- Check how Varnish times out the request to the backend
For the purpose of this exercise, we suggest to create a simple CGI script to insert response delay in a real backend.
To check how
first_byte_timeout impacts the behavior of Varnish, analyze the output of
If you need help, look at Solution: Tune first_byte_timeout and test it against your real backend.
Alternatively, you can use
delay in a mock-up backend in
varnishtest and assert VSL records and counters to verify the effect of
The subsection Solution: Tune first_byte_timeout and test it against mock-up server shows you how to do it.
Exercise: Configure Threading¶
- Change the
thread_pool_maxparameters to get 10 threads running at any given time, but never more than 15.
varnishadm param.show <parameter>to see parameter details.
These exercises are for educational purposes, and not intended as an encouragement to change the values.
You can learn from this exercise by using