- About the course
- Goals and prerequisites
- Introduction to Varnish
- Design Principles
About the course¶
The course is split in two:
- Architecture, command line tools, installation, parameters, etc
- The Varnish Configuration Language
The course has roughly 50% exercises and 50% instruction, and you will find all the information on the slides in the supplied training material.
The supplied training material also has additional information for most chapters.
The Varnish Book includes the material for both the Varnish System Administration course and the Varnish for Web developers course.
The agenda is adjusted based on the progress made. There is usually ample time to investigate specific aspects of Varnish that may be of special interest to some of the participants.
The exercises will occasionally offer multiple means to reach the same goals. Specially when you start working on VCL, you will notice that there are almost always more than one way to solve a specific problem, and it isn’t necessarily given that the solution offered by the instructor or this course material is better than what you might come up with yourself.
Always feel free to interrupt the instructor if something is unclear.
Goals and Prerequisites¶
- Comfortable working in a shell on a Linux/UNIX machine, including editing text files and starting daemons.
- Basic understanding of HTTP and related internet protocols
- Thorough understanding of Varnish
- Understanding of how VCL works and how to use it
The course is oriented around a GNU/Linux server-platform, but the majority of the tasks only require minimal knowledge of GNU/Linux.
The course starts out by installing Varnish and navigating some of the common configuration files, which is perhaps the most UNIX-centric part of the course. Do not hesitate to ask for help.
The goal of the course is to make you confident when using Varnish and let you adjust Varnish to your exact needs. If you have any specific area you are particularly interested in, the course is usually flexible enough to make room for it.
Introduction to Varnish¶
- What is Varnish?
- Open Source / Free Software
- Varnish Governance Board (VGB)
Varnish is a reverse HTTP proxy, sometimes referred to as a HTTP accelerator or a web accelerator. It stores files or fragments of files in memory, allowing them to be served quickly. It is essentially a key/value store, that usually uses the URL as a key. It is designed for modern hardware, modern operating systems and modern work loads.
At the same time, Varnish is flexible. The Varnish Configuration Language is lightning fast and allows the administrator to express their wanted policy rather than being constrained by what the Varnish developers want to cater for or could think of. Varnish has shown itself to work well both on large (and expensive) servers and tiny appliances.
Varnish is also an open source project, and free software. The development process is public and everyone can submit patches, or just take a peek at the code if there is some uncertainty as to how Varnish works. There is a community of volunteers who help each other and newcomers. The BSD-like license used by Varnish does not place significant restriction on re-use of the code, which makes it possible to integrate Varnish in virtually any solution.
Varnish is developed and tested on GNU/Linux and FreeBSD. The code-base is kept as self-contained as possible to avoid introducing out-side bugs and unneeded complexity. As a consequence of this, Varnish uses very few external libraries.
Varnish development is governed by the Varnish Governance Board (VGB), which thus far has not needed to intervene. The VGB consists of an architect, a community representative and a representative from Varnish Software. As of March 2012, the positions are filled by Poul-Henning Kamp (Architect), Rogier Mulhuijzen (Community) and Kristian Lyngstøl (Varnish Software). On a day-to-day basis, there is little need to interfere with the general flow of development.
For those interested in development, the developers arrange weekly bug washes were recent tickets and development is discussed. This usually takes place on Mondays around 12:00 CET on the IRC channel #varnish-hacking on irc.linpro.net.
- Solve real problems
- Optimize for modern hardware (64-bit, multi-core, etc)
- Work with the kernel, not against it
- Innovation, not regurgitation
The focus of Varnish has always been performance and flexibility. That has required some sacrifices.
Varnish is designed for hardware that you buy today, not the hardware you bought 15 years ago. Varnish is designed to run on 64-bit architectures and will scale almost proportional to the number of CPU cores you have available. Though CPU-power is rarely a problem.
If you choose to run Varnish on a 32-bit system, you are limited to 3GB of virtual memory address space, which puts a limit on the number of threads you can run and the size of your cache. This is a trade-off to gain a simpler design and reduce the amount of work Varnish needs to do. The 3GB limit depends on the operating system kernel. The theoretical maximum is 4GB, but your OS will reserve some of that for the kernel. This is called the user/kernel split.
Varnish does not keep track of whether your cache is on disk or in memory. Instead, Varnish will request a large chunk of memory and leave it to the operating system to figure out where that memory really is. The operating system can generally do a better job than a user-space program.
Accept filters, epoll and kqueue are advanced features of the operating system that are designed for high-performance services like Varnish. By using these, Varnish can move a lot of the complexity into the OS kernel which is also better positioned to know what threads are ready to execute when.
In addition, Varnish uses a configuration language that is translated to C-code, compiled with a normal C compiler and then dynamically linked directly into Varnish at run-time. This has several advantages. The most practical of which is the freedom you get as a system administrator. You can use VCL to decide how you want to interface with Varnish, instead of having a developer try to predict every possible scenario. The fact that it boils down to C and a C compiler also gives you very high performance, and if you really wanted to, you could by-pass the VCL to C translation and write raw C code (this is called in-line C in VCL). In short: Varnish provides the features, VCL allow you to specify exactly how you use and combine them.
With Varnish 3 you also have Varnish Modules or simply vmods. These modules let you extend the functionality of the VCL language by pulling in custom-written features. Some examples include non-standard header manipulation, access to memcached or complex normalization of headers.
The shared memory log allows Varnish to log large amounts of information at almost no cost by having other applications parse the data and extract the useful bits. This reduces the lock-contention in the heavily threaded environment of Varnish. Lock-contention is also one of the reasons why Varnish uses a workspace-oriented memory-model instead of allocating the exact amount of space it needs at run-time.
To summarize: Varnish is designed to run on modern hardware under real work-loads and to solve real problems. Varnish does not cater to the “I want to make Varnish run on my 486 just because”-crowd. If it does work on your 486, then that’s fine, but that’s not where you will see our focus. Nor will you see us sacrifice performance or simplicity for the sake of niche use-cases that can easily be solved by other means - like using a 64-bit OS.
How objects are stored¶
- Objects in Varnish are stored in a hash
- You can control the hashing
- Multiple objects can have the same hash key
Varnish has, as mentioned, a key/value store in its core. Objects are stored in memory and a reference to this object is kept in a hash tree.
A rather unique feature of Varnish is that you can actually control what goes into the hashing algorithm that Varnish uses to store data. Typically the key is made out of the HTTP Host header and the URL, but you’re actually able to override this if you should choose to do so.
The HTTP protocol specifies that there can be multiple objects that can be served on the same URL, depending on the preferences of the client. For instance, serving gzip’ed content to a client that doesn’t indicate gzip support doesn’t make much sense and Varnish might look at the Various objects stored at that key to pick out the one that matches.