LVS-HOWTO

Joseph Mack

jmack (at) wm7d (dot) net

v2003.07 Jul 2003, released under GPL.

Abstract

Install, testing and running of a Linux Virtual Server with 2.2.x and 2.4.x kernels


Table of Contents

1. Introduction
1.1. ChangeLog
1.2. Thanks
1.3. About the HOWTO
1.4. Nomenclature/Abbreviations
1.5. What is an LVS?
1.6. Minimal knowledge required
1.7. Getting Technical Help
1.8. Mailing list: subscribing, unsubscribing, searching
1.9. Mailing list: posting to
1.10. Bug Fixes
1.11. ToDo List
1.12. Other load balancing solutions
1.13. Help! My LVS doesn't work
1.14. Software/Information/HOWTOs useful/related to LVS
2. Install, Configure, Setup
3. LVS Performance and Kernel Tuning
3.1. Performance Articles
3.2. Estimating throughput: 100Mbps FE is really 8000packets/sec ethernet
3.3. Jumbo frames
3.4. Network Latency
3.5. NICs and Switches, 100Mbps (FE) and 1Gbps (GigE)
3.6. NIC problems - eepro100
3.7. NIC problems - tulip
3.8. dual/quad ethernet cards, IRQ sharing problems
3.9. what's the CPU usage/load level on the director?
3.10. Monitoring LVS throughput at the director: with ipvsadm
3.11. LVS director throughput statistics: /proc system (originally /proc/net/ip_vs_stats)
3.12. MRTG and LVSGSP
3.13. MIB/SNMP
3.14. Other output GUIs and monitoring tools
3.15. performance testing tools
3.16. Max number of realservers
3.17. FAQ: How fast/big should my director be?
3.18. FAQ: What is the minimum hardware requirements for a director
3.19. Does SMP help?
3.20. Performance Hints from the Squid people
3.21. Conntrack, effect on throughput
3.22. sysctl flags in /proc
3.23. Don't use the Pre-emptible kernels
4. The ARP Problem
4.1. The problem
4.2. The Cure(s)
4.3. The Cure: 2.0 kernels
4.4. The Cure: 2.2.x kernels
4.5. The Cure: 2.4.x kernels
4.6. The ARP problem, the first inklings
4.7. A posting to the mailinglist by Peter Kese explaining the "arp problem"
4.8. from the mailing list: arp bouncing, Lar's Method, Static Routing to Director, iproute2 arp flag, is the arp behaviour a bug, arp caching/heartbeat
4.9. How to tell if an interface is replying to arp requests
4.10. The device doesn't reply to arp requests, the kernel does.
4.11. Properties of devices for the VIP
4.12. Topologies for LVS-DR and LVS-Tun LVS's
4.13. Why do all devices broadcast the arp replies
4.14. A discussion about the arp problem
4.15. ATM/ethernet and router problems
4.16. Same IP on multiple NICs
5. Ipvsadm and Schedulers
5.1. Using ipvsadm
5.2. Compile a version of ipvsadm that matches your ipvs
5.3. put realservers in /etc/hosts
5.4. RR and LC schedulers
5.5. netmask for VIP
5.6. LBLC, DH schedulers
5.7. SH scheduler, Patches for multiple firewalls/gateways
5.8. What is an ActiveConn/InActConn (Active/Inactive) connnection?
5.9. FAQ: ipvsadm shows entries in InActConn, but none in ActiveConn, connection hangs. What's wrong?
5.10. FAQ: initial connection is delayed, but once connected everything is fine. What's wrong?
5.11. unbalanced realservers: does rr and lc weighting equally distribute the load? - clients reusing ports
5.12. Changing weights with ipvsadm
5.13. Dynamically changing realserver weights
5.14. Handling kernel version dependant files e.g. System.map and ipvsadm
5.15. connection threshhold
5.16. Who is connecting to my LVS?
5.17. experimental scheduling code
5.18. Scheduling TCP/UDP
6. Persistent Connection (Persistence, Affinity in cisco-speak)
6.1. netscape/database/tcpip persistence
6.2. LVS persistence
6.3. Setting up with persistence
6.4. Scheduling looks different under persistence
6.5. Persistent and regular (non-persistent) services together on the same realserver.
6.6. Tracing connections: where will the client connect next?
6.7. Bringing down persistent services.
6.8. Load Balancing time constant is longer with persistence
6.9. Resetting the persistence timeout counter (persistence behaviour for short timeout values)
6.10. Why you don't want persistence for your e-commerce site
6.11. persistence with windows realservers
6.12. IIS session management: how it works
6.13. messing with the ipvsadm table while your LVS is running
6.14. Persistence for multiport services
6.15. Proxy services, e.g. AOL
6.16. key exchanges (SSL)
6.17. About longer timeouts
6.18. passive ftp and persistence
6.19. what if a realserver holding a sticky connection crashes
7. Routing and packet delivery tricks(needed for fwmark and transparent proxy)
7.1. Introduction
7.2. Routing to and accepting packets by a VIP-less director
7.3. Transparent proxy Q and A
7.4. Other tricks
8. Fwmarks (firewall marks)
8.1. Introduction
8.2. ipvsadm syntax for fwmark
8.3. setting up routing and packet delivery to the director
8.4. single-port service: telnet with fwmarks
8.5. Grouping services: single group, active ftp(20,21)
8.6. Grouping services: two groups, active ftp(20,21) and e-commerce(80,443)
8.7. passive ftp
8.8. fwmark with LVS-NAT
8.9. collisions between fwmark and VIP rules
8.10. persistence granularity with fwmark
8.11. fwmark allows LVS-DR director to be default gw for realservers
8.12. fwmark simplifies configuration for large numbers of addresses
8.13. Example: firewall farm
8.14. Example: LVS'ing a CIDR block
8.15. Example: forwarding based on client source IP
8.16. Example: load balancing multiple class C networks
8.17. Example: proxy server
8.18. Example: transparent web cache
8.19. Example: Multiply-connected router
8.20. httpd clients (browsers)
8.21. Example: dynamically generated images in webpages
8.22. Example: Balancing many IPs/services as one block
8.23. Example: Source controlled LVS - services and realserver customised by Client IP
8.24. Appendix 1: Specificiations for grouping of services with fwmarks
8.25. Appendix 2: Demonstration of grouping services with fwmarks
8.26. Appendix 3: Announcement of grouping services with fwmarks
9. Services: single-port
9.1. Introduction
9.2. setting up a new service
9.3. services must be setup for forwarding type
9.4. Realservers present the same content: Synchronising content (and config files) and backing up realservers
9.5. File Systems for Clusters
9.6. Idle timeouts for TCP/UDP connections to services
9.7. name resolution on realservers: running name resolution friendly demons on realservers
9.8. ftp, tcp 21
9.9. ssh, tcp 22
9.10. telnet, tcp 23
9.11. smtp, tcp 25; pop3, tcp 110; imap tcp/udp 143 (imap2), 220(imap3). Also sendmail, qmail, postfix, and mailfarms.
9.12. mail farms
9.13. dns, tcp/udp 53
9.14. http name and IP-based (with LVS-DR or LVS-Tun), tcp 80
9.15. http with LVS-NAT
9.16. httpd is stateless and normally closes connections
9.17. persistence with http; browser opens many connections to httpd
9.18. dynamically generated images on web pages
9.19. http: Cookies, URL rewriting and parsing, session headers (session id)
9.20. http: sanity checks, logs, shutting down, mod_proxy, indexing programs, htpasswd
9.21. HTTP 1.0 and 1.1 requests
9.22. Microsoft http clients and servers violate the RFC for TCP/IP
9.23. squids, tcp 80, 3128
9.24. authd/identd, tcp 113 and tcpwrappers (tcpd)
9.25. ntp, udp 123
9.26. samba, udp 137, udp 138, tcp 139
9.27. https, tcp 443
9.28. name based virtual hosts for https
9.29. Obtaining certificates for https
9.30. SSL Accelerators and Load Balancers
9.31. lpd, tcp 515
9.32. Databases
9.33. r commands; rsh, rcp, and their ssh replacements, tcp 514
9.34. nfs, udp 2049 (and possible replacements for nfs)
9.35. X-window, udp 177 (xdmcp), tcp 6000 (and ssh X-forwarding)
10. Services: multi-port
10.1. Introduction
10.2. ftp general, active tcp 20,21; passive 21,high_port
10.3. ftp (active) - the classic command line ftp
10.4. ftp helper modules: ip_vs_ftp/ip_masq_ftp
10.5. ftp (passive)
10.6. ftp is difficult to secure
10.7. RealNetworks streaming protocols, tcp 554, many ports
10.8. quicktime, tcp 554, many ports
10.9. Radius, udp 1645,1646
11. 3-Tier LVS
11.1. Introduction
11.2. Routes needed for 3-Tier LVS
11.3. Setting up routes using iptables and iproute2
11.4. authd/identd and other clients that are 3-Tier clients
11.5. from the mailing list
12. Authd/Identd
12.1. What is authd/identd?
12.2. symptoms of the identd problem
12.3. comp.os.linux.security FAQ on identd
12.4. Russ Nelson on identd
12.5. Why identd is a problem for LVS
12.6. tcpdumps of connections delayed by identd
12.7. There are solutions to identd problem in some cases
12.8. Turn off tcpwrappers
12.9. Identd and smtp/pop/qmail
13. LVS-NAT
13.1. Introduction
13.2. Example 1-NIC, 2 Network LVS-NAT (VIP and RIPs on different network)
13.3. All packets from the realserver to the outside world must go through the director
13.4. Run the configure script
13.5. Setting up demasquerading on the director; 2.4.x and 2.2.x
13.6. masquerading clients on realservers
13.7. re-mapping ports, rewriting is slow
13.8. masquerade timeouts
13.9. Julian's step-by-step check of a L4 LVS-NAT setup
13.10. How LVS-NAT works
13.11. In LVS-NAT, how do packets get back to the client, or how does the director choose the VIP as the source_address for the outgoing packets?
13.12. One Network LVS-NAT
13.13. Clients on Realservers connecting to services on VIP
13.14. Performance of LVS-NAT, 2.0 and 2.2 kernels
13.15. Performance of LVS-NAT, 2.4 kernels
13.16. Various debugging techniques for routes
13.17. Connecting directly from the client to a service:port on an LVS-NAT realserver
13.18. Realservers in two LVSs
13.19. Thoughts on extending NAT
13.20. Postings from the mailing list
14. LVS-DR
14.1. How LVS-DR works
14.2. Handling the arp problem for LVS-DR
14.3. LVS-DR scales well
14.4. LVS-DR director as default gw for realservers, transparent proxy and Julian's martian and forward_shared patches
14.5. Accepting packets on LVS-DR director by fwmarks
14.6. security concerns: default gw(s) and routing with LVS-DR/VS-Tun
14.7. routing to realserver from director
14.8. Setting up NAT clients on LVS-DR realservers
15. LVS-Tun
15.1. How LVS-Tun works
15.2. Configure LVS-Tun
15.3. FreeBSD realservers with LVS-Tun
15.4. W2K realservers with LVS-Tun
15.5. LVS-Tun from the mailing list
16. Localnode
16.1. You can't rewrite ports with localnode
16.2. Testing LocalNode
17. Transparent proxy (TP or Horms' method)
17.1. setting up routing and packet delivery to the director
17.2. General
17.3. How you use TP
17.4. The original 2.2 TP setup method
17.5. Transparent proxy for 2.4.x
17.6. Experiments showing that 2.4TP is different to 2.2TP
17.7. What IP TP packets arriving on?
17.8. Take home lesson for setting up TP on realservers
17.9. Handling identd requests from 2.4.x LVS-DR realservers using TP
17.10. Performance of Transparent Proxy
18. Transparent Bridging
19. Squid Realservers (poor man's L7 switch)
19.1. Terminology
19.2. Preview
19.3. Let's start assembling
19.4. One squid
19.5. Another squid
19.6. Combining pieces with LVS
19.7. Problems
20. Details of LVS operation: Security, DoS
20.1. Top 20 security vunerabilities
20.2. Top 75 security tools from the people at nmap
20.3. Do I need security, really?
20.4. Can filter rules stop the intruder hopping to other machines?
20.5. Where filter rules act
20.6. /proc filesystem flags for ipv4, e.g.rp_filter
20.7. /proc file system settings for security
20.8. Director Connection Hash Table
20.9. Hash table size
20.10. Hash table timeouts
20.11. Hash Table DoS
20.12. timeouts the same for all services
20.13. tcp timeout values, don't change them
20.14. Hash table size, director will crash when it runs out of memory.
20.15. The LVS code does not swap
20.16. Other factors determining the number of connections
20.17. Port range: limitations, expanding port range on directors
20.18. apps starved for ports
20.19. realserver running out of ports
20.20. DoS
20.21. DoS, from the mailing list
20.22. Testing DoS Strategies with testlvs: Creating large numbers of InActConn
20.23. Debugging LVS
20.24. Filesystems for realserver content: the many reader, single writer problem
20.25. What is a session?
20.26. Developement: Supporting IPSec on LVS
21. Writing filter rules, about netfilter hooks, using tcpdump
21.1. Introduction
21.2. LVS for Netfilter and Linux 2.4
21.3. tcpdump
21.4. Writing Filter Rules
21.5. Example ip_tables filter scripts
21.6. the design of LVS as a netfilter module
22. ICMP
22.1. MTU discovery and ICMP handling
22.2. LVS code only needs to handle icmp redirects for LVS-NAT and not for LVS-DR and LVS-Tun
22.3. ICMP checksum errors
22.4. ICMP Timeouts
23. High Availability LVS: Failover protection
23.1. Introduction
23.2. Stateful Failover
23.3. Director failure
23.4. UltraMonkey and Linux-HA
23.5. Keepalived and Vrrpd
23.6. Some vrrpd setup instructions
23.7. Vinnie's comparison between ldirectord/heartbeat and keepalived/vrrpd
24. Server State Sync Demon (saving the director's connection state on failover)
24.1. Intro
24.2. Release Notice
24.3. from the mailing list
24.4. Expiry of Connection in Backup Director
25. Realserver failure handled by Mon
25.1. Introduction
25.2. ethernet NIC failure, and channel bonding
25.3. Service/realserver failout
25.4. Mon for server/service failout
25.5. BIG CAVEAT
25.6. About Mon
25.7. Mon Install
25.8. Mon Configure
25.9. Testing mon without LVS
25.10. Can virtualserver.alert send commands to LVS?
25.11. Running mon with LVS
25.12. Why is the LVS monitored for failures/load by an external agent rather than by the kernel?
25.13. Running multiple directors (each with their own IP)
26. Setting up Linux-HA for directors (mostly by using rpms)
26.1. linux-ha howto
26.2. Fix the (possible) ethernet alias issue.
26.3. Configure /etc/ha.d/. files.
26.4. Stop ldirectord from starting, ensure heartbeat starts on reboot
26.5. starting heartbeat and verifying functionality
26.6. Test your fail-over features, understand HA.
26.7. Configuration of mon - recommended
26.8. Ard van Breeman's replacement for IPaddr
27. Director failover using heartbeat
28. Newer networking tools: Policy Routing
28.1. Policy Routing and ifconfig
28.2. Various debugging techniques for routes
28.3. checking source routed packets
28.4. handling arp problem with iproute2
28.5. ip commands you mightn't know about
29. Misc/FAQ/Wisdom from the mailing list
29.1. Having one director handling multiple LVS sites, Multiple VIPs
29.2. Limiting number of clients connecting to LVS
29.3. Setting up a fake service on the realserver with inetd
29.4. How to bring down a realserver for maintenance (eg swap disks)
29.5. Howto turn your single node ftp/http server into an LVS without taking it off-line
29.6. shutdown of LVS
29.7. Other projects like LVS - Beowulf
29.8. Projects like LVS - Eddie
29.9. Recommendations for a redundant file system, RAID
29.10. Thundering herd problem, when down machine(s) come on line
29.11. on the need for extended testing
29.12. loopback on Solaris
29.13. Running clients (eg telnet) on realservers
29.14. Bringing down aliased devices
29.15. Multiple IPs on the Director
29.16. Testimonials
29.17. Transport Layer Security(TLS)
29.18. Setting up a hot spare server
29.19. An LVS of LVSs
29.20. Connecting from clients through multiple parallel links: the dead gateway problem
29.21. LVS on a Linux/IBM mainframe
29.22. How do I check to see if my kernel has the ip-vs patch installed?
29.23. Running a test LVS (director, backup director and realservers) on one box
29.24. mqseries
29.25. LVS log files
29.26. LVS and linux vlan
29.27. multi-home, multi-router LVS
30. L7 Switching
30.1. Introduction
30.2. KTCPVS
30.3. DRWS
30.4. from the mailing list about L7 switching
30.5. What is TCPSP?
31. Geographically distributed load balancing
31.1. from the mailing list
32. Linux Distributions prepatched with LVS, Unsupported LVS addons
32.1. Distributions prepatched with LVS
32.2. PB's Nutshell HOWTO for Piranha/LVS-NAT
32.3. Horms advice for installing on RedHat systems
32.4. Recipe and LVS binaries for RedHat from Alex Kramarov
32.5. recipes for installing with RedHat from the mailing list
33. Useful things that have no other place (yet)
33.1. Ramdisk
33.2. cscope
33.3. Neutral currents in multiphase power lines for non-linear loads (like computers)
33.4. netcat/phatcat
34. Patches and Contributed code
34.1. machine readable error codes from ipvsadm
34.2. stateless ipsvadm: machine readable entries
34.3. Threshhold patch
34.4. Martian modification patches
34.5. fwmark name-number translation table
34.6. ip_vs_conn.pl
34.7. Running a firewall on the director: The Antefacto Netfilter patches for 2.4
34.8. Malcolm Turnbull's ISO files
35. FAQ
35.1. When will LVS be ported to Solaris, xxxBSD...?
35.2. Is there a HOWTO in Japanese, French, Italian, Chinese...?