应用笔记 · 2013年12月13日

嵌入式webserver比较

Web Server Comparisons —- from http://www.acme.com/software/thttpd/benchmarks.html#chroot

“…factoring out I/O, the primary determinant to server performance is the concurrency strategy.”

JAWS: Understanding High Performance Web Systems

Some comparisons between various web servers. We look at features, sizes, and performance. The headers in the table are links to explanatory notes below. Bright green entries indicate something unusually good, and red entries are something bad, sometimes with a link to a note explaining why.

Don’t get too excited over the performance figures. Most of these servers have enough oomph to keep a T1 saturated, running on a lowly 100MHz Pentium or maybe even a 486. Very very few web applications need more power than that. So, the fact that Apache is not that fast shouldn’t be of concern to most people.

Last updated 12jul98.

software features size RPS max users name
name version released active devel model auto conf basic auth chroot throttling tar bytes source files source lines exe small files CGI large files
thttpd 2.03 11jul98 yes select yes yes yes yes 406,016 20 7,229 49,584 720 100 1000+ thttpd
Apache 1.3.0 05jun98 yes pre-fork yes yes no no 4,925,440 149 73,381 397,152 250 90 150 Apache
mathopd 1.2b9 23may98 yes select yes no yes no 112,640 8 3,658 49,088 770 75 500 mathopd
Roxen 1.2.29 10jun98 yes threads yes yes yes no 14,274,560 1,480 247,789 502,284 170 11 50 Roxen
Boa 0.92 23dec96 yes select no no no no 225,280 21 4,103 75,788 475 115 50 Boa
Jigsaw 2.0beta1 08apr98 yes Java threads n/a yes yes no 11,341,824 594 95,744 1,841,601 45 n/a 25 Jigsaw
Acme.Serve 18may98 yes Java threads n/a no no yes 302,592 27 4,718 79,943 45 n/a 25 Acme.Serve
CERN 3.0A 15jul96 no fork yes yes no no 2,703,360 153 56,028 561,696 115 65 300 CERN
NCSA 1.5.2a-export 12oct96 no pre-fork no yes no no 931,840 77 23,726 196,040 260 70 350 NCSA
Netscape FastTrack 3.01-export-us 02oct97 no threads n/a yes ? no 62,443,520 n/a n/a 1,896,016 Netscape FastTrack
Netscape Enterprise 3.5.1-export-us 02feb98 yes threads n/a yes ? no 130,426,880 n/a n/a 1,977,568 Netscape Enterprise
Zeus 3.1.4 10jul98 yes select n/a yes no yes 3,747,840 n/a n/a 1,367,976 1050 70 500 Zeus

Notes:

  • Software – which web server.    
    • Name – the name of the server, and a link to its home page.
    • Version – the version tested here.
    • Released – when it was released.
    • Active devel – whether it is currently under development.

  • Features – some key server features.    
    • Model – what kind of server it is.  The models are:
      • fork – start a new process for each request.
      • pre-fork – pre-start a pool of processes which each handle     multiple requests.
      • threads – use threads instead of processes.
      • Java threads – this version of the Java runtime uses     “Green threads” instead of native threads, so it acts more like     the select-based servers.
      • select – use non-blocking I/O and the select() system call     to handle multiple requests in a single process, single thread.

    • Auto-conf – whether there’s a script to automatically configure the build process for your OS.
      • mathopd did not compile out-of-box under     Solaris 2.6, I had to make a minor edit.
      • Roxen did not compile under Solaris 2.6,     I had to build under Solaris 2.5.1.

    • Basic auth – whether the server supports Basic Authentication, for password-protected web pages.
    • Chroot – whether the server lets you use the chroot() system call to enhance security.
    • Throttling – the ability to set bandwidth limits on certain pages, so they don’t use more than their fair share of the server’s resources.

  • Size – various sizes of parts of the server    
    • Tar bytes – the uncompressed source tarchive size.
    • Source files – how many source and header files.
    • Source lines – how many lines in the source and header files.
    • Exe – the executable size. For the compiled program this is size of the main executable file, stripped. For the Java servers it’s the total size of the .class files or .zip files. For Roxen it’s the size of the Pike interpreter.

  • The benchmark test system is a 297MHz Ultra Sparc with    256MB RAM / 512MB swap running Solaris 2.6, otherwise totally quiescent.    RLIMIT_NOFILE is 256 soft / 1024 hard, and v.v_maxup is 3941.
  • The test loads were generated on a separate machine connected via    100MB/s ethernet.  The test client is     http_load, doing    multiple requests in parallel, with checksumming enabled.
  • All the test runs were done at least twice on different days, and    all the shapes and inflection points etc. proved to be completely    repeatable.
  • All programs are in their out-of-box configuration, except as    needed for the tests (e.g. enabling CGI).  And except for mathopd,    which doesn’t have an out-of-box configuration – I used     this one.
  • RPS – maximum requests per second.  This is determined    by running the test load at various parallel-request rates and finding    the maximum response rate.  See the graph below for the full data.    
    • Small files – the small-file test load consists of 1000 files, each 1KB long, requested randomly.
      • Acme.Serve exitted at 60 parallel     requests, due to running out of file descriptors.  Apparently     the Java runtime uses three file descriptors per thread, or     something like that.  But still, the program shouldn’t exit!
      • Jigsaw mysteriously refused to     serve 40 of the 1000 files in the test load.  Same 40 consistently.     I took those 40 out of the test and the other 960 files worked     fine.  This sort of thing doesn’t really build confidence in     a server.

    • CGI – the CGI test load consists of a trivial “hello world” C program.
      • Roxen started returning bad checksums     at only 3 parallel CGI requests.     
      • Some of the servers do not implement the parts of the CGI     spec that require header-parsing – thttpd, mathopd, Boa,     perhaps others.  This could be seen as giving them an unfair     advantage at performance comparisons.

  • Max users – maximum number of simultaneous users    that can be handled.  This is determined by running the test load at    various parallel-request rates and seeing when it starts screwing up.    Typical screwups are running out of memory or processes, so that requests    start timing out or getting refused.    
    • Large files – the large-file test load consists of 100 files, each 1MB long, requested randomly.  Also, each connection is throttled to simulate a 33.6Kbps modem.  Note that 1000 33.6Kbps connections is 3/4 of a T3.

  • Some servers that I wanted to benchmark but didn’t get to:    
  • phttpd I couldn’t get to run.
    • Netscape’s license agreement prohibits publishing benchmark results (section 6 viii).

Here’s a graph of the full data:

serverperf

Comments:

  • That nice diagonal line is very interesting.  A bunch of very    different servers follow it exactly for the first part of their curve.    Its slope is 5 hits/second per user, indicating that each hit takes a    minimum of 1/5th second to handle.  But other servers don’t follow it,    showing that it’s not an inherent limit in the test setup.    thttpd-2.00 was following the line, while mathopd, a very similar server,    does not, so I made some changes to thttpd to try and get its latency    to be more like mathopd’s.  Turns out the change that made the    difference was sending the response headers and the first load    of data as a single packet, instead of as two separate packets.    Apparently this avoids triggering TCP’s “delayed ACK”, a 1/5th    second wait to see if more packets are coming in.  thttpd-2.01 has the    single-packet change.
  • The sharp drop-off at 125 users for a bunch of different servers    is mysterious, especially considering it applies to servers of three    different models – pre-fork, threads, and select – and that other    servers are immune.