BBS水木清华站∶精华区

发信人: althea (痛并快乐着), 信区: Linux
标  题: Linux Performance Turing -- Linux Turing Basics
发信站: BBS 水木清华站 (Tue Nov  9 01:52:11 1999)

          Linux Tuning Basics
  Before reading anything else, make sure you have done this

1. Introduction

This site is focussed around performing tuning on a Linux box.
That means attempting to wring the maximum performance possible
from the hardware and software.

There is no such thing as a typical linux box. What we are
attempting to show is the items to look at when making
optimisations to your system. As such, we have to assume a
general base level of knowledge and system setup. The idea of
this document is to provide that. At the same time, it is a bit
of a prep into how hardware and software relate together (so that
you understand why all the rest of the stuff at this site works).

If you are using you system for typical home use, many of the
suggestions here might be a bit of "overkill". If you are
attempting to run a web server, mail server, database server,
file server, etc. you will want to make sure your system follows
the guidelines presented here. If you don't, many of the
suggestion presented thought the rest of the site may not have
the effect that is expected. In other words, before you push your
system pass the limits, make sure you are reaching the limits
first!

1.1 What is covered

This document is structured around three basic areas for that all
tuning attempts to make the most of: Disk, Networking, Memory.
Without these working to their maximum extent your CPU will be
wasting clock cycles. No point having a Pentium IX/6GHz if you
only have a 14k4 modem and an IDE disk.

1.2 Further Information

Also, the following links might be a good way to get started
before you read the rest of this:

     Benchmarking HOWTO (a bit dated)
     http://metalab.unc.edu/LDP/HOWTO/Benchmarking-HOWTO.html
     Configuration HOWTO
     http://metalab.unc.edu/LDP/HOWTO/Config-HOWTO.html
     Installation HOWTO
     http://metalab.unc.edu/LDP/HOWTO/Installation-HOWTO.html

2. Disk

Disks come in many forms, from the old MFM/RLL drives to today's
SCSI screamers. Optimising for this can be difficult. Luckily,
hard drives are a constant that you know about in the box you are
working on. Disks keep a permanent record of everything that
occurs so that you can switch the power off and restart the box
in pretty much the same state as it was before. Therefore, almost
every form of service ends up needing to use a disk somewhere.

2.1 Simple Disk I/O

Today there are two basic mainstream standards for talking to a
disk - IDE and SCSI. A third standard - FibreChannel is available
to the really, really expensive systems.

The two standards are more or less the same age, give or take a
year or two(SCSI being the slightly older). Both have been
stretched well beyond their original specs. No matter what we say
about the current best transfer rates, they will be out of date
within the next few months anyway. So, we'll concentrate on the
basic philosophical differences.

SCSI started at the high end and worked its way down. IDE started
at the bottom and worked its way up. For a given machine, a
single SCSI controller can handle more devices (15) than IDE (4).
Both can handle hard disks, removable media like CDs and tape,
and floppy drives. Where they differ is how they approach the
real low level stuff like controlling the access to the device.
SCSI devices have a controller chip that is used to process most
of the information during the transfers. IDE does not have a
controller. Therefore most of the control must be done by the
CPU. So, instantly, on a heavily loaded system, you can see that
a SCSI device will not load the system up as much as an IDE
device. If your system needs a lot of CPU grunt for processing
databases or dynamic web pages you will see that otherwise
identical systems will producer a faster system with SCSI drives.

If you are going for performance on a system, you need SCSI.
While you can do it with UDMA/IDE, anything that is high disk i/o
needs to move the disk processing from the main CPU to a SCSI
controller. This is the basic approach of many computer tasks.
Want faster 3D graphics, move the task to a dedicated video chip.
Better sound - get a dedicated sound card.

2.2 Single v's Multiple Disks

Hard drives are cheap. Rarely these days do you find a box
running around with only a single drive in it. There is a huge
difference between having two disks in your machine and a RAID
system. However, any of the tricks of RAID can be applied to your
humble desktop.

Put simply, if you have more than one disk in you machine, make
use of it. Even with IDE, you can make some very significant
performance gains by using both disks in parallel.

With the speed of the modern bus, a single disk is not capable of
consuming all of the available bandwidth. A well tuned system
seeks to use as much bandwidth as possible. Having two or more
drives operating in parallel means much higher bandwidth
utilisation. For example, say you have the typical / and /usr
setup on two different partitions. If there are on the same disk,
it will run slower than having one on each disk.

2.3 Organise the hardware

If there is any chance in your system to make things run in
parallel - do so. All modern systems come with dual IDE channels
in them. Run one disk on each if you have two disks. This applies
to any device connected to the IDE controller: including CDROMs
and floppies.

With SCSI, this decision can be a bit tough. If you have two
controllers and two disks, should you go this parallel route?
Probably not (discounting the RAID controller and standard
controller pairings). If you need to do a lot of disk to disk
copying then keeping two SCSI disks on the same bus actually is
much quicker than having them on separate ones (the need to pass
the data over the internal system bus is usually much slower). If
there is minimal disk to disk transfer then parallelling may be a
good option.

SCSI also provides another challenge: The SCSI bus runs at the
speed of the slowest device on it. Get an old SCSI I drive and
connect it to the an Ultra Wide controller and another UW disk
and everything runs like wet cement. Keep them on separate
controllers if at all possible. This problem does not seem to
effect IDE drives as much. Obviously, from this, you should try
to keep devices of the same speed connected together. If you have
SCSI hard drives, instead of slaping a SCSI CDROM on with them,
get an IDE cdrom unless you happen to have a Ultra SCSI II cdrom
to put with your ULTRA II Hard Drives

2.4 Organise your data

The simplest rule - the thing that requires the most data should
go on the fastest drive you've got.

With RAID, it has the highest capable bandwidth of the lot. That
is, if you have anything that runs really heavy disk I/O like web
page logging put it on the fastest drive you've got. If you have
a RAID drive and a normal drive make sure that all your log
activities write to the RAID disk.

/ contains most of the system utilities and doesn't get used
much. These can be shipped off to the slowest disk.

/var/log contains a _lot_ of logging information. Best on a fast
disk

/usr is typically on a separate partition anyway. Place it
wherever, but if you have a lot of clients starting lots of X
applications, some speedy disk may be in order.

Before deciding on your partitioning scheme, you really need to
know exactly what sort of applications you will be running. Here
are two typical examples of the sorts of configurations that you
need to know about:

Mail: Sendmail writes to two main locations mail queue (usually
/var/spool/mqueue and /var/spool/mail as well as possibly having
to read into a user's home directory for a procmail configuration
OR .forward file. If you are attempting to boost sendmail
performance and nfs mounted mail queue, mail spool, or home
directories might cause some serious issues and you'll have to
take a look into your mail scheme. For more information, you
should look at the section on mail server tuning.

Apache uses several different files, two logs files for logging
and access to the actual pages. While you may think that you want
your web pages on the fastest drive, apache spends quite a bit of
time writing to logs files. You will want to make sure that you
take this into consideration in developing a partition scheme.

2.5 Swap Space

Swap partitions. How often do we see someone screwing up a
perfectly good system by a badly configured swap space? The
difference can be phenomenal.

Always have swap space on a separate partition(s). The use of
swap files can really grind a system to a halt under even
moderate load. When using a swap file, it adds an extra layer of
system calls for every write and read. Instead of talking to
"raw" disk, you are writing into the filesystem. Access to the
raw filesystem will be quicker

Two smaller swaps on two disks is better than one big one on a
single disk. Based on the same reasons as earlier, there is also
an advantage to using RAID for your swap partition as well.

If possible, make the swap partition the first partition on the
disk (it is physically located closest to the outer edge of the
hard drive platter). There can be up to a factor of 2 difference
in transfer rates for data out on the edge compared to the middle
of the disk. As the kernel uses swap space as extra memory, the
quicker you can get this stuff on and off disk, the better your
performance.

Have at least twice the amount of physical RAM set aside as swap
space.

2.6 Further information

     From the Configuration HOWTO: General System Setup
     http://metalab.unc.edu/LDP/HOWTO/Config-HOWTO-2.html#ss2.4
     Multi Disk System Tuning HOWTO
     http://metalab.unc.edu/LDP/HOWTO/Multi-Disk-HOWTO.html
     Large Disk HOWTO
     http://metalab.unc.edu/LDP/HOWTO/mini/Large-Disk.html
     Software-RAID HOWTO
     http://metalab.unc.edu/LDP/HOWTO/mini/Software-RAID.html
     Ultra-DMA Mini HOWTO
     http://metalab.unc.edu/LDP/HOWTO/mini/Ultra-DMA.html

3. Networking

TBD

3.1 Further information

4. Memory

The final piece of the basic tuning is to look at the use of
memory in your machine. This is not just the main memory (RAM),
but the use of caches and access between the various levels of
memory.

4.1 Unix Memory

In order to understand how memory can be optimised, you need to
know a little about how the underlying operating system allocates
memory to applications. Since Linux is a form of Unix, it follows
many of the same basic principles, so well talk about general
unix models here.

Unix uses an "Always Full" model for memory. All memory is always
being used, as in, it always contains some form of data.
Typically this is used to buffer information from slower forms of
memory like hard disks etc. Whether or not that information is
relevant is inconsequential.

The first thing a unix box does when it starts up is grab all the
memory and divide it into a number of chunks. Typically this
consists of:

Kernel Space
     A memory space that is reserved exclusively for the kernel
     to use and abuse. This is the minimal breathing area needed
     for the kernel to run scheduling, keep itself permanently in
     memory and other really important stuff like device drivers
     etc.
User Space
     The memory space used to keep application level code. This
     is all applications - even server processes like web/mail
     servers, a user login shell or X Windows. Inside the user
     space, each application (process) has its own block of space
     to operate. No other application is allowed to invade this
     space. If it does, the typical result is a core dump. (There
     are caveats to this for things like Shared Memory usage that
     puts in an explicit common space.)
Buffer Space
     This is basically everything else of the memory that is not
     taken up by the requirements of user and kernel space. This
     is used to buffer I/O for disks, network cards etc. Also,
     importantly, this is used for DMA transfers between devices
     to make things much quicker. Buffering is applied to both
     read and write operations. Attempt to write something to
     disk, and it will be first written to memory and then at
     some later time it will be written to the physical media as
     well.

The job of the kernel is to manage all of these different spaces
according to the requirements of the currently running processes.
When you attempt to start a new application, it must first find
some memory to use that is big enough, allocate it and then start
your application. If there is not enough physical RAM, it must
make some by swapping out onto disk another application, or maybe
raid buffer space, and then copy your application into memory
before beginning the execution. Some of this task involves
re-arranging memory to get a contiguous chunk of space large
enough. (Coincidently, these are also why applications on Unix
machines tend to appear to typically take longer to start than
the equivalent Win32 or MacOS app).

＞From this you should be starting to see some of the problems
that could occur that make your machine run slowly. Too little
RAM and the kernel spends its time swapping (commonly known as
Thrashing when it gets bad). Badly setup swap space and it slows
down that operation. Too many apps running and the kernel spends
all its time searching for new space or re-arranging memory to
fit.

4.2 Memory Types

Over the years, the typical PC memory chip has gone through a
whole swathe of acronyms. Lets look at a few of them:

     DRAM: Dynamic RAM. The original memory type. Could
     allow one operation at once: read or write.

     EDORAM: Extended Data Out RAM. Like DRAM, but allowed
     reading of multiple bytes in parallel.

     SRAM: Static DRAM. Usually faster than other methods,
     but more power-hungry. Used for caches.

     SDRAM: Synchronous DRAM. Operates in sync with the CPU
     clock. Increases throughput by using pipelining to hide
     setup times. This is the most common type of RAM in use
     today.

     WRAM: Windowed RAM. A dual ported memory that allowed
     simultaneous reading and writing. Only ever appeared in
     Video cards.

     ECCRAM: Error Checking / Correcting RAM. Contains
     several extra bits per byte; the memory controller uses
     this to detect and correct all single bit errors and
     detect most multiple bit errors. Typically only used in
     servers, where the cost of an error is very high.

See also Tom's Hardware Guide.

4.3 Parity Checking

Standard main memory is Dynamic RAM of one sort or another, in
which each bit is represented by voltage on a capacitor. If a
cosmic ray happens to dump too much charge onto one of those
capacitors, you can end up with a bit error. This happens very
infrequently, so desktop systems don't need to worry too much
about it. To detect bit errors, some memory chips come with a
parity bit on each byte; the parity bit is a simple checksum of
the byte. When the computer reads the byte, and detects that the
checksum on the byte is wrong, it declares a bus error, and halts
the current program.

Not many computers use simple parity checking anymore; it doesn't
detect as many errors as full ECC memory (which has four or so
parity bits per byte, and can even correct single-bit errors).

4.4 Leveling the Cache

In computing, the faster you want to access something, the more
expensive it is to manufacture. Have you ever wondered why we
don't see all hard drives replaced with RAM? Its cheaper to
produce the same amount of storage on a disk than it is in
memory. This price saving comes at a penalty - speed. A hard
drive is about 100 times slower to access than standard memory.

Your CPU screams along at 400 million instructions a second
(obviously this depends on CPU clock speed). On a good day, that
means it needs to read one instruction every 2.5 nano seconds
(2.5 x 10^-9). An average hard drive has an access time of around
8 milliseconds (8 x 10 -3). If the CPU was going to read every
thing from the hard drive, for each instruction it executes,
there are about 3.2 million clock cycles that it doesn't do
anything for. Pretty obviously that's a big waste of resources.

In order to keep the CPU feed with instructions, we need memory
that is going to keep up. The problem now becomes one of how to
get data from slow moving devices into the fast moving ones -
that is the job of the various levels of memory. On your typical
computer you have the following:

     ~ Level 1 - The CPU cache built into the
     microprocessor. Currently that is between 8K and 32K
     bytes. Quickest as it operates at exactly the same
     speed as the CPU core

     ~ Level 2 - External Cache. The next level is much
     bigger at between 256K and 2MB. This normally operates
     at about half the CPU core clock speed.

     ~ Level 3 - Main Memory. Your standard RAM. Anywhere
     between 16MB and a few gigabytes. Access to this is
     around 66 or 100MHz, or probably a quarter of the CPU
     core clock speed.

     ~ Level 4 - Storage devices. Hard drives, CDs, Floppies
     etc. Access speed is around 10-50MHz depending on the
     device and connection (eg IDE is 16MHz, UW SCSI is
     80MHz, FireWire/FibreChannel is around 300MHz).

There are some real over simplifications here, but it should give
you a rough idea of how each level slows down compared to the one
below it as we move away from the CPU. As another gross
simplification, for each unit at each level, the cost stays
roughly the same of around US$250 so you can see how cost
influences size and speed.

Now, that's a long introduction to this point. As you can see, we
get to smaller and smaller memory sizes the closer we get to the
CPU. That means we fit less in, and the chances of having to
fetch something from the next level up increases. Since this next
level is slower, we pay a penalty each time.

The simplistic approach to tuning your memory and cache is to put
as much of the fastest memory that you can buy into the machine.
This is the reason the Xeon, Sparc and Alpha chips can have as
much as 2MB of on board cache. For server applications where
there is huge amounts of number crunching to do, the more you can
cache the quicker everything runs.

Buying big caches doesn't always  gain you extra. For example, if
you are mail serving or doing only static web pages, you will
gain almost nothing compared to more reasonable "standard"
amounts. For many operations, information gets dumped straight
from the hard drive or main memory, straight to the output
device. For example, sound clips or image textures may go
straight to output devices for processing. If you are dishing up
files to a server, you are much better off trying to store the
files in main memory rather than on the hard drive. Tuning a file
server usually involves buying lots of standard RAM rather than
big caches.

4.5 Bus speed

As you saw in the previous section, accessing a byte of
information is dependent on the rate at which you can access the
information in the device. However, that device needs to be
connected to the other devices so that means it must travel over
some intermediary connection. Like a dam with a straw allowing
water to escape, it doesn't matter how much or how quick the
device might be capable of delivering information if the bus
connecting the two runs like a glacier.

One immediate and fairly easy way to get information out quicker
is to start playing with the bus speeds for everything. A normal
computer comes with adjustments for PCI, ISA, Memory and a few
other items set to fairly conservative, safe settings. On the
other hand, the devices plugged into these busses usually have
some amount of tolerance for the bus speed moving around a bit.
This gives you room to tinker.

Playing with bus speeds is a bit of a trial and error approach.
Good quality components usually have quite a margin to play with,
but el cheapo components (like your $10 NE2000 clone network
card) won't tolerate it too much.

Bus speeds are usually only available in the BIOS setup. To tune
your bus, up the speed, reboot the machine and see if things
start locking up. Assuming the machine still boots OK, run
benchmarks over it. In some cases, increasing the bus speed will
_slow_down_ your machine due to the devices not adequately
dealing with higher than specified settings. Don't just set it as
high as possible and assume everything will be better. Find
benchmarks for the particular subsystem that you've played with
and check!

--
※ 来源:·BBS 水木清华站 bbs.net.tsinghua.edu.cn·[FROM: 162.105.179.11]