BBS水木清华站∶精华区

发信人: minix (海盗船长), 信区: Linux
标  题: Apache for Developers
发信站: BBS 水木清华站 (Sun Mar 14 17:20:23 1999)

October 1998

            Subscribe NOW!




  Apache for Developers

    The latest Apache Web server features a modern architecture and a
    choice of solid development environments.




by Bjorn Borud
The Apache Web server is probably the most popular Web server among Web
professionals today. Some would say that this is despite the fact that
Apache is a free product developed mainly by what they refer to as
"enthusiasts" and despite the fact that little, if any, money has been
spent promoting it. My personal opinion is that the Apache is exactly what
people want because it is made by the people who use it.
Works With
                  Apache for Unix, NT

The Apache project grew out of an effort to improve the NCSA httpd, which,
in early 1995, was the most popular Web server. The first incarnations of
Apache were based on NCSA httpd 1.3 and the name "Apache" reflects the
state of the project early on: It was "a patchy server"—a server that
consisted of NCSA httpd 1.3 and a series of patches.
Today Apache is best viewed as an application framework on which you can
build your solutions, rather than a shrink-wrapped product with a fixed
set of features. Sure, Apache is a good Web server in itself, but the real
advantage comes from its extensibility and the fact that many people
publish their extensions so others can use them directly or learn how to
create their own extensions.
      In this article, you will learn about the general development features of
Apache, as well as two particular environments that lend themselves to
serious applications. The first, PHP, is a strong language for
database-related functions. The second, mod_perl, moves Perl scripting
into the high-performance arena and extends what you can do with it.
      Face it: If you are looking to just serve up files you can use almost any
Web server. If you are looking for an advanced development platform for
solving more involved problems you have to consider what Web server
provides the most cost-effective solution. The API and the module
framework make Apache an excellent choice as a platform on which you can
build your Web applications. The fact that it is free and distributed in
source code form also helps ensure that there are plenty of developers who
have intimate knowledge of the server and will be able to assist you in
development.
      Another important aspect to consider if you are looking for a Web platform
is that Apache won't go away any time soon. Companies go out of business,
get taken over, or discontinue products regularly. Since Apache does not
really belong to any one organization it will stay around as long as there
are people who want to use it. The fact that Apache is the most popular
     Web server in use on the Internet today, with more than one million users,
suggests that people aren't going to lose interest any time soon.
     That community has brought Apache a long way from its patchy beginnings.
The current architecture no longer suffers from NCSA's request handling,
which terminated every process upon completion of each request. Now,
Apache uses a pool of processes that it establishes at start up, a much
more efficient use of server resources.
      Get Modular
      Since version 1.0, Apache has also been divided into modules (see Table
1). An API was also published to enable third-party developers to provide
  their own modules. For the developers of Apache, this meant that they
  could move much of the code out of the core of Apache and into modules.
  While the core of Apache takes care of all the tasks having to do with
  process management etc., the modules provide the more application-oriented
  functions of the Web server. Things like authentication, access checks,
  URL-to-filename translation, sending content back to the user, and logging
  the request were now put into modules so they could easily be maintained
  independently of the Apache core. In addition to the standard modules that
  come with Apache, there are quite a few modules that have been contributed
  by third parties to provide alternative ways to authenticate users, means
  to limit the bandwidth usage of different areas on your Web server, etc.


     For developers, modules mean it's easier to add functionality to the Web
server or even alter its existing functionality. Modules are usually
compiled into the server. When building Apache you specify what modules
you'd like to include and the module will be compiled into the Web server
binary.
      Some people found this a bit cumbersome—having to recompile the entire
server to add just one module; so in more recent versions of Apache
(1.3.x) you can compile a module into a "shared object" (what Windows
people know as a DLL) and load the module at runtime. Adding or removing a
module can now be done by compiling the module and editing the
configuration file. This functionality did exist previously, but in a more
experimental form.
      In principle, the Apache Web server is not a complicated piece of
software. Simply put, it consists of a core that takes care of all the
low-level functions; a set of modules to provide whatever functionality
you would want from your Web server; and handlers to call those modules.
But how do the modules work? How does the Web server decide what to do and
when?
      Phases: When Apache receives a request it will go through a number of
"phases" in order to serve the request, as shown in Figure 1. That
simplifies the task of the developer who wants to extend some aspect of
the server. For instance, if the developer wants to provide her own
authentication modules she will only need to write the code needed to do
the actual authentication. The developer will not have to bother with the
other tasks that need to be performed, like determining the MIME type of
the requested object or even write the code that sends the object back to
the user.
Handlers: A module can define "handlers" for one or more phases. For each
phase the server has a list of handlers from various modules that should
be called during each phase. Each module defines a hard-coded data
structure that identifies what phases it can handle.
When the server calls a handler, the handler performs its task and returns
a status code indicating how things went. An OK code will be returned if
the handler performed its task successfully.
The handler can also decline to handle the request and return the DECLINED
code, in which case the Web server simply ignores the handler and calls
the next handler in the list for that phase. Should an error occur, the
handler can indicate this by returning one of the HTTP error codes. The
server will then abort further request processing, write a message to the
error log and send an error message to the browser.
    To sum up, a request goes through a number of phases. For each phase the
server maintains a list of handlers. The server will call each handler in
the list until a handler signals that it has handled the request or until
an error is reported. A module can contain one or more handlers. Writing
your own modules
      If you plan on writing your own modules you should start out by reading
the "Apache API notes" section of the Apache documentation that comes with
the server. This will give you a basic idea of how to write a module.
  After that, look at some of the modules that come with Apache. If you look
under the src/modules directory in the Apache 1.3 source distribution you
will find both standard and experimental modules plus a sample module
called mod_example that is heavily commented to help you understand what
it does.

Server-Side Programming

      Modules, like the Web server itself, are generally implemented in the C
      programming language. While it makes perfect sense to write a Web server
      in C, it may not be very practical for the average Web developer to use
      it. C is rather hard to use, even harder to debug, and judging by some of
      the code I have seen, it can be terribly hard to read and understand.
      The Web industry moves at a fast pace. Customers want their Web sites
      online quickly and many of the Web developers have little or no prior
      experience in software development. Needless to say, the unforgiving
      nature of C makes it hard for inexperienced developers or those stressed
      for time to produce reasonably stable code fast enough.
      Writing modules for Apache in C may be an option for some, but for others
      the time constraints or their ability may be too restrictive for a given
      project. Fortunately, Apache gives you other ways to program function in,
      the most general solutions being the Common Gateway Interface (CGI) and
      FastCGI.
      Perhaps the most common solution in the past, and presumably to this day,
      is to use CGI in conjunction with some scripting language like Perl. While
      widely used, CGI is actually rather crude. It relies on the Web server to
      spawn a new process, send the pertinent data for the request to the
      process, and then read the response from the program and send it back to
      the client. As mentioned earlier, starting and stopping processes is the
      nemesis of performance, so the standard CGI mechanism is likely to
      introduce bottlenecks into your system.
      Not only will the Web server have to spawn a new process, but if you use a
      scripting language like Perl, the Perl interpreter will have to read the
      script, load the appropriate Perl modules and compile the script into byte
      code that can then be executed. Even if you use a language that lets you
      produce pre-compiled binaries, there is still the significant overhead of
      spawning a new process.
      But CGI isn't all bad. The fact that it is so simple to use is probably
      what made it so popular in the first place, and an added bonus is that you
      are not limited to any one language when creating CGI scripts. If you
      like, you can use any language that is able to read environment variables
      and communicate using the standard I/O mechanisms.
      But before you dismiss CGI as old-hat, consider the Fast CGI option. When
      you use FastCGI, your CGI scripts will not terminate between requests, but
      keep running, waiting for the next request to arrive, thus eliminating the
      overhead of starting and stopping the script for each request.
      Also, your CGI scripts don't have to run on the same machine as the Web
      server when you use FastCGI. You can run your CGI scripts on a different
      host to take the load off your Web server and thereby distribute the load
      across several machines.
      The advantages of using FastCGI are first and foremost speed, but also the
      fact that you can make use of it without having to throw away your
      existing CGI code. With some simple modifications to your CGI scripts you
      can migrate them into a FastCGI environment. For more information on
      FastCGI, point your browser to www.fastcgi.org.
      PHP:Easy DB Access

      When building Web applications, it would be convenient if you could make
      the Web server look-up data in a SQL server and insert the data into HTML
      documents or templates without having to create and maintain CGI scripts.
      To get around the performance bottlenecks of a CGI-style interface and the
      tedium of putting HTML code inside print statements, you could embed a
      parser in the server which allows you to put code into your HTML markup.
      This is exactly what PHP does.
      PHP can also interpret input fields from forms and make the values
      available as variables in the programming language. Also, the interpreter
      can be compiled into the Apache Web server as a module. Now you only need
      to create an HTML document with code embedded and point your browser at
      the page. The Web server will automatically run the embedded code on
      loading the HTML file.
      Here's how it works. Imagine you have an HTML file containing a form:

<FORM ACTION="shoy database that OpenLink (www.openlink.co.uk) supports through the
      OpenLink broker. This means that you can use the features OpenLink
      provides to get persistent connections to databases and a single interface
      to several databases of different kinds. It also means that there's an
      easy way to use databases that aren't supported on your Web server
      platform. For instance, there are no client libraries for Oracle available
      under Linux (yet). If you Web server runs Linux and your database server
      runs Oracle under Solaris, OpenLink provides a way to use the database
      server from the Linux machine.
      To make database access more efficient, PHP offers persistent connections
      to databases in order to eliminate the need for connecting to the database
      every time. PHP will stay logged into the database between requests and
      re-use connections where possible. For some databases this makes database
      access considerably faster.
      Note that the connections are specific to the Web server process. There is
      currently no mechanism to pool connections within the main process of
      Apache and have the Web server processes share connections when needed.
      This means that if you have a large number of processes with persistent
      connections to your database, you will have an equally large number of
      connections to the database.
      As mentioned above, using OpenLink to pool connections in a middle tier
      might be a good option if this is a problem. Usually it is not a problem.
      Extending PHP
      Adding native support for a database or some custom functions could be
      awkward to implement in the PHP language itself; fortunately PHP is rather
      easy to extend. Besides the API documentation that comes with the PHP
      source distribution, there are many built-in functions you can look at for
      reference.
      Perhaps the most common problems when programming in C are memory and file
      descriptor leaks. You allocate memory, but when you are done using it you
      forget to give it back to the system. PHP provides a set of wrapper
      functions for allocating memory. When using these wrapper functions PHP
      will keep track of the memory you allocate and will then be able to clean
      up after your code when the memory is not needed anymore.
      PHP also provides a framework to handle persistent resources like database
      connections etc. This can boost your performance considerably when using
      PHP to communicate with systems that have a considerable startup cost
      associated with initiating a connection. As we mentioned earlier, database
      accesses can often be made persistent.
      Built-in Goodies
      PHP has a lot of built-in goodies that will come in handy when you want to
      develop Web applications. Since the cookie mechanism is a popular way of
      having the browser preserve state information across HTTP requests, PHP
      provides functions to manipulate cookies. Once a cookie is set its value
      can be accessed like a normal variable.
      For instance if your PHP code contains the following code:

  setcookie("session_id", "1234");
      A cookie will be sent back to the browser and the next time the user
      accesses the site the cookie will be available through a variable with the
      same name as the cookie:

  echo "The session id is $session_id\n";
      For more details on what other parameters you can specify to the
      setcookie() you should check out the documentation for PHP (see
      "Availability"). Among the other neat features offered in PHP is
      on-the-fly image generation using the GD library and support for TrueType
      fonts in images using the FreeType library. PHP also provides an easy
      interface to do file uploads from browsers, arbitrary precision math, you
      can talk to LDAP, and IMAP servers, you can query SNMP agents and you can
      even open files on other Web or FTP servers simply by using an URL instead
      of a regular file name.
      Development Using PHP
      Developing Web applications with PHP 3.0 is very straightforward. Once you
      have installed PHP 3.0 and set up the configuration properly you are ready
      to go. Since trying to parse all HTML pages isn't too much of a
      performance hit, I usually set up the server to interpret anything with
      the .html suffix as PHP.
      Now you can just create a file containing some HTML markup and some PHP
      code and save it to a file with the suffix that Apache will identify as a
      (potential) PHP file:

<TITLE>Test page</TITLE>

Hello there, this is my test page

   and today is

<?PHP

   echo date("l F d Y");

?>
      If you make a mistake, the PHP parser will output an error message to your
      browser and tell you in what file and on what line it detected an error.
      You can also configure PHP to issue warnings when you're about to do silly
      things, like using the values of variables before they have been
      initialized.
      Once your projects start to grow in size it is a good idea to put pieces
      of code that you use often into separate files and then use the include()
      command to load the code you need. Not only does this encourage code
      reuse, but it will also make your HTML files considerably smaller and more
      readable. An added bonus is that this will allow you to write less code in
      the long run and make it considerably easier to correct your bugs.
      To automate inclusion of often used libraries and code you can use the
      configuration directives php3_include_path, php3_auto_prepend_file, and
      php3_auto_append_file to make PHP load the appropriate libraries for you.
      As duly pointed out again and again by the critics of PHP, it is not a
      general-purpose language like Perl and thus lacks the immense number of
      features and libraries available for Perl. Nor does it have the ability to
      com of DBI is to
      provide a consistent set of methods and properties that the developer can
      use to access databases with different native APIs.
      The actual database communication is done by dispatching calls from the
      DBI layer to database-specific driver modules, called DBDs. In order to
      support a new database you need only install the appropriate DBD module or
      if none exists for your database; perhaps develop your own.
      In any case, to users of mod_perl this means that whatever database the
      DBI interface supports, you can use it in Apache as well. At startup you
      can have mod_perl load the DBI interface and on most systems the processes
      in the Apache server pool will be able to share the code so you won't have
      to waste memory loading DBI in each child.
      Persistent Database Connections
      Another advantage of using mod_perl in conjunction with DBI is that you
      can create persistent connections to the database thereby avoiding the
      need to connect and disconnect from the database for every request that
      you serve.
      To use persistent database connections you should install the Apache::DBI
      module. After this is done you simply add PerlModule Apache:: DBI to your
      configuration file before any of the modules that actually use DBI. The
      reason you have to load this module before any module that uses DBI is
      because the DBI package checks to see if Apache::DBI has been loaded. The
      Apache::DBI module stores database handles in a global hash and ignores
      any attempts to close the connection to the database.
      The connections are on a per-server basis, that is, the persistent
      connection is established the first time each Web server process makes a
      connection to the database. If you try to initialize a database connection
      in the parent process of the Web server you may run into a lot of trouble
      when several child processes try to access the same database connection at
      the same time. Therefore, make sure you do not load any code that opens up
      database connections using PerlRequire or PerlModule in the Apache
      configuration files.
      The Apache::DBI package also comes with some convenient modules for using
      databases in authenticating users. This is a neat alternative to the
      primitive password files or the somewhat awkward DBM files.
      PHP and mod_perl are equally well suited to database connectivity your
      choice should be based more on the degree of support that the package has
      for your database and what language you are more comfortable with. The
      advantage Perl has over PHP is that Perl has a more uniform interface
      (remember DBI) to databases and it is probably easier to change database
      brands without having to rewrite the code. (Then again, databases are
      diverse critters so it may even be worse: You may even have to redesign
      your application because the new database does things
      differently—regardless of whether you use PHP or Perl)
      Embedding Perl in HTML
      You can also embed Perl code in HTML documents as you could in PHP. This
      is done using the HTML::Embperl package. This package not only offers
      embedded Perl in HTML documents for serving pages on-the-fly, but also the
      ability to generate static HTML files that can later be served without the
      need to run Perl scripts. The latter option may be an alternative if your
      content changes at regular intervals, say once per day.
      Using HTML::Embperl to embed Perl code in HTML documents is a bit
      different than using PHP. When you used PHP you only needed to put <?PHP
      and ?> around your code. With HTML::Embperl you have the following ways of
      embedding code:
        [+ Perl code +] replaces the code inside the [+ and +] marks with the
        result of evaluating the Perl code. You can use variables, expressions
        and even arrays and hashes.
        [- Perl code -] will execute the command inside the brackets, but no
        output will be generated.
        [! Perl code !] is basically the same as [- Perl code -] but the code
        will only be executed on the first request. This way of calling Perl
        code can be used for defining subroutines or doing initializations.
        [$ Cmd Arg $] HTML::Embperl has a set of meta-commands that allow you to
        do things like:

       [$ if $ENV{REQUEST_METHOD} eq

         'GET' $]

       Method was GET

       [$ else $]

       Method other than GET used

       [$ endif $]
        that cannot be done just by inserting plain Perl code because statements
        like (...) {...} else {...} have to be contained within a single pair of
        brackets.
      When you use HTML::Embperl, the pages are cached much like normal Perl
      code is cached; the page and Perl code is compiled and stored into memory
      the first time the page is requested and each time the file changes. The
      page is not cached as a static page so the code will be executed (but not
      loaded and recompiled unless the file changes) on each request.
      CPAN
      The Comprehensive Perl Archive Network (CPAN) is a globally mirrored
      archive that contains a huge number of Perl modules. If the modules that
      come with Perl (or even mod_perl) lack something, this is the place to
      look. For a list of CPAN sites you should check the "perlmodlib" manual
      page that comes with Perl 5 or just visit www.perl.com.
      To make it easier to find, manage, and install Perl modules there is also
      something called the "CPAN shell" that will present you with a command
      line interface through which you can search for and install modules. Be
      warned, though: The CPAN shell won't always work as well as you'd want and
      sometimes manual intervention is needed in order to make things work
      properly.
      Nevertheless, the CPAN shell is very practical in day to day use and it
      can simplify installation if you end up needing to install more than one
      package due to dependencies between packages.
      The mod_perl home page is located at http://perl.apache.org/ and you will
      find everything you need in terms of software, documentation, and links to
      interesting information from that site. Start by downloading just the
      mod_perl package (and of course Apache) and build an Apache Web server
      with mod_perl first. Then, as you familiarize yourself with mod_perl you
      can start downloading and try out the various packages that can extend
      mod_perl to fit your needs.
      PHP or mod_perl?
      Throughout this article I've tried to give a balanced introduction to what
      PHP and mod_perl can offer when used in conjunction with Apache. The
      packages cater to slightly different audiences perhaps and I would
      hesitate to say that one is inherently better than the other.
      If you are considering which Web server infrastructure to offer your
      users: install both and let users use whatever they like more. If you are
      planning to undertake a large project, give both packages a spin and
      listen to your developers after they've tried out both and familiarized
      themselves with the software.
      To the new user, I would recommend starting with PHP, though. It is easy
      to understand and easy to use. Also PHP 3.0 seems to be more common these
      days than mod_perl. I have completed several large projects using PHP and
      never regretted the choice.



      Bjorn Borud is a partner of Guardian Networks (www.guardian.no) in Norway
      where he also works as a developer and consultant. Guardian Networks
      specializes in Internet security, Unix, and creating the magic behind the
      scenes for Web sites. Bjorn can be reached at borud@guardian.no.


--
　　　你若想要自由、爱情和快乐
           就只有用你的信心、决心和爱心去换取
                除此之外绝对没有别的法子,不是吗？

※ 修改:·minix 於 Mar 18 18:24:32 修改本文·[FROM:  159.226.41.165]
※ 来源:·BBS 水木清华站 bbs.net.tsinghua.edu.cn·[FROM: 159.226.41.165]