CMPS 335 Advanced Web Publishing Perl and CGI Programming


Processing Forms


CGI and CGI Environment Variables

Common Gateway Interface (CGI) is the standard for communication between a Web browser, a Web server, and the programs on the server.  It is the interface between a Web page being displayed by a browser and a program that runs on the server.  A Web page can call a CGI program for a service and the CGI program usually communicates back to the Web page by creating an HTML document.  The HTML document is sent back to the browser through the Web server.  All Web browsers and servers communicate using HTTP request and response headers.  The most common HTTP request header is the method header.  Browsers commonly use the GET and POST headers when requesting HTML documents.  The POST method is frequently used when passing data from the browser to the server.

One of the keystones of CGI is a collection of data called environment variables.  These variables are the primary mechanism by which a Web server communicates with CGI scripts.  A CGI script is a program that is run on a Web server triggered by a request from a Web browser.  Environment variables are created by the server and their values are automatically set by the server each time a CGI script is executed.  These variables and their values are stored as key-value pairs in a special hash called %ENV.   Some of environment variables give information about the server and hardware and will never change.  Other variables contain information about the data available to a script and will be different every time a cript is executed.  Your CGI script can use any of these environment variables as needed.

The following three environment variables contain information about the data that is being passed from the browser to the server and they are essential to your CGI script.
   Variable Name (key)     Value
   -----------------------------------------------------------
   REQUEST_METHOD      GET or POST            
   QUERY_STRING        Form data sent from the browser (GET method)
   CONTENT_LENGTH      Length of the posted data (POST method) 

REQUEST_METHOD can have value GET or POST, depending on how the form data is submitted.  QUERY_STRING contains the data appended to a link or data appended to a URL in the Action attribute of the Form tag.  CONTENT_LENGTH contains the size of the data submitted with the POST method stored in the standard input known as STDIN.  The size gives the number of characters in the input buffer.

Click Here to view a sample list of environment variables


URL Encoding Conventions

URL encoding is a scheme used by a Web browser to encode data to be passed to a Web server for processing by a CGI script.  The browser collects the contents of all NAME and VALUE attributes from the form, encodes them as name=value pairs, and sends them to the server.  URL encoding follows the following rules: Some of the commonly encoded special characters are:
       Character   URL Encoded String
       ------------------------------
          &          %26
          /          %2F
          :          %3A
          ;          %3B
          @          %40
          ~          %7E

   Example: Form Data (two VALUES are entered by the visitor)

      NAME Attribute  VALUE Attribute
     ----------------------------------
      studentname     David Smith
      age             21

   Encoded data:   studentname=David+Smith&age=21     
Click Here for URL Encoding Example 1
Click Here for URL Encoding Example 2

Processing of Form Data

The easiest and most common way to get input from your visitors is with a form on your Web page.  The following figure gives a basic description of how CGI works.

      ************    (1)    ***********    (2)    ***********
      *  Web     *  ------>  *  Web    *  ------>  *  CGI    *
      *  browser *  <------  *  server *  <------  *  script *
      ************    (4)    ***********    (3)    *********** 
   Steps:
  1. Browser's request
    A visitor fills out the form in the browser.  When the Submit button is clicked, the form data is encoded and sent from the browser to the server in one of two browser request methods: GET or POST.  The POST method allows for unlimited quantities of data and is the one generally used.
     
  2. Data Available to A CGI script
    Form data sent via the GET method is available to the CGI script in the %ENV environment hash, using the hash key QUERY_STRING.  Data transferred using the POST method is aviable to the CGI script in the input buffer represented by the handler STDIN, and the exact amount of data in the standard input, in bytes, is stored in CONTENT_LENGTH.  Most CGI scripts get their input through the POST method, but the QUERY_STRING variable is handy for special circumstances, such as passing the encoded data appended to the URL in the Action attribute of the <FORM> tag or the encoded data appended to the URL of the <A> tag.  The appended data begins with a ? mark.
     
  3. Output from the script
    The data is available to the script is encoded as a long stream of name=value pairs separated by ampersands.  The CGI script parses the stream of input data, processes the data, and outputs the requested information formatted in HTML.  The cgi script is responsible for generating the response header
              Content-type:text/html
    and a blank line that terminates the header
    (Perl statement: print "Content-type:text/html\n\n";) transmission.  The response header tells the browser that the data returned by the server is HTML text.
     
  4. Server's response
    The server sends the HTML document to the browser.  The browser interprets the HTML document and displays the Web page in the browser window.

When the Submit button is clicked, the browser collects all the form data and sends the data to the server in a long stream of encoded name=value pairs separated by ampersands. 

Examples:
   number1=48&operator=mul&number2=7
   studentname=Mary+Smith&class=Junior

For the stream of encoded form data to be useful, a CGI script must be used to parse the encoded data.  Form parsing scripts serve as the backbone of CGI scripts that process form data. 

A form parsing script called Parse_Form_POST is shown below.   The Parse_Form_POST script parses the encoded form data sent from the browser and places name-value pairs in a hash called %formdata.   Each NAME attribute from the HTML form corresponds to a key in the %formdata hash and each VALUE attribute (or data typed by the visitor in text boxes) corresponds to a value in the %formdata hash.  You need to include this Parse_Form script as a subroutine in your Perl CGI scripts.

The Parse_Form_POST script
 
# The subroutine is in the /cgi-bin/subroutines.lib file
# For parsing form data sent by the POST method

sub Parse_Form_POST 
{
  read (STDIN, $buffer,  $ENV{'CONTENT_LENGTH'});
  @pairs = split(/&/, $buffer);
  foreach $pair (@pairs)
  {
    ($key, $value) = split (/=/, $pair);
    $key =~ tr/+/ /;
    $key =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;
    $value =~ tr/+/ /;
    $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg;

    if ($formdata{$key}) 
      {
        $formdata{$key} .= ", $value";
      }
    else 
      {
        $formdata{$key} = $value;
      }
  }
}
1; 

The Parse_Form script
# The subroutine is in the /cgi-bin/subroutines.lib file
# For parsing form data sent by either the POST or GET method

sub Parse_Form 
{
  if ($ENV{'REQUEST_METHOD'} eq 'GET')
  {
    @pairs = split(/&/, $ENV{'QUERY_STRING'});
  } 
  elsif ($ENV{'REQUEST_METHOD'} eq 'POST') 
  {
    read (STDIN, $buffer,  $ENV{'CONTENT_LENGTH'});
    @pairs = split(/&/, $buffer);
    if ($ENV{'QUERY_STRING'}) 
    {
      @getpairs = split(/&/,  $ENV{'QUERY_STRING'});
      push(@pairs,@getpairs);
    }
  }
else 
{
  print "Content-type: text/html\n\n";
  print "

Use POST or GET"; } foreach $pair (@pairs) { ($key, $value) = split (/=/, $pair); $key =~ tr/+/ /; $key =~ tr/+/ /; $key =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; $value =~ tr/+/ /; $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; $value =~ s///g; if ($formdata{$key}) { $formdata{$key} .= ", $value"; } else { $formdata{$key} = $value; } } } 1;


Return to CMPS 335 Home Page
Return to Web Site Home Page