Query string

This is an old revision of this page, as edited by 204.244.25.251 (talk) at 16:15, 1 December 2005 (Syntax). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In the World Wide Web, a query string is the part of a URL that contains data to be passed to CGI programs.

The mozilla URL ___location bar showing an URL with the query string title=Main_page&action=raw

When a web page is requested via the HyperText Transfer Protocol, the server locates a file in its file system based on the requested URL. This file may be a regular file or a program. In the second case, the server may (depending on its configuration) run the program, sending its output as the required page. The query string is a part of the URL which is passed to the program. This way, the URL can encode some data that is accessible to the program generating the web page.

Syntax

URLs that identify files on the server have typically the following form (URLs may be quite more complicated):

http://server/path/file

If the requested web page is to be generated by a program, the URL can be constructed as follows:

http://server/path/program?query_string

When a server receives a request for such a page, it runs the program (if configured to do so) passing the query_string unchanged to the program in some way. The question mark is used as a separator and is not part of the query string.

The query string is passed as is to the program unchanged.

If the URL results from submitting a web form, the query string has the following form:

parameter1=value1&parameter2=value2&parameter3=value3...
  • The query string is composed of a series of parameter=value pairs
  • The parameter-value pairs are each separated by an equal sign.
  • The series of pairs is separated by the ampersand, '&' (also by ';' in the newer W3C recommendations [1])

For each field of the form, the query string contains a pair parameter=value. Web forms may include fields that are not visible to the user, and these fields are included in the query string when the form is submitted.

URL Encoding

The parameter=value pairs in the query string are encoded according to a schema known as URL Encoding. This is necessary because some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL (for example, the character #, which is used to locate a point within a page).

In particular, RFC 1738 specifies that “only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL”. All characters in a query string can be replaced by their hexadecimal value precedeed by the symbol %. For example, the equal sign can be replaced by %3D. All characters can be replaced this way; for the characters that are forbidden in a query string, this is not only possible but necessary.

The space character can be also represented by +.

Example

If a form embedded in an HTML page as follows:

 <form action=cgi-bin/test.cgi method=get>
 <input type=text name=first>
 <input type=text name=second>
 <input type=submit>

and the user inserts the strings “this is a field” and “was it clear (already)?” in the two textfields and presses the submit button, the program test.cgi will receive the following query string:

 firstname=this+is+a+field&secondname=was+it+clear+%28already%29%3F

In UNIX-based web servers, the program receives the query string as an environment variable named QUERY_STRING

Tracking

A program receiving a query string can ignore part or all of it. If the requested URL corresponds to a file and not to a program, the whole query string is ignored. However, regardless of whether the query string is used or not, the whole URL including it is stored in the server log files.

These facts allow query strings to be used to track users in a manner similar to that provided by HTTP cookies. For this to work, every time the user download a page, a unique identifier is chosen and added as a query string to the URLs of all links the page contains. As soon as the user follows one of these links, the corresponding URL is requested to the server. This way, the download of this page is linked with the previous one.

For example, when a web page containing the following is requested:

 <a href="frank.html">see my page!</a>
 <a href="ciccio.html">mine is better</a>

an unique string, such as sdfsd23423 is chosen, and the page is modified as follows:

 <a href="frank.html?sdfsd23423">see my page!</a>
 <a href="ciccio.html?sdfsd23423">mine is better</a>

The addition of the query string do not change the way the page is shown to the user. When the user follows, for example, the first link, the browser requests the page frank.html?sdfsd23423 to the server, which ignores what follows ? and sends the page frank.html as expected, adding the query string to its links as well.

This way, any subsequent page request from this user will carry the same query string sdfsd23423, making it possible to establish that all these pages have been viewed by the same user. Query strings are often used in association with web beacons.

The main differences between query strings used for tracking and HTTP cookies are that:

  1. query strings form part of the URL, and are therefore included if the user saves or sends the URL to another user; cookies can be maintained across browsing sessions, but are not saved or sent with the URL
  2. if the user arrives to the same web server by two (or more) independent paths, it will be assigned two different query strings, while the stored cookies are the same

See also