Query string: Difference between revisions

Content deleted Content added
mNo edit summary
Line 1:
In the [[World Wide Web]], a '''query string''' is the part of a [[Uniform Resource Locator|URL]] that contains data to be passed to [[Common Gateway Interface|CGI]] programs.
 
[[Image:Url.png|frame|The [[mozillaMozilla]] URL ___location bar showing an URL with the query string <code>title=Main_page&action=raw]]
 
When a [[web page]] is requested via the [[HyperText Transfer Protocol]], the server locates a file in its [[file system]] based on the requested [[Uniform Resource Locator|URL]]. This file may be a regular file or a program. In the second case, the server may (depending on its configuration) run the program, sending its output as the required page. The query string is a part of the URL which is passed to the program. This way, the URL can encode some data that is accessible to the program generating the web page.
 
==Structure==
 
The URLs of documents to be generated by programs may contain a query string that is passed to the program. A typical such URL is as follows:
 
Line 17 ⟶ 16:
:<code>field<sub>1</sub>=value<sub>1</sub>&field<sub>2</sub>=value<sub>2</sub>&field<sub>3</sub>=value<sub>3</sub>...</code>
 
* The query string is composed of a series of field=value pairs.
* The field-value pairs are each separated by an [[equal sign]].
* The series of pairs is separated by the [[ampersand]], '&' (also by ';' in the newer [[W3C]] recommendations [http://www.w3.org/TR/1999/REC-html401-19991224/appendix/notes.html#h-B.2.2])</div>
 
Line 25 ⟶ 24:
Technically, the form content is encoded as a query string when the form submission method is GET. The same encoding is used by default when the submission method is POST, but the result is not sent as a query string, that is, is not added to the action URL of the form. Rather, the string is sent as the body of the request.
 
==URL Encodingencoding==
Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character <code>#</code> is used to locate a point within a page; the character <code>=</code> is used to separate a name from a value. A query string may need to be converted to satisfy these constraints. This can be done using a schema known as [[URL Encodingencoding]].
 
Some characters cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character <code>#</code> is used to locate a point within a page; the character <code>=</code> is used to separate a name from a value. A query string may need to be converted to satisfy these constraints. This can be done using a schema known as [[URL Encoding]].
 
In particular, [[Request for Comments|RFC]] 1738 specifies that &ldquo;only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL&rdquo;. All characters in a query string can be replaced by their hexadecimal value precedeed by the symbol <code>%</code>. For example, the equal sign can be replaced by <code>%3D</code>. All characters can be replaced this way; for the characters that are forbidden in a query string, this is not only possible but necessary.
Line 34 ⟶ 32:
 
==RFC==
As defined in RFC 1738, an URL of scheme <code>http</code> can contain a ''searchpart'' following the rest of the URL and separated from it by a <code>?</code> character. RFC 3986 specifies that the ''query component'' of an [[Uniform Resource Identifier|URI]] is the part between the <code>?</code> and the end of the URI or the character <code>#</code>. The term ''query string'' is of common usage for referring to this part for the case of HTTP URLs.
 
As defined in RFC 1738, an URL of scheme <code>http</code> can contain a ''searchpart'' following the rest of the URL and separated from it by a <code>?</code> character. RFC 3986 specifies that the ''query component'' of an [[URI]] is the part between the <code>?</code> and the end of the URI or the character <code>#</code>. The term ''query string'' is of common usage for referring to this part for the case of HTTP URLs.
 
==Example==
 
If a form embedded in an [[HTML]] page as follows:
<form action=cgi-bin/test.cgi method=get>
 
<input type=submittext name=first>
<form action=cgi-bin/test.cgi method=get>
<input type=text name=firstsecond>
<input type=text name=secondsubmit>
<input type=submit>
 
and the user inserts the strings &ldquo;this is a field&rdquo; and &ldquo;was it clear (already)?&rdquo; in the two [[textfield]]s and presses the submit button, the program <code>test.cgi</code> will receive the following query string:
firstname=this+is+a+field&secondname=was+it+clear+%28already%29%3F
 
firstname=this+is+a+field&secondname=was+it+clear+%28already%29%3F
 
In [[UNIX]]-based [[web server]]s, the program receives the query string as an [[environment variable]] named <code>QUERY_STRING</code>
 
==Tracking==
 
A program receiving a query string can ignore part or all of it. If the requested URL corresponds to a file and not to a program, the whole query string is ignored. However, regardless of whether the query string is used or not, the whole URL including it is stored in the server [[log file]]s.
 
Line 59 ⟶ 52:
 
For example, when a web page containing the following is requested:
<a href="cicciofrank.html">minesee ismy betterpage!</a>
 
<a href="frankciccio.html">seemine myis page!better</a>
<a href="ciccio.html">mine is better</a>
 
an unique string, such as <code>sdfsd23423</code> is chosen, and the page is modified as follows:
<a href="cicciofrank.html?sdfsd23423">minesee ismy betterpage!</a>
 
<a href="frankciccio.html?sdfsd23423">seemine myis page!better</a>
<a href="ciccio.html?sdfsd23423">mine is better</a>
 
The addition of the query string do not change the way the page is shown to the user. When the user follows, for example, the first link, the browser requests the page <code>frank.html?sdfsd23423</code> to the server, which ignores what follows <code>?</code> and sends the page <code>frank.html</code> as expected, adding the query string to its links as well.
Line 73 ⟶ 64:
 
The main differences between query strings used for tracking and HTTP cookies are that:
# queryQuery strings form part of the URL, and are therefore included if the user saves or sends the URL to another user; cookies can be maintained across browsing sessions, but are not saved or sent with the URL.
 
# ifIf the user arrives to the same web server by two (or more) independent paths, it will be assigned two different query strings, while the stored cookies are the same.
# query strings form part of the URL, and are therefore included if the user saves or sends the URL to another user; cookies can be maintained across browsing sessions, but are not saved or sent with the URL
# if the user arrives to the same web server by two (or more) independent paths, it will be assigned two different query strings, while the stored cookies are the same
 
==See also==
 
* [[HyperText Transfer Protocol]]
* [[Common Gateway Interface]]
Line 85 ⟶ 74:
 
==External links==
 
* RFC 1738
* RFC 3986