Content deleted Content added
Tags: Reverted Mobile edit Mobile web edit |
Isaidnoway (talk | contribs) Undid revision 1287006439 by 2409:40D0:1015:AB00:8000:0:0:0 (talk) Reverting unexplained content removal |
||
Line 67:
Some [[Character (computing)|characters]] cannot be part of a URL (for example, the space) and some other characters have a special meaning in a URL: for example, the character <code>#</code> can be used to further specify a subsection (or [[Fragment identifier|fragment]]) of a document. In HTML forms, the character <code>=</code> is used to separate a name from a value. The URI generic syntax uses [[Percent-encoding#Percent-encoding reserved characters|URL encoding]] to deal with this problem, while HTML forms make some additional substitutions rather than applying percent encoding for all such characters. SPACE is encoded as '<code>+</code>' or "<code>%20</code>".<ref name="w3schools" />
[[HTML 5]] specifies the following transformation for submitting HTML forms with the "GET" method to a web server. The following is a brief summary of the algorithm:
* Characters that cannot be converted to the correct charset are replaced with HTML [[numeric character reference]]s<ref name="html5 urlencoded" />
* SPACE is encoded as '<code>+</code>' or '<code>%20</code>'
* Letters (<code>A</code>–<code>Z</code> and <code>a</code>–<code>z</code>), numbers (<code>0</code>–<code>9</code>) and the characters '<code>~</code>','<code>-</code>','<code>.</code>' and '<code>_</code>' are left as-is
* <code>+</code> is encoded by %2B
* All other characters are encoded as a <code>%HH</code> [[hexadecimal]] representation with any non-ASCII characters first encoded as UTF-8 (or other specified encoding)
The octet corresponding to the tilde ("<code>~</code>") is permitted in query strings by RFC3986 but required to be percent-encoded in HTML forms to "<code>%7E</code>".
The encoding of SPACE as '<code>+</code>' and the selection of "as-is" characters distinguishes this encoding from RFC 3986.
== Example ==
|