Trimming (computer programming)

This is an old revision of this page, as edited by EmausBot (talk | contribs) at 16:18, 15 September 2012 (r2.7.3) (Robot: Modifying ru:Trim). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computer programming, trimming (trim) or stripping (strip) is a string manipulation in which leading and trailing whitespace is removed from a string.

For example, the string (enclosed by apostrophes)

'  this  is a test  '

would be changed, after trimming, to

'this  is a test'

Variants

The most popular variants of the trim function strip only the beginning or end of the string. Typically named ltrim and rtrim respectively, or in the case of Python: lstrip and rstrip. C# uses TrimStart and TrimEnd, and Common Lisp string-left-trim and string-right-trim. Pascal and Java do not have these variants built-in, although Object Pascal (Delphi) has TrimLeft and TrimRight functions.[1]

Many trim functions have an optional parameter to specify a list of characters to trim, instead of the default whitespace characters. For example, PHP and Python allow this optional parameter, while Pascal and Java do not. With Common Lisp's string-trim function, the parameter (called character-bag) is required. The C++ Boost library defines space characters according to locale, as well as offering variants with a predicate parameter (a functor) to select which characters are trimmed.

An uncommon variant of trim returns a special result if no characters remain after the trim operation. For example, Apache Jakarta's StringUtils has a function called stripToNull which returns null in place of an empty string.

An alternative to trimming a string is space normalization, where in addition to removing surrounding whitespace, any sequence of whitespace characters within the string is replaced with a single space. Space normalization is done by Trim() in spreadsheet applications (including Excel, Calc, Gnumeric, and Google Docs), and by the normalize-space() function in XSLT and XPath,

While most algorithms return a new (trimmed) string, some alter the original string in-place. Notably, the Boost library allows either in-place trimming or a trimmed copy to be returned.

Definition of whitespace

The characters which are considered whitespace varies between programming languages and implementations. For example, C traditionally only counts space, tab, line feed, and carriage return characters, while languages which support Unicode typically include all Unicode space characters. Some implementations also include ASCII control codes (non-printing characters) along with whitespace characters.

Java's trim method considers ASCII spaces and control codes as whitespace, while Java's [2] method recognizes Unicode space characters.

Delphi's Trim function considers characters U+0000 (NULL) through U+0020 (SPACE) to be whitespace.

Usage

Following are examples of trimming a string using several programming languages. All of the implementations shown return a new string and do not alter the original variable.

Example usage Languages
String.Trim([chars]) C#, VB.NET, Windows PowerShell
string.strip(); D
(.trim string) Clojure
sequence [ predicate? ] trim Factor
(string-trim '(#\Space #\Tab #\Newline) string) Common Lisp
(string-trim string) Scheme
string.trim() Java, JavaScript (1.8.1+, Firefox 3.5+)
Trim(String) Pascal,[3] QBasic, Visual Basic, Delphi
string.strip() Python
strings.Trim(string, chars) Go
LTRIM(RTRIM(String)) Oracle SQL, T-SQL
strip(string [,option, char]) REXX
string:strip(string [,option, char]) Erlang
string.strip Ruby
(string =~ /\S(.*\S)?/s, $&) Perl 5
string.trim Perl 6
trim(string) PHP
[string stringByTrimmingCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] Objective-C using Cocoa
string withBlanksTrimmed
string withoutSpaces
string withoutSeparators
Smalltalk (Squeak, Pharo)
Smalltalk
strip(string) SAS
string trim $string Tcl
TRIM(string) or TRIM(ADJUSTL(string)) Fortran
TRIM(string) SQL
TRIM(string) or LTrim(string) or RTrim(String) ColdFusion

Other languages

In languages without a built-in trim function, it is usually simple to create a custom function which accomplishes the same task.

AWK

In AWK, one can use regular expressions to trim:

 ltrim(v) = gsub(/^[ \t]+/, "", v)
 rtrim(v) = gsub(/[ \t]+$/, "", v)
 trim(v)  = ltrim(v); rtrim(v)

or:

 function ltrim(s) { sub(/^[ \t]+/, "", s); return s }
 function rtrim(s) { sub(/[ \t]+$/, "", s); return s }
 function trim(s)  { return rtrim(ltrim(s)); }

C/C++

There is no standard trim function in C or C++. Most of the available string libraries[4] for C contain code which implements trimming, or functions that significantly ease an efficient implementation. The function has also often been called EatWhitespace in some, non-standard C libraries.

In C, programmers often combine a ltrim and rtrim to implement trim:

char *
rtrim(char *str)
{
  char *ptr;
  int   len;

  len = strlen(str);
  for (ptr = str + len - 1; ptr >= str && isspace((int)*ptr ); --ptr);

  ptr[1] = '\0';

  return str;
}

char *
ltrim(char *str)
{
  char *ptr;
  int  len;

  for (ptr = str; *ptr && isspace((int)*ptr); ++ptr);

  len = strlen(ptr);
  memmove(str, ptr, len + 1);

  return str;
}

char *
trim(char *str)
{
  char *ptr;
  ptr = rtrim(str);
  str = ltrim(ptr);
  return str;
}

The open source C++ library Boost has several trim variants, including a standard one[5]:

#include <boost/algorithm/string/trim.hpp>
trimmed = boost::algorithm::trim_copy("string");

Note that with boost's function named simply trim the input sequence is modified in-place,[6] and does not return a result.

Another open source C++ library Qt has several trim variants, including a standard one:[7]

#include <QString>
trimmed = s.trimmed();

The Linux kernel also includes a strip function, strstrip(), since 2.6.18-rc1, which trims the string "in place". Since 2.6.33-rc1, the kernel uses strim() instead of strstrip() to avoid false warnings.[8]

Haskell

A trim algorithm in Haskell:

 import Data.Char (isSpace)
 trim      :: String -> String
 trim      = f . f
    where f = reverse . dropWhile isSpace

may be interpreted as follows: f drops the preceding whitespace, and reverses the string. f is then again applied to its own output. Note that the type signature (the second line) is optional.

J

The trim algorithm in J is a functional description:

     trim =. #~ [: (+./\ *. +./\.) ' '&~:

That is: filter (#~) for non-space characters (' '&~:) between leading (+./\) and (*.) trailing (+./\.) spaces.

JavaScript

There is a built-in trim function in JavaScript 1.8.1 (Firefox 3.5 and later), and the ECMAScript 5 standard. In earlier versions it can be added to the String object's prototype as follows:

String.prototype.trim = function() {
  return this.replace(/^\s+/g, "").replace(/\s+$/g, "");
};

Perl

Perl 5 has no built-in trim function. However, the functionality is commonly achieved using regular expressions.

Example:

$string =~ s/^\s+//;            # remove leading whitespace
$string =~ s/\s+$//;            # remove trailing whitespace

or:

$string =~ s/^\s+|\s+$//g ;     # remove both leading and trailing whitespace

These examples modify the value of the original variable $string.

Also available for Perl is StripLTSpace in String::Strip from CPAN.

There are, however, two functions that are commonly used to strip whitespace from the end of strings, chomp and chop:

  • chop removes the last character from a string and returns it.
  • chomp removes the trailing newline character(s) from a string if present. (What constitutes a newline is $INPUT_RECORD_SEPARATOR dependent).

In Perl 6, the upcoming major revision of the language, strings have a trim method.

Example:

$string = $string.trim;     # remove leading and trailing whitespace
$string .= trim;            # same thing

Tcl

The Tcl string command has three relevant subcommands: trim, trimright and trimleft. For each of those commands, an additional argument may be specified: a string that represents a set of characters to remove—the default is whitespace (space, tab, newline, carriage return).

Example of trimming vowels:

set string onomatopoeia
set trimmed [string trim $string aeiou]         ;# result is nomatop
set r_trimmed [string trimright $string aeiou]  ;# result is onomatop
set l_trimmed [string trimleft $string aeiou]   ;# result is nomatopoeia

XSLT

XSLT includes the function normalize-space(string) which strips leading and trailing whitespace, in addition to replacing any whitespace sequence (including line breaks) with a single space.

Example:

<xsl:variable name='trimmed'>
   <xsl:value-of select='normalize-space(string)'/>
</xsl:variable>

XSLT 2.0 includes regular expressions, providing another mechanism to perform string trimming.

Another XSLT technique for trimming is to utilize the XPath 2.0 substring() function.

See also

Notes