Snowball (programming language): Difference between revisions

Content deleted Content added
No edit summary
Kylras (talk | contribs)
Link suggestions feature: 3 links added.
 
(8 intermediate revisions by 6 users not shown)
Line 1:
{{Short description|String processing programming language}}
{{distinguish|SNOBOL}}
{{NotabilityPrimary sources|date=NovemberMarch 20232020}}
{{Update|inaccurate=yes|updated=September 2014|date=April 2021}}
 
'''Snowball''', also known as [[SNOBOL]], is a small string processing [[programming language]] designed for creating [[stemming]] algorithms for use in [[information retrieval]].<ref name=Snowball-HomePage>[http://snowball.tartarus.org/ "Snowball"], Martin Porter, web page. Retrieved 2 September 2014.</ref>
 
The name Snowball was chosen as a tribute to the [[SNOBOL]] programming language, "with which it shares the concept of string patterns delivering signals that are used to control the flow of the program."<ref name=":0" /> The creator of Snowball, [[Martin Porter|Dr. Martin Porter]], "toyed with the idea of calling it 'strippergram,' ", because it "effectively provides a '[[suffix]] STRIPPER GRAMmar.' ".<ref name="Snowball-HomePage" />
The Snowball compiler translates a Snowball script (a .sbl file) into program in [[thread safety|thread-safe]] [[ANSI C]], [[Java (programming language)|Java]], Ada, C#, Go, Javascript, Object Pascal, Python or Rust. For ANSI C, each Snowball script produces a program file and corresponding header file (with .c and .h extensions).<ref>[http://snowball.tartarus.org/texts/quickintro.html "Snowball: Quick introduction"], Martin Porter, web page. Retrieved 2 September 2014.</ref> The Snowball compiler checks the consistency of its script, and this check was used to discover a [[typo]] in a seminal academic paper by [[Julie Beth Lovins|Lovins]] which had remained undetected for 30 years.<ref>{{Cite web|url=http://snowball.tartarus.org/algorithms/lovins/festschrift.html|title=Lovins revisited}}</ref>
 
The Snowball [[compiler]] translates a Snowball script (a an .sbl file) into program in [[thread safety|thread-safe]] [[ANSI C]], [[Java (programming language)|Java]], Ada, C#, Go, [[JavaScript|Javascript]], [[Object Pascal]], Python or Rust.<ref name=":1">{{Cite web |last=Porter |first=Martin |title=Snowball: Quick introduction |url=http://snowball.tartarus.org/texts/quickintro.html |access-date=May 4, 2025}}</ref><ref name=":2">{{Cite web |date=March 27, 2025 |title=Snowball README |url=https://github.com/snowballstem/snowball# |access-date=May 4, 2025}}</ref> For ANSI C, each Snowball script produces a program file and corresponding header file (with .c and .h extensions).<ref>[http://snowball.tartarus.org/texts/quickintro.html name="Snowball: Quick introduction1"], Martin Porter, web page. Retrieved 2 September 2014.</ref> The Snowball compiler checks the consistency of its script, and this check was used to discover a [[typo]] in a seminal academic paper by [[Julie Beth Lovins|Lovins]] which had remained undetected for 30 years.<ref>{{Cite web|url=http://snowball.tartarus.org/algorithms/lovins/festschrift.html|title=Lovins revisited|website=snowball.tartarus.org |author1=Martin Porter |access-date=6 August 2024 |date=December 2001}}</ref>
The basic [[datatype]]s handled by Snowball are strings of characters, signed integers, and boolean [[truth value]]s, or more simply strings, integers and booleans. Snowball's characters are either 8-bit wide, or 16-bit, depending on the mode of use. In particular, both [[ASCII]] and [[UTF-16|16-bit Unicode]] are supported. Like the [[SNOBOL programming language]], the flow of control in Snowball is arranged by the implicit use of signals (each statement returns a true or false value), rather than the explicit use of constructs such as if, then, and break found in [[C (programming language)|C]] and many other programming languages.<ref>[http://snowball.tartarus.org/compiler/snowman.html "Snowball Manual"], Martin Porter, web page. Retrieved 2 September 2014.</ref>
 
The basic [[datatype]]s handled by Snowball are strings of characters, signed integers, and boolean [[truth value]]s, or more simply strings, integers and booleans. Snowball's characters are either 8-bit wide, or 16-bit, depending on the mode of use. In particular, both [[ASCII]] and [[UTF-16|16-bit Unicode]] are supported.<ref name=":0" /> Like the [[SNOBOL programming language]], the flow of control in Snowball is arranged by the implicit use of signals (each statement returns a true or false value), rather than the explicit use of constructs such as if, then, and break found in [[C (programming language)|C]] and many other programming languages.<ref name=":0">[http://snowball.tartarus.org/compiler/snowman.html "Snowball Manual"], Martin Porter, web page. Retrieved 2 September 2014.</ref>
SNOBOL was designed to utilize symbolic string data, as its name suggests its job. In only a few lines of code, programmers may quickly search, edit, and use string variables. Pattern-matching problems are well-suited to SNOBOL and [[derivative]] languages <ref name=":0">{{Cite web |title=SNOBOL |url=https://unacademy.com/content/bank-exam/study-material/computer-knowledge/snobol/ |access-date=2023-11-14 |website=Unacademy |language=en-US}}</ref>
 
Though the original [http://snowball.tartarus.org/ Snowball website] maintained by Dr. Martin Porter and colleague Richard Boulton has been closed since 2014 following Dr. Porter’s retirement,<ref name="Snowball-HomePage" /><ref name=":2" /><ref>{{Cite web |last=Porter |first=Martin |title=Snowball - Credits |url=http://snowball.tartarus.org/credits.html |access-date=May 4, 2025}}</ref> the site itself is still accessible, and the language continues to be developed as [https://github.com/snowballstem a community project on GitHub].<ref name="Snowball-HomePage" /><ref name=":2" /> Additionally, large projects like the [[Natural Language Toolkit|Natural Language Toolkit (NLTK)]] for Python employ Snowball along with stemming algorithms designed by Dr. Porter and other contributors to the Snowball language.<ref>{{Cite web |title=nltk.stem.SnowballStemmer Documentation |url=https://www.nltk.org/api/nltk.stem.SnowballStemmer.html |access-date=May 4, 2025 |website=Natural Language Toolkit}}</ref><ref>{{Cite web |title=Source code for nltk.stem.snowball |url=https://www.nltk.org/_modules/nltk/stem/snowball.html |access-date=May 4, 2025 |website=Natural Language Toolkit}}</ref>
The name Snowball was chosen as a tribute to the SNOBOL programming language, with which it shares the concept of string patterns delivering signals that are used to control the flow of the program. The creator of Snowball, [[Martin Porter|Dr. Martin Porter]], "toyed with the idea of calling it 'strippergram' ", because it "effectively provides a '[[suffix]] STRIPPER GRAMmar' ".<ref name="Snowball-HomePage" />
 
One of the notable features of Snowball is its collection of stemming algorithms for various languages. These algorithms are widely used in academic research and commercial applications for natural language processing.<ref>{{Cite web |title=Snowball Stemmer – NLP |url=https://www.geeksforgeeks.org/snowball-stemmer-nlp/}}</ref>
 
== Modern use ==
SNOBOL was designed to work with symbolic string data. [[Pattern-matching]] issues are well-suited to SNOBOL and derivative languages. In the post [[Cold War|Cold-War]] era, Ralph Griswold, the creator of SNOBOL, produced Icon, a language that was comparable to SNOBOL.<ref name=":0" />
 
==References==
{{reflist}}
*P Willett. "The Porter Stemming Algorithm: Then and Now" (July 2006) ''Program''. Volume 40. Issue 3. Pages 219 et seq.
 
== External links ==
* [http://{{official website|snowballstem.org/ Official site]}}
* [https://github.com/snowballstem Snowball Stemming language and algorithms project] on GitHub
* [https://github.com/snowballstem/snowball/blob/master/algorithms/porter.sbl Porter Stemmer in Snowball]