Snowball (programming language): Difference between revisions

Content deleted Content added
No edit summary
Kylras (talk | contribs)
Link suggestions feature: 3 links added.
 
(13 intermediate revisions by 10 users not shown)
Line 1:
{{Short description|String processing programming language}}
{{distinguish|SNOBOL}}
{{Primary sources|date=March 2020}}
{{Update|inaccurate=yes|updated=September 2014|date=April 2021}}
{{Notability|date=November 2023}}
 
'''Snowball''', also known as [[SNOBOL]], is a small string processing [[programming language]] designed for creating [[stemming]] algorithms for use in [[information retrieval]].<ref name=Snowball-HomePage>[http://snowball.tartarus.org/ "Snowball"], Martin Porter, web page. Retrieved 2 September 2014.</ref>
 
The name Snowball was chosen as a tribute to the [[SNOBOL]] programming language, "with which it shares the concept of string patterns delivering signals that are used to control the flow of the program."<ref name=":0" /> The creator of Snowball, [[Martin Porter|Dr. Martin Porter]], "toyed with the idea of calling it 'strippergram,' ", because it "effectively provides a '[[suffix]] STRIPPER GRAMmar.' ".<ref name="Snowball-HomePage" />
The Snowball compiler translates a Snowball script (a .sbl file) into program in [[thread safety|thread-safe]] [[ANSI C]], [[Java (programming language)|Java]], Ada, C#, Go, Javascript, Object Pascal, Python or Rust. For ANSI C, each Snowball script produces a program file and corresponding header file (with .c and .h extensions).<ref>[http://snowball.tartarus.org/texts/quickintro.html "Snowball: Quick introduction"], Martin Porter, web page. Retrieved 2 September 2014.</ref> The Snowball compiler checks the consistency of its script, and this check was used to discover a [[typo]] in a seminal academic paper by [[Julie Beth Lovins|Lovins]] which had remained undetected for 30 years.<ref>{{Cite web|url=http://snowball.tartarus.org/algorithms/lovins/festschrift.html|title=Lovins revisited}}</ref>
 
The Snowball [[compiler]] translates a Snowball script (a an .sbl file) into program in [[thread safety|thread-safe]] [[ANSI C]], [[Java (programming language)|Java]], Ada, C#, Go, [[JavaScript|Javascript]], [[Object Pascal]], Python or Rust.<ref name=":1">{{Cite web |last=Porter |first=Martin |title=Snowball: Quick introduction |url=http://snowball.tartarus.org/texts/quickintro.html |access-date=May 4, 2025}}</ref><ref name=":2">{{Cite web |date=March 27, 2025 |title=Snowball README |url=https://github.com/snowballstem/snowball# |access-date=May 4, 2025}}</ref> For ANSI C, each Snowball script produces a program file and corresponding header file (with .c and .h extensions).<ref>[http://snowball.tartarus.org/texts/quickintro.html name="Snowball: Quick introduction1"], Martin Porter, web page. Retrieved 2 September 2014.</ref> The Snowball compiler checks the consistency of its script, and this check was used to discover a [[typo]] in a seminal academic paper by [[Julie Beth Lovins|Lovins]] which had remained undetected for 30 years.<ref>{{Cite web|url=http://snowball.tartarus.org/algorithms/lovins/festschrift.html|title=Lovins revisited|website=snowball.tartarus.org |author1=Martin Porter |access-date=6 August 2024 |date=December 2001}}</ref>
The basic [[datatype]]s handled by Snowball are strings of characters, signed integers, and boolean [[truth value]]s, or more simply strings, integers and booleans. Snowball's characters are either 8-bit wide, or 16-bit, depending on the mode of use. In particular, both [[ASCII]] and [[UTF-16|16-bit Unicode]] are supported. Like the [[SNOBOL programming language]], the flow of control in Snowball is arranged by the implicit use of signals (each statement returns a true or false value), rather than the explicit use of constructs such as if, then, and break found in [[C (programming language)|C]] and many other programming languages.<ref>[http://snowball.tartarus.org/compiler/snowman.html "Snowball Manual"], Martin Porter, web page. Retrieved 2 September 2014.</ref>
 
The basic [[datatype]]s handled by Snowball are strings of characters, signed integers, and boolean [[truth value]]s, or more simply strings, integers and booleans. Snowball's characters are either 8-bit wide, or 16-bit, depending on the mode of use. In particular, both [[ASCII]] and [[UTF-16|16-bit Unicode]] are supported.<ref name=":0" /> Like the [[SNOBOL programming language]], the flow of control in Snowball is arranged by the implicit use of signals (each statement returns a true or false value), rather than the explicit use of constructs such as if, then, and break found in [[C (programming language)|C]] and many other programming languages.<ref name=":0">[http://snowball.tartarus.org/compiler/snowman.html "Snowball Manual"], Martin Porter, web page. Retrieved 2 September 2014.</ref>
[[SNOBOL]] was designed to utilize symbolic string data, as its name suggests its job. In only a few lines of code, programmers may quickly search, edit, and use string variables. Pattern-matching problems are well-suited to [[SNOBOL]] and [[derivative]] languages <ref name=":0">{{Cite web |title=SNOBOL |url=https://unacademy.com/content/bank-exam/study-material/computer-knowledge/snobol/ |access-date=2023-11-14 |website=Unacademy |language=en-US}}</ref>
 
Though the original [http://snowball.tartarus.org/ Snowball website] maintained by Dr. Martin Porter and colleague Richard Boulton has been closed since 2014 following Dr. Porter’s retirement,<ref name="Snowball-HomePage" /><ref name=":2" /><ref>{{Cite web |last=Porter |first=Martin |title=Snowball - Credits |url=http://snowball.tartarus.org/credits.html |access-date=May 4, 2025}}</ref> the site itself is still accessible, and the language continues to be developed as [https://github.com/snowballstem a community project on GitHub].<ref name="Snowball-HomePage" /><ref name=":2" /> Additionally, large projects like the [[Natural Language Toolkit|Natural Language Toolkit (NLTK)]] for Python employ Snowball along with stemming algorithms designed by Dr. Porter and other contributors to the Snowball language.<ref>{{Cite web |title=nltk.stem.SnowballStemmer Documentation |url=https://www.nltk.org/api/nltk.stem.SnowballStemmer.html |access-date=May 4, 2025 |website=Natural Language Toolkit}}</ref><ref>{{Cite web |title=Source code for nltk.stem.snowball |url=https://www.nltk.org/_modules/nltk/stem/snowball.html |access-date=May 4, 2025 |website=Natural Language Toolkit}}</ref>
The name Snowball was chosen as a tribute to the [[SNOBOL]] programming language, with which it shares the concept of string patterns delivering signals that are used to control the flow of the program. The creator of Snowball, [[Martin Porter|Dr. Martin Porter]], "toyed with the idea of calling it 'strippergram' ", because it "effectively provides a '[[suffix]] STRIPPER GRAMmar' ".<ref name="Snowball-HomePage" />
 
= Modern use =
[[SNOBOL]] was designed to work with symbolic string data. Just in a few lines of code, programmers would instantly search, edit, and use string variables. [[Pattern-matching]] issues are well-suited to [[SNOBOL]] and derivative languages. Attempts to restore SNOBOL back to life have been made through out the years. In the post [[Cold War|Cold-War]] era, Ralph Griswold, the creator of [[SNOBOL]], produced Icon, a language that was comparable to SNOBOL. It was never as Famous just as SNOBOL was to being too much specialized.<ref name=":0" />
 
==References==
{{reflist}}
*P Willett. "The Porter Stemming Algorithm: Then and Now" (July 2006) ''Program''. Volume 40. Issue 3. Pages 219 et seq.
 
== External links ==
* [http://{{official website|snowballstem.org/ Official site]}}
* [https://github.com/snowballstem Snowball Stemming language and algorithms project] on GitHub
* [https://github.com/snowballstem/snowball/blob/master/algorithms/porter.sbl Porter Stemmer in Snowball]