Cyclone (programming language): Difference between revisions

Content deleted Content added
No edit summary
Bender the Bot (talk | contribs)
m top: HTTP to HTTPS for Cornell University
 
(142 intermediate revisions by more than 100 users not shown)
Line 1:
{{Short description|Memory-safe dialect of the C programming language}}
The '''Cyclone''' [[programming language]] is intended to be a safe dialect of the [[C programming language|C]] programming language. It is designed to avoid the [[buffer overflow]] and other vulnerabilities that are inherent in C, without losing the power and convenience of C as a tool for [[systems programming]].
{{use dmy dates|date=August 2024}}
{{more footnotes|date=August 2015}}
{{Infobox programming language
| name = Cyclone
| logo =
| logo caption =
| file ext =
| paradigm =
| released = {{Start date and age|2002}}
| designer = [[AT&T Labs]]
| developer = [[Cornell University]]
| latest release version = 1.0
| latest release date = {{Start date and age|2006|05|08}}
| latest preview version =
| latest preview date = <!-- {{start date and age|YYYY|MM|DD}} -->
| typing =
| implementations =
| dialects =
| influenced by = [[C (programming language)|C]]
| influenced = [[Rust (programming language)|Rust]], [[Project Verona]]
| programming language =
| operating system =
| license =
| website = {{URL|http://cyclone.thelanguage.org}}
| wikibooks =
| discontinued = Yes<ref>{{cite web |title=Open Access Cyclone (programming language) Journals · OA.mg |url=https://oa.mg/journals/open-access-cyclone-programming-language-journals |website=oa.mg |access-date=30 October 2022 |archive-date=30 October 2022 |archive-url=https://web.archive.org/web/20221030192542/https://oa.mg/journals/open-access-cyclone-programming-language-journals |url-status=live }}</ref>
}}
The '''Cyclone''' [[programming language]] was intended to be a safe dialect of the [[C (programming language)|C language]].<ref>{{Cite journal |last1=Jim |first1=Trevor |last2=Morrisett |first2=J. Greg |last3=Grossman |first3=Dan |last4=Hicks |first4=Michael W. |last5=Cheney |first5=James |last6=Wang |first6=Yanling |date=2002-06-10 |title=Cyclone: A Safe Dialect of C |url=https://dl.acm.org/doi/10.5555/647057.713871 |journal=Proceedings of the General Track of the Annual Conference on USENIX Annual Technical Conference |series=ATEC '02 |___location=USA |publisher=USENIX Association |pages=275–288 |isbn=978-1-880446-00-3}}</ref> It avoids [[buffer overflow]]s and other vulnerabilities that are possible in C programs by design, without losing the power and convenience of C as a tool for [[system programming]]. It is no longer supported by its original developers, with the reference tooling not supporting [[64-bit computing|64-bit platforms]]. The [[Rust (programming language)|Rust]] language is mentioned by the original developers for having integrated many of the same ideas Cyclone had.<ref>{{cite web |title=Cyclone |url=http://cyclone.thelanguage.org/ |website=cyclone.thelanguage.org |access-date=11 December 2023 |archive-date=21 May 2006 |archive-url=https://web.archive.org/web/20060521202022/http://cyclone.thelanguage.org/ |url-status=live }}</ref>
 
Cyclone development was started as a joint project of Trevor Jim from [[AT&T Labs]] Research and [[Greg Morrisett]]'s group at [[Cornell University]] in 2001. Version 1.0 was released on May 8, 2006.<ref>{{cite web |title=Cyclone |url=https://www.cs.cornell.edu/Projects/cyclone/ |website=[[Cornell University]] |access-date=30 October 2022 |archive-date=15 October 2022 |archive-url=https://web.archive.org/web/20221015034248/https://www.cs.cornell.edu/Projects/cyclone/ |url-status=live }}</ref>
Cyclone was jointly developed by [[Greg Morrisett]]'s group at [[Cornell University]] and [[AT and T Labs Research|AT&T Labs Research]] in the early 2000s. It received a certain amount of publicity in November 2001. As of June 15, 2004, the Cyclone compiler stands at version 0.8.1.
 
== Language Features ==
==Language features==
Cyclone is meant from the ground up to avoid some of the common pitfalls of the [[C programming language]], whilst still maintaining the look and performance of C. To this end, Cyclone places the following restrictions upon programs:
Cyclone attempts to avoid some of the common pitfalls of [[C (programming language)|C]], while still maintaining its look and performance. To this end, Cyclone places the following limits on programs:
* <code>[[NULL]]</code> checks are inserted to prevent [[segmentation fault]]s
* <code>[[Null pointer|NULL]]</code> checks are inserted to prevent [[segmentation fault]]s
* [[Pointer arithmetic]] is restricted
* [[Pointer arithmetic]] is limited
* Pointers must be initialized before use
* Pointers must be initialized before use (this is enforced by [[definite assignment analysis]])
* Dangling pointers are prevented through region analysis and limitations on [[Malloc | ''<code>free()</code>'']]
* [[Dangling pointer]]s are prevented through region analysis and limits on <code>[[free()]]</code>
* Only "safe" casts and unions are allowed
* [[Control flow | ''<code>goto</code>'']] into scopes is disallowed
* [[Control flow | ''<code>switch</code>'']] labels in different scopes are disallowed
* Pointer-returning functions must execute <code>return</code>
* ''[[Setjmp.h|<code>setjmp</code>'' and ''<code>longjmp</code>'']] are not supported
In order to maintain the tool set that C programmers are used to, Cyclone provides the following extensions:
* '''Never-<code>NULL</code> pointers''' do not require <code>NULL</code> checks
* '''"Fat" pointers''' support pointer arithmetic with run-time [[bounds checking]]
* '''Growable regions''' support a form of safe manual memory management
* '''[[Tagged union]]s''' support type-varying arguments
* '''Injections''' help automate the use of tagged unions for programmers
* '''[[Polymorphism (computer science)|Polymorphism]]''' replaces some uses of [[void pointer|<code>void *</code>]]
* '''varargs''' are implemented as fat pointers
* '''[[Exception]]s''' replace some uses of <code>setjmp</code> and <code>longjmp</code>
 
To maintain the tool set that C programmers are used to, Cyclone provides the following extensions:
For a better high-level introduction to Cyclone, the reasoning behind Cyclone and the source of these lists, please see [http://www.research.att.com/projects/cyclone/papers/cyclone-safety.pdf].
* Never-<code>NULL</code> pointers do not require <code>NULL</code> checks
* "Fat" pointers support pointer arithmetic with run-time [[bounds checking]]
* Growable regions support a form of safe manual memory management
* [[Garbage collection (computer science)|Garbage collection]] for heap-allocated values
* [[Tagged union]]s support type-varying arguments
* Injections help automate the use of tagged unions for programmers
* [[Polymorphism (computer science)|Polymorphism]] replaces some uses of [[void pointer|<code>void *</code>]]
* varargs are implemented as fat pointers
* [[Exception handling|Exceptions]] replace some uses of <code>setjmp</code> and <code>longjmp</code>
 
For a better high-level introduction to Cyclone, the reasoning behind Cyclone and the source of these lists, see [http://www.cs.umd.edu/projects/cyclone/papers/cyclone-safety.pdf this paper].
Although Cyclone looks, in general, much like [[C programming language|C]], it should be thought of as a [http://en.wikipedia.org/wiki/Category:C_dialects C-like language]. With that, let us look at more features of the language, in depth.
 
Cyclone looks, in general, much like C, but it should be viewed as a C-like language.
===Pointer/reference types===
 
Cyclone implements three kinds of [[reference]] (following C terminology these are called pointers):
===Pointer types===
Cyclone implements three kinds of [[pointer (computer science)|pointer]]:
* <code>*</code> (the normal type)
* <code>@</code> (the never-<code>NULL</code> pointer), and
* <code>?</code> (the only type with [[pointer arithmetic]] allowed, [[fat pointer|"fat" pointerspointer]]s).
The purpose of introducing these new pointer types is to avoid common problems when using pointers. Take for instance a function, called <code>foo</code> that takes a pointer to an int:
<syntaxhighlight lang="C">
 
int foo(int *);
</syntaxhighlight>
 
Although the person who wrote the function <code>foo</code> could have inserted <code>NULL</code> checks, let us assume that for performaceperformance reasons they did not. Calling <code>foo(NULL);</code> will result in [[undefined behavior]] (typically, although not necessarily, a '''[[SIGSEGV''']] [[Unix signal|signal]] being sent to the application). To avoid such problems, Cyclone introduces the <code>@</code> pointer type, which can never be <code>NULL</code>. Thus, the "safe" version of <code>foo</code> would be:
<syntaxhighlight lang="C">
 
int foo(int @);
</syntaxhighlight>
 
This would telltells the Cyclone compiler that the argument to <code>foo</code> should never be <code>NULL</code>, avoiding the aforementioned undefined behavior. The simple change of <code>*</code> to <code>@</code> saves the programmer from having to write <code>NULL</code> checks and the operating system from having to trap <code>NULL</code> pointer dereferences. This extra restrictionlimit, however, can be a rather large stumbling block for most C programmers, who are used to being able to manipulate their pointers directly with arithmetic. Although this is desirable, it can lead to [[buffer overflow]]s and other '''"off-by-one'''"-style attacksmistakes. To avoid this, the <code>?</code> pointer type is delimited by a known bound, the size of the array. Although this adds overhead due to the extra information stored about the pointer, it improves safety and security. Take for instance a simple (and naïve) <code>strlen</code> function, written in C:
<syntaxhighlight lang="C">
 
int strlen(const char *s)
{
int iteri = 0;
if (s == NULL)
return 0;
while (s[iteri] != '\0') {
iteri++;
return iter;}
return i;
}
</syntaxhighlight>
 
This function assumes that the string being passed in is terminated by <code>NULL'\0'</code>-delimited. However, what would happen if <{{code>|style=white-space:nowrap|2=c|1=char&nbsp; buf[6]&nbsp; =&nbsp; {'h','e','l','l','o','!'};</code>}} were passed to this string? This is perfectly legal in C, yet would cause <code>strlen</code> to iterate through memory not necessarily associated with the string <code>s</code>. There are functions, such as <code>strnlen</code> which can be used to avoid such problems, but these functions are not standard with every implementation of [[ANSI C]]. The Cyclone version of <code>strlen</code> is not so different from the C version:
<syntaxhighlight lang="C">
 
int strlen(const char ? s)
{
int iter = 0i, n = s.size;
if (!s) return== 0;NULL)
return 0;
for(;iter < n;iter++,s++)
for (i = if0; (!*s)i return< in; i++, s++)
if (*s == '\0')
return i;
return n;
}
</syntaxhighlight>
Here, <code>strlen</code> bounds itself by the length of the array passed to it, thus not going over the actual length. Each of the kinds of pointer type can be safely cast to each of the others, and arrays and strings are automatically cast to <code>?</code> by the compiler. (Casting from <code>?</code> to <code>*</code> invokes a [[bounds checking|bounds check]], and casting from <code>?</code> to <code>@</code> invokes both a <code>NULL</code> check and a bounds check. Casting from <code>*</code> to <code>?</code> results in no checks whatsoever; the resulting <code>?</code> pointer has a size of 1.)
 
===Dangling pointers and region analysis===
Here, <code>strlen</code> bounds itself by the length of the array passed to it, thus not going over the actual length. Each of the kinds of pointer type can be safely cast to each of the others, and arrays and strings are automagically cast to <code>?</code> by the compiler. (Casting from <code>?</code> to <code>*</code> invokes a [[bounds checking|bounds check]], and casting from <code>?</code> to <code>@</code> invokes both a <code>NULL</code> check and a bounds check. Casting from <code>*</code> or <code>@</code> results in no checks whatsoever; the resulting <code>?</code> pointer has a size of 1.)
 
===Dangling Pointers and Region Analysis===
Consider the following code, in C:
<syntaxhighlight lang="C">
 
char *itoa(int i)
{
Line 74 ⟶ 108:
return buf;
}
</syntaxhighlight>
 
ThisThe returnsfunction <code>itoa</code> allocates an objectarray thatof ischars allocated<code>buf</code> on the stack ofand returns a pointer to the functionstart of <code>itoabuf</code>. However, whichthe ismemory notused availableon afterthe stack for <code>buf</code> is deallocated when the function exitsreturns, so the returned value cannot be used safely outside of the function. While [[gccGNU Compiler Collection]] and other compilers will warn about such code, thisthe following will typically compile without warnings:
<syntaxhighlight lang="C">
 
char *itoa(int i)
{
char buf[20], *z;
sprintf(buf,"%d",i);
z = buf;
return z;
}
</syntaxhighlight>
GNU Compiler Collection can produce warnings for such code as a side-effect of option {{code|-O2}} or {{code|-O3}}, but there are no guarantees that all such errors will be detected.
Cyclone does regional analysis of each segment of code, preventing dangling pointers, such as the one returned from this version of <code>itoa</code>. All of the local variables in a given scope are considered to be part of the same region, separate from the heap or any other local region. Thus, when analyzing <code>itoa</code>, the Cyclone compiler would see that <code>z</code> is a pointer into the local stack, and would report an error.
 
==See also==
Cyclone does regional analysis of each segment of code, preventing dangling pointers, such as the one returned from this version of <code>itoa</code>. All of the local variables in a given scope are considered to be part of the same region, separate from the heap or any other local region. Thus, when analyzing <code>itoa</code>, the compiler would see that <code>z</code> is a pointer into the local stack, and would report an error.
* [[C (programming language)|C]]
===Manual Memory Management===
* [[ML (programming language)|ML]]
* [[Rust (programming language)|Rust]]
 
== Examples References==
{{Reflist}}
The best example to start with is the classic [[Hello world]] program:
 
==External links==
#include <stdio.h>
* [http://cyclone.thelanguage.org/ Cyclone homepage]
#include <core.h>
* [https://web.archive.org/web/20111227232825/http://www.eecs.harvard.edu/~greg/cyclone/old_cyclone.html Old web site]
using Core;
* [http://cyclone.thelanguage.org/wiki/Download Cyclone - source code repositories]
int main(int argc, string_t ? args)
* [http://cyclone.thelanguage.org/wiki/Frequently%20Asked%20Questions Cyclone - FAQ]
{
* [http://cyclone.thelanguage.org/wiki/Cyclone%20for%20C%20Programmers Cyclone for C programmers]
if (argc <= 1)
* [http://cyclone.thelanguage.org/wiki/User%20Manual Cyclone user manual]
{
* [http://www.cs.umd.edu/~mwh/papers/cyclone-cuj.pdf Cyclone: a Type-safe Dialect of C] by Dan Grossman, Michael Hicks, Trevor Jim, and Greg Morrisett - published January 2005
printf("Usage: hello-cyclone <name>\n");
return 1;
} else {
printf("Hello from Cyclone, %s\n", args[1]);
}
return 0;
}
 
Presentations:
== Thanks ==
* [https://web.archive.org/web/20110607170455/http://www.cs.kent.ac.uk/people/staff/rej/morrisett-4.2.03.ppt Cyclone: A Type-Safe Dialect of C]
Most of this page is a re-edit of the [http://www.research.att.com/projects/cyclone/papers/cyclone-safety.pdf "Cyclone: a safe dialect of C"] document. Many thanks to Trevor Jim, Greg Morrisett, Dan Grossman, Michael Hicks, James Cheney and Yanling Wang for creating great software and great documents to spread the word with.
* [http://www.cs.washington.edu/homes/djg/slides/grossman_cyclone_jpl_05.ppt Cyclone: A Memory-Safe C-Level Programming Language]
 
{{CProLang}}
== External links ==
*[http://www.eecs.harvard.edu/~greg/cyclone/ A Safe Dialect of C] or you can use the alternative from AT&T's website [http://www.research.att.com/projects/cyclone/]
 
{{DEFAULTSORT:Cyclone (Programming Language)}}
[[Category:C dialects]]
[[Category:C programming language family]]
[[de:Cyclone]]
[[Category:Programming languages created in 2002]]