Furman - Languages Paper

advertisement
Thomas Furman
COP 6557
Programming Languages and Design Paradigms
This paper covers the three programming languages (Lua, Rust, and D) and analyzes some of the key
features that each of the programs offer to the user. With each language, the history of the language is
documented with who designed it and what the language was modeled after. The major programming
paradigms of each language are listed and shown how they are implemented. And a summary of some
unique features from each language is given.
1. The Lua Language
Lua is a scripting language created by Roberto Ierusalimschy at the Pontifical Catholic University of Rio
de Janeiro in Brazil in 1993. The first public release of Lua was in July 1994. Lua 5.2 is the most recent
release of the language; however, Lua 5.1 is still heavily used and supported.
Lua is termed as being a multi-paradigm programming language. In addition to the scripting paradigm, it
also has imperative paradigm and functional paradigm based features. Lua was written in ANSI C and
utilizes a C API to be extendable and embeddable in a multitude of fashions. When Lua code is compiled,
it is resolved into a byte code format and interpreted using a register-based virtual machine. This lends
the language to claim to be one of the fastest interpreted scripting languages for real world applications.
In addition, there is an independent just-in-time compiler implementation of Lua for potentially more
speed.
Lua is a portable language in that it can be implemented cross-platform in any environment that is able
to run ANSI C. It also means that Lua can be embedded into many other languages by using the LUA API
and libraries that will extend the functionality of Lua. For example, Lua has been used in programs
written in: C, C++, C#, Java, Smalltalk, Ada, Erlang, and more. Being only 20,000 lines of C source code,
Lua does not heavily inflate the space and run time complexity of a program due to its small and lightweight nature.
Lua uses a small set of reserved words (and, break, do, else, elseif, end, false, for, function, if in, local,
nil, not or, repeat, return, then, true , until, while). It is also a dynamically typed language; meaning that
variables do not have types assigned to them. All values innately have a data type assigned to them
based on what the value is. The atomic data types include double-precision floating point numbers,
Booleans, and strings by default.
Lua has a single advanced built-in native data structure known as a table. The Lua table functions as a
heterogeneous associative array that is based on a key and data pairing. An entry in the table can be
referred to by either the index number or the key variable name string. Tables can store any data type
except for nil which represents nothing. Tables and strings both start with indexes at 1 instead of at 0.
Metatables are another advanced concept in Lua. Metatables allow the user to define special
operations to occur under specific circumstances for a value. If a string is used with an addition
operator, then the language will check the metatable for a function to execute and override the addition
operator. Metatables are simply stored as normal Lua tables with values that are functions and keys
that are event names. Using metatables and functions is Lua’s form of creating objectivity in the
language.
Functions in Lua are defined as “first-class” functions and are treated with high priority by the language.
This allows functions to have a significant degree of autonomy in the way that they act. To protect
against certain lexical scoping issues that are brought in with first-class functions, Lua also uses closures
whenever a function has finished. A closure is a form of keeping a value in the function intact and
available after the function has finished executing and has been removed from the activation records.
An upvalue is a pointer from the function’s closure to where the value still exists in memory.
Error handling can occur both in the Lua code and C code due to Lua being an extension of C. If an error
occurs during compile or execution time, then control will be handed back to the host. An error
message will be output based on what C determines the problem was. However, the user can explicitly
generate an error by calling an appropriate error function. This allows the user to catch and throw
potential errors that occur during the run time environment.
Garbage collection occurs automatically in Lua via a mark-and-sweep method. Objects that are no
longer reachable in the program are marked for reclamation. If an object only possesses weak
references, then it is marked for collection. If a table only contains weak references, then the table itself
is deemed weak and marked for collection as well. Since garbage collection is automatic, there isn’t a
need for the user to define and free memory space. However, specific attributes for the garbage
collector can be altered to change how collection functions or the rate at which it runs.
Coroutines exist and function in Lua like threads do in other languages. There are three specific
functions to create, resume, and yield the operations of a coroutine within the Lua standard library.
During runtime, each coroutine has a personal stack that it uses for its lifetime. If a coroutine yields to
another process, then the stack is set aside for when the coroutine resumes execution. When a
coroutine finishes or is no longer accessible, then the contents of the stack will be reclaimed by the
garbage collector.
Lua is a very versatile scripting language that is popular among many development communities. It
extends off of C and is able to be embedded in a variety of languages. Lua is also a worthwhile language
for its performance speed and unique features. Lua has grown to be a prominent language over the
past twenty years and will continue to grow in the future.
2. The Rust Language
Design and development started on Rust in 2006 by Graydon Hoare, who was a developer for Mozilla.
In 2009, Mozilla took over development of the language with Hoare as lead developer. The language
was first released in January 2012 and the current version, Rust v0.8, came out in September 2013. Rust
is open source and largely contributed to by the Rust community itself. As Rust is still a new language in
its infancy, there are not many large applications currently using Rust. Mozilla currently uses Rust as a
research platform for new internet browser prototypes and similar applications. The majority of Rust’s
activity occurs on Github.
Rust is another multi-paradigm language that is modeled after C and C++. Rust has taken specific
aspects of functional, imperative, and object-oriented paradigms. Most importantly, Rust is designed to
prioritize the safety and concurrency of memory during runtime. Rust code is compiled using the rustc
compiler and outputs an executable binary file. The Rust compiler is written in Rust itself and uses a
precompiled version of it to install itself. It also uses LLVM (Low Level Virtual Machine), a compiler
infrastructure that is written in C++, on the backend to handle optimizations for linking and loading.
Rust takes the ideas and features of many other languages and implements them in its own way.
Syntactically, Rust is similar to C or C++ as it uses many of the same keywords and program structuring.
However, the language is not similar to either of these languages semantically. Additional features were
refined from other languages as well. For example, Rust’s implementation of concurrency is similar to
Erlang’s implementation, and Rust’s type system uses specific classes that can be found in Haskell. Rust
is multi-platform and can be developed on Windows, OS X, and Linux.
Rust claims to be memory safe by not allowing certain circumstances that are typically allowed to create
unsafe conditions in other languages. The semantics of the language do not allow for null pointers,
dangling pointers, wild points, or any other situation that would lead to the integrity of memory being
compromised. Objects are considers to be immutable by default, but the programmer is given complete
control of the mutability of objects within a program. Any variable or identifier that is declared must be
initialized with a value during the compilation of the code; otherwise, a compile-time error will be
created. Rust uses a stringent series of static semantics to enforce these safety regulations.
In Rust, a source code unit of compilation, linking, and loading is referred to as a crate. Each crate
contains a tree of nested scopes that represents the modules. During compile time, a single crate of
source code is compiled into a single crate in binary form as an executable. If a crate is linked to other
source files containing modules, then those source files are brought into the compiler and linked with
the crate. If a crate’s content contains a main function, then it will be compiled into an executable. In
this case, the main function cannot take arguments and must return a unit type.
Inside of a crate, there is a sequence of zero or more items which represent that core of the source
code. Items are defined by the language to be modules, functions, type definitions, structures, etc. A
module is similar to a class in other languages as it can contain functions and variables inside of itself.
Functions in Rust act similarly to functions in C or C++ where the function declaration must specify the
arguments, type definitions, and return type. Emphasizing on Rust’s safety, functions are deemed
unsafe if it exhibits behavior considered to be unsafe (Dereferencing null or raw pointers, data races,
etc.) or if the calling of another unsafe function occurs. A function maybe explicitly deemed unsafe by
the programmer by using the ‘unsafe’ keyword before the function declaration. Within an unsafe
function, the programmer may use techniques that are typically not allowed. The compiler will ignore
such functions and leave it to the programmer to justify the code’s integrity.
Rust’s memory model is also interwoven with its concurrency model. A Rust program can be broken
down in tasks that can be executed concurrently. Each task is equipped with a stack and a heap.
Memory allocation in the task’s stack is called a slot, whereas in the task’s heap it is called a box. Tasks
that share portions of their stack with other tasks must do so with immutable objects. The lifecycle of a
task is determined by its state: running, blocked, failing, dead. A failing state task occurs when an
external event kills the running or blocked task. A dead task is unable to transition back into another
state and exists for inspection by other tasks. Task scheduling gives each task a finite time slice. If it is
not executed within that time frame, then it loses its priority and is unscheduled. It will be rescheduled
at a pseudo-random interval.
In addition to Rust’s other features, it also sports first-class functions and closures (similar to Lua),
Algebraic data types through the implementation of enums, and traits that share aspects of type classes
and interfaces. Garbage collection is done by a single-threaded algorithm that scans each per-task heap.
Inaccessible identifiers, discarded pointers, and finished functions are marked and reclaimed by the
collector. Special types of pointers in Rust don’t need to be traversed by the garbage collector as they
are able to immediately release themselves and their storage contents.
Rust is a language that is still in its infancy. It is modeled after novel concepts of safe memory
management and strict concurrency. However, it has a moderate level of complexity compared to other
languages along with a small development community. Rust will require some time before it can grow
to become a popular and influential language.
3. The D Language
Walter Bright, the author of the D language, first released a working alpha version in December 2001. It
took six years of development to reach D version 1.0 (D1). D1 was designed to incorporate features
from imperative and object-oriented language paradigms. Later, D2 was released and brought with it
several new features from other language paradigms including imperative, functional, and metaprogramming. Currently, the D language is on version 2.063.2 which is considered as a stable release. D
was intended to be a private redesign of C++ and has been influenced by many languages (C, C++, Java,
Python, Ruby) and has influenced several new languages as well (MiniD, Vala, Qore). As of October
2013, Facebook has begun using the D language in production on their products. This is the language’s
biggest commercial application use today.
The general impression of D is that it looks and feels like C and C++. D shares many of the same
syntactical and semantic rules. The D compiler compiles D source code directly into machine code for
execution using LLVM on the back-end of the compiler. In addition to the stand alone D compiler
(DMD), there is also a D frontend for the gcc compiler (gdc). While D was designed under the
manipulation of C++, the finished result is a stand-alone language that is unable to accept C/C++ source
code without being revised. However, it is still close enough to access the C runtime library and extend
itself to run non-native C/C++ code.
Being a multi-paradigm programming language, D uses the concepts of other languages to implement
each paradigm. The procedural aspect of D is strikingly similar to C which is where it takes its imperative
roots. D utilizes single inheritance hierarchy and Java-like interfaces to achieve object-oriented
programming. D is able to manipulate other programs or access system-level features which falls under
the meta-programming paradigm that is supported during compile time. The functional programming
paradigm allows D to support function literals, closures, and immutable data types and structures.
Lastly, the concurrent paradigm is implemented in D through the creation and management of threads
and the way that the threads communicate between each other.
A few new features added to D include auto-declaration of variables, traversal of a compilation in a loop,
and scope statements. By using the keyword ‘auto’ in front of a variable identifier instead of a data
type, the compiler will select the best data type to fit the variable based on its contents. This is a special
type of implicit type inference done at compile time. This is different from other languages, such as
Javascript, that allow all variables to be classified under one data type. The foreach keyword is another
new addition to the language that creates a loop to cycle through all of the entries in an array, stack, etc.
Some languages, like C#, also have this implemented. This is a great tool for looping through a data
structure that may not have a static number of contents during the runtime environment.
Exception and error handling is done via try-catch blocks similar to Java. D prefers to use user defined
error handling instead of predetermined error codes. Try-catch-finally statements allow the user to
enclose blocks of code around a statement that will try any errors thrown during execution and handle
them explicitly. Scope statements are a more general form of code management that can be used in
conjunction with the try-catch blocks. A defined scope statement will execute based on the success,
failure, or abnormal exit of a scope.
D enables both automatic memory management through the garbage collector and explicit memory
management using predefined functions to allocate or release memory. Much like other languages,
garbage collection in D will scan currently used threads, stacks, and heaps for objects and pointers that
are no longer used. The garbage collector operates automatically and independently. This allows the
programmer to not worry about allocating and releasing memory like in C. However, there are some
negative drawbacks to garbage collection in D. It can be unpredictable on when it runs or will only run
when memory is close to running out. Also, all thread execution is halted while garbage collection is
taking place.
D comes out to be another rather versatile language that has been influenced by several prominent and
popular languages from the past twenty years. The result is a stable language that is suitable for
creating native applications on different platforms. D may not be the best language for beginners, but it
an excellent language for intermediate programmers who should be able to pick it up easily. With
Facebook now using and supporting the D language community, it may begin to garner more attention
in the future.
References
[1] The Lua Programming Language Website. Retrieved from http://www.lua.org/home.html
[2] Ierusalimchy et al. (2012). The Implementation of Lua 5.0.
Retrieved from http://www.lua.org/doc/jucs05.pdf
[3] Kulchenko, Paul. (2012). Lua: Good, Bad, and Ugly Parts. Retrieved from
http://notebook.kulchenko.com/programming/lua-good-different-bad-and-ugly-parts
[4] The Rust Language Website. Retrieved from http://www.rust-lang.org/
[5] Rust Reference Manual and Grammar. Retrieved from
http://static.rust-lang.org/doc/master/rust.html
[6] Github’s Rust Language Repository. Retrieved from https://github.com/mozilla/rust
[7] The D Language Website. Retrieved from http://dlang.org/
[8] Bridgwater, Adrian. (2013). Facebook Adopts D Language. Retrieved from
http://www.drdobbs.com/mobile/facebook-adopts-d-language/240162694
[9] Luca. (2013). Rust vs D. Retrieved from http://versusit.org/rust-vs-d
[10] Github’s D Programming Language Repository. Retrieved from https://github.com/D-ProgrammingLanguage
Download