Uploaded by swangari388

Introduction Data Science Programming Handout Set 1A

advertisement
1
Big Data Science Programming Tools
This course unit introduces the learners to the world of programming tools that are used extensively in
big data analysis.
Big data is there and what is possible with big data can only be realized if we can be able to dig into the
ocean of big data.
Remember catch phrase: “Big data is gold a waiting to be mined”.
Traditional programming tools, C, C++, Fortrain, Java as up to Java 6, have had their share in software
development and has contributed immensely in automating business processes. However in their
current state they might not able to provide the power needed that is needed to derive the benefits of
big data.
This is where Scala and Hadoop and other tools such as Apache Spark come in.
Specifically we will focus on Scala and Apache Spark.
Note
1. Scala runs on Java Virtual Machine ( JVM) so java must be installed in your machine.
2. Spark is Hadoop’s sub-project. Therefore, it is better to install Spark into a Linux based system.
3. Practically Spark is not a modified version of Hadoop and is not dependent on Hadoop
because it has its own cluster management. Hadoop is just one of the ways to implement Spark.
4. Apache spark runs on scala so that so scala must be installed as well as java .
5. Apache Spark is an in-memory big data platform that performs especially well with iterative
algorithms providing
◦ 10-100x speedup over Hadoop with some algorithms, especially iterative ones as found
in machine learning
◦ Originally developed by UC Berkeley starting in 2009 and later Moved to an Apache
project in 2013
◦ Spark itself is written in Scala, and Spark jobs can be written in Scala, Python, and Java
(and more recently R and SparkSQL)
◦ Other libraries may include Streaming, Machine Learning, Graph Processing etc.
◦ Percent of Spark programmers who use each language distribution as as 88% Scala,
44% Python, 22% Java
◦ For data scientist who deals with large datasets, Scala will be invaluable. It's practically
2
the de facto language for the current Big Data tools like Apache Spark, Finagle,
Scalding, etc.
Many of the high performance data science frameworks that are built on top of Hadoop usually are
written and use Scala or Java. The reason Scala is used in these environments is because of its amazing
concurrency support, which is key in parallelizing a lot of the processing needed for large data sets. It
also runs on the JVM, which makes it almost a no-brainer when paired with Hadoop.
Also with the industry adoption of Apache Spark, ( which is leading Scala framework for cloud
computing and Big Data) Scala has quickly become popular among Big Data professionals. If you are
going for an interview for any of the Big Data job openings that require Apache Spark experience then
you should prepare for Scala interview questions as well because Spark is written in Scala.
Introduction To Scala Programming
•
The name Scala stands for “scalable language.” It is so named because it was designed to grow
as the demands of its users grows.
•
With Scala you can a wide range of programming tasks, from writing small scripts( as in
JavaScript or bash scripting) to building large systems.
•
Scala is a modern multi-paradigm programming language designed to express common
programming patterns in a concise, elegant, and type-safe way.
•
Scala has been created by Martin Odersky and he released the first version in 2003.
•
Scala is a hybrid programming language. It smoothly integrates the features of object-oriented
and functional languages.
•
Scala is compiled to run on the Java Virtual Machine.
•
You can think of Scala as fist choice programming platform to boost your development
productivity, to write application scalability and systems that has overall reliability.
Prerequisites
•
Scala Programming is based on Java, so if you are aware of Java syntax, then it's pretty easy to
learn Scala.
•
Further if you do not have expertise in Java but if you know any other programming language
like C, C++ or Python then it will also help in grasping Scala concepts very quickly.
•
It is also very important that you have done CMT 203: Introduction to system Administration.
3
Indeed almost all big data analysis tools and concepts are largely based on open source
solutions than proprietary solutions.
Features of Scala
Object-oriented Programming Language
Scala is a pure object-oriented language in the sense that every value is an object. Types and behavior
of objects are described by classes and traits -to be discussed later.
Classes are extended by subclassing and a flexible mixin-based composition mechanism as a clean
replacement for multiple inheritance.
Scala is an object-oriented language in pure form: every value is an object and every operation is a
method call.
Example, 1 + 2 in Scala, means you are actually invoking a method named + defined in class Int . You
can define methods with operator-like names that clients of your API can then use in operator notation.
Functional Programming Language
Scala is also a functional language in the sense that every function is a value and every value is an
object so ultimately every function is an object.
Scala provides a lightweight syntax for defining anonymous functions, it supports higher-order
functions, it allows functions to be nested, and supports currying-(to be discussed later.)
Functional programming is guided by two main ideas.
•
The first idea is that functions are first-class values. In a functional language, a function is a
value of the same status as, say, an integer or a string. This means you:
◦ You can pass functions as arguments to other functions,
◦ Return them as results from functions, or
◦ Store them in variables.
◦ You can also define a function inside another function, just as you can define an integer
value inside a function.
◦ And you can define functions without giving them a name(Anonymous functions). This
means you can just sprinkling your code with function literals as easily as you might write
integer literals like 3, 20 ,50 etc.
Functions that are first-class values provide a convenient means for abstracting over operations and
creating new control structures.
4
This generalization of functions provides great expressiveness, which often leads to very legible and
concise programs. It also plays an important role for scalability.
•
The second main idea of functional programming is that the operations of a program should
map input values to output values rather than change data in place.
Example Illustration
Consider the implementation of strings in Ruby and in Java.
In Ruby, a string is an array of characters. Characters in a string can be changed individually. For
instance you can change a semicolon character in a string to a period inside the same string object.
In Java and Scala, on the other hand, a string is a sequence of characters in the mathematical sense.
Replacing a character in a string using an expression like s.replace(';', '.') yields a new string object,
which is different from s .
We say that strings are immutable ( Cannot be changed in place) in Java whereas they are mutable in
Ruby.
Functional programming main principle is that methods should not have any side effects. They should
communicate with their environment only by taking arguments and returning results.
Methods like replace are called referentially transparent, which means that for any given input the
method call could be replaced by its result without affecting the program’s semantics.
•
Immutable data structures are one of the cornerstones of functional programming.
The Scala libraries define many more immutable data types on top of those found in the Java APIs.
For instance, Scala has immutable lists, tuples, maps, and sets.
In addition to Immutable data structures Functional languages encourage also referentially
transparent methods.
Scala is statically typed
Scala, unlike some of the other statically typed languages (C, Pascal, Rust, etc.), does not expect you to
provide redundant type information. You don't have to specify a type in most cases, and you certainly
don't have to repeat it.
Scala runs on the JVM
•
Scala is compiled into Java Byte Code which is executed by the Java Virtual Machine (JVM).
This means that Scala and Java have a common runtime platform. You can easily move from
Java to Scala.
5
•
The Scala compiler compiles your Scala code into Java Byte Code, which can then be executed
by the 'scala' command. The 'scala' command is similar to the java command, in that it
executes your compiled Scala code.
•
The combination of both styles in Scala makes it possible to express new kinds of programming
patterns and component abstractions.
•
Scala has a set of convenient constructs that help you get started quickly and let you program in
a pleasantly concise style. At the same time, you have the assurance that you will not outgrow
the language.
•
You can always tailor the program to your requirements, because everything is based on library
modules that you can select and adapt as needed.
Bazaar (NOT Cathedral) Design
•
Scala is much more like a bazaar than a cathedral, in the sense that it is designed to be extended
and adapted by the people programming in it.
•
Instead of providing all constructs you might ever need in one “perfectly complete” language,
Scala puts the tools for building such constructs into your hands.
Scala can Execute Java Code
Scala enables you to use all the classes of the Java SDK and also your own custom Java classes, or
your favorite Java open source projects. AS we get to experience programming in scala we will have a
taste of java in sacla
Scala can do Concurrent & Synchronize processing
Scala allows you to express general programming patterns in an effective way. It reduces the number of
lines and helps the programmer to code in a type-safe way. It allows you to write codes in an
immutable manner, which makes it easy to apply concurrency and parallelism (Synchronize).
Scala Web Frameworks
Scala is being used everywhere and importantly in enterprise web applications. You can check a few of
the most popular Scala web frameworks −
Scala vs Java
Scala has a set of features that completely differ from Java. Some of these are −
• All types are objects
• Type inference
6
•
•
•
•
•
•
Nested Functions
Functions are objects
Domain specific language (DSL) support
Traits
Closures
Concurrency support inspired by Erlang
Scala is Concise
Scala programs tend to be short and less noisy as compared to java programming. Scala programmers
have reported reductions in number of lines of up to a factor of ten compared to Java. A more
conservative estimate would be that a typical Scala program should have about half the number of lines
of the same
program written in Java. Fewer lines of code means:
•
Less typing,
•
Also less effort at reading and understanding programs
•
Fewer possibilities of errors all resulting to
•
Shorter system development
Example Illustration
Compare the example below that shows how you write classes and constructors in Java and
Scala.
In Java, a class with a constructor
class MyClass {
private int index;
private String name;
public MyClass(int index, String name) {
this.index = index;
this.name = name;
}
}
In Scala, the same class
class MyClass(index: Int, name: String)
Explanation
The Scala compiler will produce a class that has
two private instance variables, an Int named index
and a String named name , and a constructor that
takes initial values for those variables as
parameters.
The code of this constructor will initialize the two
instance variables with the values passed as
parameters.
7
Scala is high-level
Programmers are constantly grappling with complexity. To program productively, you must understand
the code on which you are working.
Overly complex code has been the downfall of many a software project and unfortunately in some
cases we can’t avoid complex requirements while writing software. It must instead be managed.
Scala helps you manage complexity by letting you raise the level of abstraction in the interfaces you
design and use.
Example Illustration
As an example, imagine you have a String variable name , and you want to find out whether or not that
String contains an upper case character. In Java, you might write this:
/ /this is Java
Whereas in Scala, you could write this:
boolean nameHasUpperCase = false;
for (int i = 0; i < name.length(); ++i) {
val nameHasUpperCase = name.exists(_.isUpperCase)
if (Character.isUpperCase(name.charAt(i))) {
nameHasUpperCase = true;
break;
}
}
Explanation
The Java code treats strings as low-level entities that are stepped through character by character in a
loop. The Scala code treats the same strings as higher-level sequences of characters that can be queried
with predicates.
Clearly the Scala code is much shorter and easier to understand than the Java code. So the Scala code
weighs less heavily on the total complexity cost. It also minimises opportunity to make errors.
The predicate _.isUpperCase is an example of a function literal in Scala. It describes a function that
takes a character argument (represented by the underscore character), and tests whether it is an upper
case letter
8
Setup Scala - Environment
Scala can be installed on any UNIX flavored or Windows based system.
Before you start installing Scala on your machine, you must have Java 1.8 or greater installed on your
computer.
Follow the steps given below to install Scala.
Step 1: Verify Your Java Installation
First of all, you need to have Java Software Development Kit (SDK) installed on your system.
If the Java installation has been done properly, then it will display the current version and specification
of your Java installation. A sample output is given in the following table.
Platform
Windows
Command
Sample Output
Open Command Console and type − Java version "1.8.0_31"
\>java –version
Open Command terminal and type − Java version "1.8.0_31"
Linux
$java –version
We assume that the readers of this tutorial have Java SDK version 1.8.0_31 installed on their system.
In case you do not have Java SDK, download its current version from
http://www.oracle.com/technetwork/java/javase/downloads/index.html and install it.
Step 2: Set Your Java Environment
Set the environment variable JAVA_HOME to point to the base directory location where Java is
installed on your machine. For example,
Sr.No
1
2
Platform & Description
Windows-: Set JAVA_HOME to C:\ProgramFiles\java\jdk1.7.0_60
Linux - Export JAVA_HOME=/usr/local/java-current
9
Append the full path of Java compiler location to the System Path.
Sr.No
Platform & Description
Windows
1
2
Append the String "C:\Program Files\Java\jdk1.7.0_60\bin" to the end of the system
variable PATH.
Linux -: Export PATH=$PATH:$JAVA_HOME/bin/
Execute the command java -version from the command prompt as explained above.
Step 3: Install Scala
You can download Scala from http://www.scala-lang.org/downloads. Make sure you have admin
privilege to proceed. Now, execute the following command at the command prompt −
Platform
Command & Output
Windows
\>java –jar scala-2.11.5installer.jar\>
Description
This command will display an installation wizard, which
will guide you to install Scala on your windows
machine. During installation, it will ask for license
agreement, simply accept it and further it will ask a path
where Scala will be installed. I selected default given
path “C:\Program Files\Scala”, you can select a suitable
path as per your convenience.
10
Platform
Command & Output
Command −
$java –jar scala-2.9.0.1-installer.jar
Output −
Welcome to the installation of Scala 2.9.0.1!
The homepage is at − http://Scala-lang.org/
Linux
Description
During installation, it will
ask for license agreement,
to accept it type 1 and it
will ask a path where Scala
will be installed.
I entered /usr/local/share,
you can select a suitable
path as per your
convenience.
press 1 to continue, 2 to quit, 3 to redisplay
1................................................
[ Starting to unpack ]
[ Processing package: Software Package Installation (1/1) ]
[ Unpacking finished ]
[ Console installation done ]
Finally, open a new command prompt and type Scala -version and press Enter. You should see the
following −
Platform
Windows
Linux
Command
\>scala -version
$scala -version
Output
Scala code runner version 2.11.5 -- Copyright 2002-2013,
LAMP/EPFL
Scala code runner version 2.9.0.1 – Copyright 2002-2013,
LAMP/EPFL
11
Basic Scala Syntax
If you have a good understanding on Java, then it will be very easy for you to learn Scala.
The biggest syntactic difference between Scala and Java is that the ';' line end character is optional.
We can consider a Scala program as a collection of objects that communicate via invoking each other’s
methods.
Let us briefly look into what do class, object, methods and instance variables mean befotre we move
forward.
• Object − Objects have states and behaviors. An object is an instance of a class. Example − A
Dog has states - color, name, breed as well as behaviors - wagging, barking, and eating.
• Qn What are the states and behavior of a Person, Students, Soccer tournament
• Class − A class can be defined as a template/blueprint that describes the behaviors/states that
are related to the class.
• Qn Construct in a class in Java and Scala for the following Person, Marks, Dog
• Methods − A method is basically a behavior. A class can contain many methods. It is in
methods where the logics are written, data is manipulated and all the actions are executed.
• Fields − Each object has its unique set of instance variables, which are called fields. An
object's state is created by the values assigned to these fields.
• Closure − A closure is a function, whose return value depends on the value of one or more
variables declared outside this function.
• Traits − A trait encapsulates method and field definitions, which can then be reused by mixing
them into classes. Traits are used to define object types by specifying the signature of the
supported methods.
12
Scala Basic Syntax
We will get started with Scala by using the Scala interpreter, an interactive “shell” for writing Scala
expressions and programs. A shell is a kind of REPL (Run Evaluate Print Loop), commonly found in
several languages to support interactive development.
Later we will be using files just as we do in java.
Simply type an expression into the interpreter and it will evaluate the expression and print the resulting
value.
We can execute a Scala program in two modes: one is interactive mode and another is script mode.
Interactive Mode
Open the command prompt and use the following command to open Scala.
\>scala
If Scala is installed in your system, the following output will be displayed −
Welcome to Scala version 2.9.0.1
Type in expressions to have them evaluated.
Type :help for more information.
Type the following text to the right of the Scala prompt and press the Enter key −
scala> println("Hello, Scala!");
It will produce the following result −
Hello, Scala!
Setting Your Environment
Make sure you have your own account. In case of any problem consult the technician.
For all practical works you will be logged into you account.
Create a directory Desktop/Scala in you home directory by typing the following
$mkdir Directory/Scala
Move to the Scala directory
$cd Directory/Scala
You will now be saving all your work in this directory
13
Script Mode: Running a code in a file
Use the following instructions to write a Scala program in script mode. Use the cat command to create
a file HelloWorld.scala. With the content as shown. Press CTRL + D when complete
$cat >HelloWorld.scala
object HelloWorld {
/* This is my first scala program.
* This will print 'Hello World' as the output
*/
def main(args: Array[String]) {
println("Hello, world!") // prints Hello World
}
}
Once you are finished you have created a file HelloWorld.scala. in Desktop/Scala .
Note: You can use any of you favorite editor to create files. Popular choices include nano, vim, gedit ,
etc.
Use the ‘scalac’ command is used to compile the Scala program. It will generate a few class files in the
current directory. One of them will be called HelloWorld.class which is bytecode which will run on
Java Virtual Machine (JVM) using ‘scala’ command.
\> scalac HelloWorld.scala
Use the following command to compile and execute your Scala program.
\> scala HelloWorld
Output
Hello, World!
A bit of Scala Simple Tasks .
\> 4 + 3
It automatically generated or user-defined name to refer to
the computed value ( res0 means result 0),
• a colon ( : ), followed by the type of the expression ( Int ),
• an equals sign ( = ), and
• the value resulting from evaluating the expression ( 7).
res0: Int = 7
The resX identifier may be used in later lines. For instance, since res0 was set to 7 previously, then res0
* 3 will be 21:
\> res0*3
res1: Int = 21
14
The type Int names the class Int in the package scala . Packages in Scala are similar to packages in
Java: they partition the global namespace and provide a mechanism for information hiding.
Values of Scala class Int correspond to Java’s int values.
Note:
1. All of Java’s primitive types have corresponding classes in the scala package. For example,
scala.Boolean corresponds to Java’s boolean, scala.Float corresponds to Java’s float.
2. And when you compile your Scala code to Java bytecodes, the Scala compiler will use Java’s
primitive types where possible to give you the performance benefits of the primitive types.
To exit the interpreter, you can do so by entering :quit or :q .
scala> :quit
$
Syntax & Conding Conventions
The following are the basic syntax and coding conventions in Scala programming.
• Case Sensitivity − Scala is case-sensitive, which means identifier Hello and hello would have
different meaning in Scala.
• Class Names − For all class names, the first letter should be in Upper Case. If several words are
used to form a name of the class, each inner word's first letter should be in Upper Case.
Example − class MyFirstScalaClass.
• Method Names − All method names should start with a Lower Case letter. If multiple words
are used to form the name of the method, then each inner word's first letter should be in Upper
Case.
Example − def myMethodName()
• Program File Name − Name of the program file should exactly match the object name. When
saving the file you should save it using the object name (Remember Scala is case-sensitive) and
append ‘.scala’ to the end of the name. (If the file name and the object name do not match your
program will not compile).
Example − Assume 'HelloWorld' is the object name. Then the file should be saved as
'HelloWorld.scala'.
• def main(args: Array[String]) − Scala program processing starts from the main() method
which is a mandatory part of every Scala Program.
15
Using the app class in Scala
Alternatively Scala provides a helper class, called App, that provides the main method. Instead of
writing your own main method, classes can extend App class to produce concise and executable
applications in Scala as shown in the following example:
object Main extends App {
println("Hello Scala: " + (args aString ", "))
}
Scala Identifiers
All Scala components require names. Names are used for objects, classes, variables and methods are
are referred to as called identifiers.
A keyword cannot be used as an identifier and identifiers are case-sensitive. Scala supports four types
of identifiers.
Alphanumeric Identifiers
An alphanumeric identifier starts with a letter or an underscore, which can be followed by further
letters, digits, or underscores. The '$' character is a reserved keyword in Scala and should not be used in
identifiers.
Following are legal alphanumeric identifiers − age, salary, _value, __1_value
Following are illegal identifiers − $salary, 123abc, -salary
Operator Identifiers
An operator identifier consists of one or more operator characters. Operator characters are printable
ASCII characters such as +, :, ?, ~ or #.
Following are legal operator identifiers − + ++ ::: <?> :>
The Scala compiler will internally "mangle" operator identifiers to turn them into legal Java identifiers
with embedded $ characters. For instance, the identifier :-> would be represented internally as
$colon$minus$greater.
Mixed Identifiers
A mixed identifier consists of an alphanumeric identifier, which is followed by an underscore and an
operator identifier.
Following are legal mixed identifiers − unary_+, myvar_=
Here, unary_+ used as a method name defines a unary + operator and myvar_= used as method name
defines an assignment operator (operator overloading).
16
Literal Identifiers
A literal identifier is an arbitrary string enclosed in back ticks (` . . . `).
Following are legal literal identifiers − `x` `<clinit>` `yield`
17
Scala Keywords
The following list shows the reserved words in Scala. These reserved words may not be used as
constant or variable or any other identifier names.
abstract
def
false
forSome
lazy
object
case
do
final
if
match
override
catch
else
finally
implicit
new
package
=
class
extends
for
import
Null
private
this
throw
trait
Try
protected
return
sealed
super
true
type
val
Var
while
with
yield
:
<<;
<%
>:
#
@
=>
Comments in Scala
Scala supports single-line and multi-line comments very similar to Java. Multi-line comments may be
nested, but are required to be properly nested. All characters available inside any comment are ignored
by Scala compiler.
object HelloWorld {
/* This is my first java program.
* This will print 'Hello World' as the output
* This is an example of multi-line comments.
*/
def main(args: Array[String]) {
// Prints Hello World
// This is also an example of single line comment.
println("Hello, world!")
}
}
Blank Lines and Whitespace
A line containing only whitespace, possibly with a comment, is known as a blank line, and Scala totally
ignores it. Tokens may be separated by whitespace characters and/or comments.
Newline Characters
•
Scala is a line-oriented language where statements may be terminated by semicolons (;) or
newlines. A semicolon at the end of a statement is usually optional.
•
You can type one if you want but you don't have to if the statement appears by itself on a single
line.
18
•
On the other hand, a semicolon is required if you write multiple statements on a single line.
•
The code below is an example of multiple statements usage.
val s = "hello"; println(s)
Scala Packages
A package is a named module of code. A package can also be described as a collection of related
objects.
Scala packages can be imported so that they can be referenced in the current compilation scope. The
following statement imports the contents of the scala.xml package −
import scala.xml._
You can import a single class and object, for example, HashMap from the scala.collection.mutable
package −
import scala.collection.mutable.HashMap
You can import more than one class or object from a single package, for example, TreeMap and
TreeSet from the scala.collection.immutable package −
import scala.collection.immutable.{TreeMap, TreeSet}
Scala - Data Types
Scala has all the same data types as Java, with the same memory footprint and precision. Following is
the table giving details about all the data types available in Scala −
Sr.No
1
2
3
4
5
6
7
Data Type & Description
Byte: 8 bit signed value. Range from -128 to 127
Short : 16 bit signed value. Range -32768 to 32767
Int: 32 bit signed value. Range -2147483648 to 2147483647
Long: 64 bit signed value. -9223372036854775808 to 9223372036854775807
Float: 32 bit IEEE 754 single-precision float
Double: 64 bit IEEE 754 double-precision float
Char: 16 bit unsigned Unicode character. Range from U+0000 to U+FFFF
19
8
9
10
11
12
13
14
String: A sequence of Chars
Boolean: Either the literal true or the literal false
Unit: Corresponds to no value
Null: null or empty reference
Nothing: The subtype of every other type; includes no values
Any: The supertype of any type; any object is of type Any
AnyRef: The supertype of any reference type
All the data types listed above are objects. There are no primitive types like in Java.
This means that you can call methods on an Int, Long, etc.
Scala Basic Literals
Scala uses for are simple and intuitive literals rules. This section explains all basic Scala Literals.
Integral Literals
Integer literals are usually of type Int, or of type Long when followed by a L or l suffix. Here are some
integer literals − 0,035, 21, 0xFFFFFFFF, 0777L
Floating Point Literal
Floating point literals are of type Float when followed by a floating point type suffix F or f, and are of
type Double otherwise. Here are some floating point literals −
0.0 , 1e30f , 3.14159f , 1.0e100, .1
Boolean Literals
The Boolean literals true and false are members of type Boolean.
Symbol Literals
A symbol literal 'x is a shorthand for the expression scala.Symbol("x"). Symbol is a case class, which
is defined as follows.
20
package scala
final case class Symbol private (name: String) {
override def toString: String = "'" + name
}
Character Literals
A character literal is a single character enclosed in quotes. The character is either a printable Unicode
character or is described by an escape sequence. Here are some character literals −
'a' , '\u0041', '\n' ,'\t'
String Literals
A string literal is a sequence of characters in double quotes. The characters are either printable Unicode
character or are described by escape sequences. Here are some string literals −
"Hello,\nWorld!"
"This string contains a \" character."
Multi-Line Strings
A multi-line string literal is a sequence of characters enclosed in triple quotes """ ... """. The
sequence of characters is arbitrary, except that it may contain three or more consecutive quote
characters only at the very end.
Characters must not necessarily be printable; newlines or other control characters are also permitted.
Here is a multi-line string literal −
"""the present string
spans three
lines."""
Null Values
The null value is of type scala.Null and is thus compatible with every reference type. It denotes a
reference value which refers to a special "null" object.
Escape Sequences
The following escape sequences are recognized in character and string literals.
Escape Sequences
\b
\t
\n
\f
Unicode
\u0008
\u0009
\u000c
\u000c
Description
backspace BS
horizontal tab HT
formfeed FF
formfeed FF
21
\r
\u000d carriage return CR
\"
\u0022 double quote "
\'
\u0027 single quote .
\\
\u005c backslash \
A character with Unicode between 0 and 255 may also be represented by an octal escape, i.e., a
backslash '\' followed by a sequence of up to three octal characters. Following is the example to show
few escape sequence characters −
Example
object Test {
def main(args: Array[String]) {
println("Hello\tWorld\n\n" );
}
}
When
the above code is compiled and executed, it produces the following result −
Output
Hello
World
22
Scala - Variables
Variables are reserved memory locations to store values. This means that when you create a variable,
you reserve some space in memory.
Based on the data type of a variable, the compiler allocates memory and decides what can be stored in
the reserved memory.
These means you can store integers, decimals, or characters in variables.
Variable Declaration
Scala has two kinds of variables, val s and var s. A val variable is similar to a final variable in Java.
Once initialized, a val can never be reassigned. A var variable, by contrast, is similar to a non-final
variable in Java. A var can be reassigned throughout its lifetime.
Example1: Using val
scala> val msg = "Hello, world!"
The type of msg is java.lang.String . This is because Scala strings are implemented by Java’s String
class and in this case it is immutable.
NOTE:
You notice in the example above that we did not specify that the variable should be of type
java.lang.String yet Scala returned a java.lang.String.
This example illustrates type inference, Scala’s ability to infer the correct type. This is made possible
because you initialized msg with a string literal.
Example 2: Using var
var myVar : String = "Foo"
Here, myVar is declared using the keyword var. It is a variable that can change value. It is an example
of a mutable variable.
Variable Data Types
The type of a variable is specified after the variable name and before equals sign. You can define any
type of Scala variable by mentioning its data type as follows −
Syntax
val or val VariableName : DataType = [Initial Value]
23
If you do not assign any initial value to a variable, then it is valid as follows −
Syntax
var myVar :Int;
val myVal :String;
Example : Declare variables with basic numeric types:
val
val
val
val
val
val
b:
x:
l:
s:
d:
f:
Byte = 1
Int = 1
Long = 1
Short = 1
Double = 2.0
Float = 3.0
In the first four examples, if you don’t explicitly specify a type, the number 1 will default to an Int, so
if you want one of the other data types — Byte, Long, or Short — you need to explicitly declare
those types.
Numbers with a decimal (like 2.0) will default to a Double, so if you want a Float you need to
declare a Float, as shown in the last example.
Because Int and Double are the default numeric types, you typically create them without explicitly
declaring the data type:
val i = 123
val x = 1.0
// defaults to Int
// defaults to Double
The REPL shows that those examples default to Int and Double:
scala> val i = 123
i: Int = 123
scala> val x = 1.0
x: Double = 1.0
BigInt and BigDecimal
For large numbers Scala also includes the types BigInt and BigDecimal:
var b = BigInt(1234567890)
var b = BigDecimal(123456.789)
A great thing about BigInt and BigDecimal is that they support all the operators you’re used to
using with numeric types:
scala> var b = BigInt(1234567890)
24
b: scala.math.BigInt = 1234567890
scala> b + b
res0: scala.math.BigInt = 2469135780
scala> b * b
res1: scala.math.BigInt = 1524157875019052100
scala> b += 1
scala> println(b)
1234567891
String and Char
Scala also has String and Char data types, which you can generally declare with the implicit form:
val name = "Bill"
val c = 'a'
Though once again, you can use the explicit form, if you prefer:
val name: String = "Bill"
val c: Char = 'a'
As shown, enclose strings in double-quotes and a character in single-quotes.
Two Notes About Strings
Scala has a nice, Ruby-like way to merge multiple strings. Given these three variables:
val firstName = "John"
val mi = 'C'
val lastName = "Doe"
you can append them together like
this, if you want to:
val name = firstName + " " + mi + " " + lastName
//Java like
However, Scala provides this more convenient form:
val name = s"$firstName $mi $lastName"
This form creates a very readable way to print strings that contain variables:
println(s"Name: $firstName $mi $lastName")
As shown, all you have to do is to precede the string with the letter s, and then put a $ symbol before
your variable names inside the string. This feature is known as string interpolation.
String interpolation in Scala provides many more features. For example, you can also enclose your
25
variable names inside curly braces:
println(s"Name: ${firstName} ${mi} ${lastName}")
For some people that’s easier to read, but an even more important benefit is that you can put
expressions inside the braces, as shown in this REPL example:
scala> println(s"1+1 = ${1+1}")
1+1 = 2
A few other benefits of string interpolation are:
You can precede strings with the letter f, which lets you use printf style formatting inside strings
• The raw interpolator performs no escaping of literals (such as \n) within the string
• You can create your own string interpolators
Multiline strings
A second great feature of Scala strings is that you can create multiline strings by including the string
inside three double-quotes:
val speech = """Four score and
seven years ago
our fathers ..."""
That’s very helpful for when you need to work with multiline strings. One drawback of this basic
approach is that lines after the first line are indented, as you can see in the REPL:
scala> val speech = """Four score and
|
seven years ago
|
our fathers ..."""
speech: String =
Four score and
seven years ago
our fathers ...
A simple way to fix this problem is to put a | symbol in front of all lines after the first line, and call the
stripMargin method after the string:
val speech = """Four score and
|seven years ago
|our fathers ...""".stripMargin
The REPL shows that when you do this, all of the lines are left-justified:
scala> val speech = """Four score and
|
|seven years ago
|
|our fathers ...""".stripMargin
speech: String =
Four score and
26
seven years ago
our fathers ...
Because this is what you generally want, this is a common way to create multiline strings.
Example : Illustrating Variable declarations
The following is an example program that explains the process of variable declaration in Scala.
It declares four variables — two variables are defined with type declaration and remaining two are
without type declaration.
Type the above program in in a file named Variables.scala. Compile and execute this it.
Give the output of the programe
object Variables {
def main(args: Array[String]) {
var myVar :Int = 10;
val myVal :String = "Hello Scala with datatype declaration.";
var myVar1 = 20;
val myVal1 = "Hello Scala new without datatype declaration.";
println(myVar); println(myVal); println(myVar1);
println(myVal1);
}
}
Reading input: Use of readLine method
There are several ways to read command-line input, but the easiest way is to use the readLine
method in the scala.io.StdIn package. To use it, you need to first import it, like this:
import scala.io.StdIn.readLine
Example. Put this source code in a file named HelloInteractive.scala:
import scala.io.StdIn.readLine
object HelloInteractive extends App {
print("Enter your first name: ")
val firstName = readLine()
print("Enter your last name: ")
val lastName = readLine()
println(s"Your name is $firstName $lastName")
}
Then compile it with scalac: $ scalac HelloInteractive.scala
27
Then run it with scala:
$ scala HelloInteractive
As you saw in this application, you bring classes and methods into scope in Scala just like you do with
Java and other languages, with import statements.
import scala.io.StdIn.readLine
That import statement brings the readLine method into the current scope so you can use it in the
application.
Other std In methods Includes
1. def readBoolean(): Boolean Reads a boolean value from an entire line of the default input.
2. def readByte(): Byte Reads a byte value from an entire line of the default input.
3. def readChar(): Char Reads a char value from an entire line of the default input.
4. def readDouble(): Double Reads a double value from an entire line of the default input.
5. def readFloat(): Float Reads a float value from an entire line of the default input.
6. def readInt(): Int Reads an int value from an entire line of the default input.
7. def readLine(text: String, args: Any*): String Print and flush formatted text to the default
output, and read a full line from the default input.
8. def readLine(): String Read a full line from the default input.
9. def readLong(): Long Reads an long value from an entire line of the default input.
10. def readShort(): Short Reads a short value from an entire line of the default input.
11. def readf(format: String): List[Any] Reads in some structured input (from the default input),
specified by a format specifier.
12. def readf1(format: String): Any Reads in some structured input (from the default input),
specified by a format specifier, returning only the first value extracted, according to the format
specification.
13. def readf2(format: String): (Any, Any) Reads in some structured input (from the default
input), specified by a format specifier, returning only the first two values extracted, according to
the format specification.
14.  def readf3(format: String): (Any, Any, Any) Reads in some structured input (from the
28
default input), specified by a format specifier, returning only the first three values extracted,
according to the format specification.
Example: Using Functions to
1.0 Read Int types input from the Keyboard
2.0 Generate values randomly
import scala.io.StdIn.readInt
object KeyBoardInputs{
def main(args:Array[String])={
println("Hello World")
//The function is called without specifying the arguments
println( "Returned Value : " + totalMarks1() );
}
//A function to read from the Keyborad
def totalMarks1( a:Double=readInt(),b:Double =readInt(),c:Double
=readInt() ) : Double = a + b +c
// A function to generate the marks randomly
def totalMarks( a:Double=(Math.ceil(Math.random()*10)),
b:Double =(Math.ceil(Math.random()*20)),c:Double
=(Math.ceil(Math.random()*70)) ) : Double = a + b +c
}
1.0 Save the code as KeyBoardInputs.scala ,Compile it and run
N.b The code on running only calls the totalMarks1(). To run totlaMarks() modify the
code accordingly
Multiple Assignments
Scala supports multiple assignments. If a code block or method returns a Tuple (Tuple − Holds
collection of Objects of different types), the Tuple can be assigned to a val variable.
Syntax
val (myVar1: Int, myVar2: String) = Pair(40, "Welcome")// Pair is a tuple
And by letting Scala inference the correct type
Syntax
val (myVar1, myVar2) = Pair(40, "Welcome")
29
Variable Scope
Variables in Scala can have three different scopes depending on the location where they are being used.
They can be fields, method parameters and local variables. We discuss these concepts below
Fields
Fields are variables that belong to an object. The fields are accessible from inside every method in the
object. ( Remember fields in Java)
Fields can also be accessible outside the object depending on what access modifiers the fields is
declared with. Object fields can be both mutable and immutable types and can be defined using either
var or val.
Method Parameters
Method parameters are variables, which are used to pass the value inside a method, when the method is
called.
Method parameters are only accessible from inside the method but the objects passed in may be
accessible from the outside, if you have a reference to the object from outside the method.
Method parameters are always immutable which means they are all defined by val keyword.
Local Variables
Local variables are variables declared inside a method. Local variables are only accessible from inside
the method, but the objects you create may escape the method if you return them from the method.
Local variables can be both mutable and immutable types and can be defined using either var or val.
30
Scala - Operators
An operator is a symbol that tells the compiler to perform specific mathematical or logical
manipulations. Scala is rich in built-in operators and provides the following types of operators −
•
•
•
•
•
Arithmetic Operators
Relational Operators
Logical Operators
Bitwise Operators
Assignment Operators
Arithmetic Operators
The following arithmetic operators are supported . Assume variable A holds 10 and variable B holds 20,
then −
Operator
+
*
/
%
Description
Example
A + B will give 30
A - B will give -10
A * B will give 200
B / A will give 2
Adds two operands
Subtracts second operand from the first
Multiplies both operands
Divides numerator by de-numerator
Modulus operator finds the remainder after division of one number by
B % A will give 0
another
Relational Operators
The following relational operators are supported
Operator
==
!=
>
<
>=
<=
Description
Checks if the values of two operands are equal or not, if yes then
condition becomes true.
Checks if the values of two operands are equal or not, if values are not
equal then condition becomes true.
Checks if the value of left operand is greater than the value of right
operand, if yes then condition becomes true.
Checks if the value of left operand is less than the value of right operand,
if yes then condition becomes true.
Checks if the value of left operand is greater than or equal to the value of
right operand, if yes then condition becomes true.
Checks if the value of left operand is less than or equal to the value of
right operand, if yes then condition becomes true.
Example
(A == B) is not
true.
(A != B) is true.
(A > B) is not
true.
(A < B) is true.
(A >= B) is not
true.
(A <= B) is true.
31
Logical Operators
The following logical operators are supported by Scala language. For example, assume variable A holds
1 and variable B holds 0, then −
Operator
&&
||
!
Description
It is called Logical AND operator. If both the operands are non zero then
condition becomes true.
It is called Logical OR Operator. If any of the two operands is non zero then
condition becomes true.
It is called Logical NOT Operator. Use to reverses the logical state of its
operand. If a condition is true then Logical NOT operator will make false.
Example
(A && B) is
false.
(A || B) is true.
!(A && B) is
true.
Bitwise Operators
Bitwise operator works on bits and perform bit by bit operation.
The truth tables for &, |, and ^ are as follows −
p
0
0
1
1
q
0
1
1
0
p&q
0
0
1
0
p|q
0
1
1
1
p^q
0
1
0
1
Assume if A = 60; and B = 13; now in binary format they will be as follows −
A = 0011 1100
B = 0000 1101
----------------------A&B = 0000 1100
A|B = 0011 1101
A^B = 0011 0001
~A = 1100 0011
The Bitwise operators supported by Scala language is listed in the following table. Assume variable A
holds 60 and variable B holds 13, then −
Operator
&
|
^
~
Description
Binary AND Operator copies a bit to the result if it
exists in both operands.
Binary OR Operator copies a bit if it exists in either
operand.
Binary XOR Operator copies the bit if it is set in one
operand but not both.
Binary Ones Complement Operator is unary and has
Example
(A & B) will give 12, which is 0000
1100
(A | B) will give 61, which is 0011
1101
(A ^ B) will give 49, which is 0011
0001
(~A ) will give -61, which is 1100
32
the effect of 'flipping' bits.
<<
>>
>>>
0011 in 2's complement form due to
a signed binary number.
Binary Left Shift Operator. The bit positions of the left
A << 2 will give 240, which is 1111
operands value is moved left by the number of bits
0000
specified by the right operand.
Binary Right Shift Operator. The Bit positions of the
left operand value is moved right by the number of
A >> 2 will give 15, which is 1111
bits specified by the right operand.
Shift right zero fill operator. The left operands value is
moved right by the number of bits specified by the
A >>>2 will give 15 which is 0000
right operand and shifted values are filled up with
1111
zeros.
Assignment Operators
There are following assignment operators supported by Scala language −
Op
=
+=
-=
*=
/=
%=
<<=
>>=
&=
^=
|=
Description
Simple assignment operator, Assigns values from right side
operands to left side operand
Add AND assignment operator, It adds right operand to the left
operand and assign the result to left operand
Subtract AND assignment operator, It subtracts right operand
from the left operand and assign the result to left operand
Multiply AND assignment operator, It multiplies right operand
with the left operand and assign the result to left operand
Divide AND assignment operator, It divides left operand with
the right operand and assign the result to left operand
Modulus AND assignment operator, It takes modulus using two
operands and assign the result to left operand
Left shift AND assignment operator
Right shift AND assignment operator
Bitwise AND assignment operator
bitwise exclusive OR and assignment operator
bitwise inclusive OR and assignment operator
Example
C = A + B will assign value of A
+ B into C
C += A Means C = C + A
C -= A Means C = C - A
C *= A Means C = C * A
C /= A Means C = C / A
C %= A Means C = C % A
C <<= 2 is same as C = C << 2
C >>= 2 is same as C = C >> 2
C &= 2 is same as C = C & 2
C ^= 2 is same as C = C ^ 2
C |= 2 is same as C = C | 2
Operators Precedence in Scala
Operator precedence determines the grouping of terms in an expression. This affects how an expression
is evaluated. Certain operators have higher precedence than others; for example, the multiplication
operator has higher precedence than the addition operator −
33
For example, x = 7 + 3 * 2; here, x is assigned 13, not 20 because operator * has higher precedence
than +, so it first gets multiplied with 3*2 and then adds into 7.
Take a look at the following table. Operators with the highest precedence appear at the top of the table
and those with the lowest precedence appear at the bottom. Within an expression, higher precedence
operators will be evaluated first.
Category
Postfix
Unary
Multiplicative
Additive
Shift
Relational
Equality
Bitwise AND
Bitwise XOR
Bitwise OR
Logical AND
Logical OR
Assignment
Comma
Operator
Associativity
Left to right
Right to left
Left to right
Left to right
Left to right
Left to right
Left to right
Left to right
Left to right
Left to right
Left to right
Left to right
() []
!~
*/%
+>> >>> <<
> >= < <=
== !=
&
^
|
&&
||
= += -= *= /= %= >>= <<= &= ^= |
Right to left
=
,
Left to right
34
Introduction To functions
Functions are a key concepts to any programming and scripting language. Here we introduce simple
functions in Scala:
Function Definition
scala> def max(x: Int, y: Int): Int = {
if (x > y) x
else y
}
max: (Int,Int)Int
NOTE:
Function definitions start with def .
The function’s name, in this case max , is
followed by a comma-separated list of
parameters in parentheses.
The function’s result type is an equals sign
and pair of curly braces that contain the body
of the function.
•
A type annotation must follow every function parameter, preceded by a colon, because the
Scala compiler (and interpreter) does not infer function parameter types unlike in other use
of variables..
•
After the close parenthesis of max ’s parameter list you’ll find another “ : Int ” type annotation.
This one defines the result type of the max function itself. Sometimes the Scala compiler will
require you to specify the result type of a function.
•
If the function is recursive, 7 for example, you must explicitly specify the function’s result type.
•
In the above case however, you may leave the result type off and the compiler will infer it.
•
Also, if a function consists of just one statement, you can optionally leave off the curly braces.
Thus, you could alternatively write the max function like this:
scala> def max2(x: Int, y: Int) = if (x > y) x else y
max2: (Int,Int)Int
•
Following the function’s result type is an equals sign and pair of curly braces that contain the
body of the function.
•
The equals sign that precedes the body of a function hints that in the functional world view, a
function defines an expression that results in a value.
35
Calling A function in Scala
Once you have defined a function, you can call it by name, Example
scala> max(3, 5)
res6: Int = 5
Example2 A simple function
scala> def printHello()={
| println("Hello Hello")
| }
scala> printHello()
//Calling the function
1. Methods with multiple input parameters
A method that takes two input parameters:
def add(a: Int, b: Int) = a + b
The same method, with the method’s return type explicitly shown:
def add(a: Int, b: Int): Int = a + b
A method that takes three input parameters:
def add(a: Int, b: Int, c: Int): Int = a + b + c
Multiline Example
scala> def addThenDouble(a: Int, b: Int): Int = {
val sum = a + b
val doubled = sum * 2
doubled
}
scala> addThenDouble(4,5)
val res0: Int = 18
2.
A function that takes no parameters and returns no result
Example
scala> def greet() = println("Hello, world!")
greet: ()Unit
When you define the greet() function, the interpreter will respond with greet: ()Unit . What does this
36
mean?
•
“ greet ” is, the name of the function.
•
The empty parentheses ind icate the function takes no parameters.
•
The Unit is greet ’sresult type. A result type of Unit indicates the function returns no value
Scala’s Unit type is similar to Java’s void type, and in fact every void -returning method in Java is
mapped to a Unit -returning method in Scala.
Methods with the result type of Unit , therefore, are only executed for their side effects. In the case of
greet() , the side effect is a friendly greeting printed to the standard output.
3.
Functions Call-by-Name
Normally parameters to functions are by-value parameters; that is, the value of the parameter is
determined before it is passed to the function.
But what if we need to write a function that accepts as a parameter an expression that we don't want
evaluated until it's called within our function? For this circumstance, Scala offers call-by-name
parameters.
A call-by-name mechanism passes a code block to the call and each time the call accesses the
parameter, the code block is evaluated and the value is calculated. Here, delayed prints a message
demonstrating that the method has been entered. Next, delayed prints a message with its value. Finally,
delayed prints ‘t’.
The following program shows how to implement call–by–name.
Example
4. object Example {
5.
def main(args: Array[String])={
6.
delayed(time());
7.
}
8.
9.
def time() = {
10.
println("Getting time in nano seconds")
11.
System.nanoTime
12.
}
13.
def delayed( t: => Long ) = {
14.
println(“System now:”+System.nanoTime);
15.
println("In delayed method")
16.
println("Time Parameter now: " + t)
17.
}
18.
}
37
Save the above program in Example.scala.
Output
In delayed method
Getting time in nano seconds
Param: 2027245119786400
4.
Function with Variable Arguments
Scala allows you to define a function that has a repeated last parameter . This allows clients to pass
variable length argument lists to the function.
In the example below the args inside the print Strings function, which is declared as type "String*" is
actually an Array[String].
Example
object StringExample {
def main(args: Array[String]) {
printStrings("Hello", "Scala", "Python");
}
//A function definition that specifies that the String can be
// The function can be called with any number of string objects
def printStrings( args:String* )= {
var i : Int = 0;
for( arg <- args ){
println("Arg value[" + i + "] = " + arg );
i = i + 1;
}
}
}
repeated
Save the above program in StringExample.scala.
Output
Arg value[0] = Hello
Arg value[1] = Scala
Arg value[2] = Python
5.
Functions with Named Arguments
In a normal function call, the arguments in the call are matched one by one in the order of the
parameters of the called function.
38
Named arguments allow you to pass arguments to a function in a different order.
The syntax is simply that each argument is preceded by a parameter name and an equals sign.
Below is a simple example to show the functions with named arguments.
Example
object NamedExample {
def main(args: Array[String]) {
//The function is called in any order
printInt(b = 5, a = 7);
}
def printInt( a:Int, b:Int ) = {
println("Value of a : " + a );
println("Value of b : " + b );
}
}
Output
Value of a :
Value of b :
6.
7
5
Default Parameter Values for a Function
Scala lets you specify default values for function parameters.
The argument for such a parameter can optionally be omitted from a function call, in which case the
corresponding argument will be filled in with the default.
If you specify one of the parameters, then first argument will be passed using that parameter and
second will be taken from default value.
Example
object DefaultExample {
def main(args: Array[String]) {
//The function is called without specifying the arguments
println( "Returned Value : " + addInt() );
}
def addInt( a:Int = 5, b:Int = 7 ) : Int = {
var sum:Int = 0
sum = a + b
return sum
}
39
}
Output
Returned Value : 12
When working with default arguments make sure that any default arguments
must be after all non-default arguments.
The following Example will raises an error:
scala> def func(a:Int=7,b:String){
| println(s"$a $b")
|}
func: (a: Int, b: String)Unit
scala> func("Ayushi")
<console>:13: error: not enough arguments for method func: (a: Int, b: String)Unit.
Unspecified value parameter b.
func("Ayushi")
^
7.
Recursion Functions
Recursion plays a big role in pure functional programming and Scala supports recursion functions very
well. Recursion means a function that call itself repeatedly.
Example: Factorial
object RecursiveExample {
def main(args: Array[String]) {
for (i <- 1 to 10)
println( "Factorial of " + i + ": = " + factorial(i) )
}
def factorial(n: BigInt): BigInt = {
if (n <= 1)
1
else
n * factorial(n - 1)
}
}
40
Output
Factorial
Factorial
Factorial
Factorial
Factorial
Factorial
Factorial
Factorial
Factorial
Factorial
of
of
of
of
of
of
of
of
of
of
8.
1: = 1
2: = 2
3: = 6
4: = 24
5: = 120
6: = 720
7: = 5040
8: = 40320
9: = 362880
10: = 3628800
Nested Functions(Local Functions)
Scala allows you to define functions inside a function and functions defined inside other functions are
called local functions.
Like a local variable declaration in many languages, a nested method is only visible inside the
enclosing method.
If you try to call grade() outside of printResult(), you will get a compiler error.
Here is an implementation of a nested grading function within a printResult function
Example
object NestedExample {
def main(args: Array[String])= {
printResults(45,25,30,50,23,60,76,49,52);
}
// Can accept a variable arguments array
def printResults( args:Int* ) = {
var i : Int = 0;
for( mark <- args ){
println("Mark value[" + i + "] = " + mark+ "Grade= "+grade(arg) );
i = i + 1;
}
// A nested function to return the grade for a given marks
def grade(mark:Int): String={
if( mark >= 70 )"A" else if( mark >= 60 )"B" else if( mark >= 50 )"C"
else
if( mark >= 40 )"D" else "F"
}//End of grade
}//End of printResults
}
41
Output
Arg
Arg
Arg
Arg
Arg
Arg
value[0]
value[1]
value[2]
value[3]
value[4]
value[5]
9.
=
=
=
=
=
=
1
20
30
50
23
60
Grade=F
Grade=F
Grade=F
Grade=C
Grade=F
Grade=B
Partially Applied Functions
When you invoke a function, you're said to be applying the function to the arguments.
•
If you pass all the expected arguments, you have fully applied it.
•
If you send only a few arguments, then you get back a partially applied function.
•
This gives you the convenience of binding some arguments and leaving the rest to be filled in
later.
•
Consider a case where lets say 10 different messages from 10 different sources are received at
different times in the same day and needed to be logged. Such an application requirement is a
good candidate to partially applied function.
Example
Let us examine the code below
import java.util.Date
object PartFunExample {
def main(args: Array[String]) {
val date = new Date //Creating a date object
log(date, "message1" )
Thread.sleep(1000) //Sleeping for 1000 milliseconds
log(date, "message2" )
Thread.sleep(1000)
log(date, "message3" )
}
//A function that recieves a message and the date and logs the msg
def log(date: Date, message: String) = {
println(date + "----" + message)
}
}
42
Output
Mon Dec 02 12:52:41 CST 2013----message1
Mon Dec 02 12:52:41 CST 2013----message2
Mon Dec 02 12:52:41 CST 2013----message3
Here, the log( ) method takes two parameters: date and message.
We want to invoke the method multiple times, with the same value for date but different values for
message.
The goal is to eliminate the noise of passing the date to each call by partially applying that argument
to the log( ) method.
This can be achieved by binding a value to the date parameter and leave the second parameter
unbound by putting an underscore at its place.
The result is a partially applied function that we've stored in a variable.
Example
import java.util.Date
object AppliedDemo {
def main(args: Array[String]) {
val date = new Date
// Note the magic of the underscore
val logWithDateBound = log(date, _ : String)
// First call to the function
logWithDateBound("message1" )
Thread.sleep(1000)
// Second call to the function
logWithDateBound("message2" )
Thread.sleep(1000)
// Third call to the function
logWithDateBound("message3" )
}
def log(date: Date, message: String) = {
println(date + "----" + message)
}
}
Output
Mon Dec 02 12:53:56 CST 2013----message1
Mon Dec 02 12:53:56 CST 2013----message2
Mon Dec 02 12:53:56 CST 2013----message3
43
10.
Higher-Order Functions
Since Scala as a highly functional language, it treats its functions as first-class citizens.
This means that we can pass them around as parameters, or even return them from functions.
Definition
A higher order function is a functions that take other functions as parameters, or
whose result is a function.
The following example program below, apply() function takes another function f and a value v and
applies function f to v.
Example
1. object HOFExample {
2.
def main(args: Array[String]) {
3.
println( apply( layout, 10) )
4.
}
5.
6.
def apply(f: Int => String, v: Int) = f(v)
7.
8.
def layout[A](x: A) = "[" + x.toString() + "]"
9. }
Output
[10]
NOTE Line 8 the function def layout[A](x: A) = "[" + x.toString() + "]"
This is a scala generic function (Parameterized ) that can take any type parameter.
In this case , it can take an Int, a Float , a String etc. So long as the type suport the toString() method.
44
11.
Understanding Higher Order Functions
Consider the case of an application that can be used to to draw a random name from a
list of names.
A scala method to accoplish that can be as shown below
def randomName(names: Seq[String]): String = {
val randomNum = util.Random.nextInt(names.length)
names(randomNum)
}
To run the function we can use a sequence of String values:
val names = Seq("Oyugi", "Christine", "Joylene", "Moses")
val winner = randomName(names)
Consider a case of an application that can be used to to draw a random element from a
list of element.
In such a case we make use of generic function as show below.
def randomElement[A](seq: Seq[A]): A = {
val randomNum = util.Random.nextInt(seq.length)
seq(randomNum)
}
With this change, the method can now be called on a variety of types:
randomElement(Seq("Aleka", "Christina", "Tyler", "Molly"))
randomElement(List(1,2,3))
randomElement(List(1.0,2.0,3.0))
randomElement(Vector.range('a', 'z'))
45
12.
Anonymous Functions
Scala provides a relatively lightweight syntax for defining anonymous functions.
Anonymous functions in source code are called function literals and at run time, function literals are
instantiated into objects called function values.
Scala supports first-class functions, which means functions can be expressed in function literal syntax,
i.e., (x: Int) => x + 1, and that functions can be represented by objects, which are called function
values.
The following are examples of anonymous functions.
1. var inc = (x:Int) => x+1
Variable inc is now a function that can be used the usual way −
var x = inc(7)-1
It is also possible to define functions with multiple parameters as follows −
2. var mul = (x: Int, y: Int) => x*y
Variable mul is now a function that can be used the usual way −
println(mul(3, 4))
It is also possible to define functions with no parameter as follows −
3. var userDir = () => { System.getProperty("user.dir") }
Variable userDir is now a function that can be used the usual way −
println(userDir)
46
12.
Currying Functions
Currying transforms a function that takes multiple parameters into a chain of functions, each taking a
single parameter.
Curried functions are defined with multiple parameter lists, as follows −
Syntax
def strcat(s1: String)(s2: String) = s1 + s2
Alternatively, you can also use the following syntax to define a curried function −
Syntax
def strcat(s1: String) = (s2: String) => s1 + s2
Following is the syntax to call a curried function
Syntax
strcat("foo")("bar")
You can define more than two parameters on a curried function based on your requirement.
Example
object CurrExample {
def main(args: Array[String]) {
val str1:String = "Hello,"
val str2:String = "Scala!"
println("str1 + str2 = " + strcat(str1)(str2) )
}
def strcat(s1: String)(s2: String) = {
s1 + s2
}
}
Output
str1 + str2 = Hello, Scala!
47
13.
Closures
A closure is a function, whose return value depends on the value of one or more variables declared
outside this function.
The following piece of code with anonymous function.
val multiplier = (i:Int) => i * 10
Here the only variable used in the function body, i * 10 , is i, which is defined as a parameter to the
function. Try the following code −
val multiplier = (i:Int) => i * factor
There are two free variables in multiplier: i and factor. One of them, i, is a formal parameter to the
function. Hence, it is bound to a new value each time multiplier is called.
However, factor is not a formal parameter, then what is this? Check on the code below.
var factor = 3
val multiplier = (i:Int) => i * factor
Now factor has a reference to a variable outside the function but in the enclosing scope. The function
references factor and reads its current value each time.
If a function has no external references, then it is trivially closed over itself. No external context is
required.
From the point of view of the above function, factor is a free variable, because the function literal
does not itself give a meaning to it. The i variable, by contrast, is a bound variable, because it does
have a meaning in the context of the function: it is defined as the function’s lone parameter, an Int .
Example
object ClosureDemo {
def main(args: Array[String]) {
println( "multiplier(1) value = " +
println( "multiplier(2) value = " +
}
var factor = 3
val multiplier = (i:Int) => i * factor
}
multiplier(1) )
multiplier(2) )
48
Output
multiplier(1) value = 3
multiplier(2) value = 6
The function value (the object) that’s created at runtime from this function literal is called a closure.
The name arises from the act of “closing” the function literal by “capturing” the bindings of its free
variables.
49
Type Casting in Scala
Scala, like many other computer languages, supports type casting or type coercion.
Please note that type casting in Scala is fraught with danger because of type erasure. As a result, if
we don’t understand how to correctly type cast, we can introduce subtle bugs.
Type Cast Mechanisms in Scala
Scala provides three main ways to convert the declared type of an object to another type:
1. Value type casting for intrinsic types such as Byte, Int, Char, and Float
2. Type casting via the asInstanceOf[T] method
3. Pattern matching to effect type casting using the match statement
Value Type Casting
Conversion between value types is defined as:
Byte —> Short —> Int —> Long —> Float —> Double
^
|
Char
The arrows denote that a given value type on the left-hand side of the arrow can be promoted to the
right-hand side.
For example, a Byte can be promoted to a Short. The opposite, however, is not true. Scala will not
allow us to assign in the opposite direction:
Type Casting via asInstanceOf[T]
We can coerce an existing object to another type with the asInstanceOf[T] method. Additionally, Scala
supports a companion method, isInstanceOf[T], that we can use in conjunction with it.
To see these methods in action, let’s consider a few class definitions:
class T1
class T2 extends T1
class T3
Now, let’s use those classes to see how type casting via asInstanceOf[T] works:
val t2 = new T2
val t3 = new T3
val t1: T1 = t2.asInstanceOf[T1]
assert(t2.isInstanceOf[T1] == true)
50
assert(t3.isInstance[T1] == false)
val anotherT1 = t3.asInstanceOf[T1]
// Run-time error
Writing Some Scala scripts
Although Scala is designed to help programmers build very large-scale systems, it also scales down to
scripting.
A script is just a sequence of statements in a file that will be executed sequentially.
When working with scripts it is recommended you quite the interpreter or you open another terminal.
Example : Put this into a file named hello.scala and the run it.
println("Hello, world, from a script!")
$ scala hello.scala
51
Working With Command Line Arguments
Scala provides an array named args which is used to primarily to hold arguments passed at command
line to a script.
In Scala, arrays are zero based, and you access an element by specifying an index in parentheses.
So the first element in a Scala array named marks is steps(0) , not marks[0] , as in Java.
Example1.: Command Line Arguments
Type the following into a new file named Example1.scala :
object Example2 {
def main(args: Array[String]) ={
println("Hello, " + args(0)+ "!")
}
}
then run:
$ scala Example1.scala James
Here "James" are passed as a command line argument, which is accessed in the script as args(0) .
Example2.: Two Command Line Arguments :
Type the following into a new file named Example2.scala :
object Example2 {
def main(args: Array[String]) ={
println(args(0)+ ":"+ args(1))
}
}
then run:
$ scala Example2.scala 76
Here "76" are passed as a command line argument, which is accessed in the script as args(0) .
52
REVISION QUESTIONS
1.
2.
3.
4.
5.
In your commodity hardware, what do you need to install inorder to work with scala
What do you need to install inorder to work with apache spark?
What languages can ypou use to write apache spark jobs.?
A java code compile to a byte code what about a scala program.
A scala compiled code can run anyhwere in any machine as long as the machine
………………...installed
6. Scala is a hybrid programming language.. What does this mean.
7. What do we mean when we say that a function is a first class function?
8. What do we mean when we say that a function should not have any side effects?
i. Give three examples of such functions
9. Give an example of data structures in Java that are immutable?
10. Give an example of data structures in Scala that are immutable?
11. Scala does not allow you to provide type information when declaring variables. T/F. Explain
you answer.
12. You can have the full joy of programming as a java developer if write you code in scala but
maintain java coding style and syntax throughout . No/Yes. Explain.
13. What happens when you use a scala class for a data type such as Int, Boolean, Double etc when
you compile.
14. All scala data types listed above are objects but not primitive types like in Java. (T/F)
15. Write a scala program that makes use a a variable-argument function that enables one to pass
through any number of integers then print out the integers and the total sum.
Solution
object IntDemo {
def main(args: Array[String])= {
printInts(1,20,30,50,23,60);
}
def printInts( args:Int* ) = {
var i : Int = 0; Int sum=0;
for( arg <- args ){
sum=sum+args
println("Arg value[" + i + "] = " + arg );
i = i + 1;
}
println("{Totals " ] = " + sum );
}
53
}
16. Distinguish between a variable argument function and a nested function.
17. Write a scala program that makes use a nested function grade() that receives the marks passed
and returns the corresponding grade according to a define grading creteria. The function can be
nested inside another function let say printResults() that prints marks scored and the
corresponding grade.
18. Write a scala program that makes use a variable length function printResults() that receives a
variable set of marks and for each mark it prints the corresponding grade according to a define
grading creteria. The program uses a nested function grade() inside the printResults()
function. The nested function recieves every mark passed and returns the grade.
Solution
object FunctionsDemo {
def main(args: Array[String])= {
printResults(36,37,55,20,30,50,23,60);
}
// The variable length function that can receive any number of integers as argument
def printResults( args:Int* ) = {
var i : Int = 0;
for( arg <- args ){
println("Arg value[" + i + "] = " + arg+ "Grade= "+grade(arg) );
i = i + 1;
}
// A nested function to return the grade for a given marks that returns a string grade
def grade(mark:Int): String={
if( mark >= 70 )"A" else if( mark >= 60 )"B" else if( mark >= 50 )"C" else if( mark
>= 40 )"D" else "F"
}//End of grade
}//End of printResults
}
19.
Write a scala program that makes use a anonymous function that receives a variable
marks and for each mark it prints the corresponding grade according to a define
grading creteria.
20. Define the following terms as they apply in scala programming
i. free variable
ii. bound variable
iii. closure function
iv. Parially applied function
v. Recursive Functions
Download