**Haskell 1** # Background - Haskell is a pure, statically-typed, lazy functional language designed by a group of programming language researchers to unify the plethora of lazy functional languages existing during the 1980's. Being pure functional means the language strictly isolates side effects from the rest of the code so that the code can be reasoned with using simple equational reasoning. The fact that it's statically-typed means every running program contains no type error. Being lazy means an expression is not evaluated until the time that its value is absolutely needed for the continuation of the running program. This makes it possible to have highly flexible data structures, e.g., infinite lists. - In this course we'll be using the [Glasgow Haskell Compiler (GHC)](https://www.haskell.org/ghc/) system to process Haskell programs. So if you haven't installed it yet, go to [this page](https://www.haskell.org/ghcup) and install it on your computer. - You can interact with GHC via the terminal by typing the command ```bash ghci ``` at the shell prompt. The `ghci` interpreter uses `Prelude> ` as its prompt. Every expression you type into the interpreter must be finished by hitting the `` key. To cancel the evaluation of an expression (like when it gets into infinite recursion), hit `Ctrl-C`; to quit the interpreter, hit `Ctrl-D`. - Running `ghci` puts you in a REPL (Read-Eval-Print-Loop). This is the normal mode that an interpreter works. However, when you are writing a substantial program consisting of several function definitions, you'll want to use a text editor to write your program, then either run the program through an interpreter or compiler. We'll demonstrate this in class. - Haskell defined several data types, operators, and functions in various modules. When you run `ghci`, the prelude module loads in many of these predefined names. These are what `ghci` writers think most people need in order to start doing useful programming. We call this the `initial environment`. # Values, Types, Operations - GHC understands values of various types. A type is a set of values together with operations that can be performed on them. - The types ```haskell (), Bool, Char, Int, Integer, Float, Double ``` are the _atomic types_ whose values cannot be broken down any further. GHC also has _structured types_ whose values consist of multiple smaller values combined. These structured types are the lists, tuples, and functions. - `()` is both a type name and the type's only value; it is used when we want to create a function that takes zero argument. - The `Bool` type has two possible logical values `True` and `False`, representing the logical values `true` and `false` respectively. `Bool` values can be operated on with `not`, `&&`, `||`. The `&&` operator means `and`; the `||` operator means `or`. They are both short-circuiting. `||` has lower precedence than `&&`, which in turn has lower precedence than `not`. - The `Char` type contains all the possible unicode characters. A `Char` value is indicated by surrounding the character with the single quotation mark. Examples of `Char` values are 'x', '有', 'ก'. A string type is a list of characters, written as [Char]. A string literal can be written by surrounding it in double quotation mark, e.g., `"happy"`. Inside a Char or string literal, you can use the usual escape sequences `\n`, `\t`, `\f`, `\r`, `\b`, or use 3-digit decimal number, like `\129`. The string concatenation operator is `++`. The `length` function when applied to a string returns the length of the string. - The `Int` type contains all the integers in 2-complement notation that fits in 64 bits. `Int` is a subrange of the mathematics integer, in which half of the range is negative and the other half nonnegative. We can use decimal or hexadecimal notaion to input integers. Operators for `int` include addition `+` and subtraction `-` of equal precedence level; multiplication `*` and integer division `div` and integer modulus `mod` of higher precedence level than the additive operators. - The `Integer` type contains all the integers of any size that your computer can hold. - The `Float` type contains all the single-precision floating point numbers and the `Double` type contains all the double-precision floating point numbers. A floating-point number can be typed in using decimal point notation like `3.14` or scientific notation like `314e-2`. Floating point numbers can be operated on by addition `+` and subtraction `-` of equal precedence level, and multiplication `*` and real division `/` of higher precedence. - For integers $a$, $b$, we can do $a$ `^` $b$ to get $a$ raised to the $b$ power, e.g., ```haskell 3 ^ 2 ``` gives 9 as a result. - For floating point numbers $x$ and $y$, we can do $x$ `**` $y$ to get $x$ raised to the $y$ power, e.g., ```haskell 3.2 ** 1.2 ``` gives 4.038127004673237 as a result. - All the arithmetic operators `+`, `-`, `*`, `/` associate to the left. The operators `^` and `**` associate to the right. - The `chr` function takes an `Int` argument and returns the character having that integer as its code value. The `ord` function takes a `Char` argument and returns the integer value of the code for that character. However, Prelude does not load the necessary module automatically. If you want to use these two functions, you'll have to type `import Data.Char` first. - Expressions of any type above can be compared using `==`, `/=`, `<`, `<=`, `>`, and `>=`. # Variables - A variable definition is written like so: ```haskell x = 3.0 * 5.0 ``` This has the effect of binding the real value `15.0` to the variable `x` in the top-level environment. Once defined, a variable never changes its value. However, you can bind the variable name to another value, in which case the new binding simply hides the previous definition. - All variables and function names start with a lowercase letter or an underscore character. However, inside it we can use uppercase or lowercase letters, digits, the underscore characters, or the single quote `'` character. Examples of valid Haskell variables or names of functions are ```haskell a', a_variable, some_function, not4Love, camelCase ``` - In contrast to variables and functions, a Haskell `type` or `typeclass` starts with a capital letter. For example, ```haskell Int, Integer, Num, Eq ``` We'll explain what a `typeclass` is later. # Expressions - Values, variables, and functions can be combined into an expression using appropriate operators and function applications. Function application has higher precedence than operators. - Every expression has a type and a value. You must learn to tell the types of expressions, just like the Haskell interpreter. This will be important in order to write correct Haskell programs. - An _if expression_ always has 2 arms: an _if clause_ and an _else clause_. The test condition must be a `Bool` type value; both the if clause and else clause expressions must be of the same type, and that determines the type of the if expression itself. E.g. ``` if 3 < 4 then "true" else "false" ``` is a valid expression of type `[Char]` and has value `"true"`. # Lists - A Haskell list is a recursively defined data value. All elements of a list must be of the same type. By definition, a list is either empty or not empty. If it is not empty, then it consists of a head element of the list element type, and a tail which is a list of the same type. We write `[]` to denote an empty list, and write ``` [1, 2, 3] ``` for a list of 3 numbers, for example. - If `lst` is a list, `head lst` returns the head element of `lst` and `tail lst` returns the tail of `lst`. - The function `null` when applied to a list returns `True` if the list is empty; it returns `False` otherwise. - The `:` operator takes a list element as left operand and a list as a right operand. It returns the list that results from consing the left operand to the head of the right operand. It is right associative. - Now that you know about the `:` operator, I can explain in more detail what I mean by saying that Haskell lists are defined recursively. When the Haskell system says that the value of a list is, e.g., `[1, 2, 3]`, it's actually using a shorthand notation for saying that the list value is `1:2:3:[]`. - The `length` function when applied to a list returns the length of the list. The `++` operator takes two operands of the same list type. It returns the result of appending the second list to the first list. - The precedence of `:` and `++` are lower than the additive operators such as `+` and `-`, but higher than the comparison operators such as `<`, `<=`, etc. - You can access the nth element of a list by the `!!` operator. E.g. ```haskell ["zero", "one", "two", "three"] !! 2 ``` has the value `"two"`. Notice that the position starts counting at 0, not 1. - Inside the square brackets, you can use the `..` operator to generate a range of values. For example, writing ```haskell [2..5] ``` gives the list `[2, 3, 4, 5]`, and writing ```haskell ['a'..'e'] ``` gives the string `"abcde"`. - Haskell has a number of functions that can be applied to lists. You will want to investigate these: `sort`, `reverse`, `sum`, `product`, `take`, `drop`, `splitAt`. Even the `[..]` operator has other variants that can generate ranges of various kinds. - The type of a list is written `[a]` where `a` is a Haskell name for the type of the elements. For example, the type of ``` ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"] ``` is `[[Char]]`. # Tuples - A tuple takes two or more values, possibly of different types, and combines them into one aggregate value. Its syntax is similar to that of a list, except that it uses a pair of parentheses to surround the components instead of brackets. - There is no 1-tuple type. - For 2-tuple value, it's possible to access the first and second components using the `fst` and `snd` functions respectively. Here is an example screen interaction with ghci showing this: ```haskell GHCi, version 9.2.8: https://www.haskell.org/ghc/ :? for help ghci> t = ('a', 3) ghci> fst t 'a' ghci> snd t 3 ghci> ``` - For tuples with more than 2 components, you can access each component by pattern matching. More about that later. - The type of a tuple is the Cartesian Product of the types of all the components. For example, the tuple ```haskell ([True, False], 'a', "song") ``` has type ```haskell ([Bool], Char, [Char]). ``` Note also that this 3-tuple is different from the 2-tuple ```haskell (([True, False], 'a'), "song") ``` whose type is ```haskell (([Bool], Char), [Char]), ``` and it's different from the 2-tuple ```haskell ([True,False], ('a', "song")) ``` whose type is ```haskell ([Bool] * (Char * [Char])). ``` # Functions - Functions in Haskell is just another data type and every function has a value just like any other data type. The type of the function is a map from its domain to its codomain. A function value differs from other kinds of values in that it abstracts the idea of a _computational process_. So we can apply a function to an appropriate argument to get its return value. For example `sin 1.0` returns the value of `sin` function when its argument is 1.0. Its type is `Floating a => a -> a`. Its value is the computational process that computes the trigonometric sine value of its argument. It's an example of a `polymorphic type`, a rather advanced concept we'll talk about later. - One hallmark of a functional language is that **it treats functions as first-class citizens**, meaning that a function can be given as an argument of other function; it can be part of an aggregate data; and it can be returned as the result of a function application. - All functions in Haskell take exactly one argument! If a function takes zero arguments, Haskell thinks that its one augument is of type `()`. If it takes more than 2 arguments, Haskell thinks it takes a tuple argument having as many components as the number of arguments! # Function Definition - A function is defined like so ```haskell fact n = if n == 0 then 1 else n * fact (n-1) ``` - The type of a function consists of a domain and a codomain. For example, the `fact` function above has type `(Eq p, Num p) => p -> p`, a polymorphic type. Don't worry about it just yet. We'll talk about it later. # Type Inference - Even though Haskell is a strongly-typed language, it does not force you to specify the types of your variables or functions when defining them. It has a built-in _type inference engine_ that can deduce the types of all expressions. The type inference engine will deduce the most general type possible for your expression. For example, when you write ```haskell double x = x + x ``` the compiler will deduce that the type of `double` is `Num a => a -> a`. # Type Annotation - As explained above, Haskell allows you to define a function without specifying the types of the domain and the codomain. The type inference engine will deduce their types for you automagically. However, there are times when a programmer might want to specify the function type precisely. Perhaps she wants to write a function that has domain and codomain values of certain types only. For example, if she wants to write a `double` function that takes a `Double` value and returns a `Double` value, she can specify her intention like this: ```haskell double :: Double -> Double double x = x + x ``` The line ```haskell double :: Double -> Double ``` is technically called a `type annotation`. It says that `double` is a function with `Double` domain and `Double` codomain. - In fact, you are encouraged to write type annotations in your Haskell source code. It serves as good documentation. # Garbage Collection - While the Haskell language system is running, it uses memory for storing the values of variables and calculating temporary values. This can exhaust the RAM so the system needs to periodically reclaim unused memory. Such a mechanism is called _garbage collection_. While the system is collecting garbage, your computer may seem to freeze. Just be patient and wait a little bit. _---San Skulrattanakulchai_