Configuring Docker in order to run properly behind a company proxy

Quick post to remember how to setup Docker in order to run behind a proxy:

1. Create docker service configuration file:

1
2
3
sudo mkdir /etc/systemd/system/docker.service.d
sudo touch /etc/systemd/system/docker.service.d/http-proxy.conf
sudo nano /etc/systemd/system/docker.service.d/http-proxy.conf

Add the following:

[Service]
Environment="HTTP_PROXY=http://xxx:yyy" 
Environment="HTTPS_PROXY=https://xxx:yyy"
Environment="NO_PROXY=localhost,127.0.0.0"

where xxx is the host and yyy is the port of course!

2. Check and apply configuration:

1
sudo systemctl daemon-reload

Check proper configuration:

1
sudo systemctl show --property Environment docker

Restart docker if the command output is correct:

1
sudo systemctl restart docker

You might wish to customize your bashrc too (~/.bashrc), by adding:

export http_proxy="http://xxx:yyy"
export https_proxy="https://xxx:yyy"

Understanding mutability and immutability in Scala

It’s been 4 months now since I started working with Scala and although I’m far away from being considered a Scala expert (the language is far more complex to master than Java, C# or any other programming language), I’ve got crystal clear how mutability and immutability works in the language and also generally (because this is actually a general, language agnostic concept). Unfortunately it seems to me that several developers are struggling to get it right, so in this post I’m gonna try my best to make it simple and clear for all.
There are two level of mutability/immutability:

1. Variable level
2. Object level

A variable is mutable if defined using the var keyword, it’s instead immutable if defined with the val keyword.
The immutability of a variable means that once it has been defined it can’t be reassigned to something else, the mutability of a variable instead means that it’s possible to assign another object to the variable after its definition.
For example the following code won’t compile:

1
2
3
4
object Foo extends App {
    val foo: String = "ciao"
    foo = "hello"
}

It will raise this exception:

error: reassignment to val

This happens because foo is immutable (it’s like a constant, it can’t change).
But if we change val with var all will be fine:

1
2
3
4
object Foo extends App {
    var foo: String = "ciao"
    foo = "hello"
}

This because foo is now mutable, so we can assign it to another string (one or more times).
BUT regarding object mutability we are not mutating the string object, because a string either referred with val or var is an immutable object! So the reassignment of foo does not change the string object itself, this is not possible, what happens is that a new string object is created (“hello”) and it’s assigned to the variable, later the previous string object “ciao” will be marked for deletion and the garbage collector will get rid off it.
It’s possible to demonstrate that these objects are different by checking their hash code:

1
2
3
4
5
6
7
8
9
10
11
12
13
object Foo extends App {
    var foo: String = "ciao"
    println(foo.hashCode)
    println(foo.hashCode)
    
    foo = "hello"
    println(foo.hashCode)
    println(foo.hashCode)
    
    foo = "hi"
    println(foo.hashCode)
    println(foo.hashCode)
}

The code above will print 2 codes of the same value for the object “ciao”, 2 codes of the same value for the object “hello” and 2 codes of the same value for the object “hi” (we have created 3 string objects in memory and thus we have 3 different hash codes for each one).
Mutability and immutability are the reason behind different collections implementations.
One of the most common collection is the List object for example and it’s an immutable one.
It’s possible to “sum” several lists into one, but in my opinion is not a good practice, because we are creating several unnecessary temporary objects in memory.
For example:

1
2
3
4
5
6
object Foo extends App {
    var myList: List[Int] = List(1, 2, 3)
    myList = myList ::: List(4, 5, 6)
    myList = myList ::: List(6, 7, 8)
    myList = myList ::: List(9, 10, 11)
}

can be better by using a collection designed for mutability like ListBuffer:

1
2
3
4
5
6
7
8
import scala.collection.mutable.ListBuffer
object Foo extends App {
    val buffer: ListBuffer[Int] = ListBuffer(1, 2, 3)
    buffer.append(4, 5, 6)
    buffer.append(6, 7, 8)
    buffer.append(9, 10, 11)
}

and as you can see I defined the buffer as an immutable constant variable, because I do want the buffer to expand (and this is provided by design since it’s a mutable collection) but I don’t want it to be replaced by another buffer with a new assignment… do you get it?

From Python Generators to Scala Streams

Preface

As a “Pythonista” one cool feature of Python that I was looking for in Scala were generators.
Python generators are a sort of iterators on an incrementally generated collection and they are magically created by using the yield keyword (which is also available in Scala but it has a complete different behavior)*.
For example the following Python code generates as many string objects as the given count on demand (this means that we’ll get only once object in memory on each iteration instead of a collection with multiple objects):

1
2
3
def stream(count):
    for i in range(count):
        yield "string_{}".format(i)

By calling the function above a generator instance will be returned:

1
2
string_generator = stream(10)
print(type(string_generator))

The output will be:

1
<class 'generator'>

By iterating on the returned object:

1
2
for item in string_generator:
    print(item)

We will get the following output:

1
2
3
4
5
6
7
8
9
10
string_0
string_1
string_2
string_3
string_4
string_5
string_6
string_7
string_8
string_9

So… you might wonder “what’s all the fuss about printing strings in a loop?”, the cool thing is that those objects are generated one by one on demand if and only if it’s needed, this means that you can generates bazillion of objects by having only one of them in memory at time.
This can be tested by adding a simple print statement before the yield statement.

1
2
3
4
def stream(count):
    for i in range(count):
        print("Generating new object ({})".format(i))
        yield "string_{}".format(i)

The output of:

1
2
for item in stream(10):
    print(item)

Will be:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Generating new object (0)
string_0
Generating new object (1)
string_1
Generating new object (2)
string_2
Generating new object (3)
string_3
Generating new object (4)
string_4
Generating new object (5)
string_5
Generating new object (6)
string_6
Generating new object (7)
string_7
Generating new object (8)
string_8
Generating new object (9)
string_9

As you can see by calling stream() you won’t get a preallocated collection of objects but instead you will pick one by one during the iteration. In practice when Python interpreter find a yield inside a loop it does not run it, but instead it turn that into a special class (generator).
Python generators act like iterators, so they can be “traversed” only once (after the first iteration the generator has been consumed and it won’t return any further data).

Let’s get to Scala now…

How can implement the same thing in Scala?
The answer is: by using Stream!
Here is a basic example:

1
2
3
def stream(): Stream[String] = {
  "one" #:: "two" #:: "three" #:: "four" #:: Stream.empty[String]
}

And by calling:

1
stream().foreach(println)

The output will be:

1
2
3
4
one
two
three
four

Ok, but how does it work? Well, basically the sign #:: in plain English means “next in the queue will be”, then at the end a Stream.empty acts as a sentinel value which indicates the end of the stream.
Obviously in the real world an approach like the code above does not make sense since it’s hard-coded. In order to make a stream dynamic like the previous Python example, in Scala we have to use recursion:

1
2
3
4
5
6
7
def stream(count: Int): Stream[String] = {
  if (count > 0) {
    s"string_$count" #:: stream(count - 1) // <- go on (recursion)
  } else {
    Stream.empty[String] // <- end of stream
  }
}

Which by calling stream(10).foreach(println) produces the following output:

1
2
3
4
5
6
7
8
9
10
string_10
string_9
string_8
string_7
string_6
string_5
string_4
string_3
string_2
string_1

As you should have noticed the output is different from the Python implementation, in order to have the same one we need an helper inner function**:

1
2
3
4
5
6
7
8
9
10
11
12
def generate(count: Int): Stream[String] = {
    def stream(current: Int): Stream[String] = {
        if (current < count) {
            s"string_$current" #:: stream(current + 1)
        } else {
            Stream.empty[String]
        }
    }
    stream(0) // <- start the generation from 0
}

In this way the count will be the same (from 0 to 9).

One more thing…

While both Python generators and Scala streams are lazy, in Scala, generated objects are kept in memory and it’s possible to use the same stream multiple times (even though objects are generated only the first time).
In practice Scala streams are actually lazy collections.
Fortunately this can be easily “fixed” by returning the stream iterator instead of the stream itself (here the inner function is still present just to maintain the same order of the Python implementation, but it’s not mandatory in other implementations):

1
2
3
4
5
6
7
8
9
10
11
12
def generate(count: Int): Iterator[String] = {
    def stream(current: Int): Stream[String] = {
        if (current < count) {
            s"string_$current" #:: stream(current + 1)
        } else {
            Stream.empty[String]
        }
    }
    stream(0).iterator
}

So later on:

1
2
3
4
5
val stream: Iterator[String] = generate(10)
while (stream.hasNext) {
    println(stream.next())
}

And in this way we will get one and only one object generated and in memory at time.

Notes

* In Scala yield is used in order to implement the same as Python list comprehension:

1
for (i <- 1 to 10; if i % 2 == 0) yield i

The code above will returns a collection containing only even numbers:

1
Vector(2, 4, 6, 8, 10)

** Honestly speaking I’m not 100% sure that it’s the only way to do it, but the more clean and simple way I figured out.

Final thoughts

The examples that I’ve used are of course dumb, but I think that they are anyway more practical and more useful than the abused Fibonacci sequence generation you can find everywhere.
It’s been hard but after 1 year, I’m finally appreciating Scala and starting to master all the features of the language, in this case streams, which in the end are more flexible than Python generators even though, as usual the design choices of Scala make it hard to understand and remember how things work… for instance: wouldn’t be more readable one of the following alternative syntax:

1
2
3
4
5
"one" then "two" then "three" then Stream.empty[String]
// or
"one" follow "two" follow "three" follow Stream.empty[String]

Instead of:

1
"one" #:: "two" #:: "three" #:: Stream.empty[String]

?
…Fuck yes!

Understanding Scala Generics Variances (Invariant, Covariant, Contravariant)

Since in Scala generics variances are not “verbose” as it might be in Java, I had initially some difficulties in understanding them, until I wrote a super-simple proof of concept in which I tried the three possible variances and “see what happens”.
You know…

So… there are three types of variances:

  1. Invariant: [T]
  2. Covariant: [+T]
  3. Contravariant: [-T]

each one implies a different behavior that I’m gonna to demonstrate by creating a generic class called Box three times, one for each variance. I will also use 2 simple classes that would be placed inside the Box brackets.

1. Invariant: [T]

1
2
3
4
5
6
7
8
9
10
11
12
13
class Box[T]
abstract class Item
class Coin extends Item
val b1: Box[Item] = new Box[Item] // ok
val b2: Box[Coin] = new Box[Coin] // ok
val b3: Box[Item] = new Box[Coin] // won't compile!
val b4: Box[Coin] = new Box[Item] // won't compile!

Only b1 and b2 are valid statements, since the invariant [T] is the most restrictive one and it doesn’t allow to see a Box of Coin as it was a Box of Item (the supertype) nor a Boxof Item as it was a Box off Coin (the subtype).

2. Covariant: [+T]

1
2
3
4
5
6
7
8
9
10
11
12
13
class Box[+T]
abstract class Item
class Coin extends Item
val b1: Box[Item] = new Box[Item] // ok
val b2: Box[Coin] = new Box[Coin] // ok
val b3: Box[Item] = new Box[Coin] // ok
val b4: Box[Coin] = new Box[Item] // won't compile!

By using the covariant [+T] only the last statement won’t compile. So, basically covariant allows polymorphis thus specific types can be seen as more generic ones.
This type of variance is the one for instance used in Scala collections, otherwise a statement like:

1
val coins: List[Item] = List(new Coin, new Coin)

wouldn’t be valid.

3. Contravariant: [-T]

1
2
3
4
5
6
7
8
9
10
11
12
13
class Box[-T]
abstract class Item
class Coin extends Item
val b1: Box[Item] = new Box[Item] // ok
val b2: Box[Coin] = new Box[Coin] // ok
val b3: Box[Item] = new Box[Coin] // won't compile!
val b4: Box[Coin] = new Box[Item] // ok

By using contravariant [-T] this time only the third statement won’t compile. This is the most weird one since it doesn’t allow a subtype to be seen as its supertype but the contrary: a supertype can be seen as its subtype!

Scala implicit for mere mortals (like me)

One of the most difficult feature to master in Scala (at least for mere mortals of course) is the implicit keyword and its implications especially because it may appear in different places and for different purposes:

  • Implicit methods
  • Implicit parameters
  • Implicit variables
  • Implicit classes
  • Implicit constructors

So what is it useful for?
Implicits are essentially a way to leverage the power of Scala compiler by avoiding (otherwise) explicit calls to methods in our code or allowing us to extend native or third party classes “dynamically”.

Implicit methods

Implicits are used by Scala itself, for example: have you ever wonder why:

1
val value: Double = 1 // <- this is an Int not a Double!

compiles without any issue and returns the expected value (1.0)? Implicit in action!
If we look at scala.Int companion object we will find the following implicit method declaration:

1
implicit def int2double(x: Int): Double = x.toDouble

So, at compile-time, once the compiler finds a type mismatch in the code that will generate an exception (in this case an Int instead of a Double), it will check for an implicit in the current scope that is able to convert the wrong type into the correct one. If found, the compiler will replace the code by invoking the method for us:

1
val value: Double = int2double(1) // <- explicit replacement

Let’s try to do a similar magic by ourselves. Let’s make the compiler be able to understand booleans as 'yes or 'no symbols:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
object MyApp {
  implicit def symbolToBoolean(symbol: Symbol): Boolean = {
    symbol match {
      case 'yes => true
      case 'no => false
    }
  }
  def main(args: Array[String]): Unit = {
    val scalaIsCool: Boolean = 'yes
    val scalaIsEasy: Boolean = 'no
    println(s"scalaIsCool: $scalaIsCool")
    println(s"scalaIsEasy: $scalaIsEasy")
  }
}

So, once the compiler encounters the following assignment:

1
val scalaIsCool: Boolean = 'yes

Since a boolean can be only true or false, it will check for a way to give this symbol a boolean value, so it proceeds by checking the current scope (main() method), here there are no implicit definitions, so it goes up to the parent scope (MyApp), here it finds an implicit that is able to convert the symbol, so it will replace the code with:

1
val scalaIsCool: Boolean = symbolToBoolean('yes)

and the same for the next line.
Of course this is just a dumb example and in the real life implicits are usually defined in separated classes.
We could rewrite the example as:

SymbolImplicits.scala

1
2
3
4
5
6
7
8
9
10
object SymbolImplicits {
  implicit def symbolToBoolean(symbol: Symbol): Boolean = {
    symbol match {
      case 'yes => true
      case 'no => false
    }
  }
}

AppImportingImplicitMethods.scala

1
2
3
4
5
6
7
8
9
10
11
12
13
import SymbolImplicits._
object MyApp {
  def main(args: Array[String]): Unit = {
    val scalaIsCool: Boolean = 'yes
    val scalaIsEasy: Boolean = 'no
    println(s"scalaIsCool: $scalaIsCool")
    println(s"scalaIsEasy: $scalaIsEasy")
  }
}

Implicit classes

One of the best use case for Scala implicit is to extend native or third party classes. Let’s pretend for example that we need to turn an arbitrary String into a slug, we could create an object like StringUtils with a method slufigy(string: String) but wouldn’t be much more clean to be able to call .toSlug against a String instance as we usually do with .toUpperCase and so on? Of course, and by defining an implicit class we can have such convenience:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
object StringExtensions {
  implicit class StringWrapper(string: String) {
    def toSlug: String = {
      string
        .toLowerCase
        .replaceAll("[^a-z\\d]", " ")
        .trim
        .replaceAll("\\s+", "-")
    }
  }
}

Then later on we can:

1
2
3
4
5
6
7
8
9
import StringExtensions._
object MyApp {
  def main(args: Array[String]): Unit = {
    println(">> This string will be slugified! <<".toSlug)
  }
}

Bear in mind that an implicit class cannot be defined as a top-level object, therefore it must be “wrapped” by an object!

Implicit parameters

Also method parameters can be defined as implicit thus providing a sort of “native dependency injection”.
Let’s create a sample class which is responsible of creating a folder tree by naming folders according to the current date (/yyyy/MM/dd/):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
object FoldersCreator {
  def createDateFolderTree(root: Path)(implicit now: LocalDateTime): Option[Path] = {
    val format = (p: String) => now.format(DateTimeFormatter.ofPattern(p))
    val path: Path = Paths.get(root.toString, format("yyyy"), format("MM"), format("dd"))
    val result: Option[Path] = Try(Files.createDirectories(path)).toOption
    result
  }
  def createDateFolderTree(root: String)(implicit now: LocalDateTime): Option[Path] = {
    createDateFolderTree(Paths.get(root))
  }
}

As you can see I defined a method with an overload and both have a secondary couple of parenthesis containing a parameter now declared as implicit. This means that we will be able to call the methods without the second argument:

1
FoldersCreator.createDateFolderTree("a/root/path")

Implicit variables

…provided that in the scope in which they get invoked an implicit variable of type LocalDateTime is found (the name doesn’t matter!).
Moreover as you can see even in the inner call from the overloaded method (createDateFolderTree(rootPath: String)...) to the original signature (createDateFolderTree(rootPath: Path)...) the second argument is omitted.
Let’s see how we can use the previously created object with implicit arguments:

1
2
3
4
5
6
7
8
9
object MyApp {
  def main(args: Array[String]): Unit = {
    implicit val now: LocalDateTime = LocalDateTime.now()
    FoldersCreator.createDateFolderTree("/home/dave/Desktop")
    FoldersCreator.createDateFolderTree("/home/dave/Desktop/backup")
  }
}

In the code above the implicit variable has been defined just before invoking methods and it has been named like the argument in the signature (now).
However the implicit date can be declared also as class field and with a different name:

1
2
3
4
5
6
7
8
9
10
object MyApp {
  private implicit val d: LocalDateTime = LocalDateTime.now()
  def main(args: Array[String]): Unit = {
    FoldersCreator.createDateFolderTree("/home/dave/Desktop")
    FoldersCreator.createDateFolderTree("/home/dave/Desktop/backup")
  }
}

If we define several implicit val, we may face a resolution exception. For example if we add a new implicit val of the same type (LocalDateTime) to the previous code:

1
2
3
4
5
6
7
8
9
10
11
object MyApp {
  private implicit val date: LocalDateTime = LocalDateTime.now()
  private implicit val now: LocalDateTime = LocalDateTime.of(2020, 1, 1, 20, 30)
  def main(args: Array[String]): Unit = {
    FoldersCreator.createDateFolderTree("/home/dave/Desktop")
    FoldersCreator.createDateFolderTree("/home/dave/Desktop/backup")
  }
}

We will get:

1
2
3
4
Error:(11, 40) ambiguous implicit values:
 both value date in object AppWithImplicitsVariables of type => java.time.LocalDateTime
 and value now in object AppWithImplicitsVariables of type => java.time.LocalDateTime
 match expected type java.time.LocalDateTime

This is a further proof that variable names don’t matter, only type does and if two variables with the same type are found in the scope the compiler doesn’t know which one to pick up.
Obviously in this case the problem is immediately spottable, but in a real application it might be harder to discover where is the conflict, especially if there are many classes with many implicits, generics and so on.

Implicit constructors

Like methods even constructor can be composed by implicit arguments.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// an "injectable" object
class Bar
// a class with the implicit
class Foo(implicit val bar: Bar)
// tests app
object FooBarApp {
  private implicit val bar: Bar = new Bar
  def main(args: Array[String]): Unit = {
    val foo: Foo = new Foo
    println(foo.bar.getClass.getName)
  }
}

Final thoughts

The examples that I provided in this post are intentionally dumb and short, however with Scala implicit is possible to achieve complex tasks and automatically data conversion. For example Spark use this feature quite heavily for data encoding. Anyway this powerful tool should be used carefully and not abused in order to avoid painful situations.