How to build Scala tiny backend on Amazon AWS Lambda

We are going to create a simple application, which posts and gets provided to lambda set of numbers.

Initialize AWS services

Create S3 bucket

First of all, let’s go to amazon aws console and create a bucket with name scala-aws-lambda. We need it for storing lambda code.

Create DynamoDB table

We are going to store our data in DynamoDB, NoSQL database provided by Amazon. Let’s create table with name numbers-db with partition key id and sort key at.

Initialize sbt project

Let’s create an empty project

$ sbt new scala/scala-seed.g8
Minimum Scala build.

name [My Something Project]: scala-aws-lambda

Template applied in ./scala-aws-lambda

$ cd scala-aws-lambda/

Add the following to your project/plugins.sbt file:

resolvers += "JBoss" at "https://repository.jboss.org/"

addSbtPlugin("com.gilt.sbt" % "sbt-aws-lambda" % "0.4.2")

Add the AwsLambdaPlugin auto-plugin and s3-bucket name (the actual lambda binary will be stored there) to your build.sbt. We also need additional library dependencies to be able to handle lambda input (aws-lambda-java-core).

enablePlugins(AwsLambdaPlugin)

retrieveManaged := true

s3Bucket := Some("scala-aws-lambda")

awsLambdaMemory := Some(320)

awsLambdaTimeout := Some(30)

libraryDependencies += "com.amazonaws" % "aws-lambda-java-core" % "1.1.0"

To create new lambda you need to run the following command:

AWS_ACCESS_KEY_ID=<YOUR_KEY_ID> AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_KEY> sbt createLambda

To update your lambda you need to run this command:

AWS_ACCESS_KEY_ID=<YOUR_KEY_ID> AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_KEY> sbt updateLambda

Post method implementation

First of all, we need to specify which actualy handlers we are going to use. Also we need to use one of the json parser. Let’s use circe. For all of that we need to update build.sbt file.

lambdaHandlers += "post" -> "com.dbrsn.lambda.Main::post"

libraryDependencies ++= Seq(
  "io.circe" %% "circe-generic",
  "io.circe" %% "circe-parser",
  "io.circe" %% "circe-java8"
) map (_ % "0.8.0")

libraryDependencies += "com.amazonaws" % "aws-java-sdk-dynamodb" % "1.11.150"

Now we can write our simple main class with post method, which will be called from Amazon Lambda. We need imports:

// Amazon AWS DynamoDB
import com.amazonaws.regions.Regions
import com.amazonaws.services.dynamodbv2.document.{DynamoDB, Item}
import com.amazonaws.services.dynamodbv2.{AmazonDynamoDB, AmazonDynamoDBClientBuilder}
// Amazon AWS Lambda
import com.amazonaws.services.lambda.runtime.Context
// Circe encoding/decoding
import io.circe.generic.auto._
import io.circe.java8.time._
import io.circe.parser._
import io.circe.syntax._
// Other
import org.apache.commons.io.IOUtils
import scala.collection.JavaConverters._

Our input order will be the following:

/**
  * Input order
  */
final case class Order(clientId: ClientId, numbers: Set[Int])

object Order {
  final type ClientId = String
}

Our output and persisted order will be:

/**
  * Output and persisted order
  */
final case class PersistedOrder(orderId: OrderId, at: LocalDateTime, order: Order)

object PersistedOrder {
  final type OrderId = String
}

Finally, Main class itself:

class Main {
  // Initializing DynamoDB client
  lazy val client: AmazonDynamoDB = AmazonDynamoDBClientBuilder.standard().withRegion(Regions.US_EAST_1).build()
  lazy val dynamoDb: DynamoDB = new DynamoDB(client)

  lazy val clock: Clock = Clock.systemUTC()

  val tableName: String = "numbers-db"
  val encoding = "UTF-8"

  def post(input: InputStream, output: OutputStream, context: Context): Unit = {
    // Parsing order from input stream
    val order = parse(IOUtils.toString(input, encoding)).flatMap(_.as[Order])
    // Wrapping it to an object, which we would like to persist
    val persistedOrder = order.map(PersistedOrder(UUID.randomUUID().toString, LocalDateTime.now(clock), _))

    val persisted = persistedOrder.flatMap { o =>
      // Getting DynamoDB table
      val table = dynamoDb.getTable(tableName)
      Try {
        // Creating new DynamoDB item
        val item = new Item()
          .withPrimaryKey("id", o.orderId)
          .withLong("at", Timestamp.valueOf(o.at).getTime)
          .withString("clientId", o.order.clientId)
          .withList("numbers", o.order.numbers.toList.asJava)
        // Persisting item to DynamoDB
        table.putItem(item)
        o
      }.toEither
    }

    // Throw exception if it happened or write output order in json format otherwise
    persisted.map(_.asJson).fold(throw _, json => {
      IOUtils.write(json.noSpaces, output, encoding)
      output.flush()
    })
  }

}

Deploy

And now after running the following command and answering access question, your lambda will be deployed

AWS_ACCESS_KEY_ID=<YOUR_KEY_ID> AWS_SECRET_ACCESS_KEY=<YOUR_SECRET_KEY> sbt createLambda

You can visit aws console and find your lambda there

Post lambda available

Let’s try to test it. Go to post function and click test button. We can use following example as an input:

{
  "clientId": "123",
  "numbers": [
    1,
    2,
    3
  ]
}

And now we see an error

DynamoDB authorization error

This mean, that our User is not authorized to perform put item action into DynamoDB. Let’s authorize him. We need to go to our IAM management console and in the Roles section select our role (by default, it’s lambda_basic_execution), click Attach Policy button in a Permissions section and attach policy AmazonDynamoDBFullAccess. Otherwise, you can goo to DynamoDB config and in a tab “Access control” create policy for our user to allow our user to perform action PutItem.

That’s all. Now we can try to test it one more time and happily enjoy the following result:

Post method works

We can also see that our DynamoDB database is actually updated:

DynamoDB items

API Gateway for post method

We created our new shiny lambda function. But it doesn’t have any API to connect it to the external world. Let’s fix it. Let’s go to Amazon AWS API Gateway and create new API method POST. Here you need to specify your concrete lambda method and test it. Surprisely, it works.

Summary

In this article we created a simple project which allows us to build lambda functions in Scala with automated sbt deployment. We also learnt how to put data to Amazon DynamoDB. And it’s easy and powerfull. The source code of this project you can find in my github.

Articles

Static Microservice Discovery

As we saw in our previous post, Akka Cluster gives you some predefined topologies from the box. And it’s very easy to perform it’s discovery. For example, for Direct Topology all you need to discover service is to have actor path, including the address of the node. And to discovery microservice with Cluster Singleton Topology is even simpler. You just need to know is cluster role name and singleton manager name (with it’s path).

The simplest way to make service discovery is just to make class in service-api module, which incapsulates all necessary data. Let’s call this method as static service discovery and have a look some examples.

Cluster Singleton Static Discovery

As we agreed in the post about decomposition into the modules, we are going to split service into service implementation (stored in service-core module) and service gateway (interface in service-api module).

Singleton Service API Module

object FooDescriptor {
  val SingletonManagerName = "fooManager"
  val SingletonManagerPath = s"user/$SingletonManagerName"
  val ClusterRole = "foo"
}

object FooProxyFactory {
  val ProxyActorName = "fooProxy"
  
  def proxyProps(settings: ClusterSingletonProxySettings): Props = ClusterSingletonProxy.props(
    singletonManagerPath = FooDescriptor.SingletonManagerPathg,
    settings = settings.withRole(FooDescriptor.ClusterRole)
  )

  def proxyProps(actorSystem: ActorSystem): Props = proxyProps(ClusterSingletonProxySettings(actorSystem))

  def createProxy(actorRefFactory: ActorRefFactory, actorSystem: ActorSystem, name: String): ActorRef = actorRefFactory.actorOf(
    proxyProps(actorSystem),
    name = name
  )

  def createProxy(actorSystem: ActorSystem, name: String = ProxyActorName): ActorRef = createProxy(actorSystem, actorSystem, name)
}

As we can see from the code above, FooDescriptor contains full information, required by creating proxy actor for access to the Foo Singleton.

Singleton Service Core Module

As soon as core module has api module as a dependency, we can use information from FooDescriptor to assemble our own singleton.

object FooManagerFactory {
  def managerProps(settings: ClusterSingletonManagerSettings): Props = ClusterSingletonManager.props(
    singletonProps = Props(injected[FooActor]),
    terminationMessage = PoisonPill,
    settings = settings.withRole(FooDescriptor.ClusterRole)
  )
  
  def managerProps(actorSystem: ActorSystem): Props = managerProps(ClusterSingletonManagerSettings(actorSystem))
  
  def createManager(actorRefFactory: ActorRefFactory, actorSystem: ActorSystem): ActorRef = actorRefFactory.actorOf(
    managerProps(actorSystem),
    name = FooDescriptor.SingletonManagerName
  )

  def createProxy(actorSystem: ActorSystem): ActorRef = createProxy(actorSystem, actorSystem)
}

Conclusion

Let’s have a look on pros and cons of a static discovery method.

Pros: Simplicity

This method of dicscovery is very simple. We don’t need to have complicated systems to deliver the topology to the end-client.

Cons: No dynamic update on topology change

When we need to update service topology (for example, we rewrote service from Cluster Singleton topology to a Cluster Sharding topology), we need to change and recompile service-api module. That mean, that we need to recompile and redeploy all the client services. This might be an issue, when there are a lot of clients of this concrete service and can easily lead to the redeployment of the whole cluster. Thus, using this method we can neutralize the benefits of using microservices.

Cons: No way to track all running services

Unless we use service registry (we discuss this topic later), there is no way to control and track all running services.

Microservice Topologies in Akka Cluster

Service discovery is an essential aspect for microservice architecture. Does usage of a Akka Cluster simplify this problem for us? To answer this question, first of all we need to analyse data flows within a cluster and understand how different elements (service nodes or clients) of a microservice can be arranged inside a cluster. Let’s call this arrangement topology.

Microservice topology is the arrangement of the various elements of a cluster

Let’s describe some of the possible microservice topologies within an Akka Cluster.

Direct Topology

Every Actor in a cluster has a path, which includes it’s network address. And this path is a part of serializable ActorRef value. In this way you can have an actor, running on any machine and knowing it’s reference ActorRef is enough to send message directly to that actor from any other machine. Let’s call this topology direct.

Cluster Singleton Topology

This topology is based on Cluster Singleton approach. In this topology we ensure that we have exactly one actor of a certain type running somewhere in the cluster.

Cluster Sharding Topology

This topolog is based on Cluster Sharding approach. Cluster sharding is useful when we need to distribute actors across several nodes in the cluster and want to be able to interact with them using their logical identifier, but without having to care about their physical location in the cluster, which might also change over time.

Cluster Router Group Topology

This topology is based on Cluster Aware Routers with Group of Routees. Router sends messages to the specified path using actor selection. The routees can be shared between routers running on different nodes in the cluster.

Other Topologies

The list of the topologies is of course not complete. There might be other different variations, like as message queue based topology (ZMQ, NSQ), involving Apache Kafka, etc.

Decomposition of monolithic application to Akka Cluster microservices

In the first part of series of articles about Akka Cluster Microservices, I would like to cover question of decomposition of a monolithic application. First of all, we need to understand what is monolithic application and what are micriservices. Also we need to define a goal and principles and practices to achieve this goal. So, let’s start.

The Monolith

Let’s invent our abstract monolithic Scala application, mostly written in message-driven manner over Akka actor model, Play Framework for controller layer and Akka Cluster for nodes coordination. Let’s add good unit and integration test coverage (>80%) and highly qualified agile team with UAT as a part of a team workflow. Too perfect conditions? Okay, then let’s add some complexity. Application is highly coupled, violating almost all SOLID principles, missing strong Separation of Concerns (for example, no service layer at all). All source code is in one repository without splitting into the modules, and so on. Classical monolith.

Before starting our dive, first of all we need to define a goals.

Strategic Goal

Our goal will be quite simple: split the monolith application to microservices.

To realise what do we need, first of all, we need to understand what is microservice. The definition, given by Sam Newman in his book “Building Microservices. Designing Fine-Grained Systems” is the following: microservices are small, autonomous services that work together. I also like the platform-agnostic golden rule of defining microservices: microservice is something that could be rewritten in two weeks.

For our concrete Actor-based application, the typical microservice can be described by the following figure:

Example of Microservice Data Flow

Key Benefits of microservice architecture

There are a lot of articles, proving that microservice architecture is something, what you need. The key benefits of this approach are:

  • Technical heterogeneity
  • Resilience
  • Scaling
  • Ease of Deployment
  • Composability
  • Optimizing for replaceability

Architectural principles and Design and Delivery Practices

To understand, that we are moving in the right direction, we need to define some platform-specific and application-specific principles and practices to follow. This section might be extended due some other principles and practices might be discovered during the implementation.

Cross-modular versioning is important

As soon as we are going to deal with multiple instances of microservices, which are going to be modularized, we always need to know which version of module is running on the microservice. This might be important when we face to the problem, that after some api or model changes, some modules become incompatible with the other modules. We always need to keep in mind the version compatibilities. There are a lot of versioning methodologies, that might be useful for the concrete task.

Akka Cluster as a transport

This practice might be against the technology-agnostic approach of designing microservices: the only supported platform is JVM and mostly common used language is Scala. But let’s make this choice because of asynchronous message-driven style of collaboration, implemented in Akka and extended to Akka Cluster. This style is also good to follow in the concrete module implementation. It can help to build reactive, distributed and highly scalable application. Of course, there are a lot of other choices: request/response synchronous and asynchronous RPC (SOAP, Thrift, Protocol Buffers) or also request/response based REST (over HTTP). But in this article we are going to stay only on Akka and Akka Cluser message based approach.

Solution preferences

Let’s define some list of priorities, which approach is most preferable. The key principle here will be to choose the most flexible and scalable solution for a specific tasks.

Separate public messages and API from internal ones

In order to have API decoupled from the process module, we need to keep internal messages and API independent from external, even if the functionality is duplicated. Internal messages might be changed quite often and API messages shouldn’t reflect this changes. API need be backward compatible as long as possible.

Decomposition

First of all, we need to understand, how we need to split the monolith application into the modules. What strategy need to be applied to keep them loosely coupled and highly cohesive? We, of course, need some shared util and runtime libraries. But more shared components we have, more highly coupled application we will get at the end point. So, let’s try to find a golden balance for our concrete task and try to keep the dependency list as small as possible.

We need to extract the core microservice logic into the module(s) and keep the monolith application working properly and all unit tests passed successfully.

In my opinion, the best solution will be building a microservice based on the following SBT modules:

Model

Models, shared between microservices. So, it might contain following components:

  • Model classes and their companion objects.
  • Constants.
  • Serialization/deserialization, formats of shared models (let’s choose JSON for simplicity sake).
  • Utils. Unfortunately, some companion objects (for example, parsers or apply() methods) requires some util components.
  • Unit tests. We need to make sure, that serialization/deserialization or utils work fine.

API

Interface to the corcrete microservice. Should contain enough information to make microservice lookup and discovery. All clients of this microservice should include this module as a dependency. Regarding storing public messages, there are 2 possible implementations:

Gateway

Module contains only information, which is enough to make microservice lookup and discovery (service topology data). It doesn’t contain public messages and their serializers. They should be stored in Model. In this case Core shouldn’t have this module in the dependencies.

Protocol

Module contains information, which is enough to make microservice lookup and discovery (service topology data). It also contains public messages and their serializers, etc. In this case Core should be dependent on this module.

Core

Main microservice logic. Depends on the Model and for some concrete implementations on the API modules. Should be able to handle public messages. Internal messages shoulnd’t be exposed

The figure, which show the dependencies between different microservices with the related modules.

Microservices dependencies

Conclusion

In this article I tried to cover the first round of decomposition of the monolithic application into the microservices. Of course, it’s only beginning of the big journey and probably some other decompositions will be required. In the next articles I’ll try to cover questions of different Akka Cluster topologies and some methods of microservices discovery.

Akka Cluster Microservices

By this post I would like to start series of articles about designing and implementing microservices on top of Akka Cluster. All the following is my personal thought, based on my experience. These records are attempt to systematize my knowledge and experience on a specific domain. I’m writing them mostly for myself to keep something in mind and avoid some issues in a future.