How to setup a Keycloak server with external MySQL database on AWS ECS Fargate in clustered mode
Edit: This blog post was written before Keycloak went over on Quarkus. This article explains how to setup database, load balancer, service and everything AWS related, for the Quarkus specific configuration check out my new article. The main difference is the task definition part and that we have to build our own docker image. By reading these two articles you should be able to get the new Quarkus distribution of Keycloak up and running on AWS.
For a project I was working on we needed to setup a Keycloak server running on an Amazon ECS Fargate instance. I could not find anything on the web describing how to do so step by step, but after reading the docs of Keycloak, Jboss, AWS and Wildfly I finally got the cluster up and running like it should. This is the article I really wished I found when I was doing my research. Here I will describe step by step how to get your Keycloak server up and running on an AWS Fargate instance in clustered mode with an external database.
What we need to do
- Create a database in Amazon RDS (or anywhere really, we just need the connection details, we will be using MySQL 8 in this tutorial).
- Create an Application Load Balancer (ALB) in the EC2 Console.
- Configure the Target Group connected to the ALB.
- Create a Task Definition for our Keycloak image.
- Create a Service where we can run our tasks (Keycloak).
Creating database
Note: If you already have a database ready for use just skip this step.
Keycloak supports many of the most popular databases out there (Oracle, Microsoft SQL Server, PostgreSQL and MySQL). In this tutorial we will be creating a MySQL 8 database on Amazon Relational Database Service (RDS).
First head over to RDS -> Databases and click on Create database. I will go through the wizard step by step.
We will be using Amazons Free tier for this tutorial, but that would probably not be a good fit for production depending on your needs in terms of performance and reliability.
Set a name for your database instance, a username and a password. As long as we have chosen the free tier template DB instance class is chosen for us. If you need more resources for your database I suggest you upgrade from the free tier.
For this example just keep defaults, but in a production environment I would consider using storage autoscaling or setting up an alert so your database does not go out of storage. If you go away from the free tier you could also set up a standby instance, which Amazon recommends for production usage.
Here we go with the default VPC and subnet group and create a new security group for our database. We will come back to configuration of the security group later (it is important to open it up for the security group protecting Keycloak). By default it automatically opens up your computers IP.
We call our database keycloak (which is the name the Keycloak image expects by default, but this can be overwritten by environment variables if you like to give your database a different name). I turned off automatic backup for this example, but that is most certainly something you want in production. For the rest I just kept defaults.
Just click Create database and AWS starts up your new keycloak database instance (this could take a few minutes).
When it is ready you can see that status has changed from creating to Available. This means you now can connect to your newly created database instance. Just click on your instance and you will see the connection settings:
We need the endpoint and port from this screen and the username and password we created earlier. Then we can connect to the database from our favorite database tool. I prefer to use IntelliJ, there it will look something like this:
Connecting to the keycloak database can be useful so we can verify that the Keycloak tables gets created when we launch our Keycloak image later in this tutorial.
Creating an ALB
Ok, now we have a database, let us focus on getting Keycloak up and running. Before we do that we need to create an Application Load Balancer. This will balance the load between our Keycloak instances and make our Fargate instance available through DNS (we get an URL we can use in the browser to reach it). We will apply this load balancer when we create our Service later.
Load balancers are created in the AWS EC2 dashboard. Just click Create Load Balancer and you will see this screen:
Choose the Application Load Balancer and move on.
Choose a name for your load balancer and choose a VPC, we are still using our default VPC and we place it in two availability zones for availability (depending on your needs). Click next, now you will get a warning since we have not configured a secure listener (HTTPS). Just click next. This tutorial will not show you how to setup a secure listener, there is many great tutorials on how to do that, which also is fairly easy if your domain is hosted on AWS Route 53, then you can just add a HTTPS listener on port 443 to your load balancer, create a certificate with AWS Certificate Manager and apply it on your load balancer and you are good.
Next step is to create a Security Group for our ALB. We want it to be available on the internet, so we set source to anywhere. Again this is something you need to configure for your needs, if there is only your VPN solution that should have access to Keycloak you obviously would need more strict rules. And if you add a secure listener you probably would open for HTTPS (443).
Now we will create our target group which the load balancer will route requests to. Give it an appropriate name and select IP for target type. Keycloak does not have a health check endpoint (but as always there is extensions you could install), so let us just point the health check to the /auth/ endpoint which will give status code 200 if Keycloak is up and running (do not forget the trailing slash).
Then click next, we will not register targets now, just click next and finish the wizard.
Configuring our Target Group
In the ALB creation wizard we created a target group, keycloak-tg. To get our Keycloak cluster to function properly we need to add some extra configuration.
Head over to Target Groups in the EC2 Console and click on the target group we created previously.
We need to edit the Attributes section.
As Keycloak states in their documentation (https://www.keycloak.org/docs/latest/server_installation/) sticky sessions when using a load balancer is not mandatory, but it is good for performance reasons. Keycloak uses the cookie AUTH_SESSION_ID, so let us enable stickiness and add the cookie to the target group.
The rest is default, just click Save changes once you have enabled Stickiness, choosed Application-based cookie and added the App cookie name.
Create Task Definition
The Task Definition is where we define which docker image to use, how much CPU and RAM the container(s) should get, logging and environment variables.
Head over to the ECS Console and click on Task Definitions.
Click on Create new Task Definition.
Choose Fargate as the launch type.
Give your Task Definition an appropriate name, set up an Task execution IAM role if you have not already, if you have just pick it. For this tutorial I have gone with the cheapest options, as little memory and CPU as possible, again something you would need to configure for your needs. Keycloak can be very slow on start up, so it likes its resources. This is something we will handle by extending the start period before the health checks kick in.
Let the rest stay default and click on Add container.
Edit: I would recommend to use the latest Keycloak version, (you find it here: https://hub.docker.com/r/jboss/keycloak/tags), right now I run version 15.0.1.
Give the container an appropriate name and add the jboss/keycloak image (if you build your own Keycloak image and push it to ECR this is where you would put the url to the ECR repository) under Image. Add 8080 to port mappings and add the health check command under Command. The health check uses the /auth/ endpoint here as well. To ensure that our container health check does not fail I have added a 300 seconds grace period before health checks kick in, this is only a problem if you give Keycloak limited CPU and RAM, then it takes time before it starts up and is able to respond to health checks.
Now scroll down and let us take a look at environment variables. For this is where most of the config happens.
I will go through all of them to explain why we need them.
PROXY_ADDRESS_FORWARDING
When running Keycloak behind a load balancer you need to set PROXY_ADDRESS_FORWARDING to true or else it will not forward requests correctly.
DB_ADDR
This is the endpoint URL for the database you would like to connect to your Keycloak instance(s). For the database I created earlier it was: keycloak.cjlnhvmoxcuy.eu-central-1.rds.amazonaws.com
DB_USER
The database user the Keycloak instance(s) should connect to the database with. We will use the user admin which we created in the RDS wizard.
DB_PASSWORD
The password to the admin user we created in the RDS wizard.
DB_VENDOR
This is where we specify which database is used, in our case mysql.
JGROUPS_DISCOVERY_PROTOCOL
You have to set this explicitly since Keycloak by default uses multicast which is not supported on Fargate. This protocol determines how the cluster should communicate. On AWS Fargate the only options is S3_PING and JDBC_PING. Since we already have connected our Keycloak instances to a database the latter is preferred to creating a S3 bucket solely for clustering purposes.
JGROUPS_DISCOVERY_PROPERTIES
This property is necessary to run Keycloak in a cluster on AWS Fargate in JDBC_PING mode. We have set it to:
datasource_jndi_name=java:jboss/datasources/KeycloakDS,info_writer_sleep_time=500
This means that the Keycloak instances will use the database that we have set up and create a JGROUPSPING table in it. See https://developer.jboss.org/docs/DOC-16351 and https://www.keycloak.org/2019/08/keycloak-jdbc-ping for more documentation on JDBC_PING and how to set it up.
I would also recommend adding remove_old_coords_on_view_change=true so the env variable look like this:
datasource_jndi_name=java:jboss/datasources/KeycloakDS,info_writer_sleep_time=500,remove_old_coords_on_view_change=true
This was not in the original article, but it is a nice addition. I experienced some issues if for instance one container crashed, then a new container starts but the IP address in the JGROUPSPING table belonged to the crashed container. A new container gets a new IP, so it timed out during joining. Setting this variable to true makes sure that the table is cleared so it gets updated with the most recent IP address.
KEYCLOAK_USER
Keycloak does not come with a user, so we need to create one to be able to log in and start using Keycloak.
Note: Since we are using a database the user and password environment variables can be removed after first startup (since the user then already will exist in the database, and Keycloak will tell you that in the log on every subsequent startup).
KEYCLOAK_PASSWORD
The password to the Keycloak user.
Just scroll on down and make sure Auto-configure CloudWatch Logs is ticked so we could check out the Keycloak logs later to check if our cluster is working properly. Then scroll to the bottom and click Add, repeat and click Create and your task definition is done.
As you can see a log group for Keycloak has been created.
Create an ECS Cluster
Now we have a database, load balancer, target group and a task definition. The only two things missing is a ECS Cluster to run our Service in and a Service.
In the ECS Console clik on Clusters.
Click Create Cluster.
Choose Networking only as this is a Fargate cluster, click Next step.
Give your cluster an appropriate name and click Create (we will not create a new VPC since we are using the default VPC in this tutorial).
Now you will see the new Cluster in the list, but it has 0 services and tasks running, let us change that.
Create an ECS Service
To create a Service enter our newly created cluster, you should see a screen like this:
In the Services tab, hit Create.
Choose Fargate as launch type, pick the task definition we created, give the service an appropriate name and set Number of tasks higher than 1 (if not it would not been a cluster, since that only gives us one Keycloak instance). Keep the other defaults and hit Next step.
Then choose VPC (we will use the default VPC) and add subnets. The security group could be locked down to only allow traffic from the load balancers security group and database. You also see that we have added a 800 seconds grace period for the load balancers health checks, that is as mentioned earlier because Keycloak does not boot very quick with little resources (800 is probably overkill).
Then choose the load balancer we created and click Add to load balancer.
Then pick the target group and the settings we set up earlier will load. Click Next step x2 and Create Service. I could take some minutes to get it up and running.
Verify that your cluster is up and running
In the Keycloak documentation (https://www.keycloak.org/docs/latest/server_installation/) it states that the cluster works if you find a similar log message in both your containers:
INFO [org.infinispan.remoting.transport.jgroups.JGroupsTransport] (Incoming-10,shared=udp)
ISPN000094: Received new cluster view: [node1/keycloak|1] (2) [node1/keycloak, node2/keycloak]
Let us take a look in CloudWatch:
As we can see the IP addresses matches our instances (you can see them in ECS under Cluster->Service->Task). The cluster is up and running.
Login to Keycloak
You can not login to Keycloak without HTTPS, so you have to add a HTTPS listener to your load balancer (you could typically set it up on keycloak.yourdomain.com) in the EC2 Console.
There is a way around this, but should only be used for testing purposes and NOT in a production environment. You can connect to the keycloak database we created earlier and run this query:
update REALM set ssl_required = 'NONE' where id = 'master';
Now you should be able to login on <load balancer DNS name>/auth/ after you have restarted Keycloak (that is, stopping your two containers in AWS ECS so two new ones starts up).
You find your load balancer DNS name in the EC2 Console under load balancers, click on the keycloak load balancer and you should see it.
Further steps
- Create an Elastic Container Repository (ECR) for Keycloak. This way you have a place to store customized Keycloak images, which is very useful if you like to customize themes for Keycloak (login page, sign up etc.) with your companys graphical profile.
- Set up an HTTPS listener on our load balancer and connect it to your domain.
- Move the environment variables to AWS Parameter store or Secrets Manager and use valueFrom in our Task Definition.
- Lock down the security groups so only necessary IP addresses/security groups are allowed inbound.
Closing words
So now we have set up a Keycloak cluster on Fargate instances with an external database. This is my first tutorial and article on Medium. I hope you enjoyed and that it helped someone. Just post a comment if anything is unclear, this became a long tutorial and many of the steps could have been explained in more (or less) detail. Hope it is possible to follow. I will maybe update the article later with a task definition in JSON format and CLI commands since the GUI can be cumbersome.