How to run a highly-available Typesense search in AWS

How to run a highly-available Typesense search in AWS

Get a multi-node Typesense cluster up & running in minutes on AWS using infrastructure as code

Retrieving relevant search results blazing-fast is a challenge. A solid search requires typo tolerance, result rankings, synonyms, filtering, faceting, geo search, and so on.

Quite often the underlying database schema is not optimized for our search needs, the relevance of search results is questionable, and running ad-hoc queries on big datasets usually leads to performance issues and slow response.

This is why you should add a dedicated search service into the mix.

Typesense is my favorite, it's simple, fast, typo-tolerant, and open-source. I've been using it on multiple production projects handling searches on datasets of millions of entries.

For production use, it is recommended to run a multi-node Typesense cluster, however, if you're looking for a lighter setup check my other post on how to run a single Typesense node in AWS.

In this article, I will explain how to get a highly available (multi-node) Typesense cluster running in AWS using infrastructure as code.

The setup uses 2 AWS CloudFormation stacks that can be deployed into any AWS account:

How it Works

Typesense uses the Raft consensus algorithm to manage the cluster and recover from node failures. In cluster mode, Typesense will automatically replicate the entire dataset to all nodes. Automatically and continuously.

Raft requires a quorum for consensus and we need to run a minimum of 3 nodes to tolerate a 1-node failure, and a 5-node cluster will tolerate failures of up to 2 nodes, but at the expense of slightly higher write latencies.

To start a Typesense node as part of a cluster, each node instance needs to have a definition file with the list of all nodes in the following format, separated by commas:


  • The peering_address (the EC2 instance private IP address)

  • The peering_port (the port used for cluster operations - default 8107)

  • The api_port (the port to which clients connect to - default 8108)

We will use an Auto Scaling group to launch a minimum of 3 EC2 Instances distributed across 3 Availability Zones and start a Typesense service on each.

    Type: AWS::AutoScaling::AutoScalingGroup
        LaunchConfigurationName: !Ref LaunchConfig
        DesiredCapacity: 3
        MinSize: 3
        MaxSize: 6
            - NotificationTypes:
                  - autoscaling:EC2_INSTANCE_TERMINATE
              TopicARN: !Ref EC2EventsTopic
            - !Ref ALBTargetGroup
            - !ImportValue typesense-vpc-PublicSubnet1
            - !ImportValue typesense-vpc-PublicSubnet2
            - !ImportValue typesense-vpc-PublicSubnet3

EC2 Instance Type

CPU capacity is important to handle concurrent search traffic and indexing operations, so Typesense requires at least 2 vCPUs of compute capacity to operate.

All the next-gen EC2 T4g instances have at least 2 vCPUs, they're powered by Arm-based AWS Graviton2 processors and are ideal for running applications with moderate CPU usage that experience temporary spikes in usage.

The amount of RAM required is completely dependent on the size of the data you index, but for Typesense to hold the whole index in memory, the instance memory should be 2-3X the size of the data set.

As the Typesense process, itself is quite lightweight (20MB RAM with empty dataset) a 2GB memory instance (t4g.small) will likely handle a data set of up to 1GB.

Data Storage

Typesense stores a copy of the raw data on disk and then builds the in-memory index with the data. Then at search time, after determining the final set of documents to return in the API response, it fetches these documents (only) from the disk and puts them in the API response.

We'll use the Amazon EC2 instance store for the node data, although this type of storage is ephemeral and data will be lost if the instance gets terminated. When a new instance is launched the data will be replicated from the other nodes automatically.

We'll have scheduled backups for our peace of mind.

Load Balancing

AWS Elastic Load Balancing is used to distribute the traffic equally across all nodes, and we're going to add an Application Load Balancer in front of our EC2 instances from all 3 Availability Zones. It will also help us manage the SSL termination.

    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
        Name: !Sub ${AWS::StackName}-${EnvironmentType}
            - !ImportValue typesense-vpc-PublicSubnet1
            - !ImportValue typesense-vpc-PublicSubnet2
            - !ImportValue typesense-vpc-PublicSubnet3
            - !GetAtt ALBSecurityGroup.GroupId

The ALB Security Group allows inbound traffic on ports 80 (HTTP) and 443 (HTTPS).

    Type: AWS::EC2::SecurityGroup
        GroupDescription: allow traffic on port 80
        VpcId: !ImportValue typesense-vpc-VPCID
            - IpProtocol: tcp
              FromPort: 80
              ToPort: 80
            - IpProtocol: tcp
              FromPort: 443
              ToPort: 443

We have 2 ALB listeners:

  • one for HTTPS that will forward traffic to the Target Group

  • one for HTTP that will redirect all requests to HTTPS

    Type: AWS::ElasticLoadBalancingV2::Listener
            - Type: forward
              TargetGroupArn: !Ref ALBTargetGroup
        LoadBalancerArn: !Ref ALB
            - CertificateArn: !Ref ALBCertificate
        Port: 443
        Protocol: HTTPS
    Type: AWS::ElasticLoadBalancingV2::Listener
            - Type: redirect
                  Host: "#{host}"
                  Path: "/#{path}"
                  Port: 443
                  Protocol: "HTTPS"
                  Query: "#{query}"
                  StatusCode: HTTP_301
        LoadBalancerArn: !Ref ALB
        Port: 80
        Protocol: HTTP

And finally, the Target Group routes requests to our nodes.

The Target Group also performs health checks on our nodes, it uses Typesense's /health endpoint that will return an HTTP code 200 and {"ok": true} if our node search service is in a working state.

    Type: AWS::ElasticLoadBalancingV2::TargetGroup
        HealthCheckIntervalSeconds: 30
        HealthCheckProtocol: HTTP
        HealthCheckTimeoutSeconds: 15
        HealthyThresholdCount: 5
        HealthCheckPath: /health
            HttpCode: 200
        Name: !Sub ${AWS::StackName}-${EnvironmentType}
        Port: 8108
        Protocol: HTTP
        UnhealthyThresholdCount: 3
        VpcId: !ImportValue typesense-vpc-VPCID

So we have two health checks in place for our highly available search service:

  • Auto Scaling group that checks if our instance is healthy

  • Load Balancer health checks will check if our application running on the instance is healthy

Node Launching

The Launch Configuration will be used by the Auto Scaling group to configure our EC2 instances (search nodes).

    Type: AWS::AutoScaling::LaunchConfiguration
        InstanceType: !Ref InstanceType
        ImageId: !Ref LatestLinuxAmiId
        IamInstanceProfile: !GetAtt EC2InstanceProfile.Arn
            - !Ref EC2SecurityGroup
            Fn::Base64: !Sub |
                # [...] removed for brevity plese check below

Launch Configuration UserData

UserData allows us to run commands on our instance at launch. When an unhealthy instance is terminated and a new one is created and the UserData will run the commands to ensure the cluster node is reconfigured back to its running state.

1. CloudWatch Agent

We want to monitor our nodes, CloudWatch Agent will collect metrics and logs from our EC2 instances and send them to CloudWatch. The CloudWatch configuration is created by the CloudFormation template and referenced in the commands below.

# install cloudwatch agent
yum -y install amazon-cloudwatch-agent
# auto configure cloudwatch agent
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -m ec2 -c ssm:${CloudWatchAgentConfig} -s

2. Download & Install Typesense Server

We'll run the service on T4g instances powered by Graviton2 processors running Amazon Linux and there's no package available, so we have to download the arm64 Binary.

# download & unarchive typesense
curl -O${TypesenseVersion}/typesense-server-${TypesenseVersion}-linux-arm64.tar.gz
tar -xzf typesense-server-${TypesenseVersion}-linux-arm64.tar.gz -C /opt/typesense
# remove archive
rm typesense-server-${TypesenseVersion}-linux-arm64.tar.gz

3. Configure the Typesense Node

We get the EC2 Instance Private IP, create the Typesense data & log folders and the nodes and server config file. Read more about Typesense server config.

# get instance private IP
EC2_INSTANCE_IP=$(curl -s http://instance-data/latest/meta-data/local-ipv4)

# create typesense folders
mkdir -p /opt/typesense/data
mkdir -p /opt/typesense/log

# create the nodes file
echo "$EC2_INSTANCE_IP:${TypesensePeeringPort}:${TypesenseApiPort}" > /opt/typesense/nodes

# create typesense server config file
echo "[server]
api-key = ${TypesenseApiKey.Value}
data-dir = /opt/typesense/data
log-dir = /opt/typesense/log
enable-cors = true
api-port = ${TypesenseApiPort}
peering-port = ${TypesensePeeringPort}
peering-address = $EC2_INSTANCE_IP
nodes = /opt/typesense/nodes" > /opt/typesense/typesense.ini

4. Create systemd service & enable the daemon

As we installed it from a binary we have to create a systemd service for the Typesense server. This will make the Typesense server service always available.

# create typesense service
echo "[Unit]
Description=Typesense service

ExecStart=/opt/typesense/typesense-server --config=/opt/typesense/typesense.ini

[Install]" > /etc/systemd/system/typesense.service

# start typesense service
systemctl start typesense
# enable typesense daemon
systemctl enable typesense

5. Notify Cluster

We publish a message using AWS CLI on the SNS topic that a new Typesense node is available.

# notify sns topic of the new instance
aws sns publish --message "New Instance Ready (instanceId:$EC2_INSTANCE_IP)" --topic-arn ${EC2EventsTopic}  --region ${AWS::Region}

Cluster Events

Every Typesense node needs to be aware of the other cluster nodes in order to automatically replicate the data. When a node instance is terminated or a new node is launched we need to update the configuration on the rest of the nodes.

An SNS topic will receive notifications from the Auto Scaling Group in case of an EC2_INSTANCE_TERMINATE event and also from the AWS CLI (UserData) once the new node is running.

The SNS will then trigger a Lambda that will send a command through the Systems Manager to all the healthy instances to update the nodes definition file.

Click here to view the cluster events listener Lambda code.

Systems Manager Run Command

AWS SSM helps us securely manage the configuration of our nodes and allows running a shell script on multiple instances at once. This facilitates updating the nodes definition of our Typesense service.

# SSM Document
    Type: AWS::SSM::Document
            schemaVersion: "2.2"
            description: Update Typesense Nodes
                    type: String
                    description: Nodes string
                - action: aws:runShellScript
                  name: updateTypesense
                          - echo "{{NodeList}}" > /opt/typesense/nodes
                          - sudo systemctl restart typesense

        DocumentFormat: YAML
        DocumentType: Command
        Name: !Sub ${AWS::StackName}-${EnvironmentType}-update-nodes
        TargetType: /AWS::EC2::Instance

EC2 IAM Role & Security Groups

The template will create an Instance Profile, an IAM Role, and a Security Group for the EC2 instances.

The IAM Role will be assumed by the EC2 instance and contains policies that will allow only the necessary actions (principle of least privilege).

  • CloudWatchAgentServerPolicy - to read instance information and write it to CloudWatch Logs and Metrics

  • AmazonSSMManagedInstanceCore - to receive messages for Run Command and read parameters in Parameter Store (CloudWatch agent config)

  • sns:Publish - publish SNS message when a new node is ready

    Type: AWS::IAM::Role
        RoleName: !Sub ${AWS::StackName}-${EnvironmentType}-instance-role
            Version: "2012-10-17"
                - Effect: Allow
                      - sts:AssumeRole
        Path: /
            - arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy
            - arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore
            - PolicyName: !Sub ${AWS::StackName}-${EnvironmentType}-ec2-policy
                  Version: "2012-10-17"
                      - Effect: Allow
                            - sns:Publish
                            - !Ref EC2EventsTopic

The Instance Profile allows passing the IAM role to the EC2 instance.

The Security Group permits the EC2 instances to:

  • receive traffic on Typesense API Port 8108 from the Load Balancer

  • communicate with the other VPC nodes on API Port 8108 and Peering Port 8107

    Type: AWS::EC2::SecurityGroup
        GroupDescription: allow connections from ALB and SSH, from other instances within VPC
        VpcId: !ImportValue typesense-vpc-VPCID
            - IpProtocol: tcp
              FromPort: 8108
              ToPort: 8108
              SourceSecurityGroupId: !GetAtt ALBSecurityGroup.GroupId
            - IpProtocol: tcp
              FromPort: 8107
              ToPort: 8107
              CidrIp: !ImportValue typesense-vpc-VPCCidrBlock
            - IpProtocol: tcp
              FromPort: 8108
              ToPort: 8108
              CidrIp: !ImportValue typesense-vpc-VPCCidrBlock

Typesense API Key Generation

The CloudFormation stack will generate a Typesense admin API key using a CustomResource that references a RandomStringGenerator Lambda function.

The TypesenseApiKeyis exported in the Output so it can be referenced in other stacks. Optionally it can be stored in a AWS::SSM::Parameter to be used by other AWS services.

The admin API key provides full control over the Typesense API, make sure you create Scoped API Keys to be used in your application.

    Type: AWS::CloudFormation::CustomResource
        Length: 32
        ServiceToken: !GetAtt RandomStringGenerator.Arn

The random string generator Lambda function is deployed using inline code, this works for simple functions when the code length is up to 4096 chars.


The setup uses 2 AWS CloudFormation stacks that can be deployed into any AWS account:

To get started with the Typesense API once you have the service running, check API Reference

For educational purposes, this post and the included templates feature some common real-life implementations of the AWS Well-Architected Framework concepts such as high availability, elasticity, scalability, resiliency, security, principles of least privilege, etc.

Cover Photo by Samuel Sianipar on Unsplash