Pub/Sub with AWS SNS, SQS and Laravel
Pub/Sub is a great mechanism to establish microservice communication. By focusing on defining good, reliable topics of interest, microservices can come and go without disrupting the entire ecosystem. AWS offer two great services that help build a highly scalable, reliable and recoverable pub/sub system: Simple Notification Service and Simple Queue Service.
Introduction to microservice communication
When service A communicate with B via direct call, we can infer that there is a dependency in A coupled to B. If B is temporarily unavailable or under heavy load, it could disrupt the stability of A.
Writing coupled microservices usually means dealing with the worst of both worlds. If instead the system was built like a monolith, it would still be coupled to each other, but without the problems of a distributed system. On a monolith, a component is never unavailable or unreachable and the communication is just one function call or object instantiation away. By splitting two services apart, they become independent of each other at the cost of creating a distributed system.
With a Publisher / Subscriber system, there’s the possibility of system B to track (or subscribe) to events that happen in A without a direct coupling. The Pub/Sub system becomes a broker and serves as a contract between both parties. If system A is starting to decay, it can be fully replaced by a newer, smarter, better version without affecting B if the messages that A publishes can still be respected or kept backward compatible.
AWS Simple Notification Service is where publishers will notify about events that happens. That notification happens on a topic. As of this moment, SNS can notify subscribers by http, https, email, sms, sqs and lambda. with the HTTP(S) protocol, two services could communicate with each other via SNS without the need of SQS at all. So why bring SQS to the mix?
When using HTTP(S) as the subscription protocol for SNS, AWS will offer retries in case the subscriber cannot be reached or responds with an error. However, there’s quite some limitation on what can be achieved with this model. Particularly attention if the subscriber has a bug and cannot process the notification at all. There could be a lot of notifications lost until a patch is released. However, when subscribing an SQS to an SNS topic, AWS will guarantee the delivery to SQS. Perhaps if the worker/consumer of SQS has a bug, it will fail to process the message, but the message can stay in the queue up to 14 days and can still be delivered to a Dead Letter Queue. This mechanism offers a much safer option to never lose messages. Some microservices work with the premise that messages can always be replayed, but what if some subscribers already processed those events? Of course, replaying systems are a valid option so long the subscribers can be idempotent. For this article, I will focus primarily on avoiding message loss instead.
Laravel has a Queue system out of the box and offers a native driver that can communicate with Amazon SQS. When pushing a message to SQS, Laravel will store enough information inside the message that allows the worker to understand and process the Job (message) correctly. However, when fetching a message from SQS that was inserted by SNS, the message will lack the job attribute which is used by the Worker system to know which Job to process. If only two systems are communicating through SNS, one could argue that the publisher system could specify the job key so the Laravel subscriber would know what to do. Unfortunately that is a strong violation of the Pub / Sub system because the publisher would be required to know an implementation detail of the subscriber. It also doesn’t work with two or more subscribers, unless we want all our subscribers to have the same Job namespace to process a topic event.
To get around this, one interesting option is to tie the SNS Topic ARN to a specific Job class. Whenever a new message from topic user-authenticated is sent to SQS, the Laravel worker system should use class ProcessUserAuthenticatedJob. To achieve this, it’s necessary to create a custom Laravel Queue driver. I decided to call this custom driver sns to give the sense of working messages from AWS SNS (through SQS).
In a Service Provider we can hook an afterResolving event on Laravel’s container. That way whenever the QueueManager class is resolved out of the container for the first time we can make sure to add our custom driver to it.
The JobMap class is responsible for mapping a Topic ARN to a Job Handler.
The callback given to the QueueManager with the name of sns should return an implementation of IlluminateQueueConnectorsConnectorInterface. In here I decided to extend from SqsConnector in order to take advantage of the getDefaultConfiguration() method already provided by Laravel Worker system. There’s no actual need for this inheritance except to borrow that method.
As per Laravel’s interface, a Queue Connector should return a Queue object. Since we wish to work messages from Sqs, it is a lot easier to just inherit from Laravel’s SqsQueue object and override the pop method.
Everything else to make the Queue object work already come out of the box from the SqsQueue class. Note that we’ve loaded the queue.map from the Service Provider, sent it to the SnsConnector, SnsQueue and now we’re injecting it into the SnsJob.
Note: we could have easily used a Service Locator / Facade strategy to automatically resolve the mapping from inside the job itself and personally I don’t have any problems with that. However, if you’re working on a diverse team that dislike Facade, the explicitly dependency injection allows the Job class to not be dependent on / coupled to an external resource.
Lastly, with the SnsJob class we can extend from SqsJob provided by Laravel to default to the normal workflow of working messages from Sqs while overriding the payload method to inject the job attribute. That’s where the queue.map configuration comes in play. Whenever a specific Sns Topic injects a message to the Queue, we can instruct Laravel which job to run.
To make sure everything works as intended, we can leverage localstack with an SQS container locally. Configuring Localstack is beyond the scope of this article and will not be covered. The test I would like to run looks a bit like the following class.
The setUp method is pushing a message to localstack SQS container. The test blueprint can be translated as:
– Arrange: Add SNS topic to the queue.map on Laravel’s config system
– Act: Invoke Laravel’s Queue Worker system.
– Assert: Check that the message was correctly obtained.
The content of message.json can be seen below.
The last missing piece is the ProvidesSqs trait that helps setup the localstack integration for as many test class as we need.
When sendMessageToSqs is invoked, it will:
- Make an AWS Sqs Client.
- Create a new Sqs on localstack.
- Purge the Queue in case there’s messages from previous tests.
- Send a message to the Queue
- Configure Laravel Queue system to use sns driver by default.
- Configure Laravel Queue system to work messages from this queue.
Compiling everything together, we have:
- An external system that publishes messages to AWS SNS.
- A Queue that subscribes to the SNS Topic
- A Laravel project that is interested in working those messages on the background
- A Service Provider that creates a new sns driver for Laravel Worker
- A SnsConnector, SnsQueue and SnsJob that assembles the worker system
- A ProvidesSqs trait that uses localstack to write tests.
One caveat to pay attention: You cannot use the sns driver to push messages to the queue. It is a read-only process that will rely on the TopicARN key to figure out what job to run.
Hope you enjoy the article and the technical details. If you have any questions find me on Twitter at DeleuGyN.