Automatic monitoring of AWS Lambda functions for an Alexa Skill

If you’re not testing your services in production yourself, then you’re letting your customers test it for you!

Automatic testing of your system means you can catch issues before your customers see them. This doesn’t mean you can skip on unit testing or functional testing, it simply means you have an extra layer to notify you of problems before they affect users.

What to test

The Alexa Skill that I want to test is implemented as an AWS Lambda Function. The Lambda Function will be the focus of the automated testing since that’s where my code lives and where it’s most likely to have a problem. I’m not testing anything to do with voice commands, Echo devices, or the Alexa Skill backend. I’m treating that as a black box, after all I don’t have any way of fixing or debugging issues inside of those systems anyways.

When I’m monitoring the system I want to know the following

  • Is the Lambda function online and working
  • Are dependencies functioning correctly
  • How long does it take to execute (latency)

To test this I’ve decided to take the simplistic approach and use an automated request to my Lambda Function that synthesises a customer request and then use my monitoring systems to identify problems.

This works well for my use case because I have already implemented monitoring and logging for Lambda Function so I have pretty good metrics and most users interact with my skill on a weekly basis so this is likely to highlight problems before users find them.

How to test it

My Lambda function exposes a single handler function in Node.js so to test different behaviours you have to modify the values sent in the request payload (rather than having different APIs for each). There are a couple of different options that I can use to test my function but for simplicity and cost it’s easier to use anĀ AWS CloudWatch Event to automatically trigger my Lambda Function at a set interval and check everything is OK.

Setting up the Cloudwatch event is split into two parts:

  1. Modifying the Lambda Function to allow CloudWatch
  2. Setting up the Event itself to call the function at a set interval

Allowing the function to be invoked is pretty simple. You modify your Lambda Function and add a new trigger from the menu on the left for “CloudWatch Events”, This means your function can be invoked either from an Alexa Skill or from CloudWatch.

You can then go into CloudWatch and add a new event. I’ve opted to trigger this event on a set schedule every 5 minutes. You can also select the target as being a Lambda function and then select the function name from the list, and also the version of the function you want to target (I always select the same version that I have live for users to make sure I’m testing what users are experiencing).

For the input of the event, I used a fixed JSON payload which specifies the intent name that I want to trigger.

{   
  "session": {     
    "new": true,     
    "sessionId": "SessionId.253d1fe4-9af0-45de-b767-ccfea6f0e3d4",     
    "application": {       
      "applicationId": "<YOUR SKILL ID HERE>"     
    },     
    "attributes": {},     
    "user": {      
      "userId": "HEALTH-CHECK-USER"     
    }   
  },   
  "request": {     
    "type": "IntentRequest",     
    "requestId": "EdwRequestId.cb85ea9c-1e57-4226-83b6-f1a1d9e2eb8a",     
    "intent": {      
      "name": "PlayLatestSermon",       
      "slots": {}     
    },     
    "locale": "en-GB",     
    "timestamp": "2018-01-14T14:31:58Z"   
  },   
  "context": {     
    "AudioPlayer": {       
      "playerActivity": "IDLE"     
    },     
    "System": {       
      "application": {         
        "applicationId": "<YOUR SKILL ID HERE>"       
      },        
      "user": {         
        "userId": "HEALTH-CHECK-USER"       
      },       
      "device": {         
        "supportedInterfaces": {}       
      }     
    }   
  },   
  "version": "1.0" 
}

Once that’s in place you should have the specified intent being triggered every 5 mins!

Knowing when things go wrong

I’ve already written about monitoring Alexa Skills and creating a dashboard. I’d rather not have to keep checking graphs to know when something is going wrong. So, you can actually use CloudWatch alarms to check metrics for you and send an email when a configured threshold is breached.

When you add a CloudWatch event it automatically logs some metrics for each configured event, including for successful and failed invocations. This is perfect for knowing when the health check failed. I followed the configuration options for the alarm and set this to fail if I have three or more failed invocations in a 15 min period. I’ve also configured some other alarms on errors, and some capacity alarms. One thing that I find helpful is to set an alert when the status is ALARM (when it goes wrong) and also when it’s OK (when it recovers), that way, if you get a blip that triggers the alarm, you’ll also get a follow up telling you it was OK.

The beauty of this approach is that you get automatic traffic testing your code at whatever interval you pick, and you also get notified when something starts to misbehave so you can catch it before your users do helping ensure you have a reliable system and a better experience for your users