GraphQL APIs are powerful because they use resolvers to fetch limited subsets of data, often through multiple atomic operations. When building a public API in AWS using AWS AppSync, it is common to implement GraphQL resolvers as individual Lambda functions, each of which can read (or write) data to persistent stores. As your API grows, the serverless application can become complex, with multiple moving parts; that’s why it is important to build in comprehensive logging and tracing techniques from day one, as part of your DevOps practices. nubeGo COO, Emma Button, shares her experiences implementing distributed tracing for AppSync APIs.
What is X-Ray and why do I need it?
AWS X-Ray is a serverless, cloud-based distributed tracing tool which provides you with insight into the path that a request takes as it travels through your application. Using X-Ray can help you debug errors within complex micro-service systems and it can help you to analyse and pinpoint performance bottlenecks. At nubeGo, we encourage our customers to enable and use AWS X-Ray for tracing of serverless applications. I’ll walk you through a recent AppSync implementation we have worked on…
How to enable X-ray
You can enable X-Ray for the components of your AWS application in the console, or in your IaC (Infrastructure as Code). Our application is constructed as a SAM (Serverless Application Model) template so some of the syntax is a little different than in Cloudformation, but the concepts are the same.
For the AppSync API, it is as straightforward as setting the XRayEnabled setting to true. This will cause AppSync to start tracing the full API requests and latencies.
MyAppSyncAPI:
Type: AWS::AppSync::GraphQLApi
Properties:
XrayEnabled: true
It is important to also trace the Lambda functions, if you’re using them to implement resolvers. Enabling X-Ray for Lambda is almost as simple:-
MyFunction:
Type: AWS::Serverless::Function
Properties:
Tracing: Active
If you are managing your own Lambda function roles (rather than letting SAM to auto-create them) then you’ll also need to add the following permissions to the function’s execution role:-
- xray:PutTraceSegments
- xray:PutTelemetryRecords
- xray:GetSamplingRules
- xray:GetSamplingTargets
- xray:GetSamplingStatisticSummaries
Once your Lambda functions are tracing to X-Ray, there are some phenomenally powerful things you can do in your code to segment your traces by the elements that are important to you. That’s a topic for a whole new blog post, but right now, lets focus in on one element – the database.
Adding in SQL tracing for MySQL
In our example API, we write AppSync queries which result in the construction of complex SQL queries against an Aurora RDS database. For us, it is invaluable to be able to see and understand the SQL queries that each request ends up executing against the database, to understand how often they’re executed, and how long they take. We add X-Ray tracing of RDS operations by instrumenting the Lambda code.
Our Lambda functions are written in NodeJS and use the MySQL library for connecting to Aurora. We can use the aws-xray-sdk library to wrap the mysql library, ensuring that every single MySQL query or operation gets traced using X-Ray.
You’ll need to import the library for use by your Lambda function
npm install –save aws-xray-sdk
And then change your import statements to wrap the mysql library:
const AWSXRay = require('aws-xray-sdk');
const mysql = AWSXRay.captureMySQL(require('mysql'));
Now you can use mysql just as you would do normally. By default, X-Ray will only surface the DB connection details and the user that the operation is executing as. If you want to trace the actual SQL query executed, you will need to enable the sanitized_query subsegment and add in your SQL after each query (making sure to remove any potentially sensitive information from the trace).
// Retrieve the most recently created subsegment in order to add tracing
const subs = AWSXRay.getSegment().subsegments;
if (subs && subs.length > 0) {
var sqlSub = subs[subs.length - 1];
sqlSub.sql.sanitized_query = sql;
}
What information does X-Ray Provide?
You can use the Amazon Web Services management console to explore the tracing information that AWS captures. The first, and perhaps shiniest thing you’ll gravitate to is the service map. This is an automatically-generated diagram which shows the flow of data through your application during recent request executions. For applications with tens or hundreds of micro-services, this alone is a powerful tool for understanding the application eco-system and I can think of many scenarios in the past where a visualisation such as this would have benefited an application support team. You can annotate the service map diagram to highlight health or performance hotspots with different colours – picking out any hops on a request’s journey that are returning error codes, or perhaps taking longer than expected. In this diagram you can see our CI unit tests and a sample application each calling a different AppSync query, which in turn each call a lambda function, both of which query the database.
But for me, the most useful element of X-Ray is the Traces tab. Here you can dive into individual request traces and understand what components call what functions, how long they spend in each phase and view details of each hop of the trace. This is where most debugging happens, and it is where you can view the details of a trace segment such as the SQL that was executed during an RDS operation.
That’s Fab – Doesn’t It Cost a Lot to Store Tracing Data?
As with most AWS Services, X-Ray provides a free-tier which gives you an allowance of 100k traces stored a month, and 1M retrieved a month. On a complex application, the free tier will very quickly be consumed, but prices are incredibly low even after that. As someone who has grappled with installing and managing an alternative OSS distributed tracing solution in the past, I think the cost to trace is a complete bargain!
That said, there are some neat ways that you can reduce the volume of trace data being stored to help optimise costs. X-Ray allows you to define sampling rules which means that you don’t have to trace EVERYTHING and can choose to just take a sample to aid with ongoing monitoring. For us, SQL tracing is invaluable during development, but not needed so frequently once you go into production so we define a sampling rule at that point which disables all but the most important information to capture. Rules can be added and removed or prioritised as your needs evolve.
As specialist AWS Cloud engineers, nubeGo can help you evaluate application logging and tracing as part of your Cloud adoption programme. Find out more about our application modernisation and migration programme, SWIFT, and our DevOps capabilities at www.nubego.io/swift
Comments