Summary
Challenge: Our client’s AWS infrastructure supported multiple customer communications platforms, but continual technical debt threatened to slow development and drive up costs if left unmanaged.
Solution: Automated technical debt and cost reduction using a series of AWS Lambda scripts. Removed obsolete AMI images and snapshots, archived CloudWatch logs automatically, and decommissioned unused Cloud9 instances.
Outcome: Significantly reduced AWS costs by tens of thousands of dollars per month, eliminated outdated infrastructure, cut deployment time by 99.8%, decreased maximum Lambda runtime by 85%, and reduced task frequency by 96%.
Challenge
Our customer, a major global mobile telecoms provider, uses Amazon Web Services (AWS) to provide the infrastructure for a number of different communications software platforms it provides for its business customers.
Before we began working with our client, the organisation found that developing and maintaining their applications generated a constant level of new technical debt, as machine images became obsolete and old infrastructure was no longer needed.
If left uncontrolled, the technical debt could slow down the development of new functionality, and incur substantial AWS costs.
We were brought in to tackle this challenge. Our goal was to reduce technical debt and costs in an automated way, without removing anything that may still be needed.
Solution
Automation Consultants developed a series of Lambda scripts to meet our client’s requirements.
Removal of obsolete AMI images and snapshots
- A build-up of AMI images was occurring, with frequent copying in the course of building and testing the company’s software. Our client also received security-hardened AMIs from a supplier which were regularly updated to contain the latest security, and which rendered older AMIs obsolete.
- To solve this, we executed the automated deregistration of older AMIs, via a Lambda script, initiated by passing a JSON parameter file to the script using the AWS SQS service. The script takes various parameters, such as the name or part name of the images to be deregistered and the minimum age an image must have before it will be deregistered. The script can be run in trial mode (in which case it produces a list of images to be deregistered), or in live mode (in which case the images are actually deregistered).
- Once the desired AMI images have been deregistered, the associated snapshots can be deleted. We created another script to perform this function. The script deletes all snapshots older than a specified retention age and can also be run in trial mode.
Automated archiving of CloudWatch logos
- The customer required the archiving of certain CloudWatch logs for an extended period. We achieved this by a Lambda function which copies the relevant logs to S3, which are then transferred within S3 to Glacier Deep Archive.
- The Lambda function is triggered by a ‘cron’ expression set to run every six hours. It takes only logs produced by certain functionality and ignores any others. Other logs are automatically deleted from CloudWatch after a certain time. The Lambda function creates a CloudWatch export task to a parametered S3 bucket. To avoid duplication, it keeps track of the timestamp of the most recent log to have been copied to S3, and will not copy any logs older than that timestamp.
- Once the logs have been copied into S3, they are transferred automatically to Glacier Deep Archive by means of settings in the S3 bucket.
Removal of disused Cloud9 instances
- Due to the continuous development and maintenance effort put in by our client to its software products, the creation and eventual abandonment of a large number of Cloud9 IDE instances became a challenge.
- We developed a Lambda function to remove disused Cloud9 instances. The solution uses Lambda EventBridge rules and SNS. The Lambda works only on instances containing a certain string reserved for Cloud9, identifying them as Cloud9 instances. It then determines whether the instance is stopped, and if so for how long. Any instance that has been stopped for more than 90 days is deleted. Any instance that has been stopped for more than 30 days but less than 90, is put on a notification list. After the script has run, a list of all the instances deleted and those on the notification list are sent by SNS to the relevant administrators.
- The script is run every 15 days using EventBridge rules.
Benefits
This AWS optimisation project, delivered by our team here Automation Consultants, resulted in significant benefits for the global mobile telecoms provider.
These included:
Cost savings: The solutions implemented by Automation Consultants have reduced AWS costs by several tens of thousand dollars per month.
Reduced technical debt: Our solutions also reduced technical debt by removing old AMIs and infrastructure, which could otherwise clutter up the lists of active infrastructure and create confusion around which are the correct infrastructure and images to use.
Time savings and increased efficiencies: Our work resulted in a 99.8% redeployment time (from 3 days to 10 minutes), a 96% reduction in task frequency (from daily to quarterly), and an 85% decrease in Max Lambda Runtime, ensuring faster execution.




