What you should consider when storing datasets in s3

As an Amazon Web Services (AWS) developer, I am often asked what is the best way to organise datasets in S3. A dataset could comprise data exported by business systems, or data emitted by AWS services, such as CloudFront logs, or CloudTrail logs. Far too often I have seen datasets just dumped into one massive S3 bucket, and left for someone else to tidy up later, however with a little consideration, and empathy for those dealing with this in the future, we can do better than this. ...

Avoid accidental exposure of authenticated Amazon API Gateway resources

I have been working with Amazon API Gateway for a while and one thing I noticed is there are a few options for authentication, which can be confusing to developers, and lead to security issues. This post will cover one of the common security pitfalls with API Gateway and how to mitigate it. If your using AWS_IAM authentication on an API Gateway, then make sure you set the default authorizer for all API resources. This will avoid accidental exposing an API if you mis-configure, or omit an authentication method for an API resource as the default is None. ...

RIP AWS Go Lambda Runtime

Amazon Web Services (AWS) is deprecating the go1.x runtime on Lambda, this is currently scheduled for December 31, 2023. Customers need to migrate their Go based lambda functions to the al2.provided runtime, which uses Amazon Linux 2 as the execution environment. I think this is a bad thing for a couple of reasons: There is no automated migration path from existing Go Lambda functions to the new custom runtime. Customers will need to manually refactor and migrate each function to this new runtime, which this is time-consuming and error-prone. This will remove Go1.x name from the lambda console, Go will now just be another “custom” runtime instead of a first class supported language. This makes Go development on Lambda seem less official/supported compared to other languages like Node, Python, Java etc. Case in point, try searching for “al2.provided lambda” on Google and see how little documentation comes up compared to “go1.x lambda”. The migration essentially removes the branding and discoverability of Go as a Lambda language, I am sure this will improve over time, but it is still ambiguous. ...

Stop using IAM User Credentials with Terraform Cloud

I recently started using Terraform Cloud but discovered that the getting started tutorial which describes how to integrate it with Amazon Web Services (AWS) suggested using IAM user credentials. This is not ideal as these credentials are long-lived and can lead to security issues. What is the problem with IAM User Credentials? IAM User Credentials are long lived, meaning once compromised they allow access for a long time They are static, so if leaked it is difficult to revoke access immediately But there are better alternatives, the one I recommend is OpenID Connect (OIDC), which if you dig deep into the Terraform Cloud docs is a supported approach. This has a few benefits: ...

Automated Cloud Security Remediation

Recently I have been looking into automated security remediation to understand its impacts, positive and negative. As I am a user of AWS, as well other cloud services, I was particularly interested in how it helped maintain security in these environments. As with anything, it is good to understand what problem it is trying to solve and why it exists in the first place. So firstly what does automated security remediation for a cloud service do? This is software which detects threats, more specifically misconfigurations of services, and automatically remediates problems. ...

My Development Environment

I was inspired by others to document the tools I use working as a software developer professionally, and hacking on side projects out side of work. One thing to note is in my day job I work on an Apple Mac, but my personal machine is a Linux laptop running PopOS. I find using Linux as a desktop works as most software I use is web based or supported on linux. I also use it for IoT development as pretty much all the tool chains I use supports it. ...

Diving into AWS Billing Data

Billing is an integral part of day to day AWS account operation, and to most it seems like a chore, however there is a lot to be learnt interacting with AWS Billing data. So why would you ever want to dive into AWS Billing data in the first place? It is pretty easy for both novices, and experience developers to rack up a sizable bill in AWS, part of the learning experience is figuring out how this happened. The billing data itself is available in parquet format, which is a great format to query and dig into with services such as Athena. This billing data is the only way of figuring out how much a specific AWS resource costs, this again is helpful for the learning experience. The Cost Explorer in AWS is great if you just want an overview, but having SQL access to the data is better for developers looking to dive a bit deeper. The billing service has a feature which records created_by for resources, this is only available in the CUR data. If you have already you can enable it via Cost Allocation Tags. These points paired with the fact that a basic understanding of data wrangling in AWS is an invaluable skill to have in your repertoire. ...

Why isn't my s3 bucket secure?

We have all read horror stories of Amazon Simple Storage Service (S3) buckets being “hacked” in the popular media, and we have seen lots of work by Amazon Web Services (AWS) to tighten up controls and messaging around best practices. So how do the amazon tools help you avoid some of the pitfalls with S3? Case in point, the AWS CLI which a large number of engineers and developers rely on every day, the following command will create a bucket. ...

AWS Events reading list

For some time now I have been working on internal, and some product related services which use AWS events, some of this has been paired with AppSync subscriptions, slack and AWS SNS. To help everyone come up to speed with events, and async messaging in general in a world of REST and synchronous APIs I have been compiling a list of links, which I thought I would share in a post. To start out it is helpful to have an overview, this post and the associated talk Moving to event-driven architectures (SVS308-R1) are a good place to start. ...

Getting started with Cognito?

The AWS Cognito product enables developers to build web or API based applications without worrying about authentication and authorisation. When setting up an applications authentication I try to keep in mind a few goals: Keep my users data as safe as possible. Try and find something which is standards based, or supports integrating with standard protocols such as openid, oauth2 and SAML. Evaluate the authentication flows I need and avoid increasing scope and risk. Try to use a service to start with, or secondarily, an opensource project with a good security process and a healthy community. Limit any custom development to extensions, rather than throwing out the baby with the bath water. As you can probably tell, my primary goal is to keep authentication out of my applications, I really don’t have the time or inclination to manage a handcrafted authentication solution. ...