AWS S3-presigned URLs: Deep Dive and Use Cases of secure file transfers
Abstract
One of AWS S3 features is very useful, but not so many developers and architects are familiar with it - the capability to generate presigned URLs.
In this blog post, we will explore what presigned URLs are, how they are useful, and how you can leverage them in your AWS S3 workflows.
Target Architecture
Singed URLs are popular for several scenarios:
- we do not want to create and manage users for 3rd party access to S3 buckets, but require for a short period of time share the file with them, but not making it publicly
- by creating signed URL we offload processing consumption to S3 (there is no load to our aws lambda that bypasses file to APIGW, or EKS service that reads from S3 and proxies to Ingress the response). We are not heating CPU and memory on not needed components.
Following diagram describes integration of signed url feature:
- user application/system requests access to S3 file
- lambda signs file for a given period of time and returns back URL
- user application directly downloads file from S3
The signature process is quick and do not consume workflow capacity.
A presigned URL remains valid for the period of time specified when the URL is generated. If you create a presigned URL with the Amazon S3 console, the expiration time can be set between 1 minute and 12 hours. If you use the AWS CLI or AWS SDKs, the expiration time can be set as high as 7 days.
Code snippet to perform signature
1
2
3
4
5
6
7
8
9
s3 = boto3.client('s3')
url = s3.generate_presigned_url(
ClientMethod='get_object',
Params={
'Bucket': bucket,
'Key': key
},
ExpiresIn=expiration
)
1
python3 reverse-engineer.py
After execution as output we will see the following URLs, let’s deep dive into analysis.
URL structure
File: data.txt
1
2
3
4
5
https://strata-2024.s3.amazonaws.com/data.txt?
AWSAccessKeyId=ASIAWFOD4FP2PNVEYEWT&
Signature=OxQHlg+H2gdggggXpoX&
x-amz-security-token=IQoJb3aaaZ2luX2VjEK&
Expires=1706977535
File: data2.txt
1
2
3
4
5
https://strata-2024.s3.amazonaws.com/data2.txt?
AWSAccessKeyId=ASIAWFOD4FP2PNVEYEWT&
Signature=OxQHlg+H2gdggggXpoX&
x-amz-security-token=IQoJb3aaaZ2luX2VjEK&
Expires=1706977684
Comparing the payload:
Param | File1 | File2 |
---|---|---|
AWSAccessKeyId | ASIAWFOD4FP2PNVEYEWT | ASIAWFOD4FP2PNVEYEWT |
x-amz-security-token | IQoJb3aaaZ2luX2VjEK | IQoJb3aaaZ2luX2VjEK |
Expires | 1706977535 | 1706977684 |
Signature | OxQHlg+H2gdggggXpoX | OxQHlg+H2gdggggXpoX |
Besides different HTTP path
(files), these 2 signed-urls to 2 distinct files, have the same ACCESS_KEY
and x-amz-security-token
.
As we know from AWS signature v4
implementation ACCESS_KEY
is sent in the payload, header or url param, but SECRET_KEY
always stays on owning side.
We invoked s3 sign
method and AWS
registers programmatic access key that is not visible in our account, but exists. Later when user accesses the URL
this SECRET_KEY
is used to generate request signature and compare it with signature inside the request payload to guarantee the data integrity. For these 2 files same key is used. Since the payloads are different (distinct expires
and url path
) generated signature
is not the same.
Let’s try to modify Expires (the last parameter in URL)
Now if we will try to access a file using modified URL, S3 will response with Error
in format of xml
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
<Error>
<Code>SignatureDoesNotMatch</Code>
<Message>
The request signature we calculated does not match the signature you provided. Check your key and signing method.
</Message>
<AWSAccessKeyId>ASIAWFOD4FP2PNVEYEWT</AWSAccessKeyId>
<StringToSign>
GET 1706977682
x-amz-security-token:IQoJb3aaaZ2luX2VjEK////////=
/strata-2024/data2.txt
</StringToSign>
<SignatureProvided>OxQHlg+H2gdggggXpoX/M/40U3Veg=</SignatureProvided>
<StringToSignBytes>
47 45 54 0a 0a 0a 31 37 30 36 39 37 37 36 38 32 0a 78 2d 61 6d 7a 2d 73 65 63 75 72 69 74 79 2d 74 6f 6b 65 6e 3a 49 51 6f 4a 62 33 4a 70 5a 32 6c 75 58 32 56 6a 45 4b 2f 2f 2f 2f 2f 2f 2f 2f 2f
2f 2f 77 45 61 44 47 56 31 4c 57 4e 6c 62 6e 52 79 59 57 77 74 4d 53 4a 48 4d 45 55 43 49 48 67 67 61 55 35 4b 5a 63 33 71 41 42 59 67 6e 30 4f 4d 6a 30 76 65 50 47 52 52 5a 55 48 67 66 71 54 6b
6c 4f 4d 35 57 35 65 4e 41 69 45 41 78 71 43 34 38 76 74 6c 52 4a 63 5a 30 4a 78 66 33 50 52 58 4b 33 37 59 33 61 75 74 69 67 74 51 67 59 35 35 67 50 50 46 44 6d 77 71 6c 67 49 49 65 42 41 42 47
37 72 42 54 63 55 77 4c 67 70 79 39 36 67 32 39 36 50 6e 4b 6c 58 65 57 74 73 54 51 44 42 70 74 47 76 64 42 36 62 47 48 5a 53 49 33 76 4d 74 41 65 64 2b 52 74 52 69 6b 6f 57 4e 62 64 54 74 66 72
4e 77 7a 65 56 71 74 31 69 4b 57 63 4f 67 68 42 41 35 46 33 31 55 76 77 6d 51 54 62 57 78 31 4d 53 68 63 45 79 6e 53 6e 31 48 6d 6d 50 4c 45 54 76 70 78 34 49 2f 31 56 35 57 6b 79 6b 68 74 7a 6f
49 3d 0a 2f 73 74 72 61 74 61 2d 32 30 32 34 2f 64 61 74 61 32 2e 74 78 74
</StringToSignBytes>
<RequestId>22CF9WNZZ429CSRE</RequestId>
<HostId>
jgGesVaafsl6Dx7zB9Vk6OClOTglS5yGa/lZix3sM20q0eers4hdm6c+8S9UVoTLkBXeBDa4xHeGRQ==
</HostId>
</Error>
From the payload we can check what exactly is signed StringToSign
it is separated HTTP method, expiration time, token and object path.
Reverse-engineering SECRET from pre-signed URL, and why it is important to keep as short expiration time as possible:
AWS uses AWS signature version 4
algorithm for signing requests. Knowing this algorithm details malicious user who has access to signed URL can revere-engineer it and extract SECRET_KEY
Having both ACCESS_KEY
and SECRET_KEY
will allow more widly attack vector, based on the IAM permissions
that are associated with this entity, or follow privilidges escalation
vector.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
result = ''
access_key = 'ASIAWFOD4FP2PNVEYEWT'.encode("UTF-8")
string_to_sign = 'GET 1706977682 x-amz-security-token:IQoJb3JpZ2luX2VjEK/////= /strata-2024/data2.txt'
.encode("UTF-8")
while ('OxQHlg+H2gdggggXpoX/M/40U3Veg=' != result):
secret_key = generate_random_secret_key().encode("UTF-8")
signature = base64.b64encode(
hmac.new(
secret_key, string_to_sign, sha1
).digest()
).strip()
result = signature.decode()
print(f"AWS {access_key.decode()}:{signature.decode()}")
The output will be infinite loop with different generated keys:
1
2
3
4
5
6
7
8
...
AWS ASIAWFOD4FP2PNVEYEWT:YUTgj1ul1CqzE/hPhfVFHyaXfMk=
AWS ASIAWFOD4FP2PNVEYEWT:Cp47/TsLSY6z2XZW2BaYKm/lBvs=
AWS ASIAWFOD4FP2PNVEYEWT:nVdB1mSbQbbuzHo0rkq0viHf9DY=
AWS ASIAWFOD4FP2PNVEYEWT:C0McVvq0DIqEgjMbgkmTngnLUGQ=
AWS ASIAWFOD4FP2PNVEYEWT:G0Dh2Rkwz0pk4pdVPaoqPcEHwtk=
AWS ASIAWFOD4FP2PNVEYEWT:UJU1F7WYtyoYm7tYA/HFro6xfZ4=
....
These brute-force
process has several dependencies:
- cryptographical strength of algorithm
- used compute resources
- time for processing
Since algorithm is fixed, and we do not control the amount of resources that will be used for such task, there is very important parameter that we have control - ExpirationTime
. Reverse engineering will take A LOT OF
time, but by reducing the expiration time in our generated signed URL - we make it even more impossible to brute force the signature.
Expiration
Once expiration has passed the request will return Status: 403 Forbidden:
1
2
3
4
5
6
7
8
9
10
<Error>
<Code>AccessDenied</Code>
<Message>Request has expired</Message>
<Expires>2024-02-03T16:28:04Z</Expires>
<ServerTime>2024-02-03T16:41:16Z</ServerTime>
<RequestId>EASJ4X6YWFJEB6RZ</RequestId>
<HostId>
a8TzGs44FcojTt8e1tQhqTfhXEdGBPF+IYV4q2T7jhumYTQJu9FsD0qQml8qfm+0OONK7/UKxocQBiQ==
</HostId>
</Error>
Signed URLs are even more
S3 supports not only signed GET
requests. But also object modification PUT
, DELETE
requests. It is super usable when you need to provide the posibility to 3rd party agent to write some results into S3
bucket but do not want to create a user, and also want this operation for a short time.