corporate

SFTP and SCP Gateway to Amazon S3

The SFTP Gateway is a proxy server that provides a secure and convenient way to work with S3 buckets over the SFTP (over SSH) and SCP protocol. Manage access through IAM users and authenticate with the SFTP Gateway using IAM user credentials. No separate user management necessary.

This implementation is unique because it does not buffer files on the local EBS volume. Instead, files are streamed directly to S3. This eliminates disk IO as a potential bottleneck for throughput and allows terabyte-sized files to be transferred efficiently.

Refer to the resources listed below for further documentation, detailed instructions and FAQs.

Table of Contents

  1. Highlights
  2. Setup Instructions
  3. High Availability
  4. Security
    1. Audit Log
    2. Patching
  5. Performance
  6. FAQ
  7. Network Diagrams
  8. Limitations
  9. Recommended Clients
  10. Support

Highlights

Accessing the SFTP Gateway

The CloudFormation stack takes roughly 5 minutes to create. The Outputs section of the CloudFormation stack will provide you with the hostname of the SFTP load balancer (LoadBalancerHostName). The SFTP Gateway is running on port 22 and can be accessed through any SFTP or SCP client (see Recommended Clients).

Authentication is managed through AWS IAM user accounts. The SFTP Gateway will accept an Access Key ID as user name and the Secret Access Key as password.

The server will then assume the identity of the access key's owner. The IAM user requires the necessary policies that allows them to list buckets, read bucket locations, list objects and access objects. See below for a least-privileged IAM policy or apply the AmazonS3FullAccess AWS managed policy.

To connect to the linux shell for administrative purposes use SSH on port 22 of the EC2 instance and the username ec2-user.

Password Authentication

  1. Sign in to the AWS Management Console and open the IAM console at https://console.aws.amazon.com/iam/.
  2. In the IAM console, in the navigation pane, choose Users, and from the list of users, choose your IAM user.
  3. On the user details page, choose the Security Credentials tab, and then choose Create access key.
  4. For authentication with the SFTP Gateway, use the Access key ID as username and the Secret access key as password.

High Availability

For high availability, we recommend launching at least two servers in separate availability zones. To provide fault-tolerance and zero-downtime in case one of the availability zones is no longer available, the two servers can be added to the same DNS record. A sample configuration in Route 53 could look like this where the DNS record s3gw.mydomain.com resolves to both SFTP Gateway servers. Additionally, a health check can be configured that automatically removes one of the DNS records if no TCP connection to the SFTP Gateway can be made. Client applications will automatically be directed to the remaining SFTP Gateway in case of a failure.

Each SFTP Gateway generates its own unique SSH key pair. Make sure that the clients are aware of all SSH host keys. Otherwise, a connection could be rejected by the client. Alternatively, you can change the SSH host keys to match by overwriting them in /home/ec2-user/keys/. Run sudo service s3gw restart to apply the change.

Security

This service listens on port 22 of the Network Load Balancer for connections which forwards the connection to port 2222 on the EC2 instance. No additional ports need to be opened. The SSH host certificate is unique to the instance. The SFTP Gateway only supports the subset of SSH commands that are required for SFTP (over SSH) and SCP connections. A full SSH shell cannot be opened. The service runs under an unprivileged user s3gw with no write access to the local file system.

The OpenSSH server is accessible on port 22 on the EC2 instance for administrative access. The username is ec2-user.

All files are uploaded with AES256 server-side encryption enabled.

Audit Log

The application will write an Audit Log of users who have logged on and transferred files to AWS CloudWatch Logs. The audit log contains information about the session start date, end date, instance id, IP and protocol of the session. The CloudWatch Logs group is called /netcubed/s3gw.

audit log

Patching

The server can be patched manually by accessing the server via SSH. Execute sudo yum update -y to upgrade packages and the operating system. Reboot the instance to ensure that all patches are applied.

We will also be publishing new AMIs after critical security vulnerabilities have been published. As a subscriber to the AMI you will be notified immediately.

Performance

An active connection can consume up to 10 megabytes of memory when up- or downloading a file from S3, regardless of the size of the object that is being transferred. This is achieved by using multi-part up- and downloads. The limiting factor in terms of transfer speed will therefore be the network bandwidth of the instance.

The SFTP Gateway supports multi-core environments and therefore fully leverages instance types that provide more than one core. The server is implemented using a non-blocking event loop. Therefore, the server can handle many concurrent connections.

The EC2 instances are located behind a Network Load Balancer which provides high throughput and low latency. The Auto Scaling Group is configured to replace instances where health checks fail.

The recommended instance type for small production environments is m4.large. T2 instances should only be used for testing. They tend to run out of CPU credits if they are heavily utilized and will grind to a halt. The server will become unresponsive and you will need to reboot the instance or upgrade to larger instance type.

Recommended Clients

We have tested SFTP Gateway successfully with the following clients. Since we are fully compliant with the SCP and SFTP (over SSH) standards, we expect other clients to work as well. Please contact us if you are having trouble to connect with a client that is not listed here.

FAQ

Can I access buckets in a different region than the SFTP Gateway?

Yes, you can but with caveats. Buckets in regions other than the one where the SFTP Gateway was launched are accessible. However, please be reminded that this will incur cross-region file-transfer charges. Due to the higher latency and reduced bandwidth between the SFTP Gateway and the S3 endpoint expect worse performance. It is highly recommended to only access S3 buckets in the same region as the SFTP Gateway.

Can I access buckets that do not belong to my account?

Yes, you can. You can simply cd into a bucket that you have read access to, even though it doesn't show in the root directory listing. For example, try to cd into the cloudformation-examples bucket (the SFTP Gateway must be located in the us-east-1 region). In graphical SFTP clients you should be able to set a path that the client will change directory into.

path

What is the least-privileged IAM policy for users?

{
  "Version": "2012-10-17",
  "Statement": [{
    "Effect": "Allow",
    "Action": [
      "s3:GetObject",
      "s3:PutObject",
      "s3:DeleteObject",
      "s3:GetBucketLocation",
      "s3:List*"
    ],
    "Resource": "*"
  }]
}

Network Diagrams

These network diagrams give an idea on how the SFTP Gateway can be deployed.

Internet Facing

A common scenario where you want to grant access to your S3 buckets to an external entity. The SFTP Gateway can be launched in a public subnet and is therefore also accessible to clients that connect from the internet.

internet

Hybrid Cloud

This topology especially useful in hybrid networks where on-premises applications need to work with objects on S3 but are not allowed to connect to S3's public endpoints. In that case, the on-premises application will be able to connect to the SFTP Gateway in the securely connected VPC (via VPN or Direct Connect) which will relay the request to S3.

corporate

Limitations

Buckets cannot be created, deleted or renamed

The server will not attempt to create, delete or rename buckets if instructed to do so. Instead, an error is returned to the client.

File attributes cannot be set

File attributes, such as file modes, cannot be changed since S3 does not support them. Similarly, symbolic links are not supported.

Folders cannot be renamed

S3 is not a real file system and therefore some operations cannot be implemented efficiently. Renaming a directory on a regular file system is a single system call. Renaming a directory on S3 requires all containing files to be moved to the new directory which is a slow, and even a costly operation (since you will be charged for each COPY command).

Is public key authentication supported?

Not at this point. Currently, only authentication through AWS IAM credentials is available. Please contact support@netcubed.de if you are interested in public key authentication.

Support

For paid support, email sales@netcubed.de for further information. Free support is provided via support@netcubed.de.

For free support, we do not provide a guaranteed response time. However, we do our best to respond to questions within 24 hours Monday through Friday.

Changes

v1.0.0