corporate

Machine Learning Workbench

Ubuntu 20.04 with Jupyter, JupyterLab, TensorBoard, RStudio Server and preconfigured conda environments for Tensorflow 2.3.1, MXNet 1.7.0 and PyTorch 1.7.0 including the latest matching versions of CUDA 11.0, 10.1 and 10.2, cuDNN and NCCL for GPU-accelerated computing. There is no need for an SSH session to launch a notebook. All services, including the Ubuntu MATE desktop, can be accessed through the browser. Check out the video to see how to start training your first model on a GPU in less than 5 minutes.

Additionally, this environment provides a full-featured Ubuntu MATE desktop environment and direct console access through the browser with the latest versions of Google Chrome, Visual Studio Code and Docker (including the NVIDIA Container Runtime).

The Machine Learning Workbench puts a focus on security. All communication to the instance is encrypted, launching and accessing the Jupyter Notebook requires username and password authentication.

If you want to use Ubuntu 18.04 instead, make sure that you launch version 1.4.0 or below of this AMI.

Login Screen Jupyter Notebook Tensorboard Tensorboard Terminal Desktop

Highlights

Description

This workspace is ideal for users who require the power of EC2 instances (with or without GPUs) while being able to use a familiar graphical user interface. No client software needs to be installed to access the remote desktop. All that is required is a modern browser (including Internet Explorer 11) to access the desktop. Additionally, a terminal session can be opened through the browser.

The browser interface as well as the remote desktop protocol support copy and paste between the virtual desktop environment and the client. The environment is multi-user capable and supports a large number of concurrent users on appropriately sized instances. Please follow the documentation for instructions on how to add additional users to the system.

Setup Instructions

Launch the image on any of the available instance types. Consider changing the size of the root EBS volume. Make sure that the attached security group allows traffic on port 80 and 443 if you plan to access the server via the browser interface, and port 3389 if you plan to use the Remote Desktop Protocol. Allow a couple of minutes for the instance to boot.

The web interface can be accessed through your browser on the public IP (or private IP for enterprise VPCs) of the EC2 instance (e.g. https://18.245.21.43 where 18.245.21.43 is the public IP address of the instance).

Most browsers will display a certificate warning. This warning is letting you know that the certificate was self-signed instead of signed by a trusted Certificate Authority. You can safely ignore the warning as it doesn't impact the security of the connection by clicking on "Continue to this webpage" (Internet Explorer) or "Advanced" and then "Proceed to website" (Chrome).

The default user is ubuntu and the default password is the instance ID.

Alternatively, you can reach your desktop environment through a Remote Desktop Client (such as the Microsoft Remote Desktop Application, preinstalled on Windows and available on the Apple App Store for MacOS). The hostname is simply the public IP (or private IP for enterprise VPCs). Make sure you enter the username and password before you connect, otherwise the connection will not be established (on Windows, click Show Options and fill in the username and password).

Available Environments

conda is used to provide isolated environments for the individual machine learning frameworks. To switch between them in Jupyter notebook, select New from the front-page and select the appropriate environment. The following environments are available.

Display Name Name Python CUDA Comment
Python 3 base 3.8 - Base Environment
TensorFlow 2.4.0 tensorflow 3.8 11.0
MXNet 1.7.0 mxnet 3.8 10.2
PyTorch 1.7.0 pytorch 3.8 11.0

Custom Environment

We recommend that you clone an existing environment if you want to install additional packages. To clone an environment for your user, open a Terminal in Jupyter (New / Terminal) and execute the following command:

conda create --clone mxnet --name my-mxnet
# install additional packages into your environment
conda install --name my-mxnet boto3
# export list of installed packages
conda env export > my-mxnet.yml

Upload a File

Once authenticated, you can simply drag and drop files or directories into the browser window. An upload window will appear. Uploaded files will be stored in your home directory (e.g. /home/ubuntu). Uploads will never overwrite existing files in the home directory. Instead the upload will fail. Alternatively, you can also use the Jupyter Notebook interface to upload and download files.

Let's Encrypt

The AMI ships with the Let's Encrypt certbot client preinstalled. To obtain certificates from Let's Encrypt, execute the following steps:

  1. Log in as ec2-user and change to the directory /etc/xworkspace.
  2. Run docker-compose run certbot certonly --webroot --webroot-path /var/www/letsencrypt and follow the instructions on the screen.
  3. Update the SSL certificates to point at the Let's Encrypt repository (replace {DOMAINNAME} with your domain name).
    sudo rm -f /etc/xworkspace/var/server.*
    sudo ln -s /etc/letsencrypt/live/{DOMAINNAME}/fullchain.pem /etc/xworkspace/var/server.crt
    sudo ln -s /etc/letsencrypt/live/{DOMAINNAME}/privkey.pem /etc/xworkspace/var/server.key
    sudo chown 1000.1000 -R /etc/letsencrypt/
  4. Restart the web server with docker-compose restart server
  5. The renewal command is executed automatically on a daily basis (see /etc/cron.daily/certbot.sh). If you want to manually renew the certificates simply execute docker-compose run certbot renew

Change Password

To change the password for the default ubuntu user, log into the Terminal and execute the following command:

passwd

This will ask you for your current password (the default is the instance ID) and your new password.

Add a User

You can add additional users to the environment which can log in through the web interface and work concurrently on a single instance. To add a new user newuser, open a terminal session and execute the following command.

sudo useradd -m -s /bin/bash newuser

You can control whether the user will have access to the docker daemon or super-user priviledges by adding them to the corresponding groups.

# allow access to docker daemon
sudo usermod -G docker newuser

# super-user priviledges
sudo usermod -G sudo newuser

Lastly, set the password for newuser to newpassword.

echo newuser:newpassword | sudo chpasswd

Patch System

To keep the system up to date, run the following command from a terminal. This will also update packages from custom repositories such as NodeJS, Visual Studio Code and Google Chrome. Please note that the Linux kernel package is frozen using apt-mark. The reason is that the Nvidia driver will need to be re-installed after you upgrade the kernel.

sudo aptitude update && sudo aptitude full-upgrade

Prewarm EBS Volume

When launching from the image for the first time, the EBS volume will be "cold" which means that not all blocks have been transferred from the image to the EBS volume yet which impacts performance negatively when a file is first accessed. To prewarm the EBS volume execute the following command. You can expect much better disk performance for tasks you are executing for the first time after the full volume has been prewarmed. A start/stop or reboot of the instance will maintain the "warm" state of the EBS volume, i.e. this command only needs to be executed once per instance.

sudo fastdd

You can find more details about this process in the official AWS Documentation.

FAQ

Why does executing programs or notebooks take longer than expected after I created the instance?

When launching a new instance the underlying EBS volume is "cold" meaning that the data of the EBS snapshot (which the AMI is based on) has not been transferred from S3 to the EBS volume yet. As the user accesses files on the instance, individual blocks are transferred from S3 to the EBS volume. Consecutive access of the same files will then be served directly by EBS and is much faster. This process is called "copy-on-read" and allows AWS to optimize the cost associated with their storage system as only blocks required for operation are actually copied to the EBS volume.

We ship this AMI with a utility to prewarm the full volume. Check out the Prewarm EBS Volume section for details.

How do I transfer large files to the workspace?

We recommend to use S3 to transfer large files. Upload files from your local machine using the AWS Console or the AWS CLI. The files can then be downloaded from the workspace using the built-in aws command or goofys which allows you to mount an S3 bucket as folder on your workspace. Jupyter notebooks also come with an upload feature. However, this feature is browser based and might not be practical for large file transfers in the GBs.

If you are using the Microsoft Remote Desktop Client you can also choose to mount a local folder as drive on the workspace. Please follow the instructions of your Remote Desktop Client on how to configure device forwarding.

Alternatively, you can use an SFTP client such as Cyberduck to transfer files efficiently to the instance via the SSH protocol. Make sure you launch the instance with an SSH key and open the SSH port (22) to be able to open a connection.

How can I make my Terminal session persist?

You can launch a terminal multiplexer such as tmux to persist Terminal sessions. Simply type tmux to launch a new session. If you want to restore the session (even from the Desktop environment) simply type tmux attach. More details can be found at tmuxcheatsheet.com.

Can I run this product on smaller instance types?

Instance types with less than 2GB of memory are not recommended.

Changes

1.5.0

v1.4.0

Important changes

The default Conda environment is no longer activated by default when starting a terminal session. Primary reason is that Conda ships with a number of binaries that potentially conflict with the system-provided binaries such as NodeJS. To activate the Conda environment for the current terminal session run conda activate.

Upgrades:

v1.2.2

v1.2.1

v1.2.0

v1.1.0