AWS data access
Set up an S3 bucket, EFS, or FSx file system to use as the Nextflow work directory and to store input and output data. The IAM permissions to access these resources are documented in AWS IAM policies.
S3 bucket creation
AWS S3 (Simple Storage Service) is a type of object storage. Use one or more S3 buckets to access input and output files with Studios and Data Explorer. An S3 bucket can also store intermediate Nextflow files, as an alternative to EFS or FSx.
EFS and FSx work directories are incompatible with Studios.
- Navigate to the AWS S3 console.
- In the top right, select the same region where you plan to create your AWS Batch compute environment.
- Select Create bucket.
- Enter a unique name for your bucket.
- Leave the rest of the options as default and select Create bucket.
Nextflow uses S3 to store intermediate files. In production pipelines, this can amount to a lot of data. Consider a retention policy to automatically delete intermediate files after 30 days. See the AWS documentation for more information.
EFS or FSx file system creation
AWS Elastic File System (EFS) and AWS FSx for Lustre are types of file storage that can be used as a Nextflow work directory, as an alternative to S3 buckets.
EFS and FSx work directories are incompatible with Studios.
To use EFS or FSx as your Nextflow work directory, create the file system in the same region as your AWS Batch compute environment.
You can let Seqera create EFS or FSx automatically when creating the AWS Batch compute environment, or create them manually. If Seqera creates the file system, it is also deleted when the compute environment is removed from Platform, unless Dispose Resources is disabled in the advanced options.
Creating an EFS file system
To create an EFS file system manually, visit the EFS console.
- Select Create file system.
- Optionally give it a name, then select the VPC where your AWS Batch compute environment will be created.
- Leave the rest of the options as default and select Create file system.
Creating an FSx file system
To create an FSx for Lustre file system manually, visit the FSx console.
- Select Create file system.
- Select FSx for Lustre.
- Follow the prompts to configure the file system, then select Next.
- Review the configuration and select Create file system.
Make sure the Lustre client is available in the AMIs used by your AWS Batch compute environment to mount FSx file systems.
Next steps
- Configure required IAM permissions for S3, EFS, and FSx.
- Create your AWS Batch compute environment and reference the bucket or file system as the work directory.