Cell Painting
Enhancing Drug Discovery with high-throughput AWS Cell Painting. Are you having trouble processing cell images? Let’s see how AWS’s Cell Painting Batch offering has revolutionized cell analysis for life sciences clients.
Introduction
The analysis of microscope-captured cell pictures is a key component in the area of drug development. To comprehend cellular activities and phenotypes, a novel method for high-content screening called “Cell Painting” has surfaced. Prominent biopharma businesses have begun using technologies like the Broad Institute’s CellProfiler software, which is designed for cell profiling.
On the other hand, a variety of imaging methods and exponential data expansion provide formidable obstacles. Here, they will discover how AWS has been used by life sciences clients to create a distributed, scalable, and effective cell analysis system.
Current Circumstance
Scalable processing and storage are needed for cell painting operations in order to support big file sizes and high-throughput picture analysis. These days, scientists employ open-source tools such as CellProfiler, but to run automated pipelines without worrying about infrastructure maintenance, they need scalable infrastructure.
In addition to handling massive amounts of data on microscopic images and infrastructure provisioning, scientists are attempting to conduct scientific research. It is necessary for researchers to work together safely and productively across labs using user-friendly tools. The cornerstone of research is scientific reproducibility, which requires scientists to duplicate other people’s discoveries when publishing in highly regarded publications or even examining data from their own labs.
CellProfiler software
Obstacles
Customers in the life sciences sector encountered the following difficulties while using stand-alone instances of technologies such as CellProfiler software:
- Difficulties in adjusting to workload fluctuations.
- Problems with productivity in intricate, time-consuming tasks.
- Problems in teamwork across teams located around the company.
- Battles to fulfill the need for activities requiring a lot of computing.
- Cluster capacity problems often result in unfinished work, delays, and inefficiencies.
- The lack of a centralized data hub results in problems with data access.
Cellprofiler Pipeline
Solution Overview
AWS solution architects collaborated with life sciences clients to create a novel solution known as Cell Painting Batch (CPB) in order to solve these issues. CellProfiler Pipelines are operated on AWS in a scalable and distributed architecture by CPB using the Broad Institute’s CellProfiler image. With CPB, researchers may analyze massive amounts of images without having to worry about the intricate details of infrastructure management. Furthermore, the AWS Cloud Development Kit (CDK), which simplifies infrastructure deployment and management, is used in the construction of the CPB solution.
The whole procedure is automated; upon uploading a picture, an Amazon Simple Queue Service (SQS) message is issued that starts the image processing and ends with the storing of the results. This gives researchers a scalable, automated, and effective way to handle large-scale image processing requirements.
This figure illustrates how to dump photos from microscopes into an Amazon S3 bucket. AWS Lambda is triggered by user SQS messages. Lambda submits AWS Batch tasks utilizing container images from the Amazon Elastic Container Registry. Photos are processed using AWS Batch, and the results are sent to Amazon S3 for analysis.
Workflow
The goal of the Cell Painting Batch (CPB) solution on AWS is to simplify the intricate process of processing cell images so that researchers may concentrate on what really counts extrapolating meaning from the data. This is a detailed explanation of how the CPB solution works:
- Images are obtained by researchers using microscopes or other means.
- Then, in order to serve as an image repository, these photos are uploaded to a specific Amazon Simple Storage Service (S3) bucket.
- After storing the photos, researchers send a message to Amazon Simple Queue Service (SQS) specifying the location of the images and the CellProfiler pipeline they want to use. In essence, this message is a request for image processing that is delivered to the SQS service.
- An automated AWS Lambda function is launched upon receiving a SQS message. The main responsibility of this function is to start the AWS Batch job for the specific image processing request.
- Amazon Batch assesses the needs of the task. AWS Batch dynamically provisioned the required Amazon Elastic Compute Cloud (EC2) instances based on the job.
- It retrieves the designated container image that is kept in the Amazon Elastic Container Registry (ECR). This container runs the specified CellProfiler pipeline inside AWS Batch. The integration of Amazon FSx for Lustre with the S3 bucket guarantees that containers may access data quickly.
- The picture is processed by the CellProfiler program within the container using a predetermined pipeline. This may include doing image processing operations such as feature extraction and segmentation.
- Following CellProfiler post-processing, the outcomes are stored once again to the assigned S3 bucket at the address mentioned in the SQS message.
- Scholars use the S3 bucket to get and examine data for their investigations.
Because the workflow is automated, the solution will begin analyzing images and storing the findings as soon as a picture is uploaded and a SQS message is issued. This gives researchers a scalable, automated, and effective way to handle large-scale image processing requirements.
Safety
For cell painting datasets and workflows, AWS’s Cell Painting Batch (CPB) provides a strong security architecture. The solution offers top-notch data safety with encrypted data storage at rest and in transit, controlled access via AWS Identity and Access Management (IAM), and improved network security via an isolated VPC. Furthermore, the security posture is strengthened by ongoing monitoring using security technologies like Amazon Cloud Watch.
It is advisable to implement additional mitigations, such as version control for system configurations, strong authentication with multi-factor authentication (MFA), protection against excessive resource usage with Amazon Cloud Watch and AWS Service Quotas, cost monitoring with AWS Budgets, and container scanning with Amazon Inspector, in order to further strengthen security.
Life Sciences Customer Success Stories
Customers in the biological sciences have changed drastically after switching to CPB. This streamlined processing pipelines, sped up photo processing, and fostered collaboration. The system’s built-in scalability can manage larger datasets to hasten medication development, making these enterprises future-proof.
Customizing the Solution
CPB may be integrated with other AWS services due to its modularity. Options include AWS Step Functions for efficient process orchestration, Amazon AppStream for browser-based access to scientific equipment, AWS Service Catalog for self-service installs, and Amazon SageMaker for machine learning workloads. Github code has a parameters file for instance class, timeout duration, and other tweaks.
In summary
The cell painting batch approach may boost researcher productivity by eliminating infrastructure management. This method allows scalable and fast image analysis, speeding therapy development. It also lets researchers self-manage processing and distribution, reducing infrastructure administration needs.
The AWS CPB solution has transformed biopharmaceutical cell image processing and helped life sciences companies. A unique approach that combines scalability, automation, and efficiency allows life sciences organizations to easily handle large cell imaging workloads and accelerate drug development.