From the AWS Management Console, you can choose what type of analysis you want to perform, the partners you want to collaborate with, and what datasets you would like to contribute to a collaboration. With AWS Clean Rooms you can perform three types of analyses, SQL, Spark SQL, and PySpark analyses, and machine learning.
AWS Clean Rooms offers Spark SQL or SQL analytics engines to run queries in a Clean Rooms collaboration. When you run SQL or Spark SQL queries, AWS Clean Rooms reads data where it lives and applies built-in, flexible analysis rules to help you maintain control over your data. AWS Clean Rooms provides a broad set of privacy-enhancing SQL controls—including query controls, query output restrictions, and query logging—that allow you to customize restrictions on the queries run by each clean room participant. You can use the Spark analytics engine to run queries using the Spark SQL dialect in AWS Clean Rooms collaborations. AWS Clean Rooms Spark SQL offers configurable compute sizes to provide enhanced flexibility to customize and allocate resources to run SQL queries based your performance, scale, and cost requirements. AWS Clean Rooms Differential Privacy helps you protect the privacy of your users with mathematically backed and intuitive controls in a few clicks. With SQL analytics engine, you can use AWS Clean Rooms Differential Privacy by selecting a SQL custom analysis rule and then configure your desired differential privacy parameters. And, Cryptographic Computing for Clean Rooms (C3R) helps you keep sensitive data encrypted during your SQL analyses when using Spark analytics engine or SQL analytics engine to run your queries. To apply AWS Clean Rooms Differential Privacy in a collaboration you must use SQL as the analytics engine.
PySpark in AWS Clean Rooms, enables companies and their partners to run sophisticated analytics across large datasets using PySpark, the Python API for Apache Spark. With PySpark in AWS Clean Rooms, you and your partners can bring PySpark code and libraries to an AWS Clean Rooms collaboration and run advanced analyses without having to share underlying data or proprietary analysis methods. For example, an advertising measurement provider can use PySpark in AWS Clean Rooms to run their custom algorithms across multiple publisher datasets simultaneously to measure ad effectiveness. Similarly, a pharmaceutical company can run their proprietary algorithms and libraries across multiple healthcare provider datasets with appropriate patient consent to evaluate drug adherence across clinical trials, without sharing their proprietary data.
AWS Clean Rooms ML helps you and your partners apply privacy-enhancing machine learning (ML) to generate predictive insights without having to share raw data with each other. AWS Clean Rooms ML supports custom and lookalike machine learning (ML) modeling. With custom modeling, you can bring a custom model for training and run inference on collective datasets, without sharing underlying data or intellectual property among collaborators. With lookalike modeling, you can use an AWS-authored model to generate an expanded set of similar profiles based on a small sample of profiles that your partners bring to a collaboration. AWS Clean Rooms ML lookalike modeling, using an AWS-authored model, was built and tested across a wide variety of datasets such as e-commerce and streaming video, and can help customers improve accuracy on lookalike modeling by up to 36%, when compared with representative industry baselines. In real-world applications such as prospecting for new customers, this accuracy improvement can translate into savings of million dollars.