Workload Diagnostics and Reviews

Amazon Redshift Operational Review
Amazon Redshift Operational Review evaluates your Redshift cluster against design and configuration best practices. The review focuses on improving query performance, choosing right distribution styles, minimizing data exchange during query times, and reviewing workload management configurations to optimize your Redshift operations.
 
Amazon OpenSearch Service Operational Review
The Amazon OpenSearch Service Operational Review evaluates your OpenSearch Service domain against infrastructure, scaling, indexing, security, sharding, and use case specific best practices and provides prescriptive recommendations and guidance.

AWS Glue Operational Review
This  AWS Glue Operational Review evaluates your workload against infrastructure, catalog, crawler, monitoring, security, cost optimization, and extract, transform, and load (ETL) best practices and provides prescriptive recommendations and guidance.
 
Databases Operational Review
For production databases running Amazon Relational Database Service (Amazon RDS) MySQL and PostgreSQL engines, this multi-day Databases Operational Review enables you to best utilize AWS database monitoring services and evaluates the operational readiness of your database. The review evaluates against high availability, business continuity, security, engine-specific parameter configuration, database observability configuration, and cost optimization best practices to improve your operational performance.
 
ElastiCache for Redis Operational Review and Best Practices
Elasticache for Redis Operational Review offers practical guidance and best practices to operate your ElastiCache clusters. The review evaluates the configuration of your clusters, high availability and durability as per AWS infrastructure best practices.
 
Kubernetes Optimization for Scale
We review your existing Amazon Elastic Kubernetes Service (Amazon EKS) workloads to identify bottlenecks and constraints in your environment. We'll help you define the acceptable performance profile for your Kubernetes-based applications, then give recommendations on how to meet those goals tailored to your workload, requirements, and cluster architecture. 
 
Amazon Elastic Kubernetes Service Networking Review
We will help you navigate the ever-expanding number of options when it comes to setting up the network for your Amazon Elastic Kubernetes Service (Amazon EKS) workloads. Starting from the ground up, we'll review your VPC design, subnets, CIDR allocations, DNS configuration (CoreDNS and Route 53), ingress controllers, service mesh options, load balancer configurations, and any Container Network Interface (CNI) plugins you may be using. 
 
Amazon Elastic Kubernetes Service Security Review
This engagement will combine insights on security, Kubernetes operations, and general AWS best practices to evaluate the posture of your Amazon Elastic Kubernetes Service (Amazon EKS) workloads. We'll work with you to discover your security requirements and discuss how you can meet them with tools like Role Based Access Control (RBAC), Open Policy Agent (OPA) Gatekeeper, network policies, admission controllers, and OpenID Connect (OIDC) providers.
 
Amazon Elastic Kubernetes Service Observability and Monitoring Review
This engagement will review your logging, monitoring, and alerting capabilities to improve your ability to detect problems in your Amazon Elastic Kubernetes Service (Amazon EKS) environment. Covering both the infrastructure and applications, we'll highlight the things that are important to monitor for your specific use case so that you can use your existing monitoring tools, or we can help you launch new monitoring systems like Prometheus and Grafana.
 
Amazon Elastic Kubernetes Service Cost Optimization Review
This engagement will show you how to efficiently use the resources that make up your Amazon Elastic Kubernetes Service (Amazon EKS) environment, and then run them in a more cost effective way. Using your cluster's metrics, we'll look for ways to improve the utilization of your EKS worker nodes while maintaining or improving your application's throughput. We will also review your autoscaling configurations to make sure you can meet your bursting workloads and minimize overprovisioning your compute resources. We can also help you discover if your workloads are well suited for running on EC2 Spot instances or EKS Fargate.
 
Identity and Access Control Review
Identity and Access Control review offers a deep analysis of your AWS Identity configurations and permissions. This will include review of the current Access controls in your AWS Account(s) and identifying gaps to align them with AWS Security Best Practices. After this engagement, you will walk out with guidance on improving the AWS IAM posture in your AWS Account(s).
 
Amazon Networking Infrastructure Review
Amazon Networking Infrastructure Review offers practical guidance and best practices for improving network infrastructure connectivity, availability, scalability, performance, security, management, and monitoring. It is designed for production networks running Transit Gateway, VPC Peering, Direct Connect, Site-to-Site VPN, Client VPN. and/or Network Firewall. Networking Infrastructure Review enables you to best utilize AWS network monitoring services and evaluates the operational readiness of your network.
 
Amazon Networking Application Review
Amazon Networking Application Review offers practical guidance and best practices for improving network application connectivity, availability, scalability, performance, security, management, and monitoring. It is designed for production applications running Route 53, VPC, ELB (GWLB, NLB, ALB), VPC Endpoints (PrivateLink), Global Accelerator and/or Network Firewall. Networking Application Review enables you to best utilize AWS network monitoring services and evaluates the operational readiness of your network.
 
Amazon SageMaker Operational Review
SageMaker Operational Review enables you to best utilize the ML services and evaluates the operational readiness of your workload. The review evaluates against Operational excellence, reliability, security, performance and cost optimization best practices to improve your operational performance.
 
Amazon SageMaker Cost optimization
In this session we analyze the cost and usage patterns of your Amazon SageMaker services used in building, training and deploying Machine Learning applications. It provides guidance on instance selection and right-sizing, monitoring and alerting of idle resources, scaling and review of SageMaker design principles to keep your production load optimized for performance at minimum cost.
 
 

Operational Workshops and Deep Dives

Database Migration Operational Deep Dive
Database Migration Operational Deep Dive offers several hours of practical guidance and best practices for migrating your databases to Amazon RDS engines using AWS Database Migration Service (AWS DMS).
 
Databases Operational Deep Dive
Databases Operational Deep Dive offers several hours of practical guidance and best practices for managing, monitoring, and operating your Amazon RDS for MySQL, Amazon RDS for PostgreSQL, and Amazon Aurora databases. This deep dive highlights how you can optimize costs and analyze your instance's performance leveraging Amazon CloudWatchAmazon RDS Performance Insights, Amazon RDS Enhanced Monitoring, and other Amazon RDS specific diagnostic tools.
 
Threat Detection, Security Incident Response and Automation in AWS
This workshop is an interactive hands-on learning event that will provide best practices and guidance for incident response, remediation workflows and threat detection in AWS. During this workshop you will learn how to how to detect anomalous behavior in your AWS infrastructure and automate security incident response workflows.
 
Machine learning operations (MLOps) Workshop
The Machine Learning Operations (MLOps) workshop aims to help customers learn and practice the unique aspects of operating ML workloads. It is designed to help customers move from individual projects to streamline ML applications at scale. We start with an overview of key phases in ML adoption, operationalization and governance. Then, in a hands-on session, we use Amazon SageMaker Pipelines to create, automate, and manage a machine learning workflow.
 
Incident Management Workshop
The incident management workshop is a table top exercise in-which teams test their existing incident response procedures against a hypothetical incident. The engagement is an opportunity to discuss and check adoption of incident management best practices associated with people, process and tooling. Best practice matrixes and next step recommendations are created after the workshop which aim to help you respond faster, have fewer outages and increase uptime.
 
Operational Readiness Review Workshop
The ‘Operational Readiness Review’ workshop is an interactive “working backwards” session on people, process and mechanisms for customers. The workshop helps customers achieve a consistent process (including a checklist) for evaluating operational readiness of workloads prior to launch. Customers use these checklists to get visibility into risk and plan remediation's. Customers with consistent evaluation procedures gain improved confidence in meeting business outcomes.
 
Building a Monitoring Strategy
The ‘Building a Monitoring Strategy’ workshop uses Amazon best practices to help customers build a consistent approach to the monitoring and observability of workloads. The workshop’s goal is to create a strategy that aligns business and operational metrics. The workshop helps customers identify key metrics which matter most to delivering successful business outcomes. 
 
Operations KPI Workshop
The ‘Operations KPI’ workshop uses Amazon best practices to build a consistent approach to developing ‘Key Performance Indicators’ (KPIs). The centerpiece of the workshop is to create a strategy intersecting operational practices that support business needs and establish operational metrics. Customers with a business level view of operations activities based on the KPIs can determine if business needs are satisfied and identify areas needing improvement.
 
Operational Excellence Deep Dive
The Operational Excellence Deep-Dive extends the coverage of the Well-Architected Operational Excellence Pillar through an expert-led engagement. The engagement is centered on a guided conversation focused on key elements of your organization, priorities, processes, tooling and culture that contribute to your operational outcomes. Insight gathered from the conversation are prioritized according to your goals and then recommendations for actions are provided that help you improve, extend and scale your operations towards delivering on your desired business outcomes.
 
Cost Optimization Workshop
The Cost Optimization Workshop is an engagement which improves cost effective utilization of AWS resources. Customers are provided actionable recommendations to realize immediate savings and achieve ongoing cost efficiency. Workshop activities enable the customer to achieve rapid results based on their cost optimization priorities.
 
AWS GameDays
AWS GameDays are interactive team-based learning exercises designed to give players a chance to test AWS skills in a real-world, gamified and risk-free environment. It’s a Simulation of live application hosted on AWS to test skills by maintaining and fixing a production state. It provides complete hands-on opportunity to learn about AWS best practices, services, and architectural patterns. During an event, players are given a starting architecture that they evolve in response to internal & external events on pre-packaged AWS accounts. There are multiple portfolios of GameDays to select from, which specializes on different AWS domains & services.