What is a Platform?
A platform is a complete environment able to run all kinds of applications.
Modern platform these days is based on containers and Kubernetes (or OpenShift) running in the cloud, on on-premises hardware or in hybrid environments.
To provide necessary features, multiple Cloud Native technologies are used to accomplish high reliability, security and scalability.
Together with Platform Engineering and DevOps practices, the platform built within the organization makes it possible to develop and manage almost any software in a cost-effective manner.
What are Platform Maturity Levels?
The five levels are distinguished stages of implementation of a specific subset of features. These features increase capabilities of the platform in an incremental manner.
Each level includes strategies for achieving a specific level of maturity to provide more confidence in the platform and allows for running more critical software.
Why is this level-based approach better?
This approach is more evolutionary rather than revolutionary. It’s more practical and takes into account the time needed for learning new tools and processes.
It also allows the platform’s capabilities to be better aligned with the organization’s context (e.g. existing software, policies, restrictions, etc.).
Each level can introduce or extend the use of particular practice or technology (e.g. GitOps, Zero Trust Environment, Progressive Delivery, Chaos Engineering etc.) at a different advancement level.
Level 1 - The Launchpad
Provide a good enough starting point for experimentation.
The sooner the platform is available, the better. Start with rudimentary features to enable first applications to leverage the speed and flexibility of Kubernetes.
Use this level only as a starting point to experiment and test the possibilities. Be prepared for a rather reactive approach to problems and manual actions to fix them.
The time will come for more advanced means of improvement.
Requirements
✅ A budget for the work to set up the platform and to deploy first applications
✅ Access to the cloud with Kubernetes service or available on-prem resources
✅ A team with skills for setting up the platform (greater for on-prem environments)
For whom
→ Every organization starting the journey (non-prod!)
Benefits
- Enables developers to learn and use Kubernetes and containers
- Enables more advanced developers to leverage already built and used container images in a scale (previously locally)
- Enables operations team to learn and discover how to create more mature platform with the available infrastructure (cloud providers or on-prem)
- Allows for flexibility of adding more advanced options (i.e. additional products or services enhancing the platform features) before running real production workloads
- Development and operations teams gain more experience
Practices and Techniques
Security
Unrestricted Container Runtime
Basic Access Management
Basic Secrets Management
Efficiency
Manual Application Scaling
Unmanaged Cluster Resources
Delivery
Basic Delivery Processes
Basic Build Processes
Basic Deployment Management
Availability
Kubernetes Built-in Resiliency
Do
➕ Choose fast paths, even if they are now considered imperfect
➕ Find and encourage technology enthusiasts to participate
➕ Run your applications to align the platform use to the context
Don’t
➖ Implement more advanced features (e.g. RBAC, GitOps, Persistent Volumes)
➖ Focus on scalability or security
➖ Automate processes of building container images, delivery or platform provisioning
Level 2 - Operational Foundation
Discover useful features and explore further.
Let people continue to learn while the platform is improved and more features are added.
A small number of non-critical, stateless applications may run to prove the usability and benefits of using the new approach.
Requirements
✅ A small number of applications is ready for first production
✅ Basic roadmap with a set of applications to run on the platform
✅ Permission to learn and experiment with the delivery process
✅ Selected most important components (i.e. cloud provider, Kubernetes distribution, cluster architecture)
✅ Confirmed (or disconfirmed) usability of the new approach for the company - decision to move forward or stop and analyze
For whom
→ Every organization that wants to build more mature and professional platform
→ Companies (e.g. startups) choosing to make non-critical services available to clients
Benefits
- Increased security rules to protect the applications and the platform
- Increased visibility of platform performance and operations
- Faster and more frequent deployments with rollbacks
- Reduced response time and handle peak traffic with manual scaling
- Enable developers to use of Kubernetes and containers
- Ready for non-critical production workloads (small scale, small risk, stateless)
Practices and Techniques
Security
Unrestricted Container Runtime
Basic Access Management
Basic Secrets Management
Basic Cluster Updates
Basic Traffic Filtering
Basic Vulnerabilities Management
Efficiency
Manual Application Scaling
Basic Resources Management
Delivery
Basic Delivery Processes
Basic Build Processes
Basic Deployment Management
Availability
Hard multi-tenancy
Basic Platform Monitoring
Do
➕ Choose the necessary services or products (Kubernetes distribution, logging, identity)
➕ Address doubts about platform capabilities with open communication
➕ Implement basic security rules
Don’t
➖ Force migration of existing, non-containerized applications
➖ Optimize infrastructure costs (yet)
➖ Standardize and unify delivery processes
Level 3 - Data-Driven Improvement
Learn and improve using reliable data.
The first semi-critical applications can run on in production. It’s time to rely on the data collected on the platform to improve security, simplify troubleshooting, and reduce infrastructure costs.
At this level, more experience is required to leverage the potential of Kubernetes and Cloud Native projects.
Requirements
✅ Some apps work in prod to prove the readiness of platform
✅ Some standards have emerged (delivery)
✅ More skilled platform team(s)
For whom
→ Preparing for more critical use
→ More security required
→ Availability becomes more critical
Benefits
- More reliable platform (at least 99.9%)
- Reduced risk of data leaks
- Possible to scale both workloads and platform
- Optimized utilization of resources
- More people trust the provisioned platform
Practices and Techniques
Security
Basic Secrets Management
Basic Traffic Filtering
Basic Vulnerabilities Management
Shift Left Security
Restricted Container Runtime
Advanced Vulnerabilities Management
Advanced Data Protection
Efficiency
Basic Application Autoscaling
Advanced Application Autoscaling
Cluster Autoscaling
Basic Resources Management
Delivery
Basic Delivery Processes
DORA Metrics
Golden Paths
Advanced Build Processes
Advanced Deployment Management
Advanced Delivery Processes
Availability
Hard multi-tenancy
Advanced Platform Monitoring
Automated Node Provisioning
Continuous Resiliency Improvement
Manual Disaster Recovery Testing
Application Observability
Platform Observability
Basic Disaster Recovery
Do
➕ Start implementing best practices in code (security rules, delivery pipelines, etc.)
➕ Create a dedicated team for managing the platform
➕ Improve platform capabilities based on data
Don’t
➖ Force GitOps for all processes
➖ Delegate platform security to a dedicated team
➖ Run stateful applications on the platform (yet)
Level 4 - Productized Platform
Manage platform with an “Everything as Code” approach.
The platform is now fully automated and improvements are being implemented in code (GitOps).
More proactive and automated measures are being used to improve application reliability, scalability and address security threats.
Requirements
✅ Large scale to justify higher costs
✅ Dedicated platform team or teams to manage the platform
For whom
→ Organizations with large scale apps
→ Organizations with many stateful services
→ Organizations with compliance standards requirements (e.g. GDPR, PCI DSS, HIPAA)
Benefits
- More reliable platform (at least 99.99%)
- Rapid delivery of more secure and reliable apps
- Faster and optimized scaling capabilities
- Infrastructure costs under control and available for optimization
- Significantly reduced risks of data leaks and break-ins
- Platform viewed internally as an essential service
Practices and Techniques
Security
Shift Left Security
Restricted Container Runtime
Advanced Vulnerabilities Management
Advanced Data Protection
GitOps Platform Management
Platform Access Auditing
All traffic encrypted
Advanced Workloads Auditing
Compliance Policies Enforcement
Advanced Traffic Filtering
Efficiency
Advanced Application Autoscaling
Cluster Autoscaling
Advanced Volume Management
Advanced Resources Management
Costs Center Management
Platform Landing Zones
Delivery
DORA Metrics
Golden Paths
Advanced Build Processes
Advanced Deployment Management
Advanced Delivery Processes
Extended Images Build Processes
GitOps Application Management
Extended Delivery Processes
Availability
Hard multi-tenancy
Advanced Platform Monitoring
Automated Node Provisioning
Continuous Resiliency Improvement
Manual Disaster Recovery Testing
Application Observability
Platform Observability
Basic Disaster Recovery
Disaster Recovery for Persistent Storage
Chaos Engineering (non-prod)
Advanced Disaster Recovery Testing
Error Budget Management
GitOps Platform Management
Fault-tolerant Workload Distribution
Do
➕ Start treating the platform as an internal product
➕ Tighten platform security rules
➕ Improve platform capabilities based on data
Don’t
➖ Announce the platform’s SLA (yet)
➖ Stop people from testing and experimenting (in safe environments)
➖ Rely on a single infrastructure provider (or datacenter)
Level 5 - AI-Ready Enterprise Platform
Operate a strategic, continuously optimized platform capable of handling all enterprise workloads, including demanding AI/ML initiatives.
This represents the pinnacle of platform maturity.
No longer just infrastructure, the platform operates as a strategic internal product with defined SLAs, capable of reliably and efficiently running the organization’s most critical applications – from standard stateless services and stateful databases to complex, resource-intensive AI/ML training and inference workloads.
Requirements
✅ Organization’s strategy to include continuous development and maintenance of the platform (software costs, people)
For whom
→ Highest requirements for platform security, reliability and cost effectiveness
Benefits
- Implemented Zero Trust Environment
- Platform as a product with defined SLA
- Capabilities to run any type of workloads (stateless, stateful, machine learning, serverless)
- Detailed insight into the cost of operating the platform and application
- High confidence in the platform’s capabilities and reliability
Practices and Techniques
Security
Advanced Access Management
Restricted Container Runtime
Shift Left Security
Advanced Data Protection
Advanced Vulnerabilities Management
GitOps Platform Management
Platform Access Auditing
All traffic encrypted
Advanced Workloads Auditing
Compliance Policies Enforcement
Advanced Traffic Filtering
Identity-based Access Control
Zero Trust Environment
Network Traffic Auditing
Efficiency
Advanced Application Autoscaling
Cluster Autoscaling
Advanced Volume Management
Advanced Resources Management
Costs Center Management
Platform Landing Zones
Advanced Costs Management
Just-in-time Capacity
Delivery
DORA Metrics
Golden Paths
Advanced Build Processes
Advanced Deployment Management
Advanced Delivery Processes
Extended Images Build Processes
GitOps Application Management
Extended Delivery Processes
Kubernetes Landing Zone
Internal Developer Platform
Availability
Hard multi-tenancy
Advanced Platform Monitoring
Automated Node Provisioning
Continuous Resiliency Improvement
Manual Disaster Recovery Testing
Application Observability
Platform Observability
Basic Disaster Recovery
Disaster Recovery for Persistent Storage
Chaos Engineering (non-prod)
Advanced Disaster Recovery Testing
Error Budget Management
GitOps Platform Management
Fault-tolerant Workload Distribution
Platform SLA
Chaos Engineering in Production
Multi-cluster Platform
Do
➕ Prepare and announce the platform’s SLA
➕ Receive the official confirmation of platform compliance with security standards
➕ Encourage people to run all their workloads on the platform
Don’t
➖ Stop improving the platform
Ready for the Next Level?
Understanding the stages of platform maturity provides a valuable map for your organization’s journey. You can now pinpoint your current location, whether you’re establishing foundational capabilities or optimizing existing processes.
It’s important to recognize that while Level 2 or even Level 3 offers a starting point and proves initial value, it’s generally insufficient for running critical applications reliably or securely in the long term.
Continuing the evolution is crucial for sustainable success.
For many organizations, achieving Level 4 represents a powerful and sufficient end state. Reaching this level signifies a highly automated, secure, and reliable platform managed effectively through code, capable of handling complex and stateful workloads with confidence.
It’s a significant accomplishment that delivers substantial business value.
However, for organizations aiming for the absolute leading edge, especially those heavily investing in AI/ML or operating with the highest demands for resilience, efficiency, and strategic agility, Level 5 represents the next frontier.
This level transforms the platform into a core strategic asset, fully optimized for any workload, including the most demanding AI initiatives. It embodies Zero Trust security, advanced cost control, and practices like production Chaos Engineering.
While achieving Level 5 requires a significant commitment, it provides a future-proof foundation, maximizing innovation potential and offering unparalleled capabilities. Investing resources to reach this stage unlocks clear, long-term benefits and a distinct competitive advantage.
If you’d like to discuss your platform’s current maturity, evaluate the benefits and requirements of advancing, or strategize the most effective path forward – whether your target is Level 4 or the pursuit of Level 5 – I’m available to share insights and help you navigate your unique platform evolution.