ASPICE Capability Level 3 in 90 Days, the STS Implementation Playbook (from a CL1 Start)

1. CL1 → CL3 reality check, what “managed” and “established” actually add

We’ve run this transition often enough to be blunt about it: 90 days from a genuine CL1 baseline to a defensible CL3 rating is aggressive, but it is achievableon a focused process scope, with a team that already performs the engineering work well. If your base practices are weak, ninety days buys you nothing; you cannot put a process-management layer on top of work that isn’t happening. So the first honest question is never “how do we get to CL3?” It is “is our CL1 real?”

Under Automotive SPICE PAM 3.1 the capability dimension runs CL0 through CL5. CL0 (incomplete) means the process largely fails to achieve its purpose. CL1 (performed) means the base practices are executed and the expected work products exist, you build the architecture, you write the tests, you produce the outputs. That is rated against the single attribute PA 1.1, process performance.

CL2 (managed) adds two attributes that have nothing to do with engineering skill and everything to do with discipline. PA 2.1, performance management, asks whether the work is planned, monitored and adjusted: are there objectives, estimates, a schedule, defined responsibilities, and evidence you re-plan when reality diverges? PA 2.2, work product management, asks whether the outputs are identified, documented, version-controlled, reviewed and controlled for change. In our experience this is where most teams who insist “we already do all this” quietly leak ratings, the work is done, but the management of the work is informal.

CL3 (established) adds the process-asset layer. PA 3.1, process definition, asks whether a standard process exists, an organisational asset, not a project folder, with defined roles, competencies, infrastructure and tailoring guidance. PA 3.2, process deployment, asks whether each project actually deploys that standard process through tailoringwith the data and resources to run it and feedback that flows back to improve the asset. The mental shift is the whole point: CL3 is not heroics by a strong team. It is the same good outcome produced because the organisation defined how, not because three senior engineers happened to know how.

Every process attribute is rated on the N-P-L-F scale, Not / Partially / Largely / Fully achieved, and a level only counts when the attributes below it are at least Largely achieved. That ratchet is why scope discipline matters: it is far better to take eight processes solidly to CL3 than to smear effort across twenty and land everything at a wobbly CL2.

2. The 12-week plan at a glance

We structure the 90 days as three phases. The sequencing is deliberate, you cannot define a sensible standard process (PA 3.1) until you understand how the work is actually managed (PA 2.1 / PA 2.2), and you cannot pilot tailoring until the standard process exists.

Weeks 1, 4, baseline & GP 2.x institutionalisation

Week 1: confirm scope, run a gap assessment against the rating you claim today, and freeze the VDA process set (see section 3). Inventory existing work products against the PAM’s expected outputs.
Weeks 2, 3: close PA 2.1 gaps, project objectives, estimates, schedule, defined responsibilities, and a re-planning trigger. Close PA 2.2 gaps, configuration identification, baselining, review records, change control.
Week 4: run the GP 2.x practices live on one project for a full cycle so there is objective evidencenot just a procedure document.

Weeks 5, 9, standard process & tailoring (GP 3.x)

Weeks 5, 6: distil the managed practices into an organisational standard process: process descriptions, role and competency definitions, required infrastructure, and entry/exit criteria per activity.
Weeks 7, 8: build the tailoring guideline, the rules and decision criteria for adapting the standard process to a project’s context (variant, ASIL, supplier model, reuse).
Week 9: stand up the process asset library and the feedback mechanism (lessons learned, metrics collection, a defined improvement route back to the asset owner).

Weeks 10, 12, pilot, readiness, evidence

Weeks 10, 11: deploy the standard process on the pilot project via a recorded tailoring decision; collect deployment data and demonstrate the feedback loop firing at least once.
Week 12: assessment-readiness review, trace every GP to objective evidence, dry-run the interviews, and fix the gaps an experienced assessor will probe first.

3. Which processes first, the VDA scope

You do not assess all 32 processes of the PAM at once, and for a CL3 push you certainly don’t. We anchor on the VDA scopethe engineering core that OEMs ask for in practice, which spans SYS.2, SYS.5 and SWE.1, SWE.6 plus ACQ.4 and the supporting MAN/SUP processes, and we treat the support and management processes as enablers, not afterthoughts.

The engineering V is the heart of it:

SYS.2System Requirements Analysis
SYS.3System Architectural Design (with SYS.4 System Integration and Integration Test and SYS.5 System Qualification Test completing the system side where they are in scope)
SWE.1Software Requirements Analysis
SWE.2Software Architectural Design
SWE.3Software Detailed Design and Unit Construction
SWE.4Software Unit Verification
SWE.5Software Integration and Integration Test
SWE.6Software Qualification Test

The enablers carry more weight than teams expect, because the GP 2.x and 3.x attributes lean on them:

SUP.1Quality Assurance: independent confirmation that processes and products conform. Assessors read this early; if QA has no teeth, GP institutionalisation is hard to argue.
SUP.8Configuration Management: the backbone of PA 2.2. Without clean CM there is no baselining, no controlled change, no defensible traceability.
SUP.9Problem Resolution Management.
SUP.10Change Request Management. SUP.9 and SUP.10 together are where “plans that match reality” live or die.
MAN.3Project Management: the natural home for much of the PA 2.1 evidence (objectives, estimates, schedule, monitoring).

A pragmatic ordering inside the 12 weeks: stabilise SUP.8 and MAN.3 first (they feed every GP 2.x rating), then walk the engineering V top to bottom so that traceability is built as you go rather than retrofitted at the end. Retrofitted traceability is the single most common cause of a slipped timeline we see.

4. GP 2.x first, where the “we already do this” teams leak

Spend the first four weeks here, because CL2 is the ratchet that CL3 sits on. Two practice families, and both are mundane in a way strong engineers underestimate.

Performance management (PA 2.1)

The GP 2.1.x practices want: process performance objectivesa plan for performing the processestimates behind that plan, defined responsibilities and authorities, adequate resources and information, and managed interfaces between the parties involved. The recurring failure isn’t the absence of a plan, it is a plan that was written once and never updated, with no estimation basis and no monitoring against it. We’ve seen roughly ~60% of first-time CL2 self-claims fall here: the project ran fine, but there is no objective evidence the team monitored and adjusted performance against defined objectives. Fix it with a thin, real cadence, a monitored metric per process, a re-planning trigger, and minutes that show adjustment actually happened.

Work product management (PA 2.2)

The GP 2.2.x practices want work products with defined requirements (including structure and content), documented and controlled, under configuration management, reviewed and adjusted. This is where SUP.8 earns its keep. The leak here is almost always review discipline: outputs exist and are even version-controlled, but there is no record that they were reviewed against criteria and that findings were tracked to closure. A review without a recorded checklist, a reviewer, a date and a disposition is, to an assessor, a conversation that may or may not have happened. Make every expected work product traceable to a review record. That one habit lifts more GP 2.2 ratings than any tool purchase.

5. GP 3.x, define the standard process, then tailor it

CL3 is frequently misread as “do CL2 harder.” It is not. The defining idea is that the good result is produced because the organisation defined how, and each project tailors that definition. Two practice families again.

Process definition (PA 3.1)

You need a standard process as an organisational asset: a documented process description with the sequencing and interaction of activities, the roles and competencies required to perform it, the infrastructure and work environment needed, and, cruciallytailoring guidelines. Suitable methods for monitoring the process’s effectiveness must also be determined. The artefact teams most often skip is the tailoring guideline itself; without it there is nothing to tailor fromand PA 3.2 collapses with it.

Process deployment (PA 3.2)

Each project must deploy the standard process by selecting it and tailoring it according to the guidelinesthen assign roles and responsibilities, ensure the required competencies, provide the resources and infrastructure, collect process data, and use that data to manage and improve the standard process. The litmus test we apply: can you show one recorded tailoring decision and one piece of feedback that actually changed the asset? If the standard process has never been tailored and has never absorbed a lesson learned, it is a binder on a shelf, documented, but not established. Build the process asset library so that tailoring is a five-minute recorded decision, not an archaeology project, and so that the feedback route back to the process owner is a real, used channel.

6. Work products the assessor reads first

An experienced assessor does not start with your process manual. They sample work products and follow the threads. Get these right and the interviews go calmly; get them wrong and no amount of documentation rescues the rating.

Bidirectional traceability

The PAM is explicit that traceability must be bidirectional: stakeholder requirements ↔ system requirements (SYS.2) ↔ system architecture (SYS.3) ↔ software requirements (SWE.1) ↔ software architecture (SWE.2) ↔ detailed design (SWE.3), and forward into verification, SWE.4 unit verification, SWE.5 integration test, SWE.6 qualification test, with test cases tracing back to the requirements they verify. “Bidirectional” is not decoration: the assessor will pick a requirement and walk it down to a unit test, then pick a test case and walk it up to a requirement. Gaps in either direction are findings.

Consistency

Beyond mere links, the PAM expects consistency between levels, that the architecture actually satisfies the requirements, that the design realises the architecture, that the tests genuinely exercise what the requirements demand. A trace link to a requirement that the test does not actually verify is worse than a missing link, because it is a claim that doesn’t hold.

Review records and verification evidence

For every level: review records with criteria, reviewer, date and findings closed; and verification evidence that closes the V, unit verification results (SWE.4), integration test results (SWE.5), and qualification test results against the software requirements (SWE.6), with coverage you can defend. These are the artefacts an assessor samples to confirm the GPs are real rather than asserted.

7. Common downgrades, and how we pre-empt them

The same handful of issues account for the large majority of ratings lost in CL3 assessments. We run an internal pre-assessment specifically to hunt them.

Traceability gaps. The most frequent downgrade by far. Forward links present, backward links missing; or links present but inconsistent (the test doesn’t verify the requirement it points to). One broken thread sampled by the assessor implies the population is unreliable.
No tailoring evidence. A beautifully written standard process that no project has ever tailored from. PA 3.2 needs a recorded tailoring decision per project, not an assertion that tailoring is “allowed.”
Plans that don’t match reality. A MAN.3 plan dated at kickoff and never revised while the project visibly diverged. This sinks PA 2.1: there is no evidence of monitoring and adjustment.
Missing objective evidence of GP institutionalisation. The process exists on paper but the project ran the old informal way. Assessors rate evidence, not intentions, a procedure with no instances of it being followed is Not achieved, however well written.
Toothless QA (SUP.1). QA that reports but never causes change, or that has no independence, undermines the institutionalisation argument across the board.
CM gaps (SUP.8). No clean baselines means PA 2.2 cannot be Largely achieved, and traceability cannot be trusted because the items it links are not under control.

Our pre-emption is mechanical: a traceability completeness-and-consistency check across the full sampled thread set, one recorded tailoring decision per pilot, plan-versus-actual diffs on the pilot’s MAN.3 artefacts, and a GP-to-evidence matrix where every generic practice points at a dated, attributable artefact. If a cell is empty in week 12, it will be empty in the assessment.

8. Sustaining CL3, it decays without ownership

A rating is a snapshot; capability is a behaviour, and behaviours decay. We’ve watched hard-won CL3 ratings slide back within a year for one reason: nobody owned the standard process after the assessment. The asset goes stale, projects quietly drift to their own ways, tailoring decisions stop being recorded, and the feedback loop, the very thing PA 3.2 demands, goes silent.

What keeps it alive is unglamorous and non-negotiable. First, a named process group (it can be small) that owns the process asset library, reviews tailoring decisions, and triages improvement feedback, an engineering process group is the difference between “established once” and “established.” Second, a live feedback loop: every project contributes at least one lesson learned and at least one metric back to the asset, and you can show the asset changing because of it. Third, periodic internal assessments, a light quarterly self-check on the sampled threads and a fuller annual pass, so drift is caught at the GP level long before an external assessor would. Fourth, competency upkeep: roles defined in the standard process need the training behind them maintained, or PA 3.1 hollows out as people change.

None of this is about chasing a certificate again, and Automotive SPICE produces an assessment rating, not a certification in the first place. It is about keeping the organisation in the state where the good engineering outcome is the defaultproduced because the process defines how, which is, after all, the entire definition of an established process.

Want a 30-min walkthrough on your project?

No NDA needed. Tell us the standard, the item or asset, the assessor, and your deadline. Within 48 hours you’ll get a one-page diagnostic mapped to the points above, yours to keep, whether or not you hire us.

Book an ASPICE walkthrough

Author: Adrian Valea, Founder & Managing Director, SafetyTrust Software Technology GmbH. ASPICE Provisional Assessor (intacs / VDA), Automotive SPICE for Cybersecurity (intacs), Functional Safety Engineer (TÜV Rheinland), Automotive Cybersecurity (TÜV NORD). Published 2026-05-28.

ASPICE CL3 in 90 days from a CL1 start.