1Z0-1067-25 Syllabus — Learning Objectives by Topic

Learning objectives for OCI 2025 Cloud Ops Professional (1Z0-1067-25), organized by topic with quick links to targeted practice.

Use this syllabus as your source of truth for 1Z0‑1067‑25.

What’s covered

Topic 1: OCI Operations Foundations and Governance
Topic 2: Identity, Access, and Security Operations
Topic 3: Monitoring, Metrics, and Alarms
Topic 4: Logging, Audit, and Troubleshooting Signals
Topic 5: Automation, IaC, and Operations at Scale
Topic 6: Cost Management and Operational Excellence

Topic 1: OCI Operations Foundations and Governance

Practice this topic →

1.1 Operating model and shared responsibility

Explain the shared responsibility model for OCI services and what operations teams own vs what OCI manages.
Differentiate break/fix tasks from reliability engineering tasks and identify which improves long-term outcomes.
Given a scenario, choose an operational workflow that includes verification and rollback.
Recognize how environment separation (dev/test/prod) reduces incident risk and improves governance.
Identify common operational risks: configuration drift, undocumented changes, and lack of runbooks.
Explain why baselines and change tracking are required to detect regressions.

1.2 Compartments, tagging, and standardization

Design compartment structures that align with teams, environments, and least privilege.
Explain how tagging supports cost allocation, ownership, and governance.
Given a scenario, choose a tagging standard that enables reporting and automation.
Recognize the difference between tag namespaces and tag keys/values (concept-level).
Identify why consistent naming conventions reduce operational mistakes and speed incident response.
Explain how governance standards enable automated guardrails and policy enforcement.

1.3 Limits, quotas, and capacity planning

Differentiate service limits from quotas and explain how each affects provisioning.
Given a scenario, identify symptoms of hitting limits (failed provisioning, throttling) and next steps.
Explain why capacity planning must include peak load, growth, and failure headroom.
Recognize how noisy neighbors can appear as performance degradation and how isolation helps (concept-level).
Identify which signals indicate capacity pressure (CPU, memory, I/O, queue backlogs) at a conceptual level.
Design an operational process for requesting limit increases and documenting dependencies.

Topic 2: Identity, Access, and Security Operations

Practice this topic →

2.1 IAM operations and least privilege

Apply least privilege to users, groups, and policies for day-to-day operations.
Given a scenario, design policies that separate duties for provisioning vs audit vs security controls.
Recognize common IAM pitfalls: broad wildcard policies, shared admin accounts, and unmanaged credentials.
Explain how dynamic groups and resource principals support workload identities (concept-level).
Identify how to troubleshoot authorization failures by inspecting policies, compartments, and resource scope.
Design an access review process (who has access, why, and how to remove it safely).

2.2 Secrets, keys, and secure configuration

Store secrets and keys in Vault and avoid embedding them in code, images, or config repos.
Given a scenario, choose a rotation strategy for keys/secrets that minimizes downtime (concept-level).
Recognize common secret leak paths (logs, CI output, backups) and mitigations.
Explain why encryption at rest and in transit matter for regulated workloads (concept-level).
Identify how to scope secrets access by compartment and workload identity (concept-level).
Design incident containment steps for suspected credential compromise (revoke/rotate/audit).

2.3 Auditability and security posture

Explain the role of audit trails in incident response and compliance investigations.
Given a scenario, choose what to log/audit for privileged actions and configuration changes.
Recognize the risk of log tampering and how to protect log integrity (concept-level).
Identify how Cloud Guard (concept-level) fits into security operations and posture monitoring.
Explain why security findings must be prioritized by risk and blast radius, not volume.
Design a workflow to triage security findings and track remediation to closure.

Topic 3: Monitoring, Metrics, and Alarms

Practice this topic →

3.1 Monitoring fundamentals and baselines

Define key operational signals: latency, errors, throughput, saturation (concept-level).
Given a scenario, choose metrics that represent user impact rather than internal noise.
Explain why baselines and seasonality matter when setting alert thresholds.
Recognize leading indicators vs lagging indicators and choose appropriately for early detection.
Identify which teams should own which alerts and why ownership reduces time-to-recover.
Design dashboards that separate high-level health from deep-dive diagnostics.

3.2 Alarm design and notification routing

Create an alarm strategy that reduces alert fatigue and improves actionability (concept-level).
Given a scenario, choose appropriate alarm windows and suppression to prevent flapping.
Explain how notifications and escalation policies support incident response (concept-level).
Recognize the difference between informational alerts and paging alerts and set thresholds accordingly.
Identify how to route alerts to teams and tools while preserving context (runbooks, links, metadata).
Design a post-incident review that updates alarms based on what was missed or noisy.

3.3 Service health and dependency monitoring

Given a scenario, monitor critical dependencies (network, identity, storage) and detect cascading failures.
Explain why synthetic checks (concept-level) complement passive monitoring for availability detection.
Recognize quota/limit pressure as an availability risk and monitor for it (concept-level).
Identify when to monitor at the edge (LB/API) vs the app tier vs the database tier for fastest diagnosis.
Design a dependency map (concept-level) that informs incident triage order.
Choose guardrail alerts that block risky changes when critical metrics regress.

Topic 4: Logging, Audit, and Troubleshooting Signals

Practice this topic →

4.1 Logging strategy and hygiene

Differentiate logs, metrics, and traces (concept-level) and choose the right signal for a question.
Design a log strategy: what to collect, where to route it, and retention requirements.
Given a scenario, ensure logs are useful for diagnosis without leaking secrets or PII.
Recognize why consistent correlation IDs improve debugging across distributed services (concept-level).
Identify how to reduce logging cost without losing critical signal (sampling, severity filtering) at a conceptual level.
Design log access controls that support investigations while preventing overexposure.

4.2 Audit trails and change detection

Use audit trails to answer: who changed what, when, and from where (concept-level).
Given a scenario, correlate audit events with deployment timelines to identify the likely regression cause.
Recognize why “unknown changes” prolong incidents and how change management reduces that risk.
Identify which actions should require approvals and how to enforce them operationally (concept-level).
Design a workflow to detect drift between intended configuration and actual configuration (concept-level).
Explain why tamper-resistant logs and retention are required for compliance contexts (concept-level).

4.3 Troubleshooting workflow and common failure classes

Apply a structured triage workflow: reproduce, isolate, identify layer, validate fix.
Given a scenario, distinguish IAM failures from network failures from service-limit failures based on symptoms.
Recognize common outage patterns: DNS issues, expired certificates, exhausted quotas, and misrouted traffic.
Identify when to roll back quickly vs when to attempt live fixes based on blast radius and confidence.
Explain why preserving evidence (logs, timelines) matters for postmortems and learning.
Design incident communications that clarify impact, ETA, and mitigation steps (concept-level).

Topic 5: Automation, IaC, and Operations at Scale

Practice this topic →

5.1 Resource Manager and Terraform workflows

Explain why infrastructure-as-code reduces drift and improves repeatability.
Given a scenario, design Terraform workflows with environment separation and safe promotion.
Recognize the need for idempotency and safe retries in automation pipelines.
Identify how state management affects safety and collaboration (concept-level).
Design module standards and naming/tagging conventions for reusable infrastructure code.
Choose guardrails that prevent destructive changes (plans, approvals, policy checks).

5.2 CLI/API automation and runbooks

Use OCI CLI and APIs at a conceptual level to automate provisioning and operational checks.
Given a scenario, design runbooks that are deterministic, versioned, and regularly exercised.
Recognize why automation credentials must be managed like secrets (rotation, least privilege).
Identify where to log automation actions for audit and troubleshooting.
Design automation that is safe under partial failures (retries, timeouts, idempotency).
Choose human-in-the-loop controls for high-risk operations (deletes, network changes).

5.3 Change management and release safety

Explain why changes should be staged and verified with telemetry before full rollout.
Given a scenario, choose canary/blue-green strategies at a conceptual level for risky changes.
Recognize how to define rollback triggers using guardrail metrics (latency, errors, saturation).
Identify post-change monitoring requirements and duration based on risk and history.
Design approval workflows that balance speed and safety for production environments.
Recognize how documentation and ownership reduce incidents during on-call rotations.

Topic 6: Cost Management and Operational Excellence

Practice this topic →

6.1 Cost drivers and governance controls

Identify common cost drivers: compute utilization, storage tiering, data transfer, and long-running services.
Given a scenario, choose cost controls (budgets, tags, quotas) that prevent runaway spend.
Explain why idle resources should be detected and cleaned up with automation (concept-level).
Recognize trade-offs between availability and cost and when to pay for resilience.
Design a cost reporting strategy using tags and compartment structure.
Choose cost-optimization steps that do not compromise security or reliability.

6.2 SLOs, on-call readiness, and incident improvement

Define SLO-style targets (availability, latency) at a conceptual level and tie alerts to user impact.
Given a scenario, design escalation paths and on-call handoffs that reduce time-to-mitigate.
Recognize why runbooks must be tested and updated after changes and incidents.
Identify post-incident outputs: timeline, root cause, contributing factors, and preventive actions.
Design regression tests for operational workflows (backup checks, alarm validation) at a conceptual level.
Choose reliability investments that reduce repeat incidents (automation, better alerts, safer rollouts).

6.3 Operational risk controls and continuous compliance

Recognize operational risks in regulated environments (audit requirements, retention, segregation of duties).
Given a scenario, design continuous compliance checks using policies and automation (concept-level).
Explain why least privilege and access reviews are ongoing processes, not one-time tasks.
Identify which logs and metrics are required as evidence for audits and investigations.
Design controls that prevent high-risk changes without approvals (network exposure, IAM broadening).
Choose a workflow to remediate policy violations while minimizing service disruption.

Study Plan

Cheat Sheet

Browse Exams — Mock Exams & Practice Tests

1Z0-1067-25 Syllabus — Learning Objectives by Topic

What’s covered

Topic 1: OCI Operations Foundations and Governance

1.1 Operating model and shared responsibility

1.2 Compartments, tagging, and standardization

1.3 Limits, quotas, and capacity planning

Topic 2: Identity, Access, and Security Operations

2.1 IAM operations and least privilege

2.2 Secrets, keys, and secure configuration

2.3 Auditability and security posture

Topic 3: Monitoring, Metrics, and Alarms

3.1 Monitoring fundamentals and baselines

3.2 Alarm design and notification routing

3.3 Service health and dependency monitoring

Topic 4: Logging, Audit, and Troubleshooting Signals

4.1 Logging strategy and hygiene

4.2 Audit trails and change detection

4.3 Troubleshooting workflow and common failure classes

Topic 5: Automation, IaC, and Operations at Scale

5.1 Resource Manager and Terraform workflows

5.2 CLI/API automation and runbooks

5.3 Change management and release safety

Topic 6: Cost Management and Operational Excellence

6.1 Cost drivers and governance controls

6.2 SLOs, on-call readiness, and incident improvement

6.3 Operational risk controls and continuous compliance