Browse Exams — Mock Exams & Practice Tests

1Z0-1067-25 Syllabus — Learning Objectives by Topic

Learning objectives for OCI 2025 Cloud Ops Professional (1Z0-1067-25), organized by topic with quick links to targeted practice.

Use this syllabus as your source of truth for 1Z0‑1067‑25.

What’s covered

Topic 1: OCI Operations Foundations and Governance

Practice this topic →

1.1 Operating model and shared responsibility

  • Explain the shared responsibility model for OCI services and what operations teams own vs what OCI manages.
  • Differentiate break/fix tasks from reliability engineering tasks and identify which improves long-term outcomes.
  • Given a scenario, choose an operational workflow that includes verification and rollback.
  • Recognize how environment separation (dev/test/prod) reduces incident risk and improves governance.
  • Identify common operational risks: configuration drift, undocumented changes, and lack of runbooks.
  • Explain why baselines and change tracking are required to detect regressions.

1.2 Compartments, tagging, and standardization

  • Design compartment structures that align with teams, environments, and least privilege.
  • Explain how tagging supports cost allocation, ownership, and governance.
  • Given a scenario, choose a tagging standard that enables reporting and automation.
  • Recognize the difference between tag namespaces and tag keys/values (concept-level).
  • Identify why consistent naming conventions reduce operational mistakes and speed incident response.
  • Explain how governance standards enable automated guardrails and policy enforcement.

1.3 Limits, quotas, and capacity planning

  • Differentiate service limits from quotas and explain how each affects provisioning.
  • Given a scenario, identify symptoms of hitting limits (failed provisioning, throttling) and next steps.
  • Explain why capacity planning must include peak load, growth, and failure headroom.
  • Recognize how noisy neighbors can appear as performance degradation and how isolation helps (concept-level).
  • Identify which signals indicate capacity pressure (CPU, memory, I/O, queue backlogs) at a conceptual level.
  • Design an operational process for requesting limit increases and documenting dependencies.

Topic 2: Identity, Access, and Security Operations

Practice this topic →

2.1 IAM operations and least privilege

  • Apply least privilege to users, groups, and policies for day-to-day operations.
  • Given a scenario, design policies that separate duties for provisioning vs audit vs security controls.
  • Recognize common IAM pitfalls: broad wildcard policies, shared admin accounts, and unmanaged credentials.
  • Explain how dynamic groups and resource principals support workload identities (concept-level).
  • Identify how to troubleshoot authorization failures by inspecting policies, compartments, and resource scope.
  • Design an access review process (who has access, why, and how to remove it safely).

2.2 Secrets, keys, and secure configuration

  • Store secrets and keys in Vault and avoid embedding them in code, images, or config repos.
  • Given a scenario, choose a rotation strategy for keys/secrets that minimizes downtime (concept-level).
  • Recognize common secret leak paths (logs, CI output, backups) and mitigations.
  • Explain why encryption at rest and in transit matter for regulated workloads (concept-level).
  • Identify how to scope secrets access by compartment and workload identity (concept-level).
  • Design incident containment steps for suspected credential compromise (revoke/rotate/audit).

2.3 Auditability and security posture

  • Explain the role of audit trails in incident response and compliance investigations.
  • Given a scenario, choose what to log/audit for privileged actions and configuration changes.
  • Recognize the risk of log tampering and how to protect log integrity (concept-level).
  • Identify how Cloud Guard (concept-level) fits into security operations and posture monitoring.
  • Explain why security findings must be prioritized by risk and blast radius, not volume.
  • Design a workflow to triage security findings and track remediation to closure.

Topic 3: Monitoring, Metrics, and Alarms

Practice this topic →

3.1 Monitoring fundamentals and baselines

  • Define key operational signals: latency, errors, throughput, saturation (concept-level).
  • Given a scenario, choose metrics that represent user impact rather than internal noise.
  • Explain why baselines and seasonality matter when setting alert thresholds.
  • Recognize leading indicators vs lagging indicators and choose appropriately for early detection.
  • Identify which teams should own which alerts and why ownership reduces time-to-recover.
  • Design dashboards that separate high-level health from deep-dive diagnostics.

3.2 Alarm design and notification routing

  • Create an alarm strategy that reduces alert fatigue and improves actionability (concept-level).
  • Given a scenario, choose appropriate alarm windows and suppression to prevent flapping.
  • Explain how notifications and escalation policies support incident response (concept-level).
  • Recognize the difference between informational alerts and paging alerts and set thresholds accordingly.
  • Identify how to route alerts to teams and tools while preserving context (runbooks, links, metadata).
  • Design a post-incident review that updates alarms based on what was missed or noisy.

3.3 Service health and dependency monitoring

  • Given a scenario, monitor critical dependencies (network, identity, storage) and detect cascading failures.
  • Explain why synthetic checks (concept-level) complement passive monitoring for availability detection.
  • Recognize quota/limit pressure as an availability risk and monitor for it (concept-level).
  • Identify when to monitor at the edge (LB/API) vs the app tier vs the database tier for fastest diagnosis.
  • Design a dependency map (concept-level) that informs incident triage order.
  • Choose guardrail alerts that block risky changes when critical metrics regress.

Topic 4: Logging, Audit, and Troubleshooting Signals

Practice this topic →

4.1 Logging strategy and hygiene

  • Differentiate logs, metrics, and traces (concept-level) and choose the right signal for a question.
  • Design a log strategy: what to collect, where to route it, and retention requirements.
  • Given a scenario, ensure logs are useful for diagnosis without leaking secrets or PII.
  • Recognize why consistent correlation IDs improve debugging across distributed services (concept-level).
  • Identify how to reduce logging cost without losing critical signal (sampling, severity filtering) at a conceptual level.
  • Design log access controls that support investigations while preventing overexposure.

4.2 Audit trails and change detection

  • Use audit trails to answer: who changed what, when, and from where (concept-level).
  • Given a scenario, correlate audit events with deployment timelines to identify the likely regression cause.
  • Recognize why “unknown changes” prolong incidents and how change management reduces that risk.
  • Identify which actions should require approvals and how to enforce them operationally (concept-level).
  • Design a workflow to detect drift between intended configuration and actual configuration (concept-level).
  • Explain why tamper-resistant logs and retention are required for compliance contexts (concept-level).

4.3 Troubleshooting workflow and common failure classes

  • Apply a structured triage workflow: reproduce, isolate, identify layer, validate fix.
  • Given a scenario, distinguish IAM failures from network failures from service-limit failures based on symptoms.
  • Recognize common outage patterns: DNS issues, expired certificates, exhausted quotas, and misrouted traffic.
  • Identify when to roll back quickly vs when to attempt live fixes based on blast radius and confidence.
  • Explain why preserving evidence (logs, timelines) matters for postmortems and learning.
  • Design incident communications that clarify impact, ETA, and mitigation steps (concept-level).

Topic 5: Automation, IaC, and Operations at Scale

Practice this topic →

5.1 Resource Manager and Terraform workflows

  • Explain why infrastructure-as-code reduces drift and improves repeatability.
  • Given a scenario, design Terraform workflows with environment separation and safe promotion.
  • Recognize the need for idempotency and safe retries in automation pipelines.
  • Identify how state management affects safety and collaboration (concept-level).
  • Design module standards and naming/tagging conventions for reusable infrastructure code.
  • Choose guardrails that prevent destructive changes (plans, approvals, policy checks).

5.2 CLI/API automation and runbooks

  • Use OCI CLI and APIs at a conceptual level to automate provisioning and operational checks.
  • Given a scenario, design runbooks that are deterministic, versioned, and regularly exercised.
  • Recognize why automation credentials must be managed like secrets (rotation, least privilege).
  • Identify where to log automation actions for audit and troubleshooting.
  • Design automation that is safe under partial failures (retries, timeouts, idempotency).
  • Choose human-in-the-loop controls for high-risk operations (deletes, network changes).

5.3 Change management and release safety

  • Explain why changes should be staged and verified with telemetry before full rollout.
  • Given a scenario, choose canary/blue-green strategies at a conceptual level for risky changes.
  • Recognize how to define rollback triggers using guardrail metrics (latency, errors, saturation).
  • Identify post-change monitoring requirements and duration based on risk and history.
  • Design approval workflows that balance speed and safety for production environments.
  • Recognize how documentation and ownership reduce incidents during on-call rotations.

Topic 6: Cost Management and Operational Excellence

Practice this topic →

6.1 Cost drivers and governance controls

  • Identify common cost drivers: compute utilization, storage tiering, data transfer, and long-running services.
  • Given a scenario, choose cost controls (budgets, tags, quotas) that prevent runaway spend.
  • Explain why idle resources should be detected and cleaned up with automation (concept-level).
  • Recognize trade-offs between availability and cost and when to pay for resilience.
  • Design a cost reporting strategy using tags and compartment structure.
  • Choose cost-optimization steps that do not compromise security or reliability.

6.2 SLOs, on-call readiness, and incident improvement

  • Define SLO-style targets (availability, latency) at a conceptual level and tie alerts to user impact.
  • Given a scenario, design escalation paths and on-call handoffs that reduce time-to-mitigate.
  • Recognize why runbooks must be tested and updated after changes and incidents.
  • Identify post-incident outputs: timeline, root cause, contributing factors, and preventive actions.
  • Design regression tests for operational workflows (backup checks, alarm validation) at a conceptual level.
  • Choose reliability investments that reduce repeat incidents (automation, better alerts, safer rollouts).

6.3 Operational risk controls and continuous compliance

  • Recognize operational risks in regulated environments (audit requirements, retention, segregation of duties).
  • Given a scenario, design continuous compliance checks using policies and automation (concept-level).
  • Explain why least privilege and access reviews are ongoing processes, not one-time tasks.
  • Identify which logs and metrics are required as evidence for audits and investigations.
  • Design controls that prevent high-risk changes without approvals (network exposure, IAM broadening).
  • Choose a workflow to remediate policy violations while minimizing service disruption.