Browse Exams — Mock Exams & Practice Tests

1Z0-1111-25 Syllabus — Learning Objectives by Topic

Learning objectives for OCI 2025 Observability Professional (1Z0-1111-25), organized by topic with quick links to targeted practice.

Use this syllabus as your source of truth for 1Z0‑1111‑25.

What’s covered

Topic 1: Observability Foundations and Signal Strategy

Practice this topic →

1.1 Observability pillars and SLO thinking

  • Define observability vs monitoring and explain why observability reduces time-to-detect and time-to-recover.
  • Differentiate metrics, logs, and traces and choose the primary signal for common incident scenarios.
  • Explain SLIs and SLOs and why alerting should align with user impact rather than internal noise.
  • Given a scenario, choose leading indicators vs lagging indicators for early detection.
  • Recognize common anti-patterns: alerting on every metric, dashboards without questions, and alerts without owners.
  • Explain how environment separation and tagging improve observability governance and incident triage.

1.2 Instrumentation and telemetry collection principles

  • Describe how instrumentation choices affect cardinality, cost, and diagnostic usefulness (concept-level).
  • Given a scenario, decide what to sample vs capture at full fidelity (concept-level).
  • Explain why structured logs improve search, parsing, and correlation.
  • Identify how correlation IDs propagate across services to connect logs and traces (concept-level).
  • Given a scenario, choose where to collect telemetry (agent, SDK, platform logs) conceptually.
  • Recognize privacy and security concerns: PII in logs, tenancy boundaries, and retention controls.

1.3 Dashboards, runbooks, and operational readiness

  • Design dashboards that separate high-level health from deep-dive diagnostics and reduce time-to-diagnose.
  • Given a scenario, create an incident dashboard that surfaces key service health indicators and dependencies.
  • Explain why runbooks and annotations reduce incident duration and improve reliability.
  • Identify what to document: known failure modes, mitigation steps, verification checks, and rollback procedures.
  • Given a scenario, choose alert routing and ownership patterns that reduce time-to-acknowledge.
  • Recognize post-incident practices: blameless postmortems, action items, and alert reviews.

Topic 2: OCI Monitoring (Metrics) and Alarm Management

Practice this topic →

2.1 Metrics model and namespaces

  • Explain OCI Monitoring metric concepts: namespaces, dimensions, intervals, and statistics (concept-level).
  • Given a scenario, choose the correct metric and dimension to isolate a resource, fleet, or compartment scope.
  • Recognize the risk of high-cardinality dimensions and how to constrain them for cost and usability.
  • Explain the difference between built-in service metrics and custom/application metrics (concept-level).
  • Given a scenario, design a naming standard for custom metrics and consistent dimensions.
  • Identify how tags and compartments affect monitoring views, permissions, and reporting.

2.2 Alarms, notifications, and routing

  • Create alarms with appropriate thresholds, evaluation windows, and suppression to avoid flapping.
  • Given a scenario, choose Notifications topics and subscriptions for alert delivery and escalation.
  • Explain routing patterns for alerts (email, SMS, webhooks) and how to integrate with incident systems (concept-level).
  • Recognize when composite alarms or multi-condition alarms reduce noise (concept-level).
  • Given a scenario, design escalation paths and maintenance window behavior conceptually.
  • Identify troubleshooting steps when alarms are missing or delayed (permissions, metric gaps, mis-scoping).

2.3 Dashboards and custom monitoring

  • Build dashboards with charts that answer specific questions (availability, latency, saturation) rather than generic telemetry.
  • Given a scenario, choose appropriate aggregation (sum, average, percentiles) for the metric and question.
  • Explain how metric math/rate calculations support meaningful alerts (concept-level).
  • Identify ways to publish custom metrics from applications, jobs, and automation (concept-level).
  • Given a scenario, design monitoring for ephemeral/serverless workloads (functions, jobs) conceptually.
  • Recognize cost considerations for high-resolution metrics, many alarms, and noisy dashboards.

Topic 3: OCI Logging and Log Analytics

Practice this topic →

3.1 Log sources, log groups, and ingestion

  • Identify OCI log sources: service logs, audit logs, and custom/application logs (concept-level).
  • Given a scenario, design log group structure and retention by environment and sensitivity.
  • Explain onboarding patterns for compute and application logs (agent vs service logs) conceptually.
  • Recognize the need for consistent timestamps and fields to enable correlation across telemetry sources.
  • Given a scenario, choose masking/exclusion strategies to prevent sensitive data leakage in logs.
  • Identify common ingestion failures and remediation steps (permissions, agent config, networking).

3.2 Search, parsing, and correlation (Log Analytics)

  • Explain how Log Analytics supports search, parsing, and correlation across log sources (concept-level).
  • Given a scenario, write filters/queries to isolate errors, outliers, or suspicious activity (concept-level).
  • Identify the purpose of parsing rules and how structured logs reduce parsing complexity.
  • Given a scenario, create log-based metrics and alerts (concept-level).
  • Recognize approaches to correlate logs with metrics and traces using IDs and timestamps (concept-level).
  • Explain how retention and storage choices affect log analytics cost and compliance.

3.3 Audit, security, and governance for logs

  • Describe OCI Audit log purpose and identify high-value events to monitor (IAM/network changes, privileged actions).
  • Given a scenario, implement access controls that prevent unauthorized log access or tampering (concept-level).
  • Recognize chain-of-custody requirements for investigations and long-term evidence handling (concept-level).
  • Explain how to export logs to Object Storage or external analysis destinations (concept-level).
  • Given a scenario, design log retention and deletion policies aligned to compliance requirements.
  • Identify log governance anti-patterns: everyone can read, no retention, and logging secrets or credentials.

Topic 4: APM, Tracing, and Performance Investigation

Practice this topic →

4.1 Tracing fundamentals and service maps

  • Define traces, spans, and context propagation and explain how tracing reveals latency and dependency structure.
  • Given a scenario, choose where to instrument to capture the critical path (frontend/API/database).
  • Explain sampling strategies and trade-offs for high-traffic services (concept-level).
  • Recognize common tracing pitfalls: missing context, async boundaries, and sampling bias (concept-level).
  • Given a scenario, use service maps to identify upstream/downstream dependencies conceptually.
  • Identify how correlation IDs join traces with logs and metrics for faster diagnosis (concept-level).

4.2 OCI Application Performance Monitoring (APM) concepts

  • Explain APM capabilities: traces, service maps, errors, and performance signals (concept-level).
  • Given a scenario, choose APM for troubleshooting latency vs relying on metrics and logs alone.
  • Recognize how APM can identify slow database calls, external dependency latency, and error hotspots (concept-level).
  • Explain how to define alert conditions for APM signals (errors, latency) conceptually.
  • Given a scenario, segment performance by dimension (service, endpoint, region) conceptually.
  • Identify how to troubleshoot instrumentation issues (missing traces, agent misconfiguration) conceptually.

4.3 Performance investigation workflows

  • Given a scenario, follow a systematic workflow: reproduce → isolate → measure → fix → verify.
  • Explain the difference between symptom metrics (CPU) and cause metrics (queue backlogs) conceptually.
  • Recognize common bottlenecks: saturation, contention, slow I/O, and mis-sized instances.
  • Given a scenario, choose load testing and synthetic checks to validate performance improvements (concept-level).
  • Identify how to use annotations and change tracking to link regressions to deployments or config changes.
  • Explain why profiling and tracing should be used safely in production due to overhead (concept-level).

Topic 5: Telemetry Routing, Automation, and Integration Patterns

Practice this topic →

5.1 Service Connector Hub and telemetry pipelines

  • Explain Service Connector Hub purpose: move telemetry between OCI services reliably (concept-level).
  • Given a scenario, route logs to Log Analytics or Object Storage using Service Connector Hub.
  • Recognize transformation, filtering, and batching needs in telemetry pipelines (concept-level).
  • Explain failure handling: retries, dead-letter patterns, and monitoring connectors conceptually.
  • Given a scenario, design least-privilege policies for connectors and their destinations.
  • Identify cost and latency trade-offs when exporting high-volume telemetry.

5.2 Notifications, events, and automated responses

  • Differentiate Monitoring alarms, Events, and Notifications and explain how they compose into workflows.
  • Given a scenario, trigger automation (Functions/webhooks) when alarms fire to create tickets or remediate (concept-level).
  • Explain how to design safe responders that avoid runaway automation or unsafe changes.
  • Recognize when automation must be gated with human approval for risky actions (concept-level).
  • Given a scenario, design alert deduplication and suppression during planned maintenance windows.
  • Identify patterns to integrate with external incident systems via HTTPS endpoints and webhooks (concept-level).

5.3 Cross-service correlation and multi-scope views

  • Explain how compartments, tags, and resource identifiers enable correlation across telemetry sources.
  • Given a scenario, design dashboards that aggregate across compartments or regions while preserving clarity.
  • Recognize challenges of multi-tenancy/multi-account views: access boundaries, naming, and data separation (concept-level).
  • Explain how to normalize telemetry fields (service name, env, region) to support correlation at scale.
  • Given a scenario, design centralized observability while preserving least privilege and privacy controls.
  • Identify governance practices: standard log formats, metric naming conventions, and alert ownership.

Topic 6: Incident Response, Compliance, and Optimization

Practice this topic →

6.1 Incident triage and troubleshooting scenarios

  • Given a scenario, triage an outage using metrics first, then logs and traces to isolate the likely cause.
  • Explain how to distinguish availability incidents from performance incidents using observable signals.
  • Recognize common failure modes: DNS misconfiguration, routing blocks, expired certificates, and quota limits (concept-level).
  • Given a scenario, choose remediation steps that reduce blast radius (feature flags, rollback, throttling).
  • Identify how to communicate incident status, scope, and timelines effectively during an active event.
  • Explain how to verify recovery with evidence (health checks, normalized metrics) after remediation.

6.2 Compliance, retention, and privacy operations

  • Define retention requirements and how they affect telemetry storage and access controls.
  • Given a scenario, implement least-privilege access for observability data and separate admin vs viewer roles.
  • Recognize how to handle sensitive data in telemetry (masking, sampling, exclusion) conceptually.
  • Explain auditability requirements for who changed alerts/dashboards and when (concept-level).
  • Given a scenario, design evidence collection for audits using exported logs, dashboards, and reports.
  • Identify pitfalls: collecting more data than needed, unclear ownership, and unbounded retention.

6.3 Cost and scaling considerations for observability

  • Identify major cost drivers: high-cardinality metrics, large log volumes, and long retention windows.
  • Given a scenario, reduce observability cost without losing critical signals (sampling, aggregation, better log levels).
  • Explain how to set appropriate log levels and use structured logging to reduce noise and improve queryability.
  • Recognize when to archive telemetry to Object Storage for long-term retention (concept-level).
  • Given a scenario, design tiered alerting and dashboards that scale with organization growth.
  • Explain how continuous improvement (postmortems, alert reviews) keeps observability sustainable.