DEA-C02 — Snowflake SnowPro Advanced: Data Engineer Exam Blueprint

Practical exam blueprint for the Snowflake SnowPro Advanced: Data Engineer (DEA-C02) exam.

How to Use This Exam Blueprint

Use this checklist as an independent study map for the Snowflake SnowPro Advanced: Data Engineer (DEA-C02) exam from Snowflake. It translates common Snowflake data engineering responsibilities into practical readiness checks: what you should recognize, design, troubleshoot, and choose under scenario pressure.

Do not treat this as a list of exact official exam weights. Instead, use it to find gaps before final practice.

For each topic area, mark yourself:

  • Ready: you can explain the concept, choose the right Snowflake feature, and troubleshoot a realistic scenario.
  • Review: you understand the idea but need syntax, edge cases, or decision practice.
  • Practice: you would likely miss scenario questions without hands-on repetition.

Topic-Area Readiness Table

Readiness areaWhat to reviewReady when you can…Status
Snowflake architecture for data engineeringWarehouses, storage/compute separation, databases, schemas, stages, micro-partitions, metadata, cachingExplain how compute, storage, metadata, and services interact in pipeline design☐ Ready ☐ Review ☐ Practice
Data loading and unloadingCOPY INTO, stages, file formats, load history, validation, error handling, schema evolutionDesign repeatable batch loads and diagnose failed or duplicated loads☐ Ready ☐ Review ☐ Practice
Continuous ingestionSnowpipe, cloud notifications, pipes, streaming patterns, connector-based ingestionChoose between batch, Snowpipe, streaming, and external table approaches☐ Ready ☐ Review ☐ Practice
Transformations and ELTSQL transformations, CTAS, MERGE, views, stored procedures, Snowpark, UDFsBuild reliable transformation logic with idempotency and observability☐ Ready ☐ Review ☐ Practice
Streams and tasksCDC streams, task schedules, task graphs, stream staleness, task historyImplement and troubleshoot incremental pipelines☐ Ready ☐ Review ☐ Practice
Dynamic tables and materialized resultsDynamic tables, target lag, materialized views, refresh tradeoffsSelect the right managed refresh pattern for freshness and cost requirements☐ Ready ☐ Review ☐ Practice
Semi-structured dataVARIANT, OBJECT, ARRAY, FLATTEN, JSON pathing, schema inferenceParse, flatten, model, and optimize semi-structured workloads☐ Ready ☐ Review ☐ Practice
Performance tuningQuery Profile, warehouse sizing, clustering, pruning, search optimization, materialized viewsIdentify the bottleneck and choose a proportional optimization☐ Ready ☐ Review ☐ Practice
Security and access controlRBAC, grants, managed access schemas, masking, row access, tags, stages, integrationsGrant least privilege for pipelines without overexposing data or credentials☐ Ready ☐ Review ☐ Practice
Governance and observabilityAccount usage, information schema, query history, copy history, task history, access historyTrace pipeline behavior, audit access, and explain lineage-relevant metadata☐ Ready ☐ Review ☐ Practice
Data sharing and collaborationSecure views, shares, reader/consumer patterns, policy-aware sharingProtect sensitive data while enabling controlled downstream consumption☐ Ready ☐ Review ☐ Practice
Resilience and lifecycle operationsCloning, Time Travel concepts, replication/failover concepts, environment promotionDesign recoverable, testable, and promotable data engineering workflows☐ Ready ☐ Review ☐ Practice

Core Snowflake Architecture Checks

Can you explain these without notes?

  • How Snowflake separates storage, compute, and cloud services.
  • Why virtual warehouses can be scaled independently from stored data.
  • How database, schema, table, view, stage, pipe, stream, task, and integration objects relate.
  • How micro-partitions support pruning and why manual partition management is usually not the same as in traditional systems.
  • What metadata Snowflake tracks for query planning, loading, and object history.
  • How result caching can affect testing and performance interpretation.
  • Why workload isolation matters for ingestion, transformation, BI, and ad hoc workloads.
  • How zero-copy cloning can support development, testing, rollback, or point-in-time style workflows.
  • When temporary, transient, and permanent objects may be appropriate from a lifecycle perspective.
  • How object naming, database layering, and schema organization affect maintainability.

Architecture decision prompts

Scenario cueLikely design consideration
ETL jobs compete with BI usersSeparate warehouses or workload-specific warehouse strategy
Queries scan too much dataCheck pruning, clustering, filtering, data model, and Query Profile
Need isolated test data quicklyConsider cloning and environment promotion controls
Need multiple teams to build pipelinesUse role hierarchy, schemas, ownership model, and naming standards
Need cost attributionUse warehouse separation, tags where appropriate, and query/history metadata
Need repeatable deploymentsUse scripted DDL, versioned artifacts, and controlled grants

Data Loading and Unloading Checklist

Batch loading readiness

You should be able to read, modify, and troubleshoot common load patterns like:

CREATE OR REPLACE FILE FORMAT ff_orders_csv
  TYPE = CSV
  FIELD_OPTIONALLY_ENCLOSED_BY = '"'
  SKIP_HEADER = 1
  NULL_IF = ('NULL', '');

CREATE OR REPLACE STAGE stg_orders
  FILE_FORMAT = ff_orders_csv;

COPY INTO raw.orders
FROM @stg_orders
ON_ERROR = 'CONTINUE';

Be ready to explain:

  • Internal stages versus external stages.
  • User stages, table stages, and named stages.
  • When to define reusable file formats.
  • How COPY INTO <table> differs from COPY INTO <location>.
  • How Snowflake tracks loaded files to reduce accidental reloads.
  • When FORCE, PURGE, validation options, or load history checks may matter.
  • How to use PATTERN or file organization to target the right files.
  • How compression and file format choices affect load behavior.
  • How MATCH_BY_COLUMN_NAME can help with column alignment scenarios.
  • How staged file metadata can be used during ingestion.
  • How to validate load files before committing to a production load.
  • How unloading to cloud storage is controlled by file format, stage, and permissions.

Common load failure checks

SymptomWhat to inspect
Rows rejectedFile format, delimiter, quote handling, null handling, data types
Load succeeds but data is shiftedColumn order, header handling, delimiter, embedded separators
Files are not foundStage path, cloud storage path, pattern, integration, permissions
Files load twiceLoad history, FORCE, renamed files, pipeline idempotency
Load is slowFile sizing pattern, warehouse size, concurrency, compression, cloud storage layout
Numeric/date fields failFormat settings, regional formats, invalid source values, TRY_ functions
Semi-structured load failsJSON validity, outer arrays, compression, file format type
External stage access deniedStorage integration, cloud IAM/policy, encryption configuration, object path

Unloading readiness

  • Choose a target stage and file format for downstream consumers.
  • Control header behavior, delimiters, compression, and file naming where needed.
  • Understand the difference between exporting a query result and exporting a table.
  • Protect unloaded sensitive data with access controls and masking-aware design.
  • Validate whether downstream tools expect CSV, JSON, Parquet, or another format.

Continuous Ingestion and Pipeline Selection

Choose the ingestion pattern

RequirementPattern to considerWatch for
Scheduled large file batchesCOPY INTO with orchestrationIdempotency, file organization, retries
Near-real-time file arrival in cloud storageSnowpipe with notificationsCloud events, pipe definition, latency expectations
High-frequency row/event ingestionStreaming-oriented pattern or connectorOrdering, deduplication, schema drift, cost
Query files without immediately loading all dataExternal tables or staged-file query patternPerformance, metadata refresh, governance
Kafka or event-platform sourceSnowflake connector or streaming integration patternOffset handling, error topics, schema registry assumptions
Cross-cloud or multi-system ingestStorage integrations and controlled landing zonesCredentials, encryption, network path, auditability

Snowpipe readiness

  • Explain what a pipe does and how it references a COPY INTO statement.
  • Distinguish manual pipe refresh from event-based auto-ingest patterns.
  • Identify required dependencies: stage, file format, table, pipe, notification integration or cloud event setup.
  • Troubleshoot files that arrive but are not loaded.
  • Use load history or copy history to verify ingestion.
  • Explain how Snowpipe differs from scheduled warehouse-based batch loading.
  • Recognize when Snowpipe is not the right answer, such as heavy transformation during ingest or strict complex orchestration.

Example artifact to recognize:

CREATE OR REPLACE PIPE p_orders_auto
AS
COPY INTO raw.orders
FROM @stg_orders_auto
FILE_FORMAT = ff_orders_json;

Ingestion decision traps

TrapBetter exam-ready thinking
“Near real time” always means streaming APICheck whether the source is files, events, Kafka, or application rows
Use one large warehouse for everythingSeparate load, transform, and consumption workloads when isolation matters
Put all transformation inside COPYKeep loads simple when debuggability, replay, and lineage matter
Ignore rejected recordsCapture, inspect, and design a remediation path
Assume schema never changesPlan for schema drift, validation, and versioned raw zones

Transformation, ELT, and Data Modeling

SQL transformation skills

Be comfortable with:

  • CREATE TABLE AS SELECT patterns.
  • INSERT INTO ... SELECT for append pipelines.
  • MERGE for upsert and slowly changing dimension logic.
  • Window functions for deduplication, ranking, sessionization, and latest-record selection.
  • QUALIFY for filtering after window calculations.
  • Common table expressions for readable transformation stages.
  • TRY_CAST, TRY_TO_DATE, and related safe conversion functions.
  • Set operations for reconciliation.
  • Transactions for multi-step changes.
  • Idempotent design so reruns do not corrupt downstream tables.

Example MERGE pattern to understand:

MERGE INTO dim.customer tgt
USING stg.customer_updates src
  ON tgt.customer_id = src.customer_id
WHEN MATCHED THEN UPDATE SET
  tgt.email = src.email,
  tgt.updated_at = src.updated_at
WHEN NOT MATCHED THEN INSERT (
  customer_id,
  email,
  updated_at
) VALUES (
  src.customer_id,
  src.email,
  src.updated_at
);

Modeling checks

Model/design topicReady when you can…
Raw, curated, and consumption layersExplain why each layer exists and what transformations belong there
Star schemaIdentify facts, dimensions, grain, conformed dimensions, and surrogate keys
Snowflake schemaRecognize normalized dimensions and tradeoffs
Data Vault-style patternsRecognize hubs, links, satellites at a conceptual level if used in your environment
SCD Type 1Overwrite old values safely
SCD Type 2Preserve history using effective dates, current flags, or similar design
DeduplicationUse keys, timestamps, hashes, and window functions appropriately
Late-arriving dataDecide whether to restate facts, update dimensions, or hold exceptions
Schema evolutionSeparate raw capture from curated contract changes
Data qualityAdd checks for nulls, uniqueness, referential consistency, ranges, and freshness

Transformation decision prompts

Scenario cueWhat to consider
Need exact replay from sourcePreserve raw immutable data and load metadata
Need latest customer state onlyType 1 or current-state table may be enough
Need historical reportingType 2 dimension or historized model
Need reusable business metricCurated table, view, or governed semantic layer pattern
Need complex procedural logicStored procedure or Snowpark may fit better than a single SQL statement
Need simple dependency-based SQL refreshDynamic table or task graph may be appropriate

Streams, Tasks, and Incremental Processing

Streams readiness

  • Explain what change data a stream captures.
  • Distinguish standard and append-only stream use cases.
  • Understand that streams are offsets over source changes, not permanent event archives.
  • Recognize stream metadata columns such as action/update indicators at a conceptual level.
  • Explain how consuming a stream advances its offset.
  • Avoid designing pipelines that let streams go stale.
  • Use streams with MERGE or INSERT for incremental downstream updates.
  • Know how to check whether a stream has data before running work.

Example pattern:

CREATE OR REPLACE STREAM str_orders
ON TABLE raw.orders;

MERGE INTO curated.orders tgt
USING str_orders src
  ON tgt.order_id = src.order_id
WHEN MATCHED THEN UPDATE SET
  tgt.status = src.status,
  tgt.updated_at = src.updated_at
WHEN NOT MATCHED THEN INSERT (
  order_id,
  status,
  updated_at
) VALUES (
  src.order_id,
  src.status,
  src.updated_at
);

Tasks readiness

  • Create a scheduled task.
  • Create a task that runs after another task.
  • Understand task graphs and dependency ordering.
  • Know when a task needs a warehouse or serverless-style execution option.
  • Resume, suspend, and inspect task state.
  • Troubleshoot missed or failed task runs.
  • Use task history to identify failures, duration, and timing.
  • Design tasks to be idempotent and safe to rerun.

Example task pattern:

CREATE OR REPLACE TASK t_curate_orders
  WAREHOUSE = wh_transform
  SCHEDULE = 'USING CRON 0 * * * * UTC'
AS
  CALL sp_curate_orders();

Streams and tasks traps

TrapWhy it matters
Treating a stream as a durable audit logStreams are for change consumption, not long-term retention
Forgetting task stateA valid task definition does not help if it is not running
Consuming stream data without transaction awarenessFailed or partial logic can create inconsistent downstream state
Combining too much work in one taskHarder to retry, debug, and monitor
Ignoring empty-stream checksWastes compute and complicates monitoring
No rerun strategyFailed incremental jobs can create duplicates or missed changes

Dynamic Tables, Materialized Views, and Managed Refresh

Feature selection checklist

NeedConsiderKey tradeoff
Declarative pipeline with freshness targetDynamic tableLess custom orchestration, but understand refresh behavior
Repeated query acceleration over stable logicMaterialized viewMaintenance overhead and eligibility considerations
Full control over incremental logicStreams and tasksMore orchestration responsibility
Simple virtual abstractionViewNo stored result by default
Physical curated tableCTAS, INSERT, MERGE, task-driven ELTMore control, more operations
Low-latency point lookupsSearch optimization or data model changeCost and workload fit

Dynamic table readiness

  • Explain target lag at a conceptual level.
  • Understand how upstream dependencies affect downstream freshness.
  • Distinguish dynamic tables from views, materialized views, streams, and tasks.
  • Recognize when incremental refresh is beneficial.
  • Recognize when full refresh behavior or unsupported query patterns may affect design.
  • Monitor refresh state and failures.
  • Design dynamic table chains with clear ownership and cost expectations.

Example to recognize:

CREATE OR REPLACE DYNAMIC TABLE curated.daily_order_summary
  TARGET_LAG = '1 hour'
  WAREHOUSE = wh_transform
AS
SELECT
  order_date,
  COUNT(*) AS order_count,
  SUM(order_amount) AS total_amount
FROM curated.orders
GROUP BY order_date;

Semi-Structured Data Readiness

Core skills

  • Load JSON, Parquet, Avro, ORC, or XML-style data when relevant.
  • Use VARIANT, OBJECT, and ARRAY appropriately.
  • Navigate nested fields using path expressions.
  • Flatten arrays with LATERAL FLATTEN.
  • Handle missing fields and mixed types safely.
  • Distinguish SQL NULL from semi-structured null-like values.
  • Decide when to keep data semi-structured versus extracting columns.
  • Optimize frequently filtered attributes by modeling them explicitly when needed.
  • Use schema inference where appropriate, while validating results.

Example to understand:

SELECT
  src:value:customer:id::STRING AS customer_id,
  item.value:sku::STRING AS sku,
  item.value:quantity::NUMBER AS quantity
FROM raw.events,
LATERAL FLATTEN(input => src:value:items) item;

Semi-structured scenario checks

ScenarioGood response
JSON structure changes oftenLand raw as VARIANT, curate stable columns later
BI filters repeatedly on nested fieldConsider extracting the field or using an optimization feature
Arrays create duplicated rowsCheck FLATTEN logic and output grain
Type conversions failUse safe casts and data quality exception handling
Source sends multiple event versionsStore version metadata and branch parsing logic

Performance and Optimization Checklist

Query tuning readiness

You should be able to use Query Profile and history views to answer:

  • Is the query scanning too much data?
  • Are joins exploding row counts?
  • Is a filter applied too late?
  • Is the warehouse under-sized for the workload, or is the SQL inefficient?
  • Is concurrency the issue rather than single-query speed?
  • Is result caching hiding the real cost?
  • Are repeated transformations recomputing the same expensive logic?
  • Would clustering, materialization, or search optimization help?
  • Would a data model change help more than a warehouse change?
  • Is the workload better isolated on a different warehouse?

Optimization tools and when to consider them

Tool or design choiceBest fitCaution
Warehouse scaling upIndividual queries need more computeMay not fix poor pruning or bad joins
Multi-cluster/concurrency scaling patternMany simultaneous queriesHelps concurrency more than bad SQL
Auto-suspend/auto-resumeCost control for intermittent workloadsToo aggressive settings may add latency
Clustering keyLarge tables filtered by common selective patternsMaintenance cost must be justified
Materialized viewRepeated expensive query patternNot every query is eligible or worth materializing
Dynamic tableManaged refreshed transformationUnderstand freshness and refresh behavior
Search optimizationSelective point lookups or specific access patternsNot a general replacement for modeling
Query rewriteBad joins, missing predicates, overuse of SELECT *Often cheaper than adding compute
Table redesignWrong grain, huge semi-structured scans, poor join keysRequires downstream coordination

Performance traps

TrapExam-ready correction
“Make the warehouse bigger” for every slow queryInspect Query Profile first
Assume clustering is always neededCheck pruning and workload patterns
Use materialized views for all dashboardsValidate repetition, freshness, and maintenance tradeoff
Ignore join cardinalityRow explosion can dominate cost
Filter after flattening everythingPush filters as early as possible
Benchmark with cached results onlyTest in a way that reflects real workload behavior
Optimize one query while harming pipelinesBalance workload-level performance and cost

Security, RBAC, and Protected Data Engineering

RBAC readiness

  • Explain role-based access control in Snowflake.
  • Distinguish object ownership from object usage privileges.
  • Grant least privilege for databases, schemas, tables, views, stages, tasks, pipes, and integrations.
  • Understand future grants and why they matter for automated object creation.
  • Use managed access schemas when centralized grant control is required.
  • Avoid using overly powerful roles for routine pipelines.
  • Troubleshoot “object does not exist or not authorized” errors.
  • Design role hierarchy for developers, pipeline service roles, analysts, and administrators.

Data protection readiness

ControlUse case
Masking policyHide or transform sensitive column values based on role or context
Row access policyRestrict visible rows by user, role, tenant, region, or business rule
TagsClassify data and support governance automation
Secure viewShare or expose controlled logic with protection considerations
Storage integrationAccess cloud storage without embedding long-lived credentials directly in SQL
Network policy/private connectivity conceptsRestrict or control access paths where required
Access historyInvestigate who accessed what data, subject to available metadata
Object dependenciesUnderstand impact before changing governed objects

Example masking policy pattern to recognize:

CREATE OR REPLACE MASKING POLICY mp_email AS
  (val STRING) RETURNS STRING ->
    CASE
      WHEN CURRENT_ROLE() IN ('DATA_PRIVILEGED_ROLE') THEN val
      ELSE '***MASKED***'
    END;

Security scenario checks

Scenario cueDecision point
Pipeline loads from cloud storageUse storage integration and grant only required stage/table privileges
Analysts need data but not PIIUse views, masking policies, row access policies, or curated tables
Multiple teams create objects in shared schemaReview ownership, managed access, future grants, and naming rules
Data shared with external consumerUse secure sharing patterns and policy-aware design
Service account runs tasksAssign a dedicated role with minimal required privileges
Access error occurs after deploymentCheck active role, database/schema usage, object privileges, ownership, and future grants

Governance, Metadata, and Observability

Monitoring artifacts to know

ArtifactUse it to answer…
Query historyWhat ran, when, by whom, on which warehouse, and how long?
Copy/load historyWhich files loaded, failed, or were skipped?
Task historyDid scheduled work run successfully?
Pipe status/historyIs continuous ingestion working?
Stream metadata/checksIs there change data waiting?
Access historyWho accessed sensitive data or governed objects?
Object dependenciesWhat breaks if this table, view, or policy changes?
Warehouse metricsIs the issue concurrency, size, queueing, or workload mix?
Tags/classification metadataWhich objects contain sensitive or governed data?

Pipeline observability checklist

  • Every production pipeline has an owner.
  • Load jobs record source file, load time, row count, and exception count where useful.
  • Transform jobs can be rerun safely.
  • Task failures are visible and actionable.
  • Data quality checks run before downstream consumption when needed.
  • Freshness is measured, not assumed.
  • Cost is traceable to warehouses, tasks, or features where possible.
  • Sensitive data access is auditable.
  • Deployments include grant and policy changes, not just table definitions.

Data Sharing, Collaboration, and Lifecycle Operations

Sharing readiness

  • Explain the purpose of secure data sharing.
  • Know why secure views may be used for controlled exposure.
  • Understand that provider-side changes can affect consumers.
  • Consider masking and row access policies before sharing sensitive data.
  • Validate grants and object dependencies before exposing data.
  • Consider whether a physical copy, share, or curated export is the best fit.

Lifecycle and resilience checks

NeedSnowflake capability to review
Create test environment quicklyZero-copy clone concepts
Recover from accidental changeTime Travel concepts and recovery-oriented design
Promote code across environmentsVersioned DDL, deployment scripts, controlled grants
Validate production change safelyClone-based testing, representative data, rollback plan
Reduce blast radiusSeparate roles, warehouses, schemas, and environments
Cross-region or continuity planningReplication/failover concepts and dependency validation
Preserve pipeline historyAudit tables, metadata capture, and raw immutable zones

Snowpark, UDFs, and Stored Procedure Readiness

When procedural or programmatic logic appears

Be ready to identify when SQL alone is enough and when another tool is justified.

NeedPossible approach
Simple relational transformationSQL view, CTAS, MERGE, dynamic table, or task
Multi-step orchestration inside SnowflakeStored procedure called by task
Custom scalar logicUDF, if governance and performance fit
DataFrame-style transformationsSnowpark pattern
External library or complex logicValidate runtime, packaging, security, and maintainability assumptions
Reusable validation frameworkProcedure, table-driven checks, or orchestration tool

Readiness checks

  • Explain the difference between UDFs and stored procedures.
  • Recognize when UDFs can hurt performance or maintainability.
  • Understand how procedures can encapsulate multi-step pipeline logic.
  • Know that privileges, execution context, and ownership matter.
  • Decide whether logic should live in Snowflake, an orchestration platform, or an upstream application.
  • Keep transformations observable and testable even when packaged in code.

Scenario and Decision-Point Drill

Use this table as a rapid final-review drill. For each row, cover the answer column and decide what you would choose.

ScenarioWhat a strong answer should include
Files arrive every few minutes in cloud object storageSnowpipe or event-driven ingestion; verify stage, pipe, notifications, load history
Daily full extract must replace curated tableBatch load plus swap/transactional pattern; validate row counts and rollback
Source sends changed customer recordsStream plus MERGE, or source CDC pattern; handle updates/deletes and idempotency
Dashboard query is slow on a large tableQuery Profile, pruning, warehouse, clustering/materialization/search optimization only if justified
Analysts need masked PIIMasking policy, row access if needed, roles, secure views or curated tables
Semi-structured events have unpredictable attributesLand raw VARIANT, curate stable fields, track schema versions
Task graph stops runningCheck task state, history, owner role, warehouse, dependencies, errors
Stream has no data but source changedCheck source table, stream offset, transaction consumption, staleness, role/object
COPY loads zero rowsStage path, pattern, file format, load history, previous load state, privileges
Need low-cost development copyClone concepts, environment isolation, grants, and lifecycle cleanup
Need freshness target without custom orchestrationDynamic table if query pattern and refresh behavior fit
Need repeated precomputed query resultMaterialized view or dynamic table depending transformation and freshness need
Need to share governed data externallySecure sharing design, masking/row policies, provider/consumer dependency review
Cost spike after pipeline changeQuery history, warehouse usage, task frequency, feature usage, data volume, query plan

Common Weak Areas for DEA-C02 Candidates

Technical weak spots

  • Knowing syntax but not knowing when to use each feature.
  • Confusing Snowpipe, streams/tasks, dynamic tables, and materialized views.
  • Treating all slow queries as warehouse sizing problems.
  • Forgetting that access issues can involve database, schema, object, stage, integration, and active role.
  • Designing incremental pipelines without idempotency.
  • Not validating load history before rerunning a failed load.
  • Flattening semi-structured data without preserving correct grain.
  • Ignoring operational metadata such as task history, query history, and copy history.
  • Overusing procedural code when SQL or declarative refresh would be clearer.
  • Underestimating governance controls in pipeline design.

Scenario-reading traps

Wording in questionBe careful
“Most cost-effective”Do not automatically choose maximum warehouse size or always-on compute
“Near real time”Identify whether the source is file arrival, message stream, or application event
“Least privilege”Grants and ownership matter; avoid administrator-style answers
“Repeated query”Consider materialization, but verify freshness and maintenance needs
“Changing schema”Separate raw ingestion from curated contracts
“Failed after deployment”Check roles, grants, object ownership, task state, and integrations
“Large table”Large alone does not require clustering; access pattern matters
“Securely load from cloud storage”Prefer integration-based patterns over embedded credentials
“Incremental”Decide whether streams, dynamic tables, MERGE, or source CDC fits best

Final-Week Checklist

7 to 5 days out

  • Rebuild a batch load from stage to raw table without notes.
  • Practice troubleshooting at least three COPY INTO failures.
  • Review Snowpipe dependencies and monitoring points.
  • Write one stream-plus-MERGE pipeline.
  • Create or diagram one task graph.
  • Compare streams/tasks, dynamic tables, materialized views, and views.
  • Review Query Profile examples and identify bottlenecks.
  • Revisit RBAC, future grants, managed access, and stage/integration privileges.

4 to 2 days out

  • Drill scenario questions on feature selection.
  • Review semi-structured parsing and FLATTEN.
  • Practice SCD Type 1 and Type 2 design decisions.
  • Review masking policies, row access policies, tags, and secure views.
  • Check observability artifacts: query history, task history, copy history, pipe status.
  • Summarize performance tools and when not to use them.
  • Review deployment and lifecycle topics: cloning, environment promotion, rollback thinking.

Last day

  • Stop memorizing obscure syntax and focus on decision quality.
  • Re-read your weak-area notes.
  • Do a short mixed scenario set.
  • Review common traps: permissions, idempotency, incremental state, semi-structured grain, performance overreaction.
  • Sleep and avoid making major changes to your study plan.

Practical Next Step

Pick the three weakest readiness areas from the table at the top of this page. For each one, do a short hands-on rebuild or scenario drill, then answer: Which Snowflake feature would I choose, why, what could fail, and how would I prove it is working?