Free SnowPro Core COF-C02 Full-Length Practice Exam: 100 Questions

Try 100 free SnowPro Core COF-C02 questions across the exam domains, with explanations, then continue with full IT Mastery practice.

This free full-length SnowPro Core COF-C02 practice exam includes 100 original IT Mastery questions across the exam domains.

These questions are for self-assessment. They are not official exam questions and do not imply affiliation with the exam sponsor.

Count note: this page uses the full-length practice count maintained in the Mastery exam catalog. Some certification vendors publish total questions, scored questions, duration, or unscored/pretest-item rules differently; always confirm exam-day rules with the sponsor.

Need concept review first? Read the SnowPro Core COF-C02 Cheat Sheet on Tech Exam Lexicon, then return here for timed mocks and full IT Mastery practice.

Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.

Try SnowPro Core COF-C02 on Web View full SnowPro Core COF-C02 practice page

Exam snapshot

  • Exam route: SnowPro Core COF-C02
  • Practice-set question count: 100
  • Time limit: 115 minutes
  • Practice style: mixed-domain diagnostic run with answer explanations

Full-length exam mix

DomainWeight
Snowflake Architecture and Key Features20%
Account Setup, Security, and Governance20%
Data Loading, Unloading, and Transformation20%
Query Processing and Performance Optimization15%
Data Sharing, Collaboration, and Marketplace15%
Operations, Monitoring, and Business Continuity10%

Use this as one diagnostic run. IT Mastery gives you timed mocks, topic drills, analytics, code-reading practice where relevant, and full practice.

Practice questions

Questions 1-25

Question 1

Topic: Data Sharing, Collaboration, and Marketplace

In Snowflake, which principle best describes the purpose of using ARRAY-construction functions when working with semi-structured data?

Options:

  • A. They create in-query lists of values as array objects so multiple values can be handled together as a single semi-structured value.

  • B. They optimize micro-partition pruning by physically clustering related values together on storage.

  • C. They enforce row-level security rules by hiding specific elements in semi-structured data.

  • D. They automatically normalize semi-structured data into fully relational tables with one row per array element.

Best answer: A

Explanation: ARRAY-construction functions in Snowflake are used to build arrays—ordered lists of values—directly inside SQL queries. Instead of handling each value separately in multiple columns or rows, you can group them into a single array value, usually stored in a VARIANT column or passed as a parameter.

For example, you might construct an array from literals or expressions like:

SELECT ARRAY_CONSTRUCT(10, 20, 30) AS amounts;

This produces a single semi-structured value representing the list [10, 20, 30]. You can then store this value, pass it to a UDF, or later use functions like FLATTEN or array access syntax to read or transform its elements.

The core principle is: ARRAY-construction functions let you treat multiple related values as one semi-structured object inside Snowflake SQL, which is especially useful when working with semi-structured data patterns such as JSON-like lists or when you want to bundle parameters or keys together in a single value.


Question 2

Topic: Operations, Monitoring, and Business Continuity

A company currently shares Snowflake data with partners by unloading daily files to external cloud storage. Partners use Snowflake accounts in different regions and cloud platforms. They now want near-real-time, copy-free access with minimal operations. Which approach is the best fit?

Options:

  • A. Increase the unload frequency to every few minutes and let partners load the files into their own Snowflake accounts using automated COPY jobs.

  • B. Enable Snowflake database replication to the partners’ regions, then create secure shares from the replicated database so each partner queries the data directly in its own Snowflake account.

  • C. Create a single large virtual warehouse and share its credentials with partners so they can run read-only queries directly in the provider’s Snowflake account.

  • D. Create reader accounts for each partner in their own regions without using replication, then grant the reader accounts access to the existing database.

Best answer: B

Explanation: Snowflake data sharing is designed to let providers expose data to consumers without copying data files. Within the same region and cloud platform, a secure share points directly to the provider’s database, and the consumer creates a database from that share. The consumer then queries the shared data as if it were local, with no ingestion pipelines.

For cross-region or cross-cloud scenarios, Snowflake uses replication-backed sharing. The provider designates a database for replication and configures it to replicate to one or more target accounts in other regions or cloud platforms. In each target, Snowflake maintains a read-only replica of the provider’s database. From that replica, the provider (or owning account) creates secure shares or listings for consumers in that region.

This pattern gives consumers near-real-time access to shared data in their own Snowflake accounts while avoiding file exports, manual COPY operations, and extra storage copies. It also keeps strong governance boundaries because each consumer has its own account, roles, and warehouses, while the provider controls what is shared and how often it is replicated.


Question 3

Topic: Data Loading, Unloading, and Transformation

Which statement BEST describes when Snowflake can reuse the result cache to answer a query?

Options:

  • A. When the source data files are already stored in an internal stage, allowing COPY commands to load data faster without scanning cloud storage again.

  • B. When a query uses Time Travel to read historical data from a table, so the historical result is automatically served from the result cache.

  • C. When a table’s micro-partitions are already in the local warehouse cache, so subsequent queries never need to read from Snowflake storage.

  • D. When an identical query is re-executed, the underlying data has not changed, and the previous result is still within the result cache retention period, even if the warehouse is suspended.

Best answer: D

Explanation: The Snowflake result cache stores complete results of queries that have already been executed. If the same query (including text and relevant session settings) is run again, Snowflake first checks whether a valid cached result exists. If the underlying data referenced by the query has not changed since the previous execution and the cached result is still within its retention period, Snowflake can return the stored result directly.

This behavior is independent of whether the virtual warehouse is currently running; Snowflake can serve the cached result without consuming additional warehouse compute credits. This makes the result cache different from the warehouse cache, which speeds up data reads but still requires the query to run on a warehouse and scan data blocks.


Question 4

Topic: Data Sharing, Collaboration, and Marketplace

Which statement BEST describes the trade-off when using Snowflake’s approximate aggregate functions such as APPROX_COUNT_DISTINCT instead of exact aggregates like COUNT(DISTINCT ...)?

Options:

  • A. They ignore a random subset of rows, which makes them unsuitable for any production workload in Snowflake.

  • B. They are only available for VARIANT columns and semi-structured data, where exact aggregation is not supported.

  • C. They sacrifice some accuracy in exchange for lower latency and reduced compute resource usage on large datasets.

  • D. They always return exactly the same result as COUNT(DISTINCT ...) but use more compute resources to guarantee consistency.

Best answer: C

Explanation: Snowflake’s approximate aggregate functions, such as APPROX_COUNT_DISTINCT, use probabilistic algorithms to estimate results instead of computing exact values. This approach typically requires fewer resources and completes more quickly, especially on very large datasets, making them well-suited for exploratory analytics, dashboards, or use cases where a small error margin is acceptable. However, because the results are approximate, they are not appropriate for use cases that require exact numbers, such as regulatory or financial reporting.

Exact aggregates like COUNT(DISTINCT ...) scan and compute the precise result, which can be more expensive and slower on large data volumes but provide guaranteed accuracy. Choosing between approximate and exact functions is therefore a trade-off between performance/cost and numerical precision.


Question 5

Topic: Snowflake Architecture and Key Features

Which statement BEST describes a Snowflake share in the context of data sharing between accounts?

Options:

  • A. A Snowflake share is a metadata object in a provider account that references specific database objects and grants read-only access to those objects to one or more consumer accounts without copying the data.

  • B. A Snowflake share is a special role that, when granted to another account, automatically provides both read and write access to selected databases and schemas.

  • C. A Snowflake share is a full physical copy of a database that is exported from one account and imported into another, creating independent storage in each account.

  • D. A Snowflake share is a virtual warehouse configuration that allows multiple accounts to run queries on the same compute resources for shared cost savings.

Best answer: A

Explanation: A Snowflake share is a metadata object that a data provider defines to expose specific, read-only database objects (such as schemas, tables, and secure views) to one or more consumer accounts. The actual data is not copied; consumers create a database from the share in their own account and query the provider’s underlying storage, enabling governed, zero-copy data sharing.

The choice describing a full physical copy being exported and imported confuses sharing with data duplication. Snowflake’s sharing model is explicitly designed to avoid extra storage and to keep a single source of truth managed by the provider, which is why the metadata-based definition is correct.


Question 6

Topic: Data Sharing, Collaboration, and Marketplace

Which TWO statements about fraction-based vs fixed-size sampling using the SAMPLE clause in Snowflake are INCORRECT? (Select TWO.)

Options:

  • A. Fixed-size sampling guarantees that Snowflake scans only the number of rows requested, so it is always cheaper than fraction-based sampling.

  • B. Fraction-based sampling always returns exactly the same number of rows, even when you rerun the query on an unchanged table.

  • C. With fraction-based sampling, if the table size doubles and you keep the same percentage, the expected number of sampled rows roughly doubles.

  • D. Using SAMPLE (10) performs fraction-based sampling, returning roughly 10% of the rows from the table.

  • E. Using SAMPLE (100 ROWS) performs fixed-size sampling, requesting a sample based on a specific number of rows instead of a percentage.

Correct answers: A and B

Explanation: Snowflake’s SAMPLE clause can work in two main ways: fraction-based and fixed-size sampling.

Fraction-based sampling is used when you specify only a numeric value, such as SAMPLE (10). In this case, Snowflake interprets the number as a percentage and returns approximately that fraction of the table’s rows. As the table grows, the expected number of sampled rows grows proportionally.

Fixed-size sampling is used when you include the ROWS (or ROW) keyword, such as SAMPLE (100 ROWS). Here, you request a sample based on a target number of rows rather than a percentage. This is useful when you want roughly the same sample size even if table size changes.

Both sampling modes are random: the exact set of rows and even the exact number of rows returned can vary between executions. Also, Snowflake compute cost is based on warehouse time, not directly on the number of rows returned by SAMPLE, so you cannot assume that fixed-size sampling is always cheaper than fraction-based sampling.


Question 7

Topic: Snowflake Architecture and Key Features

A data engineering team is designing a Snowflake table to store semi-structured JSON web event data along with core user attributes used in joins and filters. Which of the following approaches should they AVOID? (Select TWO.)

Options:

  • A. Store all fields, including stable user IDs and primary keys, together in a single VARIANT column for maximum flexibility.

  • B. Land raw JSON in a VARIANT column, then parse out commonly queried attributes into dedicated typed columns in a later transformation step.

  • C. Use VARIANT columns to capture optional or evolving event attributes while keeping frequently filtered attributes as native NUMBER, VARCHAR, or TIMESTAMP columns.

  • D. Use FLATTEN and path expressions to query arrays and nested objects inside VARIANT when analyzing clickstream behavior.

  • E. Convert incoming JSON to a single large VARCHAR column before loading so that semi-structured capabilities are not used.

Correct answers: A and E

Explanation: Snowflake’s VARIANT data type is designed to store semi-structured data such as JSON, Avro, or Parquet while still allowing efficient querying using path expressions and functions like FLATTEN. However, VARIANT is not a replacement for all structured columns.

A good design typically combines native structured types (NUMBER, VARCHAR, TIMESTAMP, etc.) for stable, frequently joined or filtered attributes with VARIANT columns for flexible or evolving attributes. This preserves type checking, improves query performance and statistics, and keeps schemas understandable, while still allowing schema-on-read behavior for less predictable data.

Patterns that either push everything into a single VARIANT or ignore VARIANT entirely by storing JSON as plain text both fail to take advantage of Snowflake’s architecture and make the data harder to use.


Question 8

Topic: Data Loading, Unloading, and Transformation

In Snowflake, what is a primary benefit of the virtual warehouse data cache when the same table is queried repeatedly?

Options:

  • A. It creates and maintains physical indexes on frequently queried columns to speed up filter predicates.

  • B. It automatically extends Time Travel retention so old versions of the table remain queryable longer without extra cost.

  • C. It decreases Snowflake storage costs by permanently storing compressed copies of the data in the warehouse.

  • D. It reduces repeated remote storage reads by serving frequently accessed data from the warehouse cache.

Best answer: D

Explanation: Snowflake virtual warehouses maintain a data cache (often called the warehouse cache) that stores recently scanned micro-partitions in local, compute-side storage. When subsequent queries access the same data, Snowflake can read from this warmed cache instead of repeatedly fetching the same micro-partitions from remote storage.

By avoiding repeated remote storage I/O, queries that scan the same data benefit from lower latency and often complete faster, especially when the query pattern involves many repeated reads of a stable dataset (for example, dashboards or recurring reports). This cache is transparent to users: no special syntax or configuration is needed beyond running queries on the same active warehouse.

The cache does not change how long data is retained (Time Travel), does not affect storage billing for the underlying data, and does not introduce traditional index structures. Its main purpose is to reduce I/O for repeated scans of the same data, improving performance and sometimes reducing compute work for those queries.


Question 9

Topic: Snowflake Architecture and Key Features

A data engineer is designing automation for ELT pipelines in the Snowflake AI Data Cloud. They plan to use Snowflake tasks to schedule and orchestrate SQL-based workloads. Which of the following is NOT a correct use or expectation of Snowflake tasks for scheduling SQL-based workloads?

Options:

  • A. Expecting a task to automatically start whenever files land in external cloud storage, without any schedule, stream, or other dependency configured.

  • B. Creating a root task that runs hourly and defines a tree of dependent child tasks to orchestrate a multi-step ELT workflow.

  • C. Configuring a task to run a nightly MERGE statement into a reporting table using a dedicated, auto-suspending virtual warehouse.

  • D. Using a task with a 5-minute schedule and a WHEN condition that checks a stream for new data before running processing SQL.

Best answer: A

Explanation: Snowflake tasks are catalog objects that let you schedule and automate SQL-based workloads. They primarily support time-based scheduling using a SCHEDULE clause and event-like behavior by using dependencies (task trees) and WHEN conditions that check internal state such as streams.

Tasks do not directly subscribe to external events like file arrivals in cloud storage. To handle event-style ingestion, you typically combine Snowpipe (for automatic file loading) with streams and tasks that periodically check for new data using a WHEN condition such as SYSTEM$STREAM_HAS_DATA. The key idea is that tasks are driven by schedules and internal conditions, not by raw cloud storage notifications on their own.


Question 10

Topic: Data Sharing, Collaboration, and Marketplace

Which of the following statements about optimizing queries on very large VARIANT columns in Snowflake is NOT correct?

Options:

  • A. Applying filters directly on VARIANT paths (for example, WHERE v:event:type = 'click') can help reduce the amount of data scanned.

  • B. Selecting the entire VARIANT column instead of specific paths usually performs better than referencing only the attributes needed by the query.

  • C. Avoiding unnecessary FLATTEN operations on large arrays in VARIANT values can help control both query runtime and compute cost.

  • D. Limiting the SELECT list to specific VARIANT paths (for example, v:customer:id) can reduce work and improve query performance.

Best answer: B

Explanation: When working with very large VARIANT columns in the Snowflake AI Data Cloud, performance is strongly influenced by how much of the semi-structured object each query needs to read and transform.

Projecting only the paths you actually need (for example, v:customer:id or v:metrics:clicks) allows Snowflake to avoid processing unnecessary parts of the VARIANT. Similarly, placing filters directly on specific VARIANT paths gives the optimizer more precise information, which can reduce the amount of data scanned via partition pruning and other optimizations.

Operations like FLATTEN can dramatically expand the number of rows, especially on large arrays. While FLATTEN is useful, applying it unnecessarily on large VARIANT data will increase both runtime and compute consumption.

For these reasons, the idea that selecting the entire VARIANT column is better for performance is incorrect. Doing so typically causes more data to be processed than required, which generally harms performance rather than improves it.


Question 11

Topic: Snowflake Architecture and Key Features

Which of the following statements about secure views in the Snowflake AI Data Cloud is NOT correct?

Options:

  • A. Secure views can expose only selected columns or filtered rows from base tables, helping limit what sensitive data consumers can see.

  • B. Secure views help prevent sensitive query results from being reused across roles through the result cache, adding an extra layer of protection.

  • C. Secure views are commonly used when sharing data with other accounts, so consumers query the view without direct access to the underlying tables.

  • D. Secure views automatically apply role-based dynamic masking to sensitive columns without any additional masking policies or logic.

Best answer: D

Explanation: Secure views in the Snowflake AI Data Cloud are specialized views that add stronger protection against data leakage. They are often used to expose a controlled subset of data while hiding underlying tables and reducing the risk of sensitive information being exposed via caches or query metadata.

However, secure views do not, by themselves, implement dynamic data masking based on user roles. To mask sensitive values, you must either embed masking logic in the view definition (for example, CASE expressions) or attach Snowflake masking policies to the underlying columns. Secure views then help ensure that whatever is exposed through the view is handled securely, but they are not a substitute for explicit masking policies or RBAC design.

Other statements correctly describe how secure views interact with result caching and data sharing, and how they limit visible columns and rows, which are key patterns for protecting sensitive data conceptually at the SnowPro Core level.


Question 12

Topic: Snowflake Architecture and Key Features

A team ingests clickstream events as JSON into the Snowflake AI Data Cloud. New attributes are added frequently by upstream services. They must 1) avoid frequent table DDL changes, 2) efficiently join on stable fields such as user_id and event_time, and 3) keep modeling effort low. Which table design is MOST appropriate?

Options:

  • A. Create a table with a single VARCHAR column that stores the raw JSON text, and use string functions to parse attributes when needed.

  • B. Create a table with a single VARIANT column for the full JSON and extract user_id and event_time from it in every query.

  • C. Create a wide table with dedicated typed columns for all current and expected JSON attributes, without using VARIANT.

  • D. Create a table with separate typed columns for user_id and event_time plus a VARIANT column that stores the full JSON event.

Best answer: D

Explanation: Snowflake supports both structured and semi-structured data. Traditional types such as VARCHAR, NUMBER, and TIMESTAMP are ideal for stable, well-understood columns that are commonly used for joins, filters, and aggregations.

For semi-structured data (JSON, Avro, Parquet, XML), Snowflake provides the VARIANT data type. VARIANT can store nested, evolving structures without requiring a fixed schema in advance. New attributes can appear in the incoming data without breaking existing queries or forcing immediate DDL changes.

A common best practice is a hybrid design: model stable, high-value attributes as strongly typed columns, and keep the raw or less predictable payload in a VARIANT column. This gives the convenience and performance of typed columns for key fields while still allowing flexible exploration of all other attributes via VARIANT and path expressions.

In this scenario, making user_id and event_time separate typed columns supports efficient joins and filters, while the VARIANT column captures all other semi-structured event details with minimal ongoing modeling effort.


Question 13

Topic: Data Loading, Unloading, and Transformation

Which TWO statements about using Snowflake resource monitors and the Snowflake Budgets service to manage credit usage are correct? (Select TWO.)

Options:

  • A. Snowflake Budgets automatically block all new compute usage in the account as soon as the budget is exceeded, without any additional configuration.

  • B. Snowflake Budgets let you define a target credit usage for a period and trigger alerts when actual or forecast usage approaches or exceeds that target.

  • C. Resource monitors can automatically suspend one or more virtual warehouses when a defined credit-usage threshold is reached.

  • D. Resource monitors can retroactively refund credits for queries that ran after the quota was exceeded.

  • E. Resource monitors track and limit storage and data transfer costs in addition to compute credit usage.

Correct answers: B and C

Explanation: Snowflake provides two complementary mechanisms for managing credit spend.

Resource monitors attach directly to compute credit usage at the account or warehouse level. You define a credit quota and one or more thresholds (as percentages of that quota). When a threshold is reached, the monitor can trigger actions such as sending notifications or automatically suspending specific warehouses or even all warehouses in the account. This is the primary way to enforce hard or soft limits on compute usage.

Snowflake Budgets focus on higher-level cost governance. You define a target spend (often in credits) over a time period and associate the budget with a scope, such as an account or a tag-based grouping of objects. As usage accumulates, Budgets compare actual and forecasted consumption against the target and can generate alerts when thresholds are approached or exceeded. Budgets are meant for monitoring and alerting, not for directly suspending compute.

Therefore, the statement that resource monitors can automatically suspend warehouses at thresholds is correct, and the statement that Budgets allow defining a target and alerting on approaching or exceeding that target is also correct. Claims that resource monitors refund credits, that Budgets automatically block all compute, or that resource monitors control storage and data transfer are not accurate.


Question 14

Topic: Snowflake Architecture and Key Features

Which TWO statements about Snowsight in the Snowflake AI Data Cloud are correct? (Select TWO.)

Options:

  • A. Snowsight must be installed as a desktop client application on a user’s computer before it can be used.

  • B. Snowsight is the recommended tool for fully automated batch jobs and CI/CD pipelines integrating with Snowflake.

  • C. Snowsight allows users to create and manage virtual warehouses, databases, and roles through a graphical interface.

  • D. Snowsight is only available to users with the ACCOUNTADMIN role; other roles must use SnowSQL instead.

  • E. Snowsight is the primary web-based user interface for managing and querying data in Snowflake.

Correct answers: C and E

Explanation: Snowsight is the primary, browser-based user interface for working with the Snowflake AI Data Cloud. It is designed for interactive use: running SQL queries, exploring data, managing many core objects (such as warehouses and databases), building worksheets and dashboards, and administering roles and grants when the logged-in role has sufficient privileges.

Snowsight does not require any local installation; users simply sign in via a supported web browser. It is available to all users regardless of role, as long as they can authenticate and their roles have the necessary privileges. For automation and CI/CD, Snowflake instead recommends programmatic interfaces such as SnowSQL, language drivers, the SQL API, or SnowCD, which can be scripted and integrated into pipelines.


Question 15

Topic: Query Processing and Performance Optimization

A company has several Snowflake teams loading CSV files from the same external storage location. Today, each team hard-codes file format options in its own COPY INTO commands, causing inconsistent handling of NULLs and dates. The data platform team wants centralized governance and easy reuse of identical format settings. Which approach is MOST appropriate?

Options:

  • A. Have each team define its own file format options inline in the COPY INTO command, using standardized documentation as guidance.

  • B. Ask each team to create its own named FILE FORMAT object in its own schema, tuned to that team’s needs for the shared CSV files.

  • C. Store all format settings only on individual named stages and let each team create its own stage pointing to the same external location.

  • D. Create a shared named FILE FORMAT in a central schema and require all COPY INTO commands to reference that FILE FORMAT object.

Best answer: D

Explanation: Named FILE FORMAT objects in Snowflake let you define parsing options (such as delimiter, NULL handling, and date formats) once and reuse them across many COPY INTO commands and stages. This is ideal when many teams load data from the same source layout and you want consistency and easy governance.

By creating a shared FILE FORMAT in a central location and granting usage on it, administrators can enforce standard behavior for ingestion and change parsing rules in one place if the upstream file layout changes. All loads that reference the shared FILE FORMAT automatically pick up the new behavior without modifying every pipeline script.

Inline options or separate per-team definitions spread configuration across many places, increasing the risk of inconsistencies and making governance and updates harder.


Question 16

Topic: Query Processing and Performance Optimization

In Snowflake, why is it a best practice for multiple teams to use shared named FILE FORMAT objects for their COPY-based loading and unloading operations?

Options:

  • A. Because using a named FILE FORMAT lets each user automatically override parsing options at runtime without changing shared settings

  • B. Because a single, reusable FILE FORMAT centralizes settings such as delimiters and compression, keeping behavior consistent across all COPY operations that reference it

  • C. Because FILE FORMAT objects determine which encryption keys are used for individual users when reading or writing staged files

  • D. Because FILE FORMAT objects control how Snowflake creates and sizes micro-partitions for tables loaded with COPY

Best answer: B

Explanation: Shared named file format objects in Snowflake capture the reusable details about how files are structured, such as delimiters, compression, escape characters, and date formats. When teams point their COPY INTO commands and stages to the same file format object, every load or unload operation applies the same set of rules.

This pattern improves governance and consistency because changes to file-handling behavior (for example, switching compression type) can be made once in the shared file format and then automatically apply to all dependent jobs. It also reduces errors that occur when individual users repeatedly specify detailed file options in each COPY statement, where typos or forgotten options can cause inconsistent parsing, rejected files, or data quality issues.

By centralizing configuration and referencing it by name, organizations standardize how data is loaded and unloaded across teams while keeping the implementation simple and auditable.


Question 17

Topic: Snowflake Architecture and Key Features

Which of the following statements about Snowflake sequences is NOT correct?

Options:

  • A. The same sequence can be referenced from multiple tables, causing them to share one increasing stream of generated numbers.

  • B. A sequence can be used in a column DEFAULT expression (for example, DEFAULT my_seq.NEXTVAL) to populate surrogate keys during INSERT operations.

  • C. Snowflake sequences automatically reset to 1 for each table that uses them, ensuring surrogate keys always start at 1 per table.

  • D. A sequence is a schema-level object that generates numeric values when referenced using the NEXTVAL pseudo-column.

Best answer: C

Explanation: Snowflake sequences are schema-level objects designed to generate numeric values, commonly used for surrogate keys or technical identifiers. They are independent from any specific table. Applications or DDL/DML statements obtain new values by referencing the sequence’s NEXTVAL pseudo-column.

A typical pattern is to define a sequence once in a schema and then use it in one or more tables, often through a column DEFAULT expression. Every time an INSERT statement triggers that default, Snowflake calls my_seq.NEXTVAL and assigns the next numeric value from that sequence. The same sequence can be shared by multiple tables or used directly in ad hoc SQL. Sequences do not automatically reset for each table; instead, they produce a continuous stream of values wherever they are referenced.


Question 18

Topic: Operations, Monitoring, and Business Continuity

Which statement BEST describes a reader account in Snowflake data sharing?

Options:

  • A. A reader account is a special provider account that can create shares but cannot host its own data.

  • B. A reader account is a Snowflake-managed account that a data provider creates for consumers who do not have their own Snowflake account, allowing them to query shared data.

  • C. A reader account is a temporary database clone that allows a consumer to preview shared data without having any Snowflake account.

  • D. A reader account is a standard Snowflake account owned and billed independently by the data consumer, used to import a provider’s shared data as a full copy.

Best answer: B

Explanation: In Snowflake data sharing, a reader account is a fully functional Snowflake account that is created and managed by a data provider specifically for consumers who do not have their own Snowflake account. The reader account allows those consumers to connect, run queries, and build views or other objects over the shared data while the provider retains administrative control.

In contrast, a standard consumer account is an ordinary Snowflake account owned by the consumer’s organization, which directly receives the share and manages its own compute, users, and security. Provider accounts own datasets and create shares, but do not themselves function as reader accounts.


Question 19

Topic: Data Loading, Unloading, and Transformation

Which TWO statements correctly explain why data spilling shown in the Query Profile can slow down a query in Snowflake? (Select TWO.)

Options:

  • A. Data spilling causes intermediate data to be written to and read from storage instead of staying in memory, adding extra I/O and latency.

  • B. Data spilling shows that micro-partitions are being physically reorganized, which blocks other queries from accessing those partitions.

  • C. Data spilling means the virtual warehouse has automatically reduced its size during the query, lowering compute power and extending runtime.

  • D. Data spilling indicates that Snowflake has fully warmed its caches, so subsequent queries on the same data always run faster.

  • E. Data spilling means some operators must repeatedly move data between memory and storage, so they spend more time waiting on I/O than on computation.

Correct answers: A and E

Explanation: In Snowflake, the Query Profile can show when an operator has “spilled” data to local or remote storage. This happens when the working set of data for that step does not fully fit in memory allocated to the virtual warehouse. Instead of processing everything in memory, Snowflake must write intermediate results out to storage and read them back later.

Accessing storage is much slower than using data already held in memory. Every extra write and read adds I/O operations and latency, so operators spend more time waiting on data than performing computations. In some cases, the operator has to loop through the data in multiple passes, repeatedly moving chunks between memory and storage. This additional I/O and waiting time is why significant spilling, as shown in the Query Profile, often correlates with slower query performance.

Spilling does not mean that warehouses are shrinking, partitions are locked, or caches are necessarily warmed. It is specifically a signal of memory pressure during query execution, which typically reduces performance compared with an in-memory execution plan.


Question 20

Topic: Account Setup, Security, and Governance

Which TWO of the following statements about Snowflake security best practices are NOT correct? (Select TWO.)

Options:

  • A. Enabling multi-factor authentication (MFA) or SSO for interactive users is a recommended best practice to strengthen account security.

  • B. Applying least privilege means granting users only the minimum roles and privileges required to perform their job duties.

  • C. Granting the ACCOUNTADMIN role to all developers ensures they do not encounter permission errors and is considered a scalable security approach.

  • D. Service integrations and automation should use dedicated Snowflake service users with narrowly scoped roles, rather than reusing personal user accounts.

  • E. Users should frequently share a single Snowflake login and password within a team to simplify permission management.

Correct answers: C and E

Explanation: Snowflake security best practices center on strong, individual authentication and careful role-based access control. Each person should have their own Snowflake user, with permissions granted via roles that follow the principle of least privilege. High-privilege roles such as ACCOUNTADMIN must be tightly controlled.

Strong authentication (such as MFA or SSO/federation) is recommended to protect against password compromise. For non-interactive workloads and integrations, dedicated service users with restricted roles provide both clear auditing and minimized risk. In contrast, shared logins and broadly granting powerful roles are unsafe and violate core security principles.


Question 21

Topic: Data Sharing, Collaboration, and Marketplace

A team is onboarding log data files from external storage into the Snowflake AI Data Cloud. Their current ingestion pipeline is configured with file formats only for JSON, AVRO, and PARQUET; XML files must be converted before loading. The storage bucket currently contains the following files:

FormatFile count
JSON320
PARQUET450
AVRO230
XML120

Assuming they load only the formats already configured (JSON, AVRO, PARQUET), how many files can they load immediately without conversion? (Round to the nearest whole file.)

Options:

  • A. 1,120 files

  • B. 870 files

  • C. 1,000 files

  • D. 1,230 files

Best answer: C

Explanation: Snowflake can ingest several semi-structured formats, and in this scenario the pipeline is explicitly configured for JSON, AVRO, and PARQUET. To find how many files can be loaded immediately, you add the counts for just those three formats and exclude XML, which the team has stated must be converted first.

Adding the configured formats gives 320 JSON files, 450 PARQUET files, and 230 AVRO files, for a total of 1,000 files that can be loaded without any conversion step. This reinforces that JSON, AVRO, and PARQUET are common semi-structured formats used with Snowflake.


Question 22

Topic: Data Sharing, Collaboration, and Marketplace

An analytics team stores unstructured image files in an internal named stage. They need to send one-time download links for specific images to external partners who do not have Snowflake accounts. Security requires that links automatically expire after a few hours and never expose permanent cloud storage credentials. Which approach best meets these requirements?

Options:

  • A. Share the internal stage URL and have partners access it by logging in through Snowflake reader accounts with file download privileges.

  • B. Move the images to a public cloud storage container and share the unsigned HTTP URLs with partners for direct download.

  • C. Provide direct cloud storage URLs for the underlying storage location of the stage, secured with long-lived access keys managed outside Snowflake.

  • D. Generate presigned URLs for the specific files in the internal stage with a short expiration time and share those URLs with partners.

Best answer: D

Explanation: Presigned URLs are specifically designed to provide secure, time-limited access to individual files stored in Snowflake stages, including internal stages used for unstructured data. When Snowflake generates a presigned URL, it embeds a token and an explicit expiration time into the link. External users can then download the file directly without needing a Snowflake account or separate credentials.

In this scenario, the team must share files with external partners who do not have Snowflake accounts, and they must ensure that access automatically expires after a few hours and does not reveal permanent cloud storage credentials. Presigned URLs satisfy all of these requirements: they are scoped to a specific object, have a configurable lifetime, and avoid exposing underlying storage keys.

By contrast, internal stage URLs are intended for use by authenticated Snowflake users with appropriate privileges, and cloud storage URLs (public or key-based) either expose long-lived credentials or provide overly broad access that does not automatically expire. Therefore, generating presigned URLs for files in the internal stage is the best solution.


Question 23

Topic: Query Processing and Performance Optimization

A data engineering team loads about 200GB of CSV files each day from an internal stage into a Snowflake table and occasionally needs to correct a few individual rows. Which loading approach is NOT recommended for this scenario?

Options:

  • A. When business rules change and the entire table must be rebuilt from another table, use INSERT OVERWRITE to replace all existing rows.

  • B. Run COPY INTO in VALIDATION_MODE first to detect data issues, then rerun COPY INTO without VALIDATION_MODE to perform the actual 200GB load.

  • C. Use COPY INTO from the internal stage for the 200GB daily files, and use small INSERT statements only for the occasional row-level corrections.

  • D. For the 200GB daily load, use a single large INSERT statement that selects directly from the staged files instead of using COPY INTO.

Best answer: D

Explanation: When loading large volumes of data from staged files, Snowflake recommends using the COPY INTO command. COPY INTO is designed and optimized for bulk file ingestion: it can process multiple files efficiently, tracks load history, and offers robust error handling and options like validation, parallelism, and partial loads.

INSERT and INSERT OVERWRITE are SQL DML commands best suited for small batches, row-level changes, or rebuilding a table from another table or query, not for high-volume file ingestion. While it might be technically possible to select from staged files using INSERT…SELECT patterns, this bypasses COPY INTO’s purpose-built bulk loading capabilities, resulting in poorer performance, harder monitoring, and potentially more fragile pipelines.

In the scenario, the best pattern is to use COPY INTO for the large, 200GB daily load and reserve INSERT for the occasional row-level corrections. Using a huge INSERT directly from staged files for the bulk load is the clearly wrong choice compared with COPY INTO.


Question 24

Topic: Data Loading, Unloading, and Transformation

A data engineer is investigating slow performance on a reporting warehouse. In Snowsight they open Query History and see the following:

Exhibit:

QUERY_IDSTART_TIMETOTAL_ELAPSED_TIME_MSBYTES_SCANNEDROWS_PRODUCED
01a1-xyz-0012025-06-10 09:01:2214500018,000,000120,000
01a1-xyz-0022025-06-10 09:03:10120002,500,00015,000
01a1-xyz-0032025-06-10 09:04:05180003,200,00020,500

They want to understand why query 01a1-xyz-001 was slow, including which steps consumed the most time. Which Snowflake feature should they use next?

Options:

  • A. Increase the warehouse size and re-run all queries shown in Query History

  • B. Query the ACCOUNT_USAGE.QUERY_HISTORY view directly to see the join and scan steps for query 01a1-xyz-001

  • C. Use Query History filters to show the detailed execution graph for query 01a1-xyz-001

  • D. Open the Query Profile for query 01a1-xyz-001 to see operator-level execution details

Best answer: D

Explanation: The exhibit shows Snowsight Query History with summary metrics per query: TOTAL_ELAPSED_TIME_MS, BYTES_SCANNED, and ROWS_PRODUCED. This view is excellent for workload-level analysis, such as spotting outlier queries. Here, query 01a1-xyz-001 clearly stands out with a TOTAL_ELAPSED_TIME_MS of 145,000, much higher than the other queries.

However, the data engineer specifically wants to know why that single query was slow, including which internal steps (such as table scans, joins, or aggregations) consumed the most time. Those details are not present in Query History or ACCOUNT_USAGE views. Instead, Snowflake provides a per-query diagnostic tool called Query Profile.

Query Profile displays a graphical execution plan with nodes for each operation and metrics like time spent, rows processed, and data volume at each step. This is the correct feature to use when you drill into the performance characteristics of an individual query after identifying it as an outlier in Query History.

Thus, the appropriate workflow is: use Query History for spotting slow queries or workload patterns, then open the Query Profile for a specific query to understand its internal execution behavior.


Question 25

Topic: Snowflake Architecture and Key Features

Which statement BEST describes the primary purpose of the Snowflake AI Data Cloud?

Options:

  • A. A service for provisioning and managing virtual machines, storage volumes, and networks used to host customer-built databases and applications.

  • B. A fully managed data platform that centralizes diverse data and provides scalable compute so organizations can run analytics and data-driven applications without managing underlying infrastructure.

  • C. A low-cost object storage service for keeping raw files, without built-in SQL processing, security controls, or workload isolation.

  • D. A traditional on-premises relational database that focuses on OLTP workloads and requires manual indexing, partitioning, and storage tuning for performance.

Best answer: B

Explanation: Snowflake AI Data Cloud is a fully managed, cloud-native data platform designed to centralize data from many sources and provide elastic compute for analytics and applications. Users query and process data using SQL and related tools while Snowflake handles infrastructure, scaling, storage layout, and maintenance tasks.

By separating storage from compute and exposing a simple account/database/schema/warehouse model, Snowflake lets teams focus on data and workloads instead of hardware, operating systems, and database tuning. This is the core value proposition: unified data access with scalable performance and simplified operations.


Questions 26-50

Question 26

Topic: Account Setup, Security, and Governance

A Snowflake account currently relies on LOGIN_HISTORY to review sign-ins. The security team must routinely report which users actually accessed specific sensitive tables and when, with minimal custom parsing. Which approach best meets this requirement using Snowflake’s governance capabilities?

Options:

  • A. Apply object tags to all sensitive tables and query TAG_REFERENCES to determine which users have accessed those tagged objects and when.

  • B. Query SNOWFLAKE.ACCOUNT_USAGE.ACCESS_HISTORY on a schedule and build a report using columns such as QUERY_START_TIME, USER_NAME, ROLE_NAME, and DIRECT_OBJECTS_ACCESSED.

  • C. Use GRANTS_TO_USERS and GRANTS_TO_ROLES to list which users have privileges on sensitive tables and treat this list as a record of who accessed the data.

  • D. Combine LOGIN_HISTORY and QUERY_HISTORY, then parse the QUERY_TEXT field to infer which tables and columns were accessed by each user.

Best answer: B

Explanation: The requirement is to provide a repeatable governance report that answers who accessed which sensitive objects and when, while avoiding heavy custom parsing. In Snowflake, the ACCESS_HISTORY view in the SNOWFLAKE.ACCOUNT_USAGE schema is specifically designed for this purpose.

ACCESS_HISTORY records data-access events at the query level. For each query, it exposes the actor (user and role), the time of the access (such as QUERY_START_TIME), and the set of objects accessed (for example, tables and views listed in DIRECT_OBJECTS_ACCESSED and BASE_OBJECTS_ACCESSED). This lets security and governance teams directly analyze actual usage of sensitive tables and columns.

By querying ACCESS_HISTORY on a schedule and projecting just the relevant columns (user, role, timestamp, and accessed objects), you can build a governance report that shows real access patterns without needing to parse raw SQL text or infer access from grants or tags.


Question 27

Topic: Data Loading, Unloading, and Transformation

Which of the following statements about caching in the Snowflake AI Data Cloud is NOT correct?

Options:

  • A. Result cache entries are reused only when the query text and relevant session parameters match and the underlying data has not changed in a way that affects the result.

  • B. Snowflake stores table data in durable storage; caches only keep copies to improve performance and are not the authoritative source of data.

  • C. Queries may return logically incorrect results if you do not periodically clear Snowflake caches; therefore cache maintenance is required for data correctness.

  • D. The local data cache for a virtual warehouse is discarded when the warehouse is suspended, causing subsequent queries to re-read needed data from storage.

Best answer: C

Explanation: Snowflake uses several caches—such as the result cache and the local data cache on virtual warehouses—to improve performance and reduce repeated reads from storage. However, these caches are optimizations only.

The authoritative copy of all table data is always stored in Snowflake’s durable storage layer. The query processing engine consults caches when possible, but it automatically invalidates or bypasses them when underlying data changes or when query conditions differ, so results remain logically correct without manual cache management.

Because of this, cache behavior should never be relied on for functional correctness or data freshness guarantees. Instead, correctness is defined by Snowflake’s transactional storage and Time Travel mechanisms; caches only help make correct queries faster and cheaper to run.


Question 28

Topic: Operations, Monitoring, and Business Continuity

A table was accidentally dropped, and a user attempts to query it using Time Travel but receives the following error.

Exhibit:

SELECT * FROM sales.public.orders
  AT(TIMESTAMP => '2025-01-01 10:00:00'::timestamp);

SQL execution error: Time Travel retention period has expired for table SALES.PUBLIC.ORDERS.
Data is now in Fail-safe and is not directly accessible.
To recover, contact Snowflake Support.

Based only on the information in the exhibit, what is the most appropriate next step?

Options:

  • A. Open a Snowflake Support case to request recovery of the table from Fail-safe.

  • B. Use zero-copy cloning on the table to restore data from Fail-safe.

  • C. Create a new table with the same name so Fail-safe automatically repopulates it.

  • D. Increase the Time Travel retention period on the database and rerun the query.

Best answer: A

Explanation: The exhibit shows a query attempting to use Time Travel on the SALES.PUBLIC.ORDERS table. The error message clearly states that the “Time Travel retention period has expired” and that the data “is now in Fail-safe and is not directly accessible.” It then explicitly instructs the user: “To recover, contact Snowflake Support.”

In Snowflake, Time Travel allows users to query, clone, and restore objects within a configured retention period. Once that window expires, historical data moves into Fail-safe. Fail-safe is a last-resort recovery mechanism that is fully managed by Snowflake. Customers cannot query or directly access data in Fail-safe using SQL, cloning, or Time Travel.

Therefore, when data is only available in Fail-safe, the correct action is to open a case with Snowflake Support and request recovery. Configuration changes like increasing Time Travel retention or creating new tables cannot bring back data whose Time Travel window has already expired.


Question 29

Topic: Snowflake Architecture and Key Features

A data platform team is migrating analytics workloads to the Snowflake AI Data Cloud. They want to 1) scale query processing independently of data storage and 2) rely on Snowflake to centrally manage authentication, RBAC, and query optimization without managing infrastructure. Which of the following actions/solutions will meet these requirements? (Select TWO.)

Options:

  • A. Increase the size of the Snowflake storage layer to add more CPU and memory for queries while delegating all authentication to an external identity provider that bypasses Snowflake RBAC.

  • B. Deploy external VMs to run SQL engines for queries, using Snowflake only as low-cost storage, and manage query optimization in those external engines.

  • C. Use Snowflake virtual warehouses for elastic compute while relying on the cloud services layer for authentication, RBAC, and query optimization over centrally stored data.

  • D. Create additional virtual warehouses to store cold and hot data separately while manually tuning query performance using user-created indexes on tables.

  • E. Adopt Snowflake’s multi-cluster shared data architecture by keeping all data in the storage layer and provisioning one or more virtual warehouses, allowing the cloud services layer to manage metadata, permissions, and query planning.

Correct answers: C and E

Explanation: Snowflake’s architecture is built around three main layers: storage, compute, and cloud services.

The storage layer holds all persisted data (micro-partitioned, compressed, and encrypted). It is fully managed and automatically optimized by Snowflake; it does not provide CPU or memory for queries.

The compute layer consists of virtual warehouses, which supply the processing resources for running queries, loading data, and performing transformations. Warehouses can be scaled up or down and started or stopped independently of the amount of stored data.

The cloud services layer manages metadata, authentication, RBAC, query parsing and optimization, and other coordination tasks. It centralizes security and governance and orchestrates how compute accesses shared storage.

In the scenario, the team wants to scale query processing independently of data storage and also wants Snowflake itself to manage authentication, RBAC, and query optimization. That means they must correctly leverage virtual warehouses (compute) and the cloud services layer, while recognizing that data resides in the storage layer and that storage does not supply compute resources.


Question 30

Topic: Operations, Monitoring, and Business Continuity

Which Snowflake feature should a data provider primarily use to monitor how often consumer accounts query shared objects and when that access occurs?

Options:

  • A. Dashboards in the consumer’s BI tool, which the provider can reference to understand shared data query patterns

  • B. Secure shares themselves, which automatically generate and deliver detailed usage reports to the provider by email

  • C. Time Travel on the shared database, by reviewing historical table versions to infer when consumers queried the data

  • D. Account-level usage and access history views that record query activity and shared object access in the provider account

Best answer: D

Explanation: Snowflake provides built-in usage and access history views that capture query metadata and object access, allowing a data provider to see which shared objects were queried, by which accounts, and when. These metadata views are the primary, Snowflake-native way to monitor data sharing usage and audit access to shared data. Time Travel, shares themselves, or external BI dashboards do not capture or centralize this access metadata for the provider.


Question 31

Topic: Data Sharing, Collaboration, and Marketplace

A data engineer is working with semi-structured order data stored in a VARIANT column. They want one row per item while still returning order-level columns.

Exhibit:

SELECT
  o.order_id,
  o.customer_id,
  o.order_ts,
  i.value:sku       AS sku,
  i.value:quantity::NUMBER AS quantity
FROM orders o,
     LATERAL FLATTEN(input => o.items) i;

Based on the exhibit, what is the primary purpose of using LATERAL FLATTEN in this query?

Options:

  • A. To convert the items VARIANT column into a permanent relational table that can be queried without referencing orders

  • B. To automatically improve performance by caching the parsed JSON from o.items across all future queries

  • C. To expand each element of o.items into its own row while keeping the parent row’s columns (order_id, customer_id, order_ts) available in the same result row

  • D. To cast o.items from VARIANT to ARRAY so it can later be flattened in a separate step

Best answer: C

Explanation: The query in the exhibit selects order-level columns (o.order_id, o.customer_id, o.order_ts) together with fields from individual items (i.value:sku, i.value:quantity). The FROM clause uses orders o, LATERAL FLATTEN(input => o.items) i.

FLATTEN takes an array or object (here, the items VARIANT column) and returns a set of rows, one per element. The LATERAL keyword makes this set correlated with each row of orders. For every parent row in orders, Snowflake runs FLATTEN on o.items and returns one result row for each element, while keeping the parent row’s columns in scope.

This is how you preserve the parent context (order-level details) while expanding an array into multiple rows. Each output row contains the order identifiers plus the details of a single item. Without a lateral relationship, you would lose that 1-to-many link or have to manually write a more complex join.

The other options misinterpret what LATERAL FLATTEN does. It does not create permanent tables, does not introduce special caching, and does not handle type casting from VARIANT to ARRAY; it simply expands the array per parent row while keeping the parent columns available.


Question 32

Topic: Snowflake Architecture and Key Features

A data engineering team currently runs weekly maintenance SQL scripts by manually executing statements in Snowsight worksheets. They want these tasks to be fully automated, scriptable, and easily integrated into an existing CI/CD pipeline without adding unnecessary complexity. Which approach is the BEST fit?

Options:

  • A. Write the maintenance SQL in a SnowSQL script and run it on a schedule from a job runner or CI/CD pipeline, supplying credentials via secure configuration.

  • B. Move to the classic web interface and use its query history to manually rerun the same maintenance queries whenever needed.

  • C. Keep the SQL in Snowsight worksheets and have an engineer use the “Run All” button each week to execute the maintenance tasks.

  • D. Develop a custom web application that calls the Snowflake SQL API for each maintenance statement and deploy it as a separate scheduling service.

Best answer: A

Explanation: SnowSQL is Snowflake’s command-line client, designed specifically for scripted and automated execution of SQL statements. It integrates naturally with schedulers and CI/CD tools because it can be run non-interactively, accepts connection parameters and variables, and returns clear exit codes for pipeline logic.

In this scenario, the team wants to move from manual, UI-driven execution in Snowsight to a fully automated, repeatable process that fits into an existing CI/CD pipeline. The simplest and most aligned solution is to place the maintenance SQL into a SnowSQL script and invoke it from a scheduler or pipeline job. This directly targets the core use case for SnowSQL: batch and automated operations outside the web interface.

Other options either keep manual steps in the web UI or introduce unnecessary custom infrastructure, which conflicts with the requirement to automate while keeping things simple and using recommended Snowflake tooling patterns.


Question 33

Topic: Query Processing and Performance Optimization

A team receives 50 large CSV files (about 5GB total) on an internal stage every hour. Their top priority is to minimize warehouse credits while reliably loading all rows into a target table. Which approach is most appropriate?

Options:

  • A. Load each file into a temporary table using many INSERT statements, then run an INSERT OVERWRITE into the target table once per hour.

  • B. Use an application to read each file and execute individual INSERT statements for every row into the target table.

  • C. Create a recurring task that runs frequent single-row INSERT statements as soon as each file appears on the stage.

  • D. Use a single COPY INTO command to bulk-load all staged files into the target table each hour.

Best answer: D

Explanation: For large batches of data already stored as files on a stage, Snowflake is optimized around the COPY INTO command. COPY INTO reads files directly from internal or external stages using Snowflake’s parallel architecture, which allows it to load many files and large volumes of data efficiently in a single operation.

Warehouse credits are driven primarily by how long warehouses run and at what size. A single, well-structured COPY INTO load minimizes query overhead and connection round trips. In contrast, issuing thousands or millions of row-level INSERT statements keeps the warehouse busy for much longer to achieve the same result, consuming more credits and adding operational complexity.

INSERT (with or without OVERWRITE) is better suited to smaller, row-level or set-based operations that modify data already inside Snowflake tables, not for high-volume file ingestion from stages. When the requirement is to minimize credits for file-based batch loads, COPY INTO is the preferred choice.


Question 34

Topic: Account Setup, Security, and Governance

A data engineer wants to understand why BI users in different regions see different subsets of rows and masked email addresses when querying the CUSTOMER_SALES table. Based only on the exhibit, how is Snowflake enforcing this behavior?

Exhibit: Result of a query on INFORMATION_SCHEMA.POLICY_REFERENCES for table CUSTOMER_SALES.

POLICY_NAMEPOLICY_KINDOBJECT_NAMECOLUMN_NAMEACTIVE
RAP_REGION_FILTERROW_ACCESSCUSTOMER_SALES
Y
MP_EMAIL_MASKMASKINGCUSTOMER_SALESEMAILY

Options:

  • A. Snowflake enforces region filtering through a network policy and email masking through ETL transformations; the listed policies are inactive metadata only.

  • B. Snowflake uses a row access policy on CUSTOMER_SALES and a masking policy on the EMAIL column, both active, to filter rows by criteria and mask email values at query time.

  • C. Snowflake relies on a single masking policy on CUSTOMER_SALES to control both which rows are visible and whether EMAIL values are masked.

  • D. Snowflake duplicates CUSTOMER_SALES into separate regional tables, and BI users are pointed to different tables; no policies are used at query time.

Best answer: B

Explanation: The exhibit shows two entries from INFORMATION_SCHEMA.POLICY_REFERENCES for the CUSTOMER_SALES table. One entry has POLICY_KIND set to ROW_ACCESS with COLUMN_NAME shown as <table>, indicating a row access policy attached at the table level. Row access policies evaluate a condition for each row and determine which rows are visible to the querying user or role, which directly supports region-based row filtering.

The second entry has POLICY_KIND set to MASKING on the EMAIL column, meaning a masking policy is attached to that specific column. Masking policies dynamically transform returned values (for example, replacing them with nulls or hashed values) based on who is querying. This is how Snowflake implements column-level security, such as obscuring sensitive email addresses for most users while allowing full visibility for privileged roles.

Both policies are marked ACTIVE = Y, so they are enforced at query time. Together, they provide row-level security (via the row access policy) and column-level security (via the masking policy) on the same table without duplicating data or rewriting queries in applications.


Question 35

Topic: Operations, Monitoring, and Business Continuity

Which TWO statements about the privileges required to create and manage outbound data shares in Snowflake are NOT correct? (Select TWO.)

Options:

  • A. To alter or drop an existing share, the acting role must hold OWNERSHIP on that share.

  • B. To create an outbound share, a role must have the global CREATE SHARE privilege or use a role that already has it (for example, ACCOUNTADMIN).

  • C. To add a table to a share, the acting role must have SELECT on the table and USAGE on the table’s database and schema.

  • D. Only secure views can be added to an outbound share; base tables cannot be shared directly.

  • E. A role that owns a database (has OWNERSHIP on the database) can create outbound shares for its tables without needing any additional global privileges.

Correct answers: D and E

Explanation: Snowflake data sharing is controlled by a combination of global and object-level privileges. Outbound shares themselves are account-level objects, so creating them requires a global privilege. Managing which tables and views are included in a share depends on the role having the necessary privileges on those underlying objects. Finally, like other Snowflake objects, shares are governed by OWNERSHIP when it comes to altering or dropping them.

The global CREATE SHARE privilege controls who can create outbound shares in an account. Once a share exists, the role that owns it (has OWNERSHIP) can modify it and grant ownership if needed. To add a table or secure view to a share, the role must be able to query that object, which means having SELECT and the required USAGE privileges on the database and schema.

Snowflake allows direct sharing of tables as well as secure views and certain other objects. There is no special “share table” privilege; the combination of CREATE SHARE and the underlying object privileges is sufficient.


Question 36

Topic: Query Processing and Performance Optimization

Which TWO statements about unloading data from Snowflake using COPY INTO and controlling the number/size of output files are TRUE? (Select TWO.)

Options:

  • A. The number and size of unload files are chosen entirely by Snowflake based on warehouse size; users cannot influence this behavior at all.

  • B. Using many very small files always improves performance because it minimizes metadata overhead in external cloud storage.

  • C. Unloading into multiple moderately sized files is often preferred for analytics workloads because it allows downstream systems to read the data in parallel.

  • D. COPY INTO always produces multiple files, so creating one large file requires combining them later using an external tool.

  • E. You can configure COPY INTO to write the entire unload result into a single output file, which simplifies downstream handling but may limit parallel read performance.

Correct answers: C and E

Explanation: When unloading data with COPY INTO, Snowflake can either produce a single output file or multiple files. A single large file is often easier to manage and move, especially for tools or processes that expect exactly one file. However, a single file limits parallel reading, because only one consumer can process it at a time.

For analytics workloads, multiple moderately sized files are usually better. They allow downstream systems—such as parallel ETL jobs or distributed compute engines—to read separate files concurrently, improving throughput. However, going to the other extreme and creating many tiny files is generally a bad idea, because it increases metadata/listing overhead and can slow down directory scans and downstream processing.

Snowflake provides COPY INTO options to control whether one file or multiple files are written, and to influence the approximate maximum file size when using multiple files. This allows you to choose between operational simplicity (one big file) and parallelism (multiple medium-sized files) based on your use case.


Question 37

Topic: Account Setup, Security, and Governance

Your team is deciding which system role should be used for day-to-day creation of users and management of masking policies. Review the excerpt from SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_ROLES:

roleprivilegeobject_typeobject_name
SECURITYADMINCREATE USERACCOUNT
SECURITYADMINCREATE ROLEACCOUNT
SECURITYADMINAPPLY MASKING POLICYACCOUNT
SYSADMINCREATE WAREHOUSEACCOUNT
SYSADMINMODIFY WAREHOUSEACCOUNT

Based on the exhibit, which role is the most appropriate choice for these user and security policy tasks?

Options:

  • A. Use the PUBLIC role so all users can grant and revoke roles and apply masking policies without needing elevation.

  • B. Use the SECURITYADMIN role to create users, manage roles, and apply masking policies.

  • C. Use the ACCOUNTADMIN role for all user onboarding and masking, and reserve SECURITYADMIN for managing warehouses.

  • D. Use the SYSADMIN role to manage users and masking policies because it already manages warehouses.

Best answer: B

Explanation: The exhibit shows how different system roles in the Snowflake AI Data Cloud are granted specific account-level privileges. The SECURITYADMIN role has CREATE USER, CREATE ROLE, and APPLY MASKING POLICY on the account, which are classic security administration capabilities. These privileges directly support managing identities (users), access control structure (roles), and security policies (masking policies).

In contrast, the SYSADMIN role in the exhibit holds CREATE WAREHOUSE and MODIFY WAREHOUSE privileges, which are focused on compute and object administration rather than identity or security policy management. This separation of duties is a key principle of Snowflake’s RBAC model: SECURITYADMIN is responsible for security-related tasks, while SYSADMIN manages data and compute objects.

Therefore, for day-to-day user onboarding (creating users and roles) and applying column-level masking policies, SECURITYADMIN is the most appropriate and least-privileged role that matches the requirements shown in the exhibit.


Question 38

Topic: Account Setup, Security, and Governance

A team is building a non-interactive integration that connects a programmatic client to Snowflake using key pair authentication. Which of the following practices should they AVOID when configuring and managing the keys? (Select TWO.)

Options:

  • A. Store the unencrypted private key file in a shared source code repository so all developers can access it.

  • B. Rotate key pairs on a regular schedule and remove old public keys from the Snowflake user once they are no longer needed.

  • C. Reuse the same key pair for multiple service accounts and different applications to simplify configuration.

  • D. Use the private key only on the client side and register only the corresponding public key with the Snowflake user.

  • E. Protect the private key with a passphrase and load it from a secure location that only the application process can access.

Correct answers: A and C

Explanation: Key pair authentication for programmatic Snowflake clients is a non-interactive mechanism where a client proves its identity by signing a challenge with a private key. The corresponding public key is registered with the Snowflake user. Snowflake verifies the signature using the stored public key, allowing the client to authenticate without passwords or manual input.

Because the private key is effectively the “password” for the integration, it must be protected carefully. Sound practices include keeping the private key strictly on the client side, associating only the public key with the Snowflake user, protecting the private key with a passphrase, storing it in a secure location such as a secrets manager or protected filesystem path, restricting access to the application identity, and rotating key pairs regularly.

Anti-patterns include exposing the private key in shared environments (such as code repositories) and reusing the same key pair across multiple service accounts or applications. These patterns violate basic security principles like least privilege and strong credential isolation, and they significantly increase risk if a key is ever compromised.


Question 39

Topic: Data Loading, Unloading, and Transformation

A data engineer is explaining how the Snowflake result cache works to a BI team so they can design efficient dashboards. Which TWO statements about result cache behavior are INCORRECT and should be avoided in design discussions? (Select TWO.)

Options:

  • A. Existing cached results remain valid across warehouse resizing and suspension events when the underlying data stays unchanged.

  • B. Result cache contents are managed by Snowflake and may be evicted after a retention period, so applications should treat it as an opportunistic performance feature.

  • C. A cached result can be reused when the SQL text and relevant session settings are identical and the underlying data remains unchanged.

  • D. Result cache is useful for dashboards that execute the same read-only queries repeatedly without data changes between executions.

  • E. A cached result is always returned for a query whenever the SQL text is similar, even if columns are reordered or additional filters appear.

  • F. A cached result can be reused only by the same virtual warehouse that originally ran the query.

Correct answers: E and F

Explanation: Snowflake’s result cache stores the final result sets of queries in the cloud services layer. Its purpose is to speed up repeated executions of the same query when the underlying data has not changed.

For a query to reuse a cached result, several key conditions must hold:

  • The SQL text must be identical, including projections, filters, and order clauses, together with relevant session parameters (for example, case sensitivity and time zone settings).
  • The underlying data that contributes to the result must remain unchanged; DML that affects the result set causes cache invalidation.
  • The cached result must still be present; Snowflake manages a retention period for result cache and may evict entries over time.
  • The requesting user must have the necessary privileges to see the data.

Result cache is shared across virtual warehouses within the same account, because it is stored in the cloud services layer, not on individual warehouses. It is a best-effort performance optimization: applications must always be prepared for a query to run fully if the cache is not available or has been invalidated.

The incorrect statements in this question either overstate when caching applies based on “similar” queries, or mistakenly tie cache reuse to a specific warehouse, both of which are misunderstandings of how the Snowflake result cache works.


Question 40

Topic: Operations, Monitoring, and Business Continuity

Which statement BEST describes how encryption key management works in the Snowflake AI Data Cloud?

Options:

  • A. Snowflake encrypts only data at rest, network traffic remains unencrypted, and encryption keys do not rotate unless explicitly requested in a support ticket.

  • B. Customers must manually rotate every data encryption key using their own external key management service for Snowflake to encrypt data at rest.

  • C. Encryption keys are stored in clear text within account metadata, but access is restricted using Snowflake roles and privileges.

  • D. Snowflake uses a hierarchical key model and automatically rotates encryption keys on a regular basis without customer intervention.

Best answer: D

Explanation: Snowflake provides built-in, end-to-end encryption for data at rest and in transit. It uses a hierarchical key architecture, where higher-level keys protect lower-level keys, which in turn protect the data. As part of this design, Snowflake automatically rotates keys on a regular schedule and when certain security events occur, reducing operational burden on customers and improving security. Customers do not need to run their own scripts or jobs to perform basic key rotation for Snowflake-managed keys.

While customers can optionally bring or control keys via supported key management integrations, the core behavior of Snowflake itself is to manage and rotate the encryption key hierarchy automatically. Keys are never stored in clear text, and security does not rely solely on RBAC permissions on metadata objects.


Question 41

Topic: Query Processing and Performance Optimization

A team wants to ingest application logs into Snowflake with deeply nested attributes and frequently changing fields. They want to load the data directly into a VARIANT column with minimal pre-processing while preserving structure. Which file format is MOST appropriate for the source files?

Options:

  • A. CSV files with all fields flattened into columns

  • B. JSON files containing the full log records

  • C. Parquet files with a fixed column schema

  • D. XML files wrapped in a single text column

Best answer: B

Explanation: For semi-structured, nested log data that changes frequently, the priority is to ingest records with minimal transformation and preserve their structure. In Snowflake, this aligns best with loading into a VARIANT column from a semi-structured format that matches VARIANT’s internal representation.

JSON is the most natural fit. Snowflake natively parses JSON into VARIANT, automatically handling nested objects and arrays. New or missing fields across records do not require altering table schemas, which is ideal for log data where attributes can evolve over time.

Flat formats like CSV require you to define columns and often to pre-flatten or encode nested content, which conflicts with the goal of minimal pre-processing. While Parquet is efficient and supports nested data, it is typically used with a more stable schema. XML can represent hierarchy, but JSON is better aligned with Snowflake’s common practices and tooling for loading semi-structured logs into VARIANT.


Question 42

Topic: Query Processing and Performance Optimization

A downstream application can only read data files from an existing external cloud storage bucket. You must deliver a daily export of a Snowflake table while minimizing extra data copy steps outside Snowflake. Which approach using COPY INTO is most appropriate?

Options:

  • A. Run COPY INTO a user’s internal stage, then use an external script to transfer the files from the internal stage to the external bucket.

  • B. Create an external table on top of the external bucket so the downstream application can query the data directly without using COPY INTO.

  • C. Share the Snowflake database with the downstream application’s team and ask them to export the data themselves from Snowflake to their bucket.

  • D. Create an external named stage pointing to the bucket and run COPY INTO that external stage from the table on a schedule.

Best answer: D

Explanation: To unload data from a Snowflake table to external cloud storage with minimal extra data movement, you should use COPY INTO with an external stage that directly targets the desired bucket.

An external named stage stores the connection information (URL, credentials, encryption settings) for the external bucket. When you run COPY INTO @external_stage/path FROM my_table, Snowflake reads from the table and writes result files straight into the bucket. This keeps the entire export process inside Snowflake and avoids a second copy step.

Other approaches either add extra data transfer steps outside Snowflake or use features designed for different directions of data flow (reading external data into Snowflake rather than exporting it).


Question 43

Topic: Query Processing and Performance Optimization

An analyst needs to load several CSV files stored on their laptop into a Snowflake table. They must first move the files into an internal stage and then check for formatting issues before actually loading the table. Which of the following actions/solutions will meet these requirements? (Select TWO.)

Options:

  • A. Use the PUT command in SnowSQL to upload the local files into a named internal stage, then load from that stage with COPY INTO.

  • B. Use a VALIDATE command on the internal stage with the target file format to detect rows that would fail before running COPY INTO.

  • C. Create an external stage that points to the analyst’s laptop directory and load from it with COPY INTO.

  • D. Use the GET command in SnowSQL to copy files from the internal stage to the laptop, where they can be manually inspected before loading.

  • E. Run COPY INTO directly from the local file path, allowing Snowflake to both upload and load the files in a single step.

Correct answers: A and B

Explanation: To load files from a local environment into Snowflake, the files must first be placed in a stage that Snowflake can read, typically an internal user or named stage. The PUT command in SnowSQL is specifically designed to upload files from the local file system to an internal stage.

Once files are staged, you can check data quality and formatting before actually loading the table. The VALIDATE capability (used with staged files and a file format) allows you to evaluate whether the data would load cleanly, surfacing parsing or conversion issues without committing bad data to the target table.

Option review:

  • ✔ Use the PUT command in SnowSQL to upload the local files into a named internal stage, then load from that stage with COPY INTO.
    • Correct because PUT is the standard way to move local files into an internal stage, which COPY INTO can then read.
  • ✔ Use a VALIDATE command on the internal stage with the target file format to detect rows that would fail before running COPY INTO.
    • Correct because VALIDATE is used to check staged files against a file format and report potential load errors before running COPY.
  • ✖ Run COPY INTO directly from the local file path, allowing Snowflake to both upload and load the files in a single step.
    • Incorrect because COPY INTO cannot directly access local file paths; data must be in an internal or external stage.
  • ✖ Create an external stage that points to the analyst’s laptop directory and load from it with COPY INTO.
    • Incorrect because external stages are defined over cloud storage locations, not local machines.
  • ✖ Use the GET command in SnowSQL to copy files from the internal stage to the laptop, where they can be manually inspected before loading.
    • Incorrect because GET retrieves data from a stage to local storage and does not help move local files into Snowflake or validate them prior to loading.

Question 44

Topic: Data Loading, Unloading, and Transformation

A nightly ETL job runs on a Medium virtual warehouse (4 credits/hour) and takes 2 hours (8 credits). The job scans about 15TB but returns only 3% of rows. Estimates show:

  • Upgrading to a Large warehouse (8 credits/hour) would cut runtime to 1 hour.
  • Enabling Query Acceleration Service (QAS) while keeping the Medium warehouse would cut runtime to 30 minutes and consume 2 extra credits per run for QAS.

The team must reduce runtime below 45 minutes while minimizing total credits per run (round to whole credits). Which option is the most appropriate?

Options:

  • A. Upgrade the ETL warehouse to Large and enable multi-cluster to handle the heavy scans.

  • B. Upgrade the ETL warehouse to Large without enabling Query Acceleration Service.

  • C. Create a separate Small warehouse for ETL and run the job twice each night.

  • D. Keep the Medium warehouse and enable Query Acceleration Service only for this ETL workload.

Best answer: D

Explanation: Query Acceleration Service (QAS) uses additional serverless compute to offload parts of large, selective scans, speeding up queries that scan a lot of data but return few rows, without permanently resizing the warehouse.

In this scenario, the ETL scans about 15TB and returns only 3% of rows, which is exactly the kind of workload QAS is designed to accelerate. With QAS, the Medium warehouse finishes in 30 minutes: 30 minutes at 4 credits/hour uses about 2 credits, plus 2 credits for QAS, for about 4 credits total per run. Scaling to a Large warehouse uses 1 hour at 8 credits/hour, or 8 credits per run, which is both slower and more expensive than Medium with QAS.

Therefore, enabling QAS on the existing Medium warehouse best meets the runtime requirement while minimizing credits and directly targets the large, selective scans in the ETL job.


Question 45

Topic: Query Processing and Performance Optimization

A data engineering team is building an ingestion process for clickstream data in the Snowflake AI Data Cloud. Compressed JSON log files land in an external stage every 5 minutes, with about 5 million rows per hour. They need to efficiently append all new data each hour, occasionally correct or backfill a few individual rows, and must avoid replacing the entire target table contents during hourly loads. Which loading pattern is the most appropriate for this scenario?

Options:

  • A. Use INSERT OVERWRITE on the target table each hour so that its contents always match the data files in the stage.

  • B. Use COPY INTO to bulk load new files from the stage into the target table each hour, and use INSERT statements for the occasional row-level corrections.

  • C. Use COPY INTO for the occasional single-row corrections and INSERT OVERWRITE for the hourly bulk loads from the stage.

  • D. Use INSERT statements for both the hourly bulk loads and the occasional row-level corrections.

Best answer: B

Explanation: In Snowflake, different commands are optimized for different loading use cases. For high-volume ingestion from files already stored in an internal or external stage, COPY INTO is the standard bulk load mechanism. It reads files directly from the stage, parses them using a defined file format, and efficiently writes the data into a table, making it ideal for repeatedly loading millions of rows.

By contrast, INSERT is intended for row-level operations, such as inserting a small number of rows from literal values or from another table/query. It works well for occasional corrections, backfills, or small data maintenance tasks, but is not efficient for parsing and loading large staged files.

INSERT OVERWRITE replaces existing data in a target table (or partition) with the result of a query. It is useful when you intentionally want to refresh or fully replace a table or partition, not when you need to continually append to a history table.

In this scenario, the team needs to append millions of new rows from staged files hourly and occasionally correct a few rows, without replacing the full table. The best fit is to use COPY INTO for the hourly file-based bulk loads and INSERT for the rare row-level changes.


Question 46

Topic: Query Processing and Performance Optimization

A data engineering team loads CSV files from multiple sources into Snowflake. The architect wants consistent parsing rules and centralized governance of load settings. Which practice is NOT recommended for managing file formats in this scenario?

Options:

  • A. Create a reusable FILE FORMAT object for CSV settings and reference it from COPY INTO commands across different stages and tables.

  • B. Allow each developer to specify all file format options inline in every COPY INTO statement instead of using named FILE FORMAT objects.

  • C. Define a few versioned FILE FORMAT objects (for example, for different delimiters or encodings) and standardize which pipelines use each one.

  • D. Grant only a small set of roles the privilege to modify shared FILE FORMAT objects, while allowing broader read/use access for loaders.

Best answer: B

Explanation: Snowflake encourages defining reusable FILE FORMAT objects with CREATE FILE FORMAT so that data loading settings are centralized and consistent across pipelines. When COPY operations reference named file formats, any necessary change to parsing behavior (such as a delimiter or date format adjustment) can be made once in the FILE FORMAT object and automatically applied wherever it is used. This approach simplifies maintenance, improves governance, and reduces the risk of subtle inconsistencies.

Defining full file format settings inline for each COPY statement duplicates configuration and makes it difficult to ensure all pipelines use the same rules. It also complicates audits and change control because there is no single authoritative definition of how files are parsed.


Question 47

Topic: Account Setup, Security, and Governance

Several analysts report they must log in separately to Snowsight and a BI tool that connects to the same Snowflake account. Security also wants centralized password/MFA management. The account currently uses only local Snowflake users with username/password. Which change is the most appropriate?

Options:

  • A. Relax the Snowflake password policy so analysts can reuse their BI tool passwords.

  • B. Create a shared Snowflake user for the BI tool so only one additional login is required.

  • C. Enable key-pair authentication for all users instead of passwords.

  • D. Configure single sign-on (SSO) with an external identity provider and require users to authenticate via SSO for Snowsight and client tools.

Best answer: D

Explanation: The scenario describes two main requirements: improving user experience by avoiding multiple logins and improving security by centralizing password and MFA management. Local Snowflake users with separate username/password credentials do not meet these needs because each application maintains its own login and policy experience.

Configuring single sign-on (SSO) with an external identity provider (IdP) such as an enterprise directory service allows users to authenticate once with their corporate identity and use that session to access multiple applications, including Snowflake via Snowsight and third-party BI tools. The IdP centrally enforces password rules, MFA, and account lifecycle, while Snowflake delegates authentication to it. This both simplifies user access and strengthens governance.

Other options either weaken security (relaxing password policies, shared accounts) or change the authentication mechanism without providing single sign-on or centralized management (key-pair authentication alone). Therefore, enabling SSO with an external IdP is the best aligned with the stated goals.


Question 48

Topic: Data Sharing, Collaboration, and Marketplace

A data engineering team is considering different ways to use CREATE TABLE AS SELECT (CTAS) in their Snowflake AI Data Cloud account.

Which of the following CTAS usage patterns should you AVOID? (Select TWO.)

Options:

  • A. Using CTAS to precompute a wide, heavily joined reporting table that many BI dashboards reuse every day.

  • B. Using CTAS to persist the output of a complex transformation pipeline so that downstream tasks can read from a stable intermediate table without re-running upstream logic.

  • C. Using CTAS repeatedly to duplicate the same source table into multiple schemas for each team instead of using secure views or data sharing mechanisms.

  • D. Creating a permanent table with CTAS for every ad-hoc exploration query, even when the result is needed only once.

  • E. Using CTAS to create a daily snapshot of customer balances for audit and historical comparison.

Correct answers: C and D

Explanation: CREATE TABLE AS SELECT (CTAS) in Snowflake is used to materialize the results of a query into a new table. This is most appropriate when the result set is:

  • Expensive to recompute (for example, heavy joins or aggregations), and
  • Reused by multiple queries or downstream processes, or
  • Needed as a stable snapshot at a specific point in time.

Good CTAS usage patterns focus on reuse, performance, and clear lifecycle management. Poor patterns use CTAS for one-off exploratory work or to create unnecessary copies of the same data for each team, which drives up storage cost and complicates governance. In Snowflake, zero-copy features like views, secure views, and data sharing are preferred over duplicating tables whenever possible.


Question 49

Topic: Query Processing and Performance Optimization

Which statement BEST describes how CREATE EXTERNAL TABLE is used in Snowflake?

Options:

  • A. It creates a logical view that joins multiple internal tables without storing data, relying entirely on existing Snowflake tables for query results.

  • B. It configures Snowpipe to automatically ingest new files from an external location into a Snowflake table for low-latency loading.

  • C. It defines table metadata over files in an external stage so they can be queried in-place, with metadata periodically refreshed to detect new or changed files.

  • D. It copies data files from an external stage into Snowflake-managed storage and creates a permanent internal table holding the loaded data.

Best answer: C

Explanation: CREATE EXTERNAL TABLE in Snowflake defines a table whose rows are derived from data files stored outside of Snowflake (for example, in an external stage). The external table object stores only metadata about those files, such as locations and columns, and Snowflake reads the data in-place from external storage when queries run. To recognize newly added or changed files, the external table’s metadata must be refreshed, typically using ALTER EXTERNAL TABLE ... REFRESH or an automated mechanism.

This allows you to query external data without fully loading it into Snowflake-managed storage, which is useful for exploratory analysis, data lakes, or situations where data must remain in external storage for governance or cost reasons.


Question 50

Topic: Data Sharing, Collaboration, and Marketplace

A data team is ingesting JSON event payloads that frequently add or remove attributes. They want to minimize schema management effort while still being able to query individual attributes in SQL. Which description of storing the events in a VARIANT column is most accurate for this goal?

Options:

  • A. Require a fixed JSON schema to be registered for the VARIANT column so that any event with new or missing attributes is rejected at load time.

  • B. Store each JSON document in a VARIANT column, preserving its nested structure so attributes can be accessed with path expressions without predefining columns.

  • C. Automatically expand every JSON key into separate physical table columns at load time, so VARIANT only holds a reference to the relational columns.

  • D. Store each JSON document as a VARCHAR string in a VARIANT column, then use custom parsing UDFs on every query to extract attributes.

Best answer: B

Explanation: VARIANT is Snowflake’s native data type for semi-structured data such as JSON, Avro, Parquet, and others. Conceptually, it stores each semi-structured value as a single column value while preserving its hierarchical structure (objects, arrays, nested fields).

When JSON is loaded into a VARIANT column, Snowflake understands the internal structure and types of the data. You can access nested attributes using JSON path expressions (for example, event:customer:id), filter on them, and project them in queries, without first converting the JSON into many separate relational columns.

This makes VARIANT ideal when event payloads or documents evolve over time. New keys can appear or old keys can disappear in the JSON without requiring ALTER TABLE operations. You still get queryability and type awareness, but you avoid the overhead of constantly changing the relational schema.

In the scenario, the deciding factor is minimizing schema management effort while keeping the data queryable. Only the description that highlights storing the full JSON document in VARIANT and querying it via path expressions, without predefining columns, aligns with that goal and with how VARIANT conceptually works in Snowflake.


Questions 51-75

Question 51

Topic: Account Setup, Security, and Governance

A company already uses a corporate identity provider with single sign-on and MFA for all internal applications. They are onboarding 300 analysts to the Snowflake AI Data Cloud. Security requires: 1) users must not manage separate Snowflake passwords, 2) account access must be revoked centrally when employees leave, and 3) MFA rules must remain consistent with other apps. Which authentication approach is the most appropriate for Snowflake?

Options:

  • A. Configure all users to authenticate with key pair authentication using locally generated public/private keys stored on their laptops

  • B. Create a single shared Snowflake user for all analysts with a strong password rotated monthly and give that user the required roles

  • C. Require every user to create a separate Snowflake username/password and enable Snowflake native MFA for each account

  • D. Integrate Snowflake with the corporate SSO provider and require users to sign in through SSO, disabling direct password login for those SSO-managed users

Best answer: D

Explanation: Single sign-on in Snowflake allows organizations to delegate authentication to an external identity provider. Users authenticate once to the corporate SSO, which can enforce MFA and other security policies, and then access Snowflake without maintaining separate credentials. This delivers a better user experience and centralizes lifecycle management: disabling a user in the identity provider also cuts off Snowflake access.

In this scenario, the company wants to avoid separate Snowflake passwords, enforce the same MFA policies used elsewhere, and centrally revoke access when employees leave. Integrating Snowflake with the existing corporate SSO provider and requiring users to log in through SSO directly addresses these requirements.

Username/password authentication managed directly in Snowflake can be secure, but it creates another credential store for users to manage and for administrators to govern. Key pair authentication is powerful for automation and service principals but is not ideal as the primary method for large numbers of interactive users. Shared accounts are an anti-pattern because they undermine auditability and user-level access control.

Therefore, SSO integration is the best fit because it combines strong security with simpler user experience and centralized governance.


Question 52

Topic: Data Loading, Unloading, and Transformation

A data engineering team runs occasional ad-hoc queries that filter on a highly selective predicate over a 15 TB fact table. Concurrency is low, but these queries still take several minutes to complete even on a large virtual warehouse. They want to speed up these large, selective scans without permanently increasing warehouse size or redesigning the schema. Which action is the most appropriate in Snowflake?

Options:

  • A. Create a materialized view on the entire 15 TB fact table to accelerate all queries

  • B. Enable the Query Acceleration Service on the virtual warehouse used for these ad-hoc queries

  • C. Resize the virtual warehouse to the next larger size and keep it running longer

  • D. Increase the maximum number of clusters on the virtual warehouse to improve concurrency

Best answer: B

Explanation: The scenario describes low-concurrency, ad-hoc queries that perform large but highly selective scans on a very large table. The existing virtual warehouse is already large, so simply adding more permanent compute is undesirable. The team wants to improve performance for these specific scan-heavy queries without a schema redesign.

Snowflake’s Query Acceleration Service (QAS) is designed for exactly this situation: it can offload parts of large, selective scans from the main virtual warehouse to additional serverless compute resources. This helps reduce query latency for such workloads while allowing you to keep the base warehouse size stable. You pay only for the extra serverless compute consumed when acceleration is used.

Other options like multi-cluster warehouses or large materialized views either focus on concurrency or change the data model and incur ongoing maintenance costs. They do not directly match the requirement: speeding up large, selective scans without permanently increasing warehouse size or restructuring the schema.


Question 53

Topic: Account Setup, Security, and Governance

A security architect is reviewing how users connect from laptops and BI tools over the public internet to the Snowflake AI Data Cloud. The architect’s top priority is to prevent eavesdropping on queries and results in transit. Which statement BEST addresses this requirement?

Options:

  • A. Restricting access with network policies so only trusted IP ranges can connect removes the need to encrypt traffic between clients and Snowflake.

  • B. Snowflake automatically encrypts all data at rest in databases and stages, so intercepted network traffic cannot reveal any sensitive information.

  • C. Snowflake requires all client connections to use TLS-encrypted channels, ensuring query traffic between clients and Snowflake is encrypted in transit by default.

  • D. Configuring multi-factor authentication (MFA) for users ensures that any intercepted network traffic is unusable because attackers cannot log in.

Best answer: C

Explanation: The scenario is specifically about protecting query and result data while it moves between clients (such as laptops and BI tools) and the Snowflake AI Data Cloud over the public internet. The key security control for this is encryption in transit.

Snowflake enforces TLS (Transport Layer Security) for client connections, such as those using JDBC, ODBC, or HTTPS through Snowsight. TLS establishes an encrypted tunnel between the client and Snowflake so that any data sent across the network (queries, results, credentials) is unreadable to anyone who might intercept the traffic.

Other security features like encryption at rest, MFA, and network policies are important for overall security, but they do not directly provide the confidentiality of the actual bytes traveling across the network. Only the choice that explicitly calls out TLS-encrypted channels matches the architect’s stated priority of preventing eavesdropping on data in transit.


Question 54

Topic: Data Loading, Unloading, and Transformation

You are reviewing a slow query in Snowsight. In the Query Profile, several operators show spill indicators. Which TWO interpretations or actions related to these spill indicators are INCORRECT and should be AVOIDED? (Select TWO.)

Options:

  • A. Interpret the absence of any local or remote spill indicators in the Query Profile as evidence that the query operators fit in memory during execution.

  • B. Recognize that operators marked as spilling to local or remote storage likely did not fit fully in memory and may benefit from warehouse resizing or query rewrites.

  • C. Treat spill indicators as purely informational and keep the same warehouse size and query design even when many operators show spilling to remote storage.

  • D. Assume that as long as the query completes successfully, it is well-optimized, even if many nodes in the Query Profile show spilling to remote storage.

  • E. Use the Query Profile node tooltips to inspect metrics such as bytes scanned and bytes spilled to understand where memory pressure occurs.

Correct answers: C and D

Explanation: In Snowflake’s Query Profile, spill indicators such as spilled to local storage or spilled to remote storage show where operators exceeded available memory and had to write intermediate data out of memory. This usually degrades performance, especially when spilling to remote storage, because reading and writing to storage is slower than in-memory operations.

A good tuning approach is to look at where spilling occurs (using node tooltips and metrics like bytes spilled), then decide whether to resize the virtual warehouse, adjust query logic (for example, reduce data volume earlier, optimize joins/aggregations), or accept the cost if it is expected and infrequent. Simply ignoring spill markers because the query “works” is an anti-pattern that can lead to consistently slow and more expensive workloads.


Question 55

Topic: Operations, Monitoring, and Business Continuity

A data engineering team uses zero-copy cloning to create test databases from production and relies on Time Travel for recovery. They want to understand storage impact. Which of the following statements about zero-copy clones and Time Travel is INCORRECT?

Options:

  • A. If a clone is dropped, some underlying micro-partitions may still be retained because they are required to satisfy the Time Travel or Fail-safe period of the source.

  • B. Immediately after a large database is cloned, storage usage increases only minimally because the source and clone initially share the same micro-partitions.

  • C. As data in either the source or the clone is modified, additional storage is consumed only for the new or changed micro-partitions created by those operations.

  • D. Once a clone is created, it will never generate additional storage costs because all future data changes continue to be fully shared with the source database.

Best answer: D

Explanation: Zero-copy cloning in the Snowflake AI Data Cloud is a metadata operation at the moment of clone creation. The clone and the source reference the same underlying micro-partitions, so the initial storage overhead is minimal, even for large databases or tables.

However, this low storage overhead is not permanent. As soon as data starts changing in either the source or the clone, Snowflake writes new micro-partitions for those changes. The original micro-partitions are still retained for any objects and for their Time Travel and Fail-safe periods, while the new micro-partitions add to total storage. This means storage consumption grows as the datasets diverge.

Time Travel and Fail-safe also affect how long micro-partitions are kept. Dropping a clone does not instantly remove micro-partitions if those data blocks are still needed to satisfy the Time Travel or Fail-safe window of the source (or other clones). Eventually, when no objects and no retention windows reference a micro-partition, Snowflake can reclaim that storage.

Therefore, it is wrong to assume that clones will never create additional storage cost. Cloning is very efficient initially, but ongoing changes in either the source or the clone will increase storage usage over time.


Question 56

Topic: Query Processing and Performance Optimization

A data engineer loads newline-delimited JSON files from an internal stage using COPY INTO into a table with a VARIANT column. The load fails with JSON parse errors. The COPY command currently references a CSV file format. What is the BEST action to fix the issue while keeping the source files unchanged?

Options:

  • A. Create a JSON file format and reference it in the COPY command instead of the CSV file format.

  • B. Set ON_ERROR = CONTINUE in the COPY command so that rows with JSON parse errors are skipped.

  • C. Increase the virtual warehouse size used for the COPY command to handle complex JSON data.

  • D. Convert the JSON files to CSV in external storage, then reload them using the existing CSV file format.

Best answer: A

Explanation: The load is failing with JSON parse errors while using a CSV file format for newline-delimited JSON data. Snowflake supports multiple file formats for loading, including CSV for delimited text and JSON for semi-structured data.

When loading JSON into a VARIANT column, the file format must be compatible with JSON so that Snowflake can parse each line as a separate JSON document. Using a CSV file format forces Snowflake to interpret the data as delimited text, which does not match the structure of JSON and leads to parsing errors.

The simplest and most appropriate fix is to define or use an existing file format with TYPE = JSON and reference it in the COPY INTO statement. This allows Snowflake to correctly parse the semi-structured JSON data without modifying the source files or masking errors.


Question 57

Topic: Account Setup, Security, and Governance

A company wants strict separation of duties in its Snowflake AI Data Cloud account: one team manages databases, schemas, and tables, while a separate security team manages users, roles, and access policies. Which system role should the security team primarily use to administer access control?

Options:

  • A. ACCOUNTADMIN

  • B. SYSADMIN

  • C. A custom role with OWNERSHIP on all databases and schemas

  • D. SECURITYADMIN

Best answer: D

Explanation: In Snowflake’s role-based access control model, system roles are designed around separation of duties. SECURITYADMIN is intended for security administration tasks such as managing users, roles, and their grants. This allows organizations to keep identity and access management separate from day-to-day data and object administration.

In the scenario, one team is responsible for objects (databases, schemas, tables) and another team is responsible for controlling who can access those objects and how. The security team therefore needs a role whose primary purpose is access control administration, not object ownership or account-wide governance. SECURITYADMIN is exactly that role.

By giving the security team SECURITYADMIN, they can create and manage users, create and manage roles, and grant or revoke privileges to roles and users. Meanwhile, a different role such as SYSADMIN can focus on creating and owning objects, preserving a clean separation of responsibilities and supporting least-privilege access.


Question 58

Topic: Query Processing and Performance Optimization

A data engineering team must both load CSV files from a stage into Snowflake tables and periodically export query results back to external cloud storage. Which of the following ways of using the COPY command is INCORRECT for these requirements?

Options:

  • A. Use COPY INTO @archive_stage FROM orders TABLE_FORMAT = (TYPE = CSV) to unload table data as files into a stage.

  • B. Use COPY INTO sales FROM (SELECT * FROM sales) to “export” table data while keeping it in Snowflake.

  • C. Use COPY INTO sales_raw FROM @raw_stage FILE_FORMAT = (TYPE = CSV) to load files from a stage into a table.

  • D. Use COPY INTO @ext_stage/exports FROM (SELECT * FROM sales WHERE load_date = CURRENT_DATE()) FILE_FORMAT = (TYPE = CSV) to export filtered query results to external cloud storage.

Best answer: B

Explanation: Snowflake provides two main COPY patterns: COPY INTO <table> for loading data from staged files into a Snowflake table, and COPY INTO <location> for unloading data from a table or query into files in a stage or external cloud storage.

To ingest data, you point COPY INTO <table> at a stage (internal or external) that contains source files. To export data, you use COPY INTO <location> where the location is a stage or URL, and the source is a table or query.

The choice that uses COPY INTO targeting a table and claims to “export” data is incorrect because it does not write files to a stage or external location; it simply attempts to write back into the table itself, which is not how unloading works in Snowflake.


Question 59

Topic: Account Setup, Security, and Governance

Security logs show many failed login attempts to your Snowflake account from unfamiliar IP addresses. RBAC is correctly configured and all users authenticate through SSO. You must further reduce the attack surface without changing roles. Which action is most appropriate?

Options:

  • A. Rotate all Snowflake user passwords and enforce a stricter password policy on the account.

  • B. Revoke the PUBLIC role from all users so they cannot access shared objects by default.

  • C. Enable mandatory multi-factor authentication (MFA) for all Snowflake users in addition to SSO.

  • D. Create an account-level network policy that only allows connections from approved corporate VPN and office IP ranges.

Best answer: D

Explanation: The scenario describes many failed login attempts coming from unfamiliar IP addresses. Role-based access control is already configured correctly, and users authenticate using SSO, so identity and privileges are in reasonable shape. The remaining risk is the network attack surface: anyone on the internet can still attempt to reach the Snowflake account.

Snowflake network policies are designed exactly for this: they restrict which client IP addresses are allowed to establish a connection to the account (or to specific users). By defining an account-level network policy that allows only trusted corporate VPN and office address ranges, you block all access attempts from untrusted locations before authentication and RBAC even come into play. This directly addresses the symptom while leaving existing role design and SSO configuration unchanged.


Question 60

Topic: Account Setup, Security, and Governance

A data engineer created the following secure view so the ANALYST role can query customer data.

Security now requires that ANALYST must not see SSN values.

Using only the information in the exhibit, what is the most appropriate next step?

CREATE OR REPLACE SECURE VIEW ANALYTICS.V_CUSTOMER AS
SELECT
  CUSTOMER_ID,
  EMAIL,
  PHONE,
  SSN
FROM RAW.CUSTOMERS;

GRANT SELECT ON VIEW ANALYTICS.V_CUSTOMER TO ROLE ANALYST;

Options:

  • A. Redefine the secure view so that the SSN column is removed from the SELECT list.

  • B. Grant SELECT on the RAW.CUSTOMERS base table directly to the ANALYST role and keep the view unchanged.

  • C. Convert the secure view to a regular (non-secure) view so SSN values are automatically hidden from ANALYST.

  • D. Resize the warehouse used by ANALYST queries to a smaller size so SSN data is not returned in query results.

Best answer: A

Explanation: The exhibit shows a secure view ANALYTICS.V_CUSTOMER that selects four columns, including SSN, from RAW.CUSTOMERS, and grants SELECT on this view to the ANALYST role. Because SSN appears explicitly in the SELECT list, any role with access to the view will see SSN values in query results.

To use secure views for governance, you typically restrict what underlying data is exposed in the view definition. In this case, the view should be redefined so that sensitive columns, such as SSN, are either removed entirely or replaced with a masked expression. Once the view only returns allowed columns, granting SELECT on that secure view to roles like ANALYST ensures they cannot directly access the hidden data, even though the base table might contain it.

Secure views enhance governance by protecting both data returned and certain metadata in shared or multi-tenant scenarios, but they still return exactly the columns defined in the SELECT list. Therefore, the core governance action is to carefully design the view definition to include only the permitted columns or rows for each consuming role.


Question 61

Topic: Snowflake Architecture and Key Features

A team is designing storage for a very large event-tracking dataset in the Snowflake AI Data Cloud. They want to minimize storage costs while keeping good query performance. Which of the following design choices is NOT appropriate for this goal?

Options:

  • A. Keep all intermediate staging tables as permanent with the longest available Time Travel retention, even though their data can be fully regenerated from raw files within a day.

  • B. Store raw source files in external cloud storage and load only the necessary columns into a curated Snowflake table for analytics.

  • C. Apply clustering keys on the large fact table’s most commonly filtered columns to improve query performance, accepting some additional storage overhead.

  • D. Use transient tables with minimal Time Travel retention for intermediate staging data that can be easily recreated from raw files.

Best answer: A

Explanation: For large Snowflake tables, you need to balance storage cost and query performance by choosing appropriate table types, retention settings, and physical design features like clustering. Data that is easily re-created should not be stored with expensive, long-term protection, whereas critical analytical data might justify extra storage overhead to achieve better performance.

Intermediate staging data is a classic example of content that can be regenerated from raw files. Keeping that data in permanent tables with long Time Travel significantly increases retained micro-partitions and overall storage cost, but rarely improves business value. Instead, transient tables with short retention are typically sufficient. In contrast, investing in clustering on frequently filtered columns of a large fact table is often worthwhile because it can dramatically reduce scan volumes and improve performance, even if it adds some storage overhead for maintenance.

Storing raw data externally and loading only curated subsets into Snowflake further reduces storage consumption on the platform, which can be appropriate when raw data does not need to be queried interactively inside Snowflake.


Question 62

Topic: Account Setup, Security, and Governance

Which statement best describes the OWNERSHIP privilege on a Snowflake object such as a table?

Options:

  • A. It gives full control of the object, including the ability to modify it, drop it, and grant or revoke all privileges on it.

  • B. It allows inserting new rows into the object but not updating or deleting existing rows in the object.

  • C. It allows using the object in queries (for example, as a source or target) but does not allow seeing its data or structure.

  • D. It allows reading data from the object but does not permit changing data or managing privileges on the object.

Best answer: A

Explanation: In Snowflake, the OWNERSHIP privilege represents full administrative control over a specific object, such as a table, view, or schema. The role that owns an object can modify its structure or properties, drop it, and grant or revoke any other privileges on that object to roles. Only one role at a time can own a given object, and changing ownership transfers this control.

By contrast, data access privileges like SELECT, INSERT, UPDATE, and DELETE focus on what a role can do with the data inside an object, not on managing the object itself. Similarly, USAGE on containers or compute objects controls whether a role can reference or use those objects, but not administer them fully.


Question 63

Topic: Snowflake Architecture and Key Features

A Snowflake account has a CUSTOMER table with columns ID, EMAIL, PHONE, and SSN. All analysts currently query the base table directly using a shared ANALYST role. Security requires that junior analysts only see masked PII, senior analysts see full values, and no extra physical copies of data are created. Which approach is the most appropriate to meet these requirements?

Options:

  • A. Keep direct access to the CUSTOMER table but configure the BI tool to hide or obfuscate PII columns for junior analysts using report-level filters.

  • B. Create a nightly ETL job that copies CUSTOMER into a separate reporting table with masked PII, and grant junior analysts access only to the reporting table.

  • C. Create a regular (non-secure) view that selects only non-PII columns from CUSTOMER and grant this view to junior analysts, while keeping full table access for senior analysts.

  • D. Create a secure view on CUSTOMER that masks PII columns based on the querying role, and grant analysts access only to this secure view instead of the base table.

Best answer: D

Explanation: The scenario requires role-based masking of sensitive columns (EMAIL, PHONE, SSN) without creating additional physical copies of the data. Governance should be enforced within Snowflake, not delegated to external tools.

Secure views in Snowflake are designed for protecting sensitive data. They only expose defined columns and expressions, hide the underlying table structure from consumers, and prevent certain information about the underlying data from appearing in query history or explain plans. When combined with role-aware masking logic (for example, masking policies that inspect CURRENT_ROLE()), a secure view can show full values to authorized roles and masked values to less-privileged roles, all from a single logical object.

By granting analysts access only to the secure view and restricting access to the base table to a higher-privileged administrative or engineering role, you centralize and harden data access control while avoiding data duplication and extra pipelines.


Question 64

Topic: Query Processing and Performance Optimization

Your team is standardizing how it monitors and troubleshoots Snowflake COPY-based load and unload jobs. Which of the following approaches should you AVOID when designing this monitoring strategy? (Select TWO.)

Options:

  • A. Deleting files from the source stage and rerunning COPY commands to see if they succeed, without first checking COPY_HISTORY or other metadata views for error details.

  • B. Creating a scheduled task that periodically copies rows from ACCOUNT_USAGE load/unload history views into an internal monitoring table for long-term analysis.

  • C. Querying INFORMATION_SCHEMA.COPY_HISTORY for a specific table and time window to review job status, row counts, and load/unload errors.

  • D. Using SNOWFLAKE.ACCOUNT_USAGE.COPY_HISTORY to analyze COPY activity across the entire account, filtering by table, pipe, or time period.

  • E. Relying only on external cloud storage logs and never querying ACCOUNT_USAGE or INFORMATION_SCHEMA views because they are assumed not to contain useful load/unload information.

Correct answers: A and E

Explanation: Snowflake provides rich metadata about load and unload operations through INFORMATION_SCHEMA and SNOWFLAKE.ACCOUNT_USAGE views such as COPY_HISTORY and load/unload history views. These views expose information like execution timestamps, status, number of rows loaded or unloaded, affected tables, file names, and error indicators.

Effective monitoring and troubleshooting strategies read from these views to understand what happened during COPY operations, rather than making destructive changes or relying solely on external systems. Good practices include focusing queries on relevant tables and time windows, using account-wide views for centralized monitoring, and persisting key metrics to internal monitoring tables.

The approaches to avoid are those that ignore this metadata (for example, relying only on storage logs) or that change or delete data before understanding the cause of failures (for example, deleting stage files and blindly rerunning jobs).


Question 65

Topic: Account Setup, Security, and Governance

A data engineer is investigating why the role ANALYST_ROLE receives a “SQL access control error: insufficient privileges” when running:

SELECT *
FROM SALES.PUBLIC.ORDERS;

They query SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_ROLE and see the following results for ANALYST_ROLE:

ROLEGRANTED_ONNAMEPRIVILEGE
ANALYST_ROLEDATABASESALESUSAGE
ANALYST_ROLESCHEMASALES.PUBLICUSAGE
ANALYST_ROLEWAREHOUSEREPORTING_WHUSAGE

Based on this exhibit, which privilege change is MOST likely required so ANALYST_ROLE can successfully query SALES.PUBLIC.ORDERS?

Options:

  • A. Grant USAGE on warehouse REPORTING_WH to role ANALYST_ROLE.

  • B. Grant USAGE on schema SALES.PUBLIC to role ANALYST_ROLE.

  • C. Grant OWNERSHIP on database SALES to role ANALYST_ROLE.

  • D. Grant SELECT on table SALES.PUBLIC.ORDERS to role ANALYST_ROLE.

Best answer: D

Explanation: The exhibit is a filtered view of SNOWFLAKE.ACCOUNT_USAGE.GRANTS_TO_ROLE for ANALYST_ROLE. It shows that this role already has:

  • USAGE on the SALES database (GRANTED_ON = DATABASE, NAME = SALES).
  • USAGE on the SALES.PUBLIC schema (GRANTED_ON = SCHEMA, NAME = SALES.PUBLIC).
  • USAGE on the REPORTING_WH warehouse (GRANTED_ON = WAREHOUSE, NAME = REPORTING_WH).

To successfully run a SELECT query on SALES.PUBLIC.ORDERS, a role must have:

  1. USAGE on the database.
  2. USAGE on the schema.
  3. An object-level privilege like SELECT on the target table (or a higher-scope grant that covers that table).
  4. USAGE on a warehouse to execute the query.

The exhibit confirms that items 1, 2, and 4 are already satisfied. The missing piece is the table-level privilege on the ORDERS table. Therefore, granting SELECT on SALES.PUBLIC.ORDERS to ANALYST_ROLE is the specific privilege change needed.

Other proposed changes either duplicate existing privileges shown in the exhibit (such as USAGE on the schema or warehouse) or dramatically overgrant control (such as full OWNERSHIP on the database), which is not necessary for simple read access to a single table.


Question 66

Topic: Snowflake Architecture and Key Features

A data team uses the Snowflake AI Data Cloud. Analysts currently download CSVs from Snowsight to build ad hoc spreadsheets. Leadership wants simple, interactive “what-if” data apps that run directly on Snowflake data, reuse existing warehouses, and avoid managing any new infrastructure. Which approach is most appropriate?

Options:

  • A. Create more views and have analysts run parameterized SQL in Snowsight worksheets instead of downloading CSVs.

  • B. Use Streamlit in Snowflake to build interactive apps that query Snowflake tables directly and run using existing virtual warehouses and RBAC.

  • C. Deploy an external web server, install open-source Streamlit, and connect to Snowflake via ODBC using a dedicated service account.

  • D. Export data nightly from Snowflake into an external BI tool using extracts, then build dashboards with filters in that tool.

Best answer: B

Explanation: Streamlit in Snowflake is designed for quickly building interactive data applications that run directly inside the Snowflake AI Data Cloud. Developers can use Python and Streamlit APIs to create UI elements, bind them to queries on Snowflake tables or views, and render interactive visualizations. These apps execute within Snowflake, using existing virtual warehouses and respecting Snowflake roles and privileges.

In this scenario, leadership wants interactive “what-if” applications that:

  • Access live Snowflake data directly
  • Reuse existing Snowflake compute and RBAC
  • Avoid provisioning and operating separate servers or application platforms

Streamlit in Snowflake matches all of these goals. It keeps data, logic, and execution inside Snowflake, reducing data movement and operational overhead while enabling richer interactions than plain SQL worksheets.

Other options either remain at the SQL/worksheet level, require external infrastructure, or move data out of Snowflake into other tools, so they do not fully satisfy the requirements.


Question 67

Topic: Data Loading, Unloading, and Transformation

A company uses a single, always-on Large virtual warehouse for BI dashboards that are only used during an 8-hour business day. Storage usage and Time Travel retention are acceptable, but monthly Snowflake credit consumption is high. They want to reduce costs without changing data retention or dashboard performance. Which action is the most appropriate?

Options:

  • A. Reduce the database Time Travel retention period from 7 days to 1 day to lower virtual warehouse compute costs.

  • B. Move older fact tables to external cloud storage and query them as external tables instead of internal tables.

  • C. Configure the BI virtual warehouse with auto-suspend after a few minutes of inactivity and enable auto-resume.

  • D. Disable the query result cache so queries always read fresh data, then resize the warehouse from Large to Medium.

Best answer: C

Explanation: Snowflake costs are primarily driven by three components: storage, virtual warehouse compute, and cloud services. In this scenario, the concern is high monthly credit consumption while the BI dashboards only run during an 8-hour business day. That points directly to virtual warehouse compute being used (and billed) even when there is no workload.

Virtual warehouses consume credits while they are running, regardless of whether they are actively processing queries. A common optimization is to enable auto-suspend so that a warehouse shuts down after a short period of inactivity, and auto-resume so it automatically starts again when a new query arrives. This reduces idle compute cost without changing data retention settings or noticeable query performance, assuming reasonable resume times.

By contrast, adjusting Time Travel mainly affects storage consumption, not ongoing compute credits. Moving data to external storage changes where data is stored, not how long the warehouse runs. Disabling the result cache and shrinking the warehouse often increases compute work and can degrade performance, which conflicts with the stated requirement to keep dashboard performance unchanged.

Therefore, configuring auto-suspend and auto-resume on the BI warehouse is the best Snowflake-native optimization that targets the main cost driver in this scenario: idle virtual warehouse compute.


Question 68

Topic: Account Setup, Security, and Governance

Which TWO statements describe how secure views in Snowflake support data governance by hiding sensitive data from certain roles? (Select TWO.)

Options:

  • A. Once a secure view is created, the base table no longer needs any privileges assigned, because all access must go through the view.

  • B. Granting SELECT on a secure view automatically grants SELECT on the underlying table to the same role.

  • C. A secure view can project only non-sensitive columns so that roles with SELECT on the view cannot see the hidden base table columns.

  • D. Secure views are required in Snowflake to implement any kind of row-level security.

  • E. A secure view can apply a WHERE clause using functions like CURRENT_ROLE() so different roles see different subsets of rows from the same table.

Correct answers: C and E

Explanation: Secure views in the Snowflake AI Data Cloud are an important governance tool for controlling how data is exposed to different roles. They behave like regular views in that they can select specific columns and filter rows, but they add stronger protections for secure sharing and for hiding underlying objects and logic.

From a governance perspective, secure views let you define an approved, filtered projection of a table and then grant roles access only to that view. By omitting sensitive columns and embedding filters that depend on session context (for example, using CURRENT_ROLE() or CURRENT_USER()), you can ensure that users only see the data they are permitted to see, even if the underlying table contains more detailed or sensitive information.

Secure views do not change Snowflake’s privilege model: granting SELECT on a secure view does not automatically grant access to the underlying table, and they are not the only way to implement row-level security. Instead, they complement other mechanisms such as row access policies and role-based grants.


Question 69

Topic: Operations, Monitoring, and Business Continuity

A data provider uses the Snowflake AI Data Cloud to publish a curated analytics dataset to more than 20 external customers. Some customers already have their own Snowflake accounts; others do not use Snowflake at all. Requirements are:

  • Customers with existing Snowflake accounts must query the data inside their own accounts and pay for their own compute.
  • Customers without Snowflake accounts must still be able to run SQL queries in a Snowflake UI, with the provider managing and paying for their compute.
  • The provider wants to maintain a single logical copy of the data with strong access governance.

Which approach BEST meets these requirements using Snowflake data sharing capabilities?

Options:

  • A. Register as a data provider, create standard shares for customers with Snowflake accounts, and create reader accounts for customers without Snowflake accounts, granting all of them access to the same shared objects.

  • B. Create reader accounts for all customers, including those with existing Snowflake accounts, and require everyone to query through those reader accounts so the provider can centrally manage compute.

  • C. Create a separate Snowflake account per customer, replicate the dataset into each account, and let customers without Snowflake accounts use Snowsight in those dedicated accounts.

  • D. Ask all customers to export the shared data from Snowflake into external cloud storage files and then load it into their own tools, so each customer manages its own copy and compute.

Best answer: A

Explanation: Snowflake data sharing is built around three key roles: the data provider (the account that owns and publishes data), consumer accounts (regular Snowflake accounts that receive shares), and reader accounts (provider-managed accounts for consumers who do not have their own Snowflake accounts).

In this scenario, the provider needs to serve two distinct groups while keeping a single logical copy of the data. For customers who already have Snowflake accounts, the best pattern is to share data directly from the provider account to each customer’s consumer account. Those customers can then create their own roles, virtual warehouses, and views and pay for their own compute while querying the provider’s shared data.

For customers without Snowflake accounts, reader accounts are designed specifically for this use case. The provider creates one or more reader accounts under its own Snowflake account, grants those reader accounts access to the shared objects, and creates warehouses in those reader accounts. The external users log in to the reader account’s Snowsight UI, run SQL on the shared data, and all compute costs are billed to the provider. The data itself remains stored once in the provider account and is accessed zero-copy by both consumer and reader accounts.

This combination meets all requirements: existing Snowflake customers query within their own accounts and pay their own compute, non-Snowflake customers still get a Snowflake UI and SQL access with provider-managed compute, and the provider maintains a single governed dataset using Snowflake data sharing.


Question 70

Topic: Account Setup, Security, and Governance

A company uses a single Snowflake account with several shared schemas and a few virtual warehouses. They want to improve governance by classifying PII columns and allocating warehouse costs by department, using Snowflake object tags instead of spreadsheets. Which approach is the most appropriate?

Options:

  • A. Create separate object tags for data classification and cost center, attach them to PII columns and departmental warehouses, then report using Snowflake’s tag-reference views in ACCOUNT_USAGE.

  • B. Use column and warehouse comments to store classification and department names, and periodically export SHOW command output to a separate reporting system.

  • C. Create one generic governance tag on each database only, without tagging individual columns or warehouses, and track department usage from role names instead.

  • D. Define a single organization-wide tag for compliance and attach it only to ACCOUNTADMIN, then infer data classification and costs from queries run by that role.

Best answer: A

Explanation: Snowflake object tags allow you to attach structured metadata to objects such as databases, schemas, tables, columns, and virtual warehouses. This metadata can then be queried via Snowflake-provided views (for example, in SNOWFLAKE.ACCOUNT_USAGE) to support governance, classification, and internal cost allocation.

In this scenario, the company wants to classify PII at the column level for governance and associate compute costs with departments at the warehouse level. The best way is to create appropriate tags (for example, DATA_CLASSIFICATION and COST_CENTER), assign them to the correct objects (columns that store PII, warehouses used by specific departments), and then build reports on tag usage and associated costs using the tag-reference views.

This keeps the solution simple, Snowflake-native, and aligned with built-in governance and billing visibility features, without introducing complex tag-based policies or external tooling beyond standard reporting on tag metadata.


Question 71

Topic: Snowflake Architecture and Key Features

Which statement BEST describes Snowpark in the Snowflake AI Data Cloud and when you would use it instead of only writing SQL?

Options:

  • A. Snowpark is a metadata catalog that tracks data lineage and access policies for objects stored in Snowflake databases and schemas.

  • B. Snowpark is a separate compute cluster outside Snowflake used to pre-process data files before loading them into Snowflake tables.

  • C. Snowpark lets developers write data processing logic in familiar programming languages that runs inside Snowflake compute, useful when business logic is easier to express in code than in SQL alone.

  • D. Snowpark is a desktop client that connects to Snowflake and helps analysts visually build SQL queries without writing any code.

Best answer: C

Explanation: Snowpark is a developer framework that lets you build data processing and data pipeline logic in familiar programming languages while executing that logic inside Snowflake virtual warehouses. You typically choose Snowpark instead of pure SQL when the transformation or business logic is complex, better expressed as code, or needs to reuse existing application logic rather than being rewritten in SQL.

The option describing Snowpark as a desktop visual query builder is the closest distractor, but it is incorrect because Snowpark is not a UI client at all; it is an API and execution model that runs within Snowflake, not a drag-and-drop SQL tool installed on a user’s machine.


Question 72

Topic: Account Setup, Security, and Governance

Which TWO statements about using INFORMATION_SCHEMA and ACCOUNT_USAGE views for data governance reporting in Snowflake are INCORRECT? (Select TWO.)

Options:

  • A. INFORMATION_SCHEMA views typically reflect new objects and privilege changes almost immediately, making them suitable for near real-time checks within a specific database or schema.

  • B. For data governance, ACCOUNT_USAGE is preferred when you must avoid any latency because it always reflects activity in real time.

  • C. INFORMATION_SCHEMA views are stored in the SNOWFLAKE database and automatically include metadata from every database in the account.

  • D. ACCOUNT_USAGE views are appropriate for long-term governance reporting, such as analyzing privileges, object history, and query activity over weeks or months.

  • E. ACCOUNT_USAGE views aggregate metadata across the entire Snowflake account but can have a delay of up to several hours before new activity is visible.

Correct answers: B and C

Explanation: Snowflake provides both INFORMATION_SCHEMA and ACCOUNT_USAGE views to support governance and auditing, but they differ in scope and latency.

INFORMATION_SCHEMA is implemented as a schema within each database (and sometimes at the schema level), exposing metadata only for that specific database or schema. Because it is tightly coupled with the database’s current state, it typically reflects changes like new objects or privilege grants with minimal delay, making it useful for near real-time checks in that scope.

ACCOUNT_USAGE, exposed via the SNOWFLAKE database, aggregates metadata and usage information across the entire account: queries, objects, roles, grants, and more. This data is optimized for historical and cross-database reporting, but it is loaded asynchronously, resulting in a documented latency that can be up to several hours for some views. Therefore, it is ideal for long-term governance and audit reporting, but not for strict real-time monitoring.

In governance designs, it is common to use INFORMATION_SCHEMA for immediate, narrow checks (per database) and ACCOUNT_USAGE for centralized, account-wide reporting, understanding that ACCOUNT_USAGE trades freshness for breadth and history.


Question 73

Topic: Account Setup, Security, and Governance

A Snowflake AI Data Cloud account currently uses shared username/password credentials in ETL scripts. Security wants stronger authentication for non-interactive workloads while simplifying credential rotation and avoiding MFA prompts in jobs. Which change BEST meets these goals?

Options:

  • A. Use a single shared technical user with password and MFA enabled, and configure all ETL scripts to authenticate using that account through the organization’s SSO provider.

  • B. Create dedicated service users for ETL and enable key pair authentication, storing private keys encrypted in a central secrets manager and disabling password login for those users.

  • C. Enable key pair authentication for all human users and distribute their private keys by email without passphrases so they can connect easily from any device.

  • D. Keep password-based authentication but enforce very long, complex passwords for the ETL user and rotate the password manually once per year.

Best answer: B

Explanation: For non-interactive workloads such as ETL scripts, Snowflake supports both password-based authentication and key pair authentication. Passwords are easy for humans to type but must be stored somewhere when used by scripts, which creates security and management challenges. Key pair authentication separates a public key stored in Snowflake from a private key kept by the client, allowing strong, non-interactive authentication without exposing a shared password.

Using dedicated service accounts with key pair authentication lets teams avoid sharing user passwords across servers and scripts. Private keys can be stored in a central secrets manager and rotated in a controlled way. Because the authentication is based on possession of the private key, scripts do not need MFA prompts, yet security is strong if keys are protected and password login is disabled for those accounts.

By contrast, longer passwords or adding MFA may help for human users, but they do not solve the underlying problem for automated jobs: credentials must still be stored and updated in each script. Sharing one technical account across many jobs also hinders auditing and increases risk. Similarly, distributing unprotected private keys for all humans would weaken security and be difficult to manage. The recommended pattern is to use key pair authentication for service accounts with proper key storage and rotation, while humans typically use SSO or password+MFA.


Question 74

Topic: Snowflake Architecture and Key Features

A data engineering team currently exports large tables from the Snowflake AI Data Cloud to an external processing cluster to run complex transformation code. They want to redesign the pipeline using the requirements shown below.

Requirement IDDetail
R1Reuse existing general-purpose programming language skills (e.g., Python, Java) instead of only SQL.
R2Keep data and processing inside Snowflake to minimize data movement.
R3Build maintainable, testable code-based data pipelines.
R4Use DataFrame-style operations while still leveraging Snowflake tables.

Based on the exhibit, which Snowflake feature is MOST appropriate to implement the data processing logic?

Options:

  • A. Continue using the external processing cluster but replace exports with a Snowflake ODBC connection for reading data.

  • B. Create external tables on files in cloud storage and process them with an external Spark cluster.

  • C. Use Snowpark to build data pipelines in a supported programming language that execute inside Snowflake compute.

  • D. Rewrite all transformation logic as pure SQL in Snowsight worksheets and schedule it with tasks.

Best answer: C

Explanation: The exhibit highlights four key requirements: reuse of general-purpose programming language skills (R1), keeping both data and processing inside Snowflake (R2), building maintainable code-based pipelines (R3), and using DataFrame-style operations over Snowflake tables (R4).

Snowpark is a Snowflake feature that lets developers work in familiar programming languages while pushing the execution of that code into Snowflake virtual warehouses. It exposes DataFrame-style APIs and integrates with Snowflake’s SQL engine so that the data never needs to leave Snowflake storage, directly addressing R1–R4.

In contrast, designs that rely purely on SQL ignore the desire to reuse existing application language skills, and designs that rely on external clusters fundamentally violate the requirement to keep processing inside Snowflake to minimize data movement and operational overhead.


Question 75

Topic: Snowflake Architecture and Key Features

Which of the following statements about the responsibilities of the Snowflake cloud services layer is NOT correct?

Options:

  • A. It maintains metadata and coordinates transactions to ensure ACID behavior across queries.

  • B. It generates query execution plans and optimizes how virtual warehouses access data.

  • C. It handles user authentication and session management, including integration with SSO providers.

  • D. It stores all table data in micro-partitions on cloud object storage.

Best answer: D

Explanation: Snowflake’s cloud services layer is responsible for coordinating and governing how the platform works, not for physically storing data or performing the low-level data processing itself.

Core responsibilities include authentication and session management, metadata management, query parsing and optimization, and transaction coordination. The cloud services layer knows what data exists and how it is organized, and it decides how queries should run, but it does not perform the actual data scans or store the table data files.

In contrast, the storage layer holds all table data in compressed micro-partitions on cloud object storage. Virtual warehouses (compute) perform the query execution and data scanning based on the plans produced by the cloud services layer.


Questions 76-100

Question 76

Topic: Account Setup, Security, and Governance

A global company is onboarding two semi-autonomous business units to the Snowflake AI Data Cloud. Each unit must have isolated administration of its own data, users, roles, and virtual warehouses, and usage must be billed and monitored separately per unit. However, the central data team needs to occasionally share curated dimensional data between the units in a governed way. Based on Snowflake’s logical architecture and isolation boundaries, what is the most appropriate design?

Options:

  • A. Provision one Snowflake account and separate the business units using different virtual warehouses, relying on warehouse-level usage tracking for cost separation.

  • B. Use one Snowflake account and create a reader account for each business unit to separate their usage while keeping administration centralized.

  • C. Provision a separate Snowflake account for each business unit under the same Snowflake organization, and use secure data sharing between the accounts when curated data must be exchanged.

  • D. Create a single Snowflake account and use separate databases, schemas, and roles for each business unit, plus secure views for any shared data.

Best answer: C

Explanation: In Snowflake, the account is the top-level logical container for data, compute resources (virtual warehouses), users, roles, and most configuration. Accounts are strongly isolated from one another, with their own security boundary and billing context. When different business units need administrative and cost separation but still want to collaborate, using multiple accounts under a single Snowflake organization is the standard design.

By giving each business unit its own account, each team can independently manage its own users, roles, databases, schemas, and virtual warehouses. Usage and costs are naturally separated at the account level, while the organization construct allows central governance and consolidated commercial management. When data must be shared across units, Snowflake’s secure data sharing allows providers to expose specific objects (like curated dimensional tables) to consumer accounts without copying data.

The other options rely on logical isolation mechanisms within a single account (databases, roles, virtual warehouses, reader accounts), which do not change the fact that all security, metadata, and billing still reside in one account. That violates the requirement for isolated administration and separate billing per business unit.


Question 77

Topic: Data Loading, Unloading, and Transformation

A data engineer investigates a slow aggregation query. The Query Profile shows multiple operators with “spilling to remote storage.” The goal is to reduce runtime while keeping costs reasonable. Which of the following actions/solutions will meet these requirements? (Select TWO.)

Options:

  • A. Increase the database Time Travel retention period so Snowflake keeps more historical data available.

  • B. Turn on automatic clustering for the underlying table so micro-partitions are physically reordered over time.

  • C. Enable multi-cluster mode on the warehouse so additional clusters are added automatically for this query.

  • D. Increase the virtual warehouse size by one level so more of the working data set fits in memory during query execution.

  • E. Rewrite the query to push selective filters and aggregations earlier, reducing the volume of intermediate data processed.

Correct answers: D and E

Explanation: In Snowflake, the Query Profile can show that operators are “spilling to local storage” or “spilling to remote storage” when the working data set for that operation does not fit into the memory allocated on the virtual warehouse.

When spilling occurs, Snowflake must write intermediate data to disk and later read it back, introducing extra I/O. Accessing disk (local or remote cloud storage) is significantly slower than working in memory, so repeated spills can noticeably slow a query.

There are two main levers to improve performance in this situation:

  • Increase the memory available to the query by using a larger virtual warehouse, so operators can keep more intermediate data in memory.
  • Reduce the amount of data each operator needs to hold at once by rewriting the query to be more selective or to aggregate earlier, which shrinks intermediate result sizes.

The correct answers follow these principles. The incorrect options either target different problems (like concurrency or data recovery) or features that do not directly influence the memory usage that leads to spilling.


Question 78

Topic: Data Loading, Unloading, and Transformation

A team wants to speed up dashboard queries using materialized views in the Snowflake AI Data Cloud. Which of the following practices related to materialized views should you AVOID? (Select TWO.)

Options:

  • A. Create a materialized view on a large, stable fact table to precompute daily aggregates used by many dashboards.

  • B. Define multiple overlapping materialized views that each compute very similar aggregates on the same base table for slightly different filters.

  • C. Design materialized views primarily on tables whose data changes relatively infrequently but are read very often by BI users.

  • D. Create materialized views on highly volatile staging tables that are continuously loaded and updated throughout the day.

  • E. Periodically review materialized views and drop those that have not been queried recently to reduce ongoing maintenance costs.

  • F. Use a materialized view to precompute an expensive join between a dimension and a large fact table that powers many reporting queries.

Correct answers: B and D

Explanation: Materialized views in Snowflake store precomputed results for a defined query, such as aggregates or joins on a base table. Because the results are stored, queries against the materialized view often run much faster than recomputing the logic every time, especially for large tables and complex aggregations.

However, materialized views are not free. Snowflake must maintain them whenever the base table changes. This introduces extra compute for refresh and additional storage for the persisted results. To use materialized views effectively, you want high read benefit relative to the cost of keeping them up to date.

Creating materialized views on highly volatile tables, or creating many overlapping materialized views that compute similar results, increases refresh work and storage without a corresponding gain in query performance. Those are practices to avoid. In contrast, designing a small number of materialized views on stable, frequently queried data—especially for repeated aggregates or joins—and periodically retiring unused views aligns with Snowflake best practices for performance and cost optimization.


Question 79

Topic: Data Sharing, Collaboration, and Marketplace

In Snowflake, which use case BEST justifies using CREATE TABLE AS SELECT (CTAS)?

Options:

  • A. Streaming new files from external cloud storage into a table as soon as they arrive

  • B. Automatically keeping query results up to date whenever source tables change without rerunning the query

  • C. Persisting the result of a complex query as a physical table so it can be reused and queried faster

  • D. Enforcing row-level security by filtering sensitive rows from a base table

Best answer: C

Explanation: CREATE TABLE AS SELECT (CTAS) in Snowflake is used to materialize the result of a query into a new physical table. This is especially useful when a query is complex, expensive to run, or its result needs to be reused multiple times.

By writing the result set into storage as a regular table, later queries can read directly from that table instead of recomputing the logic every time. This can improve performance, reduce compute cost, and provide a stable snapshot of data for downstream workloads, such as reporting or further transformations.

However, CTAS produces a static snapshot at the time it is run. It does not automatically stay in sync with source tables; for continuously updated or automatically maintained results, views or materialized views are more appropriate. CTAS also does not replace security mechanisms like row access policies or ingestion tools like Snowpipe.


Question 80

Topic: Snowflake Architecture and Key Features

Which TWO statements about Snowflake micro-partitions are NOT correct? (Select TWO.)

Options:

  • A. Choosing an appropriate clustering key for a large table can influence how new micro-partitions are organized and help improve pruning for frequently filtered columns.

  • B. Each micro-partition stores column-level statistics such as value ranges, which Snowflake uses for partition pruning to reduce how much data queries must scan.

  • C. Administrators must define explicit partition keys for every table; otherwise, Snowflake will not create micro-partitions and query performance will be severely degraded.

  • D. Snowflake automatically divides table data into micro-partitions as it is loaded, and users cannot directly control the exact boundaries or on-disk layout of each micro-partition.

  • E. DBAs should regularly run manual physical maintenance commands (such as explicit partition rebuilds) to reorganize micro-partitions on disk and avoid performance degradation.

Correct answers: C and E

Explanation: Snowflake micro-partitions are a core part of how the Snowflake AI Data Cloud organizes table data on storage. When data is loaded into a table, Snowflake automatically divides it into contiguous units called micro-partitions. Users do not define partition keys in the traditional database sense, nor do they manage physical files or run low-level maintenance jobs; these details are handled by Snowflake’s cloud services layer.

Each micro-partition includes metadata, such as min/max values per column and counts of distinct values, which Snowflake’s optimizer uses to decide which micro-partitions to scan. This process, called partition pruning, reduces the amount of data read for many queries and is central to Snowflake’s performance.

Although you cannot directly control micro-partition boundaries or file layout, you can influence how data is clustered using clustering keys on large, frequently filtered tables. Clustering keys help Snowflake organize new data so that related rows are more likely to fall into the same or nearby micro-partitions, improving pruning efficiency.

Because Snowflake automatically manages micro-partitions and storage layout, you do not need (and are not able) to run manual operations like explicit partition rebuilds or VACUUM/ANALYZE. Thinking in terms of micro-partitions is useful conceptually, but daily operations focus on table design, clustering, and query patterns rather than physical partition management.


Question 81

Topic: Data Loading, Unloading, and Transformation

A Snowflake admin is investigating a cost spike on virtual warehouse WH_BI and wants to know how many credits analyst ALICE used on that warehouse in the last 60 minutes. The admin exports a small portion of Query History:

Query IDUserWarehouseStart time (minutes ago)Credits
Q101ALICEWH_BI1101.0
Q102ALICEWH_ETL550.6
Q103ALICEWH_BI400.4
Q104BOBWH_BI300.7
Q105ALICEWH_BI100.5

Assume “last 60 minutes” includes all queries that started within the last 60 minutes. In Snowsight’s Query History, what total credits should the admin see, and which filters should be applied to match that total?

Options:

  • A. 1.5 credits; filter by user ALICE and time range Last 60 minutes (any warehouse).

  • B. 1.6 credits; filter by warehouse WH_BI and time range Last 60 minutes (any user).

  • C. 1.9 credits; filter by user ALICE, warehouse WH_BI, and time range Last 2 hours.

  • D. 0.9 credits; filter by user ALICE, warehouse WH_BI, and time range Last 60 minutes.

Best answer: D

Explanation: To answer this, you first identify which rows match all three conditions: user ALICE, warehouse WH_BI, and started within the last 60 minutes. Only Q103 (40 minutes ago, 0.4 credits) and Q105 (10 minutes ago, 0.5 credits) satisfy all of these, and together they consume 0.9 credits.

In Snowsight’s Query History, you can reproduce this subset by setting filters for the specific user (ALICE), virtual warehouse (WH_BI), and a time range of Last 60 minutes. Using all three filters together narrows the view to just the relevant queries so the credit total in the UI aligns with your manual calculation.


Question 82

Topic: Data Loading, Unloading, and Transformation

Each hour, a BI dashboard sees 30 concurrent queries for 20 minutes and 5 concurrent queries for 40 minutes. An X-Small warehouse cluster supports 10 concurrent queries and uses 1 credit/hour; a Large supports 40 concurrent queries and uses 4 credits/hour. Credits scale linearly with time. To avoid queuing and stay under 3 credits/hour, which setup is best?

Options:

  • A. Configure one X-Small warehouse that runs for the full hour.

  • B. Configure one Large warehouse that runs for the full hour.

  • C. Configure a multi-cluster X-Small warehouse with min=1 and max=3, using auto-resume and auto-suspend so extra clusters run only during the 20-minute burst.

  • D. Configure a multi-cluster X-Small warehouse with min=3 and max=3, running for the full hour.

Best answer: C

Explanation: The workload has a short period of high concurrency and a longer period of low concurrency. A good design must provide enough concurrent capacity during the 20-minute burst while minimizing credits when only a few queries run.

With multi-cluster warehouses, Snowflake can spin up additional clusters during spikes and shut them down when demand drops, so you only pay for extra capacity while it is actually used. This matches the pattern of spiky BI dashboards much better than a single, always-on large warehouse.

In the chosen configuration, three X-Small clusters handle the 30-query burst, and only one X-Small runs during the quieter 40 minutes. Three clusters for 20 minutes use about one credit, and one cluster for 40 minutes uses about two-thirds of a credit, for a total of about 1.67 credits, which is under the 3-credit target.


Question 83

Topic: Snowflake Architecture and Key Features

A team is migrating a 5 TB fact table into the Snowflake AI Data Cloud. They want good query performance while minimizing ongoing effort managing how data is physically partitioned or laid out on disk. Which approach best leverages Snowflake’s automatic micro-partitioning?

Options:

  • A. Continuously reorganize and rewrite data files in external cloud storage so that each file aligns with common filter predicates before loading into Snowflake.

  • B. Load the data into a single standard table without specifying any partitions and let Snowflake automatically create and manage micro-partitions as data is loaded.

  • C. Pre-split the source data into separate tables for each month and route queries to the correct table using application logic to reduce partition scanning.

  • D. Define a detailed multi-column partitioning scheme in the CREATE TABLE statement so Snowflake uses those columns as explicit physical partitions.

Best answer: B

Explanation: Snowflake automatically stores table data in micro-partitions, which are immutable storage units created and managed by the service as data is loaded. Users do not define physical partitions, indexes, or file layouts on disk. Instead, Snowflake’s cloud services layer tracks rich metadata about each micro-partition, such as value ranges and statistics, and uses this metadata to prune unneeded micro-partitions at query time.

Because of this design, the simplest and most aligned pattern for most workloads is to load data into a standard Snowflake table and allow the platform to manage micro-partitions automatically. This meets the requirement to minimize ongoing effort managing physical layout while still benefiting from efficient query performance.

Patterns like manual table sharding, trying to define physical partitions, or rearranging files in external storage all increase operational complexity and do not actually control how micro-partitions are created within Snowflake’s storage layer.


Question 84

Topic: Account Setup, Security, and Governance

In a secure Snowflake deployment, administrators ensure that the PUBLIC role has only minimal privileges (for example, no access to sensitive schemas and only basic usage on shared utilities). Which core security principle does this practice best represent?

Options:

  • A. Separation of duties

  • B. Principle of least privilege

  • C. Security through obscurity

  • D. Defense in depth

Best answer: B

Explanation: In Snowflake, the PUBLIC role is automatically granted to every user and role in the account. If PUBLIC is given broad privileges such as USAGE on sensitive databases or SELECT on important tables, all current and future users inherit that access, often without administrators realizing it.

Keeping PUBLIC restricted to minimal, non-sensitive privileges directly implements the principle of least privilege. This principle states that a subject (user, role, application) should have only the exact permissions needed to accomplish its tasks, and no more. In Snowflake, that means giving PUBLIC only basic capabilities (for example, access to a shared utility schema with non-sensitive functions) while using dedicated custom roles for access to production data, PII, or financial tables.

Over-privileged PUBLIC can lead to unintended data exposure. For example, if PUBLIC is granted SELECT on a customer table, any new user or role automatically gets read access to customer data, even if they were created for an unrelated purpose, such as running internal system checks or temporary testing. Similarly, granting USAGE on an entire database to PUBLIC can make all schemas and many objects discoverable and queryable to everyone.

By instead assigning more specific grants to well-defined roles (like ANALYST, ETL_ENGINEER, DATA_SCIENTIST), and keeping PUBLIC almost empty, administrators ensure that access decisions are explicit and traceable, aligned with least privilege and good governance practices.


Question 85

Topic: Data Loading, Unloading, and Transformation

A team is evaluating Snowflake’s search optimization service to improve query speed. In which scenario is enabling search optimization NOT an appropriate choice and should be avoided as a tuning strategy?

Options:

  • A. A large customer table where most queries filter on CUSTOMER_ID and usually return a single matching row.

  • B. A semi-structured log table in VARIANT format where operations teams often filter on a specific JSON attribute that matches only a few rows per day.

  • C. A wide events table where support teams frequently search for a specific SESSION_ID value to troubleshoot a single user session.

  • D. A 12-billion-row fact table where analysts run dashboards that scan nearly all rows each time to compute aggregates, with no highly selective filters.

Best answer: D

Explanation: Snowflake’s search optimization service is intended to accelerate highly selective lookup queries—for example, point lookups or predicates that return a very small subset of rows in a large table. It creates additional data structures that allow Snowflake to locate matching micro-partitions more directly than with standard pruning alone.

Because there is an extra storage and maintenance cost, it should be reserved for workloads where predicates are very selective and frequently used. It does not help when queries scan most or all of a table’s rows; those patterns are dominated by raw scan and aggregation cost, which are better addressed through warehouse sizing, query design, or data modeling.

In the scenario where dashboards always scan nearly all rows of a 12-billion-row fact table to compute aggregates, search optimization will provide little or no benefit. Snowflake still has to process almost all the data, and the search optimization index structures cannot avoid that work. Instead, using a larger or multi-cluster warehouse, summary tables, or other modeling changes is more appropriate.

In contrast, when queries repeatedly look up specific IDs, sessions, or narrow JSON attributes that match only a tiny fraction of the table, search optimization can dramatically reduce latency by quickly finding only the relevant micro-partitions and avoiding unnecessary scans.


Question 86

Topic: Data Sharing, Collaboration, and Marketplace

A data engineer must build a nightly ELT workflow that runs several SQL transformations in strict order across multiple schemas. The workflow must be reusable for different run dates, support centralized error logging, and ensure all steps succeed or none do.

Which of the following actions/solutions will meet these requirements? (Select TWO.)

Options:

  • A. Implement the transformations inside a stored procedure that runs all SQL steps within a single transaction and rolls back the entire workflow if any step fails.

  • B. Create separate scheduled tasks for each SQL statement, ordering them by schedule time so each task runs a different step of the pipeline.

  • C. Replace the pipeline with a single view that reads directly from the raw tables so the BI tool always queries the latest data instead of running transformations.

  • D. Implement the workflow as a user-defined function (UDF) that contains all transformation SQL and call it from a SELECT statement during the nightly run.

  • E. Create a stored procedure that accepts a run-date parameter, executes all required SQL statements in sequence, and includes exception handling that logs errors and then re-raises them to the caller.

Correct answers: A and E

Explanation: Stored procedures in Snowflake are designed to orchestrate multi-step SQL workflows such as ELT pipelines. They can call multiple SQL statements in a defined sequence, use variables and parameters for reusability, and include procedural error handling to log and control failures.

By putting the pipeline steps inside a stored procedure, the data engineer creates a single callable unit that encapsulates the entire workflow. The procedure can accept inputs like a run date or data source identifier, perform the extract/load/transform steps in order, and use exception handling constructs to log errors (for example, into an audit table) and either continue or stop, depending on requirements.

Using transaction control in a stored procedure allows the workflow to run all steps in one transaction. If any step fails, the procedure can roll back the transaction so no partial changes are committed, satisfying the all-or-nothing requirement. The stored procedure can then be invoked by a task on a schedule, giving a clean orchestration pattern for recurring transformations.


Question 87

Topic: Data Sharing, Collaboration, and Marketplace

A team currently loads JSON logs, CSV tables, PDFs, and JPEG images into one VARIANT column. They want to simplify governance by separating structured, semi-structured, and unstructured zones in Snowflake. Which description correctly defines data that should go in the unstructured zone?

Options:

  • A. Relational data stored in tables with fixed columns and data types, such as customer and orders tables.

  • B. Files such as documents, images, audio, video, and other binary objects that lack a predefined tabular or self-describing structure.

  • C. Any compressed or encrypted file, regardless of its internal format, because Snowflake cannot inspect its schema.

  • D. Any data stored in a VARIANT column, such as JSON, Avro, or XML, because its schema is flexible.

Best answer: B

Explanation: In Snowflake, data is often categorized into structured, semi-structured, and unstructured based on how strongly it adheres to a schema.

Structured data fits into relational tables with predefined columns and data types (for example, customer and order tables). Semi-structured data, such as JSON, Avro, ORC, or XML, is self-describing and is typically stored in VARIANT columns; it has an internal structure but does not require a fixed table schema.

Unstructured data covers files that do not have a predefined tabular schema or a standard self-describing format Snowflake can parse as fields. Typical examples are PDFs, Word documents, images (PNG, JPEG), audio, video, and arbitrary binary files. In Snowflake, these are handled as unstructured file objects stored in stages, with separate metadata that you manage in tables.

In the scenario, the unstructured zone is meant to hold content like PDFs and JPEG images, not JSON logs or CSV tables. The correct definition is therefore the one that describes broad file-based content such as documents and media files without a predefined tabular or self-describing structure.


Question 88

Topic: Data Sharing, Collaboration, and Marketplace

Which of the following statements about querying unstructured data in the Snowflake AI Data Cloud are NOT accurate? (Select TWO.)

Options:

  • A. A common pattern is to extract key attributes from unstructured files into structured or semi-structured columns before applying heavy SQL analytics.

  • B. Querying unstructured data in Snowflake generally has higher latency than querying the same content after it has been loaded into structured tables.

  • C. Unstructured data access is typically better suited for enrichment, lookups, or occasional retrieval than for tight, low-latency query loops.

  • D. Because unstructured data is stored in Snowflake, it automatically benefits from the same row-based pruning and statistics as regular tables.

  • E. Unstructured data in Snowflake can be filtered and aggregated using the full range of SQL operations with performance similar to columnar tables.

Correct answers: D and E

Explanation: In Snowflake, unstructured data (such as images, PDFs, or raw text files) is stored and accessed differently from structured or semi-structured table data. It is typically read at the file or object level rather than as finely organized rows and columns.

Because of this, querying unstructured data usually has higher latency and supports fewer direct SQL operations than querying data that has been loaded into structured or semi-structured columns. A common best practice is to use unstructured data as a source, extract important attributes into structured tables, and then perform most analytic work on those derived, structured representations.

Unstructured data is therefore well-suited for occasional retrieval, enrichment, and lookups, but not for workloads that require fast, repeated analytical queries using the full power of SQL and Snowflake’s columnar engine.


Question 89

Topic: Operations, Monitoring, and Business Continuity

A data provider plans to use Snowflake secure data sharing so external partners can query data in near real time without copying it. Which of the following practices should you AVOID? (Select TWO.)

Options:

  • A. Unloading provider tables to CSV files in external cloud storage so each partner can download and keep its own copy

  • B. Requiring each partner to ETL the shared data into its own on-premises database before running analytics

  • C. Creating secure views in the provider account and sharing those views instead of raw tables to restrict sensitive columns

  • D. Scheduling regular updates to provider tables so downstream consumers automatically see new data through the secure share

  • E. Using a reader account so an external partner without its own Snowflake account can directly query shared data in Snowflake

  • F. Granting a consumer Snowflake account access to a secure share so it can query shared objects in place

Correct answers: A and B

Explanation: Snowflake secure data sharing allows a provider to expose selected databases, schemas, tables, and views to consumer accounts or reader accounts without copying underlying data. Consumers query shared objects directly on the provider’s storage, which enables zero-copy access and near real-time visibility into changes.

Practices that export data to files or require separate ETL into another system break these advantages. They create redundant copies, introduce latency, complicate governance, and move away from Snowflake’s native secure sharing model. In contrast, granting access via secure shares, sharing secure views, and using reader accounts all rely on Snowflake’s built-in mechanisms and preserve zero-copy sharing and timely data access.


Question 90

Topic: Data Sharing, Collaboration, and Marketplace

A team uses the Snowflake AI Data Cloud to process image files stored in an internal stage. Each day they must: 1) scan the directory table for new files, 2) call an external function to analyze each image, 3) update a metadata table, and 4) capture any errors in a central log so the orchestration task does not fail on a single bad file.

They want to coordinate all steps in one place and implement robust error handling.

Which of the following actions/solutions will meet these requirements? (Select TWO.)

Options:

  • A. Create a stream on the directory table and a task that calls the external function directly for each new row, allowing the task to fail on errors and be retried manually if necessary.

  • B. Configure a Snowpipe on the stage so each new file automatically invokes the external function, and rely solely on load history views for error tracking.

  • C. Define a materialized view on the directory table that automatically updates file metadata and marks rows with an error flag whenever analysis fails.

  • D. Create a stored procedure that reads new entries from the directory table, calls an external function for each file, and writes success or failure status for each file into a control table.

  • E. Implement a stored procedure that wraps scanning the directory table, calling the external function, and updating the metadata table in a TRY/CATCH-style block, logging any exceptions to an error table while allowing the scheduled task to complete.

Correct answers: D and E

Explanation: Stored procedures in the Snowflake AI Data Cloud are well suited to orchestrating complex unstructured data workflows because they can perform multiple SQL operations in sequence, call external functions or Snowpark logic, and implement centralized error handling (for example, by catching errors and writing them to an error log table).

In this scenario, the team needs a single place to coordinate reading a directory table of unstructured files, invoking an external analysis step, updating metadata, and recording any errors without causing the entire scheduled task to fail. The best solutions therefore use stored procedures as the orchestration layer, with tasks simply triggering the procedure on a schedule.

Approaches that rely only on Snowpipe, streams, tasks, or materialized views will not provide the requested central orchestration and robust error handling in a stored procedure, even if they can react to new files or maintain some metadata automatically.


Question 91

Topic: Snowflake Architecture and Key Features

A data engineering team needs to implement a nightly pipeline entirely inside the Snowflake AI Data Cloud. The pipeline must: (1) run several DML steps in sequence (archiving, loading, and table swaps), (2) perform data-quality checks and, based on the results, either continue or abort, and (3) write detailed error information to a log table before returning a status code. The process will be triggered by a scheduled task but the team wants the orchestration and control flow to live in a single reusable object in Snowflake, not in an external tool. Which Snowflake feature is the most appropriate to implement this orchestration logic?

Options:

  • A. Define a view that joins staging and history tables and uses CASE expressions to implement the required branching logic

  • B. Create a stored procedure that executes the DML statements with conditional logic and error handling, and invoke it from a scheduled task

  • C. Create a SQL UDF that runs the data-quality logic and calls all required DML statements when evaluated in a query

  • D. Configure a chain of separate tasks, each running a single SQL statement on a schedule, and rely on task dependencies for branching

Best answer: B

Explanation: This scenario focuses on orchestrating a multi-step, data-dependent workflow inside Snowflake. The team needs to run multiple DML steps in sequence, evaluate data-quality checks, branch based on those results, log errors, and return a status. That is more than a single query: it is a procedural workflow with control flow and side effects.

In Snowflake, stored procedures are the primary object for implementing such logic. They allow you to execute multiple SQL statements, implement control flow (for example, IF/ELSE, loops, TRY/CATCH-style error handling), write to log tables, and return values. A task can then call the stored procedure on a schedule, giving you scheduling plus internal orchestration.

User-defined functions (UDFs), by contrast, are meant to be invoked inside SQL statements for per-row or per-set computations. They do not serve as orchestrators for multi-step pipelines and are not designed to control transactions or sequences of independent DML operations.

Other objects like tasks and views have important roles but are not sufficient by themselves to implement the sort of procedural, branching workflow described in the scenario. Tasks schedule and trigger work; views define reusable query logic. For end-to-end orchestration and control flow, the correct choice is a stored procedure.


Question 92

Topic: Data Sharing, Collaboration, and Marketplace

In the Snowflake AI Data Cloud, which object is BEST suited to orchestrate multiple SQL transformation steps with defined sequencing, basic error handling, and reusable logic that can be invoked with a single call?

Options:

  • A. A task that runs a single SQL statement on a schedule

  • B. A stored procedure that executes a series of SQL statements in order

  • C. A view that combines data from multiple tables

  • D. A stream that tracks row-level changes on a table

Best answer: B

Explanation: Stored procedures in the Snowflake AI Data Cloud are designed to orchestrate complex transformation workflows that require more than a single SQL statement. They can execute multiple SQL commands in a defined order, apply conditional logic, manage transactions, and implement basic error handling. Because they are defined as database objects, they can be invoked repeatedly using a simple CALL statement, promoting reuse across pipelines and teams.

In a typical pattern, a stored procedure might perform several steps: staging data, merging into target tables, running validation queries, and logging results. All of these can be packaged into one procedure. If details change (for example, a new validation step is added), only the procedure definition needs to be updated, while callers continue to use the same interface.

By contrast, tasks, views, and streams each play important but different roles. Tasks primarily schedule or trigger work, views define reusable queries, and streams track changes. None of these is meant to encapsulate multi-step procedural orchestration with sequencing and error handling in the way that a stored procedure does.


Question 93

Topic: Data Loading, Unloading, and Transformation

A team runs daytime BI dashboards and ad-hoc queries with moderate concurrency and wants to keep Snowflake credit costs low while maintaining reasonable responsiveness. Which virtual warehouse configurations SHOULD YOU AVOID? (Select TWO.)

Options:

  • A. Start with a small warehouse and manually scale up to a larger size only during known peak hours.

  • B. Use a small multi-cluster warehouse with min=1 and max=3 clusters and auto-scale enabled.

  • C. Configure auto-suspend to 1 minute and enable auto-resume on a medium warehouse.

  • D. Disable auto-suspend and run a large warehouse continuously to avoid any cold-start latency.

  • E. Set a multi-cluster warehouse with min=5 and max=5 clusters for about 10 daytime users running light queries.

Correct answers: D and E

Explanation: Virtual warehouse settings in the Snowflake AI Data Cloud strongly influence both performance and cost. Features like auto-suspend and auto-resume help eliminate idle compute spend, while multi-cluster settings control how many clusters run concurrently to handle concurrent queries.

For a daytime BI workload with moderate concurrency, the main cost risk is running excessive compute when there is little or no activity. Long-lived or always-on large warehouses and high minimum cluster counts can consume a lot of credits with minimal performance gain relative to the workload’s needs.

Good practice is to keep warehouses as small and short-lived as is reasonable while still meeting responsiveness expectations. Short auto-suspend times, auto-resume, and modest multi-cluster ranges (with a low minimum) all help contain costs while providing flexibility for peaks.

The configurations to avoid are those that force large or many clusters to stay running constantly for a relatively small, predictable workload, because they violate basic cost optimization principles without providing proportional performance benefits.


Question 94

Topic: Data Loading, Unloading, and Transformation

Which of the following statements about standard (single-cluster) and multi-cluster virtual warehouses in Snowflake is NOT correct?

Options:

  • A. When concurrent demand drops, a multi-cluster warehouse can automatically shut down extra clusters, leaving fewer active clusters to reduce compute usage.

  • B. A multi-cluster warehouse can split a single long-running query across multiple clusters so that one query runs in parallel on several clusters.

  • C. A multi-cluster warehouse can automatically start additional clusters within a configured range when concurrent query load increases.

  • D. A standard warehouse uses a single compute cluster, so high concurrency can cause queries to queue on that cluster.

Best answer: B

Explanation: Standard (single-cluster) warehouses in Snowflake always use a single compute cluster. If many users or workloads run queries concurrently and exceed that cluster’s capacity, Snowflake may queue some queries until resources become available.

Multi-cluster warehouses are designed to handle higher concurrency by adding or removing entire clusters within a configured minimum and maximum. When concurrency increases, Snowflake may start additional clusters so more queries can run simultaneously. When concurrency decreases, it can shut down extra clusters to reduce compute usage.

However, Snowflake never splits a single query across multiple clusters. Each query executes entirely on one cluster, even when multiple clusters are active. Multi-cluster behavior improves concurrency (more queries at once), not the parallelism of an individual query.


Question 95

Topic: Account Setup, Security, and Governance

A data platform team wants to simplify permission management as analysts frequently move between departments. They also want consistent, auditable access aligned with Snowflake’s RBAC model. Which approach is the most appropriate?

Options:

  • A. Grant all required object privileges directly to each user account and manually revoke and re-grant them whenever a user changes departments.

  • B. Give all analysts the SYSADMIN role and rely on application-level filters to control which data each person can see in their department.

  • C. Create one shared analyst user per department with a powerful role, and have all analysts log in with that shared account to simplify role management.

  • D. Create department-specific roles, grant required object privileges to those roles, and assign or revoke the roles from user accounts as people move.

Best answer: D

Explanation: Snowflake’s role-based access control (RBAC) model is built around granting privileges to roles and then assigning those roles to users. Users do not receive object privileges directly; instead, they inherit permissions from the roles they hold. This structure allows administrators to manage access centrally and adjust it easily as people change responsibilities.

The key goal in the scenario is to simplify permission changes when analysts move between departments while keeping access auditable and consistent with Snowflake’s RBAC approach. The most effective way to do this is to design department- or function-specific roles, grant all necessary database and schema privileges to those roles, and then assign or revoke roles for individual users as they move. This keeps the privilege model stable and shifts churn to role membership, which is much easier to manage.

By contrast, per-user privilege management, shared accounts, or granting powerful system roles broadly either conflict with Snowflake’s RBAC design or undermine security and governance, even if they might appear to reduce short-term configuration work.


Question 96

Topic: Data Loading, Unloading, and Transformation

Under which condition can Snowflake automatically rewrite a user’s query to use a materialized view instead of the base table?

Options:

  • A. When the materialized view and the base table are in the same schema and owned by the same role, regardless of query structure

  • B. When the SQL text of the query exactly matches the materialized view’s defining query, including whitespace and alias names

  • C. When the user references the materialized view name in a query hint that instructs the optimizer to use it

  • D. When the query is logically equivalent to (or a subset of) the materialized view definition, with compatible filters, projections, and aggregations

Best answer: D

Explanation: Snowflake’s optimizer can transparently rewrite queries to read from a materialized view instead of the underlying base table when it can prove that the materialized view fully or partially covers the query in a logically equivalent way. In practice, this means the query’s filters, projections, joins, and aggregations must fit within what the materialized view already computes, so that using the view yields exactly the same result as querying the base table.

This behavior allows performance improvements (by leveraging precomputed and maintained results) without changing application SQL. However, it is driven by logical query equivalence, not by string matching, hints, or object placement alone.


Question 97

Topic: Snowflake Architecture and Key Features

A security architect wants to tighten access control in a Snowflake account by ensuring that authentication, role-based access control (RBAC), and metadata-driven governance behave consistently, regardless of which virtual warehouse or database a user queries. Which Snowflake component is primarily responsible for providing this centralized control?

Options:

  • A. The storage layer that holds micro-partitions

  • B. Virtual warehouses

  • C. External cloud storage integrated with Snowflake stages

  • D. Cloud services layer

Best answer: D

Explanation: In the Snowflake AI Data Cloud, the cloud services layer is responsible for core control-plane functions such as authentication, role-based access control (RBAC), metadata management, query optimization, and transaction coordination. These responsibilities are deliberately separated from compute (virtual warehouses) and storage so that governance and control remain consistent regardless of which warehouse runs the query or where the data is physically stored.

When a user logs in, the cloud services layer authenticates the user, evaluates their roles and privileges, and determines what objects they can access. It uses metadata catalogs to understand objects (databases, schemas, tables, views, policies) and applies security rules such as RBAC, row access policies, and masking policies. It also parses and optimizes SQL, generates execution plans, and coordinates transactions across warehouses.

This centralized control is what allows the architect’s requirement—tighter, consistent access control that is independent of any specific virtual warehouse or database—to be met by the cloud services layer and not by compute or storage components.


Question 98

Topic: Query Processing and Performance Optimization

A data engineer is designing a secure ingestion strategy and comparing internal and external stages in Snowflake. Which of the following statements about stages is INCORRECT?

Options:

  • A. Internal stages store files in Snowflake-managed storage, while external stages reference files that remain in external cloud storage owned by the customer.

  • B. External stages typically require storage credentials or a storage integration to access external cloud storage, while internal stages do not require external cloud credentials.

  • C. Data in an external stage is stored in Snowflake-managed storage and billed as Snowflake storage, just like data in internal stages.

  • D. Internal stages can be created as user, table, or named stages, whereas external stages are defined only as named stages that reference external cloud locations.

Best answer: C

Explanation: Snowflake stages provide locations for data files used in loading and unloading operations. Internal stages are fully managed by Snowflake, storing files inside Snowflake’s own storage layer. External stages, in contrast, are pointers to locations in external cloud storage, such as object storage buckets, where the customer continues to own and manage the data.

Because internal stages live entirely within Snowflake, they do not require external storage credentials, and Snowflake handles encryption and access control. External stages must reference external storage URLs and typically use credentials or a storage integration so Snowflake can read and/or write the files during COPY operations.

Internal stages can exist at different scopes (user, table, or named), which affects who can access them. External stages are always named database objects that reference an external location. Importantly, data in an external stage is not stored in Snowflake-managed storage and is not billed as Snowflake storage; it remains in the customer’s external cloud storage account.


Question 99

Topic: Operations, Monitoring, and Business Continuity

A data engineer in the provider account runs the following query in Snowsight:

SELECT share_name, kind, owner_account,
       consumer_account, managed_account_name, enabled
FROM snowflake.account_usage.shares
WHERE share_name IN ('RETAIL_ANALYTICS_READER', 'RETAIL_ANALYTICS_DIRECT');

Exhibit:

SHARE_NAMEKINDOWNER_ACCOUNTCONSUMER_ACCOUNTMANAGED_ACCOUNT_NAMEENABLED
RETAIL_ANALYTICS_READEROUTBOUNDORG_PROD_ACCTNULLRETAIL_READER_011
RETAIL_ANALYTICS_DIRECTOUTBOUNDORG_PROD_ACCTACME_PROD_ACCTNULL1

Based on the exhibit, what is the most accurate conclusion about how the retailer accesses the RETAIL_ANALYTICS_READER share?

Options:

  • A. The retailer uses a Snowflake reader account created and managed in the provider account, so they do not need their own Snowflake account.

  • B. The retailer connects through its own Snowflake account ACME_PROD_ACCT and pays for all compute used to query the shared data.

  • C. The retailer has full read-write access to the provider’s underlying tables because the share is ENABLED and OUTBOUND.

  • D. The retailer can further share RETAIL_ANALYTICS_READER data with other Snowflake accounts because it is an OUTBOUND share.

Best answer: A

Explanation: The exhibit shows two outbound shares from the provider account ORG_PROD_ACCT. The key columns are CONSUMER_ACCOUNT and MANAGED_ACCOUNT_NAME.

For RETAIL_ANALYTICS_DIRECT, CONSUMER_ACCOUNT is ACME_PROD_ACCT and MANAGED_ACCOUNT_NAME is NULL, which indicates a standard data sharing relationship with an existing Snowflake consumer account.

For RETAIL_ANALYTICS_READER, CONSUMER_ACCOUNT is NULL and MANAGED_ACCOUNT_NAME is RETAIL_READER_01. In Snowflake, this pattern identifies a reader account: a managed account that the provider creates and controls for a consumer who does not have their own Snowflake account. The provider pays for the compute used by queries in that reader account, and the consumer accesses shared objects read-only through that managed environment.

Therefore, the most accurate conclusion is that the retailer is accessing the shared data through a Snowflake reader account created and managed in the provider account, rather than through its own independent Snowflake account.


Question 100

Topic: Data Loading, Unloading, and Transformation

Which TWO statements about how Snowflake database replication and failover affect storage and compute costs are correct? (Select TWO.)

Options:

  • A. Each replicated database copy incurs standard Snowflake storage charges in every account or region where it is stored.

  • B. Replication operations themselves consume compute resources, even if no user queries are run on the secondary account.

  • C. Virtual warehouses are automatically replicated with databases, eliminating the need for additional compute to recreate them after failover.

  • D. After failover, all queries against the new primary database are billed to virtual warehouses in the original primary account.

  • E. Periodic failover testing is effectively free because switching primary/secondary roles does not use any compute resources.

Correct answers: A and B

Explanation: Snowflake database and failover group replication improve availability and disaster recovery by maintaining additional copies of your data in other regions or accounts. However, each replica is a full data copy, so you pay storage for both the primary and every secondary. In addition, replication and failover are active processes that use Snowflake compute resources, and any test or real failover where you run workloads on the secondary will incur compute costs in that environment.

Virtual warehouses are not included in data replication; they must be defined in each account where you run queries. When a failover occurs, queries are executed in the target account’s warehouses and are billed there. This means careful planning is needed: limit replication to necessary objects, choose appropriate replication frequency, and schedule failover tests thoughtfully to balance resilience and cost.


Continue with full practice

Use the SnowPro Core COF-C02 Practice Test page for the full IT Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.

Try SnowPro Core COF-C02 on Web View SnowPro Core COF-C02 Practice Test

Focused topic pages

Free review resource

Read the SnowPro Core COF-C02 Cheat Sheet on Tech Exam Lexicon for concept review before another timed run.

Revised on Thursday, May 14, 2026