Try 100 free SnowPro Core COF-C02 questions across the exam domains, with explanations, then continue with full IT Mastery practice.
This free full-length SnowPro Core COF-C02 practice exam includes 100 original IT Mastery questions across the exam domains.
These questions are for self-assessment. They are not official exam questions and do not imply affiliation with the exam sponsor.
Count note: this page uses the full-length practice count maintained in the Mastery exam catalog. Some certification vendors publish total questions, scored questions, duration, or unscored/pretest-item rules differently; always confirm exam-day rules with the sponsor.
Need concept review first? Read the SnowPro Core COF-C02 Cheat Sheet on Tech Exam Lexicon, then return here for timed mocks and full IT Mastery practice.
Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.
Try SnowPro Core COF-C02 on Web View full SnowPro Core COF-C02 practice page
| Domain | Weight |
|---|---|
| Snowflake Architecture and Key Features | 20% |
| Account Setup, Security, and Governance | 20% |
| Data Loading, Unloading, and Transformation | 20% |
| Query Processing and Performance Optimization | 15% |
| Data Sharing, Collaboration, and Marketplace | 15% |
| Operations, Monitoring, and Business Continuity | 10% |
Use this as one diagnostic run. IT Mastery gives you timed mocks, topic drills, analytics, code-reading practice where relevant, and full practice.
Topic: Data Sharing, Collaboration, and Marketplace
In Snowflake, which principle best describes the purpose of using ARRAY-construction functions when working with semi-structured data?
Options:
A. They create in-query lists of values as array objects so multiple values can be handled together as a single semi-structured value.
B. They optimize micro-partition pruning by physically clustering related values together on storage.
C. They enforce row-level security rules by hiding specific elements in semi-structured data.
D. They automatically normalize semi-structured data into fully relational tables with one row per array element.
Best answer: A
Explanation: ARRAY-construction functions in Snowflake are used to build arrays—ordered lists of values—directly inside SQL queries. Instead of handling each value separately in multiple columns or rows, you can group them into a single array value, usually stored in a VARIANT column or passed as a parameter.
For example, you might construct an array from literals or expressions like:
SELECT ARRAY_CONSTRUCT(10, 20, 30) AS amounts;
This produces a single semi-structured value representing the list [10, 20, 30]. You can then store this value, pass it to a UDF, or later use functions like FLATTEN or array access syntax to read or transform its elements.
The core principle is: ARRAY-construction functions let you treat multiple related values as one semi-structured object inside Snowflake SQL, which is especially useful when working with semi-structured data patterns such as JSON-like lists or when you want to bundle parameters or keys together in a single value.
Topic: Operations, Monitoring, and Business Continuity
A company currently shares Snowflake data with partners by unloading daily files to external cloud storage. Partners use Snowflake accounts in different regions and cloud platforms. They now want near-real-time, copy-free access with minimal operations. Which approach is the best fit?
Options:
A. Increase the unload frequency to every few minutes and let partners load the files into their own Snowflake accounts using automated COPY jobs.
B. Enable Snowflake database replication to the partners’ regions, then create secure shares from the replicated database so each partner queries the data directly in its own Snowflake account.
C. Create a single large virtual warehouse and share its credentials with partners so they can run read-only queries directly in the provider’s Snowflake account.
D. Create reader accounts for each partner in their own regions without using replication, then grant the reader accounts access to the existing database.
Best answer: B
Explanation: Snowflake data sharing is designed to let providers expose data to consumers without copying data files. Within the same region and cloud platform, a secure share points directly to the provider’s database, and the consumer creates a database from that share. The consumer then queries the shared data as if it were local, with no ingestion pipelines.
For cross-region or cross-cloud scenarios, Snowflake uses replication-backed sharing. The provider designates a database for replication and configures it to replicate to one or more target accounts in other regions or cloud platforms. In each target, Snowflake maintains a read-only replica of the provider’s database. From that replica, the provider (or owning account) creates secure shares or listings for consumers in that region.
This pattern gives consumers near-real-time access to shared data in their own Snowflake accounts while avoiding file exports, manual COPY operations, and extra storage copies. It also keeps strong governance boundaries because each consumer has its own account, roles, and warehouses, while the provider controls what is shared and how often it is replicated.
Topic: Data Loading, Unloading, and Transformation
Which statement BEST describes when Snowflake can reuse the result cache to answer a query?
Options:
A. When the source data files are already stored in an internal stage, allowing COPY commands to load data faster without scanning cloud storage again.
B. When a query uses Time Travel to read historical data from a table, so the historical result is automatically served from the result cache.
C. When a table’s micro-partitions are already in the local warehouse cache, so subsequent queries never need to read from Snowflake storage.
D. When an identical query is re-executed, the underlying data has not changed, and the previous result is still within the result cache retention period, even if the warehouse is suspended.
Best answer: D
Explanation: The Snowflake result cache stores complete results of queries that have already been executed. If the same query (including text and relevant session settings) is run again, Snowflake first checks whether a valid cached result exists. If the underlying data referenced by the query has not changed since the previous execution and the cached result is still within its retention period, Snowflake can return the stored result directly.
This behavior is independent of whether the virtual warehouse is currently running; Snowflake can serve the cached result without consuming additional warehouse compute credits. This makes the result cache different from the warehouse cache, which speeds up data reads but still requires the query to run on a warehouse and scan data blocks.
Topic: Data Sharing, Collaboration, and Marketplace
Which statement BEST describes the trade-off when using Snowflake’s approximate aggregate functions such as APPROX_COUNT_DISTINCT instead of exact aggregates like COUNT(DISTINCT ...)?
Options:
A. They ignore a random subset of rows, which makes them unsuitable for any production workload in Snowflake.
B. They are only available for VARIANT columns and semi-structured data, where exact aggregation is not supported.
C. They sacrifice some accuracy in exchange for lower latency and reduced compute resource usage on large datasets.
D. They always return exactly the same result as COUNT(DISTINCT ...) but use more compute resources to guarantee consistency.
Best answer: C
Explanation: Snowflake’s approximate aggregate functions, such as APPROX_COUNT_DISTINCT, use probabilistic algorithms to estimate results instead of computing exact values. This approach typically requires fewer resources and completes more quickly, especially on very large datasets, making them well-suited for exploratory analytics, dashboards, or use cases where a small error margin is acceptable. However, because the results are approximate, they are not appropriate for use cases that require exact numbers, such as regulatory or financial reporting.
Exact aggregates like COUNT(DISTINCT ...) scan and compute the precise result, which can be more expensive and slower on large data volumes but provide guaranteed accuracy. Choosing between approximate and exact functions is therefore a trade-off between performance/cost and numerical precision.
Topic: Snowflake Architecture and Key Features
Which statement BEST describes a Snowflake share in the context of data sharing between accounts?
Options:
A. A Snowflake share is a metadata object in a provider account that references specific database objects and grants read-only access to those objects to one or more consumer accounts without copying the data.
B. A Snowflake share is a special role that, when granted to another account, automatically provides both read and write access to selected databases and schemas.
C. A Snowflake share is a full physical copy of a database that is exported from one account and imported into another, creating independent storage in each account.
D. A Snowflake share is a virtual warehouse configuration that allows multiple accounts to run queries on the same compute resources for shared cost savings.
Best answer: A
Explanation: A Snowflake share is a metadata object that a data provider defines to expose specific, read-only database objects (such as schemas, tables, and secure views) to one or more consumer accounts. The actual data is not copied; consumers create a database from the share in their own account and query the provider’s underlying storage, enabling governed, zero-copy data sharing.
The choice describing a full physical copy being exported and imported confuses sharing with data duplication. Snowflake’s sharing model is explicitly designed to avoid extra storage and to keep a single source of truth managed by the provider, which is why the metadata-based definition is correct.
Topic: Data Sharing, Collaboration, and Marketplace
Which TWO statements about fraction-based vs fixed-size sampling using the SAMPLE clause in Snowflake are INCORRECT? (Select TWO.)
Options:
A. Fixed-size sampling guarantees that Snowflake scans only the number of rows requested, so it is always cheaper than fraction-based sampling.
B. Fraction-based sampling always returns exactly the same number of rows, even when you rerun the query on an unchanged table.
C. With fraction-based sampling, if the table size doubles and you keep the same percentage, the expected number of sampled rows roughly doubles.
D. Using SAMPLE (10) performs fraction-based sampling, returning roughly 10% of the rows from the table.
E. Using SAMPLE (100 ROWS) performs fixed-size sampling, requesting a sample based on a specific number of rows instead of a percentage.
Correct answers: A and B
Explanation: Snowflake’s SAMPLE clause can work in two main ways: fraction-based and fixed-size sampling.
Fraction-based sampling is used when you specify only a numeric value, such as SAMPLE (10). In this case, Snowflake interprets the number as a percentage and returns approximately that fraction of the table’s rows. As the table grows, the expected number of sampled rows grows proportionally.
Fixed-size sampling is used when you include the ROWS (or ROW) keyword, such as SAMPLE (100 ROWS). Here, you request a sample based on a target number of rows rather than a percentage. This is useful when you want roughly the same sample size even if table size changes.
Both sampling modes are random: the exact set of rows and even the exact number of rows returned can vary between executions. Also, Snowflake compute cost is based on warehouse time, not directly on the number of rows returned by SAMPLE, so you cannot assume that fixed-size sampling is always cheaper than fraction-based sampling.
Topic: Snowflake Architecture and Key Features
A data engineering team is designing a Snowflake table to store semi-structured JSON web event data along with core user attributes used in joins and filters. Which of the following approaches should they AVOID? (Select TWO.)
Options:
A. Store all fields, including stable user IDs and primary keys, together in a single VARIANT column for maximum flexibility.
B. Land raw JSON in a VARIANT column, then parse out commonly queried attributes into dedicated typed columns in a later transformation step.
C. Use VARIANT columns to capture optional or evolving event attributes while keeping frequently filtered attributes as native NUMBER, VARCHAR, or TIMESTAMP columns.
D. Use FLATTEN and path expressions to query arrays and nested objects inside VARIANT when analyzing clickstream behavior.
E. Convert incoming JSON to a single large VARCHAR column before loading so that semi-structured capabilities are not used.
Correct answers: A and E
Explanation: Snowflake’s VARIANT data type is designed to store semi-structured data such as JSON, Avro, or Parquet while still allowing efficient querying using path expressions and functions like FLATTEN. However, VARIANT is not a replacement for all structured columns.
A good design typically combines native structured types (NUMBER, VARCHAR, TIMESTAMP, etc.) for stable, frequently joined or filtered attributes with VARIANT columns for flexible or evolving attributes. This preserves type checking, improves query performance and statistics, and keeps schemas understandable, while still allowing schema-on-read behavior for less predictable data.
Patterns that either push everything into a single VARIANT or ignore VARIANT entirely by storing JSON as plain text both fail to take advantage of Snowflake’s architecture and make the data harder to use.
Topic: Data Loading, Unloading, and Transformation
In Snowflake, what is a primary benefit of the virtual warehouse data cache when the same table is queried repeatedly?
Options:
A. It creates and maintains physical indexes on frequently queried columns to speed up filter predicates.
B. It automatically extends Time Travel retention so old versions of the table remain queryable longer without extra cost.
C. It decreases Snowflake storage costs by permanently storing compressed copies of the data in the warehouse.
D. It reduces repeated remote storage reads by serving frequently accessed data from the warehouse cache.
Best answer: D
Explanation: Snowflake virtual warehouses maintain a data cache (often called the warehouse cache) that stores recently scanned micro-partitions in local, compute-side storage. When subsequent queries access the same data, Snowflake can read from this warmed cache instead of repeatedly fetching the same micro-partitions from remote storage.
By avoiding repeated remote storage I/O, queries that scan the same data benefit from lower latency and often complete faster, especially when the query pattern involves many repeated reads of a stable dataset (for example, dashboards or recurring reports). This cache is transparent to users: no special syntax or configuration is needed beyond running queries on the same active warehouse.
The cache does not change how long data is retained (Time Travel), does not affect storage billing for the underlying data, and does not introduce traditional index structures. Its main purpose is to reduce I/O for repeated scans of the same data, improving performance and sometimes reducing compute work for those queries.
Topic: Snowflake Architecture and Key Features
A data engineer is designing automation for ELT pipelines in the Snowflake AI Data Cloud. They plan to use Snowflake tasks to schedule and orchestrate SQL-based workloads. Which of the following is NOT a correct use or expectation of Snowflake tasks for scheduling SQL-based workloads?
Options:
A. Expecting a task to automatically start whenever files land in external cloud storage, without any schedule, stream, or other dependency configured.
B. Creating a root task that runs hourly and defines a tree of dependent child tasks to orchestrate a multi-step ELT workflow.
C. Configuring a task to run a nightly MERGE statement into a reporting table using a dedicated, auto-suspending virtual warehouse.
D. Using a task with a 5-minute schedule and a WHEN condition that checks a stream for new data before running processing SQL.
Best answer: A
Explanation: Snowflake tasks are catalog objects that let you schedule and automate SQL-based workloads. They primarily support time-based scheduling using a SCHEDULE clause and event-like behavior by using dependencies (task trees) and WHEN conditions that check internal state such as streams.
Tasks do not directly subscribe to external events like file arrivals in cloud storage. To handle event-style ingestion, you typically combine Snowpipe (for automatic file loading) with streams and tasks that periodically check for new data using a WHEN condition such as SYSTEM$STREAM_HAS_DATA. The key idea is that tasks are driven by schedules and internal conditions, not by raw cloud storage notifications on their own.
Topic: Data Sharing, Collaboration, and Marketplace
Which of the following statements about optimizing queries on very large VARIANT columns in Snowflake is NOT correct?
Options:
A. Applying filters directly on VARIANT paths (for example, WHERE v:event:type = 'click') can help reduce the amount of data scanned.
B. Selecting the entire VARIANT column instead of specific paths usually performs better than referencing only the attributes needed by the query.
C. Avoiding unnecessary FLATTEN operations on large arrays in VARIANT values can help control both query runtime and compute cost.
D. Limiting the SELECT list to specific VARIANT paths (for example, v:customer:id) can reduce work and improve query performance.
Best answer: B
Explanation: When working with very large VARIANT columns in the Snowflake AI Data Cloud, performance is strongly influenced by how much of the semi-structured object each query needs to read and transform.
Projecting only the paths you actually need (for example, v:customer:id or v:metrics:clicks) allows Snowflake to avoid processing unnecessary parts of the VARIANT. Similarly, placing filters directly on specific VARIANT paths gives the optimizer more precise information, which can reduce the amount of data scanned via partition pruning and other optimizations.
Operations like FLATTEN can dramatically expand the number of rows, especially on large arrays. While FLATTEN is useful, applying it unnecessarily on large VARIANT data will increase both runtime and compute consumption.
For these reasons, the idea that selecting the entire VARIANT column is better for performance is incorrect. Doing so typically causes more data to be processed than required, which generally harms performance rather than improves it.
Topic: Snowflake Architecture and Key Features
Which of the following statements about secure views in the Snowflake AI Data Cloud is NOT correct?
Options:
A. Secure views can expose only selected columns or filtered rows from base tables, helping limit what sensitive data consumers can see.
B. Secure views help prevent sensitive query results from being reused across roles through the result cache, adding an extra layer of protection.
C. Secure views are commonly used when sharing data with other accounts, so consumers query the view without direct access to the underlying tables.
D. Secure views automatically apply role-based dynamic masking to sensitive columns without any additional masking policies or logic.
Best answer: D
Explanation: Secure views in the Snowflake AI Data Cloud are specialized views that add stronger protection against data leakage. They are often used to expose a controlled subset of data while hiding underlying tables and reducing the risk of sensitive information being exposed via caches or query metadata.
However, secure views do not, by themselves, implement dynamic data masking based on user roles. To mask sensitive values, you must either embed masking logic in the view definition (for example, CASE expressions) or attach Snowflake masking policies to the underlying columns. Secure views then help ensure that whatever is exposed through the view is handled securely, but they are not a substitute for explicit masking policies or RBAC design.
Other statements correctly describe how secure views interact with result caching and data sharing, and how they limit visible columns and rows, which are key patterns for protecting sensitive data conceptually at the SnowPro Core level.
Topic: Snowflake Architecture and Key Features
A team ingests clickstream events as JSON into the Snowflake AI Data Cloud. New attributes are added frequently by upstream services. They must 1) avoid frequent table DDL changes, 2) efficiently join on stable fields such as user_id and event_time, and 3) keep modeling effort low. Which table design is MOST appropriate?
Options:
A. Create a table with a single VARCHAR column that stores the raw JSON text, and use string functions to parse attributes when needed.
B. Create a table with a single VARIANT column for the full JSON and extract user_id and event_time from it in every query.
C. Create a wide table with dedicated typed columns for all current and expected JSON attributes, without using VARIANT.
D. Create a table with separate typed columns for user_id and event_time plus a VARIANT column that stores the full JSON event.
Best answer: D
Explanation: Snowflake supports both structured and semi-structured data. Traditional types such as VARCHAR, NUMBER, and TIMESTAMP are ideal for stable, well-understood columns that are commonly used for joins, filters, and aggregations.
For semi-structured data (JSON, Avro, Parquet, XML), Snowflake provides the VARIANT data type. VARIANT can store nested, evolving structures without requiring a fixed schema in advance. New attributes can appear in the incoming data without breaking existing queries or forcing immediate DDL changes.
A common best practice is a hybrid design: model stable, high-value attributes as strongly typed columns, and keep the raw or less predictable payload in a VARIANT column. This gives the convenience and performance of typed columns for key fields while still allowing flexible exploration of all other attributes via VARIANT and path expressions.
In this scenario, making user_id and event_time separate typed columns supports efficient joins and filters, while the VARIANT column captures all other semi-structured event details with minimal ongoing modeling effort.
Topic: Data Loading, Unloading, and Transformation
Which TWO statements about using Snowflake resource monitors and the Snowflake Budgets service to manage credit usage are correct? (Select TWO.)
Options:
A. Snowflake Budgets automatically block all new compute usage in the account as soon as the budget is exceeded, without any additional configuration.
B. Snowflake Budgets let you define a target credit usage for a period and trigger alerts when actual or forecast usage approaches or exceeds that target.
C. Resource monitors can automatically suspend one or more virtual warehouses when a defined credit-usage threshold is reached.
D. Resource monitors can retroactively refund credits for queries that ran after the quota was exceeded.
E. Resource monitors track and limit storage and data transfer costs in addition to compute credit usage.
Correct answers: B and C
Explanation: Snowflake provides two complementary mechanisms for managing credit spend.
Resource monitors attach directly to compute credit usage at the account or warehouse level. You define a credit quota and one or more thresholds (as percentages of that quota). When a threshold is reached, the monitor can trigger actions such as sending notifications or automatically suspending specific warehouses or even all warehouses in the account. This is the primary way to enforce hard or soft limits on compute usage.
Snowflake Budgets focus on higher-level cost governance. You define a target spend (often in credits) over a time period and associate the budget with a scope, such as an account or a tag-based grouping of objects. As usage accumulates, Budgets compare actual and forecasted consumption against the target and can generate alerts when thresholds are approached or exceeded. Budgets are meant for monitoring and alerting, not for directly suspending compute.
Therefore, the statement that resource monitors can automatically suspend warehouses at thresholds is correct, and the statement that Budgets allow defining a target and alerting on approaching or exceeding that target is also correct. Claims that resource monitors refund credits, that Budgets automatically block all compute, or that resource monitors control storage and data transfer are not accurate.
Topic: Snowflake Architecture and Key Features
Which TWO statements about Snowsight in the Snowflake AI Data Cloud are correct? (Select TWO.)
Options:
A. Snowsight must be installed as a desktop client application on a user’s computer before it can be used.
B. Snowsight is the recommended tool for fully automated batch jobs and CI/CD pipelines integrating with Snowflake.
C. Snowsight allows users to create and manage virtual warehouses, databases, and roles through a graphical interface.
D. Snowsight is only available to users with the ACCOUNTADMIN role; other roles must use SnowSQL instead.
E. Snowsight is the primary web-based user interface for managing and querying data in Snowflake.
Correct answers: C and E
Explanation: Snowsight is the primary, browser-based user interface for working with the Snowflake AI Data Cloud. It is designed for interactive use: running SQL queries, exploring data, managing many core objects (such as warehouses and databases), building worksheets and dashboards, and administering roles and grants when the logged-in role has sufficient privileges.
Snowsight does not require any local installation; users simply sign in via a supported web browser. It is available to all users regardless of role, as long as they can authenticate and their roles have the necessary privileges. For automation and CI/CD, Snowflake instead recommends programmatic interfaces such as SnowSQL, language drivers, the SQL API, or SnowCD, which can be scripted and integrated into pipelines.
Topic: Query Processing and Performance Optimization
A company has several Snowflake teams loading CSV files from the same external storage location. Today, each team hard-codes file format options in its own COPY INTO commands, causing inconsistent handling of NULLs and dates. The data platform team wants centralized governance and easy reuse of identical format settings. Which approach is MOST appropriate?
Options:
A. Have each team define its own file format options inline in the COPY INTO command, using standardized documentation as guidance.
B. Ask each team to create its own named FILE FORMAT object in its own schema, tuned to that team’s needs for the shared CSV files.
C. Store all format settings only on individual named stages and let each team create its own stage pointing to the same external location.
D. Create a shared named FILE FORMAT in a central schema and require all COPY INTO commands to reference that FILE FORMAT object.
Best answer: D
Explanation: Named FILE FORMAT objects in Snowflake let you define parsing options (such as delimiter, NULL handling, and date formats) once and reuse them across many COPY INTO commands and stages. This is ideal when many teams load data from the same source layout and you want consistency and easy governance.
By creating a shared FILE FORMAT in a central location and granting usage on it, administrators can enforce standard behavior for ingestion and change parsing rules in one place if the upstream file layout changes. All loads that reference the shared FILE FORMAT automatically pick up the new behavior without modifying every pipeline script.
Inline options or separate per-team definitions spread configuration across many places, increasing the risk of inconsistencies and making governance and updates harder.
Topic: Query Processing and Performance Optimization
In Snowflake, why is it a best practice for multiple teams to use shared named FILE FORMAT objects for their COPY-based loading and unloading operations?
Options:
A. Because using a named FILE FORMAT lets each user automatically override parsing options at runtime without changing shared settings
B. Because a single, reusable FILE FORMAT centralizes settings such as delimiters and compression, keeping behavior consistent across all COPY operations that reference it
C. Because FILE FORMAT objects determine which encryption keys are used for individual users when reading or writing staged files
D. Because FILE FORMAT objects control how Snowflake creates and sizes micro-partitions for tables loaded with COPY
Best answer: B
Explanation: Shared named file format objects in Snowflake capture the reusable details about how files are structured, such as delimiters, compression, escape characters, and date formats. When teams point their COPY INTO commands and stages to the same file format object, every load or unload operation applies the same set of rules.
This pattern improves governance and consistency because changes to file-handling behavior (for example, switching compression type) can be made once in the shared file format and then automatically apply to all dependent jobs. It also reduces errors that occur when individual users repeatedly specify detailed file options in each COPY statement, where typos or forgotten options can cause inconsistent parsing, rejected files, or data quality issues.
By centralizing configuration and referencing it by name, organizations standardize how data is loaded and unloaded across teams while keeping the implementation simple and auditable.
Topic: Snowflake Architecture and Key Features
Which of the following statements about Snowflake sequences is NOT correct?
Options:
A. The same sequence can be referenced from multiple tables, causing them to share one increasing stream of generated numbers.
B. A sequence can be used in a column DEFAULT expression (for example, DEFAULT my_seq.NEXTVAL) to populate surrogate keys during INSERT operations.
C. Snowflake sequences automatically reset to 1 for each table that uses them, ensuring surrogate keys always start at 1 per table.
D. A sequence is a schema-level object that generates numeric values when referenced using the NEXTVAL pseudo-column.
Best answer: C
Explanation: Snowflake sequences are schema-level objects designed to generate numeric values, commonly used for surrogate keys or technical identifiers. They are independent from any specific table. Applications or DDL/DML statements obtain new values by referencing the sequence’s NEXTVAL pseudo-column.
A typical pattern is to define a sequence once in a schema and then use it in one or more tables, often through a column DEFAULT expression. Every time an INSERT statement triggers that default, Snowflake calls my_seq.NEXTVAL and assigns the next numeric value from that sequence. The same sequence can be shared by multiple tables or used directly in ad hoc SQL. Sequences do not automatically reset for each table; instead, they produce a continuous stream of values wherever they are referenced.
Topic: Operations, Monitoring, and Business Continuity
Which statement BEST describes a reader account in Snowflake data sharing?
Options:
A. A reader account is a special provider account that can create shares but cannot host its own data.
B. A reader account is a Snowflake-managed account that a data provider creates for consumers who do not have their own Snowflake account, allowing them to query shared data.
C. A reader account is a temporary database clone that allows a consumer to preview shared data without having any Snowflake account.
D. A reader account is a standard Snowflake account owned and billed independently by the data consumer, used to import a provider’s shared data as a full copy.
Best answer: B
Explanation: In Snowflake data sharing, a reader account is a fully functional Snowflake account that is created and managed by a data provider specifically for consumers who do not have their own Snowflake account. The reader account allows those consumers to connect, run queries, and build views or other objects over the shared data while the provider retains administrative control.
In contrast, a standard consumer account is an ordinary Snowflake account owned by the consumer’s organization, which directly receives the share and manages its own compute, users, and security. Provider accounts own datasets and create shares, but do not themselves function as reader accounts.
Topic: Data Loading, Unloading, and Transformation
Which TWO statements correctly explain why data spilling shown in the Query Profile can slow down a query in Snowflake? (Select TWO.)
Options:
A. Data spilling causes intermediate data to be written to and read from storage instead of staying in memory, adding extra I/O and latency.
B. Data spilling shows that micro-partitions are being physically reorganized, which blocks other queries from accessing those partitions.
C. Data spilling means the virtual warehouse has automatically reduced its size during the query, lowering compute power and extending runtime.
D. Data spilling indicates that Snowflake has fully warmed its caches, so subsequent queries on the same data always run faster.
E. Data spilling means some operators must repeatedly move data between memory and storage, so they spend more time waiting on I/O than on computation.
Correct answers: A and E
Explanation: In Snowflake, the Query Profile can show when an operator has “spilled” data to local or remote storage. This happens when the working set of data for that step does not fully fit in memory allocated to the virtual warehouse. Instead of processing everything in memory, Snowflake must write intermediate results out to storage and read them back later.
Accessing storage is much slower than using data already held in memory. Every extra write and read adds I/O operations and latency, so operators spend more time waiting on data than performing computations. In some cases, the operator has to loop through the data in multiple passes, repeatedly moving chunks between memory and storage. This additional I/O and waiting time is why significant spilling, as shown in the Query Profile, often correlates with slower query performance.
Spilling does not mean that warehouses are shrinking, partitions are locked, or caches are necessarily warmed. It is specifically a signal of memory pressure during query execution, which typically reduces performance compared with an in-memory execution plan.
Topic: Account Setup, Security, and Governance
Which TWO of the following statements about Snowflake security best practices are NOT correct? (Select TWO.)
Options:
A. Enabling multi-factor authentication (MFA) or SSO for interactive users is a recommended best practice to strengthen account security.
B. Applying least privilege means granting users only the minimum roles and privileges required to perform their job duties.
C. Granting the ACCOUNTADMIN role to all developers ensures they do not encounter permission errors and is considered a scalable security approach.
D. Service integrations and automation should use dedicated Snowflake service users with narrowly scoped roles, rather than reusing personal user accounts.
E. Users should frequently share a single Snowflake login and password within a team to simplify permission management.
Correct answers: C and E
Explanation: Snowflake security best practices center on strong, individual authentication and careful role-based access control. Each person should have their own Snowflake user, with permissions granted via roles that follow the principle of least privilege. High-privilege roles such as ACCOUNTADMIN must be tightly controlled.
Strong authentication (such as MFA or SSO/federation) is recommended to protect against password compromise. For non-interactive workloads and integrations, dedicated service users with restricted roles provide both clear auditing and minimized risk. In contrast, shared logins and broadly granting powerful roles are unsafe and violate core security principles.
Topic: Data Sharing, Collaboration, and Marketplace
A team is onboarding log data files from external storage into the Snowflake AI Data Cloud. Their current ingestion pipeline is configured with file formats only for JSON, AVRO, and PARQUET; XML files must be converted before loading. The storage bucket currently contains the following files:
| Format | File count |
|---|---|
| JSON | 320 |
| PARQUET | 450 |
| AVRO | 230 |
| XML | 120 |
Assuming they load only the formats already configured (JSON, AVRO, PARQUET), how many files can they load immediately without conversion? (Round to the nearest whole file.)
Options:
A. 1,120 files
B. 870 files
C. 1,000 files
D. 1,230 files
Best answer: C
Explanation: Snowflake can ingest several semi-structured formats, and in this scenario the pipeline is explicitly configured for JSON, AVRO, and PARQUET. To find how many files can be loaded immediately, you add the counts for just those three formats and exclude XML, which the team has stated must be converted first.
Adding the configured formats gives 320 JSON files, 450 PARQUET files, and 230 AVRO files, for a total of 1,000 files that can be loaded without any conversion step. This reinforces that JSON, AVRO, and PARQUET are common semi-structured formats used with Snowflake.
Topic: Data Sharing, Collaboration, and Marketplace
An analytics team stores unstructured image files in an internal named stage. They need to send one-time download links for specific images to external partners who do not have Snowflake accounts. Security requires that links automatically expire after a few hours and never expose permanent cloud storage credentials. Which approach best meets these requirements?
Options:
A. Share the internal stage URL and have partners access it by logging in through Snowflake reader accounts with file download privileges.
B. Move the images to a public cloud storage container and share the unsigned HTTP URLs with partners for direct download.
C. Provide direct cloud storage URLs for the underlying storage location of the stage, secured with long-lived access keys managed outside Snowflake.
D. Generate presigned URLs for the specific files in the internal stage with a short expiration time and share those URLs with partners.
Best answer: D
Explanation: Presigned URLs are specifically designed to provide secure, time-limited access to individual files stored in Snowflake stages, including internal stages used for unstructured data. When Snowflake generates a presigned URL, it embeds a token and an explicit expiration time into the link. External users can then download the file directly without needing a Snowflake account or separate credentials.
In this scenario, the team must share files with external partners who do not have Snowflake accounts, and they must ensure that access automatically expires after a few hours and does not reveal permanent cloud storage credentials. Presigned URLs satisfy all of these requirements: they are scoped to a specific object, have a configurable lifetime, and avoid exposing underlying storage keys.
By contrast, internal stage URLs are intended for use by authenticated Snowflake users with appropriate privileges, and cloud storage URLs (public or key-based) either expose long-lived credentials or provide overly broad access that does not automatically expire. Therefore, generating presigned URLs for files in the internal stage is the best solution.
Topic: Query Processing and Performance Optimization
A data engineering team loads about 200GB of CSV files each day from an internal stage into a Snowflake table and occasionally needs to correct a few individual rows. Which loading approach is NOT recommended for this scenario?
Options:
A. When business rules change and the entire table must be rebuilt from another table, use INSERT OVERWRITE to replace all existing rows.
B. Run COPY INTO in VALIDATION_MODE first to detect data issues, then rerun COPY INTO without VALIDATION_MODE to perform the actual 200GB load.
C. Use COPY INTO from the internal stage for the 200GB daily files, and use small INSERT statements only for the occasional row-level corrections.
D. For the 200GB daily load, use a single large INSERT statement that selects directly from the staged files instead of using COPY INTO.
Best answer: D
Explanation: When loading large volumes of data from staged files, Snowflake recommends using the COPY INTO command. COPY INTO is designed and optimized for bulk file ingestion: it can process multiple files efficiently, tracks load history, and offers robust error handling and options like validation, parallelism, and partial loads.
INSERT and INSERT OVERWRITE are SQL DML commands best suited for small batches, row-level changes, or rebuilding a table from another table or query, not for high-volume file ingestion. While it might be technically possible to select from staged files using INSERT…SELECT patterns, this bypasses COPY INTO’s purpose-built bulk loading capabilities, resulting in poorer performance, harder monitoring, and potentially more fragile pipelines.
In the scenario, the best pattern is to use COPY INTO for the large, 200GB daily load and reserve INSERT for the occasional row-level corrections. Using a huge INSERT directly from staged files for the bulk load is the clearly wrong choice compared with COPY INTO.
Topic: Data Loading, Unloading, and Transformation
A data engineer is investigating slow performance on a reporting warehouse. In Snowsight they open Query History and see the following:
Exhibit:
| QUERY_ID | START_TIME | TOTAL_ELAPSED_TIME_MS | BYTES_SCANNED | ROWS_PRODUCED |
|---|---|---|---|---|
| 01a1-xyz-001 | 2025-06-10 09:01:22 | 145000 | 18,000,000 | 120,000 |
| 01a1-xyz-002 | 2025-06-10 09:03:10 | 12000 | 2,500,000 | 15,000 |
| 01a1-xyz-003 | 2025-06-10 09:04:05 | 18000 | 3,200,000 | 20,500 |
They want to understand why query 01a1-xyz-001 was slow, including which steps consumed the most time. Which Snowflake feature should they use next?
Options:
A. Increase the warehouse size and re-run all queries shown in Query History
B. Query the ACCOUNT_USAGE.QUERY_HISTORY view directly to see the join and scan steps for query 01a1-xyz-001
C. Use Query History filters to show the detailed execution graph for query 01a1-xyz-001
D. Open the Query Profile for query 01a1-xyz-001 to see operator-level execution details
Best answer: D
Explanation: The exhibit shows Snowsight Query History with summary metrics per query: TOTAL_ELAPSED_TIME_MS, BYTES_SCANNED, and ROWS_PRODUCED. This view is excellent for workload-level analysis, such as spotting outlier queries. Here, query 01a1-xyz-001 clearly stands out with a TOTAL_ELAPSED_TIME_MS of 145,000, much higher than the other queries.
However, the data engineer specifically wants to know why that single query was slow, including which internal steps (such as table scans, joins, or aggregations) consumed the most time. Those details are not present in Query History or ACCOUNT_USAGE views. Instead, Snowflake provides a per-query diagnostic tool called Query Profile.
Query Profile displays a graphical execution plan with nodes for each operation and metrics like time spent, rows processed, and data volume at each step. This is the correct feature to use when you drill into the performance characteristics of an individual query after identifying it as an outlier in Query History.
Thus, the appropriate workflow is: use Query History for spotting slow queries or workload patterns, then open the Query Profile for a specific query to understand its internal execution behavior.
Topic: Snowflake Architecture and Key Features
Which statement BEST describes the primary purpose of the Snowflake AI Data Cloud?
Options:
A. A service for provisioning and managing virtual machines, storage volumes, and networks used to host customer-built databases and applications.
B. A fully managed data platform that centralizes diverse data and provides scalable compute so organizations can run analytics and data-driven applications without managing underlying infrastructure.
C. A low-cost object storage service for keeping raw files, without built-in SQL processing, security controls, or workload isolation.
D. A traditional on-premises relational database that focuses on OLTP workloads and requires manual indexing, partitioning, and storage tuning for performance.
Best answer: B
Explanation: Snowflake AI Data Cloud is a fully managed, cloud-native data platform designed to centralize data from many sources and provide elastic compute for analytics and applications. Users query and process data using SQL and related tools while Snowflake handles infrastructure, scaling, storage layout, and maintenance tasks.
By separating storage from compute and exposing a simple account/database/schema/warehouse model, Snowflake lets teams focus on data and workloads instead of hardware, operating systems, and database tuning. This is the core value proposition: unified data access with scalable performance and simplified operations.
Topic: Account Setup, Security, and Governance
A Snowflake account currently relies on LOGIN_HISTORY to review sign-ins. The security team must routinely report which users actually accessed specific sensitive tables and when, with minimal custom parsing. Which approach best meets this requirement using Snowflake’s governance capabilities?
Options:
A. Apply object tags to all sensitive tables and query TAG_REFERENCES to determine which users have accessed those tagged objects and when.
B. Query SNOWFLAKE.ACCOUNT_USAGE.ACCESS_HISTORY on a schedule and build a report using columns such as QUERY_START_TIME, USER_NAME, ROLE_NAME, and DIRECT_OBJECTS_ACCESSED.
C. Use GRANTS_TO_USERS and GRANTS_TO_ROLES to list which users have privileges on sensitive tables and treat this list as a record of who accessed the data.
D. Combine LOGIN_HISTORY and QUERY_HISTORY, then parse the QUERY_TEXT field to infer which tables and columns were accessed by each user.
Best answer: B
Explanation: The requirement is to provide a repeatable governance report that answers who accessed which sensitive objects and when, while avoiding heavy custom parsing. In Snowflake, the ACCESS_HISTORY view in the SNOWFLAKE.ACCOUNT_USAGE schema is specifically designed for this purpose.
ACCESS_HISTORY records data-access events at the query level. For each query, it exposes the actor (user and role), the time of the access (such as QUERY_START_TIME), and the set of objects accessed (for example, tables and views listed in DIRECT_OBJECTS_ACCESSED and BASE_OBJECTS_ACCESSED). This lets security and governance teams directly analyze actual usage of sensitive tables and columns.
By querying ACCESS_HISTORY on a schedule and projecting just the relevant columns (user, role, timestamp, and accessed objects), you can build a governance report that shows real access patterns without needing to parse raw SQL text or infer access from grants or tags.
Topic: Data Loading, Unloading, and Transformation
Which of the following statements about caching in the Snowflake AI Data Cloud is NOT correct?
Options:
A. Result cache entries are reused only when the query text and relevant session parameters match and the underlying data has not changed in a way that affects the result.
B. Snowflake stores table data in durable storage; caches only keep copies to improve performance and are not the authoritative source of data.
C. Queries may return logically incorrect results if you do not periodically clear Snowflake caches; therefore cache maintenance is required for data correctness.
D. The local data cache for a virtual warehouse is discarded when the warehouse is suspended, causing subsequent queries to re-read needed data from storage.
Best answer: C
Explanation: Snowflake uses several caches—such as the result cache and the local data cache on virtual warehouses—to improve performance and reduce repeated reads from storage. However, these caches are optimizations only.
The authoritative copy of all table data is always stored in Snowflake’s durable storage layer. The query processing engine consults caches when possible, but it automatically invalidates or bypasses them when underlying data changes or when query conditions differ, so results remain logically correct without manual cache management.
Because of this, cache behavior should never be relied on for functional correctness or data freshness guarantees. Instead, correctness is defined by Snowflake’s transactional storage and Time Travel mechanisms; caches only help make correct queries faster and cheaper to run.
Topic: Operations, Monitoring, and Business Continuity
A table was accidentally dropped, and a user attempts to query it using Time Travel but receives the following error.
Exhibit:
SELECT * FROM sales.public.orders
AT(TIMESTAMP => '2025-01-01 10:00:00'::timestamp);
SQL execution error: Time Travel retention period has expired for table SALES.PUBLIC.ORDERS.
Data is now in Fail-safe and is not directly accessible.
To recover, contact Snowflake Support.
Based only on the information in the exhibit, what is the most appropriate next step?
Options:
A. Open a Snowflake Support case to request recovery of the table from Fail-safe.
B. Use zero-copy cloning on the table to restore data from Fail-safe.
C. Create a new table with the same name so Fail-safe automatically repopulates it.
D. Increase the Time Travel retention period on the database and rerun the query.
Best answer: A
Explanation: The exhibit shows a query attempting to use Time Travel on the SALES.PUBLIC.ORDERS table. The error message clearly states that the “Time Travel retention period has expired” and that the data “is now in Fail-safe and is not directly accessible.” It then explicitly instructs the user: “To recover, contact Snowflake Support.”
In Snowflake, Time Travel allows users to query, clone, and restore objects within a configured retention period. Once that window expires, historical data moves into Fail-safe. Fail-safe is a last-resort recovery mechanism that is fully managed by Snowflake. Customers cannot query or directly access data in Fail-safe using SQL, cloning, or Time Travel.
Therefore, when data is only available in Fail-safe, the correct action is to open a case with Snowflake Support and request recovery. Configuration changes like increasing Time Travel retention or creating new tables cannot bring back data whose Time Travel window has already expired.
Topic: Snowflake Architecture and Key Features
A data platform team is migrating analytics workloads to the Snowflake AI Data Cloud. They want to 1) scale query processing independently of data storage and 2) rely on Snowflake to centrally manage authentication, RBAC, and query optimization without managing infrastructure. Which of the following actions/solutions will meet these requirements? (Select TWO.)
Options:
A. Increase the size of the Snowflake storage layer to add more CPU and memory for queries while delegating all authentication to an external identity provider that bypasses Snowflake RBAC.
B. Deploy external VMs to run SQL engines for queries, using Snowflake only as low-cost storage, and manage query optimization in those external engines.
C. Use Snowflake virtual warehouses for elastic compute while relying on the cloud services layer for authentication, RBAC, and query optimization over centrally stored data.
D. Create additional virtual warehouses to store cold and hot data separately while manually tuning query performance using user-created indexes on tables.
E. Adopt Snowflake’s multi-cluster shared data architecture by keeping all data in the storage layer and provisioning one or more virtual warehouses, allowing the cloud services layer to manage metadata, permissions, and query planning.
Correct answers: C and E
Explanation: Snowflake’s architecture is built around three main layers: storage, compute, and cloud services.
The storage layer holds all persisted data (micro-partitioned, compressed, and encrypted). It is fully managed and automatically optimized by Snowflake; it does not provide CPU or memory for queries.
The compute layer consists of virtual warehouses, which supply the processing resources for running queries, loading data, and performing transformations. Warehouses can be scaled up or down and started or stopped independently of the amount of stored data.
The cloud services layer manages metadata, authentication, RBAC, query parsing and optimization, and other coordination tasks. It centralizes security and governance and orchestrates how compute accesses shared storage.
In the scenario, the team wants to scale query processing independently of data storage and also wants Snowflake itself to manage authentication, RBAC, and query optimization. That means they must correctly leverage virtual warehouses (compute) and the cloud services layer, while recognizing that data resides in the storage layer and that storage does not supply compute resources.
Topic: Operations, Monitoring, and Business Continuity
Which Snowflake feature should a data provider primarily use to monitor how often consumer accounts query shared objects and when that access occurs?
Options:
A. Dashboards in the consumer’s BI tool, which the provider can reference to understand shared data query patterns
B. Secure shares themselves, which automatically generate and deliver detailed usage reports to the provider by email
C. Time Travel on the shared database, by reviewing historical table versions to infer when consumers queried the data
D. Account-level usage and access history views that record query activity and shared object access in the provider account
Best answer: D
Explanation: Snowflake provides built-in usage and access history views that capture query metadata and object access, allowing a data provider to see which shared objects were queried, by which accounts, and when. These metadata views are the primary, Snowflake-native way to monitor data sharing usage and audit access to shared data. Time Travel, shares themselves, or external BI dashboards do not capture or centralize this access metadata for the provider.
Topic: Data Sharing, Collaboration, and Marketplace
A data engineer is working with semi-structured order data stored in a VARIANT column. They want one row per item while still returning order-level columns.
Exhibit:
SELECT
o.order_id,
o.customer_id,
o.order_ts,
i.value:sku AS sku,
i.value:quantity::NUMBER AS quantity
FROM orders o,
LATERAL FLATTEN(input => o.items) i;
Based on the exhibit, what is the primary purpose of using LATERAL FLATTEN in this query?
Options:
A. To convert the items VARIANT column into a permanent relational table that can be queried without referencing orders
B. To automatically improve performance by caching the parsed JSON from o.items across all future queries
C. To expand each element of o.items into its own row while keeping the parent row’s columns (order_id, customer_id, order_ts) available in the same result row
D. To cast o.items from VARIANT to ARRAY so it can later be flattened in a separate step
Best answer: C
Explanation: The query in the exhibit selects order-level columns (o.order_id, o.customer_id, o.order_ts) together with fields from individual items (i.value:sku, i.value:quantity). The FROM clause uses orders o, LATERAL FLATTEN(input => o.items) i.
FLATTEN takes an array or object (here, the items VARIANT column) and returns a set of rows, one per element. The LATERAL keyword makes this set correlated with each row of orders. For every parent row in orders, Snowflake runs FLATTEN on o.items and returns one result row for each element, while keeping the parent row’s columns in scope.
This is how you preserve the parent context (order-level details) while expanding an array into multiple rows. Each output row contains the order identifiers plus the details of a single item. Without a lateral relationship, you would lose that 1-to-many link or have to manually write a more complex join.
The other options misinterpret what LATERAL FLATTEN does. It does not create permanent tables, does not introduce special caching, and does not handle type casting from VARIANT to ARRAY; it simply expands the array per parent row while keeping the parent columns available.
Topic: Snowflake Architecture and Key Features
A data engineering team currently runs weekly maintenance SQL scripts by manually executing statements in Snowsight worksheets. They want these tasks to be fully automated, scriptable, and easily integrated into an existing CI/CD pipeline without adding unnecessary complexity. Which approach is the BEST fit?
Options:
A. Write the maintenance SQL in a SnowSQL script and run it on a schedule from a job runner or CI/CD pipeline, supplying credentials via secure configuration.
B. Move to the classic web interface and use its query history to manually rerun the same maintenance queries whenever needed.
C. Keep the SQL in Snowsight worksheets and have an engineer use the “Run All” button each week to execute the maintenance tasks.
D. Develop a custom web application that calls the Snowflake SQL API for each maintenance statement and deploy it as a separate scheduling service.
Best answer: A
Explanation: SnowSQL is Snowflake’s command-line client, designed specifically for scripted and automated execution of SQL statements. It integrates naturally with schedulers and CI/CD tools because it can be run non-interactively, accepts connection parameters and variables, and returns clear exit codes for pipeline logic.
In this scenario, the team wants to move from manual, UI-driven execution in Snowsight to a fully automated, repeatable process that fits into an existing CI/CD pipeline. The simplest and most aligned solution is to place the maintenance SQL into a SnowSQL script and invoke it from a scheduler or pipeline job. This directly targets the core use case for SnowSQL: batch and automated operations outside the web interface.
Other options either keep manual steps in the web UI or introduce unnecessary custom infrastructure, which conflicts with the requirement to automate while keeping things simple and using recommended Snowflake tooling patterns.
Topic: Query Processing and Performance Optimization
A team receives 50 large CSV files (about 5GB total) on an internal stage every hour. Their top priority is to minimize warehouse credits while reliably loading all rows into a target table. Which approach is most appropriate?
Options:
A. Load each file into a temporary table using many INSERT statements, then run an INSERT OVERWRITE into the target table once per hour.
B. Use an application to read each file and execute individual INSERT statements for every row into the target table.
C. Create a recurring task that runs frequent single-row INSERT statements as soon as each file appears on the stage.
D. Use a single COPY INTO command to bulk-load all staged files into the target table each hour.
Best answer: D
Explanation: For large batches of data already stored as files on a stage, Snowflake is optimized around the COPY INTO command. COPY INTO reads files directly from internal or external stages using Snowflake’s parallel architecture, which allows it to load many files and large volumes of data efficiently in a single operation.
Warehouse credits are driven primarily by how long warehouses run and at what size. A single, well-structured COPY INTO load minimizes query overhead and connection round trips. In contrast, issuing thousands or millions of row-level INSERT statements keeps the warehouse busy for much longer to achieve the same result, consuming more credits and adding operational complexity.
INSERT (with or without OVERWRITE) is better suited to smaller, row-level or set-based operations that modify data already inside Snowflake tables, not for high-volume file ingestion from stages. When the requirement is to minimize credits for file-based batch loads, COPY INTO is the preferred choice.
Topic: Account Setup, Security, and Governance
A data engineer wants to understand why BI users in different regions see different subsets of rows and masked email addresses when querying the CUSTOMER_SALES table. Based only on the exhibit, how is Snowflake enforcing this behavior?
Exhibit: Result of a query on INFORMATION_SCHEMA.POLICY_REFERENCES for table CUSTOMER_SALES.
| POLICY_NAME | POLICY_KIND | OBJECT_NAME | COLUMN_NAME | ACTIVE | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RAP_REGION_FILTER | ROW_ACCESS | CUSTOMER_SALES |
Options:
Best answer: B Explanation: The exhibit shows two entries from The second entry has Both policies are marked Question 35Topic: Operations, Monitoring, and Business Continuity Which TWO statements about the privileges required to create and manage outbound data shares in Snowflake are NOT correct? (Select TWO.) Options:
Correct answers: D and E Explanation: Snowflake data sharing is controlled by a combination of global and object-level privileges. Outbound shares themselves are account-level objects, so creating them requires a global privilege. Managing which tables and views are included in a share depends on the role having the necessary privileges on those underlying objects. Finally, like other Snowflake objects, shares are governed by OWNERSHIP when it comes to altering or dropping them. The global CREATE SHARE privilege controls who can create outbound shares in an account. Once a share exists, the role that owns it (has OWNERSHIP) can modify it and grant ownership if needed. To add a table or secure view to a share, the role must be able to query that object, which means having SELECT and the required USAGE privileges on the database and schema. Snowflake allows direct sharing of tables as well as secure views and certain other objects. There is no special “share table” privilege; the combination of CREATE SHARE and the underlying object privileges is sufficient. Question 36Topic: Query Processing and Performance Optimization Which TWO statements about unloading data from Snowflake using COPY INTO and controlling the number/size of output files are TRUE? (Select TWO.) Options:
Correct answers: C and E Explanation: When unloading data with COPY INTO, Snowflake can either produce a single output file or multiple files. A single large file is often easier to manage and move, especially for tools or processes that expect exactly one file. However, a single file limits parallel reading, because only one consumer can process it at a time. For analytics workloads, multiple moderately sized files are usually better. They allow downstream systems—such as parallel ETL jobs or distributed compute engines—to read separate files concurrently, improving throughput. However, going to the other extreme and creating many tiny files is generally a bad idea, because it increases metadata/listing overhead and can slow down directory scans and downstream processing. Snowflake provides COPY INTO options to control whether one file or multiple files are written, and to influence the approximate maximum file size when using multiple files. This allows you to choose between operational simplicity (one big file) and parallelism (multiple medium-sized files) based on your use case. Question 37Topic: Account Setup, Security, and Governance Your team is deciding which system role should be used for day-to-day creation of users and management of masking policies. Review the excerpt from
Based on the exhibit, which role is the most appropriate choice for these user and security policy tasks? Options:
Best answer: B Explanation: The exhibit shows how different system roles in the Snowflake AI Data Cloud are granted specific account-level privileges. The In contrast, the Therefore, for day-to-day user onboarding (creating users and roles) and applying column-level masking policies, SECURITYADMIN is the most appropriate and least-privileged role that matches the requirements shown in the exhibit. Question 38Topic: Account Setup, Security, and Governance A team is building a non-interactive integration that connects a programmatic client to Snowflake using key pair authentication. Which of the following practices should they AVOID when configuring and managing the keys? (Select TWO.) Options:
Correct answers: A and C Explanation: Key pair authentication for programmatic Snowflake clients is a non-interactive mechanism where a client proves its identity by signing a challenge with a private key. The corresponding public key is registered with the Snowflake user. Snowflake verifies the signature using the stored public key, allowing the client to authenticate without passwords or manual input. Because the private key is effectively the “password” for the integration, it must be protected carefully. Sound practices include keeping the private key strictly on the client side, associating only the public key with the Snowflake user, protecting the private key with a passphrase, storing it in a secure location such as a secrets manager or protected filesystem path, restricting access to the application identity, and rotating key pairs regularly. Anti-patterns include exposing the private key in shared environments (such as code repositories) and reusing the same key pair across multiple service accounts or applications. These patterns violate basic security principles like least privilege and strong credential isolation, and they significantly increase risk if a key is ever compromised. Question 39Topic: Data Loading, Unloading, and Transformation A data engineer is explaining how the Snowflake result cache works to a BI team so they can design efficient dashboards. Which TWO statements about result cache behavior are INCORRECT and should be avoided in design discussions? (Select TWO.) Options:
Correct answers: E and F Explanation: Snowflake’s result cache stores the final result sets of queries in the cloud services layer. Its purpose is to speed up repeated executions of the same query when the underlying data has not changed. For a query to reuse a cached result, several key conditions must hold:
Result cache is shared across virtual warehouses within the same account, because it is stored in the cloud services layer, not on individual warehouses. It is a best-effort performance optimization: applications must always be prepared for a query to run fully if the cache is not available or has been invalidated. The incorrect statements in this question either overstate when caching applies based on “similar” queries, or mistakenly tie cache reuse to a specific warehouse, both of which are misunderstandings of how the Snowflake result cache works. Question 40Topic: Operations, Monitoring, and Business Continuity Which statement BEST describes how encryption key management works in the Snowflake AI Data Cloud? Options:
Best answer: D Explanation: Snowflake provides built-in, end-to-end encryption for data at rest and in transit. It uses a hierarchical key architecture, where higher-level keys protect lower-level keys, which in turn protect the data. As part of this design, Snowflake automatically rotates keys on a regular schedule and when certain security events occur, reducing operational burden on customers and improving security. Customers do not need to run their own scripts or jobs to perform basic key rotation for Snowflake-managed keys. While customers can optionally bring or control keys via supported key management integrations, the core behavior of Snowflake itself is to manage and rotate the encryption key hierarchy automatically. Keys are never stored in clear text, and security does not rely solely on RBAC permissions on metadata objects. Question 41Topic: Query Processing and Performance Optimization A team wants to ingest application logs into Snowflake with deeply nested attributes and frequently changing fields. They want to load the data directly into a VARIANT column with minimal pre-processing while preserving structure. Which file format is MOST appropriate for the source files? Options:
Best answer: B Explanation: For semi-structured, nested log data that changes frequently, the priority is to ingest records with minimal transformation and preserve their structure. In Snowflake, this aligns best with loading into a VARIANT column from a semi-structured format that matches VARIANT’s internal representation. JSON is the most natural fit. Snowflake natively parses JSON into VARIANT, automatically handling nested objects and arrays. New or missing fields across records do not require altering table schemas, which is ideal for log data where attributes can evolve over time. Flat formats like CSV require you to define columns and often to pre-flatten or encode nested content, which conflicts with the goal of minimal pre-processing. While Parquet is efficient and supports nested data, it is typically used with a more stable schema. XML can represent hierarchy, but JSON is better aligned with Snowflake’s common practices and tooling for loading semi-structured logs into VARIANT. Question 42Topic: Query Processing and Performance Optimization A downstream application can only read data files from an existing external cloud storage bucket. You must deliver a daily export of a Snowflake table while minimizing extra data copy steps outside Snowflake. Which approach using COPY INTO is most appropriate? Options:
Best answer: D Explanation: To unload data from a Snowflake table to external cloud storage with minimal extra data movement, you should use An external named stage stores the connection information (URL, credentials, encryption settings) for the external bucket. When you run Other approaches either add extra data transfer steps outside Snowflake or use features designed for different directions of data flow (reading external data into Snowflake rather than exporting it). Question 43Topic: Query Processing and Performance Optimization An analyst needs to load several CSV files stored on their laptop into a Snowflake table. They must first move the files into an internal stage and then check for formatting issues before actually loading the table. Which of the following actions/solutions will meet these requirements? (Select TWO.) Options:
Correct answers: A and B Explanation: To load files from a local environment into Snowflake, the files must first be placed in a stage that Snowflake can read, typically an internal user or named stage. The PUT command in SnowSQL is specifically designed to upload files from the local file system to an internal stage. Once files are staged, you can check data quality and formatting before actually loading the table. The VALIDATE capability (used with staged files and a file format) allows you to evaluate whether the data would load cleanly, surfacing parsing or conversion issues without committing bad data to the target table. Option review:
Question 44Topic: Data Loading, Unloading, and Transformation A nightly ETL job runs on a Medium virtual warehouse (4 credits/hour) and takes 2 hours (8 credits). The job scans about 15TB but returns only 3% of rows. Estimates show:
The team must reduce runtime below 45 minutes while minimizing total credits per run (round to whole credits). Which option is the most appropriate? Options:
Best answer: D Explanation: Query Acceleration Service (QAS) uses additional serverless compute to offload parts of large, selective scans, speeding up queries that scan a lot of data but return few rows, without permanently resizing the warehouse. In this scenario, the ETL scans about 15TB and returns only 3% of rows, which is exactly the kind of workload QAS is designed to accelerate. With QAS, the Medium warehouse finishes in 30 minutes: 30 minutes at 4 credits/hour uses about 2 credits, plus 2 credits for QAS, for about 4 credits total per run. Scaling to a Large warehouse uses 1 hour at 8 credits/hour, or 8 credits per run, which is both slower and more expensive than Medium with QAS. Therefore, enabling QAS on the existing Medium warehouse best meets the runtime requirement while minimizing credits and directly targets the large, selective scans in the ETL job. Question 45Topic: Query Processing and Performance Optimization A data engineering team is building an ingestion process for clickstream data in the Snowflake AI Data Cloud. Compressed JSON log files land in an external stage every 5 minutes, with about 5 million rows per hour. They need to efficiently append all new data each hour, occasionally correct or backfill a few individual rows, and must avoid replacing the entire target table contents during hourly loads. Which loading pattern is the most appropriate for this scenario? Options:
Best answer: B Explanation: In Snowflake, different commands are optimized for different loading use cases. For high-volume ingestion from files already stored in an internal or external stage, By contrast,
In this scenario, the team needs to append millions of new rows from staged files hourly and occasionally correct a few rows, without replacing the full table. The best fit is to use Question 46Topic: Query Processing and Performance Optimization A data engineering team loads CSV files from multiple sources into Snowflake. The architect wants consistent parsing rules and centralized governance of load settings. Which practice is NOT recommended for managing file formats in this scenario? Options:
Best answer: B Explanation: Snowflake encourages defining reusable FILE FORMAT objects with Defining full file format settings inline for each COPY statement duplicates configuration and makes it difficult to ensure all pipelines use the same rules. It also complicates audits and change control because there is no single authoritative definition of how files are parsed. Question 47Topic: Account Setup, Security, and Governance Several analysts report they must log in separately to Snowsight and a BI tool that connects to the same Snowflake account. Security also wants centralized password/MFA management. The account currently uses only local Snowflake users with username/password. Which change is the most appropriate? Options:
Best answer: D Explanation: The scenario describes two main requirements: improving user experience by avoiding multiple logins and improving security by centralizing password and MFA management. Local Snowflake users with separate username/password credentials do not meet these needs because each application maintains its own login and policy experience. Configuring single sign-on (SSO) with an external identity provider (IdP) such as an enterprise directory service allows users to authenticate once with their corporate identity and use that session to access multiple applications, including Snowflake via Snowsight and third-party BI tools. The IdP centrally enforces password rules, MFA, and account lifecycle, while Snowflake delegates authentication to it. This both simplifies user access and strengthens governance. Other options either weaken security (relaxing password policies, shared accounts) or change the authentication mechanism without providing single sign-on or centralized management (key-pair authentication alone). Therefore, enabling SSO with an external IdP is the best aligned with the stated goals. Question 48Topic: Data Sharing, Collaboration, and Marketplace A data engineering team is considering different ways to use Which of the following CTAS usage patterns should you AVOID? (Select TWO.) Options:
Correct answers: C and D Explanation:
Good CTAS usage patterns focus on reuse, performance, and clear lifecycle management. Poor patterns use CTAS for one-off exploratory work or to create unnecessary copies of the same data for each team, which drives up storage cost and complicates governance. In Snowflake, zero-copy features like views, secure views, and data sharing are preferred over duplicating tables whenever possible. Question 49Topic: Query Processing and Performance Optimization Which statement BEST describes how Options:
Best answer: C Explanation: This allows you to query external data without fully loading it into Snowflake-managed storage, which is useful for exploratory analysis, data lakes, or situations where data must remain in external storage for governance or cost reasons. Question 50Topic: Data Sharing, Collaboration, and Marketplace A data team is ingesting JSON event payloads that frequently add or remove attributes. They want to minimize schema management effort while still being able to query individual attributes in SQL. Which description of storing the events in a VARIANT column is most accurate for this goal? Options:
Best answer: B Explanation: VARIANT is Snowflake’s native data type for semi-structured data such as JSON, Avro, Parquet, and others. Conceptually, it stores each semi-structured value as a single column value while preserving its hierarchical structure (objects, arrays, nested fields). When JSON is loaded into a VARIANT column, Snowflake understands the internal structure and types of the data. You can access nested attributes using JSON path expressions (for example, This makes VARIANT ideal when event payloads or documents evolve over time. New keys can appear or old keys can disappear in the JSON without requiring In the scenario, the deciding factor is minimizing schema management effort while keeping the data queryable. Only the description that highlights storing the full JSON document in VARIANT and querying it via path expressions, without predefining columns, aligns with that goal and with how VARIANT conceptually works in Snowflake. Questions 51-75Question 51Topic: Account Setup, Security, and Governance A company already uses a corporate identity provider with single sign-on and MFA for all internal applications. They are onboarding 300 analysts to the Snowflake AI Data Cloud. Security requires: 1) users must not manage separate Snowflake passwords, 2) account access must be revoked centrally when employees leave, and 3) MFA rules must remain consistent with other apps. Which authentication approach is the most appropriate for Snowflake? Options:
Best answer: D Explanation: Single sign-on in Snowflake allows organizations to delegate authentication to an external identity provider. Users authenticate once to the corporate SSO, which can enforce MFA and other security policies, and then access Snowflake without maintaining separate credentials. This delivers a better user experience and centralizes lifecycle management: disabling a user in the identity provider also cuts off Snowflake access. In this scenario, the company wants to avoid separate Snowflake passwords, enforce the same MFA policies used elsewhere, and centrally revoke access when employees leave. Integrating Snowflake with the existing corporate SSO provider and requiring users to log in through SSO directly addresses these requirements. Username/password authentication managed directly in Snowflake can be secure, but it creates another credential store for users to manage and for administrators to govern. Key pair authentication is powerful for automation and service principals but is not ideal as the primary method for large numbers of interactive users. Shared accounts are an anti-pattern because they undermine auditability and user-level access control. Therefore, SSO integration is the best fit because it combines strong security with simpler user experience and centralized governance. Question 52Topic: Data Loading, Unloading, and Transformation A data engineering team runs occasional ad-hoc queries that filter on a highly selective predicate over a 15 TB fact table. Concurrency is low, but these queries still take several minutes to complete even on a large virtual warehouse. They want to speed up these large, selective scans without permanently increasing warehouse size or redesigning the schema. Which action is the most appropriate in Snowflake? Options:
Best answer: B Explanation: The scenario describes low-concurrency, ad-hoc queries that perform large but highly selective scans on a very large table. The existing virtual warehouse is already large, so simply adding more permanent compute is undesirable. The team wants to improve performance for these specific scan-heavy queries without a schema redesign. Snowflake’s Query Acceleration Service (QAS) is designed for exactly this situation: it can offload parts of large, selective scans from the main virtual warehouse to additional serverless compute resources. This helps reduce query latency for such workloads while allowing you to keep the base warehouse size stable. You pay only for the extra serverless compute consumed when acceleration is used. Other options like multi-cluster warehouses or large materialized views either focus on concurrency or change the data model and incur ongoing maintenance costs. They do not directly match the requirement: speeding up large, selective scans without permanently increasing warehouse size or restructuring the schema. Question 53Topic: Account Setup, Security, and Governance A security architect is reviewing how users connect from laptops and BI tools over the public internet to the Snowflake AI Data Cloud. The architect’s top priority is to prevent eavesdropping on queries and results in transit. Which statement BEST addresses this requirement? Options:
Best answer: C Explanation: The scenario is specifically about protecting query and result data while it moves between clients (such as laptops and BI tools) and the Snowflake AI Data Cloud over the public internet. The key security control for this is encryption in transit. Snowflake enforces TLS (Transport Layer Security) for client connections, such as those using JDBC, ODBC, or HTTPS through Snowsight. TLS establishes an encrypted tunnel between the client and Snowflake so that any data sent across the network (queries, results, credentials) is unreadable to anyone who might intercept the traffic. Other security features like encryption at rest, MFA, and network policies are important for overall security, but they do not directly provide the confidentiality of the actual bytes traveling across the network. Only the choice that explicitly calls out TLS-encrypted channels matches the architect’s stated priority of preventing eavesdropping on data in transit. Question 54Topic: Data Loading, Unloading, and Transformation You are reviewing a slow query in Snowsight. In the Query Profile, several operators show spill indicators. Which TWO interpretations or actions related to these spill indicators are INCORRECT and should be AVOIDED? (Select TWO.) Options:
Correct answers: C and D Explanation: In Snowflake’s Query Profile, spill indicators such as spilled to local storage or spilled to remote storage show where operators exceeded available memory and had to write intermediate data out of memory. This usually degrades performance, especially when spilling to remote storage, because reading and writing to storage is slower than in-memory operations. A good tuning approach is to look at where spilling occurs (using node tooltips and metrics like bytes spilled), then decide whether to resize the virtual warehouse, adjust query logic (for example, reduce data volume earlier, optimize joins/aggregations), or accept the cost if it is expected and infrequent. Simply ignoring spill markers because the query “works” is an anti-pattern that can lead to consistently slow and more expensive workloads. Question 55Topic: Operations, Monitoring, and Business Continuity A data engineering team uses zero-copy cloning to create test databases from production and relies on Time Travel for recovery. They want to understand storage impact. Which of the following statements about zero-copy clones and Time Travel is INCORRECT? Options:
Best answer: D Explanation: Zero-copy cloning in the Snowflake AI Data Cloud is a metadata operation at the moment of clone creation. The clone and the source reference the same underlying micro-partitions, so the initial storage overhead is minimal, even for large databases or tables. However, this low storage overhead is not permanent. As soon as data starts changing in either the source or the clone, Snowflake writes new micro-partitions for those changes. The original micro-partitions are still retained for any objects and for their Time Travel and Fail-safe periods, while the new micro-partitions add to total storage. This means storage consumption grows as the datasets diverge. Time Travel and Fail-safe also affect how long micro-partitions are kept. Dropping a clone does not instantly remove micro-partitions if those data blocks are still needed to satisfy the Time Travel or Fail-safe window of the source (or other clones). Eventually, when no objects and no retention windows reference a micro-partition, Snowflake can reclaim that storage. Therefore, it is wrong to assume that clones will never create additional storage cost. Cloning is very efficient initially, but ongoing changes in either the source or the clone will increase storage usage over time. Question 56Topic: Query Processing and Performance Optimization A data engineer loads newline-delimited JSON files from an internal stage using Options:
Best answer: A Explanation: The load is failing with JSON parse errors while using a CSV file format for newline-delimited JSON data. Snowflake supports multiple file formats for loading, including CSV for delimited text and JSON for semi-structured data. When loading JSON into a VARIANT column, the file format must be compatible with JSON so that Snowflake can parse each line as a separate JSON document. Using a CSV file format forces Snowflake to interpret the data as delimited text, which does not match the structure of JSON and leads to parsing errors. The simplest and most appropriate fix is to define or use an existing file format with Question 57Topic: Account Setup, Security, and Governance A company wants strict separation of duties in its Snowflake AI Data Cloud account: one team manages databases, schemas, and tables, while a separate security team manages users, roles, and access policies. Which system role should the security team primarily use to administer access control? Options:
Best answer: D Explanation: In Snowflake’s role-based access control model, system roles are designed around separation of duties. SECURITYADMIN is intended for security administration tasks such as managing users, roles, and their grants. This allows organizations to keep identity and access management separate from day-to-day data and object administration. In the scenario, one team is responsible for objects (databases, schemas, tables) and another team is responsible for controlling who can access those objects and how. The security team therefore needs a role whose primary purpose is access control administration, not object ownership or account-wide governance. SECURITYADMIN is exactly that role. By giving the security team SECURITYADMIN, they can create and manage users, create and manage roles, and grant or revoke privileges to roles and users. Meanwhile, a different role such as SYSADMIN can focus on creating and owning objects, preserving a clean separation of responsibilities and supporting least-privilege access. Question 58Topic: Query Processing and Performance Optimization A data engineering team must both load CSV files from a stage into Snowflake tables and periodically export query results back to external cloud storage. Which of the following ways of using the COPY command is INCORRECT for these requirements? Options:
Best answer: B Explanation: Snowflake provides two main COPY patterns: To ingest data, you point The choice that uses Question 59Topic: Account Setup, Security, and Governance Security logs show many failed login attempts to your Snowflake account from unfamiliar IP addresses. RBAC is correctly configured and all users authenticate through SSO. You must further reduce the attack surface without changing roles. Which action is most appropriate? Options:
Best answer: D Explanation: The scenario describes many failed login attempts coming from unfamiliar IP addresses. Role-based access control is already configured correctly, and users authenticate using SSO, so identity and privileges are in reasonable shape. The remaining risk is the network attack surface: anyone on the internet can still attempt to reach the Snowflake account. Snowflake network policies are designed exactly for this: they restrict which client IP addresses are allowed to establish a connection to the account (or to specific users). By defining an account-level network policy that allows only trusted corporate VPN and office address ranges, you block all access attempts from untrusted locations before authentication and RBAC even come into play. This directly addresses the symptom while leaving existing role design and SSO configuration unchanged. Question 60Topic: Account Setup, Security, and Governance A data engineer created the following secure view so the ANALYST role can query customer data. Security now requires that ANALYST must not see SSN values. Using only the information in the exhibit, what is the most appropriate next step? Options:
Best answer: A Explanation: The exhibit shows a secure view To use secure views for governance, you typically restrict what underlying data is exposed in the view definition. In this case, the view should be redefined so that sensitive columns, such as Secure views enhance governance by protecting both data returned and certain metadata in shared or multi-tenant scenarios, but they still return exactly the columns defined in the SELECT list. Therefore, the core governance action is to carefully design the view definition to include only the permitted columns or rows for each consuming role. Question 61Topic: Snowflake Architecture and Key Features A team is designing storage for a very large event-tracking dataset in the Snowflake AI Data Cloud. They want to minimize storage costs while keeping good query performance. Which of the following design choices is NOT appropriate for this goal? Options:
Best answer: A Explanation: For large Snowflake tables, you need to balance storage cost and query performance by choosing appropriate table types, retention settings, and physical design features like clustering. Data that is easily re-created should not be stored with expensive, long-term protection, whereas critical analytical data might justify extra storage overhead to achieve better performance. Intermediate staging data is a classic example of content that can be regenerated from raw files. Keeping that data in permanent tables with long Time Travel significantly increases retained micro-partitions and overall storage cost, but rarely improves business value. Instead, transient tables with short retention are typically sufficient. In contrast, investing in clustering on frequently filtered columns of a large fact table is often worthwhile because it can dramatically reduce scan volumes and improve performance, even if it adds some storage overhead for maintenance. Storing raw data externally and loading only curated subsets into Snowflake further reduces storage consumption on the platform, which can be appropriate when raw data does not need to be queried interactively inside Snowflake. Question 62Topic: Account Setup, Security, and Governance Which statement best describes the OWNERSHIP privilege on a Snowflake object such as a table? Options:
Best answer: A Explanation: In Snowflake, the OWNERSHIP privilege represents full administrative control over a specific object, such as a table, view, or schema. The role that owns an object can modify its structure or properties, drop it, and grant or revoke any other privileges on that object to roles. Only one role at a time can own a given object, and changing ownership transfers this control. By contrast, data access privileges like SELECT, INSERT, UPDATE, and DELETE focus on what a role can do with the data inside an object, not on managing the object itself. Similarly, USAGE on containers or compute objects controls whether a role can reference or use those objects, but not administer them fully. Question 63Topic: Snowflake Architecture and Key Features A Snowflake account has a Options:
Best answer: D Explanation: The scenario requires role-based masking of sensitive columns (EMAIL, PHONE, SSN) without creating additional physical copies of the data. Governance should be enforced within Snowflake, not delegated to external tools. Secure views in Snowflake are designed for protecting sensitive data. They only expose defined columns and expressions, hide the underlying table structure from consumers, and prevent certain information about the underlying data from appearing in query history or explain plans. When combined with role-aware masking logic (for example, masking policies that inspect By granting analysts access only to the secure view and restricting access to the base table to a higher-privileged administrative or engineering role, you centralize and harden data access control while avoiding data duplication and extra pipelines. Question 64Topic: Query Processing and Performance Optimization Your team is standardizing how it monitors and troubleshoots Snowflake COPY-based load and unload jobs. Which of the following approaches should you AVOID when designing this monitoring strategy? (Select TWO.) Options:
Correct answers: A and E Explanation: Snowflake provides rich metadata about load and unload operations through INFORMATION_SCHEMA and SNOWFLAKE.ACCOUNT_USAGE views such as COPY_HISTORY and load/unload history views. These views expose information like execution timestamps, status, number of rows loaded or unloaded, affected tables, file names, and error indicators. Effective monitoring and troubleshooting strategies read from these views to understand what happened during COPY operations, rather than making destructive changes or relying solely on external systems. Good practices include focusing queries on relevant tables and time windows, using account-wide views for centralized monitoring, and persisting key metrics to internal monitoring tables. The approaches to avoid are those that ignore this metadata (for example, relying only on storage logs) or that change or delete data before understanding the cause of failures (for example, deleting stage files and blindly rerunning jobs). Question 65Topic: Account Setup, Security, and Governance A data engineer is investigating why the role They query
Based on this exhibit, which privilege change is MOST likely required so Options:
Best answer: D Explanation: The exhibit is a filtered view of
To successfully run a
The exhibit confirms that items 1, 2, and 4 are already satisfied. The missing piece is the table-level privilege on the Other proposed changes either duplicate existing privileges shown in the exhibit (such as Question 66Topic: Snowflake Architecture and Key Features A data team uses the Snowflake AI Data Cloud. Analysts currently download CSVs from Snowsight to build ad hoc spreadsheets. Leadership wants simple, interactive “what-if” data apps that run directly on Snowflake data, reuse existing warehouses, and avoid managing any new infrastructure. Which approach is most appropriate? Options:
Best answer: B Explanation: Streamlit in Snowflake is designed for quickly building interactive data applications that run directly inside the Snowflake AI Data Cloud. Developers can use Python and Streamlit APIs to create UI elements, bind them to queries on Snowflake tables or views, and render interactive visualizations. These apps execute within Snowflake, using existing virtual warehouses and respecting Snowflake roles and privileges. In this scenario, leadership wants interactive “what-if” applications that:
Streamlit in Snowflake matches all of these goals. It keeps data, logic, and execution inside Snowflake, reducing data movement and operational overhead while enabling richer interactions than plain SQL worksheets. Other options either remain at the SQL/worksheet level, require external infrastructure, or move data out of Snowflake into other tools, so they do not fully satisfy the requirements. Question 67Topic: Data Loading, Unloading, and Transformation A company uses a single, always-on Large virtual warehouse for BI dashboards that are only used during an 8-hour business day. Storage usage and Time Travel retention are acceptable, but monthly Snowflake credit consumption is high. They want to reduce costs without changing data retention or dashboard performance. Which action is the most appropriate? Options:
Best answer: C Explanation: Snowflake costs are primarily driven by three components: storage, virtual warehouse compute, and cloud services. In this scenario, the concern is high monthly credit consumption while the BI dashboards only run during an 8-hour business day. That points directly to virtual warehouse compute being used (and billed) even when there is no workload. Virtual warehouses consume credits while they are running, regardless of whether they are actively processing queries. A common optimization is to enable auto-suspend so that a warehouse shuts down after a short period of inactivity, and auto-resume so it automatically starts again when a new query arrives. This reduces idle compute cost without changing data retention settings or noticeable query performance, assuming reasonable resume times. By contrast, adjusting Time Travel mainly affects storage consumption, not ongoing compute credits. Moving data to external storage changes where data is stored, not how long the warehouse runs. Disabling the result cache and shrinking the warehouse often increases compute work and can degrade performance, which conflicts with the stated requirement to keep dashboard performance unchanged. Therefore, configuring auto-suspend and auto-resume on the BI warehouse is the best Snowflake-native optimization that targets the main cost driver in this scenario: idle virtual warehouse compute. Question 68Topic: Account Setup, Security, and Governance Which TWO statements describe how secure views in Snowflake support data governance by hiding sensitive data from certain roles? (Select TWO.) Options:
Correct answers: C and E Explanation: Secure views in the Snowflake AI Data Cloud are an important governance tool for controlling how data is exposed to different roles. They behave like regular views in that they can select specific columns and filter rows, but they add stronger protections for secure sharing and for hiding underlying objects and logic. From a governance perspective, secure views let you define an approved, filtered projection of a table and then grant roles access only to that view. By omitting sensitive columns and embedding filters that depend on session context (for example, using CURRENT_ROLE() or CURRENT_USER()), you can ensure that users only see the data they are permitted to see, even if the underlying table contains more detailed or sensitive information. Secure views do not change Snowflake’s privilege model: granting SELECT on a secure view does not automatically grant access to the underlying table, and they are not the only way to implement row-level security. Instead, they complement other mechanisms such as row access policies and role-based grants. Question 69Topic: Operations, Monitoring, and Business Continuity A data provider uses the Snowflake AI Data Cloud to publish a curated analytics dataset to more than 20 external customers. Some customers already have their own Snowflake accounts; others do not use Snowflake at all. Requirements are:
Which approach BEST meets these requirements using Snowflake data sharing capabilities? Options:
Best answer: A Explanation: Snowflake data sharing is built around three key roles: the data provider (the account that owns and publishes data), consumer accounts (regular Snowflake accounts that receive shares), and reader accounts (provider-managed accounts for consumers who do not have their own Snowflake accounts). In this scenario, the provider needs to serve two distinct groups while keeping a single logical copy of the data. For customers who already have Snowflake accounts, the best pattern is to share data directly from the provider account to each customer’s consumer account. Those customers can then create their own roles, virtual warehouses, and views and pay for their own compute while querying the provider’s shared data. For customers without Snowflake accounts, reader accounts are designed specifically for this use case. The provider creates one or more reader accounts under its own Snowflake account, grants those reader accounts access to the shared objects, and creates warehouses in those reader accounts. The external users log in to the reader account’s Snowsight UI, run SQL on the shared data, and all compute costs are billed to the provider. The data itself remains stored once in the provider account and is accessed zero-copy by both consumer and reader accounts. This combination meets all requirements: existing Snowflake customers query within their own accounts and pay their own compute, non-Snowflake customers still get a Snowflake UI and SQL access with provider-managed compute, and the provider maintains a single governed dataset using Snowflake data sharing. Question 70Topic: Account Setup, Security, and Governance A company uses a single Snowflake account with several shared schemas and a few virtual warehouses. They want to improve governance by classifying PII columns and allocating warehouse costs by department, using Snowflake object tags instead of spreadsheets. Which approach is the most appropriate? Options:
Best answer: A Explanation: Snowflake object tags allow you to attach structured metadata to objects such as databases, schemas, tables, columns, and virtual warehouses. This metadata can then be queried via Snowflake-provided views (for example, in In this scenario, the company wants to classify PII at the column level for governance and associate compute costs with departments at the warehouse level. The best way is to create appropriate tags (for example, This keeps the solution simple, Snowflake-native, and aligned with built-in governance and billing visibility features, without introducing complex tag-based policies or external tooling beyond standard reporting on tag metadata. Question 71Topic: Snowflake Architecture and Key Features Which statement BEST describes Snowpark in the Snowflake AI Data Cloud and when you would use it instead of only writing SQL? Options:
Best answer: C Explanation: Snowpark is a developer framework that lets you build data processing and data pipeline logic in familiar programming languages while executing that logic inside Snowflake virtual warehouses. You typically choose Snowpark instead of pure SQL when the transformation or business logic is complex, better expressed as code, or needs to reuse existing application logic rather than being rewritten in SQL. The option describing Snowpark as a desktop visual query builder is the closest distractor, but it is incorrect because Snowpark is not a UI client at all; it is an API and execution model that runs within Snowflake, not a drag-and-drop SQL tool installed on a user’s machine. Question 72Topic: Account Setup, Security, and Governance Which TWO statements about using INFORMATION_SCHEMA and ACCOUNT_USAGE views for data governance reporting in Snowflake are INCORRECT? (Select TWO.) Options:
Correct answers: B and C Explanation: Snowflake provides both INFORMATION_SCHEMA and ACCOUNT_USAGE views to support governance and auditing, but they differ in scope and latency. INFORMATION_SCHEMA is implemented as a schema within each database (and sometimes at the schema level), exposing metadata only for that specific database or schema. Because it is tightly coupled with the database’s current state, it typically reflects changes like new objects or privilege grants with minimal delay, making it useful for near real-time checks in that scope. ACCOUNT_USAGE, exposed via the SNOWFLAKE database, aggregates metadata and usage information across the entire account: queries, objects, roles, grants, and more. This data is optimized for historical and cross-database reporting, but it is loaded asynchronously, resulting in a documented latency that can be up to several hours for some views. Therefore, it is ideal for long-term governance and audit reporting, but not for strict real-time monitoring. In governance designs, it is common to use INFORMATION_SCHEMA for immediate, narrow checks (per database) and ACCOUNT_USAGE for centralized, account-wide reporting, understanding that ACCOUNT_USAGE trades freshness for breadth and history. Question 73Topic: Account Setup, Security, and Governance A Snowflake AI Data Cloud account currently uses shared username/password credentials in ETL scripts. Security wants stronger authentication for non-interactive workloads while simplifying credential rotation and avoiding MFA prompts in jobs. Which change BEST meets these goals? Options:
Best answer: B Explanation: For non-interactive workloads such as ETL scripts, Snowflake supports both password-based authentication and key pair authentication. Passwords are easy for humans to type but must be stored somewhere when used by scripts, which creates security and management challenges. Key pair authentication separates a public key stored in Snowflake from a private key kept by the client, allowing strong, non-interactive authentication without exposing a shared password. Using dedicated service accounts with key pair authentication lets teams avoid sharing user passwords across servers and scripts. Private keys can be stored in a central secrets manager and rotated in a controlled way. Because the authentication is based on possession of the private key, scripts do not need MFA prompts, yet security is strong if keys are protected and password login is disabled for those accounts. By contrast, longer passwords or adding MFA may help for human users, but they do not solve the underlying problem for automated jobs: credentials must still be stored and updated in each script. Sharing one technical account across many jobs also hinders auditing and increases risk. Similarly, distributing unprotected private keys for all humans would weaken security and be difficult to manage. The recommended pattern is to use key pair authentication for service accounts with proper key storage and rotation, while humans typically use SSO or password+MFA. Question 74Topic: Snowflake Architecture and Key Features A data engineering team currently exports large tables from the Snowflake AI Data Cloud to an external processing cluster to run complex transformation code. They want to redesign the pipeline using the requirements shown below.
Based on the exhibit, which Snowflake feature is MOST appropriate to implement the data processing logic? Options:
Best answer: C Explanation: The exhibit highlights four key requirements: reuse of general-purpose programming language skills (R1), keeping both data and processing inside Snowflake (R2), building maintainable code-based pipelines (R3), and using DataFrame-style operations over Snowflake tables (R4). Snowpark is a Snowflake feature that lets developers work in familiar programming languages while pushing the execution of that code into Snowflake virtual warehouses. It exposes DataFrame-style APIs and integrates with Snowflake’s SQL engine so that the data never needs to leave Snowflake storage, directly addressing R1–R4. In contrast, designs that rely purely on SQL ignore the desire to reuse existing application language skills, and designs that rely on external clusters fundamentally violate the requirement to keep processing inside Snowflake to minimize data movement and operational overhead. Question 75Topic: Snowflake Architecture and Key Features Which of the following statements about the responsibilities of the Snowflake cloud services layer is NOT correct? Options:
Best answer: D Explanation: Snowflake’s cloud services layer is responsible for coordinating and governing how the platform works, not for physically storing data or performing the low-level data processing itself. Core responsibilities include authentication and session management, metadata management, query parsing and optimization, and transaction coordination. The cloud services layer knows what data exists and how it is organized, and it decides how queries should run, but it does not perform the actual data scans or store the table data files. In contrast, the storage layer holds all table data in compressed micro-partitions on cloud object storage. Virtual warehouses (compute) perform the query execution and data scanning based on the plans produced by the cloud services layer. Questions 76-100Question 76Topic: Account Setup, Security, and Governance A global company is onboarding two semi-autonomous business units to the Snowflake AI Data Cloud. Each unit must have isolated administration of its own data, users, roles, and virtual warehouses, and usage must be billed and monitored separately per unit. However, the central data team needs to occasionally share curated dimensional data between the units in a governed way. Based on Snowflake’s logical architecture and isolation boundaries, what is the most appropriate design? Options:
Best answer: C Explanation: In Snowflake, the account is the top-level logical container for data, compute resources (virtual warehouses), users, roles, and most configuration. Accounts are strongly isolated from one another, with their own security boundary and billing context. When different business units need administrative and cost separation but still want to collaborate, using multiple accounts under a single Snowflake organization is the standard design. By giving each business unit its own account, each team can independently manage its own users, roles, databases, schemas, and virtual warehouses. Usage and costs are naturally separated at the account level, while the organization construct allows central governance and consolidated commercial management. When data must be shared across units, Snowflake’s secure data sharing allows providers to expose specific objects (like curated dimensional tables) to consumer accounts without copying data. The other options rely on logical isolation mechanisms within a single account (databases, roles, virtual warehouses, reader accounts), which do not change the fact that all security, metadata, and billing still reside in one account. That violates the requirement for isolated administration and separate billing per business unit. Question 77Topic: Data Loading, Unloading, and Transformation A data engineer investigates a slow aggregation query. The Query Profile shows multiple operators with “spilling to remote storage.” The goal is to reduce runtime while keeping costs reasonable. Which of the following actions/solutions will meet these requirements? (Select TWO.) Options:
Correct answers: D and E Explanation: In Snowflake, the Query Profile can show that operators are “spilling to local storage” or “spilling to remote storage” when the working data set for that operation does not fit into the memory allocated on the virtual warehouse. When spilling occurs, Snowflake must write intermediate data to disk and later read it back, introducing extra I/O. Accessing disk (local or remote cloud storage) is significantly slower than working in memory, so repeated spills can noticeably slow a query. There are two main levers to improve performance in this situation:
The correct answers follow these principles. The incorrect options either target different problems (like concurrency or data recovery) or features that do not directly influence the memory usage that leads to spilling. Question 78Topic: Data Loading, Unloading, and Transformation A team wants to speed up dashboard queries using materialized views in the Snowflake AI Data Cloud. Which of the following practices related to materialized views should you AVOID? (Select TWO.) Options:
Correct answers: B and D Explanation: Materialized views in Snowflake store precomputed results for a defined query, such as aggregates or joins on a base table. Because the results are stored, queries against the materialized view often run much faster than recomputing the logic every time, especially for large tables and complex aggregations. However, materialized views are not free. Snowflake must maintain them whenever the base table changes. This introduces extra compute for refresh and additional storage for the persisted results. To use materialized views effectively, you want high read benefit relative to the cost of keeping them up to date. Creating materialized views on highly volatile tables, or creating many overlapping materialized views that compute similar results, increases refresh work and storage without a corresponding gain in query performance. Those are practices to avoid. In contrast, designing a small number of materialized views on stable, frequently queried data—especially for repeated aggregates or joins—and periodically retiring unused views aligns with Snowflake best practices for performance and cost optimization. Question 79Topic: Data Sharing, Collaboration, and Marketplace In Snowflake, which use case BEST justifies using Options:
Best answer: C Explanation: By writing the result set into storage as a regular table, later queries can read directly from that table instead of recomputing the logic every time. This can improve performance, reduce compute cost, and provide a stable snapshot of data for downstream workloads, such as reporting or further transformations. However, CTAS produces a static snapshot at the time it is run. It does not automatically stay in sync with source tables; for continuously updated or automatically maintained results, views or materialized views are more appropriate. CTAS also does not replace security mechanisms like row access policies or ingestion tools like Snowpipe. Question 80Topic: Snowflake Architecture and Key Features Which TWO statements about Snowflake micro-partitions are NOT correct? (Select TWO.) Options:
Correct answers: C and E Explanation: Snowflake micro-partitions are a core part of how the Snowflake AI Data Cloud organizes table data on storage. When data is loaded into a table, Snowflake automatically divides it into contiguous units called micro-partitions. Users do not define partition keys in the traditional database sense, nor do they manage physical files or run low-level maintenance jobs; these details are handled by Snowflake’s cloud services layer. Each micro-partition includes metadata, such as min/max values per column and counts of distinct values, which Snowflake’s optimizer uses to decide which micro-partitions to scan. This process, called partition pruning, reduces the amount of data read for many queries and is central to Snowflake’s performance. Although you cannot directly control micro-partition boundaries or file layout, you can influence how data is clustered using clustering keys on large, frequently filtered tables. Clustering keys help Snowflake organize new data so that related rows are more likely to fall into the same or nearby micro-partitions, improving pruning efficiency. Because Snowflake automatically manages micro-partitions and storage layout, you do not need (and are not able) to run manual operations like explicit partition rebuilds or VACUUM/ANALYZE. Thinking in terms of micro-partitions is useful conceptually, but daily operations focus on table design, clustering, and query patterns rather than physical partition management. Question 81Topic: Data Loading, Unloading, and Transformation A Snowflake admin is investigating a cost spike on virtual warehouse
Assume “last 60 minutes” includes all queries that started within the last 60 minutes. In Snowsight’s Query History, what total credits should the admin see, and which filters should be applied to match that total? Options:
Best answer: D Explanation: To answer this, you first identify which rows match all three conditions: user ALICE, warehouse WH_BI, and started within the last 60 minutes. Only Q103 (40 minutes ago, 0.4 credits) and Q105 (10 minutes ago, 0.5 credits) satisfy all of these, and together they consume 0.9 credits. In Snowsight’s Query History, you can reproduce this subset by setting filters for the specific user (ALICE), virtual warehouse (WH_BI), and a time range of Last 60 minutes. Using all three filters together narrows the view to just the relevant queries so the credit total in the UI aligns with your manual calculation. Question 82Topic: Data Loading, Unloading, and Transformation Each hour, a BI dashboard sees 30 concurrent queries for 20 minutes and 5 concurrent queries for 40 minutes. An X-Small warehouse cluster supports 10 concurrent queries and uses 1 credit/hour; a Large supports 40 concurrent queries and uses 4 credits/hour. Credits scale linearly with time. To avoid queuing and stay under 3 credits/hour, which setup is best? Options:
Best answer: C Explanation: The workload has a short period of high concurrency and a longer period of low concurrency. A good design must provide enough concurrent capacity during the 20-minute burst while minimizing credits when only a few queries run. With multi-cluster warehouses, Snowflake can spin up additional clusters during spikes and shut them down when demand drops, so you only pay for extra capacity while it is actually used. This matches the pattern of spiky BI dashboards much better than a single, always-on large warehouse. In the chosen configuration, three X-Small clusters handle the 30-query burst, and only one X-Small runs during the quieter 40 minutes. Three clusters for 20 minutes use about one credit, and one cluster for 40 minutes uses about two-thirds of a credit, for a total of about 1.67 credits, which is under the 3-credit target. Question 83Topic: Snowflake Architecture and Key Features A team is migrating a 5 TB fact table into the Snowflake AI Data Cloud. They want good query performance while minimizing ongoing effort managing how data is physically partitioned or laid out on disk. Which approach best leverages Snowflake’s automatic micro-partitioning? Options:
Best answer: B Explanation: Snowflake automatically stores table data in micro-partitions, which are immutable storage units created and managed by the service as data is loaded. Users do not define physical partitions, indexes, or file layouts on disk. Instead, Snowflake’s cloud services layer tracks rich metadata about each micro-partition, such as value ranges and statistics, and uses this metadata to prune unneeded micro-partitions at query time. Because of this design, the simplest and most aligned pattern for most workloads is to load data into a standard Snowflake table and allow the platform to manage micro-partitions automatically. This meets the requirement to minimize ongoing effort managing physical layout while still benefiting from efficient query performance. Patterns like manual table sharding, trying to define physical partitions, or rearranging files in external storage all increase operational complexity and do not actually control how micro-partitions are created within Snowflake’s storage layer. Question 84Topic: Account Setup, Security, and Governance In a secure Snowflake deployment, administrators ensure that the PUBLIC role has only minimal privileges (for example, no access to sensitive schemas and only basic usage on shared utilities). Which core security principle does this practice best represent? Options:
Best answer: B Explanation: In Snowflake, the PUBLIC role is automatically granted to every user and role in the account. If PUBLIC is given broad privileges such as Keeping PUBLIC restricted to minimal, non-sensitive privileges directly implements the principle of least privilege. This principle states that a subject (user, role, application) should have only the exact permissions needed to accomplish its tasks, and no more. In Snowflake, that means giving PUBLIC only basic capabilities (for example, access to a shared utility schema with non-sensitive functions) while using dedicated custom roles for access to production data, PII, or financial tables. Over-privileged PUBLIC can lead to unintended data exposure. For example, if PUBLIC is granted By instead assigning more specific grants to well-defined roles (like Question 85Topic: Data Loading, Unloading, and Transformation A team is evaluating Snowflake’s search optimization service to improve query speed. In which scenario is enabling search optimization NOT an appropriate choice and should be avoided as a tuning strategy? Options:
Best answer: D Explanation: Snowflake’s search optimization service is intended to accelerate highly selective lookup queries—for example, point lookups or predicates that return a very small subset of rows in a large table. It creates additional data structures that allow Snowflake to locate matching micro-partitions more directly than with standard pruning alone. Because there is an extra storage and maintenance cost, it should be reserved for workloads where predicates are very selective and frequently used. It does not help when queries scan most or all of a table’s rows; those patterns are dominated by raw scan and aggregation cost, which are better addressed through warehouse sizing, query design, or data modeling. In the scenario where dashboards always scan nearly all rows of a 12-billion-row fact table to compute aggregates, search optimization will provide little or no benefit. Snowflake still has to process almost all the data, and the search optimization index structures cannot avoid that work. Instead, using a larger or multi-cluster warehouse, summary tables, or other modeling changes is more appropriate. In contrast, when queries repeatedly look up specific IDs, sessions, or narrow JSON attributes that match only a tiny fraction of the table, search optimization can dramatically reduce latency by quickly finding only the relevant micro-partitions and avoiding unnecessary scans. Question 86Topic: Data Sharing, Collaboration, and Marketplace A data engineer must build a nightly ELT workflow that runs several SQL transformations in strict order across multiple schemas. The workflow must be reusable for different run dates, support centralized error logging, and ensure all steps succeed or none do. Which of the following actions/solutions will meet these requirements? (Select TWO.) Options:
Correct answers: A and E Explanation: Stored procedures in Snowflake are designed to orchestrate multi-step SQL workflows such as ELT pipelines. They can call multiple SQL statements in a defined sequence, use variables and parameters for reusability, and include procedural error handling to log and control failures. By putting the pipeline steps inside a stored procedure, the data engineer creates a single callable unit that encapsulates the entire workflow. The procedure can accept inputs like a run date or data source identifier, perform the extract/load/transform steps in order, and use exception handling constructs to log errors (for example, into an audit table) and either continue or stop, depending on requirements. Using transaction control in a stored procedure allows the workflow to run all steps in one transaction. If any step fails, the procedure can roll back the transaction so no partial changes are committed, satisfying the all-or-nothing requirement. The stored procedure can then be invoked by a task on a schedule, giving a clean orchestration pattern for recurring transformations. Question 87Topic: Data Sharing, Collaboration, and Marketplace A team currently loads JSON logs, CSV tables, PDFs, and JPEG images into one VARIANT column. They want to simplify governance by separating structured, semi-structured, and unstructured zones in Snowflake. Which description correctly defines data that should go in the unstructured zone? Options:
Best answer: B Explanation: In Snowflake, data is often categorized into structured, semi-structured, and unstructured based on how strongly it adheres to a schema. Structured data fits into relational tables with predefined columns and data types (for example, customer and order tables). Semi-structured data, such as JSON, Avro, ORC, or XML, is self-describing and is typically stored in VARIANT columns; it has an internal structure but does not require a fixed table schema. Unstructured data covers files that do not have a predefined tabular schema or a standard self-describing format Snowflake can parse as fields. Typical examples are PDFs, Word documents, images (PNG, JPEG), audio, video, and arbitrary binary files. In Snowflake, these are handled as unstructured file objects stored in stages, with separate metadata that you manage in tables. In the scenario, the unstructured zone is meant to hold content like PDFs and JPEG images, not JSON logs or CSV tables. The correct definition is therefore the one that describes broad file-based content such as documents and media files without a predefined tabular or self-describing structure. Question 88Topic: Data Sharing, Collaboration, and Marketplace Which of the following statements about querying unstructured data in the Snowflake AI Data Cloud are NOT accurate? (Select TWO.) Options:
Correct answers: D and E Explanation: In Snowflake, unstructured data (such as images, PDFs, or raw text files) is stored and accessed differently from structured or semi-structured table data. It is typically read at the file or object level rather than as finely organized rows and columns. Because of this, querying unstructured data usually has higher latency and supports fewer direct SQL operations than querying data that has been loaded into structured or semi-structured columns. A common best practice is to use unstructured data as a source, extract important attributes into structured tables, and then perform most analytic work on those derived, structured representations. Unstructured data is therefore well-suited for occasional retrieval, enrichment, and lookups, but not for workloads that require fast, repeated analytical queries using the full power of SQL and Snowflake’s columnar engine. Question 89Topic: Operations, Monitoring, and Business Continuity A data provider plans to use Snowflake secure data sharing so external partners can query data in near real time without copying it. Which of the following practices should you AVOID? (Select TWO.) Options:
Correct answers: A and B Explanation: Snowflake secure data sharing allows a provider to expose selected databases, schemas, tables, and views to consumer accounts or reader accounts without copying underlying data. Consumers query shared objects directly on the provider’s storage, which enables zero-copy access and near real-time visibility into changes. Practices that export data to files or require separate ETL into another system break these advantages. They create redundant copies, introduce latency, complicate governance, and move away from Snowflake’s native secure sharing model. In contrast, granting access via secure shares, sharing secure views, and using reader accounts all rely on Snowflake’s built-in mechanisms and preserve zero-copy sharing and timely data access. Question 90Topic: Data Sharing, Collaboration, and Marketplace A team uses the Snowflake AI Data Cloud to process image files stored in an internal stage. Each day they must: 1) scan the directory table for new files, 2) call an external function to analyze each image, 3) update a metadata table, and 4) capture any errors in a central log so the orchestration task does not fail on a single bad file. They want to coordinate all steps in one place and implement robust error handling. Which of the following actions/solutions will meet these requirements? (Select TWO.) Options:
Correct answers: D and E Explanation: Stored procedures in the Snowflake AI Data Cloud are well suited to orchestrating complex unstructured data workflows because they can perform multiple SQL operations in sequence, call external functions or Snowpark logic, and implement centralized error handling (for example, by catching errors and writing them to an error log table). In this scenario, the team needs a single place to coordinate reading a directory table of unstructured files, invoking an external analysis step, updating metadata, and recording any errors without causing the entire scheduled task to fail. The best solutions therefore use stored procedures as the orchestration layer, with tasks simply triggering the procedure on a schedule. Approaches that rely only on Snowpipe, streams, tasks, or materialized views will not provide the requested central orchestration and robust error handling in a stored procedure, even if they can react to new files or maintain some metadata automatically. Question 91Topic: Snowflake Architecture and Key Features A data engineering team needs to implement a nightly pipeline entirely inside the Snowflake AI Data Cloud. The pipeline must: (1) run several DML steps in sequence (archiving, loading, and table swaps), (2) perform data-quality checks and, based on the results, either continue or abort, and (3) write detailed error information to a log table before returning a status code. The process will be triggered by a scheduled task but the team wants the orchestration and control flow to live in a single reusable object in Snowflake, not in an external tool. Which Snowflake feature is the most appropriate to implement this orchestration logic? Options:
Best answer: B Explanation: This scenario focuses on orchestrating a multi-step, data-dependent workflow inside Snowflake. The team needs to run multiple DML steps in sequence, evaluate data-quality checks, branch based on those results, log errors, and return a status. That is more than a single query: it is a procedural workflow with control flow and side effects. In Snowflake, stored procedures are the primary object for implementing such logic. They allow you to execute multiple SQL statements, implement control flow (for example, IF/ELSE, loops, TRY/CATCH-style error handling), write to log tables, and return values. A task can then call the stored procedure on a schedule, giving you scheduling plus internal orchestration. User-defined functions (UDFs), by contrast, are meant to be invoked inside SQL statements for per-row or per-set computations. They do not serve as orchestrators for multi-step pipelines and are not designed to control transactions or sequences of independent DML operations. Other objects like tasks and views have important roles but are not sufficient by themselves to implement the sort of procedural, branching workflow described in the scenario. Tasks schedule and trigger work; views define reusable query logic. For end-to-end orchestration and control flow, the correct choice is a stored procedure. Question 92Topic: Data Sharing, Collaboration, and Marketplace In the Snowflake AI Data Cloud, which object is BEST suited to orchestrate multiple SQL transformation steps with defined sequencing, basic error handling, and reusable logic that can be invoked with a single call? Options:
Best answer: B Explanation: Stored procedures in the Snowflake AI Data Cloud are designed to orchestrate complex transformation workflows that require more than a single SQL statement. They can execute multiple SQL commands in a defined order, apply conditional logic, manage transactions, and implement basic error handling. Because they are defined as database objects, they can be invoked repeatedly using a simple In a typical pattern, a stored procedure might perform several steps: staging data, merging into target tables, running validation queries, and logging results. All of these can be packaged into one procedure. If details change (for example, a new validation step is added), only the procedure definition needs to be updated, while callers continue to use the same interface. By contrast, tasks, views, and streams each play important but different roles. Tasks primarily schedule or trigger work, views define reusable queries, and streams track changes. None of these is meant to encapsulate multi-step procedural orchestration with sequencing and error handling in the way that a stored procedure does. Question 93Topic: Data Loading, Unloading, and Transformation A team runs daytime BI dashboards and ad-hoc queries with moderate concurrency and wants to keep Snowflake credit costs low while maintaining reasonable responsiveness. Which virtual warehouse configurations SHOULD YOU AVOID? (Select TWO.) Options:
Correct answers: D and E Explanation: Virtual warehouse settings in the Snowflake AI Data Cloud strongly influence both performance and cost. Features like auto-suspend and auto-resume help eliminate idle compute spend, while multi-cluster settings control how many clusters run concurrently to handle concurrent queries. For a daytime BI workload with moderate concurrency, the main cost risk is running excessive compute when there is little or no activity. Long-lived or always-on large warehouses and high minimum cluster counts can consume a lot of credits with minimal performance gain relative to the workload’s needs. Good practice is to keep warehouses as small and short-lived as is reasonable while still meeting responsiveness expectations. Short auto-suspend times, auto-resume, and modest multi-cluster ranges (with a low minimum) all help contain costs while providing flexibility for peaks. The configurations to avoid are those that force large or many clusters to stay running constantly for a relatively small, predictable workload, because they violate basic cost optimization principles without providing proportional performance benefits. Question 94Topic: Data Loading, Unloading, and Transformation Which of the following statements about standard (single-cluster) and multi-cluster virtual warehouses in Snowflake is NOT correct? Options:
Best answer: B Explanation: Standard (single-cluster) warehouses in Snowflake always use a single compute cluster. If many users or workloads run queries concurrently and exceed that cluster’s capacity, Snowflake may queue some queries until resources become available. Multi-cluster warehouses are designed to handle higher concurrency by adding or removing entire clusters within a configured minimum and maximum. When concurrency increases, Snowflake may start additional clusters so more queries can run simultaneously. When concurrency decreases, it can shut down extra clusters to reduce compute usage. However, Snowflake never splits a single query across multiple clusters. Each query executes entirely on one cluster, even when multiple clusters are active. Multi-cluster behavior improves concurrency (more queries at once), not the parallelism of an individual query. Question 95Topic: Account Setup, Security, and Governance A data platform team wants to simplify permission management as analysts frequently move between departments. They also want consistent, auditable access aligned with Snowflake’s RBAC model. Which approach is the most appropriate? Options:
Best answer: D Explanation: Snowflake’s role-based access control (RBAC) model is built around granting privileges to roles and then assigning those roles to users. Users do not receive object privileges directly; instead, they inherit permissions from the roles they hold. This structure allows administrators to manage access centrally and adjust it easily as people change responsibilities. The key goal in the scenario is to simplify permission changes when analysts move between departments while keeping access auditable and consistent with Snowflake’s RBAC approach. The most effective way to do this is to design department- or function-specific roles, grant all necessary database and schema privileges to those roles, and then assign or revoke roles for individual users as they move. This keeps the privilege model stable and shifts churn to role membership, which is much easier to manage. By contrast, per-user privilege management, shared accounts, or granting powerful system roles broadly either conflict with Snowflake’s RBAC design or undermine security and governance, even if they might appear to reduce short-term configuration work. Question 96Topic: Data Loading, Unloading, and Transformation Under which condition can Snowflake automatically rewrite a user’s query to use a materialized view instead of the base table? Options:
Best answer: D Explanation: Snowflake’s optimizer can transparently rewrite queries to read from a materialized view instead of the underlying base table when it can prove that the materialized view fully or partially covers the query in a logically equivalent way. In practice, this means the query’s filters, projections, joins, and aggregations must fit within what the materialized view already computes, so that using the view yields exactly the same result as querying the base table. This behavior allows performance improvements (by leveraging precomputed and maintained results) without changing application SQL. However, it is driven by logical query equivalence, not by string matching, hints, or object placement alone. Question 97Topic: Snowflake Architecture and Key Features A security architect wants to tighten access control in a Snowflake account by ensuring that authentication, role-based access control (RBAC), and metadata-driven governance behave consistently, regardless of which virtual warehouse or database a user queries. Which Snowflake component is primarily responsible for providing this centralized control? Options:
Best answer: D Explanation: In the Snowflake AI Data Cloud, the cloud services layer is responsible for core control-plane functions such as authentication, role-based access control (RBAC), metadata management, query optimization, and transaction coordination. These responsibilities are deliberately separated from compute (virtual warehouses) and storage so that governance and control remain consistent regardless of which warehouse runs the query or where the data is physically stored. When a user logs in, the cloud services layer authenticates the user, evaluates their roles and privileges, and determines what objects they can access. It uses metadata catalogs to understand objects (databases, schemas, tables, views, policies) and applies security rules such as RBAC, row access policies, and masking policies. It also parses and optimizes SQL, generates execution plans, and coordinates transactions across warehouses. This centralized control is what allows the architect’s requirement—tighter, consistent access control that is independent of any specific virtual warehouse or database—to be met by the cloud services layer and not by compute or storage components. Question 98Topic: Query Processing and Performance Optimization A data engineer is designing a secure ingestion strategy and comparing internal and external stages in Snowflake. Which of the following statements about stages is INCORRECT? Options:
Best answer: C Explanation: Snowflake stages provide locations for data files used in loading and unloading operations. Internal stages are fully managed by Snowflake, storing files inside Snowflake’s own storage layer. External stages, in contrast, are pointers to locations in external cloud storage, such as object storage buckets, where the customer continues to own and manage the data. Because internal stages live entirely within Snowflake, they do not require external storage credentials, and Snowflake handles encryption and access control. External stages must reference external storage URLs and typically use credentials or a storage integration so Snowflake can read and/or write the files during Internal stages can exist at different scopes (user, table, or named), which affects who can access them. External stages are always named database objects that reference an external location. Importantly, data in an external stage is not stored in Snowflake-managed storage and is not billed as Snowflake storage; it remains in the customer’s external cloud storage account. Question 99Topic: Operations, Monitoring, and Business Continuity A data engineer in the provider account runs the following query in Snowsight: Exhibit:
Based on the exhibit, what is the most accurate conclusion about how the retailer accesses the Options:
Best answer: A Explanation: The exhibit shows two outbound shares from the provider account For For Therefore, the most accurate conclusion is that the retailer is accessing the shared data through a Snowflake reader account created and managed in the provider account, rather than through its own independent Snowflake account. Question 100Topic: Data Loading, Unloading, and Transformation Which TWO statements about how Snowflake database replication and failover affect storage and compute costs are correct? (Select TWO.) Options:
Correct answers: A and B Explanation: Snowflake database and failover group replication improve availability and disaster recovery by maintaining additional copies of your data in other regions or accounts. However, each replica is a full data copy, so you pay storage for both the primary and every secondary. In addition, replication and failover are active processes that use Snowflake compute resources, and any test or real failover where you run workloads on the secondary will incur compute costs in that environment. Virtual warehouses are not included in data replication; they must be defined in each account where you run queries. When a failover occurs, queries are executed in the target account’s warehouses and are billed there. This means careful planning is needed: limit replication to necessary objects, choose appropriate replication frequency, and schedule failover tests thoughtfully to balance resilience and cost. Continue with full practiceUse the SnowPro Core COF-C02 Practice Test page for the full IT Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access. Try SnowPro Core COF-C02 on Web View SnowPro Core COF-C02 Practice Test Focused topic pages
Free review resourceRead the SnowPro Core COF-C02 Cheat Sheet on Tech Exam Lexicon for concept review before another timed run. Revised on Thursday, May 14, 2026 |