Try 10 focused SnowPro Core COF-C02 questions on Sharing and Marketplace, with explanations, then continue with IT Mastery.
Open the matching IT Mastery practice page for timed mocks, topic drills, progress tracking, explanations, and full practice.
Try SnowPro Core COF-C02 on Web View full SnowPro Core COF-C02 practice page
| Field | Detail |
|---|---|
| Exam route | SnowPro Core COF-C02 |
| Topic area | Data Sharing, Collaboration, and Marketplace |
| Blueprint weight | 15% |
| Page purpose | Focused sample questions before returning to mixed practice |
Use this page to isolate Data Sharing, Collaboration, and Marketplace for SnowPro Core COF-C02. Work through the 10 questions first, then review the explanations and return to mixed practice in IT Mastery.
| Pass | What to do | What to record |
|---|---|---|
| First attempt | Answer without checking the explanation first. | The fact, rule, calculation, or judgment point that controlled your answer. |
| Review | Read the explanation even when you were correct. | Why the best answer is stronger than the closest distractor. |
| Repair | Repeat only missed or uncertain items after a short break. | The pattern behind misses, not the answer letter. |
| Transfer | Return to mixed practice once the topic feels stable. | Whether the same skill holds up when the topic is no longer obvious. |
Blueprint context: 15% of the practice outline. A focused topic score can overstate readiness if you recognize the pattern too quickly, so use it as repair work before timed mixed sets.
These questions are original IT Mastery practice items aligned to this topic area. They are designed for self-assessment and are not official exam questions.
Topic: Data Sharing, Collaboration, and Marketplace
When granting temporary URL access to unstructured files stored in a Snowflake stage, which practice best aligns with Snowflake security recommendations?
Options:
A. Generate presigned URLs with a short expiration time and restrict stage access to only the specific roles that require the files.
B. Move files from the stage to an external cloud storage bucket configured for anonymous public access, then share the bucket URL.
C. Create a presigned URL with a very long expiration time and post it on a shared internal wiki for easy reuse.
D. Grant READ privilege on the stage to the PUBLIC role and use directory table URLs so that expiration is not required.
Best answer: A
Explanation: The recommended way to share Snowflake-hosted unstructured files via URL is to use short-lived presigned URLs together with tightly scoped role-based privileges on the underlying stage, so access is both time-bound and least-privilege. Using a long-lived URL, even if presigned, is weaker because it greatly extends the window in which anyone with the link can access the file.
Topic: Data Sharing, Collaboration, and Marketplace
A data engineer wants dim_sales to be updated incrementally using changes from raw_sales. Review the stream and task definition in the exhibit.
What is the most appropriate next change to implement this incremental ELT pattern?
-- Exhibit: Stream and task definition
CREATE OR REPLACE STREAM sales_stm ON TABLE raw_sales;
CREATE OR REPLACE TASK load_sales_tk
WAREHOUSE = etl_wh
SCHEDULE = 'USING CRON 0 * * * * UTC'
AS
MERGE INTO dim_sales d
USING raw_sales r
ON d.id = r.id
WHEN MATCHED THEN
UPDATE SET d.amount = r.amount
WHEN NOT MATCHED THEN
INSERT (id, amount) VALUES (r.id, r.amount);
Options:
A. Modify the MERGE to use sales_stm instead of raw_sales in the USING clause.
B. Increase the task schedule to run every 5 minutes instead of hourly.
C. Recreate the stream as CREATE OR REPLACE STREAM sales_stm ON TABLE dim_sales;.
D. Resize etl_wh to a larger warehouse to improve incremental processing.
Best answer: A
Explanation: In Snowflake incremental ELT patterns, a stream captures row-level changes on a source table and a task schedules the SQL that consumes those changes. The stream maintains an ordered change-log since the last time it was consumed in a transaction.
In the exhibit, the stream sales_stm is correctly defined on the raw_sales table:
CREATE OR REPLACE STREAM sales_stm ON TABLE raw_sales;
This means sales_stm will hold inserts, updates, and deletes that occur on raw_sales after the stream is created.
However, the task definition does not use the stream:
MERGE INTO dim_sales d
USING raw_sales r
ON d.id = r.id
...
Because the MERGE reads directly from raw_sales, each run of the task will consider the entire source table. That is a batch pattern, not an incremental one. To make the pattern incremental, the task must read from the stream sales_stm, which exposes only the new and changed rows since the last consumption.
The correct next step is therefore to modify the MERGE statement so that the USING clause references sales_stm instead of raw_sales. This change connects the scheduling capability of the task with the change-capture capability of the stream, forming a standard incremental ELT pattern.
Topic: Data Sharing, Collaboration, and Marketplace
A media company stores unstructured image files in a Snowflake internal stage. They must: (1) give an external design agency temporary access to a single file without sharing any Snowflake or cloud storage credentials, and (2) let internal analysts browse files using normal Snowflake authentication and role-based access. Which of the following actions/solutions will meet these requirements? (Select TWO.)
Options:
A. Give the external agency the direct cloud storage URL for the file plus a shared storage access key so they can download it whenever needed.
B. Generate a presigned URL for the specific staged image with a short expiration time and share this URL only with the external agency.
C. Create a presigned URL for the image file with no expiration so the external agency can reuse the link indefinitely without renewal.
D. Create a Snowflake user for the external agency and grant that user READ privileges on the internal stage so they can browse and download any required files.
E. Provide internal analysts with internal-stage file and directory URLs and require them to authenticate with their Snowflake accounts and appropriate roles to access the images.
Correct answers: B and E
Explanation: Presigned URLs, internal stage URLs, and direct cloud storage URLs differ mainly in who can use them, for how long, and under which security controls.
A presigned URL in Snowflake is typically used to give temporary, scoped access to a single staged file. It embeds an authorization token and expiration time, so someone without a Snowflake account or cloud credentials can download exactly that file for a limited period.
Internal stage URLs, by contrast, are designed for users who already authenticate to Snowflake. Access to files through these URLs is controlled by Snowflake RBAC and authentication, keeping all access under Snowflake governance.
Direct cloud storage URLs and shared storage keys live outside Snowflake’s control. Sharing those broadly can leak credentials or provide broader and longer-lived access than required, which conflicts with the principles of least privilege and time-bound access.
Topic: Data Sharing, Collaboration, and Marketplace
An engineer must orchestrate a repeatable daily transformation that loads, cleans, and merges data while logging any failures. Review the stored procedure definition in the exhibit. What is the primary advantage of this approach?
CREATE OR REPLACE PROCEDURE TRANSFORM_DAILY_BATCH()
RETURNS STRING
LANGUAGE SQL
AS
\[
BEGIN
-- Step 1: Load new data into staging
INSERT INTO STG_SALES
SELECT * FROM RAW_SALES;
-- Step 2: Clean and deduplicate
DELETE FROM STG_SALES s
USING (
SELECT ID, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY LOAD_TS) AS RN
FROM STG_SALES
) d
WHERE s.ID = d.ID AND d.RN > 1;
-- Step 3: Merge into final table
MERGE INTO SALES t
USING STG_SALES s
ON t.ID = s.ID
WHEN MATCHED THEN UPDATE SET ...
WHEN NOT MATCHED THEN INSERT (...);
RETURN 'SUCCESS';
EXCEPTION
WHEN OTHER THEN
INSERT INTO ETL_ERROR_LOG(PROC_NAME, ERROR_TIME, ERROR_MESSAGE)
VALUES('TRANSFORM_DAILY_BATCH', CURRENT_TIMESTAMP, SQLERRM);
RETURN 'FAILED';
END;
\];
Options:
A. It causes Snowflake to automatically run each step in parallel across separate warehouses for maximum throughput.
B. It sequences all related SQL steps in one reusable procedure with centralized error logging if any step fails.
C. It guarantees automatic retries of failed steps until the RETURN 'SUCCESS' statement is reached.
D. It bypasses the need for transactions because individual statements always auto-commit even when errors occur.
Best answer: B
Explanation: The exhibit shows a stored procedure TRANSFORM_DAILY_BATCH defined with LANGUAGE SQL. Inside the BEGIN block, three SQL steps are executed in order: inserting into a staging table, cleaning/deduplicating the staging data, and merging into the final SALES table. This enforces a clear, deterministic sequence for the transformation workflow.
After these steps, the procedure returns 'SUCCESS' if all statements complete. An EXCEPTION block handles any error raised in the preceding statements. In case of failure, the code inserts a record into ETL_ERROR_LOG with the procedure name, timestamp, and error message (SQLERRM), then returns 'FAILED'. This centralizes error logging and reporting for the entire workflow, instead of handling errors separately in each statement or client.
Because the logic is encapsulated in a single procedure, callers can simply execute CALL TRANSFORM_DAILY_BATCH() from tasks, external schedulers, or ad-hoc sessions, reusing the same orchestrated sequence and error handling across environments and schedules.
Topic: Data Sharing, Collaboration, and Marketplace
A Snowflake table stores clickstream events in a VARIANT column. Each event contains an array of product IDs that were viewed. Analysts struggle to aggregate by individual product because the IDs remain nested inside the array. What is the most appropriate way to simplify these queries?
Options:
A. Convert the VARIANT column to a plain VARCHAR using TO_VARCHAR so that each array is stored as a string and can be filtered with LIKE.
B. Create a materialized view that selects the VARIANT column directly, allowing analysts to query the nested array as-is.
C. Use the FLATTEN table function on the array column to turn each array element into its own row before aggregating by product ID.
D. Increase the virtual warehouse size so queries can scan the VARIANT column faster without changing how the data is modeled.
Best answer: C
Explanation: Semi-structured data in Snowflake, such as JSON stored in a VARIANT column, often contains arrays or nested objects. When analysts need to report or aggregate at the level of individual items inside an array (for example, individual product IDs), it is inefficient and cumbersome to work directly against the nested structure.
The FLATTEN table function is built specifically for this problem. It takes an array or object and returns its elements as separate rows. Once the nested values are represented as rows, you can use standard SQL joins, GROUP BY, and aggregations on those values just like on any other relational column.
In this scenario, applying FLATTEN to the array of product IDs produces one row per product per event. Analysts can then easily group by product ID, count views, and calculate metrics without writing complex nested path expressions in every query.
Topic: Data Sharing, Collaboration, and Marketplace
Analysts complain that a BI dashboard counting unique customers over a 5‑billion‑row table takes 40 seconds, even on a Large warehouse. They only need results within about 1% of the true value and must control compute costs. Which change is most appropriate?
Options:
A. Replace exact DISTINCT aggregate calculations with Snowflake’s approximate aggregation functions to compute estimated unique-customer counts.
B. Create a clustered table on the customer identifier column so distinct-count queries can use pruning more effectively.
C. Increase the virtual warehouse size to X-Large and disable auto-suspend so queries always use a warm cache.
D. Build a materialized view with exact distinct-customer counts and refresh it frequently, then point the dashboard to this view.
Best answer: A
Explanation: The scenario describes slow DISTINCT aggregation over a very large table, even on a reasonably sized warehouse. The business explicitly states two important constraints: results can be approximate (within about 1% of the true value) and compute costs must be controlled.
In the Snowflake AI Data Cloud, approximate or estimation functions are designed specifically for these situations. They trade exactness for speed and reduced resource usage. Instead of computing an exact DISTINCT count over billions of rows, they maintain compact statistical summaries to estimate the result. This greatly reduces processing time and credits while giving answers that are acceptably close for many analytical use cases, such as trend monitoring or high-level KPIs.
Because the requirement explicitly allows some inaccuracy in exchange for better performance and lower cost, switching from exact aggregates to approximate aggregation functions is the most appropriate and Snowflake-native response. Other options focus on throwing more compute or maintenance at the problem while still using exact calculations, which conflicts with the stated tolerance for approximation and the need to control spend.
Topic: Data Sharing, Collaboration, and Marketplace
A data engineer needs each element of a JSON array column payload:items to appear on its own row while still showing the original order_id from table ORDERS. Their current query uses TABLE(FLATTEN(input => payload:items)) as a standalone subquery, and joining it back to ORDERS is causing unexpected row duplication. Which change is the most appropriate to fix this while preserving the parent row context?
Options:
A. Rewrite the query so that ORDERS and LATERAL FLATTEN(input => orders.payload:items) appear together in the FROM clause, allowing each flattened item to be correlated with its parent order row.
B. First copy all payload:items values into a separate staging table using COPY INTO, then join the staging table back to ORDERS using a synthetic row number for each array element.
C. Cast payload from VARIANT to OBJECT and use the : path syntax to directly select payload:items[0] while keeping order_id in the same row.
D. Use GROUP BY order_id with ARRAY_AGG(payload:items) so that each order_id appears once with all items, avoiding any need for FLATTEN or joins.
Best answer: A
Explanation: When working with semi-structured JSON data in Snowflake, FLATTEN expands array elements into multiple rows. However, if FLATTEN is used as a standalone subquery without being correlated to the parent table, the resulting rows lose their direct link to the original parent row. Joining these back later often leads to unexpected row multiplication or complex join conditions.
LATERAL FLATTEN solves this by treating FLATTEN as a lateral (correlated) row source: the FLATTEN call can reference columns from the parent table in the same FROM clause. Conceptually, Snowflake creates, for each parent row, a set of child rows (one per array element) that automatically carry the parent row’s values.
For example:
SELECT o.order_id,
f.value AS item
FROM orders o,
LATERAL FLATTEN(input => o.payload:items) f;
In this pattern, each payload:items element is expanded into its own row, and order_id remains available on every row without any additional join conditions. This directly addresses the issue in the scenario: preserving parent context while expanding arrays, and avoiding incorrect joins or duplicated rows.
Topic: Data Sharing, Collaboration, and Marketplace
A team must load changes from a staging table into a reporting table once per day. Their top priority is to minimize Snowflake compute credits while still processing all daily changes. Which approach uses streams and tasks in the most cost-efficient way?
Options:
A. Create a stream on the staging table and a task that runs every 5 minutes on a medium warehouse to MERGE streamed changes for near real-time reporting.
B. Create a stream on the staging table and two hourly tasks: one to transform the data and another to load aggregates on a dedicated warehouse kept running for concurrency.
C. Skip streams and define a daily task that recreates the reporting table with a CTAS query reading the entire staging table on a large warehouse.
D. Create a stream on the staging table and a single task that runs once per day on a small auto-suspending warehouse to MERGE only the streamed changes into the reporting table.
Best answer: D
Explanation: Snowflake streams track row-level changes (inserts, updates, deletes) on a source table since the last time the stream was consumed. Tasks schedule and automate the SQL that processes those changes. For a workload that only needs to run once per day and wants to minimize compute credits, the best pattern is to use a stream for incremental change capture and a single daily task on the smallest appropriate warehouse with auto-suspend.
By processing only changed rows via the stream, the MERGE statement does less work than a full-table reload. By scheduling the task to run once per day on a small auto-suspending warehouse, compute is used only for the short period needed to apply the daily changes. This directly supports the stated deciding factor: minimizing Snowflake credits while still processing all daily changes.
Other options either increase task frequency, increase warehouse size, or reload the full dataset, all of which increase compute usage without being required by the once-per-day business requirement.
Topic: Data Sharing, Collaboration, and Marketplace
Which of the following statements about creating arrays in Snowflake using array-construction functions is NOT correct?
Options:
A. Array-construction functions such as ARRAY_CONSTRUCT return values of the native SQL type ARRAY, distinct from VARIANT.
B. ARRAY_CONSTRUCT can take multiple scalar or VARIANT arguments and returns a single VARIANT value containing an ordered array of those elements.
C. ARRAY_AGG builds an array by aggregating values from multiple rows, often used with GROUP BY to produce one array per group.
D. ARRAY_CONSTRUCT can be nested to build arrays that contain other arrays or objects.
Best answer: A
Explanation: Snowflake array-construction functions help you build arrays either from explicit arguments in an expression or by aggregating values from multiple rows. These arrays are always represented as semi-structured data of type VARIANT, not a separate ARRAY data type.
ARRAY_CONSTRUCT takes a list of expressions and returns a VARIANT value whose content is an array of those values in the same order. You can nest calls to ARRAY_CONSTRUCT (and combine them with object constructors) to build nested structures like arrays of arrays or arrays of objects.
ARRAY_AGG is an aggregate function that combines values from multiple rows into a single array stored in a VARIANT. It is commonly used with GROUP BY to create one array per group, such as all items for each customer.
The incorrect statement is the one that claims Snowflake has a native ARRAY SQL type distinct from VARIANT. In Snowflake, semi-structured data types (arrays, objects) live inside VARIANT values rather than as separate top-level SQL types.
Topic: Data Sharing, Collaboration, and Marketplace
A data engineer stores image and PDF files in an internal stage and has created a directory table for that stage. Queries on the directory table only return file names and URLs, but the team must search inside file text without loading the files into a table. What is the most appropriate Snowflake approach?
Options:
A. Resize the virtual warehouse to a larger size so Snowflake can automatically scan the contents of unstructured files when querying the directory table.
B. Create an external table on the stage so each file is automatically exposed as a row with its full text stored in a VARIANT column.
C. Use COPY INTO to load each unstructured file into a permanent table column, then drop the stage after loading completes.
D. Use SQL file functions in queries against the stage or directory table to read file metadata and text content directly from the stored files.
Best answer: D
Explanation: Directory tables in Snowflake provide a relational view over files stored in a stage, exposing information such as file names, paths, sizes, and other attributes. However, by default they do not expose the actual contents of unstructured files like images and PDFs.
To inspect or search inside those files without loading them into regular tables, Snowflake provides SQL file functions. These functions operate over files in stages (often combined with directory tables) and return metadata or content as SQL values that can be filtered, searched, or transformed in queries.
By using SQL file functions in a SELECT statement that references the stage or its directory table, the engineer can read file metadata and extract or search text content on demand, satisfying the requirement to avoid loading the files into a separate table while still enabling rich querying inside Snowflake.
Use the SnowPro Core COF-C02 Practice Test page for the full IT Mastery route, mixed-topic practice, timed mock exams, explanations, and web/mobile app access.
Try SnowPro Core COF-C02 on Web View SnowPro Core COF-C02 Practice Test
Read the SnowPro Core COF-C02 Cheat Sheet on Tech Exam Lexicon, then return to IT Mastery for timed practice.